CocoIndex

Your agents deserve fresh context.

Star us ❤️ → · · ·

CocoIndex turns codebases, meeting notes, inboxes, Slack, PDFs, and videos into live, continuously fresh context for your AI agents and LLM apps to reason over effectively — with minimal incremental processing. Get your production AI agent ready in 10 minutes with reliable, continuously fresh data — no stale batches, no context gap.

Incremental · only the delta · Any scale · parallel by default · Declarative · Python, 5 min

Built with CocoIndex ❤️

See all 20+ examples · updated every week →

React — for data engineering

React — for data engineering. The CocoIndex mental model: Target = F(Source). A persistent-state-driven dataflow where you declare the desired target state and the engine keeps it in sync with the latest source data and code, forever, at low latency and low cost.

What happens when either side changes — CocoIndex tracks per-row provenance so the Δ propagates at minimum cost. Source change re-syncs only the affected target dot; code change re-runs only dots whose outputs depend on the changed code.

See the React ↔ CocoIndex mental model →

Incremental engine for long-horizon agents

Data transformation for any engineer, designed for AI workloads —
with a smart incremental engine for always-fresh, explainable data.

CocoIndex's Python-native transformation flows connect 8 source categories through the incremental engine out to 6 target stores. Only the Δ is reprocessed — unchanged src hits the cache, changed src re-runs split() and Δ → re-embed.

Why incremental?

Your agents are only as good as the data they see.
Batch pipelines drift stale. CocoIndex stays live — and only runs the Δ.

Why incremental? Sub-second fresh, 10× cheaper at scale, explainable by default, production-grade Rust core with retries, back-off, dead-letter queues, and no-data-loss guarantees.