Codebase Indexing: How AI Coding Tools Navigate Your Code

Codebase indexing is how AI coding tools build a searchable map of your project so they can find relevant code without reading everything from scratch. Cursor builds vector embeddings. Cline and Claude Code skip indexing entirely. Augment indexes everything including commit history and docs. The approach each tool takes determines its speed, accuracy, cost, and failure modes.

5 min

Cursor's index re-sync interval

70%

Agent perf gain with Augment Context Engine

40%

Token savings from vector search (vs grep)

74%

Token reduction with AST-based retrieval

What Is Codebase Indexing

Codebase indexing is the process of building a searchable representation of your source code so an AI model can find relevant files, functions, and context quickly. Instead of reading your entire project on every query, the tool pre-processes your code into a format optimized for retrieval.

The simplest version is a text index, like what grep searches against. More sophisticated approaches use vector embeddings that capture the semantic meaning of code chunks, enabling queries like "where does authentication happen" to return results even when no file contains that exact string.

There are three main approaches in 2026:

Embedding-Based Indexing

Chunk code with AST-aware splitters, compute vector embeddings, store in a vector database. Query by semantic similarity. Used by Cursor, Windsurf, and Copilot Enterprise.

Agentic Search

No pre-built index. The model runs grep, glob, and file reads at runtime, iteratively refining its search. Used by Claude Code, Cline, and several open source agents.

Hybrid / MCP-Based

Plug semantic indexing into any agent via MCP. Augment Context Engine, CodeGrok, and Serena add structured search without replacing the core agent architecture.

How Cursor Indexes Your Code

Cursor's indexing pipeline is the most documented embedding-based approach. When you open a project, Cursor scans the folder and computes a Merkle tree of file hashes, which it syncs to Cursor's servers.

Cursor's indexing pipeline

1. Scan project → compute Merkle tree of file hashes
2. Chunk code using AST-aware splitters (tree-sitter)
3. Compute vector embeddings per chunk
4. Store in Turbopuffer (vector + full-text search)
5. Cache embeddings by content hash (skip unchanged files)
6. Re-sync every ~5 minutes to detect changes

The chunking step matters most. Naive text splitting tears apart function definitions, separating a function call from its implementation. Cursor uses tree-sitter for AST-aware splitting, traversing the syntax tree depth-first to produce chunks that respect code structure. Embeddings are cached by content hash on AWS, so unchanged code reuses previous embeddings across indexing runs.

Cursor's shared team indexing (launched 2026) means new team members can reuse existing indexes instead of waiting hours for initial indexing. For large monorepos, this cuts onboarding from hours to seconds.

Windsurf takes a similar approach

Windsurf uses its own RAG pipeline optimized for enterprise-scale repos with 100M+ lines of code. It pre-computes code snippet indexes and retrieves relevant context during generation. Remote indexing supports multiple repositories and maintains architectural awareness across distributed systems.

The Staleness Problem

Code indexes are snapshots frozen in time. Cursor re-syncs every five minutes, but a developer on a hot streak can rename functions, move files, and restructure modules in that window. Forge Code's research found that indexed agents are 22% faster on average, but produce incorrect suggestions when code has changed since the last sync. Non-indexed agents are slower but never reference phantom APIs or deleted functions.

Factor	Indexed (Cursor, Windsurf)	Non-Indexed (Claude Code, Cline)
Semantic search	Yes, via embeddings	No (text matching only)
Staleness risk	5-minute sync gap	None (reads live filesystem)
Token cost per query	Lower (pre-filtered results)	Higher (iterative search)
Setup overhead	Initial indexing (minutes to hours)	None
Security surface	Code sent to embedding service	Code stays local
Large repo performance	Fast after indexing	Scales with search complexity

The Agentic Search Alternative

Claude Code, Cline, and several open source agents take the opposite approach: no index at all. The model explores the codebase at runtime using grep, glob, and file reads, with each search step informing the next.

Agentic search in practice

# The model iterates, refining each query based on results:
Turn 1: glob "**/*auth*"        → 5 matching files
Turn 2: grep "validateToken"    → 3 matches in src/middleware/
Turn 3: read src/middleware/auth.ts → found the handler
Turn 4: grep "auth.ts" imports  → traced the call chain

# Contrast with embedding search:
Query: "where does authentication happen"
→ Returns top-5 nearest embeddings (one shot, no refinement)

Boris Cherny, who created Claude Code, explained the reasoning: "Early versions of Claude Code used RAG + a local vector DB, but we found pretty quickly that agentic search generally works better. It is also simpler and doesn't have the same issues around security, privacy, staleness, and reliability."

Cline's team makes a sharper argument against chunking: "When you chunk code for embeddings, you're literally tearing apart its logic. A function call might be in one chunk, its definition in another, and the critical context scattered across other fragments." Cline instead reads files connection by connection, following imports and tracing dependencies to build understanding incrementally.

The tradeoff is real

Agentic search works well for targeted queries where the model knows roughly what to look for. It struggles with "I don't know the name" queries where the implementation uses different terminology than the spec. If you ask "where does rate limiting happen" but the code calls it throttleMiddleware, grep won't find it. Semantic search will.

Augment Context Engine MCP

Augment's Context Engine, released as an MCP server in February 2026, represents the hybrid approach. It ships semantic indexing as a pluggable component that works with any MCP-compatible agent, including Claude Code and Cursor.

The Context Engine goes beyond code structure. It indexes commit history, codebase patterns, external documentation, and what Augment calls tribal knowledge, the implicit conventions that live in a team's collective memory but nowhere in the code itself.

70%+

Agent performance improvement across Claude Code, Cursor, Codex

30+

Repositories indexed simultaneously

You can run the Auggie CLI locally as an MCP server that indexes in real-time as you edit, or connect to Augment's hosted service over HTTP. The key insight: the performance gain comes from better retrieval, not a better underlying model. The same Claude or GPT model performs dramatically better when it receives more relevant context.

The Token Cost Debate

The cost argument is the sharpest point of contention in the indexing debate. Milvus published a detailed critique of Claude Code's grep-only approach, arguing it burns tokens by dumping irrelevant matches into the context window. Their data shows vector search reduces token usage by 40% compared to raw grep.

Approach	Token Efficiency	Accuracy Tradeoff
Raw grep (no filtering)	Worst: dumps all matches	High recall, low precision
Embedding search	40% less than grep	Semantic matching, may miss exact code
AST-based retrieval	74% less than raw reads	Structural accuracy, language-dependent
RL-trained search agent	15.6% cheaper than self-search	Iterative refinement, high precision

But the critique misses a nuance. Claude Code doesn't just run a single grep and dump results. It runs an agentic loop: grep, analyze results, refine the query, grep again. Each turn produces better-targeted results. The question is whether the cumulative cost of multiple refined searches exceeds the cost of a single embedding lookup.

For small-to-medium repos (under 50k lines), the difference is marginal. For monorepos with millions of lines, indexed search pulls ahead on cost because the search space is too large for iterative grep to navigate efficiently.

Open Source Code Indexing Tools

The MCP ecosystem has produced several open source tools that add codebase indexing to any compatible agent:

CodeGrok MCP

Combines tree-sitter AST parsing with vector embeddings. Indexes code once, then AI queries semantically and receives only the 5-10 most relevant snippets. Claims 10-100x token reduction.

Serena MCP

Uses Language Server Protocol for symbol-level navigation across 30+ languages. Provides IDE-like capabilities (go-to-definition, find-references) to any MCP-compatible agent.

Claude Context (Zilliz)

Open source MCP plugin that adds semantic code search to Claude Code using Milvus/Zilliz vector database. Brings embedding-based retrieval to agents that don't natively support it.

CodeGrok is particularly interesting because it combines two indexing strategies: AST parsing for structural understanding and vector embeddings for semantic search. When an agent asks about a function, CodeGrok returns the function definition, its callers, and related types as a coherent unit rather than fragmented text chunks.

Serena takes a different angle. Instead of embeddings, it wraps the Language Server Protocol, giving AI agents the same go-to-definition, find-all-references, and symbol-search capabilities that IDEs provide. This means precise, always-current navigation without any embedding pipeline.

RL-Trained Search: A Third Path

Both indexing and agentic search have a shared weakness: they load search results directly into the coding model's context window. Whether those results come from an embedding query or a grep command, the coding model processes them alongside everything else it's tracking. This is where context rot sets in.

WarpGrep isolates search into a separate context window. It's an RL-trained search agent that explores your codebase, filters results, and returns only the relevant file and line ranges. The parent model never sees the 15 files that were explored and rejected.

SWE-Bench Pro (lifts every frontier model)

15.6%

Cheaper than model self-search

28%

Faster than model self-search

Index maintenance overhead

How WarpGrep differs from indexing and agentic search

# Indexed search (Cursor-style):
1. Pre-build embeddings of all code chunks
2. Query → nearest-neighbor lookup → return top-k chunks
3. Results go into coding model's context
Problem: stale index, chunking artifacts, semantic-only matching

# Agentic search (Claude Code-style):
1. Coding model runs grep/glob/read in its own context
2. Each search result stays in the context window
3. Irrelevant results accumulate as context noise
Problem: token waste, context rot in the coding model

# RL-trained search agent (WarpGrep):
1. Search agent explores in its own isolated context window
2. Agent iteratively searches, reads, filters, backtracks
3. Returns only: "src/api/webhooks.ts, lines 47-89"
4. Coding model receives 150 tokens of precise context
Advantage: no index, no staleness, no context rot

The cost savings come from an unintuitive place. Adding a second model to the pipeline makes the overall system cheaper because the expensive reasoning model stops wasting tokens on search. It sees fewer irrelevant files, generates fewer wasted tokens, and finishes tasks sooner.

Choosing the Right Approach

The indexing debate isn't a binary choice. Each approach excels in specific scenarios:

Scenario	Best Approach	Why
Large stable monorepo	Embedding index (Cursor)	Semantic search across millions of lines
Fast-moving startup codebase	Agentic search (Claude Code)	No stale index, always reads current code
Enterprise multi-repo	Augment Context Engine	Cross-repo awareness, team knowledge
Cost-sensitive long tasks	RL search agent (WarpGrep)	Context isolation prevents token waste
Need semantic + exact match	Hybrid (CodeGrok, Claude Context)	Combines AST parsing with embeddings
Symbol-level navigation	Serena MCP	LSP integration, always current

The tools that will matter most are the ones that let you compose these approaches. MCP makes this possible: you can run Claude Code for agentic search, add Augment's Context Engine for semantic queries, and use WarpGrep for context-isolated search, all in the same workflow. The goal isn't to pick one approach forever. It's to match the retrieval strategy to the task.

Frequently Asked Questions

What is codebase indexing?

Codebase indexing is the process of building a searchable representation of your source code so an AI model can find relevant files, functions, and context without reading everything from scratch. Tools like Cursor use vector embeddings and AST-aware chunking, while others like Claude Code skip indexing and use agentic search at runtime.

How does Cursor index your codebase?

Cursor scans your project, computes a Merkle tree of file hashes, chunks code using tree-sitter AST-aware splitters, and stores vector embeddings in Turbopuffer. Unchanged files are cached by content hash. The index re-syncs approximately every 5 minutes to pick up changes.

Why doesn't Claude Code use codebase indexing?

Boris Cherny, Claude Code's creator, explained that early versions used RAG with a local vector DB, but agentic search outperformed everything. The model runs grep, glob, and file reads at runtime. This avoids stale indexes, security concerns from duplicating code into vector stores, and the complexity of maintaining embedding pipelines.

What is the Augment Context Engine MCP?

The Augment Context Engine is an MCP server released in February 2026 that provides semantic codebase indexing to any MCP-compatible agent. It indexes code structure, commit history, external docs, and team conventions. Adding it improved agent performance by 70% across Claude Code, Cursor, and Codex.

Does codebase indexing waste tokens?

It depends on the approach. Embedding search reduces token usage by 40% compared to raw grep. AST-based retrieval can cut tokens by 74%. But agentic search advocates argue that iterative refinement produces more precise results. RL-trained search agents like WarpGrep reduce costs by 15.6% through context isolation.

What is agentic search for code?

Agentic search is a retrieval approach where the AI model explores the codebase at runtime using tools like grep, glob, and file reads rather than querying a pre-built index. Each step informs the next, allowing iterative refinement. Claude Code and Cline use this approach. The tradeoff is higher per-query token usage in exchange for always-current results.

What are the best open source codebase indexing tools?

Notable options include CodeGrok MCP (AST parsing + vector embeddings, 10x context reduction), Serena MCP (LSP integration for symbol navigation across 30+ languages), and Augment Context Engine MCP. All work as MCP servers compatible with Claude Code, Cursor, and other agents.

How does WarpGrep compare to codebase indexing?

WarpGrep is an RL-trained search agent that runs in its own context window. Unlike static indexes, it searches the live filesystem and returns only relevant file and line ranges to the parent model. On SWE-Bench Pro, it lifts every frontier model to #1 while being 15.6% cheaper and 28% faster. No index to maintain means no staleness.

Search Without Indexing Overhead

WarpGrep is an RL-trained search agent that finds code in its own context window and returns only what matters. No embeddings to maintain, no index to go stale. 15.6% cheaper, 28% faster, and #1 on SWE-Bench Pro with every frontier model.

Try WarpGrep

See How Fast Apply Works

Morph Fast Apply

Morph WarpGrep

Morph Glance

Morph MCP

Morph Monitor