Agentic Search vs. RAG: Why Claude Code Doesn't Index Your Codebase (And Why That's the Point)
RAG-based code search hits a wall at scale: the index is stale before it ships. Claude Code's grep-and-read approach trades upfront indexing for live traversal — and that tradeoff scales differently than most teams realize.
There's a quiet design choice in Claude Code that most teams never notice until it matters: it doesn't index your codebase. No embeddings. No vector store. No nightly job that crawls the repo and pre-computes context for next morning's queries.
It just opens files. Greps. Follows references. Reads what it needs, when it needs it.
In the Applied AI team's enterprise patterns piece, this design choice gets one of the most quietly damning lines I've read in a tooling article this year:
"By the time a developer queries the index, it reflects the codebase as it previously existed weeks, days, or even hours before."
That's the RAG problem at scale. Stated plainly. And it explains a lot of why teams using embedding-based code search keep hitting the same ceiling.
What Agentic Search Actually Does
Claude Code's approach mirrors what a competent human engineer does when dropped into an unfamiliar codebase:
- Look at the directory structure.
- Open the obvious entry points.
- Grep for the symbol or string you care about.
- Follow the references the grep returns.
- Repeat until you understand enough to make the change.
There's no pre-computation. There's no "we have already indexed this." Every session starts cold. That sounds inefficient, but it's the property that makes it scale.
The Index Staleness Problem
If your team is small and your codebase changes slowly, RAG-based code search works fine. The index might be a few hours out of date, but the symbols you care about are still in roughly the right place.
If your team is a thousand engineers committing to a monorepo all day, the index is stale by definition. The Applied AI team's framing is direct: by the time you query the index, it's the codebase from hours ago. Your colleague renamed the function 40 minutes ago. The index still points at the old name.
The failure mode isn't catastrophic — it's silently degrading. The tool returns answers that look right. They reference symbols that used to exist. The developer wastes time chasing names that have moved.
Agentic search doesn't have this failure mode because there's nothing to be stale. The codebase is the index. When Claude greps for a symbol, it sees what's actually there right now, not what was there at 3 AM when the indexer last ran.
The Tradeoff, Stated Honestly
The Applied AI team is direct that this isn't free:
"Quality depends on whether Claude has sufficient starting context to know where to look."
This is the catch. A pre-computed index can take you straight to a definition without any priors. Agentic search needs priors — something has to tell Claude where to start looking.
That "something" is the harness:
- CLAUDE.md files give Claude a map of the directory structure.
- A repo-root codebase map (a markdown file listing each top-level folder with a one-line description) gives Claude a table of contents to scan before opening files.
- Per-subdirectory CLAUDE.md files tell Claude what conventions apply locally.
- LSP integrations give Claude symbol-level filtering so a grep for a common name returns 3 real references instead of 1000 text matches.
The total cost is real. It's an upfront investment of a few engineering days to set up. But once set up, it scales with the codebase — not with the rate of change. Every commit doesn't trigger a re-index. The "index" updates whenever a file is saved, because the index is the file.
Why Most Teams Mis-Estimate the Cost
The reason RAG-based code search feels "simpler" is that the work happens once, asynchronously, by a system. The reason agentic search feels "harder" is that the work happens every session, synchronously, by Claude — which uses your context window.
Both costs are real. But they scale differently:
| Property | RAG Code Search | Agentic Search |
|---|---|---|
| Upfront setup | Indexer + embeddings + vector DB | CLAUDE.md hierarchy + LSP + skills |
| Per-session cost | Negligible | Real (context tokens) |
| Cost of code change | Index re-build | Zero |
| Cost of team scale | Index size + concurrent queries | Zero |
| Cost of monorepo scale | Embedding cost grows linearly | Stays bounded per query |
| Failure mode | Silent staleness | Visible context overruns |
The "silent staleness" line in the failure mode column is what makes the difference at enterprise scale. Visible failures are fixable. Silent ones aren't, because no one knows to fix them.
What This Implies For Your Investment
If you've been investing in code search infrastructure (your own RAG pipeline, a vendor product, a homegrown indexer), the Applied AI team's framing implies that investment may not compose with Claude Code the way you hoped. Claude Code isn't going to use your index. It's going to grep.
The same engineering budget redirected at the harness — CLAUDE.md hierarchies, LSP installs, skill packaging, plugin distribution — composes much better.
This isn't a critique of RAG generally. For documentation search, customer support, and many other use cases, RAG is the right architecture. For code — where the corpus changes by the minute and the cost of stale information is silent damage — agentic search has a structural advantage that the Applied AI team is now explicitly endorsing.
The Punchline
Claude Code's lack of a codebase index isn't a missing feature. It's the design that makes everything else work. Once you internalize that, the rest of the harness (CLAUDE.md, LSP, skills, plugins) stops looking like setup work and starts looking like the thing you're actually buying.
The model navigates. The harness tells it where to look.
Part of the Claude Code at Scale series. Previous: The harness, not the model. Next: CLAUDE.md decay and the 3-6 month review cadence.