What's Missing from Karpathy's Approach (and Why That's the Lesson)

The karpathy-guidelines skill has four rules. It doesn't have five. It doesn't have twenty-seven. The four it has address specific, observable failure modes in LLM-generated code. The things it doesn't address are not oversights — they're scope decisions.

Reading what's absent from Karpathy's approach is as instructive as reading what's present. The gaps reveal a theory of what behavioral constraints can and can't accomplish, and they point at the places where a different kind of tool is needed.

What the Skill Doesn't Cover

The skill contains nothing about:

Testing strategy (unit tests, integration tests, coverage targets)
Code review process
Documentation standards
Commit conventions or branching strategy
Team collaboration norms
Architectural patterns or design principles
Security practices
Performance considerations
Long-term codebase maintenance

A thoughtful team CLAUDE.md might include guidance on any or all of these. The karpathy-guidelines skill includes none of them. Why?

The answer is scope precision. The four rules in the skill address failure modes that appear in a single LLM coding session: wrong interpretation of the task, over-built solution, excess scope, vague success criteria. These are per-session problems. A rule that patches them in a single session patches them in every session.

Testing strategy, architecture, and team norms are not per-session problems. They're accumulated decisions that depend on context the skill doesn't have: your team's technical level, your codebase's history, your deployment environment, your industry's regulatory requirements. A behavioral rule about testing coverage can't be stated abstractly enough to be correct across all contexts. Any attempt to include it would either be so general as to be useless ("write adequate tests") or so specific as to be wrong for most projects.

The Scope Discipline

The discipline of omission is harder than it looks. Every artifact that ships instructions or guidelines faces pressure to be comprehensive. Comprehensive-looking documents get more initial confidence from users. They feel more professional. They suggest the author has thought of everything.

The problem with comprehensive guidelines is that they dilute the high-value signals with medium-value ones. If the four karpathy rules are surrounded by thirty contextual rules about various coding scenarios, the behavioral constraints that actually change session-to-session quality get less attention. The most important rules don't stand out.

Karpathy's public code demonstrates the same discipline in practice. nanoGPT doesn't cover distributed training, model serving, or quantization. micrograd doesn't cover multi-dimensional tensors. Each project covers exactly what it claims to cover and stops at the boundary. The scope is declared; the boundary is respected.

What's Actually Missing (and Intentionally So)

There are things in Karpathy's approach that represent genuine gaps rather than intentional exclusions. Two are worth naming.

Team context. The karpathy-guidelines skill is designed for a solo developer or small team where the developer writing the task has full context. It doesn't account for the coordination overhead of larger teams: tasks specified by one developer that get executed in a context another developer set up, architectural decisions that have to be consistent across multiple contributors, or review processes where the agentic output needs to be legible to someone who wasn't in the session.

The skill's "Think Before Coding" rule helps — surfacing assumptions is useful in multi-person contexts. But the skill as a whole is optimized for a single-developer loop. The team dimension is outside its scope.

Long-session drift. The four rules address the start of a task — clarify before implementing, scope correctly, define success criteria. They don't address what happens over a long agentic session where early decisions compound, the context window fills, and the agent's behavior starts to drift from the original intent.

This is a real problem in complex Claude Code workflows: sessions that start well, follow the guidelines, and then produce output that's subtly wrong because intermediate decisions made sensible local choices that accumulated into an incoherent global result. Catching this requires either explicit mid-session checkpoints (which the planning template in Goal-Driven Execution partially addresses) or shorter task scopes. The skill gestures at this but doesn't fully solve it.

Why the Gaps Are the Lesson

The gaps aren't failures of the karpathy-guidelines skill. They're illustrations of what a 65-line behavioral constraint can and can't accomplish.

What it can do: patch default behavioral failure modes that appear in every session. What it can't do: supply project-specific context, encode team norms, or handle architectural concerns that require knowledge the skill doesn't have.

The lesson is that behavioral constraints are one layer of a three-layer context stack. Layer one is behavioral: what the model shouldn't do in any context (karpathy-guidelines). Layer two is project-specific: what conventions, constraints, and context your project has (CLAUDE.md). Layer three is domain-specific: what your industry, stack, or tool set requires (domain skills).

Each layer does something the others can't. Trying to solve layer-two problems with layer-one tools produces overly broad rules. Trying to solve layer-one problems with layer-two tools requires you to re-specify behavioral constraints in every new project.

The karpathy-guidelines skill doesn't cover everything. It covers the one layer that applies everywhere. That's the right scope for a skill that should work in any project, for any developer, on any task.

For a broader look at what the skill's design says about Karpathy's approach to knowledge transfer, see Karpathy's Skill Authoring Style: Distillation Over Accretion. For the layer-two angle — what goes in a project CLAUDE.md — see CLAUDE.md as Context Engineering: Why Karpathy's Practice Matters.

Part of the Karpathy on Claude Code series. Published 2026-05-23.

What's Missing from Karpathy's Approach (and Why That's the Lesson)

What the Skill Doesn't Cover

The Scope Discipline

What's Actually Missing (and Intentionally So)

Why the Gaps Are the Lesson

Related Skills to Try

Related Skills to Try

Firecrawl MCP Server

Linear CLI Integration

Related Articles

Related Articles

Dynamic Runtime in AI Skill Design

Design Systems for Solo Builders

First-Party Benchmarks Are Marketing: A Skeptic's Checklist for Launch Day

Firecrawl MCP Server

Linear CLI Integration

AI Engineer

AI Data Remediation Engineer

AI Engineer

AI Data Remediation Engineer