Karpathy's Skill Authoring Style: Distillation Over Accretion
The karpathy-guidelines skill is 65 lines. The Hermes superbundle is 144 skills. These represent opposite theories of how to encode AI capability. Karpathy's approach has specific advantages worth understanding.
There are two dominant theories of how to encode AI capability as Claude Code skills. The first is accretion: cover more use cases, add more commands, expand the surface area until the skill is comprehensive enough for most situations a developer might encounter. The Hermes catalog — expanded from 80 to 144 skills in a recent update — represents this approach taken seriously.
The second is distillation: identify the smallest set of constraints or principles that capture most of the value, encode those, and stop. The karpathy-guidelines skill represents this approach. Sixty-five lines. Four rules. No more.
Both approaches ship real value. But they optimize for different things, produce different failure modes, and imply different theories about what makes AI assistance useful. Understanding the distinction is useful both for choosing skills to install and for designing skills yourself.
The Accretion Model
Skill bundles like Hermes are built on a theory of coverage. The more tasks a skill set can handle well, the more value it provides per install. A developer installs one bundle and gets coverage across a wide range of workflow scenarios. Each new skill added to the bundle is additional value at marginal cost.
This theory holds when the skills are well-designed and genuinely orthogonal. The failure mode is quality dilution: as a bundle grows, the average skill quality tends to decrease, because maintaining 144 skills to a consistent standard requires more maintenance effort than maintaining 4. Skills that haven't been tested against recent model behavior linger in the catalog. The install count of the bundle grows, but the value per skill shrinks.
The accretion model also has a discovery problem. A bundle of 144 skills requires a user interface (a catalog, a README, a search mechanism) to navigate. Finding the right skill for the current task requires understanding the full surface area. For a generalist skill catalog, this is appropriate — that's what a catalog is. For a behavioral skill designed to apply across all tasks, it's the wrong shape.
The Distillation Model
Karpathy's approach — visible in his public code and encoded in the karpathy-guidelines skill — is the opposite. Start from the failure modes. Identify the minimum set of constraints that address those failure modes. Encode the constraints. Stop.
The result is a skill that costs almost nothing to maintain (65 lines doesn't accumulate technical debt quickly), applies across all tasks (behavioral constraints aren't domain-specific), and is fully legible to any user who spends five minutes reading it (see Reading Karpathy's Guidelines: What His One Skill Reveals).
The failure mode of distillation is coverage gaps. A 65-line behavioral skill doesn't know anything about your specific domain, tools, or conventions. It constrains behavior without informing it. A developer working in a complex domain — medical software, financial systems, high-performance computing — needs domain knowledge in their context, not just behavioral constraints.
The distilled skill is best understood as a primitive: it handles one layer of the problem (default LLM behavioral failures) cleanly, and it composes with domain skills that handle other layers. Neither layer alone is complete; together they cover substantially more ground.
The Single-Skill Choice as a Statement
The most interesting thing about the karpathy-guidelines skill is not what it contains — it's that it's a single skill at all.
The choice to publish one skill rather than building toward a broader catalog is a statement about what the author believes is most valuable to ship. Forrest Chang, who authored the skill based on Karpathy's observations, could have added skills for specific domains, specific tools, or specific workflow patterns. He didn't. He published the four behavioral constraints and stopped.
This mirrors how Karpathy himself publishes code. nanoGPT is not an attempt to cover all GPT training scenarios — it's the minimum implementation that communicates how GPT training works. micrograd doesn't cover all of automatic differentiation — it covers scalar-valued backpropagation cleanly enough to produce understanding. Each project has a specific, limited claim and fulfills it without scope creep.
The single-skill design says: this is the thing worth shipping, and everything beyond it is outside the stated scope. That's a more disciplined claim than "here are 144 skills we think are useful."
What the Contrast Reveals
Comparing the distillation and accretion models reveals a question that most skill authors don't explicitly ask: what problem am I solving?
If the problem is "developers need access to many specialized capabilities," accretion is the right model. A comprehensive catalog serves that problem. If the problem is "LLMs exhibit specific behavioral failure modes that degrade output quality," distillation is the right model. Four targeted constraints serve that problem.
The karpathy-guidelines skill is not a better version of a skill catalog. It's a different kind of artifact solving a different problem. The existence of Hermes's 144 skills doesn't make karpathy-guidelines less valuable; the existence of karpathy-guidelines doesn't make the Hermes catalog redundant.
The developers who install both — behavioral constraints from karpathy-guidelines, domain capabilities from specialized skills — are getting the most out of both models.
Designing Your Own Skills
If you're designing a skill for your own use or for publication, the distillation approach offers a useful discipline: what is the minimum specification that captures most of the value?
The test is whether you can state the skill's purpose in one sentence without loss of precision. "Reduce common LLM coding mistakes by encoding behavioral constraints derived from observed failure modes" — that's the karpathy-guidelines skill in one sentence. If you can do that for your skill, you've probably scoped it correctly. If the one-sentence description requires multiple clauses to cover all the cases, you may be in accretion territory.
Both modes can be appropriate. The discipline is knowing which one you're in.
For the context-engineering angle on behavioral skills, see CLAUDE.md as Context Engineering: Why Karpathy's Practice Matters.
Part of the Karpathy on Claude Code series. Published 2026-05-23.