The Karpathy Skills Repo: A One-Skill Field Guide

When the multica-ai/andrej-karpathy-skills repository surfaced in early 2026, the framing was irresistible: Andrej Karpathy's coding philosophy, packaged as an installable Claude Code skill. Developers who expected a curated bundle of dozens of specialized skills found something more surprising — a single file, 65 lines, four rules.

That's it.

The repo publishes exactly one skill: karpathy-guidelines. No sub-skills for frontend work, no separate agent for testing, no collection of domain-specific prompts. One skill, one install, done.

For an ecosystem that trends toward maximalism — bundles of 30+ commands, multi-skill orchestration layers, workflow stacks that require a README just to understand what you've installed — the single-skill choice is almost confrontational. And understanding why it makes sense is the fastest way into Karpathy's actual philosophy.

The Setup: What the Repo Claims to Be

The repository's description is precise: "Behavioral guidelines to reduce common LLM coding mistakes, derived from Andrej Karpathy's observations on LLM coding pitfalls." The skill was authored by Forrest Chang, inspired by Karpathy's January 2026 X post on the failure modes he observes when working with LLM-generated code. Karpathy has not publicly endorsed the repo, but the source attribution is specific enough to trace.

The resulting SKILL.md contains four sections: Think Before Coding, Simplicity First, Surgical Changes, and Goal-Driven Execution. Each one is short. Each one names an exact failure pattern and the remedy. There are no examples beyond a brief planning template. There is no aspirational framing about "being a great software engineer."

The skill doesn't tell Claude to be better in some general sense. It tells Claude to stop doing four specific things that make LLM-generated code difficult to work with.

Why One Skill Is the Right Answer

The case for distillation over breadth is partly pragmatic and partly principled.

On the pragmatic side: CLAUDE.md files and skill contexts live in the context window. Every token of behavioral guidance you add is a token of working memory unavailable to the actual task. A 65-line skill that ships four high-leverage rules costs almost nothing contextually. A 600-line skill that covers every edge case and workflow variant costs substantially more — and most of those tokens fire on situations the current session never encounters.

On the principled side: the four rules in karpathy-guidelines address failure modes that appear across nearly every kind of coding task. "Think before coding" is relevant whether you're building a REST endpoint or a data pipeline. "Simplicity first" applies whether the language is Python or TypeScript or Rust. These aren't domain-specific reminders — they're constraints on cognitive patterns that LLMs exhibit regardless of the task domain.

A skill that patches the underlying failure mode is more useful than a skill that patches its surface manifestations domain by domain.

The Rules Themselves

Each rule in the skill is worth reading on its own terms.

Think Before Coding targets the tendency to start generating implementation before clarifying intent. The instruction is to state assumptions explicitly, surface ambiguity rather than resolve it silently, and ask when confused rather than proceeding with a guess. This is not about slowing down — it's about not doing work that has to be undone because a key assumption was wrong.

Simplicity First is the most operationally specific: no unrequested features, no abstractions for single-use code, no error handling for impossible scenarios. The test case the skill provides is worth quoting: "If you write 200 lines and it could be 50, rewrite it." This constraint runs against LLMs' natural tendency toward thoroughness-as-demonstration.

Surgical Changes addresses a subtler problem: the temptation to improve adjacent code, fix unrelated issues, or update formatting while implementing a specific request. The principle is that every changed line should trace directly to what was asked. Nothing else. If you notice dead code unrelated to your task, mention it — don't delete it.

Goal-Driven Execution converts vague instructions into verifiable criteria before execution. "Fix the bug" becomes "write a test that reproduces it, then make it pass." The skill provides a planning template for multi-step tasks that makes verification explicit at each step.

The Implicit Argument

By publishing one skill rather than many, the repo makes an argument about what's actually wrong with LLM-generated code. The problem isn't domain knowledge — Claude knows plenty of Python, TypeScript, and SQL. The problem is behavioral: models tend toward elaboration over minimalism, guessing over clarifying, and touching more than they should.

The solution isn't more domain-specific guidance. It's four constraints that redirect those tendencies regardless of domain.

This is a sharply different theory from the approach most skill collections take. Most skill bundles assume the bottleneck is knowledge ("if Claude just knew more about X"). The karpathy-guidelines skill assumes the bottleneck is discipline ("if Claude just did less of Y").

Both theories have merit. But the discipline theory is harder to monetize, less visible in demos, and genuinely more useful on a daily basis. It makes sense that it shows up as a single quiet file rather than a flashy bundle.

What This Means for Skill Design

If you author skills for Claude Code, the single-skill design pattern is worth understanding as a model. It doesn't argue against specialized skills — there are good reasons to have focused skills for specific domains or tools. But it does argue against bundling behavioral guidelines into those domain skills, diluting both.

The karpathy-guidelines skill is composable precisely because it's abstract. Install it alongside any domain skill and it shapes behavior across all of them. Install a bundle that includes behavioral guidelines baked into domain-specific instructions and you lose that composability — the behavioral layer can't be upgraded or removed independently.

Clean separation of concern, in skill design, looks a lot like karpathy-guidelines.

Install and Move On

The install is one line. The skill runs silently. You don't have to invoke it — it shapes Claude's behavior from session start by virtue of being loaded. If you're already working with Claude Code on any coding task, you'll notice the difference mostly in what Claude doesn't do: it stops gold-plating, it asks before guessing, it leaves code it wasn't asked to touch.

That's the pitch. Four rules. No ceremony.

Browse the full karpathy-guidelines skill and install it from the skill page. For the broader context on why these particular rules matter, see Reading Karpathy's Guidelines: What His One Skill Reveals.

Part of the Karpathy on Claude Code series. Published 2026-05-23.

The Karpathy Skills Repo: A One-Skill Field Guide

The Setup: What the Repo Claims to Be

Why One Skill Is the Right Answer

The Rules Themselves

The Implicit Argument

What This Means for Skill Design

Install and Move On

Related Skills to Try

Related Skills to Try

Context Degradation Detection

Firecrawl MCP Server

Related Articles

Related Articles

Dynamic Runtime in AI Skill Design

Design Systems for Solo Builders

First-Party Benchmarks Are Marketing: A Skeptic's Checklist for Launch Day

Context Degradation Detection

Firecrawl MCP Server

Context Optimization

Context Fundamentals

Context Optimization

Context Fundamentals