CLI-First Architecture for AI Tools
Why CLI-first beats MCP-only for AI tool design. Build tools that work everywhere, not just inside a specific AI framework.
The MCP (Model Context Protocol) ecosystem is growing rapidly. Every week brings new MCP servers that let AI agents interact with databases, APIs, file systems, and external services. But there is a pattern emerging in the best MCP tools that the community has not talked about enough: the best MCP servers started as CLI tools.
The tools that work most reliably with AI agents are the ones that also work without them. They have a command-line interface that a human can use directly. The MCP layer is an adapter on top of that CLI, not the primary interface. This is not an accident. It is a design principle worth adopting deliberately.
Key Takeaways
- CLI-first tools are testable without AI, which means you can verify correctness independently of the model's behavior
- The command-line interface acts as a contract that forces you to define clear inputs, outputs, and error states before wrapping it for AI consumption
- CLI tools compose with standard Unix tools (pipes, redirects, scripts), giving you a multiplier that MCP-only tools lack
- Debugging is simpler because you can reproduce any AI tool call by running the equivalent CLI command manually
- Migration between AI frameworks becomes trivial when your core tool logic is framework-independent
The Problem With MCP-Only Design
When you design a tool exclusively as an MCP server, you make several implicit decisions that limit its usefulness.
You couple your logic to a protocol. MCP defines how tools describe themselves and how results are returned. If you write your business logic inside MCP handler functions, that logic is locked into MCP. When a new protocol emerges (and it will), you need to rewrite.
You cannot test without a model. To verify that your MCP tool works correctly, you need to invoke it through an AI agent. This means your test cycle includes model inference, which is slow, expensive, and non-deterministic. A CLI tool can be tested with a shell script.
You lose composability. Unix tools compose through pipes. ls | grep .ts | wc -l counts TypeScript files by combining three simple tools. MCP tools do not compose this way. Each tool is an island, invocable only through the AI agent.
You lose discoverability. A CLI tool with a --help flag is self-documenting. An MCP tool is discoverable only through the AI agent or by reading the MCP server configuration. For a deeper dive into MCP itself, see our MCP guide.
The CLI-First Pattern
CLI-first design inverts the dependency. Instead of building an MCP server and hoping to add a CLI later, you build a CLI tool and add MCP as a thin adapter.
┌──────────────┐ ┌──────────────┐
│ CLI Binary │ │ MCP Server │
│ (primary) │ │ (adapter) │
└──────┬───────┘ └──────┬───────┘
│ │
└──────────┬──────────┘
│
┌───────┴───────┐
│ Core Library │
│ (business │
│ logic) │
└───────────────┘
The core library contains all the business logic. The CLI binary provides the command-line interface. The MCP server wraps the same library with MCP protocol handling. Both interfaces call the same functions.
Example: A Database Query Tool
Let's say you are building a tool that lets AI agents query a database.
Step 1: Build the core library.
// lib/query.ts
export interface QueryResult {
rows: Record<string, unknown>[]
rowCount: number
duration: number
}
export async function executeQuery(
connectionString: string,
query: string,
params?: unknown[]
): Promise<QueryResult> {
// Business logic here
const start = Date.now()
const result = await db.query(query, params)
return {
rows: result.rows,
rowCount: result.rowCount,
duration: Date.now() - start,
}
}
Step 2: Build the CLI.
// cli/index.ts
import { executeQuery } from '../lib/query'
const args = parseArgs(process.argv.slice(2))
const result = await executeQuery(
args.connectionString,
args.query,
args.params
)
if (args.format === 'json') {
console.log(JSON.stringify(result, null, 2))
} else {
console.table(result.rows)
}
Step 3: Add the MCP adapter.
// mcp/server.ts
import { executeQuery } from '../lib/query'
server.tool('query_database', {
description: 'Execute a read-only SQL query',
parameters: {
query: { type: 'string', description: 'SQL query to execute' },
},
handler: async ({ query }) => {
const result = await executeQuery(
process.env.DATABASE_URL,
query
)
return {
content: [{
type: 'text',
text: JSON.stringify(result, null, 2)
}]
}
}
})
The CLI and MCP server share the same executeQuery function. Any bug fix or feature addition in the core library benefits both interfaces automatically.
Why This Matters for Testing
Testing is the strongest argument for CLI-first design. Consider the testing approaches for each architecture.
MCP-Only Testing
1. Start MCP server
2. Connect AI agent to server
3. Send natural language prompt that should trigger the tool
4. Parse AI response for tool call
5. Verify tool was called with correct parameters
6. Verify result was processed correctly
This test has multiple failure points: the model might not understand the prompt, might call the wrong tool, or might pass incorrect parameters. Any failure could be a bug in your tool or a quirk of the model. Distinguishing between the two is difficult.
CLI-First Testing
# Test the CLI directly
./query-tool --connection "$DB_URL" --query "SELECT count(*) FROM users" --format json
# Assert on the output
echo '{"rowCount": 1, "rows": [{"count": 42}]}' | diff - <(./query-tool ...)
This test is deterministic, fast, and tests only your code. No model inference, no protocol overhead, no ambiguity about failure causes.
You still need integration tests that verify the MCP adapter works, but those tests are thin -- they just verify that the MCP handler correctly calls the core library function and formats the response.
Composability in Practice
CLI tools compose with standard Unix tools, creating capabilities that MCP-only tools cannot match.
# Count errors in the last hour
./log-tool --since 1h --level error | wc -l
# Find the most common error types
./log-tool --since 1d --level error --format json | jq '.type' | sort | uniq -c | sort -rn
# Pipe results into another tool
./query-tool --query "SELECT * FROM users WHERE active = false" --format csv | \
./email-tool --template deactivation --csv-input
# Use in a shell script
for table in users orders products; do
./query-tool --query "SELECT count(*) FROM $table" --format text
done
None of these compositions require AI. They work in shell scripts, in CI pipelines, in cron jobs. The AI agent is one consumer of the tool, not the only consumer.
Designing CLI Interfaces for AI Consumption
When you know your CLI tool will also be used through an MCP adapter, certain design choices make the adaptation smoother.
Use Structured Output
Always support JSON output. AI agents parse JSON reliably. They struggle with formatted table output.
# Human-friendly default
./tool --query "..."
# Output: formatted table
# Machine-friendly flag
./tool --query "..." --format json
# Output: {"rows": [...], "count": 42}
Make All Parameters Explicit
Avoid interactive prompts, confirmations, or menus. Every input should be passable as a flag or argument. AI agents cannot interact with interactive prompts.
# Bad: requires interactive confirmation
./deploy-tool # "Are you sure? (y/n)"
# Good: explicit confirmation flag
./deploy-tool --confirm
Return Meaningful Exit Codes
Exit code 0 for success, non-zero for failure. MCP adapters can use exit codes to determine whether to report success or error to the agent.
Include Machine-Readable Errors
Error messages should be parseable, not just human-readable.
# Bad
echo "Something went wrong"
exit 1
# Good
echo '{"error": "connection_timeout", "host": "db.example.com", "timeout_ms": 5000}' >&2
exit 1
Migration and Future-Proofing
The AI tooling ecosystem is young and protocols will change. When they do, CLI-first tools adapt easily.
If MCP is superseded by a new protocol, you write a new adapter. Your core logic and CLI interface remain unchanged. If you want to support multiple AI frameworks simultaneously, each framework gets its own thin adapter layer.
This is the same principle that drives hexagonal architecture in traditional software: keep your business logic independent of your delivery mechanisms. The CLI and MCP server are both delivery mechanisms. The business logic belongs in the core library.
For practical examples of building tools that work with Claude Code, see our tutorial on creating custom skills.
FAQ
Does CLI-first design add development time?
Marginally. You are building two interfaces (CLI and MCP) instead of one, but the core library is shared. The CLI adds about 20% more code. The testing improvements save more time than this costs.
What about tools that need real-time interaction?
Some tools (like a database explorer that maintains a connection) benefit from long-lived processes. These work fine as CLI tools with session management. The MCP adapter maintains the session on behalf of the agent.
Is this just Unix philosophy applied to AI?
Yes. Small, composable tools with clear interfaces and text-based communication. The Unix philosophy is 50 years old and still the best architecture for tool design.
Should I build the CLI or the MCP adapter first?
Build the core library first, then the CLI, then the MCP adapter. This order ensures your business logic is clean and independent before you add any interface layer.
What about tools that only make sense in an AI context?
Some tools -- like a "summarize this conversation" tool -- only make sense when used by an AI agent. Even these benefit from a CLI interface for testing and debugging. You might never use the CLI in production, but you will use it during development.
Explore production-ready AI skills at aiskill.market/browse or submit your own skill to the marketplace.
Sources
- Command Line Interface Guidelines - Modern CLI design principles
- Model Context Protocol Specification - MCP standard documentation
- The Art of Unix Programming - Classic Unix design philosophy
- 12 Factor CLI Apps - Best practices for CLI application design