Tracking AI Token Usage and Costs
Monitor your Claude Code spending in real-time with token tracking dashboards, budget alerts, and usage optimization techniques.
AI tokens are the new cloud compute bill. They are invisible until the invoice arrives, they scale with usage in non-obvious ways, and a single runaway process can burn through your budget in hours. If you use Claude Code professionally, tracking your token usage is not optional -- it is as essential as monitoring your AWS bill.
This guide covers building a token tracking system from scratch: what to measure, how to collect the data, how to visualize it, and how to set alerts before costs spiral.
Key Takeaways
- A single Claude Code session with a large context window can consume $5-15 in tokens per hour, making daily tracking essential for professional use
- Input tokens (your prompts + context) cost more per-dollar than output tokens but output tokens cost more per-token -- optimizing context size has the highest ROI
- Tracking token usage per task, not just per day, reveals which activities are cost-efficient and which are burning budget
- Budget alerts at 50%, 80%, and 100% of your daily target prevent bill shock while still allowing productive work
- The biggest cost driver is usually context window size, not prompt count -- loading 50 files into context is expensive even for simple prompts
Understanding Token Economics
Before tracking costs, you need to understand what you are measuring.
What Is a Token?
A token is roughly 3/4 of a word in English, or about 4 characters. The sentence "Hello, how are you today?" is approximately 7 tokens. Code tends to be more token-dense than prose because of syntax characters, indentation, and naming conventions.
Pricing Model
Claude's pricing has two components:
- Input tokens: What you send to the model (your prompt + all context files)
- Output tokens: What the model generates (code, explanations, responses)
For Claude Sonnet (commonly used in Claude Code):
- Input: ~$3 per million tokens
- Output: ~$15 per million tokens
For Claude Opus:
- Input: ~$15 per million tokens
- Output: ~$75 per million tokens
Why Context Size Matters Most
Every time you send a prompt in Claude Code, the input includes your prompt plus every file currently loaded in context. If you have 50 files loaded (about 100K tokens of context), every single prompt costs the input price for those 100K tokens even if your prompt itself is only 50 tokens.
Prompt cost = (context_tokens + prompt_tokens) × input_price
Response cost = response_tokens × output_price
Total = prompt cost + response cost
This means a simple "rename this variable" prompt with 100K tokens of context costs the same as a complex "refactor this entire module" prompt with the same context. Context management is the primary cost lever.
Building a Token Tracker
Step 1: Collect Usage Data
If you are using the Claude API directly, every response includes a usage object:
interface UsageData {
input_tokens: number
output_tokens: number
cache_creation_input_tokens?: number
cache_read_input_tokens?: number
}
// After each API call
function trackUsage(
sessionId: string,
task: string,
usage: UsageData
) {
const cost = calculateCost(usage)
const entry = {
timestamp: new Date().toISOString(),
session_id: sessionId,
task,
input_tokens: usage.input_tokens,
output_tokens: usage.output_tokens,
cached_tokens: usage.cache_read_input_tokens || 0,
cost_usd: cost,
}
appendToLog(entry)
}
function calculateCost(usage: UsageData): number {
const INPUT_PRICE = 3.0 / 1_000_000 // $3 per 1M tokens
const OUTPUT_PRICE = 15.0 / 1_000_000 // $15 per 1M tokens
const inputCost = usage.input_tokens * INPUT_PRICE
const outputCost = usage.output_tokens * OUTPUT_PRICE
return Math.round((inputCost + outputCost) * 10000) / 10000
}
Step 2: Store Usage Data
Store usage data in a local SQLite database for fast querying without network dependencies.
CREATE TABLE token_usage (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp TEXT NOT NULL,
session_id TEXT NOT NULL,
task TEXT,
input_tokens INTEGER NOT NULL,
output_tokens INTEGER NOT NULL,
cached_tokens INTEGER DEFAULT 0,
cost_usd REAL NOT NULL,
model TEXT DEFAULT 'claude-sonnet'
);
CREATE INDEX idx_timestamp ON token_usage(timestamp);
CREATE INDEX idx_session ON token_usage(session_id);
Step 3: Query and Visualize
-- Daily spending
SELECT
DATE(timestamp) as day,
SUM(input_tokens) as total_input,
SUM(output_tokens) as total_output,
ROUND(SUM(cost_usd), 2) as total_cost
FROM token_usage
WHERE timestamp > DATE('now', '-30 days')
GROUP BY day
ORDER BY day;
-- Cost per task type
SELECT
task,
COUNT(*) as request_count,
SUM(input_tokens) as total_input,
ROUND(SUM(cost_usd), 2) as total_cost,
ROUND(AVG(cost_usd), 4) as avg_cost_per_request
FROM token_usage
WHERE timestamp > DATE('now', '-7 days')
GROUP BY task
ORDER BY total_cost DESC;
-- Most expensive sessions
SELECT
session_id,
COUNT(*) as requests,
SUM(input_tokens + output_tokens) as total_tokens,
ROUND(SUM(cost_usd), 2) as total_cost
FROM token_usage
WHERE timestamp > DATE('now', '-7 days')
GROUP BY session_id
ORDER BY total_cost DESC
LIMIT 10;
Step 4: Set Budget Alerts
interface BudgetConfig {
dailyLimit: number // USD
weeklyLimit: number // USD
alertThresholds: number[] // [0.5, 0.8, 1.0]
}
async function checkBudget(config: BudgetConfig): Promise<void> {
const todaySpend = await getDailySpend()
for (const threshold of config.alertThresholds) {
const limit = config.dailyLimit * threshold
if (todaySpend >= limit) {
notify(
`Token budget alert: ${Math.round(threshold * 100)}% ` +
`of daily limit reached ($${todaySpend.toFixed(2)} / $${config.dailyLimit})`
)
}
}
}
Cost Optimization Strategies
Strategy 1: Minimize Context Size
The single most effective optimization. Before each prompt, review what files are in context and remove any that are not relevant to the current task.
# In Claude Code, check current context
/context
# Remove files you no longer need
/remove src/components/unrelated-component.tsx
A session with 50K tokens of context costs half as much per prompt as a session with 100K tokens. The savings compound across hundreds of prompts per day.
Strategy 2: Use Prompt Caching
Claude's prompt caching reduces costs for repeated context. If you load the same files across multiple prompts (which you almost always do), cached tokens cost significantly less than fresh tokens.
Prompt caching is automatic in Claude Code for Anthropic API users. Monitor the cache_read_input_tokens field to verify caching is working.
Strategy 3: Batch Related Questions
Instead of sending five separate prompts that each include the full context:
Prompt 1: "What does function X do?"
Prompt 2: "What does function Y do?"
Prompt 3: "How do X and Y interact?"
Batch them:
"Explain functions X and Y, and describe how they interact with each other."
One prompt with the full context is cheaper than three prompts with the same context.
Strategy 4: Use the Right Model
Not every task needs the most capable (and most expensive) model. Simple completions, formatting, and boilerplate generation work fine with Sonnet. Reserve Opus for complex reasoning, architecture decisions, and difficult debugging.
For more on managing your AI tools cost-effectively, see our guide on self-hosting AI after limits hit.
The CLI Dashboard
Build a simple CLI command that shows your current usage status.
#!/bin/bash
# token-status.sh
echo "=== Token Usage Report ==="
echo ""
# Today's usage
sqlite3 ~/.token-usage.db "
SELECT
SUM(input_tokens) || ' input tokens' as input,
SUM(output_tokens) || ' output tokens' as output,
'\$' || ROUND(SUM(cost_usd), 2) || ' total cost' as cost
FROM token_usage
WHERE DATE(timestamp) = DATE('now')
"
echo ""
echo "--- This Week ---"
sqlite3 ~/.token-usage.db "
SELECT
'\$' || ROUND(SUM(cost_usd), 2) || ' spent this week'
FROM token_usage
WHERE timestamp > DATE('now', '-7 days')
"
echo ""
echo "--- Top Tasks by Cost ---"
sqlite3 ~/.token-usage.db "
SELECT
task || ': \$' || ROUND(SUM(cost_usd), 2)
FROM token_usage
WHERE DATE(timestamp) = DATE('now')
GROUP BY task
ORDER BY SUM(cost_usd) DESC
LIMIT 5
"
Add this as a shell alias for quick access:
alias tokens="bash ~/.scripts/token-status.sh"
FAQ
How do I track Claude Code subscription usage vs API usage?
Claude Code Pro/Max subscriptions include a usage allowance that is not billed per-token. The tracking in this guide applies to direct API usage. For subscription usage, monitor your rate limits and the usage indicators in your Anthropic dashboard.
What is a reasonable daily budget for professional development?
Most individual developers spend $5-20 per day on Claude API tokens for coding work. Teams spending more than $50/developer/day should audit their context management -- large contexts are usually the culprit.
Does prompt caching work across sessions?
Caching has a time window (typically a few minutes). Within a session, caching is very effective. Across separate sessions separated by hours, the cache has likely expired. See the CLI commands reference for session management tips.
How do I track usage when using Claude Code through the subscription?
Claude Code with a Pro/Max subscription manages usage internally. You can monitor your remaining usage through the Anthropic console. The detailed per-request tracking in this guide is most relevant for API key users.
What is the most common cause of unexpectedly high costs?
Loading too many files into context and then running many prompts. A session with 200K tokens of context (about 100 source files) where you send 50 prompts costs roughly $30 for input tokens alone, even if your prompts are short.
Explore production-ready AI skills at aiskill.market/browse or submit your own skill to the marketplace.
Sources
- Anthropic Pricing - Current Claude model pricing
- Anthropic API Usage - Token counting and caching documentation
- SQLite Documentation - Local database for usage tracking