AI tokens are the new cloud compute bill. They are invisible until the invoice arrives, they scale with usage in non-obvious ways, and a single runaway process can burn through your budget in hours. If you use Claude Code professionally, tracking your token usage is not optional -- it is as essential as monitoring your AWS bill.

This guide covers building a token tracking system from scratch: what to measure, how to collect the data, how to visualize it, and how to set alerts before costs spiral.

Key Takeaways

A single Claude Code session with a large context window can consume $5-15 in tokens per hour, making daily tracking essential for professional use
Input tokens (your prompts + context) cost more per-dollar than output tokens but output tokens cost more per-token -- optimizing context size has the highest ROI
Tracking token usage per task, not just per day, reveals which activities are cost-efficient and which are burning budget
Budget alerts at 50%, 80%, and 100% of your daily target prevent bill shock while still allowing productive work
The biggest cost driver is usually context window size, not prompt count -- loading 50 files into context is expensive even for simple prompts

Understanding Token Economics

Before tracking costs, you need to understand what you are measuring.

What Is a Token?

A token is roughly 3/4 of a word in English, or about 4 characters. The sentence "Hello, how are you today?" is approximately 7 tokens. Code tends to be more token-dense than prose because of syntax characters, indentation, and naming conventions.

Pricing Model

Claude's pricing has two components:

Input tokens: What you send to the model (your prompt + all context files)
Output tokens: What the model generates (code, explanations, responses)

For Claude Sonnet (commonly used in Claude Code):

Input: ~$3 per million tokens
Output: ~$15 per million tokens

For Claude Opus:

Input: ~$15 per million tokens
Output: ~$75 per million tokens

Why Context Size Matters Most

Every time you send a prompt in Claude Code, the input includes your prompt plus every file currently loaded in context. If you have 50 files loaded (about 100K tokens of context), every single prompt costs the input price for those 100K tokens even if your prompt itself is only 50 tokens.

Prompt cost = (context_tokens + prompt_tokens) × input_price
Response cost = response_tokens × output_price
Total = prompt cost + response cost

This means a simple "rename this variable" prompt with 100K tokens of context costs the same as a complex "refactor this entire module" prompt with the same context. Context management is the primary cost lever.

Building a Token Tracker

Step 1: Collect Usage Data

If you are using the Claude API directly, every response includes a usage object:

interface UsageData {
  input_tokens: number
  output_tokens: number
  cache_creation_input_tokens?: number
  cache_read_input_tokens?: number
}

// After each API call
function trackUsage(
  sessionId: string, 
  task: string, 
  usage: UsageData
) {
  const cost = calculateCost(usage)
  
  const entry = {
    timestamp: new Date().toISOString(),
    session_id: sessionId,
    task,
    input_tokens: usage.input_tokens,
    output_tokens: usage.output_tokens,
    cached_tokens: usage.cache_read_input_tokens || 0,
    cost_usd: cost,
  }
  
  appendToLog(entry)
}

function calculateCost(usage: UsageData): number {
  const INPUT_PRICE = 3.0 / 1_000_000   // $3 per 1M tokens
  const OUTPUT_PRICE = 15.0 / 1_000_000  // $15 per 1M tokens
  
  const inputCost = usage.input_tokens * INPUT_PRICE
  const outputCost = usage.output_tokens * OUTPUT_PRICE
  
  return Math.round((inputCost + outputCost) * 10000) / 10000
}

Step 2: Store Usage Data

Store usage data in a local SQLite database for fast querying without network dependencies.

CREATE TABLE token_usage (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  timestamp TEXT NOT NULL,
  session_id TEXT NOT NULL,
  task TEXT,
  input_tokens INTEGER NOT NULL,
  output_tokens INTEGER NOT NULL,
  cached_tokens INTEGER DEFAULT 0,
  cost_usd REAL NOT NULL,
  model TEXT DEFAULT 'claude-sonnet'
);

CREATE INDEX idx_timestamp ON token_usage(timestamp);
CREATE INDEX idx_session ON token_usage(session_id);

Step 3: Query and Visualize

-- Daily spending
SELECT 
  DATE(timestamp) as day,
  SUM(input_tokens) as total_input,
  SUM(output_tokens) as total_output,
  ROUND(SUM(cost_usd), 2) as total_cost
FROM token_usage
WHERE timestamp > DATE('now', '-30 days')
GROUP BY day
ORDER BY day;

-- Cost per task type
SELECT 
  task,
  COUNT(*) as request_count,
  SUM(input_tokens) as total_input,
  ROUND(SUM(cost_usd), 2) as total_cost,
  ROUND(AVG(cost_usd), 4) as avg_cost_per_request
FROM token_usage
WHERE timestamp > DATE('now', '-7 days')
GROUP BY task
ORDER BY total_cost DESC;

-- Most expensive sessions
SELECT 
  session_id,
  COUNT(*) as requests,
  SUM(input_tokens + output_tokens) as total_tokens,
  ROUND(SUM(cost_usd), 2) as total_cost
FROM token_usage
WHERE timestamp > DATE('now', '-7 days')
GROUP BY session_id
ORDER BY total_cost DESC
LIMIT 10;

Step 4: Set Budget Alerts

interface BudgetConfig {
  dailyLimit: number   // USD
  weeklyLimit: number  // USD
  alertThresholds: number[]  // [0.5, 0.8, 1.0]
}

async function checkBudget(config: BudgetConfig): Promise<void> {
  const todaySpend = await getDailySpend()
  
  for (const threshold of config.alertThresholds) {
    const limit = config.dailyLimit * threshold
    if (todaySpend >= limit) {
      notify(
        `Token budget alert: ${Math.round(threshold * 100)}% ` +
        `of daily limit reached ($${todaySpend.toFixed(2)} / $${config.dailyLimit})`
      )
    }
  }
}

Cost Optimization Strategies

Strategy 1: Minimize Context Size

The single most effective optimization. Before each prompt, review what files are in context and remove any that are not relevant to the current task.

# In Claude Code, check current context
/context

# Remove files you no longer need
/remove src/components/unrelated-component.tsx

A session with 50K tokens of context costs half as much per prompt as a session with 100K tokens. The savings compound across hundreds of prompts per day.

Strategy 2: Use Prompt Caching

Claude's prompt caching reduces costs for repeated context. If you load the same files across multiple prompts (which you almost always do), cached tokens cost significantly less than fresh tokens.

Prompt caching is automatic in Claude Code for Anthropic API users. Monitor the cache_read_input_tokens field to verify caching is working.

Instead of sending five separate prompts that each include the full context:

Prompt 1: "What does function X do?"
Prompt 2: "What does function Y do?"
Prompt 3: "How do X and Y interact?"

Batch them:

"Explain functions X and Y, and describe how they interact with each other."

One prompt with the full context is cheaper than three prompts with the same context.

Strategy 4: Use the Right Model

Not every task needs the most capable (and most expensive) model. Simple completions, formatting, and boilerplate generation work fine with Sonnet. Reserve Opus for complex reasoning, architecture decisions, and difficult debugging.

For more on managing your AI tools cost-effectively, see our guide on self-hosting AI after limits hit.

The CLI Dashboard

Build a simple CLI command that shows your current usage status.

#!/bin/bash
# token-status.sh

echo "=== Token Usage Report ==="
echo ""

# Today's usage
sqlite3 ~/.token-usage.db "
  SELECT 
    SUM(input_tokens) || ' input tokens' as input,
    SUM(output_tokens) || ' output tokens' as output,
    '\$' || ROUND(SUM(cost_usd), 2) || ' total cost' as cost
  FROM token_usage 
  WHERE DATE(timestamp) = DATE('now')
"

echo ""
echo "--- This Week ---"

sqlite3 ~/.token-usage.db "
  SELECT 
    '\$' || ROUND(SUM(cost_usd), 2) || ' spent this week'
  FROM token_usage 
  WHERE timestamp > DATE('now', '-7 days')
"

echo ""
echo "--- Top Tasks by Cost ---"

sqlite3 ~/.token-usage.db "
  SELECT 
    task || ': \$' || ROUND(SUM(cost_usd), 2)
  FROM token_usage 
  WHERE DATE(timestamp) = DATE('now')
  GROUP BY task
  ORDER BY SUM(cost_usd) DESC
  LIMIT 5
"

Add this as a shell alias for quick access:

alias tokens="bash ~/.scripts/token-status.sh"

FAQ

How do I track Claude Code subscription usage vs API usage?

Claude Code Pro/Max subscriptions include a usage allowance that is not billed per-token. The tracking in this guide applies to direct API usage. For subscription usage, monitor your rate limits and the usage indicators in your Anthropic dashboard.

What is a reasonable daily budget for professional development?

Most individual developers spend $5-20 per day on Claude API tokens for coding work. Teams spending more than $50/developer/day should audit their context management -- large contexts are usually the culprit.

Does prompt caching work across sessions?

Caching has a time window (typically a few minutes). Within a session, caching is very effective. Across separate sessions separated by hours, the cache has likely expired. See the CLI commands reference for session management tips.

How do I track usage when using Claude Code through the subscription?

Claude Code with a Pro/Max subscription manages usage internally. You can monitor your remaining usage through the Anthropic console. The detailed per-request tracking in this guide is most relevant for API key users.

What is the most common cause of unexpectedly high costs?

Loading too many files into context and then running many prompts. A session with 200K tokens of context (about 100 source files) where you send 50 prompts costs roughly $30 for input tokens alone, even if your prompts are short.

Explore production-ready AI skills at aiskill.market/browse or submit your own skill to the marketplace.

Sources

Anthropic Pricing - Current Claude model pricing
Anthropic API Usage - Token counting and caching documentation
SQLite Documentation - Local database for usage tracking

Tracking AI Token Usage and Costs

Key Takeaways

Understanding Token Economics

What Is a Token?

Pricing Model

Why Context Size Matters Most

Building a Token Tracker

Step 1: Collect Usage Data

Step 2: Store Usage Data

Step 3: Query and Visualize

Step 4: Set Budget Alerts

Cost Optimization Strategies

Strategy 1: Minimize Context Size

Strategy 2: Use Prompt Caching

Strategy 4: Use the Right Model

The CLI Dashboard

FAQ

How do I track Claude Code subscription usage vs API usage?

What is a reasonable daily budget for professional development?

Does prompt caching work across sessions?

How do I track usage when using Claude Code through the subscription?

What is the most common cause of unexpectedly high costs?

Sources

Related Skills to Try

Related Skills to Try

Azure Cost Optimization

Related Articles

Related Articles

What's Next for AI Skill Development

The State of AI Skills: Mid-2026

Gesture Recognition in AI Interfaces

Azure Cost Optimization

Azure Observability

Caveman

Extract Design System

Azure Observability

Caveman

Extract Design System

Key Takeaways

Understanding Token Economics

What Is a Token?

Pricing Model

Why Context Size Matters Most

Building a Token Tracker

Step 1: Collect Usage Data

Step 2: Store Usage Data

Step 3: Query and Visualize

Step 4: Set Budget Alerts

Cost Optimization Strategies

Strategy 1: Minimize Context Size

Strategy 2: Use Prompt Caching

Strategy 3: Batch Related Questions

Strategy 4: Use the Right Model

The CLI Dashboard

FAQ

How do I track Claude Code subscription usage vs API usage?

What is a reasonable daily budget for professional development?

Does prompt caching work across sessions?

How do I track usage when using Claude Code through the subscription?

What is the most common cause of unexpectedly high costs?

Sources

Related Skills to Try

Related Skills to Try

Azure Cost Optimization

Related Articles

Related Articles

What's Next for AI Skill Development

The State of AI Skills: Mid-2026

Gesture Recognition in AI Interfaces

Azure Observability

Caveman

Extract Design System