Level 3 Agents: Tool Calling Mastery

Level 2 router agents decide between predefined paths. But what if your agent needs to interact with external systems? Query a database? Call an API? Execute a shell command?

Level 3 agents gain the ability to select and invoke tools. The LLM does not just choose a path; it decides which tool to use, extracts the correct parameters from context, and interprets the results. This transforms agents from decision-makers into actors who can affect the world.

This guide covers tool-calling agents comprehensively. You will learn how to define tools, enable intelligent tool selection, handle results and errors, and build reliable tool-using systems.

Understanding Tool Calling

What Makes a Tool-Calling Agent?

A tool-calling agent has access to external capabilities and the intelligence to use them appropriately. Three components distinguish these agents:

Tool Selection The LLM examines the current task and selects which tool (or tools) would help accomplish it. This requires understanding what each tool does and when it applies.

Parameter Extraction Once a tool is selected, the LLM must provide the correct arguments. This means extracting relevant values from the conversation context and formatting them appropriately.

Result Interpretation Tools return data. The LLM must understand what the result means, whether the operation succeeded, and what to do next based on the outcome.

The Tool Calling Loop

Tool-calling agents follow an extended version of the agent loop:

Observe → Think → Select Tool → Extract Parameters → Call Tool →
Interpret Result → Think → (Select Another Tool or Complete)

This loop can repeat multiple times as the agent uses tools to gather information and take actions.

Tools vs. Routes

The distinction matters:

Aspect	Routes (Level 2)	Tools (Level 3)
Selection	Choose one path	Choose tool(s)
Parameters	None	LLM must provide
Execution	Workflow continues	External system called
Results	Path-dependent	Tool returns data
Chaining	Sequential paths	Multiple tools per step

Level 3 agents often incorporate routing, using it to decide high-level strategy while tools handle specific actions.

Defining Tools for Agents

Tool Definition Structure

Tools need clear specifications that help the LLM understand when and how to use them:

## Available Tools

### Tool: search_database
**Purpose:** Query the customer database for information

**Parameters:**
- `query` (string, required): The search query
- `table` (string, required): Which table to search (customers, orders, products)
- `limit` (integer, optional): Maximum results to return (default: 10)

**Returns:** Array of matching records with id, name, and relevant fields

**Use when:**
- User asks about specific customer information
- Need to look up order history
- Searching for product details

**Do not use when:**
- Information is already in context
- User is asking a general question not requiring data lookup

Tool Categories

Different tool types serve different purposes:

Information Retrieval Tools

### Tool: web_search
Query external knowledge sources
- Input: Search query
- Output: Relevant search results
- Side effects: None

### Tool: read_file
Access local file content
- Input: File path
- Output: File contents
- Side effects: None

Action Tools

### Tool: send_email
Deliver messages to recipients
- Input: Recipient, subject, body
- Output: Confirmation or error
- Side effects: Email is sent (irreversible)

### Tool: create_ticket
Open a support ticket in the system
- Input: Title, description, priority
- Output: Ticket ID
- Side effects: Ticket created in database

Transformation Tools

### Tool: format_date
Convert date formats
- Input: Date string, target format
- Output: Formatted date string
- Side effects: None

### Tool: calculate
Perform mathematical calculations
- Input: Expression
- Output: Numeric result
- Side effects: None

Tool Descriptions That Work

The tool description guides the LLM. Make it specific:

Vague (problematic):

### Tool: process
Processes data

Specific (effective):

### Tool: extract_invoice_data
Parses a PDF invoice and extracts structured data including vendor name,
invoice number, line items with quantities and prices, subtotal, tax,
and total amount. Works with standard invoice formats from major vendors.

Implementing Tool-Calling Agents

Basic Tool-Calling Agent

Here is a research assistant that uses tools:

---
description: Research assistant with web search and file access
version: 1.0.0
tools:
  - web_search
  - read_file
  - write_file
---

# Research Assistant Agent

## Objective

Help users research topics by searching the web, reading relevant documents,
and compiling findings into structured reports.

## Available Tools

### web_search
Search the web for current information on any topic.
- **Input:** `query` (string) - The search query
- **Output:** Array of results with title, snippet, and URL
- **Use for:** Current events, facts, documentation, tutorials

### read_file
Read contents of a local file.
- **Input:** `path` (string) - Path to the file
- **Output:** File contents as string
- **Use for:** Accessing local documents, reading previous research

### write_file
Save content to a local file.
- **Input:** `path` (string), `content` (string)
- **Output:** Confirmation of write
- **Use for:** Saving research notes, creating reports

## Tool Selection Strategy

When user asks a research question:

1. **Check local files first**
   - If topic might have existing research, use read_file
   - Check for relevant documents in ./research/ directory

2. **Search for current information**
   - Use web_search for facts, data, current events
   - Use specific, targeted queries
   - Search multiple times if needed for comprehensive coverage

3. **Synthesize and save**
   - Compile findings from multiple sources
   - Use write_file to save structured research
   - Include sources for all facts

## Execution Pattern

### For research requests:

1. Clarify scope if query is ambiguous
2. Check for existing local research (read_file)
3. Conduct web searches (web_search)
4. Synthesize findings
5. Save compiled research (write_file)
6. Present summary to user

### For follow-up questions:

1. Check if answer is in existing research (read_file)
2. If not, conduct targeted search (web_search)
3. Add new findings to research file (write_file)
4. Answer user's question

Multi-Tool Workflows

Complex tasks require multiple tools in sequence:

## Workflow: Customer Issue Resolution

### Step 1: Gather Context
Tools: search_database, read_file
- Search customer database for account info
- Read any attached documents or previous tickets

### Step 2: Diagnose Issue
Tools: check_system_status, run_diagnostic
- Check if issue relates to known system problems
- Run diagnostic checks if applicable

### Step 3: Take Action
Tools: update_ticket, send_notification, apply_fix
- Apply resolution if automated fix is available
- Update ticket with findings
- Notify customer of status

### Step 4: Document
Tools: write_file, update_knowledge_base
- Document resolution for future reference
- Update knowledge base if new issue pattern found

Parallel Tool Calling

When tools are independent, call them in parallel:

## Parallel Execution Rules

### Can run in parallel:
- Multiple read operations
- Independent searches
- Status checks for different systems

Example:

User: "Compare pricing between competitor A and competitor B" → web_search("competitor A pricing") AND web_search("competitor B pricing") → Wait for both results → Synthesize comparison


### Must run sequentially:
- Write operations that depend on read results
- Actions that depend on previous action success
- Operations with side effects that affect each other

Example:

User: "Update the customer record then send confirmation" → update_customer(data) → Wait for success → send_email(confirmation)

Parameter Extraction

Explicit Parameters

When users provide clear values:

## Parameter Extraction Rules

### Direct Mapping
User: "Search for Python tutorials"
→ web_search(query="Python tutorials")

User: "Read the file at /docs/readme.md"
→ read_file(path="/docs/readme.md")

### Implicit Formatting
User: "Find orders from last week"
→ search_database(
     query="orders",
     date_from=<calculated: today - 7 days>,
     date_to=<calculated: today>
   )

Inferred Parameters

When values must be derived from context:

## Parameter Inference

### From Conversation Context
If user previously mentioned "customer Acme Corp":
→ Customer-related queries use customer_id for Acme Corp

### From Environment
If operation requires current date:
→ Use system date, do not ask user

### From Defaults
If optional parameter not specified:
→ Use documented default value

### When Ambiguous
If required parameter is unclear:
→ Ask user for clarification before calling tool

Parameter Validation

Validate before calling:

## Pre-Call Validation

### Required Parameters
- All required parameters must have values
- If missing, ask user rather than guessing

### Type Checking
- Numeric fields must be numbers
- Dates must be valid date formats
- Enums must match allowed values

### Range Validation
- Quantities must be positive
- Dates must be reasonable (not year 3000)
- Limits must be within allowed maximums

### Security Validation
- Paths must not escape allowed directories
- Queries must not contain injection patterns
- IDs must match expected formats

Handling Tool Results

Success Handling

When tools return successfully:

## Success Processing

### Information Results
- Extract relevant data from result
- Summarize for user if result is large
- Use data to inform next steps

### Action Results
- Confirm action completed
- Note any IDs or references returned
- Proceed to dependent actions

### Empty Results
- Distinguish between "not found" and "error"
- Report to user that search found nothing
- Consider alternative approaches

Error Handling

When tools fail:

## Error Recovery

### Transient Errors (retry appropriate)
- Network timeout
- Rate limiting
- Temporary unavailability

Action: Wait and retry (max 3 attempts)

### Permanent Errors (do not retry)
- Invalid parameters
- Resource not found
- Permission denied

Action: Report error, ask for corrected input

### Partial Success
- Some operations succeeded, others failed

Action: Report what succeeded, handle failures individually

### Unknown Errors
- Unexpected error format or message

Action: Log details, report generic error, suggest manual intervention

Result Interpretation

Make sense of tool output:

## Result Interpretation

### Numeric Results
- Understand units and context
- Compare to expected ranges
- Format appropriately for user

### Structured Results
- Parse JSON/XML responses
- Extract fields relevant to query
- Handle missing or null fields

### Boolean Results
- Map to meaningful messages
- Handle edge cases (operation succeeded but with warnings)

### Collection Results
- Handle empty collections gracefully
- Paginate or summarize large collections
- Identify patterns or anomalies

Advanced Tool Patterns

Tool Chaining

Tools that feed into each other:

## Chain: Research and Summarize

### Step 1: Search
Tool: web_search
Input: user query
Output: search_results

### Step 2: Extract Content
Tool: fetch_page (for each top result)
Input: url from search_results
Output: page_content[]

### Step 3: Summarize
Tool: summarize_text
Input: combined page_content
Output: summary

### Step 4: Save
Tool: write_file
Input: summary, path
Output: confirmation

Conditional Tool Use

Select tools based on intermediate results:

## Conditional Logic

### Pattern: Check Then Act

1. Use read_tool to check current state
2. Based on result:
   - If condition A: use tool_x
   - If condition B: use tool_y
   - If unexpected: ask user

### Example: File Update

1. read_file(path) → content
2. If file exists and contains expected structure:
   → write_file(path, updated_content)
3. If file missing:
   → create_file(path, initial_content)
4. If file has unexpected format:
   → ask user how to proceed

Tool Fallbacks

When primary tool fails:

## Fallback Strategy

### Primary: API Tool
Use the dedicated API for accurate data

### Fallback 1: Cache Tool
If API unavailable, check local cache

### Fallback 2: Web Search
If no cache, search web for public data

### Fallback 3: User Input
If all else fails, ask user for data

### Rules:
- Try fallbacks in order
- Log which source was used
- Note if data might be stale

MCP Integration

The Model Context Protocol provides standardized tool access:

MCP Tool Discovery

## MCP Server Integration

### Available MCP Servers

#### filesystem
Local file operations
- list_directory
- read_file
- write_file
- search_files

#### github
GitHub API operations
- search_repos
- get_issue
- create_issue
- list_pull_requests

#### database
Database operations
- query
- insert
- update
- delete

### Tool Selection with MCP

When task requires file operations:
→ Use filesystem MCP server tools

When task requires GitHub data:
→ Use github MCP server tools

MCP Tool Calling

## MCP Tool Invocation

### Format
mcp_call(
  server="server_name",
  tool="tool_name",
  arguments={...}
)

### Example
mcp_call(
  server="github",
  tool="search_repos",
  arguments={
    "query": "language:python stars:>1000",
    "limit": 10
  }
)

### Response Handling
- MCP returns standardized response format
- Check `success` field for operation status
- Extract `result` field for tool output
- Handle `error` field for failure details

Building a Complete Example

Let us build a code review agent that uses multiple tools:

---
description: Automated code review with multi-tool analysis
version: 1.0.0
tools:
  - git_diff
  - run_linter
  - run_tests
  - search_codebase
  - write_comment
---

# Code Review Agent

## Objective

Perform comprehensive code review on pending changes using multiple
analysis tools. Identify issues, suggest improvements, and document
findings.

## Available Tools

### git_diff
Get the diff of pending changes
- **Input:** `base_branch` (string, default: "main")
- **Output:** Diff with changed files and content
- **Use for:** Understanding what changed

### run_linter
Execute linting on changed files
- **Input:** `files` (array of file paths)
- **Output:** Linting results with issues
- **Use for:** Catching style and basic errors

### run_tests
Execute test suite
- **Input:** `scope` (string: "all", "changed", "related")
- **Output:** Test results with pass/fail status
- **Use for:** Ensuring changes do not break existing functionality

### search_codebase
Search for patterns in code
- **Input:** `pattern` (string), `file_type` (string, optional)
- **Output:** Matching code locations
- **Use for:** Finding related code, checking consistency

### write_comment
Add review comment to specific location
- **Input:** `file` (string), `line` (number), `comment` (string)
- **Output:** Confirmation
- **Use for:** Recording review findings

## Review Process

### Phase 1: Understand Changes

1. Call git_diff to get all changes
2. Categorize changes by type:
   - New files
   - Modified files
   - Deleted files
3. Identify change scope:
   - Small (< 100 lines): Standard review
   - Medium (100-500 lines): Thorough review
   - Large (> 500 lines): Suggest breaking up

### Phase 2: Automated Checks

1. Run linter on changed files

run_linter(files=changed_files)

2. Run tests with related scope

run_tests(scope="related")

3. Collect all automated findings

### Phase 3: Contextual Analysis

For each significant change:

1. Search for similar patterns

search_codebase(pattern=<pattern from change>)

2. Check for consistency with existing code
3. Identify potential issues:
- Missing error handling
- Hardcoded values
- Duplicated logic
- Performance concerns

### Phase 4: Write Review Comments

For each finding:

1. Determine severity (error, warning, suggestion)
2. Write clear, actionable comment
3. Add comment to specific location

write_comment( file="path/to/file", line=42, comment="[Warning] Consider adding null check here" )


### Phase 5: Summary

Compile review summary:
- Total files reviewed
- Issues found by severity
- Test results
- Overall assessment (approve, request changes, needs discussion)

## Error Handling

### Linter Fails to Run
- Check if linter is installed
- Fall back to basic syntax checking
- Note limitation in review summary

### Tests Fail
- Distinguish test failures from broken tests
- If related tests fail, mark as blocking issue
- If unrelated tests fail, note for investigation

### Large Diffs
- If diff is too large, review in chunks
- Prioritize high-risk files
- Suggest splitting PR in review

Testing Tool-Calling Agents

Tool Mock Testing

Test without calling real tools:

## Mock Testing Strategy

### Setup
Create mock implementations for each tool:
- Return predictable responses
- Simulate success and failure cases
- Record all calls for verification

### Test Cases

#### Happy Path
- Mock tools return expected results
- Verify agent completes task correctly
- Check correct tools were called with correct params

#### Error Path
- Mock tools return errors
- Verify agent handles gracefully
- Check fallback behavior works

#### Edge Cases
- Empty results
- Large results (pagination needed)
- Timeout responses

Integration Testing

Test with real tools in safe environment:

## Integration Testing

### Test Environment
- Use sandbox/test databases
- Use test API keys with limited scope
- Create reversible actions only

### Test Scenarios
1. End-to-end workflow with all tools
2. Recovery from real tool failures
3. Performance under realistic conditions

### Cleanup
- Revert any changes made during tests
- Clear test data created
- Reset to known state

Best Practices

Tool Granularity

Too coarse (problematic):

Tool: do_everything
Handles all operations

Too fine (problematic):

Tool: get_first_name
Tool: get_last_name
Tool: get_email
Tool: get_phone
...

Right level:

Tool: get_customer_profile
Returns complete customer information

Tool: update_customer_profile
Updates specified customer fields

Clear Tool Boundaries

Each tool should have a single, clear purpose:

### Good: Single Responsibility
- search_customers: Find customers matching criteria
- get_customer: Get one customer by ID
- update_customer: Modify customer data

### Bad: Mixed Responsibilities
- handle_customer: Searches, gets, or updates based on parameters

Safe Tool Design

Build safety into tools:

### Safety Principles

1. **Idempotent when possible**
   Running the same operation twice has same effect as once

2. **Reversible when possible**
   Provide undo capability for destructive operations

3. **Scoped permissions**
   Tools only access what they need

4. **Rate limited**
   Prevent accidental overuse

5. **Logged**
   All tool calls recorded for audit

Summary

Level 3 agents bridge the gap between reasoning and acting. By giving agents the ability to select and invoke tools, you enable them to interact with external systems, gather real information, and take meaningful actions.

Key principles:

Define tools clearly with specific purposes and parameters
Enable intelligent selection by describing when each tool applies
Extract parameters carefully using conversation context
Handle results robustly including errors and edge cases
Chain tools effectively for complex multi-step operations

Tool-calling agents form the foundation for most practical agent applications. Master this pattern, and you can build agents that research, analyze, communicate, and automate real workflows.

Ready to coordinate multiple agents working together? Continue to Level 4 Agents: Multi-Agent Orchestration to learn how to build agent teams.

Level 3 Agents: Tool Calling Mastery

Level 2 router agents decide between predefined paths. But what if your agent needs to interact with external systems? Query a database? Call an API? Execute a shell command?

This guide covers tool-calling agents comprehensively. You will learn how to define tools, enable intelligent tool selection, handle results and errors, and build reliable tool-using systems.

Understanding Tool Calling

What Makes a Tool-Calling Agent?

A tool-calling agent has access to external capabilities and the intelligence to use them appropriately. Three components distinguish these agents:

Tool Selection The LLM examines the current task and selects which tool (or tools) would help accomplish it. This requires understanding what each tool does and when it applies.

Parameter Extraction Once a tool is selected, the LLM must provide the correct arguments. This means extracting relevant values from the conversation context and formatting them appropriately.

Result Interpretation Tools return data. The LLM must understand what the result means, whether the operation succeeded, and what to do next based on the outcome.

The Tool Calling Loop

Tool-calling agents follow an extended version of the agent loop:

Observe → Think → Select Tool → Extract Parameters → Call Tool →
Interpret Result → Think → (Select Another Tool or Complete)

This loop can repeat multiple times as the agent uses tools to gather information and take actions.

Tools vs. Routes

The distinction matters:

Aspect	Routes (Level 2)	Tools (Level 3)
Selection	Choose one path	Choose tool(s)
Parameters	None	LLM must provide
Execution	Workflow continues	External system called
Results	Path-dependent	Tool returns data
Chaining	Sequential paths	Multiple tools per step

Level 3 agents often incorporate routing, using it to decide high-level strategy while tools handle specific actions.

Defining Tools for Agents

Tool Definition Structure

Tools need clear specifications that help the LLM understand when and how to use them:

## Available Tools

### Tool: search_database
**Purpose:** Query the customer database for information

**Parameters:**
- `query` (string, required): The search query
- `table` (string, required): Which table to search (customers, orders, products)
- `limit` (integer, optional): Maximum results to return (default: 10)

**Returns:** Array of matching records with id, name, and relevant fields

**Use when:**
- User asks about specific customer information
- Need to look up order history
- Searching for product details

**Do not use when:**
- Information is already in context
- User is asking a general question not requiring data lookup

Tool Categories

Different tool types serve different purposes:

Information Retrieval Tools

### Tool: web_search
Query external knowledge sources
- Input: Search query
- Output: Relevant search results
- Side effects: None

### Tool: read_file
Access local file content
- Input: File path
- Output: File contents
- Side effects: None

Action Tools

### Tool: send_email
Deliver messages to recipients
- Input: Recipient, subject, body
- Output: Confirmation or error
- Side effects: Email is sent (irreversible)

### Tool: create_ticket
Open a support ticket in the system
- Input: Title, description, priority
- Output: Ticket ID
- Side effects: Ticket created in database

Transformation Tools

### Tool: format_date
Convert date formats
- Input: Date string, target format
- Output: Formatted date string
- Side effects: None

### Tool: calculate
Perform mathematical calculations
- Input: Expression
- Output: Numeric result
- Side effects: None

Tool Descriptions That Work

The tool description guides the LLM. Make it specific:

Vague (problematic):

### Tool: process
Processes data

Specific (effective):

### Tool: extract_invoice_data
Parses a PDF invoice and extracts structured data including vendor name,
invoice number, line items with quantities and prices, subtotal, tax,
and total amount. Works with standard invoice formats from major vendors.

Implementing Tool-Calling Agents

Basic Tool-Calling Agent

Here is a research assistant that uses tools:

---
description: Research assistant with web search and file access
version: 1.0.0
tools:
  - web_search
  - read_file
  - write_file
---

# Research Assistant Agent

## Objective

Help users research topics by searching the web, reading relevant documents,
and compiling findings into structured reports.

## Available Tools

### web_search
Search the web for current information on any topic.
- **Input:** `query` (string) - The search query
- **Output:** Array of results with title, snippet, and URL
- **Use for:** Current events, facts, documentation, tutorials

### read_file
Read contents of a local file.
- **Input:** `path` (string) - Path to the file
- **Output:** File contents as string
- **Use for:** Accessing local documents, reading previous research

### write_file
Save content to a local file.
- **Input:** `path` (string), `content` (string)
- **Output:** Confirmation of write
- **Use for:** Saving research notes, creating reports

## Tool Selection Strategy

When user asks a research question:

1. **Check local files first**
   - If topic might have existing research, use read_file
   - Check for relevant documents in ./research/ directory

2. **Search for current information**
   - Use web_search for facts, data, current events
   - Use specific, targeted queries
   - Search multiple times if needed for comprehensive coverage

3. **Synthesize and save**
   - Compile findings from multiple sources
   - Use write_file to save structured research
   - Include sources for all facts

## Execution Pattern

### For research requests:

1. Clarify scope if query is ambiguous
2. Check for existing local research (read_file)
3. Conduct web searches (web_search)
4. Synthesize findings
5. Save compiled research (write_file)
6. Present summary to user

### For follow-up questions:

1. Check if answer is in existing research (read_file)
2. If not, conduct targeted search (web_search)
3. Add new findings to research file (write_file)
4. Answer user's question

Multi-Tool Workflows

Complex tasks require multiple tools in sequence:

## Workflow: Customer Issue Resolution

### Step 1: Gather Context
Tools: search_database, read_file
- Search customer database for account info
- Read any attached documents or previous tickets

### Step 2: Diagnose Issue
Tools: check_system_status, run_diagnostic
- Check if issue relates to known system problems
- Run diagnostic checks if applicable

### Step 3: Take Action
Tools: update_ticket, send_notification, apply_fix
- Apply resolution if automated fix is available
- Update ticket with findings
- Notify customer of status

### Step 4: Document
Tools: write_file, update_knowledge_base
- Document resolution for future reference
- Update knowledge base if new issue pattern found

Parallel Tool Calling

When tools are independent, call them in parallel:

## Parallel Execution Rules

### Can run in parallel:
- Multiple read operations
- Independent searches
- Status checks for different systems

Example:

User: "Compare pricing between competitor A and competitor B" → web_search("competitor A pricing") AND web_search("competitor B pricing") → Wait for both results → Synthesize comparison


### Must run sequentially:
- Write operations that depend on read results
- Actions that depend on previous action success
- Operations with side effects that affect each other

Example:

User: "Update the customer record then send confirmation" → update_customer(data) → Wait for success → send_email(confirmation)

Parameter Extraction

Explicit Parameters

When users provide clear values:

## Parameter Extraction Rules

### Direct Mapping
User: "Search for Python tutorials"
→ web_search(query="Python tutorials")

User: "Read the file at /docs/readme.md"
→ read_file(path="/docs/readme.md")

### Implicit Formatting
User: "Find orders from last week"
→ search_database(
     query="orders",
     date_from=<calculated: today - 7 days>,
     date_to=<calculated: today>
   )

Inferred Parameters

When values must be derived from context:

## Parameter Inference

### From Conversation Context
If user previously mentioned "customer Acme Corp":
→ Customer-related queries use customer_id for Acme Corp

### From Environment
If operation requires current date:
→ Use system date, do not ask user

### From Defaults
If optional parameter not specified:
→ Use documented default value

### When Ambiguous
If required parameter is unclear:
→ Ask user for clarification before calling tool

Parameter Validation

Validate before calling:

## Pre-Call Validation

### Required Parameters
- All required parameters must have values
- If missing, ask user rather than guessing

### Type Checking
- Numeric fields must be numbers
- Dates must be valid date formats
- Enums must match allowed values

### Range Validation
- Quantities must be positive
- Dates must be reasonable (not year 3000)
- Limits must be within allowed maximums

### Security Validation
- Paths must not escape allowed directories
- Queries must not contain injection patterns
- IDs must match expected formats

Handling Tool Results

Success Handling

When tools return successfully:

## Success Processing

### Information Results
- Extract relevant data from result
- Summarize for user if result is large
- Use data to inform next steps

### Action Results
- Confirm action completed
- Note any IDs or references returned
- Proceed to dependent actions

### Empty Results
- Distinguish between "not found" and "error"
- Report to user that search found nothing
- Consider alternative approaches

Error Handling

When tools fail:

## Error Recovery

### Transient Errors (retry appropriate)
- Network timeout
- Rate limiting
- Temporary unavailability

Action: Wait and retry (max 3 attempts)

### Permanent Errors (do not retry)
- Invalid parameters
- Resource not found
- Permission denied

Action: Report error, ask for corrected input

### Partial Success
- Some operations succeeded, others failed

Action: Report what succeeded, handle failures individually

### Unknown Errors
- Unexpected error format or message

Action: Log details, report generic error, suggest manual intervention

Result Interpretation

Make sense of tool output:

## Result Interpretation

### Numeric Results
- Understand units and context
- Compare to expected ranges
- Format appropriately for user

### Structured Results
- Parse JSON/XML responses
- Extract fields relevant to query
- Handle missing or null fields

### Boolean Results
- Map to meaningful messages
- Handle edge cases (operation succeeded but with warnings)

### Collection Results
- Handle empty collections gracefully
- Paginate or summarize large collections
- Identify patterns or anomalies

Advanced Tool Patterns

Tool Chaining

Tools that feed into each other:

## Chain: Research and Summarize

### Step 1: Search
Tool: web_search
Input: user query
Output: search_results

### Step 2: Extract Content
Tool: fetch_page (for each top result)
Input: url from search_results
Output: page_content[]

### Step 3: Summarize
Tool: summarize_text
Input: combined page_content
Output: summary

### Step 4: Save
Tool: write_file
Input: summary, path
Output: confirmation

Conditional Tool Use

Select tools based on intermediate results:

## Conditional Logic

### Pattern: Check Then Act

1. Use read_tool to check current state
2. Based on result:
   - If condition A: use tool_x
   - If condition B: use tool_y
   - If unexpected: ask user

### Example: File Update

1. read_file(path) → content
2. If file exists and contains expected structure:
   → write_file(path, updated_content)
3. If file missing:
   → create_file(path, initial_content)
4. If file has unexpected format:
   → ask user how to proceed

Tool Fallbacks

When primary tool fails:

## Fallback Strategy

### Primary: API Tool
Use the dedicated API for accurate data

### Fallback 1: Cache Tool
If API unavailable, check local cache

### Fallback 2: Web Search
If no cache, search web for public data

### Fallback 3: User Input
If all else fails, ask user for data

### Rules:
- Try fallbacks in order
- Log which source was used
- Note if data might be stale

MCP Integration

The Model Context Protocol provides standardized tool access:

MCP Tool Discovery

## MCP Server Integration

### Available MCP Servers

#### filesystem
Local file operations
- list_directory
- read_file
- write_file
- search_files

#### github
GitHub API operations
- search_repos
- get_issue
- create_issue
- list_pull_requests

#### database
Database operations
- query
- insert
- update
- delete

### Tool Selection with MCP

When task requires file operations:
→ Use filesystem MCP server tools

When task requires GitHub data:
→ Use github MCP server tools

MCP Tool Calling

## MCP Tool Invocation

### Format
mcp_call(
  server="server_name",
  tool="tool_name",
  arguments={...}
)

### Example
mcp_call(
  server="github",
  tool="search_repos",
  arguments={
    "query": "language:python stars:>1000",
    "limit": 10
  }
)

### Response Handling
- MCP returns standardized response format
- Check `success` field for operation status
- Extract `result` field for tool output
- Handle `error` field for failure details

Building a Complete Example

Let us build a code review agent that uses multiple tools:

---
description: Automated code review with multi-tool analysis
version: 1.0.0
tools:
  - git_diff
  - run_linter
  - run_tests
  - search_codebase
  - write_comment
---

# Code Review Agent

## Objective

Perform comprehensive code review on pending changes using multiple
analysis tools. Identify issues, suggest improvements, and document
findings.

## Available Tools

### git_diff
Get the diff of pending changes
- **Input:** `base_branch` (string, default: "main")
- **Output:** Diff with changed files and content
- **Use for:** Understanding what changed

### run_linter
Execute linting on changed files
- **Input:** `files` (array of file paths)
- **Output:** Linting results with issues
- **Use for:** Catching style and basic errors

### run_tests
Execute test suite
- **Input:** `scope` (string: "all", "changed", "related")
- **Output:** Test results with pass/fail status
- **Use for:** Ensuring changes do not break existing functionality

### search_codebase
Search for patterns in code
- **Input:** `pattern` (string), `file_type` (string, optional)
- **Output:** Matching code locations
- **Use for:** Finding related code, checking consistency

### write_comment
Add review comment to specific location
- **Input:** `file` (string), `line` (number), `comment` (string)
- **Output:** Confirmation
- **Use for:** Recording review findings

## Review Process

### Phase 1: Understand Changes

1. Call git_diff to get all changes
2. Categorize changes by type:
   - New files
   - Modified files
   - Deleted files
3. Identify change scope:
   - Small (< 100 lines): Standard review
   - Medium (100-500 lines): Thorough review
   - Large (> 500 lines): Suggest breaking up

### Phase 2: Automated Checks

1. Run linter on changed files

run_linter(files=changed_files)

2. Run tests with related scope

run_tests(scope="related")

3. Collect all automated findings

### Phase 3: Contextual Analysis

For each significant change:

1. Search for similar patterns

search_codebase(pattern=<pattern from change>)

2. Check for consistency with existing code
3. Identify potential issues:
- Missing error handling
- Hardcoded values
- Duplicated logic
- Performance concerns

### Phase 4: Write Review Comments

For each finding:

1. Determine severity (error, warning, suggestion)
2. Write clear, actionable comment
3. Add comment to specific location

write_comment( file="path/to/file", line=42, comment="[Warning] Consider adding null check here" )


### Phase 5: Summary

Compile review summary:
- Total files reviewed
- Issues found by severity
- Test results
- Overall assessment (approve, request changes, needs discussion)

## Error Handling

### Linter Fails to Run
- Check if linter is installed
- Fall back to basic syntax checking
- Note limitation in review summary

### Tests Fail
- Distinguish test failures from broken tests
- If related tests fail, mark as blocking issue
- If unrelated tests fail, note for investigation

### Large Diffs
- If diff is too large, review in chunks
- Prioritize high-risk files
- Suggest splitting PR in review

Testing Tool-Calling Agents

Tool Mock Testing

Test without calling real tools:

## Mock Testing Strategy

### Setup
Create mock implementations for each tool:
- Return predictable responses
- Simulate success and failure cases
- Record all calls for verification

### Test Cases

#### Happy Path
- Mock tools return expected results
- Verify agent completes task correctly
- Check correct tools were called with correct params

#### Error Path
- Mock tools return errors
- Verify agent handles gracefully
- Check fallback behavior works

#### Edge Cases
- Empty results
- Large results (pagination needed)
- Timeout responses

Integration Testing

Test with real tools in safe environment:

## Integration Testing

### Test Environment
- Use sandbox/test databases
- Use test API keys with limited scope
- Create reversible actions only

### Test Scenarios
1. End-to-end workflow with all tools
2. Recovery from real tool failures
3. Performance under realistic conditions

### Cleanup
- Revert any changes made during tests
- Clear test data created
- Reset to known state

Best Practices

Tool Granularity

Too coarse (problematic):

Tool: do_everything
Handles all operations

Too fine (problematic):

Tool: get_first_name
Tool: get_last_name
Tool: get_email
Tool: get_phone
...

Right level:

Tool: get_customer_profile
Returns complete customer information

Tool: update_customer_profile
Updates specified customer fields

Clear Tool Boundaries

Each tool should have a single, clear purpose:

### Good: Single Responsibility
- search_customers: Find customers matching criteria
- get_customer: Get one customer by ID
- update_customer: Modify customer data

### Bad: Mixed Responsibilities
- handle_customer: Searches, gets, or updates based on parameters

Safe Tool Design

Build safety into tools:

### Safety Principles

1. **Idempotent when possible**
   Running the same operation twice has same effect as once

2. **Reversible when possible**
   Provide undo capability for destructive operations

3. **Scoped permissions**
   Tools only access what they need

4. **Rate limited**
   Prevent accidental overuse

5. **Logged**
   All tool calls recorded for audit

Summary

Key principles:

Define tools clearly with specific purposes and parameters
Enable intelligent selection by describing when each tool applies
Extract parameters carefully using conversation context
Handle results robustly including errors and edge cases
Chain tools effectively for complex multi-step operations

Tool-calling agents form the foundation for most practical agent applications. Master this pattern, and you can build agents that research, analyze, communicate, and automate real workflows.

Ready to coordinate multiple agents working together? Continue to Level 4 Agents: Multi-Agent Orchestration to learn how to build agent teams.