Tools and Tool-Use in AI Agents: A Complete Guide
Master the art of integrating tools into AI agents. Learn tool design, MCP servers, custom tools, and best practices for connecting agents to the real world.
Tools and Tool-Use in AI Agents: A Complete Guide
Tools are what transform AI from a thinking system into an acting system. Without tools, an LLM can only generate text. With tools, it can search the web, execute code, query databases, send emails, and interact with any API you expose.
This guide covers everything you need to know about integrating tools into AI agents—from basic concepts to advanced patterns.
What Are Tools?
In the context of AI agents, a tool is a function that the agent can invoke to interact with external systems. Tools bridge the gap between the agent's reasoning and the real world.
A tool definition typically includes:
- Name: What to call the tool
- Description: What the tool does (crucial for the LLM to decide when to use it)
- Parameters: What inputs the tool accepts
- Return value: What the tool outputs
# Example tool definition
{
"name": "web_search",
"description": "Search the web for current information on any topic. Use this when you need up-to-date information that might not be in your training data.",
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query. Be specific and include relevant context."
},
"num_results": {
"type": "integer",
"description": "Number of results to return (default: 5, max: 20)",
"default": 5
}
},
"required": ["query"]
}
}
The Tool-Use Loop
When an agent uses tools, it follows a specific pattern:
1. User provides a task
2. Agent reasons about what to do
3. Agent decides to use a tool
4. Agent specifies tool name and parameters
5. System executes the tool
6. System returns result to agent
7. Agent processes result
8. Repeat 2-7 until task is complete
9. Agent provides final response
Here's how this looks in code:
def agent_with_tools(task: str, tools: list, max_iterations: int = 10):
messages = [{"role": "user", "content": task}]
for _ in range(max_iterations):
# Get agent's response
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
tools=tools,
messages=messages
)
# Check if agent is done
if response.stop_reason == "end_turn":
return extract_text(response)
# Process tool calls
if response.stop_reason == "tool_use":
# Add agent's response to history
messages.append({
"role": "assistant",
"content": response.content
})
# Execute tools and gather results
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
})
# Add results back to conversation
messages.append({"role": "user", "content": tool_results})
return "Max iterations reached"
Designing Effective Tools
Tool design significantly impacts agent performance. Here are the key principles:
Principle 1: Clear, Specific Descriptions
The description is how the LLM decides when to use a tool. Be explicit:
# Bad: Vague description
{
"name": "search",
"description": "Searches for things",
...
}
# Good: Specific and instructive
{
"name": "web_search",
"description": "Search the web for current information. Use this when:\n- You need information after your training cutoff\n- You need to verify current facts\n- You need specific data (prices, statistics, etc.)\n\nDo NOT use for:\n- General knowledge questions\n- Conceptual explanations\n- Historical facts",
...
}
Principle 2: Minimal Parameters
Each parameter is a decision point. More parameters mean more chances for errors.
# Bad: Too many parameters
{
"name": "send_email",
"parameters": {
"to": ...,
"cc": ...,
"bcc": ...,
"subject": ...,
"body": ...,
"html_body": ...,
"attachments": ...,
"reply_to": ...,
"priority": ...,
"read_receipt": ...,
}
}
# Good: Essential parameters only
{
"name": "send_email",
"parameters": {
"to": ...,
"subject": ...,
"body": ...,
"cc": ... # Optional
}
}
Principle 3: Descriptive Parameter Definitions
Each parameter needs its own clear description:
{
"name": "create_file",
"parameters": {
"path": {
"type": "string",
"description": "The file path relative to the project root. Example: 'src/components/Button.tsx'"
},
"content": {
"type": "string",
"description": "The complete file content. Include all necessary imports and formatting."
},
"overwrite": {
"type": "boolean",
"description": "If true, overwrite existing files. If false (default), fail if file exists.",
"default": False
}
}
}
Principle 4: Meaningful Return Values
Tools should return structured, informative results:
# Bad: Minimal return
def search(query):
results = perform_search(query)
return results
# Good: Rich, structured return
def search(query):
results = perform_search(query)
return {
"query": query,
"total_results": len(results),
"results": [
{
"title": r.title,
"url": r.url,
"snippet": r.snippet,
"date": r.date
}
for r in results[:10]
],
"has_more": len(results) > 10
}
Principle 5: Error Information
When tools fail, return useful error information:
def execute_tool(name, params):
try:
result = tools[name](**params)
return {"success": True, "result": result}
except ValidationError as e:
return {
"success": False,
"error_type": "validation",
"message": str(e),
"hint": "Check parameter types and required fields"
}
except PermissionError as e:
return {
"success": False,
"error_type": "permission",
"message": str(e),
"hint": "This operation requires additional permissions"
}
except Exception as e:
return {
"success": False,
"error_type": "unknown",
"message": str(e)
}
Common Tool Categories
Information Retrieval Tools
# Web search
web_search = Tool(
name="web_search",
description="Search the web for information",
function=lambda query: search_api.search(query)
)
# Document search
doc_search = Tool(
name="search_documents",
description="Search internal documents and knowledge base",
function=lambda query: vector_store.similarity_search(query)
)
# Database query
db_query = Tool(
name="query_database",
description="Run read-only SQL queries against the database",
function=lambda sql: db.execute_read_only(sql)
)
Content Reading Tools
# Web page reader
read_page = Tool(
name="read_webpage",
description="Fetch and extract text content from a URL",
function=lambda url: fetch_and_parse(url)
)
# File reader
read_file = Tool(
name="read_file",
description="Read contents of a file",
function=lambda path: open(path).read()
)
# PDF reader
read_pdf = Tool(
name="read_pdf",
description="Extract text from a PDF document",
function=lambda path: pdf_parser.extract(path)
)
Code Execution Tools
# Python executor
python_exec = Tool(
name="execute_python",
description="Execute Python code and return the result",
function=lambda code: safe_exec.run(code)
)
# Shell command
shell_exec = Tool(
name="run_command",
description="Run a shell command and return output",
function=lambda cmd: subprocess.run(cmd, capture_output=True)
)
# API call
api_call = Tool(
name="call_api",
description="Make an HTTP request to an API",
function=lambda method, url, body: requests.request(method, url, json=body)
)
Communication Tools
# Email sender
send_email = Tool(
name="send_email",
description="Send an email",
function=lambda to, subject, body: email_client.send(to, subject, body)
)
# Slack message
send_slack = Tool(
name="send_slack",
description="Send a message to a Slack channel",
function=lambda channel, message: slack.post_message(channel, message)
)
File Operation Tools
# Write file
write_file = Tool(
name="write_file",
description="Write content to a file",
function=lambda path, content: Path(path).write_text(content)
)
# List directory
list_dir = Tool(
name="list_directory",
description="List files in a directory",
function=lambda path: os.listdir(path)
)
Model Context Protocol (MCP)
MCP is Anthropic's standard for connecting AI agents to tools and data sources. Instead of building custom integrations, you can use MCP servers.
What is MCP?
MCP provides:
- A standard protocol for tool discovery and invocation
- Pre-built servers for common integrations
- A way to package and share tool implementations
Using MCP Servers
from mcp import Client
# Connect to an MCP server
client = Client("http://localhost:8080")
# Discover available tools
tools = client.list_tools()
# Use a tool
result = client.call_tool(
"file_read",
{"path": "/tmp/example.txt"}
)
Building an MCP Server
from mcp.server import Server, Tool
import asyncio
server = Server("my-tools")
@server.tool("calculate")
async def calculate(expression: str) -> str:
"""Evaluate a mathematical expression safely"""
try:
# Use a safe math parser instead of arbitrary code execution
result = safe_math_parser.evaluate(expression)
return str(result)
except Exception as e:
return f"Error: {e}"
@server.tool("fetch_weather")
async def fetch_weather(city: str) -> dict:
"""Get current weather for a city"""
return await weather_api.get(city)
if __name__ == "__main__":
asyncio.run(server.run())
MCP in Claude Code
Claude Code integrates MCP natively. Configure servers in .mcp.json:
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["-y", "@anthropic-ai/mcp-server-filesystem", "/path/to/allowed/dir"]
},
"github": {
"command": "npx",
"args": ["-y", "@anthropic-ai/mcp-server-github"],
"env": {
"GITHUB_TOKEN": "${GITHUB_TOKEN}"
}
}
}
}
Advanced Tool Patterns
Pattern 1: Tool Chaining
Tools that return data formatted for other tools:
# Search returns URLs
search_tool = Tool(
name="search",
description="Returns URLs that can be passed to read_webpage",
...
)
# Reader accepts URLs from search
read_tool = Tool(
name="read_webpage",
description="Read a URL returned by search",
...
)
# Agent naturally chains: search -> get URLs -> read each URL
Pattern 2: Confirmation Tools
For sensitive operations, require confirmation:
@tool
def delete_file(path: str, confirmed: bool = False) -> dict:
"""Delete a file. Requires confirmed=True to actually delete."""
if not confirmed:
return {
"status": "pending_confirmation",
"message": f"Are you sure you want to delete {path}?",
"to_confirm": {"path": path, "confirmed": True}
}
os.remove(path)
return {"status": "deleted", "path": path}
Pattern 3: Paginated Results
For large result sets:
@tool
def search_logs(query: str, page: int = 1, page_size: int = 20) -> dict:
"""Search logs with pagination."""
all_results = log_store.search(query)
start = (page - 1) * page_size
end = start + page_size
return {
"results": all_results[start:end],
"page": page,
"total_pages": (len(all_results) + page_size - 1) // page_size,
"total_results": len(all_results),
"has_next": end < len(all_results),
"has_prev": page > 1
}
Pattern 4: Stateful Tools
Tools that maintain state across calls:
class SessionTool:
def __init__(self):
self.data = {}
def set(self, key: str, value: any) -> dict:
"""Store a value in the session."""
self.data[key] = value
return {"stored": key, "keys": list(self.data.keys())}
def get(self, key: str) -> dict:
"""Retrieve a value from the session."""
if key in self.data:
return {"found": True, "value": self.data[key]}
return {"found": False, "available_keys": list(self.data.keys())}
Pattern 5: Composite Tools
Tools that combine multiple operations:
@tool
def update_and_test(file_path: str, new_content: str) -> dict:
"""Update a file and run relevant tests."""
# Write the file
Path(file_path).write_text(new_content)
# Find related tests
test_file = find_test_file(file_path)
# Run tests
result = run_tests(test_file)
return {
"file_updated": file_path,
"tests_run": test_file,
"test_passed": result.passed,
"test_output": result.output
}
Tool Selection and Optimization
The Right Number of Tools
Research suggests optimal performance with 5-15 tools. Too few limits capability; too many causes confusion.
If you need more tools, consider:
- Tool namespacing: Group related tools with prefixes
- Dynamic tool loading: Load only relevant tools for each task
- Meta-tools: A tool that lists available specialized tools
Tool Prioritization
Order tools by likely usefulness:
# Primary tools (most common)
primary_tools = [
web_search,
read_file,
execute_python
]
# Secondary tools (occasional use)
secondary_tools = [
send_email,
create_calendar_event
]
# Include primary always, secondary based on context
tools = primary_tools
if "email" in task.lower() or "calendar" in task.lower():
tools = primary_tools + secondary_tools
Tool Caching
Cache expensive tool results:
from functools import lru_cache
import hashlib
@lru_cache(maxsize=100)
def cached_web_search(query: str) -> dict:
return web_search_api.search(query)
@tool
def web_search(query: str) -> dict:
"""Search the web (results cached for 5 minutes)."""
return cached_web_search(query)
Security Considerations
Input Validation
Always validate tool inputs:
@tool
def read_file(path: str) -> dict:
# Validate path
path = Path(path).resolve()
# Check it's within allowed directory
if not path.is_relative_to(ALLOWED_BASE_DIR):
return {"error": "Path outside allowed directory"}
# Check file exists
if not path.exists():
return {"error": "File not found"}
# Read with size limit
content = path.read_text()
if len(content) > MAX_FILE_SIZE:
content = content[:MAX_FILE_SIZE] + "\n... [truncated]"
return {"content": content}
Sandboxing
Run dangerous operations in sandboxes:
@tool
def execute_code(code: str) -> dict:
"""Execute code in a sandboxed environment."""
# Run in container with:
# - No network access
# - Limited filesystem
# - Resource limits
# - Timeout
result = sandbox.run(
code,
timeout=30,
memory_limit="128MB",
network=False
)
return result
Audit Logging
Log all tool invocations:
def execute_tool(name: str, params: dict, user_id: str) -> dict:
# Log the attempt
audit_log.write({
"timestamp": datetime.now(),
"user": user_id,
"tool": name,
"params": sanitize_params(params),
"status": "started"
})
try:
result = tools[name](**params)
audit_log.write({...,"status": "success"})
return result
except Exception as e:
audit_log.write({...,"status": "error", "error": str(e)})
raise
Testing Tools
Unit Testing
Test tools in isolation:
def test_web_search():
result = web_search(query="test query")
assert "results" in result
assert len(result["results"]) > 0
def test_web_search_empty_query():
result = web_search(query="")
assert "error" in result
Integration Testing
Test tools as agents would use them:
def test_agent_uses_tools_correctly():
agent = Agent(tools=[web_search, read_page])
result = agent.run("What is the current price of Bitcoin?")
# Verify tools were used appropriately
assert "web_search" in agent.tools_used
assert result.contains_number() # Should have a price
Adversarial Testing
Test for misuse:
def test_file_read_path_traversal():
result = read_file("../../../etc/passwd")
assert "error" in result # Should be blocked
def test_code_exec_resource_bomb():
result = execute_code("while True: pass")
assert result["status"] == "timeout"
Conclusion
Tools are the hands and feet of AI agents. Well-designed tools enable agents to accomplish real work; poorly designed tools create frustration and errors.
Key takeaways:
- Write clear, specific tool descriptions
- Minimize parameters to essentials
- Return structured, informative results
- Handle errors gracefully
- Consider MCP for standardized integrations
- Implement proper security controls
- Test thoroughly
With effective tools, your agents can move from thinking to doing—which is where the real value lies.
Ready to build agents with multiple tools working together? Check out Multi-Agent Cooperation Patterns for advanced orchestration techniques.