ReAct Pattern: How Agents Reason and Act
Master the ReAct pattern - the fundamental loop of Reasoning and Acting that powers effective AI agents. Learn implementation, examples, and best practices.
ReAct Pattern: How Agents Reason and Act
The ReAct pattern—Reasoning and Acting—is the foundational loop that powers most effective AI agents. It's elegantly simple: think about what to do, do it, observe the result, then think again. This cycle continues until the task is complete.
While simple in concept, ReAct enables sophisticated behavior. This guide explains how it works and how to implement it effectively.
The ReAct Loop
ReAct alternates between two phases:
Reasoning (Re): The agent thinks about the current situation and decides what to do next.
Acting (Act): The agent takes an action and observes the result.
Thought → Action → Observation → Thought → Action → Observation → ... → Final Answer
Each iteration builds on the previous, creating a chain of reasoning that leads to the goal.
A Complete Example
Let's trace through a ReAct agent solving a research task:
Task: "What was the GDP growth rate of Japan in the year they hosted the Olympics most recently?"
Thought 1: I need to find when Japan most recently hosted the Olympics.
I'll search for this information.
Action 1: search("Japan hosted Olympics when most recent")
Observation 1: Japan hosted the Summer Olympics in Tokyo in 2020
(held in 2021 due to COVID-19). They also hosted
in 1964 (Tokyo) and 1998 (Nagano Winter Olympics).
Thought 2: The most recent Olympics Japan hosted was Tokyo 2020
(held in 2021). Now I need to find Japan's GDP growth
rate for 2021.
Action 2: search("Japan GDP growth rate 2021")
Observation 2: Japan's real GDP growth rate in 2021 was 1.7%,
recovering from a -4.5% decline in 2020 due to
the pandemic.
Thought 3: I now have the answer. Japan hosted the Olympics most
recently in 2020/2021 (Tokyo), and their GDP growth
rate that year was 1.7%.
Final Answer: Japan most recently hosted the Olympics in Tokyo in 2020
(held in 2021 due to COVID-19). In 2021, Japan's GDP
growth rate was 1.7%.
Notice how each thought builds on previous observations, creating a logical chain to the answer.
Implementing ReAct
Here's a complete ReAct implementation:
from anthropic import Anthropic
import json
import re
class ReActAgent:
def __init__(self, tools: dict):
self.client = Anthropic()
self.tools = tools
def run(self, task: str, max_steps: int = 10) -> str:
"""Execute a task using the ReAct loop"""
system_prompt = self._build_system_prompt()
trajectory = f"Task: {task}\n\n"
for step in range(max_steps):
# Get the agent's next thought and action
response = self.client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=system_prompt,
messages=[{"role": "user", "content": trajectory}]
)
response_text = response.content[0].text
trajectory += response_text + "\n"
# Check if we have a final answer
if "Final Answer:" in response_text:
return self._extract_final_answer(response_text)
# Extract and execute the action
action = self._extract_action(response_text)
if action:
observation = self._execute_action(action)
trajectory += f"\nObservation: {observation}\n\n"
else:
trajectory += "\nObservation: No valid action found. Please provide a valid action.\n\n"
return "Maximum steps reached without finding an answer."
def _build_system_prompt(self) -> str:
tool_descriptions = "\n".join([
f"- {name}: {desc}"
for name, desc in self.tools.items()
])
return f"""You are a ReAct agent that solves tasks by alternating between Thought and Action.
Available tools:
{tool_descriptions}
Format your response as:
Thought: [Your reasoning about what to do next]
Action: [tool_name]([parameters])
OR, if you have the final answer:
Thought: [Your reasoning about why you have the answer]
Final Answer: [Your complete answer to the task]
Rules:
1. Always start with a Thought
2. Each Thought should be followed by either an Action or Final Answer
3. Use observations to inform your next thought
4. Be thorough but efficient - don't take unnecessary actions
5. If an action fails, reason about why and try a different approach"""
def _extract_action(self, text: str) -> dict:
"""Extract action from response text"""
action_pattern = r"Action:\s*(\w+)\((.*?)\)"
match = re.search(action_pattern, text, re.DOTALL)
if match:
tool_name = match.group(1)
params_str = match.group(2).strip()
return {"tool": tool_name, "params": params_str}
return None
def _execute_action(self, action: dict) -> str:
"""Execute an action and return the observation"""
tool_name = action["tool"]
params = action["params"]
if tool_name not in self.tools:
return f"Error: Unknown tool '{tool_name}'"
try:
result = self.tools[tool_name]["function"](params)
return str(result)
except Exception as e:
return f"Error executing {tool_name}: {str(e)}"
def _extract_final_answer(self, text: str) -> str:
"""Extract final answer from response"""
if "Final Answer:" in text:
return text.split("Final Answer:")[-1].strip()
return text
# Example usage
tools = {
"search": {
"description": "Search the web for information. Usage: search(query)",
"function": lambda q: web_search(q)
},
"calculate": {
"description": "Perform calculations. Usage: calculate(expression)",
"function": lambda e: safe_calculate(e)
},
"lookup": {
"description": "Look up a specific fact. Usage: lookup(topic)",
"function": lambda t: knowledge_base.get(t)
}
}
agent = ReActAgent(tools)
result = agent.run("What is the population density of the most populous country?")
ReAct Variations
Variation 1: Structured ReAct
Use JSON for more precise parsing:
def structured_react_prompt():
return """Respond in JSON format:
{
"thought": "Your reasoning about what to do next",
"action": {
"tool": "tool_name",
"params": {"param1": "value1"}
}
}
OR for final answer:
{
"thought": "Your reasoning about why you have the answer",
"final_answer": "Your complete answer"
}"""
def parse_structured_response(response: str) -> dict:
return json.loads(response)
Variation 2: ReAct with Memory
Add memory to inform reasoning:
class ReActWithMemory(ReActAgent):
def __init__(self, tools: dict):
super().__init__(tools)
self.memory = []
def run(self, task: str, max_steps: int = 10) -> str:
# Include relevant memories in context
memory_context = self._get_relevant_memories(task)
trajectory = f"Task: {task}\n\n"
if memory_context:
trajectory = f"Relevant past experience:\n{memory_context}\n\n{trajectory}"
result = super().run(task, max_steps)
# Store this interaction for future reference
self.memory.append({
"task": task,
"result": result,
"timestamp": datetime.now()
})
return result
def _get_relevant_memories(self, task: str) -> str:
# Find memories relevant to current task
relevant = [m for m in self.memory if self._is_relevant(m, task)]
return "\n".join([f"- {m['task']}: {m['result'][:100]}..." for m in relevant[-3:]])
Variation 3: ReAct with Self-Critique
Add reflection after each action:
class ReActWithCritique(ReActAgent):
def _build_system_prompt(self) -> str:
base_prompt = super()._build_system_prompt()
return base_prompt + """
After each Observation, also include:
Critique: [Brief assessment of whether this action was helpful and what you learned]
This helps you reason more effectively about your next action."""
def run(self, task: str, max_steps: int = 10) -> str:
# Extended format includes critique
trajectory = f"Task: {task}\n\n"
for step in range(max_steps):
response = self._get_response(trajectory)
trajectory += response + "\n"
if "Final Answer:" in response:
return self._extract_final_answer(response)
action = self._extract_action(response)
if action:
observation = self._execute_action(action)
trajectory += f"\nObservation: {observation}\n"
# Agent will include critique in next response
return "Maximum steps reached."
Variation 4: Parallel ReAct
Execute independent actions in parallel:
class ParallelReActAgent(ReActAgent):
async def run(self, task: str, max_steps: int = 10) -> str:
trajectory = f"Task: {task}\n\n"
for step in range(max_steps):
response = await self._get_response(trajectory)
trajectory += response + "\n"
if "Final Answer:" in response:
return self._extract_final_answer(response)
# Extract all actions (agent can specify multiple)
actions = self._extract_all_actions(response)
if actions:
# Execute independent actions in parallel
observations = await asyncio.gather(*[
self._execute_action_async(a) for a in actions
])
for i, obs in enumerate(observations):
trajectory += f"\nObservation {i+1}: {obs}\n"
return "Maximum steps reached."
def _extract_all_actions(self, text: str) -> list:
"""Extract multiple actions from response"""
pattern = r"Action \d+:\s*(\w+)\((.*?)\)"
matches = re.findall(pattern, text, re.DOTALL)
return [{"tool": m[0], "params": m[1]} for m in matches]
ReAct Best Practices
1. Clear Thought Prompting
Encourage explicit reasoning:
thought_prompt = """
When generating a Thought, consider:
1. What do I know so far?
2. What do I still need to find out?
3. What's the most efficient next step?
4. Are there any assumptions I should verify?
"""
2. Action Validation
Validate actions before execution:
def validate_action(action: dict) -> tuple[bool, str]:
"""Validate action before execution"""
if action["tool"] not in available_tools:
return False, f"Unknown tool: {action['tool']}"
tool_spec = available_tools[action["tool"]]
required_params = tool_spec.get("required", [])
for param in required_params:
if param not in action["params"]:
return False, f"Missing required parameter: {param}"
return True, "OK"
3. Observation Formatting
Format observations for easy reasoning:
def format_observation(raw_result: any) -> str:
"""Format observation for agent consumption"""
if isinstance(raw_result, dict):
return json.dumps(raw_result, indent=2)
if isinstance(raw_result, list):
if len(raw_result) > 5:
return f"Found {len(raw_result)} results:\n" + \
"\n".join([str(r)[:100] for r in raw_result[:5]]) + \
f"\n... and {len(raw_result) - 5} more"
return "\n".join([str(r) for r in raw_result])
result_str = str(raw_result)
if len(result_str) > 500:
return result_str[:500] + "... [truncated]"
return result_str
4. Error Recovery
Help the agent recover from errors:
def execute_with_retry(action: dict, max_retries: int = 2) -> str:
"""Execute action with retry logic"""
for attempt in range(max_retries + 1):
try:
result = execute_action(action)
return format_observation(result)
except Exception as e:
if attempt < max_retries:
error_msg = f"Attempt {attempt + 1} failed: {e}. Retrying..."
else:
error_msg = f"Action failed after {max_retries + 1} attempts: {e}. Consider a different approach."
return error_msg
5. Step Limiting
Prevent infinite loops:
class SafeReActAgent(ReActAgent):
def __init__(self, tools: dict, max_steps: int = 10, max_time: int = 120):
super().__init__(tools)
self.max_steps = max_steps
self.max_time = max_time
def run(self, task: str) -> str:
start_time = time.time()
for step in range(self.max_steps):
# Check time limit
if time.time() - start_time > self.max_time:
return self._summarize_progress("Time limit reached")
# Check for loops
if self._is_looping():
return self._break_loop()
# Normal ReAct step
result = self._step()
if result:
return result
return self._summarize_progress("Step limit reached")
def _is_looping(self) -> bool:
"""Detect if agent is repeating actions"""
if len(self.action_history) < 3:
return False
recent = self.action_history[-3:]
return len(set(str(a) for a in recent)) == 1
ReAct vs Other Patterns
ReAct vs Chain-of-Thought
Chain-of-Thought (CoT) reasons without acting:
CoT: Think step by step → Generate answer
ReAct: Think → Act → Observe → Think → Act → ...
ReAct is better when:
- External information is needed
- Actions must be taken
- Results need verification
CoT is better when:
- All information is in the prompt
- Pure reasoning is sufficient
- Speed is critical
ReAct vs Plan-then-Execute
Plan-then-Execute creates a full plan upfront:
Plan: Create complete plan → Execute all steps → Done
ReAct: Think → Act → Observe → Think (adjust) → Act → ...
ReAct is better when:
- Requirements may change
- Observations influence next steps
- Early failures need recovery
Plan-then-Execute is better when:
- Task structure is well-known
- Parallel execution is possible
- Upfront planning is valuable
Debugging ReAct Agents
Trajectory Analysis
Log and analyze agent trajectories:
class DebugReActAgent(ReActAgent):
def __init__(self, tools: dict, debug: bool = True):
super().__init__(tools)
self.debug = debug
self.trajectories = []
def run(self, task: str, max_steps: int = 10) -> str:
trajectory_log = {
"task": task,
"steps": [],
"result": None,
"start_time": datetime.now()
}
result = super().run(task, max_steps)
trajectory_log["result"] = result
trajectory_log["end_time"] = datetime.now()
self.trajectories.append(trajectory_log)
if self.debug:
self._print_trajectory(trajectory_log)
return result
def _print_trajectory(self, log: dict):
print(f"\n{'='*50}")
print(f"Task: {log['task']}")
print(f"Duration: {log['end_time'] - log['start_time']}")
for i, step in enumerate(log['steps']):
print(f"\nStep {i+1}:")
print(f" Thought: {step['thought'][:100]}...")
print(f" Action: {step['action']}")
print(f" Observation: {step['observation'][:100]}...")
print(f"\nResult: {log['result']}")
print('='*50)
Common Issues
1. Infinite Loops
- Symptom: Agent repeats same actions
- Cause: Not learning from observations
- Fix: Add loop detection and explicit "try different approach" prompts
2. Premature Stopping
- Symptom: Agent gives up too early
- Cause: Unclear success criteria
- Fix: Better define when task is complete
3. Tool Misuse
- Symptom: Wrong tool for the task
- Cause: Unclear tool descriptions
- Fix: Improve tool descriptions and add examples
4. Observation Overload
- Symptom: Agent confused by too much data
- Cause: Observations too long/complex
- Fix: Summarize and truncate observations
Conclusion
The ReAct pattern is the backbone of effective AI agents. By alternating between reasoning and acting, agents can:
- Tackle complex, multi-step tasks
- Adapt based on observations
- Self-correct when approaches don't work
- Build logical chains to answers
Key principles:
- Clear thoughts guide actions
- Actions produce observations
- Observations inform next thoughts
- Loop until goal is achieved
Master ReAct, and you've mastered the core of agentic AI.
Ready to tackle complex tasks that require upfront planning? Check out The Planning Pattern for strategic agent design.