Skip to main content

Reflection and Adaptation in AI Agents

Reflection is the mechanism by which agents evaluate their progress, detect failures, and adapt their strategy. A naive agent executes a plan rigidly, regardless of results. A reflective agent pauses after each tool call to ask: "Did this bring me closer to the goal? Should I retry, use a different tool, or abandon this approach?" This introspection dramatically improves success rates and enables agents to handle novel situations. This article teaches you to build reflection into agent loops and create agents that learn and adapt.

Reflection transforms agents from script-like executors into reasoning systems that handle uncertainty and change course dynamically. An agent reflecting on a failed database query might realize the query syntax is wrong and try a different approach, rather than repeating the same error.

Reflection Patterns

There are three common reflection patterns, each trading complexity for robustness.

Pattern 1: Outcome Evaluation — After each tool call, the agent asks: "Did this succeed? Is the result useful?"

def run_agent_with_outcome_evaluation(user_message: str, tools: list) -> str:
"""Agent that evaluates tool outcomes and adapts."""
client = anthropic.Anthropic()
messages = [{"role": "user", "content": user_message}]

for iteration in range(max_iterations):
# Standard call
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=2048,
tools=tools,
messages=messages
)

if response.stop_reason == "end_turn":
return next(
(block.text for block in response.content if hasattr(block, 'text')),
"Finished"
)

# Execute tools
messages.append({"role": "assistant", "content": response.content})
tool_results = []

for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)

# Reflection: evaluate the result
evaluation_prompt = f"""
Tool: {block.name}
Input: {block.input}
Result: {result}

Evaluate this result:
1. Did the tool succeed? (yes/no)
2. Is the result useful for the goal? (yes/no)
3. If not, what should we try next?
"""

eval_response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=256,
messages=[{"role": "user", "content": evaluation_prompt}]
)

evaluation = eval_response.content[0].text
print(f"Evaluation: {evaluation}")

# Include evaluation in tool result
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"{result}\n\n[Evaluation: {evaluation}]"
})

messages.append({"role": "user", "content": tool_results})

return "Max iterations"

This pattern is straightforward but adds API calls (one evaluation per tool). Use it when tools are expensive (external APIs) and worth validating.

Pattern 2: Strategy Reflection — After several steps, the agent pauses to assess overall progress and adjust strategy.

def run_agent_with_strategy_reflection(user_message: str, tools: list) -> str:
"""Agent that reflects on strategy every N steps."""
client = anthropic.Anthropic()
messages = [{"role": "user", "content": user_message}]

for iteration in range(max_iterations):
# Every 3 iterations, reflect on strategy
if iteration > 0 and iteration % 3 == 0:
reflection_prompt = f"""
Goal: {user_message}

Progress so far:
{format_conversation_summary(messages)}

Reflect on your strategy:
1. What have you learned?
2. Are you making progress toward the goal?
3. Should you change strategy? If yes, how?
"""

reflection_response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=512,
messages=[{"role": "user", "content": reflection_prompt}]
)

reflection_text = reflection_response.content[0].text
print(f"Strategy reflection: {reflection_text}")

# Add reflection to messages so the model remembers it
messages.append({
"role": "user",
"content": f"Based on this reflection, continue with your task: {reflection_text}"
})

# Standard agent step
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=2048,
tools=tools,
messages=messages
)

# ... (rest of loop)

This pattern groups evaluation, reducing API calls. It's better for long-running agents.

Pattern 3: Error-Driven Adaptation — When a tool fails, the agent automatically tries alternatives.

def run_agent_with_error_adaptation(user_message: str, tools: list) -> str:
"""Agent that adapts strategy when tools fail."""
client = anthropic.Anthropic()
messages = [{"role": "user", "content": user_message}]

for iteration in range(max_iterations):
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=2048,
tools=tools,
messages=messages
)

if response.stop_reason == "end_turn":
return extract_text(response)

messages.append({"role": "assistant", "content": response.content})
tool_results = []
errors_occurred = False

for block in response.content:
if block.type == "tool_use":
try:
result = execute_tool_with_validation(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
except ToolFailureException as e:
# Tool failed: ask agent to try alternative
errors_occurred = True
error_message = (
f"Tool '{block.name}' failed with error: {str(e)}\n"
f"Input was: {block.input}\n"
f"Try a different approach or a different tool."
)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": error_message,
"is_error": True
})

messages.append({"role": "user", "content": tool_results})

# If errors occurred, optionally add guidance
if errors_occurred:
messages.append({
"role": "user",
"content": "Some tools failed above. Try alternative approaches."
})

return "Max iterations"

This pattern uses the is_error flag (supported by many LLM APIs) to signal failures. The model sees errors and adjusts automatically.

Learning from Mistakes

Agents can improve over time by maintaining a mistake log and learning from it:

class LearningAgent:
"""Agent that remembers mistakes and learns from them."""

def __init__(self):
self.client = anthropic.Anthropic()
self.mistake_log = []

def record_mistake(self, context: str, mistake: str, lesson: str):
"""Record a mistake and the lesson learned."""
self.mistake_log.append({
"context": context,
"mistake": mistake,
"lesson": lesson
})

def get_lessons_context(self) -> str:
"""Format lessons for inclusion in agent prompts."""
if not self.mistake_log:
return ""

context = "## Previous Lessons Learned\n\n"
for entry in self.mistake_log[-5:]: # Last 5 mistakes
context += (
f"- **Mistake:** {entry['mistake']}\n"
f" **Lesson:** {entry['lesson']}\n\n"
)
return context

def run(self, user_message: str, tools: list) -> str:
"""Run agent with learned lessons in context."""
system_prompt = (
"You are a helpful agent. Use tools to answer questions. "
"Avoid the mistakes listed below.\n\n"
+ self.get_lessons_context()
)

messages = [{"role": "user", "content": user_message}]

for iteration in range(10):
response = self.client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=2048,
tools=tools,
messages=messages,
system=system_prompt
)

if response.stop_reason == "end_turn":
return extract_text(response)

# Execute tools and handle results...
messages.append({"role": "assistant", "content": response.content})
tool_results = []

for block in response.content:
if block.type == "tool_use":
try:
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
except Exception as e:
# Log this as a mistake for future learning
mistake_desc = f"Failed to call {block.name} with {block.input}"
lesson = f"Check {block.name}'s parameters before calling"
self.record_mistake(
context=user_message,
mistake=mistake_desc,
lesson=lesson
)

tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"Error: {e}",
"is_error": True
})

messages.append({"role": "user", "content": tool_results})

return "Max iterations"

# Usage
agent = LearningAgent()
agent.record_mistake(
context="User asked for weather",
mistake="Called get_weather with only a street address (missing city/state)",
lesson="get_weather requires city and state format, e.g., 'San Francisco, CA'"
)
result = agent.run("What's the weather in New York?", tools)

Self-Correction

Agents can self-correct by asking themselves clarifying questions:

def run_self_correcting_agent(user_message: str, tools: list) -> str:
"""Agent that questions its own assumptions."""
client = anthropic.Anthropic()
messages = [{"role": "user", "content": user_message}]

for iteration in range(max_iterations):
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=2048,
tools=tools,
messages=messages
)

if response.stop_reason == "end_turn":
final_answer = extract_text(response)

# Self-correction: verify the answer makes sense
verification_prompt = f"""
Original request: {user_message}
Proposed answer: {final_answer}

Check: Does this answer directly address the request?
If not, what's missing?
"""

verification_response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=256,
messages=[{"role": "user", "content": verification_prompt}]
)

verification = verification_response.content[0].text
if "missing" in verification.lower() or "incomplete" in verification.lower():
# Loop back to gather more information
messages.append({"role": "assistant", "content": final_answer})
messages.append({
"role": "user",
"content": f"Your previous answer was incomplete. {verification}"
})
continue

return final_answer

# ... handle tool calls

Key Takeaways

  • Reflection enables agents to evaluate outcomes, detect failures, and adapt strategy
  • Outcome evaluation checks each tool result; strategy reflection happens every N steps
  • Error-driven adaptation automatically handles tool failures by trying alternatives
  • Learning agents maintain mistake logs and avoid repeating errors
  • Self-correction validates answers before returning them, catching incomplete or incorrect results

Frequently Asked Questions

Does reflection slow down agents?

Yes, it adds API calls. Outcome evaluation nearly doubles the call count. Use reflection selectively: for high-stakes tasks, expensive operations, or long-running agents where adaptation pays off. For simple queries, skip it.

How do I prevent an agent from getting stuck in an adaptation loop?

Set a maximum number of retries per subtask (e.g., try a tool 3 times, then give up). Track which approaches have failed and avoid repeating them. Add a "backoff" delay between retries.

Can agents learn across sessions?

Yes, by persisting the mistake log to a database or file. Load it when the agent starts. This is useful for agents running 24/7. Privacy-aware agents shouldn't store user requests, only generic lessons.

What's the difference between reflection and planning?

Planning (from an earlier article) happens before tool calls; reflection happens after. Planning is proactive (deciding what to do). Reflection is reactive (evaluating what happened). Both improve agent robustness.

Further Reading