Learning & Adaptation
How AI agents improve over time without retraining: learning in token space, reflexion patterns, and emerging self-evolving architectures.
The Learning Challenge
Traditional ML models learn by updating weights during training. But production LLM agents can't retrain on every interaction. How can they improve?
Token Space Learning
Store successful trajectories and inject them as few-shot examples
Reflexion
Self-critique and iterative improvement on failures
Self-Evolution
Agents that modify their own prompts, tools, or code
No Weight Updates
The Learning Spectrum
Safety/Control ◄──────────────────────────────────────► Autonomy/Risk
┌─────────────────┬─────────────────┬─────────────────┬─────────────────┐
│ STATIC │ TOKEN SPACE │ REFLEXION │ SELF-EVOLVING │
│ PROMPTS │ LEARNING │ │ │
├─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ │ │ │ │
│ Fixed system │ Dynamic few- │ Self-critique │ Prompt/code │
│ prompt, no │ shot examples │ and iterative │ modification │
│ adaptation │ from trajectory │ improvement │ by agent │
│ │ storage │ │ │
│ │ │ │ │
│ • Predictable │ • Learns from │ • Improves on │ • Autonomous │
│ • Consistent │ successes │ failures │ improvement │
│ • No learning │ • Safe (read- │ • Multi-attempt │ • Risky if │
│ │ only context) │ solving │ unsupervised │
│ │ │ │ │
└─────────────────┴─────────────────┴─────────────────┴─────────────────┘
▲ ▲ ▲ ▲
│ │ │ │
Most systems Production Research Experimental
today ready interest (safety concerns) 1. Learning in Token Space
The simplest form of agent learning: store successful task completions and retrieve them as few-shot examples for similar future tasks.
┌─────────────────────────────────────────────────────────────────┐
│ NEW TASK │
│ "Parse this JSON" │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────┐
│ EMBED TASK │
│ DESCRIPTION │
└─────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ TRAJECTORY STORE │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Task: "Parse XML file" Similarity: 0.72 │ │
│ │ Steps: read_file → parse → extract │ │
│ ├─────────────────────────────────────────────────────────┤ │
│ │ Task: "Extract data from JSON" Similarity: 0.91 ◄───┼───│
│ │ Steps: read_file → json.loads → filter_keys │ │
│ ├─────────────────────────────────────────────────────────┤ │
│ │ Task: "Convert CSV to dict" Similarity: 0.68 │ │
│ │ Steps: read_file → csv.reader → to_dict │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ DYNAMIC PROMPT │
│ System: You are a data processing assistant... │
│ │
│ Example 1: (from trajectory store) │
│ User: Extract data from JSON │
│ Assistant: read_file → json.loads → filter_keys │
│ │
│ Current task: │
│ User: Parse this JSON │
└─────────────────────────────────────────────────────────────────┘ # Learning in Token Space: No weight updates, only context manipulation
class TokenSpaceLearner:
trajectoryStore: VectorDB # Stores successful task completions
fewShotSelector: EmbeddingModel
function learn(task, trajectory, outcome):
if outcome.success:
# Store successful trajectory for future reference
embedding = embed(task.description)
trajectoryStore.store({
task_embedding: embedding,
task_description: task.description,
trajectory: trajectory, # Steps taken
outcome: outcome,
timestamp: now()
})
function recall(newTask, k = 3):
# Find similar past tasks
embedding = embed(newTask.description)
similar = trajectoryStore.search(embedding, topK = k)
# Format as few-shot examples
examples = []
for item in similar:
examples.append({
task: item.task_description,
solution: formatTrajectory(item.trajectory)
})
return examples
function execute(task):
# Retrieve relevant past experiences
examples = recall(task)
# Build prompt with dynamic few-shot examples
prompt = buildPrompt(
systemMessage: "You are a helpful assistant...",
fewShotExamples: examples, # Learning injected here
currentTask: task
)
# Execute with learned context
trajectory = agent.run(prompt)
outcome = evaluate(trajectory)
# Learn from this execution
learn(task, trajectory, outcome)
return outcome import chromadb
from dataclasses import dataclass
from typing import Optional
import json
@dataclass
class Trajectory:
task: str
steps: list[dict] # {thought, action, observation}
outcome: dict
success: bool
class TokenSpaceLearner:
"""Learn from experience without updating model weights."""
def __init__(self, collection_name: str = "trajectories"):
self.client = chromadb.PersistentClient(path="./learning_db")
self.collection = self.client.get_or_create_collection(
name=collection_name,
metadata={"hnsw:space": "cosine"}
)
def store_trajectory(self, trajectory: Trajectory) -> None:
"""Store a successful trajectory for future reference."""
if not trajectory.success:
return # Only learn from successes
# Create searchable representation
doc = f"Task: {trajectory.task}\n"
doc += f"Approach: {self._summarize_approach(trajectory.steps)}"
self.collection.add(
ids=[f"traj_{hash(trajectory.task)}_{len(self.collection.get()['ids'])}"],
documents=[doc],
metadatas=[{
"task": trajectory.task,
"steps": json.dumps(trajectory.steps),
"outcome": json.dumps(trajectory.outcome),
"success": trajectory.success
}]
)
def recall_similar(
self,
task: str,
k: int = 3,
min_similarity: float = 0.5
) -> list[Trajectory]:
"""Retrieve trajectories from similar past tasks."""
results = self.collection.query(
query_texts=[task],
n_results=k,
include=["metadatas", "distances"]
)
trajectories = []
for i, distance in enumerate(results['distances'][0]):
similarity = 1 - distance
if similarity < min_similarity:
continue
metadata = results['metadatas'][0][i]
trajectories.append(Trajectory(
task=metadata['task'],
steps=json.loads(metadata['steps']),
outcome=json.loads(metadata['outcome']),
success=metadata['success']
))
return trajectories
def build_few_shot_prompt(
self,
task: str,
system_message: str,
k: int = 3
) -> list[dict]:
"""Build a prompt with dynamic few-shot examples."""
examples = self.recall_similar(task, k=k)
messages = [{"role": "system", "content": system_message}]
# Add retrieved examples as conversation turns
for ex in examples:
messages.append({
"role": "user",
"content": f"Task: {ex.task}"
})
messages.append({
"role": "assistant",
"content": self._format_trajectory(ex.steps)
})
# Add current task
messages.append({
"role": "user",
"content": f"Task: {task}"
})
return messages
def _summarize_approach(self, steps: list[dict]) -> str:
"""Create a brief summary of the approach taken."""
actions = [s.get('action', '') for s in steps]
return " -> ".join(actions[:5])
def _format_trajectory(self, steps: list[dict]) -> str:
"""Format trajectory steps for few-shot example."""
formatted = []
for step in steps:
formatted.append(f"Thought: {step.get('thought', '')}")
formatted.append(f"Action: {step.get('action', '')}")
if 'observation' in step:
formatted.append(f"Observation: {step['observation']}")
return "\n".join(formatted)
# Usage
learner = TokenSpaceLearner()
# After successful task completion
learner.store_trajectory(Trajectory(
task="Parse the JSON file and extract all email addresses",
steps=[
{"thought": "Need to read the file first", "action": "read_file('data.json')"},
{"thought": "Parse JSON and find emails", "action": "extract_emails(data)"},
],
outcome={"emails_found": 15},
success=True
))
# When handling a new similar task
prompt = learner.build_few_shot_prompt(
task="Extract phone numbers from the CSV file",
system_message="You are a data extraction assistant."
) using Qdrant.Client;
using System.Text.Json;
public record Trajectory(
string Task,
List<TrajectoryStep> Steps,
Dictionary<string, object> Outcome,
bool Success
);
public record TrajectoryStep(
string Thought,
string Action,
string? Observation = null
);
public class TokenSpaceLearner
{
private readonly QdrantClient _qdrant;
private readonly EmbeddingModel _embedder;
private const string CollectionName = "trajectories";
public TokenSpaceLearner(string qdrantUrl = "http://localhost:6334")
{
_qdrant = new QdrantClient(qdrantUrl);
_embedder = new EmbeddingModel();
EnsureCollectionExists().Wait();
}
public async Task StoreTrajectoryAsync(
Trajectory trajectory,
CancellationToken ct = default)
{
if (!trajectory.Success)
return; // Only learn from successes
var doc = $"Task: {trajectory.Task}\n";
doc += $"Approach: {SummarizeApproach(trajectory.Steps)}";
var embedding = _embedder.Encode(doc);
var point = new PointStruct
{
Id = new PointId { Uuid = Guid.NewGuid().ToString() },
Vectors = embedding,
Payload = {
["task"] = trajectory.Task,
["steps"] = JsonSerializer.Serialize(trajectory.Steps),
["outcome"] = JsonSerializer.Serialize(trajectory.Outcome),
["success"] = trajectory.Success
}
};
await _qdrant.UpsertAsync(CollectionName, new[] { point }, ct);
}
public async Task<List<Trajectory>> RecallSimilarAsync(
string task,
int k = 3,
float minSimilarity = 0.5f,
CancellationToken ct = default)
{
var embedding = _embedder.Encode(task);
var results = await _qdrant.SearchAsync(
CollectionName,
embedding,
limit: (ulong)k,
scoreThreshold: minSimilarity,
ct: ct
);
return results.Select(r => new Trajectory(
Task: r.Payload["task"].StringValue,
Steps: JsonSerializer.Deserialize<List<TrajectoryStep>>(
r.Payload["steps"].StringValue
)!,
Outcome: JsonSerializer.Deserialize<Dictionary<string, object>>(
r.Payload["outcome"].StringValue
)!,
Success: r.Payload["success"].BoolValue
)).ToList();
}
public async Task<List<ChatMessage>> BuildFewShotPromptAsync(
string task,
string systemMessage,
int k = 3,
CancellationToken ct = default)
{
var examples = await RecallSimilarAsync(task, k, ct: ct);
var messages = new List<ChatMessage>
{
new("system", systemMessage)
};
foreach (var ex in examples)
{
messages.Add(new("user", $"Task: {ex.Task}"));
messages.Add(new("assistant", FormatTrajectory(ex.Steps)));
}
messages.Add(new("user", $"Task: {task}"));
return messages;
}
private string SummarizeApproach(List<TrajectoryStep> steps) =>
string.Join(" -> ", steps.Take(5).Select(s => s.Action));
private string FormatTrajectory(List<TrajectoryStep> steps) =>
string.Join("\n", steps.SelectMany(s => new[]
{
$"Thought: {s.Thought}",
$"Action: {s.Action}",
s.Observation != null ? $"Observation: {s.Observation}" : null
}.Where(x => x != null)));
} Key Benefits
| Benefit | Description |
|---|---|
| No retraining | Learning happens through context, not weight updates |
| Immediate | New experiences are available for the next request |
| Interpretable | You can inspect what examples were retrieved |
| Safe | Read-only operation; can't corrupt the model |
| Domain-specific | Naturally adapts to your specific use cases |
Quality Over Quantity
2. Reflexion
Reflexion (Shinn et al., 2023) enables agents to learn from failures through self-reflection. Instead of giving up after an error, the agent analyzes what went wrong and tries again.
┌─────────────┐
│ TASK │
└──────┬──────┘
│
┌───────────────┼───────────────┐
│ ▼ │
│ ┌───────────────┐ │
│ │ ACTOR │ │
│ │ (Generate │ │
│ │ Trajectory) │ │
│ └───────┬───────┘ │
│ │ │
│ ▼ │
│ ┌───────────────┐ │
│ │ EVALUATOR │ │
│ │ (Check if │ │
│ │ Correct) │ │
│ └───────┬───────┘ │
│ │ │
│ ┌───────┴───────┐ │
│ │ │ │
│ Success Failure │
│ │ │ │
│ ▼ ▼ │
│ ┌─────┐ ┌───────────┐ │
│ │DONE │ │ REFLECTOR │ │
│ └─────┘ │ │ │
│ │ "What went│ │
│ │ wrong?" │ │
│ │ "Why?" │ │
│ │ "How to │ │
│ │ fix?" │ │
│ └─────┬─────┘ │
│ │ │
│ ▼ │
│ ┌───────────┐ │
│ │ MEMORY │ │
│ │(Reflections) │
│ └─────┬─────┘ │
│ │ │
└────────────────────┘ │
(retry with │
reflections) │
│
┌───────────────────────────────┘
│
▼
┌─────────────┐
│ LONG-TERM │
│ MEMORY │
│ (Learnings) │
└─────────────┘ # Reflexion: Self-reflection for iterative improvement
class ReflexionAgent:
shortTermMemory: [] # Current task context
longTermMemory: [] # Persistent reflections
function solve(task, maxAttempts = 3):
for attempt in range(maxAttempts):
# Generate trajectory (attempt to solve)
trajectory = actor.generate(
task: task,
reflections: longTermMemory, # Include past learnings
previousAttempt: shortTermMemory
)
# Evaluate the trajectory
evaluation = evaluator.assess(task, trajectory)
if evaluation.success:
# Success! Store positive reflection
reflection = reflector.generate(
task: task,
trajectory: trajectory,
outcome: "SUCCESS",
learnings: "This approach worked because..."
)
longTermMemory.append(reflection)
return trajectory
else:
# Failure - generate reflection
reflection = reflector.generate(
task: task,
trajectory: trajectory,
outcome: "FAILURE",
errors: evaluation.errors,
learnings: "This failed because... Next time I should..."
)
# Store for next attempt
shortTermMemory.append({
attempt: attempt,
trajectory: trajectory,
reflection: reflection
})
# Max attempts reached
return bestAttempt(shortTermMemory)
# Reflector generates structured self-critique
function generateReflection(task, trajectory, outcome, errors):
prompt = """
Task: {task}
Your approach:
{trajectory}
Outcome: {outcome}
Errors encountered: {errors}
Generate a reflection with:
1. What went wrong?
2. Why did it go wrong?
3. What should I do differently next time?
4. Specific actionable improvements
"""
return llm.generate(prompt) from dataclasses import dataclass, field
from typing import Callable
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from pydantic import BaseModel, Field
class ReflectionOutput(BaseModel):
what_went_wrong: str = Field(description="What went wrong in this attempt")
why_it_failed: str = Field(description="Root cause of the failure")
improvements: list[str] = Field(description="Specific improvements for next attempt")
@dataclass
class Reflection:
task: str
attempt: int
trajectory: str
outcome: str
what_went_wrong: str
why_it_failed: str
improvements: list[str]
@dataclass
class ReflexionAgent:
"""Agent that learns from self-reflection on failures."""
llm: ChatOpenAI = field(default_factory=lambda: ChatOpenAI(model="gpt-4"))
short_term_memory: list[Reflection] = field(default_factory=list)
long_term_memory: list[Reflection] = field(default_factory=list)
def solve(
self,
task: str,
max_attempts: int = 3,
evaluator: Callable = None
) -> tuple[str, bool]:
"""Attempt to solve task with reflection on failures."""
self.short_term_memory = []
for attempt in range(max_attempts):
trajectory = self._generate_trajectory(task, attempt)
success, errors = evaluator(trajectory) if evaluator else (False, [])
if success:
reflection = self._generate_reflection(task, attempt, trajectory, "SUCCESS", [])
self.long_term_memory.append(reflection)
return trajectory, True
reflection = self._generate_reflection(task, attempt, trajectory, "FAILURE", errors)
self.short_term_memory.append(reflection)
return self._select_best_attempt(), False
def _generate_trajectory(self, task: str, attempt: int) -> str:
"""Generate a solution attempt using LangChain."""
messages = [("system", self._build_system_prompt())]
if self.long_term_memory:
learnings = self._format_learnings(self.long_term_memory[-5:])
messages.append(("system", f"Learnings from past tasks:\n{learnings}"))
if self.short_term_memory:
reflections = self._format_reflections(self.short_term_memory)
messages.append(("user", f"Previous attempts and reflections:\n{reflections}"))
messages.append(("user", f"Task: {task}"))
prompt = ChatPromptTemplate.from_messages(messages)
chain = prompt | self.llm
response = chain.invoke({})
return response.content
def _generate_reflection(
self, task: str, attempt: int, trajectory: str, outcome: str, errors: list[str]
) -> Reflection:
"""Generate structured reflection using LangChain JSON parser."""
parser = JsonOutputParser(pydantic_object=ReflectionOutput)
prompt = ChatPromptTemplate.from_messages([
("system", "Analyze this attempt and generate a reflection."),
("user", """Task: {task}
Attempt #{attempt}:
{trajectory}
Outcome: {outcome}
Errors: {errors}
{format_instructions}""")
])
chain = prompt | self.llm | parser
data = chain.invoke({
"task": task,
"attempt": attempt + 1,
"trajectory": trajectory,
"outcome": outcome,
"errors": ', '.join(errors) if errors else 'None',
"format_instructions": parser.get_format_instructions()
})
return Reflection(
task=task, attempt=attempt, trajectory=trajectory, outcome=outcome,
what_went_wrong=data.get("what_went_wrong", ""),
why_it_failed=data.get("why_it_failed", ""),
improvements=data.get("improvements", [])
)
# Usage
agent = ReflexionAgent()
def code_evaluator(trajectory: str) -> tuple[bool, list[str]]:
try:
exec(trajectory)
return True, []
except Exception as e:
return False, [str(e)]
solution, success = agent.solve(
task="Write a function to find the nth Fibonacci number",
evaluator=code_evaluator
) using Microsoft.Agents.AI;
using Microsoft.Extensions.AI;
using System.Text.Json;
using OpenAI;
public record Reflection(
string Task,
int Attempt,
string Trajectory,
string Outcome,
string WhatWentWrong,
string WhyItFailed,
List<string> Improvements
);
public class ReflexionAgent
{
private readonly AIAgent _agent;
private readonly IChatClient _chatClient;
private List<Reflection> _shortTermMemory = new();
private readonly List<Reflection> _longTermMemory = new();
public ReflexionAgent(string apiKey)
{
_chatClient = new OpenAIClient(apiKey)
.GetChatClient("gpt-4o")
.AsIChatClient();
_agent = _chatClient.CreateAIAgent(
name: "ReflexionAgent",
instructions: "You solve tasks and learn from your mistakes."
);
}
public async Task<(string Solution, bool Success)> SolveAsync(
string task,
Func<string, (bool Success, List<string> Errors)> evaluator,
int maxAttempts = 3)
{
_shortTermMemory = new();
for (int attempt = 0; attempt < maxAttempts; attempt++)
{
var trajectory = await GenerateTrajectoryAsync(task, attempt);
var (success, errors) = evaluator(trajectory);
if (success)
{
var reflection = await GenerateReflectionAsync(
task, attempt, trajectory, "SUCCESS", new()
);
_longTermMemory.Add(reflection);
return (trajectory, true);
}
var failureReflection = await GenerateReflectionAsync(
task, attempt, trajectory, "FAILURE", errors
);
_shortTermMemory.Add(failureReflection);
}
return (SelectBestAttempt(), false);
}
private async Task<string> GenerateTrajectoryAsync(string task, int attempt)
{
var thread = _agent.GetNewThread();
// Add learnings from past tasks
if (_longTermMemory.Count > 0)
{
var learnings = FormatLearnings(_longTermMemory.TakeLast(5));
await thread.AddMessageAsync($"Learnings from past tasks:\n{learnings}");
}
// Add reflections from previous attempts
if (_shortTermMemory.Count > 0)
{
var reflections = FormatReflections(_shortTermMemory);
await thread.AddMessageAsync($"Previous attempts:\n{reflections}");
}
return await _agent.RunAsync($"Task: {task}", thread);
}
private async Task<Reflection> GenerateReflectionAsync(
string task, int attempt, string trajectory, string outcome, List<string> errors)
{
var prompt = $@"Analyze this attempt and generate a reflection.
Task: {task}
Attempt #{attempt + 1}:
{trajectory}
Outcome: {outcome}
Errors: {string.Join(", ", errors)}
Return JSON with: what_went_wrong, why_it_failed, improvements[]";
var response = await _chatClient.GetResponseAsync(
prompt,
new() { ResponseFormat = ChatResponseFormat.Json }
);
var data = JsonDocument.Parse(response.Text);
var root = data.RootElement;
return new Reflection(
Task: task, Attempt: attempt, Trajectory: trajectory, Outcome: outcome,
WhatWentWrong: root.GetProperty("what_went_wrong").GetString() ?? "",
WhyItFailed: root.GetProperty("why_it_failed").GetString() ?? "",
Improvements: root.GetProperty("improvements")
.EnumerateArray().Select(e => e.GetString()!).ToList()
);
}
private string FormatReflections(IEnumerable<Reflection> reflections) =>
string.Join("\n", reflections.Select(r => $@"
Attempt {r.Attempt + 1}:
- What went wrong: {r.WhatWentWrong}
- Why: {r.WhyItFailed}
- Improvements: {string.Join(", ", r.Improvements)}"));
} Reflection Structure
A good reflection includes:
- What went wrong? - Specific error or failure mode
- Why did it fail? - Root cause analysis
- What should I try next? - Concrete alternative approach
- What did I learn? - Generalizable insight
Reflection Quality
3. Self-Evolving Agents
Experimental & Safety-Critical
The most advanced (and risky) form of agent learning: agents that modify their own prompts, generate new tools, or even write new code.
Research Directions
| Approach | What Evolves | Safety Level |
|---|---|---|
| Self-Critique (SCA) | Output quality through revision | Safe (no persistent changes) |
| Prompt Evolution | System prompts based on performance | Moderate (prompts can drift) |
| Tool Generation | New tools/functions | Risky (code execution) |
| Architecture Evolution | Agent structure itself | Highly experimental |
# Self-Evolving Agents: Agents that improve their own code/prompts
# WARNING: This is an active research area with safety concerns.
# These patterns are experimental and should be used with caution.
# Pattern 1: Self-Critique Agent (SCA)
# Agent generates, critiques, and revises its own outputs
class SelfCritiqueAgent:
function generate(task):
# Initial generation
output = llm.generate(task)
# Self-critique loop
for i in range(maxRevisions):
critique = llm.generate(
"Critique this output for errors, improvements: " + output
)
if critique.noIssuesFound:
break
# Revise based on critique
output = llm.generate(
"Original: " + output +
"Critique: " + critique +
"Improved version:"
)
return output
# Pattern 2: Prompt Evolution
# Agent evolves its own system prompt based on performance
class PromptEvolver:
currentPrompt: string
performanceHistory: []
function evolve(tasks, evaluator):
for task in tasks:
# Execute with current prompt
result = agent.run(currentPrompt, task)
score = evaluator(result)
performanceHistory.append({ prompt: currentPrompt, score: score })
# Generate improved prompt based on performance
lowScoreTasks = filter(performanceHistory, score < threshold)
newPrompt = llm.generate(
"Current prompt: " + currentPrompt +
"Failed on these tasks: " + lowScoreTasks +
"Generate an improved prompt that would handle these cases better."
)
# A/B test new prompt
if evaluate(newPrompt) > evaluate(currentPrompt):
currentPrompt = newPrompt
# Pattern 3: Tool/Skill Generation
# Agent creates new tools when existing ones are insufficient
class ToolGenerator:
function handleTask(task):
# Try existing tools
result = agent.run(task, availableTools)
if result.success:
return result
if result.error == "NO_SUITABLE_TOOL":
# Generate a new tool
newTool = llm.generate(
"Task that failed: " + task +
"Available tools: " + availableTools +
"Generate a new tool (name, description, implementation) " +
"that would help solve this task."
)
# Validate and sandbox the new tool
if validateTool(newTool) and sandboxTest(newTool):
availableTools.append(newTool)
return agent.run(task, availableTools)
return result from dataclasses import dataclass
from typing import Callable
import ast
@dataclass
class EvolutionResult:
original: str
evolved: str
improvement_score: float
changes_made: list[str]
class SelfCritiqueAgent:
"""Agent that critiques and revises its own outputs."""
def __init__(self, client, model: str = "gpt-4"):
self.client = client
self.model = model
def generate_with_critique(
self,
task: str,
max_revisions: int = 3
) -> tuple[str, list[str]]:
"""Generate output with self-critique loop."""
critiques = []
output = self._generate(task)
for _ in range(max_revisions):
# Self-critique
critique = self._critique(task, output)
critiques.append(critique)
if "no issues found" in critique.lower():
break
# Revise based on critique
output = self._revise(task, output, critique)
return output, critiques
def _generate(self, task: str) -> str:
response = self.client.chat.completions.create(
model=self.model,
messages=[{"role": "user", "content": task}]
)
return response.choices[0].message.content
def _critique(self, task: str, output: str) -> str:
prompt = f"""Review this output for the given task.
Task: {task}
Output:
{output}
Identify any:
1. Errors or inaccuracies
2. Missing information
3. Areas for improvement
4. Logical inconsistencies
If the output is satisfactory, respond with "No issues found."
Otherwise, list specific issues."""
response = self.client.chat.completions.create(
model=self.model,
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
def _revise(self, task: str, output: str, critique: str) -> str:
prompt = f"""Revise this output based on the critique.
Task: {task}
Original output:
{output}
Critique:
{critique}
Provide an improved version that addresses all the issues."""
response = self.client.chat.completions.create(
model=self.model,
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
class PromptEvolver:
"""Evolves system prompts based on performance."""
def __init__(self, client, initial_prompt: str):
self.client = client
self.current_prompt = initial_prompt
self.history: list[dict] = []
def evolve(
self,
test_cases: list[tuple[str, str]], # (input, expected_output)
evaluator: Callable[[str, str], float],
generations: int = 5
) -> EvolutionResult:
"""Evolve prompt over multiple generations."""
original_prompt = self.current_prompt
original_score = self._evaluate_prompt(test_cases, evaluator)
for gen in range(generations):
# Find failure cases
failures = self._identify_failures(test_cases, evaluator)
if not failures:
break # Perfect score
# Generate improved prompt
new_prompt = self._generate_improved_prompt(failures)
# Evaluate new prompt
new_score = self._evaluate_prompt(
test_cases, evaluator, new_prompt
)
# Keep if better
if new_score > self._evaluate_prompt(test_cases, evaluator):
self.current_prompt = new_prompt
self.history.append({
"generation": gen,
"score": new_score,
"prompt": new_prompt
})
final_score = self._evaluate_prompt(test_cases, evaluator)
return EvolutionResult(
original=original_prompt,
evolved=self.current_prompt,
improvement_score=final_score - original_score,
changes_made=[h["prompt"][:100] for h in self.history]
)
def _generate_improved_prompt(self, failures: list[dict]) -> str:
failure_summary = "\n".join([
f"Input: {f['input']}\nExpected: {f['expected']}\nGot: {f['actual']}"
for f in failures[:5]
])
prompt = f"""Current system prompt:
{self.current_prompt}
This prompt failed on these cases:
{failure_summary}
Generate an improved system prompt that would handle these cases correctly.
Keep the same general purpose but add specific instructions to address the failures."""
response = self.client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
return response.choices[0].message.content
class SafeToolGenerator:
"""Generate new tools with safety constraints."""
FORBIDDEN_MODULES = {'os', 'subprocess', 'sys', 'importlib', 'eval', 'exec'}
def __init__(self, client):
self.client = client
self.generated_tools: dict[str, Callable] = {}
def generate_tool(
self,
task_description: str,
existing_tools: list[str]
) -> dict | None:
"""Generate a new tool for a task."""
prompt = f"""Generate a Python function to help with this task:
{task_description}
Existing tools: {', '.join(existing_tools)}
Requirements:
1. Pure function (no side effects)
2. No file system access
3. No network calls
4. No imports except: math, json, re, datetime
5. Include docstring and type hints
Return as JSON:
{{
"name": "function_name",
"description": "what it does",
"code": "def function_name(...):\n ..."
}}"""
response = self.client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}],
response_format={"type": "json_object"}
)
tool_spec = json.loads(response.choices[0].message.content)
# Validate safety
if not self._is_safe(tool_spec["code"]):
return None
# Test in sandbox
if not self._sandbox_test(tool_spec):
return None
return tool_spec
def _is_safe(self, code: str) -> bool:
"""Check if code is safe to execute."""
try:
tree = ast.parse(code)
for node in ast.walk(tree):
# Check for forbidden imports
if isinstance(node, ast.Import):
for alias in node.names:
if alias.name.split('.')[0] in self.FORBIDDEN_MODULES:
return False
if isinstance(node, ast.ImportFrom):
if node.module and node.module.split('.')[0] in self.FORBIDDEN_MODULES:
return False
# Check for eval/exec calls
if isinstance(node, ast.Call):
if isinstance(node.func, ast.Name):
if node.func.id in ('eval', 'exec', 'compile'):
return False
return True
except SyntaxError:
return False Safety Considerations
Sandbox All Generated Code
Version Control Prompts
Human-in-the-Loop
Evaluation Approach
| Metric | What it Measures | Applies To |
|---|---|---|
| Learning Curve | Performance improvement over tasks | All approaches |
| Sample Efficiency | Tasks needed to reach performance level | Token space, Reflexion |
| Reflection Quality | Actionability of generated reflections | Reflexion |
| Retry Reduction | Fewer attempts needed over time | Reflexion |
| Transfer Learning | Performance on related but new tasks | All approaches |
| Stability | Variance in performance over time | Self-evolving |
Metrics for evaluating agent learning
Choosing an Approach
| Factor | Token Space | Reflexion | Self-Evolving |
|---|---|---|---|
| Complexity | Low | Medium | High |
| Safety | High | High | Low |
| Latency Impact | Minimal | 2-3x per task | Variable |
| Best For | Routine tasks | Complex reasoning | Research |
| Production Ready | Yes | Yes (with limits) | No |