Learning & Adaptation

How AI agents improve over time without retraining: learning in token space, reflexion patterns, and emerging self-evolving architectures.

The Learning Challenge

Traditional ML models learn by updating weights during training. But production LLM agents can't retrain on every interaction. How can they improve?

Token Space Learning

Store successful trajectories and inject them as few-shot examples

Reflexion

Self-critique and iterative improvement on failures

Self-Evolution

Agents that modify their own prompts, tools, or code

No Weight Updates

All these techniques work at inference time by manipulating context, not by changing model weights. This makes them practical for deployed systems.

The Learning Spectrum

Agent Learning Approaches
Safety/Control ◄──────────────────────────────────────► Autonomy/Risk

┌─────────────────┬─────────────────┬─────────────────┬─────────────────┐
│   STATIC        │   TOKEN SPACE   │   REFLEXION     │  SELF-EVOLVING  │
│   PROMPTS       │   LEARNING      │                 │                 │
├─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│                 │                 │                 │                 │
│ Fixed system    │ Dynamic few-    │ Self-critique   │ Prompt/code     │
│ prompt, no      │ shot examples   │ and iterative   │ modification    │
│ adaptation      │ from trajectory │ improvement     │ by agent        │
│                 │ storage         │                 │                 │
│                 │                 │                 │                 │
│ • Predictable   │ • Learns from   │ • Improves on   │ • Autonomous    │
│ • Consistent    │   successes     │   failures      │   improvement   │
│ • No learning   │ • Safe (read-   │ • Multi-attempt │ • Risky if      │
│                 │   only context) │   solving       │   unsupervised  │
│                 │                 │                 │                 │
└─────────────────┴─────────────────┴─────────────────┴─────────────────┘

       ▲                  ▲                  ▲                  ▲
       │                  │                  │                  │
  Most systems       Production         Research          Experimental
  today              ready              interest          (safety concerns)

1. Learning in Token Space

The simplest form of agent learning: store successful task completions and retrieve them as few-shot examples for similar future tasks.

Token Space Learning Flow
┌─────────────────────────────────────────────────────────────────┐
│                        NEW TASK                                  │
│                    "Parse this JSON"                             │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
                    ┌─────────────────┐
                    │   EMBED TASK    │
                    │   DESCRIPTION   │
                    └─────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                     TRAJECTORY STORE                            │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │  Task: "Parse XML file"          Similarity: 0.72       │   │
│  │  Steps: read_file → parse → extract                     │   │
│  ├─────────────────────────────────────────────────────────┤   │
│  │  Task: "Extract data from JSON"  Similarity: 0.91  ◄───┼───│
│  │  Steps: read_file → json.loads → filter_keys            │   │
│  ├─────────────────────────────────────────────────────────┤   │
│  │  Task: "Convert CSV to dict"     Similarity: 0.68       │   │
│  │  Steps: read_file → csv.reader → to_dict                │   │
│  └─────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    DYNAMIC PROMPT                                │
│  System: You are a data processing assistant...                 │
│                                                                  │
│  Example 1: (from trajectory store)                              │
│  User: Extract data from JSON                                    │
│  Assistant: read_file → json.loads → filter_keys                 │
│                                                                  │
│  Current task:                                                   │
│  User: Parse this JSON                                           │
└─────────────────────────────────────────────────────────────────┘
Token Space Learning Implementation
# Learning in Token Space: No weight updates, only context manipulation

class TokenSpaceLearner:
    trajectoryStore: VectorDB  # Stores successful task completions
    fewShotSelector: EmbeddingModel

    function learn(task, trajectory, outcome):
        if outcome.success:
            # Store successful trajectory for future reference
            embedding = embed(task.description)
            trajectoryStore.store({
                task_embedding: embedding,
                task_description: task.description,
                trajectory: trajectory,  # Steps taken
                outcome: outcome,
                timestamp: now()
            })

    function recall(newTask, k = 3):
        # Find similar past tasks
        embedding = embed(newTask.description)
        similar = trajectoryStore.search(embedding, topK = k)

        # Format as few-shot examples
        examples = []
        for item in similar:
            examples.append({
                task: item.task_description,
                solution: formatTrajectory(item.trajectory)
            })

        return examples

    function execute(task):
        # Retrieve relevant past experiences
        examples = recall(task)

        # Build prompt with dynamic few-shot examples
        prompt = buildPrompt(
            systemMessage: "You are a helpful assistant...",
            fewShotExamples: examples,  # Learning injected here
            currentTask: task
        )

        # Execute with learned context
        trajectory = agent.run(prompt)
        outcome = evaluate(trajectory)

        # Learn from this execution
        learn(task, trajectory, outcome)

        return outcome
import chromadb
from dataclasses import dataclass
from typing import Optional
import json

@dataclass
class Trajectory:
    task: str
    steps: list[dict]  # {thought, action, observation}
    outcome: dict
    success: bool

class TokenSpaceLearner:
    """Learn from experience without updating model weights."""

    def __init__(self, collection_name: str = "trajectories"):
        self.client = chromadb.PersistentClient(path="./learning_db")
        self.collection = self.client.get_or_create_collection(
            name=collection_name,
            metadata={"hnsw:space": "cosine"}
        )

    def store_trajectory(self, trajectory: Trajectory) -> None:
        """Store a successful trajectory for future reference."""
        if not trajectory.success:
            return  # Only learn from successes

        # Create searchable representation
        doc = f"Task: {trajectory.task}\n"
        doc += f"Approach: {self._summarize_approach(trajectory.steps)}"

        self.collection.add(
            ids=[f"traj_{hash(trajectory.task)}_{len(self.collection.get()['ids'])}"],
            documents=[doc],
            metadatas=[{
                "task": trajectory.task,
                "steps": json.dumps(trajectory.steps),
                "outcome": json.dumps(trajectory.outcome),
                "success": trajectory.success
            }]
        )

    def recall_similar(
        self,
        task: str,
        k: int = 3,
        min_similarity: float = 0.5
    ) -> list[Trajectory]:
        """Retrieve trajectories from similar past tasks."""
        results = self.collection.query(
            query_texts=[task],
            n_results=k,
            include=["metadatas", "distances"]
        )

        trajectories = []
        for i, distance in enumerate(results['distances'][0]):
            similarity = 1 - distance
            if similarity < min_similarity:
                continue

            metadata = results['metadatas'][0][i]
            trajectories.append(Trajectory(
                task=metadata['task'],
                steps=json.loads(metadata['steps']),
                outcome=json.loads(metadata['outcome']),
                success=metadata['success']
            ))

        return trajectories

    def build_few_shot_prompt(
        self,
        task: str,
        system_message: str,
        k: int = 3
    ) -> list[dict]:
        """Build a prompt with dynamic few-shot examples."""
        examples = self.recall_similar(task, k=k)

        messages = [{"role": "system", "content": system_message}]

        # Add retrieved examples as conversation turns
        for ex in examples:
            messages.append({
                "role": "user",
                "content": f"Task: {ex.task}"
            })
            messages.append({
                "role": "assistant",
                "content": self._format_trajectory(ex.steps)
            })

        # Add current task
        messages.append({
            "role": "user",
            "content": f"Task: {task}"
        })

        return messages

    def _summarize_approach(self, steps: list[dict]) -> str:
        """Create a brief summary of the approach taken."""
        actions = [s.get('action', '') for s in steps]
        return " -> ".join(actions[:5])

    def _format_trajectory(self, steps: list[dict]) -> str:
        """Format trajectory steps for few-shot example."""
        formatted = []
        for step in steps:
            formatted.append(f"Thought: {step.get('thought', '')}")
            formatted.append(f"Action: {step.get('action', '')}")
            if 'observation' in step:
                formatted.append(f"Observation: {step['observation']}")
        return "\n".join(formatted)

# Usage
learner = TokenSpaceLearner()

# After successful task completion
learner.store_trajectory(Trajectory(
    task="Parse the JSON file and extract all email addresses",
    steps=[
        {"thought": "Need to read the file first", "action": "read_file('data.json')"},
        {"thought": "Parse JSON and find emails", "action": "extract_emails(data)"},
    ],
    outcome={"emails_found": 15},
    success=True
))

# When handling a new similar task
prompt = learner.build_few_shot_prompt(
    task="Extract phone numbers from the CSV file",
    system_message="You are a data extraction assistant."
)
using Qdrant.Client;
using System.Text.Json;

public record Trajectory(
    string Task,
    List<TrajectoryStep> Steps,
    Dictionary<string, object> Outcome,
    bool Success
);

public record TrajectoryStep(
    string Thought,
    string Action,
    string? Observation = null
);

public class TokenSpaceLearner
{
    private readonly QdrantClient _qdrant;
    private readonly EmbeddingModel _embedder;
    private const string CollectionName = "trajectories";

    public TokenSpaceLearner(string qdrantUrl = "http://localhost:6334")
    {
        _qdrant = new QdrantClient(qdrantUrl);
        _embedder = new EmbeddingModel();
        EnsureCollectionExists().Wait();
    }

    public async Task StoreTrajectoryAsync(
        Trajectory trajectory,
        CancellationToken ct = default)
    {
        if (!trajectory.Success)
            return; // Only learn from successes

        var doc = $"Task: {trajectory.Task}\n";
        doc += $"Approach: {SummarizeApproach(trajectory.Steps)}";

        var embedding = _embedder.Encode(doc);

        var point = new PointStruct
        {
            Id = new PointId { Uuid = Guid.NewGuid().ToString() },
            Vectors = embedding,
            Payload = {
                ["task"] = trajectory.Task,
                ["steps"] = JsonSerializer.Serialize(trajectory.Steps),
                ["outcome"] = JsonSerializer.Serialize(trajectory.Outcome),
                ["success"] = trajectory.Success
            }
        };

        await _qdrant.UpsertAsync(CollectionName, new[] { point }, ct);
    }

    public async Task<List<Trajectory>> RecallSimilarAsync(
        string task,
        int k = 3,
        float minSimilarity = 0.5f,
        CancellationToken ct = default)
    {
        var embedding = _embedder.Encode(task);

        var results = await _qdrant.SearchAsync(
            CollectionName,
            embedding,
            limit: (ulong)k,
            scoreThreshold: minSimilarity,
            ct: ct
        );

        return results.Select(r => new Trajectory(
            Task: r.Payload["task"].StringValue,
            Steps: JsonSerializer.Deserialize<List<TrajectoryStep>>(
                r.Payload["steps"].StringValue
            )!,
            Outcome: JsonSerializer.Deserialize<Dictionary<string, object>>(
                r.Payload["outcome"].StringValue
            )!,
            Success: r.Payload["success"].BoolValue
        )).ToList();
    }

    public async Task<List<ChatMessage>> BuildFewShotPromptAsync(
        string task,
        string systemMessage,
        int k = 3,
        CancellationToken ct = default)
    {
        var examples = await RecallSimilarAsync(task, k, ct: ct);

        var messages = new List<ChatMessage>
        {
            new("system", systemMessage)
        };

        foreach (var ex in examples)
        {
            messages.Add(new("user", $"Task: {ex.Task}"));
            messages.Add(new("assistant", FormatTrajectory(ex.Steps)));
        }

        messages.Add(new("user", $"Task: {task}"));

        return messages;
    }

    private string SummarizeApproach(List<TrajectoryStep> steps) =>
        string.Join(" -> ", steps.Take(5).Select(s => s.Action));

    private string FormatTrajectory(List<TrajectoryStep> steps) =>
        string.Join("\n", steps.SelectMany(s => new[]
        {
            $"Thought: {s.Thought}",
            $"Action: {s.Action}",
            s.Observation != null ? $"Observation: {s.Observation}" : null
        }.Where(x => x != null)));
}

Key Benefits

Benefit Description
No retraining Learning happens through context, not weight updates
Immediate New experiences are available for the next request
Interpretable You can inspect what examples were retrieved
Safe Read-only operation; can't corrupt the model
Domain-specific Naturally adapts to your specific use cases

Quality Over Quantity

Store only high-quality successful trajectories. A few excellent examples are better than many mediocre ones.

2. Reflexion

Reflexion (Shinn et al., 2023) enables agents to learn from failures through self-reflection. Instead of giving up after an error, the agent analyzes what went wrong and tries again.

Reflexion Loop
                    ┌─────────────┐
                    │    TASK     │
                    └──────┬──────┘
                           │
           ┌───────────────┼───────────────┐
           │               ▼               │
           │      ┌───────────────┐        │
           │      │    ACTOR      │        │
           │      │  (Generate    │        │
           │      │   Trajectory) │        │
           │      └───────┬───────┘        │
           │              │                │
           │              ▼                │
           │      ┌───────────────┐        │
           │      │   EVALUATOR   │        │
           │      │  (Check if    │        │
           │      │   Correct)    │        │
           │      └───────┬───────┘        │
           │              │                │
           │      ┌───────┴───────┐        │
           │      │               │        │
           │   Success         Failure     │
           │      │               │        │
           │      ▼               ▼        │
           │   ┌─────┐     ┌───────────┐   │
           │   │DONE │     │ REFLECTOR │   │
           │   └─────┘     │           │   │
           │               │ "What went│   │
           │               │  wrong?"  │   │
           │               │ "Why?"    │   │
           │               │ "How to   │   │
           │               │  fix?"    │   │
           │               └─────┬─────┘   │
           │                     │         │
           │                     ▼         │
           │              ┌───────────┐    │
           │              │  MEMORY   │    │
           │              │(Reflections)   │
           │              └─────┬─────┘    │
           │                    │          │
           └────────────────────┘          │
                    (retry with            │
                     reflections)          │
                                           │
           ┌───────────────────────────────┘
           │
           ▼
    ┌─────────────┐
    │ LONG-TERM   │
    │ MEMORY      │
    │ (Learnings) │
    └─────────────┘
Reflexion Implementation
# Reflexion: Self-reflection for iterative improvement

class ReflexionAgent:
    shortTermMemory: []  # Current task context
    longTermMemory: []   # Persistent reflections

    function solve(task, maxAttempts = 3):
        for attempt in range(maxAttempts):
            # Generate trajectory (attempt to solve)
            trajectory = actor.generate(
                task: task,
                reflections: longTermMemory,  # Include past learnings
                previousAttempt: shortTermMemory
            )

            # Evaluate the trajectory
            evaluation = evaluator.assess(task, trajectory)

            if evaluation.success:
                # Success! Store positive reflection
                reflection = reflector.generate(
                    task: task,
                    trajectory: trajectory,
                    outcome: "SUCCESS",
                    learnings: "This approach worked because..."
                )
                longTermMemory.append(reflection)
                return trajectory

            else:
                # Failure - generate reflection
                reflection = reflector.generate(
                    task: task,
                    trajectory: trajectory,
                    outcome: "FAILURE",
                    errors: evaluation.errors,
                    learnings: "This failed because... Next time I should..."
                )

                # Store for next attempt
                shortTermMemory.append({
                    attempt: attempt,
                    trajectory: trajectory,
                    reflection: reflection
                })

        # Max attempts reached
        return bestAttempt(shortTermMemory)

# Reflector generates structured self-critique
function generateReflection(task, trajectory, outcome, errors):
    prompt = """
    Task: {task}

    Your approach:
    {trajectory}

    Outcome: {outcome}
    Errors encountered: {errors}

    Generate a reflection with:
    1. What went wrong?
    2. Why did it go wrong?
    3. What should I do differently next time?
    4. Specific actionable improvements
    """

    return llm.generate(prompt)
from dataclasses import dataclass, field
from typing import Callable
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from pydantic import BaseModel, Field

class ReflectionOutput(BaseModel):
    what_went_wrong: str = Field(description="What went wrong in this attempt")
    why_it_failed: str = Field(description="Root cause of the failure")
    improvements: list[str] = Field(description="Specific improvements for next attempt")

@dataclass
class Reflection:
    task: str
    attempt: int
    trajectory: str
    outcome: str
    what_went_wrong: str
    why_it_failed: str
    improvements: list[str]

@dataclass
class ReflexionAgent:
    """Agent that learns from self-reflection on failures."""

    llm: ChatOpenAI = field(default_factory=lambda: ChatOpenAI(model="gpt-4"))
    short_term_memory: list[Reflection] = field(default_factory=list)
    long_term_memory: list[Reflection] = field(default_factory=list)

    def solve(
        self,
        task: str,
        max_attempts: int = 3,
        evaluator: Callable = None
    ) -> tuple[str, bool]:
        """Attempt to solve task with reflection on failures."""
        self.short_term_memory = []

        for attempt in range(max_attempts):
            trajectory = self._generate_trajectory(task, attempt)
            success, errors = evaluator(trajectory) if evaluator else (False, [])

            if success:
                reflection = self._generate_reflection(task, attempt, trajectory, "SUCCESS", [])
                self.long_term_memory.append(reflection)
                return trajectory, True

            reflection = self._generate_reflection(task, attempt, trajectory, "FAILURE", errors)
            self.short_term_memory.append(reflection)

        return self._select_best_attempt(), False

    def _generate_trajectory(self, task: str, attempt: int) -> str:
        """Generate a solution attempt using LangChain."""
        messages = [("system", self._build_system_prompt())]

        if self.long_term_memory:
            learnings = self._format_learnings(self.long_term_memory[-5:])
            messages.append(("system", f"Learnings from past tasks:\n{learnings}"))

        if self.short_term_memory:
            reflections = self._format_reflections(self.short_term_memory)
            messages.append(("user", f"Previous attempts and reflections:\n{reflections}"))

        messages.append(("user", f"Task: {task}"))

        prompt = ChatPromptTemplate.from_messages(messages)
        chain = prompt | self.llm
        response = chain.invoke({})
        return response.content

    def _generate_reflection(
        self, task: str, attempt: int, trajectory: str, outcome: str, errors: list[str]
    ) -> Reflection:
        """Generate structured reflection using LangChain JSON parser."""
        parser = JsonOutputParser(pydantic_object=ReflectionOutput)

        prompt = ChatPromptTemplate.from_messages([
            ("system", "Analyze this attempt and generate a reflection."),
            ("user", """Task: {task}

Attempt #{attempt}:
{trajectory}

Outcome: {outcome}
Errors: {errors}

{format_instructions}""")
        ])

        chain = prompt | self.llm | parser
        data = chain.invoke({
            "task": task,
            "attempt": attempt + 1,
            "trajectory": trajectory,
            "outcome": outcome,
            "errors": ', '.join(errors) if errors else 'None',
            "format_instructions": parser.get_format_instructions()
        })

        return Reflection(
            task=task, attempt=attempt, trajectory=trajectory, outcome=outcome,
            what_went_wrong=data.get("what_went_wrong", ""),
            why_it_failed=data.get("why_it_failed", ""),
            improvements=data.get("improvements", [])
        )

# Usage
agent = ReflexionAgent()

def code_evaluator(trajectory: str) -> tuple[bool, list[str]]:
    try:
        exec(trajectory)
        return True, []
    except Exception as e:
        return False, [str(e)]

solution, success = agent.solve(
    task="Write a function to find the nth Fibonacci number",
    evaluator=code_evaluator
)
using Microsoft.Agents.AI;
using Microsoft.Extensions.AI;
using System.Text.Json;
using OpenAI;

public record Reflection(
    string Task,
    int Attempt,
    string Trajectory,
    string Outcome,
    string WhatWentWrong,
    string WhyItFailed,
    List<string> Improvements
);

public class ReflexionAgent
{
    private readonly AIAgent _agent;
    private readonly IChatClient _chatClient;
    private List<Reflection> _shortTermMemory = new();
    private readonly List<Reflection> _longTermMemory = new();

    public ReflexionAgent(string apiKey)
    {
        _chatClient = new OpenAIClient(apiKey)
            .GetChatClient("gpt-4o")
            .AsIChatClient();

        _agent = _chatClient.CreateAIAgent(
            name: "ReflexionAgent",
            instructions: "You solve tasks and learn from your mistakes."
        );
    }

    public async Task<(string Solution, bool Success)> SolveAsync(
        string task,
        Func<string, (bool Success, List<string> Errors)> evaluator,
        int maxAttempts = 3)
    {
        _shortTermMemory = new();

        for (int attempt = 0; attempt < maxAttempts; attempt++)
        {
            var trajectory = await GenerateTrajectoryAsync(task, attempt);
            var (success, errors) = evaluator(trajectory);

            if (success)
            {
                var reflection = await GenerateReflectionAsync(
                    task, attempt, trajectory, "SUCCESS", new()
                );
                _longTermMemory.Add(reflection);
                return (trajectory, true);
            }

            var failureReflection = await GenerateReflectionAsync(
                task, attempt, trajectory, "FAILURE", errors
            );
            _shortTermMemory.Add(failureReflection);
        }

        return (SelectBestAttempt(), false);
    }

    private async Task<string> GenerateTrajectoryAsync(string task, int attempt)
    {
        var thread = _agent.GetNewThread();

        // Add learnings from past tasks
        if (_longTermMemory.Count > 0)
        {
            var learnings = FormatLearnings(_longTermMemory.TakeLast(5));
            await thread.AddMessageAsync($"Learnings from past tasks:\n{learnings}");
        }

        // Add reflections from previous attempts
        if (_shortTermMemory.Count > 0)
        {
            var reflections = FormatReflections(_shortTermMemory);
            await thread.AddMessageAsync($"Previous attempts:\n{reflections}");
        }

        return await _agent.RunAsync($"Task: {task}", thread);
    }

    private async Task<Reflection> GenerateReflectionAsync(
        string task, int attempt, string trajectory, string outcome, List<string> errors)
    {
        var prompt = $@"Analyze this attempt and generate a reflection.

Task: {task}
Attempt #{attempt + 1}:
{trajectory}
Outcome: {outcome}
Errors: {string.Join(", ", errors)}

Return JSON with: what_went_wrong, why_it_failed, improvements[]";

        var response = await _chatClient.GetResponseAsync(
            prompt,
            new() { ResponseFormat = ChatResponseFormat.Json }
        );

        var data = JsonDocument.Parse(response.Text);
        var root = data.RootElement;

        return new Reflection(
            Task: task, Attempt: attempt, Trajectory: trajectory, Outcome: outcome,
            WhatWentWrong: root.GetProperty("what_went_wrong").GetString() ?? "",
            WhyItFailed: root.GetProperty("why_it_failed").GetString() ?? "",
            Improvements: root.GetProperty("improvements")
                .EnumerateArray().Select(e => e.GetString()!).ToList()
        );
    }

    private string FormatReflections(IEnumerable<Reflection> reflections) =>
        string.Join("\n", reflections.Select(r => $@"
Attempt {r.Attempt + 1}:
- What went wrong: {r.WhatWentWrong}
- Why: {r.WhyItFailed}
- Improvements: {string.Join(", ", r.Improvements)}"));
}

Reflection Structure

A good reflection includes:

  • What went wrong? - Specific error or failure mode
  • Why did it fail? - Root cause analysis
  • What should I try next? - Concrete alternative approach
  • What did I learn? - Generalizable insight

Reflection Quality

Poor reflections (vague, non-actionable) can actually hurt performance. The reflector model must generate specific, actionable insights.

3. Self-Evolving Agents

Experimental & Safety-Critical

Self-evolving agents modify their own behavior. This is an active research area with significant safety considerations. Use with extreme caution in production.

The most advanced (and risky) form of agent learning: agents that modify their own prompts, generate new tools, or even write new code.

Research Directions

Approach What Evolves Safety Level
Self-Critique (SCA) Output quality through revision Safe (no persistent changes)
Prompt Evolution System prompts based on performance Moderate (prompts can drift)
Tool Generation New tools/functions Risky (code execution)
Architecture Evolution Agent structure itself Highly experimental
Self-Evolving Patterns
# Self-Evolving Agents: Agents that improve their own code/prompts

# WARNING: This is an active research area with safety concerns.
# These patterns are experimental and should be used with caution.

# Pattern 1: Self-Critique Agent (SCA)
# Agent generates, critiques, and revises its own outputs
class SelfCritiqueAgent:
    function generate(task):
        # Initial generation
        output = llm.generate(task)

        # Self-critique loop
        for i in range(maxRevisions):
            critique = llm.generate(
                "Critique this output for errors, improvements: " + output
            )

            if critique.noIssuesFound:
                break

            # Revise based on critique
            output = llm.generate(
                "Original: " + output +
                "Critique: " + critique +
                "Improved version:"
            )

        return output

# Pattern 2: Prompt Evolution
# Agent evolves its own system prompt based on performance
class PromptEvolver:
    currentPrompt: string
    performanceHistory: []

    function evolve(tasks, evaluator):
        for task in tasks:
            # Execute with current prompt
            result = agent.run(currentPrompt, task)
            score = evaluator(result)
            performanceHistory.append({ prompt: currentPrompt, score: score })

        # Generate improved prompt based on performance
        lowScoreTasks = filter(performanceHistory, score < threshold)

        newPrompt = llm.generate(
            "Current prompt: " + currentPrompt +
            "Failed on these tasks: " + lowScoreTasks +
            "Generate an improved prompt that would handle these cases better."
        )

        # A/B test new prompt
        if evaluate(newPrompt) > evaluate(currentPrompt):
            currentPrompt = newPrompt

# Pattern 3: Tool/Skill Generation
# Agent creates new tools when existing ones are insufficient
class ToolGenerator:
    function handleTask(task):
        # Try existing tools
        result = agent.run(task, availableTools)

        if result.success:
            return result

        if result.error == "NO_SUITABLE_TOOL":
            # Generate a new tool
            newTool = llm.generate(
                "Task that failed: " + task +
                "Available tools: " + availableTools +
                "Generate a new tool (name, description, implementation) " +
                "that would help solve this task."
            )

            # Validate and sandbox the new tool
            if validateTool(newTool) and sandboxTest(newTool):
                availableTools.append(newTool)
                return agent.run(task, availableTools)

        return result
from dataclasses import dataclass
from typing import Callable
import ast

@dataclass
class EvolutionResult:
    original: str
    evolved: str
    improvement_score: float
    changes_made: list[str]

class SelfCritiqueAgent:
    """Agent that critiques and revises its own outputs."""

    def __init__(self, client, model: str = "gpt-4"):
        self.client = client
        self.model = model

    def generate_with_critique(
        self,
        task: str,
        max_revisions: int = 3
    ) -> tuple[str, list[str]]:
        """Generate output with self-critique loop."""

        critiques = []
        output = self._generate(task)

        for _ in range(max_revisions):
            # Self-critique
            critique = self._critique(task, output)
            critiques.append(critique)

            if "no issues found" in critique.lower():
                break

            # Revise based on critique
            output = self._revise(task, output, critique)

        return output, critiques

    def _generate(self, task: str) -> str:
        response = self.client.chat.completions.create(
            model=self.model,
            messages=[{"role": "user", "content": task}]
        )
        return response.choices[0].message.content

    def _critique(self, task: str, output: str) -> str:
        prompt = f"""Review this output for the given task.

Task: {task}

Output:
{output}

Identify any:
1. Errors or inaccuracies
2. Missing information
3. Areas for improvement
4. Logical inconsistencies

If the output is satisfactory, respond with "No issues found."
Otherwise, list specific issues."""

        response = self.client.chat.completions.create(
            model=self.model,
            messages=[{"role": "user", "content": prompt}]
        )
        return response.choices[0].message.content

    def _revise(self, task: str, output: str, critique: str) -> str:
        prompt = f"""Revise this output based on the critique.

Task: {task}

Original output:
{output}

Critique:
{critique}

Provide an improved version that addresses all the issues."""

        response = self.client.chat.completions.create(
            model=self.model,
            messages=[{"role": "user", "content": prompt}]
        )
        return response.choices[0].message.content


class PromptEvolver:
    """Evolves system prompts based on performance."""

    def __init__(self, client, initial_prompt: str):
        self.client = client
        self.current_prompt = initial_prompt
        self.history: list[dict] = []

    def evolve(
        self,
        test_cases: list[tuple[str, str]],  # (input, expected_output)
        evaluator: Callable[[str, str], float],
        generations: int = 5
    ) -> EvolutionResult:
        """Evolve prompt over multiple generations."""

        original_prompt = self.current_prompt
        original_score = self._evaluate_prompt(test_cases, evaluator)

        for gen in range(generations):
            # Find failure cases
            failures = self._identify_failures(test_cases, evaluator)

            if not failures:
                break  # Perfect score

            # Generate improved prompt
            new_prompt = self._generate_improved_prompt(failures)

            # Evaluate new prompt
            new_score = self._evaluate_prompt(
                test_cases, evaluator, new_prompt
            )

            # Keep if better
            if new_score > self._evaluate_prompt(test_cases, evaluator):
                self.current_prompt = new_prompt
                self.history.append({
                    "generation": gen,
                    "score": new_score,
                    "prompt": new_prompt
                })

        final_score = self._evaluate_prompt(test_cases, evaluator)

        return EvolutionResult(
            original=original_prompt,
            evolved=self.current_prompt,
            improvement_score=final_score - original_score,
            changes_made=[h["prompt"][:100] for h in self.history]
        )

    def _generate_improved_prompt(self, failures: list[dict]) -> str:
        failure_summary = "\n".join([
            f"Input: {f['input']}\nExpected: {f['expected']}\nGot: {f['actual']}"
            for f in failures[:5]
        ])

        prompt = f"""Current system prompt:
{self.current_prompt}

This prompt failed on these cases:
{failure_summary}

Generate an improved system prompt that would handle these cases correctly.
Keep the same general purpose but add specific instructions to address the failures."""

        response = self.client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": prompt}]
        )
        return response.choices[0].message.content


class SafeToolGenerator:
    """Generate new tools with safety constraints."""

    FORBIDDEN_MODULES = {'os', 'subprocess', 'sys', 'importlib', 'eval', 'exec'}

    def __init__(self, client):
        self.client = client
        self.generated_tools: dict[str, Callable] = {}

    def generate_tool(
        self,
        task_description: str,
        existing_tools: list[str]
    ) -> dict | None:
        """Generate a new tool for a task."""

        prompt = f"""Generate a Python function to help with this task:
{task_description}

Existing tools: {', '.join(existing_tools)}

Requirements:
1. Pure function (no side effects)
2. No file system access
3. No network calls
4. No imports except: math, json, re, datetime
5. Include docstring and type hints

Return as JSON:
{{
    "name": "function_name",
    "description": "what it does",
    "code": "def function_name(...):\n    ..."
}}"""

        response = self.client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": prompt}],
            response_format={"type": "json_object"}
        )

        tool_spec = json.loads(response.choices[0].message.content)

        # Validate safety
        if not self._is_safe(tool_spec["code"]):
            return None

        # Test in sandbox
        if not self._sandbox_test(tool_spec):
            return None

        return tool_spec

    def _is_safe(self, code: str) -> bool:
        """Check if code is safe to execute."""
        try:
            tree = ast.parse(code)

            for node in ast.walk(tree):
                # Check for forbidden imports
                if isinstance(node, ast.Import):
                    for alias in node.names:
                        if alias.name.split('.')[0] in self.FORBIDDEN_MODULES:
                            return False

                if isinstance(node, ast.ImportFrom):
                    if node.module and node.module.split('.')[0] in self.FORBIDDEN_MODULES:
                        return False

                # Check for eval/exec calls
                if isinstance(node, ast.Call):
                    if isinstance(node.func, ast.Name):
                        if node.func.id in ('eval', 'exec', 'compile'):
                            return False

            return True

        except SyntaxError:
            return False

Safety Considerations

Sandbox All Generated Code

Never execute LLM-generated code without sandboxing. Use restricted execution environments with no filesystem or network access.

Version Control Prompts

If evolving prompts, keep a history. Prompt drift can lead to unexpected behavior over time.

Human-in-the-Loop

For any persistent changes (new tools, modified prompts), require human approval before deployment.

Evaluation Approach

Metric What it Measures Applies To
Learning Curve Performance improvement over tasks All approaches
Sample Efficiency Tasks needed to reach performance level Token space, Reflexion
Reflection Quality Actionability of generated reflections Reflexion
Retry Reduction Fewer attempts needed over time Reflexion
Transfer Learning Performance on related but new tasks All approaches
Stability Variance in performance over time Self-evolving

Metrics for evaluating agent learning

Choosing an Approach

Factor Token Space Reflexion Self-Evolving
Complexity Low Medium High
Safety High High Low
Latency Impact Minimal 2-3x per task Variable
Best For Routine tasks Complex reasoning Research
Production Ready Yes Yes (with limits) No

Start Simple

Begin with token space learning. It's safe, effective, and easy to implement. Add reflexion for tasks that frequently fail. Reserve self-evolution for research.

Related Topics