How Optimization Works

Converra uses AI-powered simulation to find better versions of your prompts—and, for multi-step workflows, to evaluate and improve agent behavior in context—connecting directly to where your prompts live in production.

The Full Lifecycle

                            CONVERRA
    ┌───────────────────────────────────────────────────────────────────────┐
    │                                                                       │
    │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐ │
    │  │ Analyze  │->│ Generate │->│ Simulate │->│Regression│->│  Select  │ │
    │  │  Prompt  │  │ Variants │  │          │  │   Test   │  │  Winner  │ │
    │  └──────────┘  └──────────┘  └──────────┘  └──────────┘  └──────────┘ │
    │                                                                       │
    └───────────────────────────────────────────────────────────────────────┘
           ↑                                             │
           │                                             ↓
    ┌──────┴──────┐                               ┌──────┴──────┐
    │  IMPORT     │                               │   DEPLOY    │
    │  prompts    │                               │   winner    │
    └──────┬──────┘                               └──────┬──────┘
           │                                             │
           ↑                                             ↓
╔══════════╧═════════════════════════════════════════════╧══════════╗
║                     YOUR PRODUCTION STACK                         ║
║                                                                   ║
║   ┌───────────┐    ┌───────────┐    ┌───────────┐                ║
║   │Observabil-│    │  Manual   │    │  Custom   │                ║
║   │ity tools  │    │  paste    │    │   API     │                ║
║   │(LangSmith,│    └───────────┘    └───────────┘                ║
║   │ Langfuse) │                                                  ║
║   └───────────┘                                                  ║
║                                                                   ║
╚═══════════════════════════════════════════════════════════════════╝

The key insight: Your prompts don't live in Converra—they live in your production systems. Converra connects to where they already are, optimizes them, and puts the improved versions back.

Agent Systems (V3)

If your production workflow uses multiple prompts (for example, a router handing off to specialists), Converra can discover an agent system from imported traces and evaluate prompts in system context.

What changes compared to single-prompt optimization:

Simulations include realistic handoff context from earlier steps.
Results can be grouped by path (the prompt sequence taken) for fair comparisons.
System metrics are diagnostic; winner selection stays apples-to-apples within comparable paths.

Import: Where Prompts Come From

Converra pulls prompts from where they already live:

Source	How It Works
LangSmith	Import prompts + conversation traces from your observability data
API	Push prompts programmatically from your deployment pipeline
Manual	Paste prompts directly for quick testing

See Integrations for setup details.

The Optimization Loop

1. Analyze Prompt

Converra analyzes your prompt to understand:

Structure and formatting
Goals and constraints
Potential improvement areas

2. Generate Variants

AI creates alternative versions of your prompt:

Each variant targets specific improvements
Variants maintain your core requirements
Typically 3-5 variants are tested

3. Simulate

Each variant is tested against simulated personas:

Diverse user types (frustrated, technical, new, etc.)
Multiple conversation scenarios
Realistic interaction patterns

4. Regression Test

When a leading variant emerges, the system automatically tests it against a "golden set" of scenarios:

Golden set: Scenarios your baseline prompt handles reliably (auto-generated)
Short exchanges: 2-3 turns per scenario for fast validation
Pass/fail: Each scenario must maintain baseline performance

If regressions are found, you see the tradeoff: "Improved X but regressed on Y. Apply anyway?"

See Regression Testing for details.

5. Select Winner

Performance is evaluated across metrics:

Task completion rate
Response quality
User sentiment
Goal achievement
Regression test results

The best-performing variant is identified.

Deploy: Putting Winners Back in Production

Once you have a winning variant, deploy it back to where your prompt lives:

Destination	How It Works
API/Webhook	Notify your systems to pull the new version
Manual	Copy the optimized prompt and update your code

The goal is a closed loop: prompts flow from production → through optimization → back to production. You can deploy via API/webhooks or copy manually.

Roadmap

GitHub PR creation and LangSmith prompt registry sync are planned.

What Gets Optimized

Aspect	Example Improvement
Clarity	Clearer instructions, better structure
Tone	More appropriate formality level
Efficiency	Shorter responses that still work
Completeness	Better coverage of edge cases
Consistency	More predictable behavior

Optimization Modes

Exploratory Mode

Best for: Finding improvements quickly

Fewer simulations per variant
Faster results (minutes)
Good for iteration

Validation Mode

Best for: Production decisions

More simulations per variant
Statistical confidence
Takes longer but more reliable

Replay Mode

Best for: Verifying fixes on real failures

Tests variants against imported production traces (offline)
Confirms fixes work on the exact cases that failed
Available when you've imported traces from LangSmith

What Stays the Same

Converra preserves your:

Core purpose and role
Key constraints and boundaries
Required output formats
Brand voice fundamentals

Simulation Personas

Your prompts are tested against diverse users:

Persona	Tests
Frustrated Customer	De-escalation, empathy
Technical User	Accuracy, depth
New User	Clarity, onboarding
Impatient User	Conciseness
Confused User	Patience, explanation

You can also create custom personas matching your actual users.

Metrics Evaluated

Primary Metrics

Task Completion - Did the AI help the user achieve their goal?
Response Quality - Was the response accurate and helpful?
User Sentiment - How would the user feel about the interaction?

Secondary Metrics

Conciseness - Appropriate length for the context
Consistency - Similar situations handled similarly
Safety - Stayed within appropriate boundaries

Example Optimization

Original Prompt:

You are a customer support agent. Help users with their questions.

Optimized Variant (Winner):

You are a customer support agent for TechCorp. Your goal is to
resolve issues quickly while maintaining a friendly tone.

When helping users:
1. Acknowledge their issue
2. Ask clarifying questions if needed
3. Provide a clear solution
4. Confirm the issue is resolved

If you can't resolve an issue, offer to escalate to a specialist.

Improvement: +34% task completion, +28% user satisfaction

Next Steps

Running Optimizations - Start an optimization
Understanding Results - Interpret results
Regression Testing - How regressions are detected
Best Practices - Write optimization-ready prompts

How Optimization Works ​

The Full Lifecycle ​

Agent Systems (V3) ​

Import: Where Prompts Come From ​

The Optimization Loop ​

1. Analyze Prompt ​

2. Generate Variants ​

3. Simulate ​

4. Regression Test ​

5. Select Winner ​

Deploy: Putting Winners Back in Production ​

What Gets Optimized ​

Optimization Modes ​

Exploratory Mode ​

Validation Mode ​

Replay Mode ​

What Stays the Same ​

Simulation Personas ​

Metrics Evaluated ​

Primary Metrics ​

Secondary Metrics ​

Example Optimization ​

Next Steps ​