Appearance
How Optimization Works
Converra uses AI-powered simulation to find better versions of your prompts—and, for multi-step workflows, to evaluate and improve agent behavior in context—connecting directly to where your prompts live in production.
The Full Lifecycle
CONVERRA
┌───────────────────────────────────────────────────────────────────────┐
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Analyze │->│ Generate │->│ Simulate │->│Regression│->│ Select │ │
│ │ Prompt │ │ Variants │ │ │ │ Test │ │ Winner │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │
└───────────────────────────────────────────────────────────────────────┘
↑ │
│ ↓
┌──────┴──────┐ ┌──────┴──────┐
│ IMPORT │ │ DEPLOY │
│ prompts │ │ winner │
└──────┬──────┘ └──────┬──────┘
│ │
↑ ↓
╔══════════╧═════════════════════════════════════════════╧══════════╗
║ YOUR PRODUCTION STACK ║
║ ║
║ ┌───────────┐ ┌───────────┐ ┌───────────┐ ║
║ │Observabil-│ │ Manual │ │ Custom │ ║
║ │ity tools │ │ paste │ │ API │ ║
║ │(LangSmith,│ └───────────┘ └───────────┘ ║
║ │ Langfuse) │ ║
║ └───────────┘ ║
║ ║
╚═══════════════════════════════════════════════════════════════════╝The key insight: Your prompts don't live in Converra—they live in your production systems. Converra connects to where they already are, optimizes them, and puts the improved versions back.
Agent Systems (V3)
If your production workflow uses multiple prompts (for example, a router handing off to specialists), Converra can discover an agent system from imported traces and evaluate prompts in system context.
What changes compared to single-prompt optimization:
- Simulations include realistic handoff context from earlier steps.
- Results can be grouped by path (the prompt sequence taken) for fair comparisons.
- System metrics are diagnostic; winner selection stays apples-to-apples within comparable paths.
Import: Where Prompts Come From
Converra pulls prompts from where they already live:
| Source | How It Works |
|---|---|
| LangSmith | Import prompts + conversation traces from your observability data |
| API | Push prompts programmatically from your deployment pipeline |
| Manual | Paste prompts directly for quick testing |
See Integrations for setup details.
The Optimization Loop
1. Analyze Prompt
Converra analyzes your prompt to understand:
- Structure and formatting
- Goals and constraints
- Potential improvement areas
2. Generate Variants
AI creates alternative versions of your prompt:
- Each variant targets specific improvements
- Variants maintain your core requirements
- Typically 3-5 variants are tested
3. Simulate
Each variant is tested against simulated personas:
- Diverse user types (frustrated, technical, new, etc.)
- Multiple conversation scenarios
- Realistic interaction patterns
4. Regression Test
When a leading variant emerges, the system automatically tests it against a "golden set" of scenarios:
- Golden set: Scenarios your baseline prompt handles reliably (auto-generated)
- Short exchanges: 2-3 turns per scenario for fast validation
- Pass/fail: Each scenario must maintain baseline performance
If regressions are found, you see the tradeoff: "Improved X but regressed on Y. Apply anyway?"
See Regression Testing for details.
5. Select Winner
Performance is evaluated across metrics:
- Task completion rate
- Response quality
- User sentiment
- Goal achievement
- Regression test results
The best-performing variant is identified.
Deploy: Putting Winners Back in Production
Once you have a winning variant, deploy it back to where your prompt lives:
| Destination | How It Works |
|---|---|
| API/Webhook | Notify your systems to pull the new version |
| Manual | Copy the optimized prompt and update your code |
The goal is a closed loop: prompts flow from production → through optimization → back to production. You can deploy via API/webhooks or copy manually.
Roadmap
GitHub PR creation and LangSmith prompt registry sync are planned.
What Gets Optimized
| Aspect | Example Improvement |
|---|---|
| Clarity | Clearer instructions, better structure |
| Tone | More appropriate formality level |
| Efficiency | Shorter responses that still work |
| Completeness | Better coverage of edge cases |
| Consistency | More predictable behavior |
Optimization Modes
Exploratory Mode
Best for: Finding improvements quickly
- Fewer simulations per variant
- Faster results (minutes)
- Good for iteration
Validation Mode
Best for: Production decisions
- More simulations per variant
- Statistical confidence
- Takes longer but more reliable
Replay Mode
Best for: Verifying fixes on real failures
- Tests variants against imported production traces (offline)
- Confirms fixes work on the exact cases that failed
- Available when you've imported traces from LangSmith
What Stays the Same
Converra preserves your:
- Core purpose and role
- Key constraints and boundaries
- Required output formats
- Brand voice fundamentals
Simulation Personas
Your prompts are tested against diverse users:
| Persona | Tests |
|---|---|
| Frustrated Customer | De-escalation, empathy |
| Technical User | Accuracy, depth |
| New User | Clarity, onboarding |
| Impatient User | Conciseness |
| Confused User | Patience, explanation |
You can also create custom personas matching your actual users.
Metrics Evaluated
Primary Metrics
- Task Completion - Did the AI help the user achieve their goal?
- Response Quality - Was the response accurate and helpful?
- User Sentiment - How would the user feel about the interaction?
Secondary Metrics
- Conciseness - Appropriate length for the context
- Consistency - Similar situations handled similarly
- Safety - Stayed within appropriate boundaries
Example Optimization
Original Prompt:
You are a customer support agent. Help users with their questions.Optimized Variant (Winner):
You are a customer support agent for TechCorp. Your goal is to
resolve issues quickly while maintaining a friendly tone.
When helping users:
1. Acknowledge their issue
2. Ask clarifying questions if needed
3. Provide a clear solution
4. Confirm the issue is resolved
If you can't resolve an issue, offer to escalate to a specialist.Improvement: +34% task completion, +28% user satisfaction
Next Steps
- Running Optimizations - Start an optimization
- Understanding Results - Interpret results
- Regression Testing - How regressions are detected
- Best Practices - Write optimization-ready prompts
