Understanding Results

Learn how to interpret optimization results and apply improvements.

Results Overview

When an optimization completes, you'll see:

Winner: The best-performing variant (or original if none improved)
Lift: Percentage improvement over the original
Confidence: How reliable the result is
Metrics: Detailed performance breakdown

Reading the Results

Winner Status

Status	Meaning
Variant Won	A variant outperformed the original
Original Won	Your agent is already performing well
Inconclusive	Not enough difference to declare a winner

Lift Percentage

The improvement compared to your original prompt:

+20% task completion = Users complete their goals 20% more often
+15% sentiment = Users feel 15% more positive about interactions
-10% response length = Responses are 10% more concise

Confidence Level

How sure we are about the result:

Confidence	Meaning
High (>95%)	Very reliable, safe to deploy
Medium (80-95%)	Likely accurate, consider more testing
Low (<80%)	Inconclusive, run more simulations

Viewing Variant Details

Dashboard

Click on any variant to see:

Full prompt content
Side-by-side comparison with original
Sample conversations
Metric breakdown

Via MCP

Show me the variants from my last optimization

Compare variant B to my original prompt

Via SDK

typescript

const variants = await converra.optimizations.getVariants('opt_123');

variants.forEach(v => {
  console.log(`${v.name}:`);
  console.log(`  Task completion: ${v.metrics.taskCompletion}%`);
  console.log(`  Lift: ${v.metrics.lift}%`);
});

Metrics Explained

Task Completion

Did the AI help users achieve their goal?

High: Users got what they needed
Low: Users left without resolution

Response Quality

Was the response accurate, helpful, and appropriate?

High: Clear, correct, actionable responses
Low: Vague, incorrect, or unhelpful

User Sentiment

How would users feel about the interaction?

Positive: Satisfied, happy
Neutral: Neither satisfied nor dissatisfied
Negative: Frustrated, disappointed

Conciseness

Are responses appropriately sized?

Good: Right length for the context
Too long: Verbose, could be shortened
Too short: Missing important information

Testing Before Applying

Want to validate the winner before deploying? Use Test Winner to run additional simulations.

Dashboard

From the optimization dropdown menu, select Test Winner. This runs the winning variant through another round of simulations to confirm performance.

Via MCP

Test the winner from my last optimization

When to Use

High-stakes prompts where you want extra confidence
When the lift is marginal and you want to verify
Before deploying to production after a long optimization

Deploying the Winner

Deployment works like CI/CD for agent improvements. Every fix passes simulation and regression testing before it ships — choose how it lands:

Auto-deploy

Enable auto-deploy in Settings and winners ship automatically when they pass regression testing. If any metric regresses after deployment, it rolls back before the next customer conversation.

GitHub PR

Converra creates a PR with the improved prompt, metrics, evidence, and check runs. Merge to deploy. Works like Dependabot for AI agents.

See GitHub Integration for setup.

Dashboard / MCP / SDK

Apply manually when you prefer:

Apply the winning variant from my last optimization

typescript

await converra.optimizations.applyVariant('opt_123');

What Happens

Your agent's prompt content is updated to the winning variant
A new version is created in version history
Your original is preserved (can revert anytime)
If any metric regresses, it rolls back automatically
Cache is invalidated (SDK users get new content immediately)

Regression Test Results

When a variant shows improvement, Converra automatically runs regression tests. Results appear alongside winner metrics:

Regression Check: PASSED (5/5 scenarios)

Regression Check: 1 REGRESSION

4/5 scenarios passed
✗ Technical support inquiry: -16%

Apply anyway? [Yes] [No - Keep Baseline]

Regressions don't automatically block deployment—you see the tradeoff and decide.

See Regression Testing for details.

When No Clear Winner

If results are inconclusive:

Run validation mode - More simulations = clearer results
Adjust intent - Focus on specific improvements
Review manually - Sometimes human judgment is needed
Keep original - If it's working, don't change it

Learning from Results

What Worked

Look at winning variant changes:

Added examples? → Examples help.
Restructured? → Format matters.
Changed tone? → Audience preference revealed.

What Didn't Work

Failed variants show what to avoid:

Too formal? Too casual?
Too verbose? Too terse?
Missing context? Over-explained?

Sample Results

Optimization Complete: opt_abc123

Winner: Variant B (+23% lift)
Confidence: High (97%)

Metrics vs Original:
┌──────────────────┬──────────┬───────────┬────────┐
│ Metric           │ Original │ Variant B │ Change │
├──────────────────┼──────────┼───────────┼────────┤
│ Task Completion  │ 72%      │ 89%       │ +17%   │
│ Response Quality │ 81%      │ 94%       │ +13%   │
│ User Sentiment   │ 68%      │ 85%       │ +17%   │
│ Conciseness      │ 65%      │ 78%       │ +13%   │
└──────────────────┴──────────┴───────────┴────────┘

Key Changes in Variant B:
- Added step-by-step format for instructions
- Included acknowledgment before solutions
- Added follow-up confirmation question

Next Steps

Logging Conversations - Track real performance
Analyzing Insights - Understand patterns

Understanding Results ​

Results Overview ​

Reading the Results ​

Winner Status ​

Lift Percentage ​

Confidence Level ​

Viewing Variant Details ​

Dashboard ​

Via MCP ​

Via SDK ​

Metrics Explained ​

Task Completion ​

Response Quality ​

User Sentiment ​

Conciseness ​

Testing Before Applying ​

Dashboard ​

Via MCP ​

When to Use ​

Deploying the Winner ​

Auto-deploy ​

GitHub PR ​

Dashboard / MCP / SDK ​

What Happens ​

Regression Test Results ​

When No Clear Winner ​

Learning from Results ​

What Worked ​

What Didn't Work ​

Sample Results ​

Next Steps ​

Understanding Results

Results Overview

Reading the Results

Winner Status

Lift Percentage

Confidence Level

Viewing Variant Details

Dashboard

Via MCP

Via SDK

Metrics Explained

Task Completion

Response Quality

User Sentiment

Conciseness

Testing Before Applying

Dashboard

Via MCP

When to Use

Deploying the Winner

Auto-deploy

GitHub PR

Dashboard / MCP / SDK

What Happens

Regression Test Results

When No Clear Winner

Learning from Results

What Worked

What Didn't Work

Sample Results

Next Steps