A timeline of features, improvements, and integrations across the Converra platform.
When an optimization doesn't find a winner, Converra now classifies why it failed, learns from the outcome, and automatically restarts with an adapted strategy. The same improvement loop we run on your agents now runs on ours.
Agent instructions now have full version history with lineage tracking. When one agent in a sibling group gets optimized, the others show staleness indicators so you know which agents are falling behind.
Preview the full pull request before creating it — the diff, metrics comparison, diagnosed issues, regression results, and conversation replay. Review everything in one place before committing to the PR.
Optimization PRs are now structured for AI code reviewers. Full context, metrics, evidence, and before/after comparisons are embedded inline so Claude Code, Copilot, and other AI reviewers can evaluate the change without needing access to Converra.
Every optimization now shows side-by-side comparisons of how the agent responded before and after the change. See the actual improvement in context, not just a score delta.
Instead of a separate email for every diagnosed conversation, you now get one agent-level alert that summarizes all issues across conversations. Less noise, same signal.
The copy fix button now includes testing methodology, regression results, real production conversation quotes, and review guidance — everything a reviewer needs without opening Converra.
Define evaluation rules at the organization level that apply across all your agents. Custom rules are plumbed directly into evaluation prompts so every optimization and insight respects your business-specific quality standards.
Each agent in a multi-agent conversation now gets its own scoped insights. Secondary agents surface their own failure patterns and performance metrics instead of being rolled into the primary agent's analysis.
Agent and fleet insights now show behavior-specific failure patterns with real conversation counts — not generic buckets. Each pattern links directly to the affected conversations. Cost and upside cards estimate the business impact of fixing the top issues.
Agent issues are now deep business insights, not metric labels. Each issue includes a headline, evidence from diagnosed conversations, a recommended fix, the team that owns it, and whether Converra can auto-fix it via prompt optimization.
Converra now creates pull requests in your GitHub repos when optimizations find improvements. Connect your GitHub, and optimization winners are automatically sent as PRs with metrics, evidence, and a one-click merge path. Supports auto-PR on completion, manual PR creation, merge-back sync, and Python agent file detection.
Learn moreWhen a benchmark finds a better model, Converra automatically opens a GitHub PR to switch the model config in your repo — complete with a comparison table showing quality score, cost, and latency across difficulty levels.
Model benchmarks now live on their own page with a dedicated nav entry. Browse all benchmark runs, view inline conversation scores, and launch new comparisons without leaving context.
See every agent's health at a glance. A single dashboard with optimization progress over time, a scoreboard ranking agents by performance, failure distribution, improvement potential, and pending deploys — everything you need to decide what to work on next.
The guided tour now starts with the Fleet page and walks through connecting your first agent. A faster path from signup to value.
Import converra/auto and every LLM call in your app is captured automatically — no wrapper functions, no code changes. Works with OpenAI, Anthropic, and Vercel AI SDK.
Langfuse and OpenTelemetry integrations now match LangSmith feature-for-feature — async sync triggers, pre-flight validation, usage limit checks, and import metrics.
Wrap your OpenAI, Anthropic, or Vercel AI SDK client with one line. Every conversation is captured automatically. Multi-agent tracing links orchestrator and sub-agent calls into a single execution graph. A/B variant swapping tests optimized prompts against real traffic.
Learn morepip install converra — Python SDK with sync/async/streaming support for OpenAI and Anthropic. LangChain callback handler included.
New API endpoints for SDK integration — prompt matching by content hash, active variant lookup for A/B testing, bulk SDK configuration endpoint. Testing mode setting (proxy/simulation) added to dashboard.
Send traces directly to Converra via the SDK (converra.traces.create) — no LangSmith, Langfuse, or OTel pipeline required. The fastest path from your agent to Converra.
Learn moreGive the optimizer direct feedback. Thumbs up/down from the UI or programmatic feedback via MCP tools — both feed into the optimization agent's planning so it learns from your judgment, not just metrics.
Benchmark comparisons now show actual per-conversation scores inline. See exactly how each model performed, not just a summary.
Redesigned optimization results with an activity card and deploy banner. Clearer post-optimization experience so you can review and deploy faster.
Get notified when optimizations complete or conversation syncs finish. Real-time alerts in the app so you never miss a result.
Step-level failure diagnosis now runs on every conversation — not just low-scoring multi-agent traces. Every agent gets actionable root cause analysis regardless of score or architecture.
Waitlist removed. Sign up with email or Google and start connecting agents immediately.
Start using Converra at no cost. The free tier includes conversation imports, insights, and a limited number of optimizations so you can evaluate before committing.
Focus optimization on what matters most. Choose from 24 built-in focus areas or define custom goals — simulations, evaluations, and variant generation all align to your intent.
Re-optimize agents that are already in monitoring state. Unresolved issues from prior runs carry forward automatically so the optimizer picks up where it left off.
Failures across your agents are now grouped by root cause category in the Systems view. Quickly spot whether issues stem from hallucinations, instruction gaps, tool errors, or context limits.
See recurring failure patterns for individual prompts. Identify which failure types affect each agent so you can prioritize the highest-impact fixes.
Step-level failure diagnosis now shows the actual conversation messages exchanged during the failing step, giving you full context without leaving the diagnosis view.
Import production conversations from any OpenTelemetry-compatible tracing pipeline. Connect Axiom or other OTel backends to automatically sync your agent's traces.
Learn moreProduction user feedback is now surfaced in conversation insights and factored into evaluation scores. See what real users thought alongside AI analysis.
Optimization automatically triggers when step diagnosis detects fixable failures. Winners can auto-deploy with settings-gated controls — no manual intervention needed.
Winners are automatically tested against a golden set of scenarios before deployment. Catch regressions before they reach production.
Learn moreDefine business-specific metrics beyond the built-in evaluation suite. Measure what matters most for your agent's domain.
Compare model performance side-by-side. Run the same scenarios across different LLMs to find the best fit for your agent.
Redesigned conversation insights with above-the-fold metrics, prompt links, and consolidated qualitative sections.
Multi-agent simulations now inject synthetic orchestrator context for higher fidelity. Simulated conversations reflect how your agents actually interact in production.
Learn moreOptimizations now target the specific agent step responsible for diagnosed failures, instead of optimizing blindly.
See recurring failure types across all your agents at a glance. Spot systemic issues before they become customer-facing problems.
Converra launches. Autonomous agent optimization with simulation testing, real-time performance tracking, and continuous prompt improvement.
Pinpoints which step in a multi-agent conversation caused a failure. See the execution flow, identify the responsible agent, and get actionable fix recommendations.
Learn moreThe optimizer now detects and resolves contradictions, redundancy, and formatting issues in your agent's instructions — not just metric-driven changes.
Extract variables from agent instructions and deploy optimized variants across sibling agents that share the same structure.
Converra is now accessible as an MCP server — manage agents, run simulations, and trigger optimizations from any MCP-compatible client.
Learn moreBreak down agent performance by segment. See which parts of your agent's instructions contribute most to success or failure.
Mark sections of your agent's instructions as protected so the optimizer preserves them during variant generation.
Import production conversations from Langfuse with continuous sync. Supports self-hosted instances and multi-agent trace detection.
Learn moreVariant selection now uses persona-level head-to-head comparisons as the single source of truth, eliminating false positives from aggregated scores.
Import production conversations directly from LangSmith. Connect your existing tracing pipeline to Converra without code changes.
Learn moreAutomatically detect multi-agent architectures from ingested conversations. See which agents participate and how they hand off.
Archive and delete conversations in bulk. Filter, select, and clean up your conversation data at scale.
Connect your agent and start seeing improvements in minutes.
Start for free