Name: Converra
Availability: InStock
Author: Converra

Question 1

Are the personas real users?

Accepted Answer

No — the probes are LLM-driven synthetic personas, calibrated against patterns we see in real production conversations. The findings are real (your agent really does what the transcript shows). The persona is the test instrument, not the user.

Question 2

What if my vendor isn't supported?

Accepted Answer

We try first-touch agentic discovery on every URL. If we can't generate an adapter, you get a wait-list confirmation and we run it manually within 24 hours. Either way, you get a real audit — automation isn't 100%, and we don't pretend otherwise.

Question 3

Is the report private?

Accepted Answer

Yes. Reports live at converra.ai/eval/r/[token] — only people with the link can read them. No index, no search, no public gallery without your opt-in. Share by sending the link, or revoke the token from your settings.

Question 4

How is this different from Braintrust / Promptfoo / Patronus?

Accepted Answer

Eval frameworks score test sets you write. We probe the agent you ship — adversarial probes that look for behaviors you didn't think to test for, run against your live production agent, no integration required.

Question 5

What does "prompt-level fix" mean?

Accepted Answer

Each finding includes the actual edit to make to your system prompt or scaffolding to close the issue. Not generic advice — the literal text or the literal config flag. Same as you'd get from a thorough code review.

Question 6

How is the score computed?

Accepted Answer

Each report scores the agent against five categories: task completion, accuracy, tone, safety, and platform hygiene. The headline score is a weighted aggregate. The eval set used is shown on every report so you can see what you're comparing against, and it's pinned across re-audits so the rubric isn't shifting underneath you.

Question 7

Can I re-audit a saved report?

Accepted Answer

Yes. Save a report to a Converra account, then hit Re-audit — the new run is compared against the previous one with score deltas per category, scenarios that flipped pass/fail, and a verdict-shift banner at the top. Re-audits use the same eval set version as the original, so the comparison is apples-to-apples.

Question 8

Do you scrape competitors?

Accepted Answer

We probe agents that are publicly accessible from a browser. Same data anyone with DevTools can see. We honor explicit no-audit requests and redact customer-specific content from any published audit.

We'll break your AI agent.

Sample · Customer-Support Agent

The transcript and the fix. Every finding.

System prompt streams verbatim to every browser

What happened

Transcript excerpt — raw SSE stream

How to fix

URL in. Scorecard out. Ten minutes.

You paste a URL.

We discover the agent.

We run 35 probes.

You get the report.

Run yours. See exactly what we find.

Sample · Customer-Support Agent

Score test sets you write.

Probes the agent you ship.

Free to start. Pay when you want it continuously.

Questions we hear.

See your agent's score in ten minutes.