Why I Founded Converra

I was responsible for shipping AI into production at Totango—and at scale, every release felt like a gamble.

The process was familiar: adjust a prompt, manually test expected cases, scan logs, deploy, then wait for real users to surface edge cases. I could validate what I anticipated—but production behavior was always messier, more diverse, and harder to predict.

Learning happened after deployment, not before it. Improvements were inferred from downstream signals, not proven upfront. Confidence came only after issues appeared—or didn't.

After leaving Totango, I advised a founder building an AI product already seeing real usage. Different team, different stack, same failure mode. Manual testing. Log-driven analysis. Post-hoc fixes. The pattern repeated.

That's when it became clear this wasn't a team problem or a tooling gap.

It was structural.

As systems scale, we stop relying on intuition:

  • Infrastructure becomes self-healing.
  • Ad platforms optimize continuously.
  • Models retrain from feedback loops.

But production AI agents—systems expected to reason, adapt, and interact with humans—are still improved through manual iteration and hindsight.

That approach doesn't scale.

The Insight

Why evaluation—not observability—is the bottleneck

Most AI teams already have observability: logs, traces, dashboards, feedback pipelines. Observability explains what happened. It's essential for debugging failures after the fact.

Evaluation serves a different purpose. It establishes causality.

Without systematic, repeatable evaluation against realistic scenarios, teams can't reliably answer foundational questions:

  • ?Did this change actually improve behavior?
  • ?Are the gains meaningful or just noise?
  • ?What regressed elsewhere?

Logs capture outcomes. Evaluation isolates signal from noise.

At scale, progress comes from measured evolution, not revolution. Large behavioral rewrites are risky, hard to validate, and often regress in unexpected ways. What actually compounds is a sequence of small, provable improvements—each one understood, evaluated, and intentionally promoted.

Designer Virgil Abloh built his practice around what he called the “3% approach”—the idea that meaningful innovation comes from small, deliberate edits rather than wholesale reinvention.

Improving AI agents works the same way. The challenge isn't generating bold new behaviors—it's determining whether a change makes an agent meaningfully better, by a measurable margin, without introducing new failure modes. Without evaluation, teams are forced into all-or-nothing releases. With it, improvement becomes incremental, safe, and cumulative.

Frameworks help generate better variants. Observability helps explain outcomes. Converra exists to decide—systematically—which changes should ship.

As teams scale, internal tooling and frameworks tend to accrete around logging, experimentation, and prompt iteration. What's missing is a system that turns those inputs into decisions. Converra doesn't replace generation frameworks or observability tools—it sits above them, establishing which changes are safe, meaningful, and worth deploying.

Converra is an autonomous optimization layer for production AI agents—built for teams shipping to real users. It closes the loop between simulation, evaluation, and deployment so agent behavior can evolve safely, deliberately, and with confidence.

The goal isn't more prompts.

It's fewer wrong decisions.

The Approach

Converra was built end-to-end by design. The core challenge wasn't writing code—it was designing a system with tight control over the feedback loop between agent behavior, evaluation, and real-world usage.

Building the foundation myself allowed rapid iteration, immediate testing of assumptions, and avoided prematurely locking the system into abstractions that didn't reflect production reality. That coherence was necessary to prove the model: that AI behavior can be improved autonomously, not manually.

Establishing that foundation came first. Scaling the team comes later.

The Founder
Oren Cohen, Founder of Converra

Oren Cohen

Founder, Converra

Second-time founder. Previously founded Buildup, a construction tech company, later acquired by Stanley Black & Decker.

As VP Product Growth at Totango, helped re-accelerate ARR growth from approximately 10% to 43% YoY while leading multiple AI initiatives in production.

Computer Science, Technion.

Based in New York City.

LinkedIn