Skip to main content
[ECONOMICS]

Agent Cost Economics

Fix vulnerabilities for $2.64–$52 with agents. 100x cheaper than incident response. Real cost data.

Agent Cost Tiers

  1. Standard agents: $2.64–$15 per fix
  2. Advanced agents: $15–$45 per fix
  3. Frontier agents: $30–$52 per fix

Hidden Costs of Incident Response

When a CVE hits production, costs multiply: engineer time, customer notifications, reputation damage, regulatory fines. Fixing in pre-production saves orders of magnitude.

Cost Optimization

Use cheaper agents for easy bugs (syntax errors, refactors). Reserve frontier agents for hard architectural problems. XOR tracks which agent solves which classes of bugs best.

$0
Total benchmark cost
$2.64
Cheapest per pass
3
Best trade-off agents
1,920
Test runs completed

Prefer interactive charts? Open the Benchmark Explorer →

What it costs to fix a bug with AI

We spent $0 running 15 agents across 128 real bugs. The cheapest agent fixes bugs for $2.64 each. The most accurate costs $52/fix. Growing to 6,138+ vulnerabilities across 250+ projects.

Cost matters because you will run this repeatedly. If a bug costs $5 to patch and you run this across 500 vulnerabilities, your spend is $2,500. The same bugs with a $0.50 agent cost $250. These numbers drive real procurement decisions. We measure actual token consumption from API logs, not estimates.

Budget with real data

Security ROI = (risk reduced - cost) / cost. These tested cost-per-fix numbers replace guesswork in your budget.

See Agentic SecEcon →

Cost vs Performance

Each dot is an agent. X-axis: cost per successful patch (log scale). Y-axis: pass rate. The dashed line shows the best trade-off - no agent below it is both cheaper and more accurate.

The scatter reveals agent clusters. Some agents cluster together at 40-50% pass rate with similar costs - they are functionally equivalent. But outliers exist: agents that are cheap but miss easy bugs, or expensive but uniquely capable on hard ones. Your choice depends on your budget and your bug distribution.

Pareto Frontier with Confidence Intervals

Cost efficiency frontier with 95% Wald confidence intervals on pass rates. The Pareto frontier identifies agents where no alternative is both cheaper AND more accurate. Every agent on this line represents a genuine trade-off decision: lower cost or higher accuracy, but not both.

Confidence intervals show the statistical range around each agent's pass rate. Wider intervals indicate greater uncertainty, typically from agents with more edge cases or lower sample counts. Agents on the frontier with tight confidence intervals are more reliable choices than those with wide bands.

Oracle Set Cover

Greedy set cover showing marginal value of adding each agent to the ensemble. This analysis answers a practical question: if you can run multiple agents on the same bug, which ones should you add to maximize coverage? Start with the best agent (highest pass rate) and add agents that fix bugs the leader misses.

The first agent covers maybe 60% of bugs. Adding the second agent might bring you to 72%. Adding a third might hit 78%. But at some point, the marginal gain from each new agent drops below the cost. This visualization helps you find the optimal ensemble size for your budget.

Cost Efficiency Rankings

RankAgent$/PassAPI CostPass RatePasses
1claude-claude-opus-4-5$2.64$15345.7%58
2claude-claude-opus-4-6$2.93$22561.6%77
3gemini31-gemini-3.1-pro-preview$3.92$25158.7%64
4cursor-composer-1.5$3.93$22445.2%57
5gemini-gemini-3-pro-preview$4.85$26743.0%55
6codex-gpt-5.2$5.30$41962.7%79
7opencode-gemini-gemini-3.1-pro-preview$5.81$38954.9%67
8cursor-gpt-5.3-codex$6.16$39450.4%64
9cursor-gpt-5.2$6.26$39451.6%63
10codex-gpt-5.2-codex$6.65$41949.2%63
11opencode-gpt-5.2$6.65$41951.6%63
12opencode-gpt-5.2-codex$8.73$41937.8%48
13cursor-opus-4.6$35.40$283262.5%80
14opencode-claude-opus-4-5$40.13$184636.8%46
15opencode-claude-opus-4-6$51.88$300947.5%58

Unlock full results

Enter your email to access the full methodology, per-sample analysis, and patch examples.

[NEXT STEPS]

Optimize your agent spend

The cheapest path: claude-claude-opus-4-5 at $2.64/fix. The most accurate: opencode-claude-opus-4-6 at $52/fix. For most teams, the best pair covers 96/128 bugs.

Explore more

FAQ

How much does an agent fix cost?

$2.64 to $52 depending on agent and model. Calculated from real API costs across 1,920 evaluations.

Why such a wide range?

Different agents have different API costs (Claude vs Codex vs Gemini). Different bugs require different reasoning depth. Some agents solve in one attempt; others need multiple tries.

How does this compare to incident response?

Incident response for a critical CVE typically costs $10K–$50K in engineer time + downtime. Agent-based pre-production fixing costs dollars. 100x–1000x cheaper.

What if the agent fails?

Failed fixes still provide learning signals. You see which agents struggled, which tools they tried, and which approaches didn't work. No wasted money-just data.

[RELATED TOPICS]

See which agents produce fixes that work

128 CVEs. 15 agents. 1,920 evaluations. Agents learn from every run.