Skip to main content
[PATCHING VERIFICATION PLATFORM]

Patch vulnerabilities with AI. Prove every fix works.

XOR orchestrates coding agents to fix CVEs, verifies each patch against the vulnerability it claims to fix, and feeds results back so agents learn from every run. 136 bugs tested. 9 agents benchmarked.

verification
harfbuzz/harfbuzz · CVE-2024-11033

Works with

Claude CodeCodexGemini CLICursor
[SHIP FASTER]

Deploy verified agent patches overnight.

How verification works

Agents fix bugs in minutes. XOR verifies each fix against the vulnerability it claims to fix. Ship verified patches by morning instead of waiting weeks for manual review. Best pass rate: 50.7%. Works with Claude Code, Codex, Gemini CLI, and Cursor.

[CUT CVE COSTS]

Verified fix for $4.16. Incident response costs thousands.

See cost breakdown

Pre-production agent patches cost $4.16 to $87 per verified fix. Post-incident response costs $50,000+. XOR feeds every result back into the agent harness so costs go down and pass rates go up with every run. 1,224 results already in the learning dataset.

[UNLOCK REGULATED MARKETS]

Signed compliance evidence from every agent run.

Standards alignment

Sell to regulated buyers. Every verification run produces a signed audit trail — tool calls, file edits, reasoning steps, all cryptographically signed. Produces evidence for SOC 2, EU AI Act, Cyber Resilience Act, FedRAMP, and PCI DSS.

Your priorities, verified and shipped.

Agents fix bugs in minutes. XOR verifies each fix against the vulnerability it claims to fix and attaches pass/fail evidence to the PR. No manual reproduction. No guessing. Best agent pass rate: 50.7%.

How verification works
[THE GAP]

AI agents fix bugs fast. Nobody checks if the fixes work.

Teams deploy Claude Code, Codex, and Cursor to patch vulnerabilities. The agents generate fixes in minutes. But whether those fixes actually resolve the vulnerability — that still requires manual verification.

As you scale agent usage across more repos and more CVEs, manual review becomes the bottleneck. You need automated verification that tests each fix against the specific vulnerability it claims to resolve.

XOR closes the gap. It wraps your agent in a verification harness, writes a verifier for each CVE, and confirms the fix passes — before the code reaches review.

[PRINCIPLE 1]

Agents get smarter with every run.

Reasoning tokens have real value. Most enterprises pay for them and capture none of it. XOR records every agent trajectory, verifies outcomes, and feeds the results back. Your agents get better. Your costs go down. 1,224 evaluations already in the learning corpus.

How agents learn
[PRINCIPLE 2]

Triage by business impact, not severity score.

CVSS scores measure technical severity. XOR adds business context: which vulnerabilities affect revenue-critical code, which agents fix them cheapest, and where your budget goes furthest. $4.16 to $87 per verified fix.

Cost and triage data

"Are these real bugs or hand-picked examples?"

Real bugs. All 136 vulnerabilities are from the public CVE database or ARVO v3.0.0 corpus. We don't cherry-pick. Every agent is tested on the same set.

"How recent are the CVEs?"

Dataset updated regularly. Oldest CVE in dataset: 2020-01-15. Newest: 2026-02-16. We test agents on active threats, not historical ones.

"Can I run these tests on my own infrastructure?"

Yes. Download the benchmark dataset. Run locally or in your environment. We publish the full CVE list with reproduction steps.

Free benchmark report. GitHub App free for open source. Agent Plugin free during beta. No credit card.
READY TO START

$xor patch --verify --learn

Book a demo
XOR | Automated Vulnerability Patching and Verification