Patch vulnerabilities with AI. Prove every fix works.

XOR orchestrates coding agents to fix vulnerabilities, verifies each patch against the vulnerability it claims to fix, and feeds results back so agents learn from every run. 128 bugs tested. 15 agents benchmarked.

See benchmark results Book a demo

verification

your codebase · 15 agents tested

Works with

Claude Code

Codex

Gemini CLI

Cursor

Deploy verified agent patches overnight.

How verification works

Agents fix bugs in minutes. XOR verifies each fix against the vulnerability it claims to fix. Ship verified patches by morning instead of waiting weeks for manual review. Best pass rate: 62.7%. Works with Claude Code, Codex, Gemini CLI, and Cursor.

Verified fix for $2.64. Incident response costs thousands.

See cost breakdown

Pre-production agent patches cost $2.64 to $52 per verified fix. Post-incident response costs $50,000+. XOR feeds every result back into the agent harness so costs go down and pass rates go up with every run. 1,920 results already in the learning dataset.

Signed compliance evidence from every agent run.

Standards alignment

Sell to regulated buyers. Every verification run produces a signed audit trail: tool calls, file edits, reasoning steps, all cryptographically signed. Produces evidence for SOC 2, EU AI Act, Cyber Resilience Act, FedRAMP, and PCI DSS.

Your priorities, verified and shipped.

Agents fix bugs in minutes. XOR verifies each fix against the vulnerability it claims to fix and attaches pass/fail evidence to the PR. No manual reproduction. No guessing. Best agent pass rate: 62.7%.

How verification works

verification pipeline

Detect → Patch → Verify → Learn

[01]DETECTIdentify the vulnerability

[02]PATCHAgent generates fix

[03]VERIFYTest fix against vulnerability

[04]LEARNFeed results back

[01]DETECT

Identify the vulnerability

[02]PATCH

Agent generates fix

[03]VERIFY

Test fix against vulnerability

[04]LEARN

Feed results back

Evidence

detect: Buffer overflow · src/parser.c:142patch: codex gpt 5.2 23 linesverify: 14/14 tests passlearn: Results recorded

Each fix verified against the vulnerability it claims to address

AI agents fix bugs fast. Nobody checks if the fixes work.

Teams deploy Claude Code, Codex, and Cursor to patch vulnerabilities. The agents generate fixes in minutes. But whether those fixes actually resolve the vulnerability? That still requires manual verification.

As you scale agent usage across more repos and more vulnerabilities, manual review becomes the bottleneck. You need automated verification that tests each fix against the specific vulnerability it claims to resolve.

XOR closes the gap. It wraps your agent in a verification harness, writes a verifier for each vulnerability, and confirms the fix passes before the code reaches review.

Agents get smarter with every run.

Reasoning tokens have real value. Most enterprises pay for them and capture none of it. XOR records every agent trajectory, verifies outcomes, and feeds the results back. Your agents get better. Your costs go down. 1,920 evaluations already in the learning corpus.

How agents learn

Triage by business impact, not severity score.

CVSS scores measure technical severity. XOR adds business context: which vulnerabilities affect revenue-critical code, which agents fix them cheapest, and where your budget goes furthest. $2.64 to $52 per verified fix.

Cost and triage data

"Are these real bugs or hand-picked examples?"

Real bugs. All 128 vulnerabilities are curated with reproduction environments and deterministic verifiers. We don't cherry-pick. Every agent is tested on the same set.

"How recent are the vulnerabilities?"

The dataset is updated on a rolling basis. Every one of the 128 samples ships with a reproduction environment and a deterministic verifier, so agents are tested against bugs that still reproduce today.

"Can I run these tests on my own infrastructure?"

Yes. Download the benchmark dataset. Run locally or in your environment. We publish the full vulnerability list with reproduction steps.

Free benchmark report. GitHub App free for public repos. Agent Plugin free during beta. No credit card.

$xor patch --verify --learn

EXECUTE COMMANDBook a demo

See benchmark results →

Patch vulnerabilities with AI. Prove every fix works.

Deploy verified agent patches overnight.

Verified fix for $2.64. Incident response costs thousands.

Signed compliance evidence from every agent run.

Your priorities, verified and shipped.

Ship verified patches faster than manual review.

Cut vulnerability costs from $50,000+ incidents to $2.64 agent fixes.

Pick the right agent for each vulnerability.

Triage by business impact, not just CVSS scores.

Enter regulated markets with signed audit trails.

AI agents fix bugs fast. Nobody checks if the fixes work.

Agents get smarter with every run.

Triage by business impact, not severity score.