Skip to main content

Detect. Patch. Verify. Learn.

One loop that patches vulnerabilities, proves fixes work, and makes your agents smarter. 15 agents. 128 bugs. Scaling to 6,138+.

verification
memory safety · your codebase
Claude Opus 4.6● 14/14 tests
62.7%Best pass rate
1,920Verified evaluations
$2.64Cheapest verified fix

Automated patching with proof that fixes work.

How verification works

XOR detects the vulnerability, dispatches an agent to patch it, and writes a verifier to confirm the fix resolves the specific vulnerability. Best pass rate: 62.7%. 1,920 evaluations completed.

Agents improve with every verification cycle.

Learning data

Failed fixes are the primary learning signal. XOR upgrades the agent harness (system prompt, tools, memory) after every run. Pass rates go up. Costs go down. $2.64 to $52 per verified fix across 15 configurations.

Compliance evidence from every run.

Standards alignment

Every agent action is cryptographically signed and logged. Produces evidence for SOC 2, FedRAMP, EU AI Act, Cyber Resilience Act, and PCI DSS. Built on an open IETF Internet-Draft.

Two interfaces. One verification engine.

GitHub App: automates PR review, fix generation, and CI hardening on your repos. Agent Plugin: wraps your coding agent in a verification harness with secure skills and memory. Choose one or both.

terminal

Detect. Patch. Verify. Learn. Repeat.

XOR detects the vulnerability, dispatches an agent to write a fix, tests the fix against a verifier it wrote for the specific vulnerability, records the result, and feeds the outcome back into the agent harness. Failed fixes teach agents what to avoid. Passing fixes expand the training set. Every cycle makes agents more accurate and cheaper.

verification pipeline
Detect Patch Verify Learn
[01]DETECT
Identify the vulnerability
[02]PATCH
Agent generates fix
[03]VERIFY
Test fix against vulnerability
[04]LEARN
Feed results back
Evidence
detect: Buffer overflow · src/parser.c:142patch: codex gpt 5.2 23 linesverify: 14/14 tests passlearn: Results recorded
Each fix verified against the vulnerability it claims to address
[AGENT COMPARISON]
0Verified
0.0%Best pass rate15 agents
$0.00Cheapest fixof $52 max
0Outcomes
Pass rate by agentacross all evaluations
All agents avg50.5%
Best agent62.7%
Latest: text-shaping/engineVERIFIED

Four skills. Every wrapped agent.

The Agent Plugin provides four core skills to every coding agent it wraps.

[Scan]

Identify vulnerabilities in the target codebase.

[Audit]

Verify agent tool configurations, sandbox boundaries, and credential exposure.

[Report]

Generate evidence reports with pass/fail outcomes and audit trails.

[Sign]

Cryptographically sign the verification record (COSE_Sign1).

Three artifacts. Every run.

Evidence Report

Attached to every PR. Shows the bug, the fix, test results, and pass/fail outcome.

Signed Audit Trail

Cryptographically signed record of every action the agent took. Every tool used, file edited, and reasoning step.

Benchmark Report

128 vulnerability test cases. 15 agents. 1,920 results. Pass rates, cost per fix, difficulty scores.

No auto-merge

Every change requires verification. No shortcuts.

No unmonitored runs

If XOR can't observe the agent, it can't verify the output.

No claims without data

Every number on this page is from verified benchmarks.

Free benchmark report. GitHub App free for public repos. Agent Plugin free during beta. No credit card.

$xor patch --verify --learn

Book a demo