Open vSwitch in CVE-Agent-Bench — 6 vulnerabilities tested
6 vulnerability samples from Open vSwitch (virtual switch), generating 90 evaluations across 15 agents.
Overview
Open vSwitch is a production-grade virtual switch for network virtualization used by OpenStack and VMware NSX. The switch processes network packets in real time, requiring careful handling of flow rules, packet parsing, and memory management under high throughput. As a key component in cloud infrastructure, bugs in packet processing can affect data center operations and network isolation between virtual machines.
Benchmark coverage
6 vulnerability samples from Open vSwitch are included in CVE-Agent-Bench, generating 90 individual evaluations across 15 agent configurations. These samples include buffer overflows in flow parsing, MPLS stack overflow vulnerabilities, and packet handling edge cases.
Vulnerability classes
Open vSwitch samples cover vulnerability patterns in packet processing and network protocol handling:
- Buffer overflows in flow rule parsing where field extraction exceeds allocated buffer bounds
- MPLS stack overflow vulnerabilities where deeply nested MPLS labels trigger buffer exhaustion
- Out-of-bounds reads in packet header parsing when field offsets are incorrect
- Integer overflow in packet field calculations leading to undersized buffers
- Denial of service via crafted packets that trigger expensive operations or memory exhaustion
- OpenFlow protocol violations where malformed messages cause state machine errors
Why Open vSwitch bugs are interesting for agent evaluation
Open vSwitch vulnerabilities test an agent's ability to understand network packet processing and state machine implementation. The codebase handles multiple network protocols including OpenFlow, MPLS, and custom encapsulation. Bugs often involve boundary conditions in packet parsing or incorrect field extraction. Agents must generate fixes that enforce packet validation correctly while maintaining the high performance required in data center environments.
Packet processing code is particularly challenging because bugs are often protocol-specific and may only trigger on packets that meet specific criteria, making them difficult to find without fuzzing or formal verification.
Agent performance on Open vSwitch
Per-project performance data is not yet published. Aggregate results across all projects are available at the full results page, where you can compare agents by pass rate and cost. The benchmark methodology documents the evaluation process.
Related projects
Projects with similar network protocol and real-time processing challenges:
- envoyproxy, HTTP/2 protocol handling with connection state management
- Apache, network-facing protocol parsing with request validation
- libarchive, untrusted binary input parsing with format validation
Explore more
- Full benchmark results
- Agent profiles
- Methodology
- Economics analysis, cost per verified patch
FAQ
Why test agents on Open vSwitch?
Open vSwitch is critical for data center networking. 6 samples test packet processing, flow parsing, and the correctness required in high-throughput network systems.
Benchmark Results
62.7% pass rate. $2.64 per fix. Real data from 1,920 evaluations.
Benchmark Methodology
How XOR benchmarks AI coding agents on real security vulnerabilities. Reproducible, deterministic, and transparent.
Benchmark Results
62.7% pass rate. $2.64 per fix. Real data from 1,920 evaluations.
harfbuzz in CVE-Agent-Bench — 19 vulnerabilities tested
19 vulnerability samples from harfbuzz (text shaping library), generating 285 evaluations across 15 agents.
libarchive in CVE-Agent-Bench — 12 vulnerabilities tested
12 vulnerability samples from libarchive (archive handling), generating 180 evaluations across 15 agents.
envoyproxy in CVE-Agent-Bench — 9 vulnerabilities tested
9 vulnerability samples from envoyproxy (layer 7 proxy), generating 135 evaluations across 15 agents.
See which agents produce fixes that work
128 CVEs. 15 agents. 1,920 evaluations. Agents learn from every run.