libjxl in CVE-Agent-Bench — 4 vulnerabilities tested
4 vulnerability samples from libjxl (JPEG XL codec), generating 60 evaluations across 15 agents.
Overview
libjxl is the reference implementation of JPEG XL, a next-generation image codec being adopted by Chrome and Safari. The format offers superior compression to JPEG while adding features like lossless compression and animation support. The implementation handles complex image decoding pipelines with color space conversions, filtering, and entropy decoding. Browser integration means bugs can affect billions of users viewing images online.
Benchmark coverage
4 vulnerability samples from libjxl are included in CVE-Agent-Bench, generating 60 individual evaluations across 15 agent configurations. These samples focus on heap buffer overflow vulnerabilities in image decoding and integer overflow bugs in image dimension calculations.
Vulnerability classes
libjxl samples cover vulnerability patterns in image codec implementation:
- Heap buffer overflows in image line decompression where output buffer size calculations are incorrect
- Integer overflow in image dimension handling that leads to undersized buffer allocation
- Out-of-bounds writes in color space conversion loops where boundary conditions are not checked
- Denial of service via crafted images that trigger infinite loops in entropy decoding
- Resource exhaustion where compressed image metadata claims excessively large dimensions
- Buffer underflow in frame boundary handling where read pointers exceed data bounds
Why libjxl bugs are interesting for agent evaluation
libjxl vulnerabilities test an agent's ability to understand image format parsing and decoding state machines. The codebase requires understanding of compression algorithms, color space conversions, and frame handling. Bugs often involve boundary conditions in decoder loops or incorrect calculations in memory allocation. Agents must generate fixes that safely handle arbitrary image inputs while maintaining the performance characteristics needed for real-time decoding in web browsers.
Image codecs are particularly high-impact because they process untrusted user input (any image on the web) in browsers that serve billions of users. A single decoder bug can become a zero-day affecting major browser versions.
Agent performance on libjxl
Per-project performance data is not yet published. Aggregate results across all projects are available at the full results page, where you can view individual agent pass rates and costs. The benchmark methodology documents the evaluation process in detail.
Related projects
Projects with similar codec and data transformation challenges:
- blosc, compression algorithm with buffer boundary handling
- libarchive, multi-format parsing with decompression
- libgit2, binary format parsing with compression handling
Explore more
- Full benchmark results
- Agent profiles
- Methodology
- Economics analysis, cost per verified patch
FAQ
What is the significance of libjxl samples?
libjxl is the next-generation image codec adopted by browsers. 4 samples test agents on modern codec implementation and the correctness needed for browser integration.
Benchmark Results
62.7% pass rate. $2.64 per fix. Real data from 1,920 evaluations.
Benchmark Methodology
How XOR benchmarks AI coding agents on real security vulnerabilities. Reproducible, deterministic, and transparent.
Benchmark Results
62.7% pass rate. $2.64 per fix. Real data from 1,920 evaluations.
harfbuzz in CVE-Agent-Bench — 19 vulnerabilities tested
19 vulnerability samples from harfbuzz (text shaping library), generating 285 evaluations across 15 agents.
libarchive in CVE-Agent-Bench — 12 vulnerabilities tested
12 vulnerability samples from libarchive (archive handling), generating 180 evaluations across 15 agents.
envoyproxy in CVE-Agent-Bench — 9 vulnerabilities tested
9 vulnerability samples from envoyproxy (layer 7 proxy), generating 135 evaluations across 15 agents.
See which agents produce fixes that work
128 CVEs. 15 agents. 1,920 evaluations. Agents learn from every run.