Image Codec in Vulnerability-Agent-Bench — 4 vulnerabilities tested

4 vulnerability samples from an image codec, generating 60 evaluations across 15 agents.

Overview

This image codec is the reference implementation of JPEG XL, a next-generation image format being adopted by Chrome and Safari. The format offers superior compression to JPEG while adding features like lossless compression and animation support. The implementation handles complex image decoding pipelines with color space conversions, filtering, and entropy decoding. Browser integration means bugs can affect billions of users viewing images online.

Benchmark coverage

4 vulnerability samples from this image codec are included in Vulnerability-Agent-Bench, generating 60 individual evaluations across 15 agent configurations. These samples focus on heap buffer overflow vulnerabilities in image decoding and integer overflow bugs in image dimension calculations.

Vulnerability classes

Image codec samples cover vulnerability patterns in image codec implementation:

Heap buffer overflows in image line decompression where output buffer size calculations are incorrect
Integer overflow in image dimension handling that leads to undersized buffer allocation
Out-of-bounds writes in color space conversion loops where boundary conditions are not checked
Denial of service via crafted images that trigger infinite loops in entropy decoding
Resource exhaustion where compressed image metadata claims excessively large dimensions
Buffer underflow in frame boundary handling where read pointers exceed data bounds

Why image codec bugs are interesting for agent evaluation

Image codec vulnerabilities test an agent's ability to understand image format parsing and decoding state machines. The codebase requires understanding of compression algorithms, color space conversions, and frame handling. Bugs often involve boundary conditions in decoder loops or incorrect calculations in memory allocation. Agents must generate fixes that safely handle arbitrary image inputs while maintaining the performance characteristics needed for real-time decoding in web browsers.

Image codecs are particularly high-impact because they process untrusted user input (any image on the web) in browsers that serve billions of users. A single decoder bug can become a zero-day affecting major browser versions.

Agent performance on image codec

Per-project performance data is not yet published. Aggregate results across all codebases are available at the full results page, where you can view individual agent pass rates and costs. The benchmark methodology documents the evaluation process in detail.

Codebases with similar codec and data transformation challenges:

Data Compressor, compression algorithm with buffer boundary handling
Archive Library, multi-format parsing with decompression
Git Library, binary format parsing with compression handling

Explore more

Full benchmark results
Agent profiles
Methodology
Economics analysis, cost per verified patch

FAQ

What is the significance of image codec samples?

This image codec is the next-generation format adopted by browsers. 4 samples test agents on modern codec implementation and the correctness needed for browser integration.

Benchmark Results

62.7% pass rate. $2.64 per fix. Real data from 1,920 evaluations.

Benchmark Methodology

How XOR benchmarks AI coding agents on real security vulnerabilities. Reproducible, deterministic, and transparent.