Blosc in CVE-Agent-Bench — 5 vulnerabilities tested
5 vulnerability samples from Blosc (compression library), generating 75 evaluations across 15 agents.
Overview
Blosc is a high-performance data compression library used by HDF5, Zarr, and scientific computing frameworks to compress large datasets. The library implements multiple compression algorithms and is optimized for throughput, handling gigabytes of data efficiently. Compression algorithms are complex state machines with tight loop performance requirements, creating tension between safety and speed.
Benchmark coverage
5 vulnerability samples from Blosc are included in CVE-Agent-Bench, generating 75 individual evaluations across 15 agent configurations. These samples focus on buffer overflow vulnerabilities in decompression and integer overflow bugs in size calculations.
Vulnerability classes
Blosc samples cover vulnerability patterns in high-performance data transformation:
- Heap buffer overflows during decompression when output size estimates are too small
- Integer overflow in compression buffer allocation where size calculations wrap around
- Out-of-bounds writes in decompression loops when boundary conditions are not checked
- Buffer underflow in format parsing where read pointers are not bounds-checked
- Resource exhaustion where decompression of crafted data triggers excessive memory allocation
- Null pointer dereference when compression codec parameters are invalid or missing
Why Blosc bugs are interesting for agent evaluation
Blosc vulnerabilities test an agent's ability to understand compression algorithm implementation and memory safety during data transformation. The codebase requires careful handling of compressed data streams and format validation. Bugs often involve boundary conditions between compression blocks or incorrect size calculations in decompression loops. Agents must generate fixes that validate compressed data correctly while preserving the performance characteristics that make the library valuable in scientific computing.
Scientific computing frameworks often process untrusted data files, and decompression bugs can lead to memory corruption that silently corrupts numerical results. This makes decompression one of the most security-critical but often overlooked components in data processing pipelines.
Agent performance on Blosc
Per-project performance data is not yet published. Overall agent performance is available at the full results page, where you can view pass rates and costs by agent. The benchmark methodology explains how agents were evaluated.
Related projects
Projects with similar compression and algorithm-heavy code:
- libarchive, archive format parsing with embedded compression handling
- libjxl, image codec with complex decoding pipelines
- libgit2, Git object compression with variable-length encoding
Explore more
- Full benchmark results
- Agent profiles
- Methodology
- Economics analysis, cost per verified patch
FAQ
What does Blosc testing reveal about agents?
Blosc requires understanding compression algorithms and memory safety in decompression. 5 samples test agents on data transformation and boundary condition handling.
Benchmark Results
62.7% pass rate. $2.64 per fix. Real data from 1,920 evaluations.
Benchmark Methodology
How XOR benchmarks AI coding agents on real security vulnerabilities. Reproducible, deterministic, and transparent.
Benchmark Results
62.7% pass rate. $2.64 per fix. Real data from 1,920 evaluations.
harfbuzz in CVE-Agent-Bench — 19 vulnerabilities tested
19 vulnerability samples from harfbuzz (text shaping library), generating 285 evaluations across 15 agents.
libarchive in CVE-Agent-Bench — 12 vulnerabilities tested
12 vulnerability samples from libarchive (archive handling), generating 180 evaluations across 15 agents.
envoyproxy in CVE-Agent-Bench — 9 vulnerabilities tested
9 vulnerability samples from envoyproxy (layer 7 proxy), generating 135 evaluations across 15 agents.
See which agents produce fixes that work
128 CVEs. 15 agents. 1,920 evaluations. Agents learn from every run.