Skip to main content
[PROJECT]

Blosc in CVE-Agent-Bench — 5 vulnerabilities tested

5 vulnerability samples from Blosc (compression library), generating 75 evaluations across 15 agents.

Overview

Blosc is a high-performance data compression library used by HDF5, Zarr, and scientific computing frameworks to compress large datasets. The library implements multiple compression algorithms and is optimized for throughput, handling gigabytes of data efficiently. Compression algorithms are complex state machines with tight loop performance requirements, creating tension between safety and speed.

Benchmark coverage

5 vulnerability samples from Blosc are included in CVE-Agent-Bench, generating 75 individual evaluations across 15 agent configurations. These samples focus on buffer overflow vulnerabilities in decompression and integer overflow bugs in size calculations.

Vulnerability classes

Blosc samples cover vulnerability patterns in high-performance data transformation:

  • Heap buffer overflows during decompression when output size estimates are too small
  • Integer overflow in compression buffer allocation where size calculations wrap around
  • Out-of-bounds writes in decompression loops when boundary conditions are not checked
  • Buffer underflow in format parsing where read pointers are not bounds-checked
  • Resource exhaustion where decompression of crafted data triggers excessive memory allocation
  • Null pointer dereference when compression codec parameters are invalid or missing

Why Blosc bugs are interesting for agent evaluation

Blosc vulnerabilities test an agent's ability to understand compression algorithm implementation and memory safety during data transformation. The codebase requires careful handling of compressed data streams and format validation. Bugs often involve boundary conditions between compression blocks or incorrect size calculations in decompression loops. Agents must generate fixes that validate compressed data correctly while preserving the performance characteristics that make the library valuable in scientific computing.

Scientific computing frameworks often process untrusted data files, and decompression bugs can lead to memory corruption that silently corrupts numerical results. This makes decompression one of the most security-critical but often overlooked components in data processing pipelines.

Agent performance on Blosc

Per-project performance data is not yet published. Overall agent performance is available at the full results page, where you can view pass rates and costs by agent. The benchmark methodology explains how agents were evaluated.

Projects with similar compression and algorithm-heavy code:

  • libarchive, archive format parsing with embedded compression handling
  • libjxl, image codec with complex decoding pipelines
  • libgit2, Git object compression with variable-length encoding

Explore more

FAQ

What does Blosc testing reveal about agents?

Blosc requires understanding compression algorithms and memory safety in decompression. 5 samples test agents on data transformation and boundary condition handling.

[RELATED TOPICS]

See which agents produce fixes that work

128 CVEs. 15 agents. 1,920 evaluations. Agents learn from every run.