Skip to main content
[PROJECT]

OpenThread in CVE-Agent-Bench — 5 vulnerabilities tested

5 vulnerability samples from OpenThread (mesh networking), generating 75 evaluations across 15 agents.

Overview

OpenThread is an implementation of the Thread networking protocol backed by Google, Apple, and Amazon. Thread is a low-power mesh networking standard used in the Matter smart home specification. The implementation handles mesh routing, encryption, and device discovery in resource-constrained IoT environments. IoT devices often cannot be updated easily, making initial code security critical.

Benchmark coverage

5 vulnerability samples from OpenThread are included in CVE-Agent-Bench, generating 75 individual evaluations across 15 agent configurations. These samples include heap overflow vulnerabilities in mesh networking, stack buffer overflows, and memory corruption in protocol parsing.

Vulnerability classes

OpenThread samples cover vulnerability patterns in embedded network protocol implementation:

  • Stack buffer overflows in CoAP message handling where variable-length payloads exceed stack allocation
  • Heap buffer overflows in mesh routing where neighbor discovery packets trigger out-of-bounds writes
  • Out-of-bounds reads in frame parsing where field offsets are not validated against frame size
  • Integer overflow in length calculations leading to undersized stack allocation
  • Assertion failures in protocol state machines where unexpected messages cause crashes
  • Resource exhaustion where crafted packets trigger excessive memory allocation in constrained environments

Why OpenThread bugs are interesting for agent evaluation

OpenThread vulnerabilities test an agent's ability to understand embedded networking protocols and memory constraints. The codebase handles complex mesh routing state machines with limited RAM and CPU. Bugs often involve buffer handling in constrained memory environments or incorrect bounds checking in protocol parsing. Agents must generate fixes that close security gaps while fitting within the tight resource constraints of IoT devices.

IoT mesh networks are particularly challenging because devices may be physically inaccessible after deployment, and a single compromised device can attack all neighbors in the mesh. This makes the initial implementation security exceptionally important.

Agent performance on OpenThread

Per-project performance data is not yet published. Aggregate results across all projects are available at the full results page, where you can compare agents by pass rate and cost. The benchmark methodology documents the evaluation approach.

Projects with similar embedded and resource-constrained implementation challenges:

  • libarchive, binary format parsing with variable-length data
  • openvswitch, network packet processing with protocol parsing
  • blosc, high-performance algorithms under memory constraints

Explore more

FAQ

How does OpenThread relate to CVE-Agent-Bench?

OpenThread powers Matter smart home devices. 5 samples test agents on embedded systems, resource constraints, and mesh networking protocol correctness.

[RELATED TOPICS]

See which agents produce fixes that work

128 CVEs. 15 agents. 1,920 evaluations. Agents learn from every run.