Skip to main content

The end of insecurity

I. The paper world

The world is less secure than it has been in a generation.

Global cybercrime costs are measured in trillions of dollars annually and rising. Ransomware operations run automatically at scale on industrial schedules. Supply-chain attacks are enabled by compromised software that runs inside all critical sectors, including hospitals, power grids, and financial infrastructure. Vulnerabilities in volunteer-maintained publicly available libraries have exposed the whole Internet multiple times this year alone. This is today's operating environment.

The industry's response has been to add more of what is already in place: more remediation tooling, more compliance frameworks, more alerting infrastructure, more dashboards. Cyber spending has grown accordingly, and breaches continue to rise. Security teams operate in firefighting mode, working through false positives across fragmented systems. Engineers are overwhelmed by compliance documents that describe work they are not doing. Insurers underwrite risk they cannot observe in real-time. The gap between the paper world of security policy and the actual reality of production code has become the defining feature of the industry.

The response to AI has repeated the pattern. Runtime blockers, policy filters, and guardrail layers are placed between models and production systems. Capable models route around them. The assumption that a smaller, less capable model can constrain a larger one has not survived contact with the work.

The deeper issue is a category error. Cyber is limited by economic and organizational factors more than technical ones. Incentives reward shipping features, not preventing breaches. Volunteer maintainers support critical infrastructure without compensation. Attackers operate on contingency; defenders operate on annual budgets. The asymmetry lives in the market structure around the code, not in the code itself.

Better detection does not change this. Detection matters, but it has not been the binding constraint for years.

The binding constraint is the mechanism that turns detection into resolved risk at scale, and that mechanism does not yet exist.

This is the age of insecurity. It ends here.

II. What static data cannot teach

Static training data has taken models as far as static training data can take them. The next step requires a different substrate: the underlying layer of environments, trajectories, and verified process knowledge from which models actually learn to do the work.

Every frontier model has been trained on public code, published vulnerability reports, GitHub commit histories, Common Vulnerabilities and Exposures (CVE) databases, and security research. Public corpora have been saturated as a source of security training data, and the marginal returns from additional static data are falling. Labs now buy curated training data from general-purpose providers like Scale, Mercor, Turing, and Mechanize; none of them produce security-specific environments. The marginal gains from adding more of the same kind of data are incremental, and incremental gains are not sufficient for the trajectory security is on.

Static data teaches a model what successful work looks like. It does not teach a model to do the work. A successful patch in a public repository records one outcome: the fix that shipped. It does not record the twenty-nine attempts that failed, the incorrect reasoning the developer ruled out, or the downstream breakage that forced a revert. Those trajectories are the training signal that teaches an agent to reason about unfamiliar code. They exist only in reinforcement learning (RL) environments where an agent is run against a verifiable target.

Cybersecurity is structurally suited to this method. The first structural property is that exploit resolution is binary at the verification layer. A patch resolves the exploit, or it does not. The exploit fires against the patched version, or it does not. Most domains where RL is applied require human raters, which introduces latency, cost, and bias. Cyber produces a clean signal where most domains produce a noisy one.

Production-worthiness is a different question, and a harder one. A patch that closes the vulnerability can still introduce regressions, break unrelated functionality, degrade performance, or resolve the bug by removing the feature that contained it. Real defensive training has to optimize against the binary exploit signal and the multi-constraint production signal simultaneously. This is what static corpora cannot teach and generic RL environments do not capture.

The second structural property is that most of the relevant data sits inside enterprises. How a bank moves a fix from developer to production is not public. How a car manufacturer handles a Controller Area Network (CAN) bus vulnerability is not public. How any enterprise actually patches, as distinct from how it documents patching, is not public. This process knowledge is the scarce input for defensive training, and it cannot be produced from public corpora. A partner that operates inside enterprises, records the workflow, and converts it into a verifiable environment has access to a substrate that very few others can build.

The third structural property is the defender-attacker dynamic. Attackers get AI regardless of regulatory or commercial decisions. Ransomware operators and nation-state actors use open-weight models and operate without compliance constraints. A defender trained once on static data has a fixed ceiling. A defender trained continuously in environments that regenerate with each new vulnerability class compounds.

Compounding is the variable that matters. The only stable equilibrium, given attackers with access to the same models, is one in which defenders compound faster. Any other equilibrium tends toward a state where breach notifications arrive at the cadence of push notifications.

Frontier AI labs are investing in defensive cyber work directly. Anthropic's cross-industry defensive cyber initiative is one example.

This investment should continue. What it cannot produce, by the structure of the problem, is an independent measurement layer for its own output. Labs measure many things internally, but a security solution cannot credibly verify itself: the verification layer has to be anchored in environments the labs do not own, trained on data the labs cannot reach, and validated against standards the labs do not write alone. XOR builds that layer. The labs build the models. XOR builds the substrate that makes those models verifiably trustworthy in a domain where trust is the product.

III. What breaks

Once the substrate exists and begins to compound, several things that currently hold the industry together stop holding.

Enterprise security cadence

Most large organizations operate on a cycle of quarterly audits, annual penetration tests, and multi-month patch rollouts. That cycle was calibrated for a threat environment in which attackers moved at roughly human speed. AI-capable attackers do not. A vulnerability disclosed on a Monday can be weaponized, packaged, and deployed at scale before the affected organization's next change-control meeting. The cycle breaks on physics. Enterprises that continue to operate on it will be breached faster than they can approve the fix. Enterprises that adopt continuous, agent-driven patching will move from quarters to hours, and the gap between the two populations will widen. 54% of vulnerabilities weaponized in 2025 were exploited as 0-days before disclosure (Mandiant M-Trends 2026).

Cyber insurance

Cyber insurance is the next structure to shift, and probably the cleanest lever for economic restructuring. Insurers currently underwrite risk on the basis of annual questionnaires, compliance attestations, and lagging incident data. That approach is already insufficient, and the gap is widening. A verified, continuous measurement layer changes what is insurable. Insurers with real-time access to an organization's defensive posture can price risk dynamically, decline coverage to actors who drift below threshold, and shift the loss function inside enterprises from "avoid paperwork failures" to "maintain measurable defensive capability." Regulation still matters, but regulation produces documents on timelines measured in years, and those documents are rarely enforced at the level they describe. Enterprises stage compliance theater for audits and return to real workflows the next day. Underwriting, by contrast, produces budget authority on timelines measured in quarters. That difference is why insurance tends to move first.

Compliance frameworks

Compliance frameworks have persisted in their current form because there has been no alternative. Measurable, continuous, verifiable defense is the alternative. When a defensive capability can be demonstrated in an environment rather than asserted in a document, the documents move downstream of the measurement, where they belong, and the frameworks shift from describing the work to auditing it. The industry stops writing policy that the code ignores and starts writing policy that the code enforces.

Lab positioning in the value chain

The position of frontier AI labs in the cyber value chain is also already changing, and the standards layer is ahead of the commercial discussion. Labs will continue to invest in defensive capability, and that investment will continue to produce results. But labs cannot occupy every layer of the stack, and the measurement and environment layers have to sit outside them for the reasons already covered. The Internet Engineering Task Force (IETF) Remote ATtestation procedureS (RATS) and Supply Chain Integrity, Transparency, and Trust (SCITT) working groups, both of which are co-chaired by Henk Birkholz, XOR's CTO and co-founder, are already defining how systems attest to their own integrity at the substrate level, and they are being adopted by companies including Microsoft, Intel, and ARM. The value chain is being built, not proposed. Labs produce the models. Specialized partners produce the environments and verification. Standards bodies produce the shared substrate.

Defender-attacker equilibrium

The defender-attacker equilibrium is where all of this lands. An industry with a compounding defender substrate does not eliminate attacks. It changes which side compounds. Attackers continue to operate and improve. Defenders, for the first time, improve faster than attackers do, because the environment regenerates with every new vulnerability class and the training signal accumulates inside enterprises whose data attackers cannot reach. Breaching a well-defended system becomes uneconomic, not because it is impossible, but because the defender-side cost curve has inverted. This is what ending insecurity actually looks like in practice: a mechanism that systematically outpaces the threat.

The separator between secure and insecure institutions, once the substrate is available, is not budget or headcount or vendor selection. It is adoption speed. The organizations that move first compound first. The organizations that wait are falling further behind a capability that is accelerating away from them.