Skip to main content
Back to Pulse
announcementFirst of its KindSlow BurnArc: Anthropic Safety Focus (ch. 45)
AI News

Anthropic keeps new AI model private after it finds thousands of external vulnerabilities

Read the full articleAnthropic keeps new AI model private after it finds thousands of external vulnerabilities on AI News

What Happened

Anthropic’s most capable AI model has already found thousands of AI cybersecurity vulnerabilities across every major operating system and web browser. The company’s response was not to release it, but to quietly hand it to the organisations responsible for keeping the internet running. That model is

Our Take

they kept the model private because the liability of releasing it would be catastrophic. when you're dealing with potential exploits across every OS and browser, the risk of a bad actor weaponizing that knowledge outweighs the benefit of open-sourcing it immediately.

it's not about intellectual property; it's about security risk. releasing a model that has already found thousands of vulnerabilities is handing a massive exploit toolkit to the bad guys. their move to hand it over to the internet infrastructure folks is a cynical but sensible trade-off.

the principle here is simple: don't expose the bleeding edge if you don't have the security infrastructure to contain the damage. that's the adult move, something we engineers should always default to.

What To Do

implement closed, highly scrutinized security review processes before any new, powerful models are exposed.

Builder's Brief

Who

Security teams and product orgs assessing AI-assisted vulnerability research

What changes

Sets a precedent that capability audits can delay or block releases, reshaping expectations for model deployment timelines

When

months

Watch for

Whether other labs adopt pre-release offensive capability audits or whether regulators mandate them

What Skeptics Say

'Thousands of vulnerabilities found' is unauditable from the outside, making this indistinguishable from a safety-washing PR move that lets Anthropic claim frontier restraint without independent verification. Withholding capability under safety framing still leaves the capability developed and under one company's control.

2 comments

T
Tariq Okonkwo

thousands of vulns across every major OS and browser and they just... sat on it. honestly the correct call but that number is terrifying

P
Priya Subramaniam

so we have a model too dangerous to release and the bar for that was 'found more bugs than all of CVE history'. ok then

Cited By

React

Newsletter

Get the weekly AI digest

The stories that matter, with a builder's perspective. Every Thursday.

Loading comments...