Anthropic keeps new AI model private after it finds thousands of external vulnerabilities
What Happened
Anthropic’s most capable AI model has already found thousands of AI cybersecurity vulnerabilities across every major operating system and web browser. The company’s response was not to release it, but to quietly hand it to the organisations responsible for keeping the internet running. That model is
Our Take
they kept the model private because the liability of releasing it would be catastrophic. when you're dealing with potential exploits across every OS and browser, the risk of a bad actor weaponizing that knowledge outweighs the benefit of open-sourcing it immediately.
it's not about intellectual property; it's about security risk. releasing a model that has already found thousands of vulnerabilities is handing a massive exploit toolkit to the bad guys. their move to hand it over to the internet infrastructure folks is a cynical but sensible trade-off.
the principle here is simple: don't expose the bleeding edge if you don't have the security infrastructure to contain the damage. that's the adult move, something we engineers should always default to.
What To Do
implement closed, highly scrutinized security review processes before any new, powerful models are exposed.
Builder's Brief
What Skeptics Say
'Thousands of vulnerabilities found' is unauditable from the outside, making this indistinguishable from a safety-washing PR move that lets Anthropic claim frontier restraint without independent verification. Withholding capability under safety framing still leaves the capability developed and under one company's control.
2 comments
thousands of vulns across every major OS and browser and they just... sat on it. honestly the correct call but that number is terrifying
so we have a model too dangerous to release and the bar for that was 'found more bugs than all of CVE history'. ok then
Cited By
React
Get the weekly AI digest
The stories that matter, with a builder's perspective. Every Thursday.