What happens when one of the world’s most powerful AI models becomes both incredibly useful — and potentially risky?
That’s the balancing act Anthropic is now trying to manage with its latest release.
The company has rolled out Claude Fable 5, a public version of its advanced “Mythos” model.
But with strict guardrails designed to block use in sensitive areas like cybersecurity.
Why? Because earlier previews reportedly impressed researchers with their ability to detect software vulnerabilities — raising both excitement and alarm.
Anthropic says this is its most capable model yet for software engineering and data analysis.
But access is being expanded carefully after initial testing with around 200 organisations.
Including US government users under a controlled program.
What’s Different This Time?
The model is designed to refuse high-risk requests. As Anthropic’s head of product, Dianne Penn, explained.
“If a user tries to get help finding cyber vulnerabilities, the model will refuse and fall back to a safer version.”
In other words, it can do powerful work — but only within set boundaries.
The company is also betting on efficiency. While Fable 5 is more expensive per unit.
It reportedly uses fewer tokens per task, which could reduce overall costs for customers.

Still, competition is heating up as AI giants race toward public deployment and potential IPOs.
And with pricing set at $10 per million input tokens and $50 per million output tokens, the question becomes: will safety-first AI also win the cost race?
Because in the AI arms race, the real challenge isn’t just building smarter systems — it’s deciding how much power is too much power.


