Back in April, Anthropic announced it had built a new AI model, called Mythos, that had an unprecedented ability to find gaps in cybersecurity systems. Citing the risk of the powerful new model falling into the wrong hands, the company initially withheld its public release, only slowly granting access to a small group of early testers through a project known as Project Glasswing.
Today, Mythos finally made its first public debut—albeit in a considerably nerfed fashion.
Anthropic announced Tuesday that it’s releasing a new model called Claude Fable 5, described in a company blog post as “a Mythos-class model made safe for general use.” The new model’s capabilities across tasks like coding, vision, tool use, legal analysis, and scientific research “exceed those of every model we’ve ever made generally available,” the company wrote. (The implication there is that the most powerful version of Mythos is capable of more, but that it hasn’t (yet) been deemed fit for public availability.) “The longer and more complex the task, the larger Fable 5’s lead over our other models.”
The company has also launched another model called Mythos 5, which has the same capabilities as Fable 5, minus some of the safeguards. That model, however, is for the time being only available to the small cohort of organizations that are part of Project Glasswing, but Anthropic said access will be expanded soon.
Safeguards
Fable 5 comes with safeguards designed to prevent it from responding to questions on particularly sensitive subjects, like cybersecurity, biology, and chemistry. In those cases, it will automatically revert to an earlier Claude model, Opus 4.8, to avoid divulging information that could, say, make it easier for aspiring terrorists to cook up a biological weapon.
Anthropic says it red-teamed the new generally available model, both internally and in collaboration with external testers, to root out security flaws that could lead to jailbreaks. (That is, manipulating an AI chatbot to ignore and bypass its built-in safety guardrails).
Both Fable 5 and Mythos 5 are trained to play it safe, Anthropic said, which could lead to instances in which user queries that are in fact benign are erroneously flagged as dangerous by the model. “Because we have prioritized safety, we’ve deliberately tuned the safeguards to be cautious, and they are still stricter than would be ideal,” the company wrote in the blog post. “We recognize that this will be frustrating to some users, and our aim is to reduce false positives as we update and refine the safeguards after launch.” Opus 4.8 has already been annoying some users after Anthropic boosted the model’s “honesty.”
Cost
With a hefty price tag of $10 per million input tokens and $50 per million output tokens, Fable 5 and Mythos 5 cost twice as much as the standard version of Opus 4.8. At a time when many individual developers and organizations are starting to fret over the costs of using AI, that will likely push Anthropic’s new model beyond the budgets of much of its userbase.
But the powerful aura that’s been built up around Mythos over the past couple of months also shouldn’t be underestimated: Many people who have been hearing rumours about the fabled Mythos will likely be more than glad to foot the bill for token usage just to see if Fable lives up to the rumors.
Read the full article here
