Anthropic has agreed to pay $1.5 billion to settle a lawsuit brought by authors and publishers over its use of millions of copyrighted books to train the models for its AI chatbot Claude, according to a legal filing posted online.
A federal judge found in June that Anthropic’s use of 7 million pirated books was protected under fair use but that holding the digital works in a “central library” violated copyright law. The judge ruled that executives at the company knew they were downloading pirated works, and a trial was scheduled for December.
The settlement, which was presented to a federal judge on Friday, still needs final approval but would pay $3,000 per book to hundreds of thousands of authors, according to the New York Times. The $1.5 billion settlement would be the largest payout in the history of U.S. copyright law, though the amount paid per work has often been higher. For example, in 2012, a woman in Minnesota paid about $9,000 per song downloaded, a figure brought down after she was initially ordered to pay over $60,000 per song.
In a statement to Gizmodo on Friday, Anthropic touted the earlier ruling from June that it was engaging in fair use by training models with millions of books.
“In June, the District Court issued a landmark ruling on AI development and copyright law, finding that Anthropic’s approach to training AI models constitutes fair use,” Aparna Sridhar, deputy general counsel at Anthropic, said in a statement by email.
“Today’s settlement, if approved, will resolve the plaintiffs’ remaining legacy claims. We remain committed to developing safe AI systems that help people and organizations extend their capabilities, advance scientific discovery, and solve complex problems,” Sridhar continued.
According to the legal filing, Anthropic says the payments will go out in four tranches tied to court-approved milestones. The first payment would be $300 million within five days after the court’s preliminary approval of the settlement, and another $300 million within five days of the final approval order. Then $450 million would be due, with interest, within 12 months of the preliminary order. And finally $450 million within the year after that.
Anthropic, which was recently valued at $183 billion, is still facing lawsuits from companies like Reddit, which struck a deal in early 2024 to let Google train its AI models on the platform’s content. And authors still have active lawsuits against the other big tech firms like OpenAI, Microsoft, and Meta.
The ruling from June explained that Anthropic’s training of AI models with copyrighted books would be considered fair use under U.S. copyright law because theoretically someone could read “all the modern-day classics” and emulate them, which would be protected:
…not reproduced to the public a given work’s creative elements, nor even one author’s identifiable expressive style…Yes, Claude has outputted grammar, composition, and style that the underlying LLM distilled from thousands of works. But if someone were to read all the modern-day classics because of their exceptional expression, memorize them, and then emulate a blend of their best writing, would that violate the Copyright Act? Of course not.
“Like any reader aspiring to be a writer, Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them—but to turn a hard corner and create something different,” the ruling said.
Under this legal theory, all the company needed to do was buy every book it pirated to lawfully train its models, something that certainly costs less than $3,000 per book. But as the New York Times notes, this settlement won’t set any legal precedent that could determine future cases because it isn’t going to trial.
Read the full article here