Judge William Alsup of the Northern District of California. 
Law and Technology

Bartz v. Anthropic: All you need to know about the largest copyright settlement in history

The clash between AI's feeding on data and the Constitution's promise to reward authors is playing out in a US court, and its resolution will influence how machine learning models are built and governed.

ARTIFICIAL INTELLIGENCE (‘AI’) MODEL TRAINING and anthropocentric authorship have collided in Bartz v. Anthropic, the recently settled class action in which authors sued the AI firm Anthropic for training its Claude model on pirated books. 

The stakes could not be higher: this $1.5 billion deal, the largest U.S. copyright settlement ever, signals to the booming AI industry that taking free literary "raw material" from shadow libraries comes at immense cost. The deal sends a message that taking copyrighted works from pirate websites is against the statutory rights granted under copyright law. Authors Guild CEO Mary Rasenberger hailed the settlement as "a vital step" and warned that AI companies "cannot simply steal authors' creative work…just because they need books to develop quality LLMs".

This clash between AI's feeding on data and the Constitution's promise to reward authors is playing out in court, and its resolution will influence how machine learning models are built and governed.

Chronology of Bartz v. Anthropic (2024–2025): In August 2024, nonfiction authors Andrea Bartz, Charles Graeber and Kirk Johnson sued Anthropic, alleging it had copied their books to train its AI without permission. In June 2025, Judge William Haskell Alsup held in a summary judgment that training Claude on legally acquired books was "transformative" fair use, but he refused to excuse Anthropic's mass downloading of pirated books. By July, the court certified a class of all registered rightsholders whose works were taken from known pirate sites (LibGen and PiLiMi). The parties announced a $1.5 billion settlement on September 5, 2025, which Judge Alsup preliminarily approved on a September 25 hearing.

The deal sends a message that taking copyrighted works from pirate websites is against the statutory rights granted under copyright law.

Systemic unpacking

Copyright law and fair use: Under the U.S. Copyright Act, authors have exclusive rights to copy and distribute their books, subject only to narrow exceptions. Section 107's "fair use" doctrine permits some unlicensed uses if they are socially valuable and don't unfairly supplant the market for the original. In AI model training cases, courts have focused on whether the model's use is transformative - essentially turning the text into something qualitatively new. 

Judge Alsup viewed text‐based training as highly transformative: he wrote that the technology was "among the most transformative many of us will see in our lifetimes" (comparing AI learning to how humans learn by reading). He held that using lawfully purchased books for destructive digitisation and model training was "quintessentially transformative" and thus protected by fair use. 

In contrast, he rejected Anthropic's bid for blanket immunity for its central library of pirated books, finding that "downloading millions of pirated books to build a permanent digital library" was not justified by fair use. In other words, the court drew a line: the training process itself (on authorised inputs) is OK, but Anthropic's underlying acquisition of those inputs via piracy remains infringement. Importantly, Judge Alsup left open the question whether training on pirated copies would ever be fair use; he expressed skepticism that copying first from a pirate site could later be "subsumed" by training use.  Skadden LLP notes these rulings are highly fact-specific - one cannot simply generalise that all AI training is fair use - and Justice Chhabria (Kadrey v. Meta) has warned that even a strong finding of transformation "is not the end of the analysis".

Class certification: Judge Alsup certified a nationwide class of copyright owners in July 2025. The class includes all beneficial and legal copyright owners of any book versions downloaded by Anthropic from the LibGen or PiLiMi databases (so long as the work was properly registered with the U.S. Copyright Office and has an ISBN/ASIN). In practice, only about 500,000 of the roughly 7 million downloaded titles qualified under these criteria. 

Crucially, the class was certified only for the piracy claims - the fair-use ruling on training applies only to the three named plaintiffs, not to the class as a whole. If the case had gone to trial, each work in the class could have generated up to $150,000 in statutory damages for willful infringement (an aggregate exposure in the tens of billions). Class counsel built the case on what they allege - "Anthropic has built a multibillion-dollar business by stealing hundreds of thousands of copyrighted books" - Anthropic's engineers funneling millions of titles from notorious pirate sites into Claude's model training pipeline.

The settlement terms: Under the proposed deal, Anthropic will pay $1.5 billion into a settlement fund. After fees and expenses, every qualifying title in the class will receive an equal share - about $3,000 per book, to be split between author and publisher. Anthropic also agreed to destroy the original pirated files and any copies it made, though it keeps any books it purchased legitimately. The release is explicitly limited to past conduct: Anthropic obtains peace of mind only for its prior downloading and internal training activities through August 25, 2025, on the identified works. The company does not get a blanket release for future copying, for uses of other works, or for any allegedly infringing outputs of the Claude model. The settlement resolves the piracy claims up to a cutoff date, but leaves open all future copyright issues (including if Claude's answers themselves infringe).

Normative and critical pivot

Is roughly $3,000 per title "justice"? The answer depends on perspective. On one hand, $3,000 far exceeds the statutory floor: it is four times higher than the $750 minimum per-work award. On the other hand, plaintiffs may get nothing without settlement (trial could yield years of appeals, with no guarantee of full damages). Authors might note that $3,000 is only 2 percent of the $150,000 statutory cap per book - a recovery that could seem paltry compared to the value of an author's creative labour. If any plaintiff believed her title was worth 100 or 1,000 times more, that gamble was foregone by settling. Even compensating "only" authors and publishers of the pirated subset, the lump sum surely falls far below what could have been obtained against Anthropic's vast resources (the company was valued at $183 billion in Sept. 2025).

Yet this settlement was a middle path. Collecting maximum damages on hundreds of thousands of works would have been unpredictable at trial and likely rendered a verdict that Anthropic would appeal for years. By contrast, the $3,000 figure provides immediate certainty. Ropes & Gray observes that the award is "four times larger than the $750 floor" and far above the $200 per-work rate. In that sense, each author at least beats the floor. Plaintiffs' counsel also pointed out that splitting authors and publishers down the middle (a 50/50 baseline) is rooted in industry norms for trade books.

An important question is how this outcome shapes future licensing of training data. Authors' groups and lawyers view it as a nudge toward negotiated deals. The Authors Guild predicts the settlement "will lead to more licensing that gives authors both compensation and control". Ropes & Gray argues that AI developers will increasingly seek proactive licensing agreements - allowing them to bargain market‐rate royalties rather than risk statutory damages. In fact, Anthropic's deal itself lays groundwork for future contracts: it demonstrates an expectation that companies must pay for scraped books just as they do for computing power or other inputs. Ropes & Gray emphasises the need for compliance strategies, including detailed data provenance systems and prompt deletion of any tainted materials.

What about the other side of fair use - the creative incentive? Some might worry that Judge Alsup's finding that AI model training on purchased texts is fair use might be too permissive, effectively allowing these models to ingest any publicly sold book for free. However, Bartz v. Anthropic did not bless wholesale copying of every copyrighted text. The court stressed that a transformative purpose was key: simply pilfering a pirate library was not tolerated. Moreover, Judge Alsup signalled that he doubted that any fair use could justify the initial act of piracy. The split between courts (e.g. Meta's case held all training fair use vs. Anthropic's holding) means that higher courts or legislation may need to clarify. For now, the message is: building an AI on texts is likely fair use when done through proper channels, but unauthorised copying remains illegal.

An important question is how this outcome shapes future licensing of training data. Authors' groups and lawyers view it as a nudge toward negotiated deals.

Forward-looking  close

The Bartz v. Anthropic settlement will influence AI governance and copyright enforcement going forward. For one thing, it sets a financial benchmark for other copyright plaintiffs in the AI context, who may point to ~$3,000 per title as a reference point for damages and negotiation. Plaintiffs may demand that future settlements similarly require identification and destruction of infringing data sources. Certainly, AI developers will take note. Internally, companies building large language models will likely revise their data practices. This may lead to more licensing deals, more emphasis on public-domain or open-licensed corpora, and new compliance regimes. There will be some prevalence in the content provenance and permissions after Bartz v. Anthropic: AI firms should keep rigorous records of where their training data comes from, and destroy any illicit datasets if challenged.

Beyond compliance, the case highlights broader constitutional and policy implications. The U.S. Copyright Clause exists to "promote the progress of science and useful arts" by granting creators exclusive rights for a limited time. Bartz v. Anthropic settlement reasserts this bargain that creators' works should yield a return when companies exploit them for profit. Also, it underlines that innovation must respect lawful institutions and norms. In a sense, Bartz v. Anthropic bridges the high-tech world of generative AI with the centuries-old edifice of intellectual property. This reminds us that even groundbreaking technologies operate under the rule of law.

Looking ahead, many questions remain open. Bartz v. Anthropic resolved past copying, but it left untouched the thorny issues of AI output infringement and future model(s) training practices. In the wake of this case, other ongoing lawsuits (against OpenAI, Meta, Google, etc.) will proceed to judge how far "transformative use" can stretch. Legislators and regulators may take note too: some lawmakers are already debating AI-specific copyright reforms or disclosure rules. For now, AI firms have learned that wholesale scraping of copyrighted books is too risky. And authors have shown they can muster collective redress in the courts. The Bartz v. Anthropic settlement thus will echo through AI governance discussions, licensing markets, and ethical norms in the AI community for at least a couple of years to come.

Complaint (1).pdf
Preview
Motion of Settlement.pdf
Preview