New tool: AI Blind Spots in (health)care

Discover the tool
USA cover4
14.10.2025

United States - Copyrights and AI: the case of Andrea Bartz, et al. versus Anthropic PBC

Introduction

On 23 June 2025, the United States District Court (hereafter ‘the Court’), Northern District of California issued a summary judgment in the case Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson v. Anthropic PBC in favor of Anthropic PBC (Anthropic), ruling that using (pirated) copies of books to train artificial intelligence software (specifically large language models) is not a violation of US copyright law. This case is significant as it is one of the first copyright infringement cases related to AI which has been decided.

First, the background facts of the case are presented. Next, the Court’s assessment and decision are analyzed. Finally, a conclusion is presented.

Background

Anthropic is an AI software company that offers a service called Claude, an AI tool that generates human-like text responses based on user prompts. Anthropic trained Claude's large language model using its library that included both pirated and purchased texts. The plaintiffs, authors of books that Anthropic copied from pirated and purchased sources, allege that Anthropic maintained these copies as a permanent resource, even after deciding not to use certain texts for training its language models. All copying was done without the plaintiffs' authorization. 

In August 2024, the three authors filed a class action lawsuit, claiming that Anthropic infringed their federal copyright protection in various ways, such as by using pirated copies for its library as well as reproducing them to train its language models.

The Court's assessment and decision

Anthropic contends that its use and storage of the books fall under fair use as defined by Section 107 of the Copyright Act. The fair use doctrine allows the use of copyrighted works without permission if certain criteria are met. The Copyright Act lists four factors to be considered in determining whether a given use is fair, with no single factor being decisive on its own: 

  1. The purpose and character of the use;
  2. The nature of the copyrighted work;
  3. The amount and substantiality of the portion used in relation to the entire work;
  4. The effect of the use on the potential market for or value of the copyrighted work.

In this case, the Court found two distinct uses: (i) creating a comprehensive central library of potentially useful content and (ii) training large language models (LLMs) using that library's content.

Training LLMs through copies of books

The Court ruled that using copies to train LLMs qualifies as fair use based on an assessment of those four factors. The first, third and fourth factor weighing in favor of fair use, while the second factor, often seen as the least important factor, opposes it because the authors’ work are considered expressive and are thus highly protected. 

In reaching his final decision, Judge Alsup highlighted the first fair use factor: the purpose and character of the use. This factor favors fair use when the new work is transformative, meaning it serves a new and different function. Judge Alsup described Anthropic's use of the authors' books as “exceedingly transformative”, making it permissible under U.S. copyright law. He further noted that the outputs generated by Claude did not infringe on the authors' rights, as none of their works were made public in a manner that violated their works, which the authors acknowledged and never raised as a claim.

Creating a comprehensive central library

Regarding the other use, creating a comprehensive central library, the Court distinguished between purchased and pirated copies. The purchased copies, which Anthropic converted from print to digital, were found to be justifiable for a different fair use (that of format change) since the original print copies were destroyed in the process and the digital versions were not shared.

In contrast, the pirated copies used to build the library were not justified as fair use. All factors weighed against fair use in this instance. Anthropic's employees indicated they would retain these pirated copies indefinitely for general purposes, even after deciding not to use them for training LLMs. The Court emphasized that each use requires its own justification, and Anthropic failed to provide a valid reason for keeping the pirated copies beyond convenience and cost savings.

Conclusion

In conclusion, Judge Alsup's summary judgment ruled that Anthropic did not infringe upon the author’s copyrights by using (pirated) copies of their books to train LLMs because this is considered fair use, with his reasoning primarily focusing on the transformative nature of LLMs. That said, the ruling makes an important distinction between using content for training purposes and storing that content in a central library. With regard to the matter of storage: while the storage of purchased copies falls under fair use, the retention of pirated copies was ruled as copyright infringement.

At the present time, Anthropic has agreed to pay $1.5 billion to achieve a settlement in the legal proceedings relating to the retention of pirated copies. 

Finally, it is essential to recognize the significance of this judgment as one of the first to be decided in copyright lawsuits related to AI. However, it is important to note that this is merely a single case, and many similar lawsuits are expected to be decided in the near future (e.g. the Meta case), potentially involving plaintiffs with deeper pockets and more extensive legal resources. For now, the most important conclusion for AI developers is that fair use starts with the lawful acquisition of training data. Additionally, it's crucial to recognize that U.S. law differs significantly from EU law, meaning that such reasoning may not hold up in European courts.

Authors

Koen Vranckaert
Sultan Erdogan
Shannen Verlee