In a landmark development with global ramifications, two U.S. District Courts in California have ruled that using legally obtained copyrighted books to train artificial intelligence (AI) models constitutes fair use under American copyright law. These rulings, involving AI companies Anthropic and Meta, are among the first judicial decisions to directly address the legality of training large language models (LLMs) using copyrighted material.
This article analyses the rulings, their implications for AI developers and content creators, and how this evolving legal landscape may impact the broader debate on copyright and machine learning.
Background: Fair Use and AI Training
Fair use is a doctrine under U.S. copyright law that allows limited use of copyrighted material without permission from the rights holders, particularly for transformative purposes such as research, education, commentary, or parody. In recent years, this doctrine has come under scrutiny in the context of AI training, where massive datasets—often containing copyrighted works—are used to train generative models such as ChatGPT, Claude, and LLaMA.
Key Rulings on Fair Use in AI Training
1. Bartz, et al. v. Anthropic PBC
Court: U.S. District Court, Northern District of California
Date: June 23, 2025
Judge: Hon’ble William Alsup
In this case, the Plaintiffs alleged that Anthropic used their copyrighted books without authorization to train its Claude chatbot. The Court ruled that using lawfully obtained books for AI training constitutes fair use, emphasizing the transformative nature of the usage.
Highlights:
The Court likened AI training to a human author learning by reading books—not to copy, but to internalize patterns and concepts.
All four factors of the fair use test under Section 107 of the U.S. Copyright Act were analyzed:
- Purpose & Character of Use: Training was transformative and non-expressive.
- Nature of the Work: Creative works may still be eligible for fair use if used for a new, non-expressive purpose.
- Amount Used: Full use of the text was deemed necessary for training but was not outputted verbatim.
- Market Effect: Plaintiffs failed to show that the training had a negative commercial impact on book sales.
Caveat: Pirated Data
The Court explicitly held that using pirated copies of books sourced from unauthorized online libraries like Bibliotik and Z-Library is not protected under fair use. That portion of the case will proceed to trial in December 2025.
2. Kadrey, et al. v. Meta Platforms Inc.
Court: U.S. District Court, Northern District of California
Date: June 25, 2025
Judge: Hon’ble Vince Chhabria
In a related case, the Court granted summary judgment in favour of Meta, ruling that the use of copyrighted books to train its LLaMA models fell within the scope of fair use.
Key Takeaways:
- The Plaintiffs were unable to prove actual or potential market harm, a critical component of the fair use analysis.
- The ruling underscores the significance of transformative use and market impact in assessing AI training on copyrighted content.
Broader Impact on Global Copyright Law
While these rulings apply under U.S. copyright law, they set persuasive precedents that may influence courts and policymakers in other jurisdictions, including the UK, EU, and India, where copyright exceptions for AI training are still under debate.
These decisions also follow the U.S. Supreme Court’s reasoning in Google v. Oracle (2021), where transformative use was a central factor in finding fair use regarding software APIs.
Guidance for Businesses and Legal Teams
1. Audit Training Data: Ensure all training datasets are sourced from authorized or public domain content.
2. Maintain Transparency: Document how copyrighted works are processed and whether output content is transformative.
3. Avoid Shadow Libraries: Do not use content from platforms that distribute unauthorized copies of copyrighted works.
4. Monitor Evolving Laws: As international jurisdictions respond to these rulings, companies should remain vigilant for changes in local legislation.
Conclusion
The recent rulings from U.S. courts mark a major step forward in clarifying the relationship between copyright law and artificial intelligence. While the Courts have recognized fair use in training AI with legally obtained materials, they have also drawn a firm boundary against the use of pirated content.
As AI continues to evolve, so too will the legal frameworks that govern it. These decisions provide a legal foundation—but not a final answer—for how copyright law will interact with machine learning in the years to come.


