Internal documents unsealed in a copyright lawsuit show that Meta used pirated books from LibGen to train its artificial intelligence systems. The revelations emerged from a case filed by several prominent authors, including Junot Diaz, Sarah Silverman, and Ta-Nehisi Coates, as reported by Wired magazine and commented on by Lithub.
The documents contain direct quotes from Meta employees discussing their use of LibGen, a known piracy site. In one exchange, a Meta engineer expressed discomfort about “torrenting from a corporate laptop.” The documents also indicate that Meta CEO Mark Zuckerberg was aware of and approved the use of pirated materials for AI training.
The lawsuit, known as Kadrey et al. v. Meta Platforms, was filed in the Northern District of California. The plaintiffs are now seeking to expand their case based on new evidence suggesting Meta not only downloaded pirated content but also participated in distributing it through torrent networks.
According to court records, a Meta representative testified under oath in November 2024 that the company engaged in “seeding” pirated files containing the plaintiffs’ works on torrent sites. Seeding refers to sharing downloaded files with other users in peer-to-peer networks.
Meta has defended its actions by claiming the materials used were publicly available and protected under fair use doctrine. However, the plaintiffs argue that public availability does not equate to legal use, especially when the content is known to be pirated.
The case highlights growing concerns about the training data used in commercial AI systems. It raises questions about intellectual property rights in AI development and the responsibilities of major technology companies in ensuring their AI models are trained on legally obtained materials.