Meta Accused of Using 82TB of Pirated Books to Train AI Models

Court documents reveal Meta allegedly downloaded copyrighted material from shadow libraries, raising ethical and legal concerns.

Overview

Meta is facing a lawsuit alleging the company illegally downloaded 82 terabytes of pirated books from shadow libraries like LibGen, Z-Library, and Anna's Archive to train its AI models.
Internal communications from Meta employees express discomfort with the use of pirated content, with one researcher stating it crosses ethical boundaries.
CEO Mark Zuckerberg reportedly attended a January 2023 meeting where he advocated for advancing the AI training process despite concerns about legality.
Meta employees discussed using VPNs to mask their IP addresses while downloading data, highlighting deliberate efforts to conceal the activity.
The lawsuit, filed by authors including Ta-Nehisi Coates and Sarah Silverman, underscores broader debates over copyright infringement in AI training and its impact on creators' rights.