Overview
- The Manhattan federal court refused OpenAI’s motion to remove or destroy data that could reveal how its models ingested copyrighted material.
- The order mandates that data be kept under confidentiality protections to remove identifying details while allowing evidence review.
- News organizations argue the preserved records will show millions of unlicensed articles were used to train ChatGPT and other AI tools.
- OpenAI asserts its data practices fall under fair use and that data retention would violate user privacy, a claim the judge dismissed.
- Legal experts say the preservation requirement may set an industry precedent for AI companies’ handling of copyrighted content in training datasets.