Overview
- Under updated robots.txt settings, the Wayback Machine can now only crawl reddit.com’s homepage, with post detail pages, comments, and user profiles blocked.
- Reddit says it identified instances of AI developers mining archived snapshots in violation of its platform policies, prompting the technical blockade to protect user privacy.
- The company notified the Internet Archive in advance and began “ramping up” the restrictions on August 11, while both parties continue negotiations over future archival access.
- Existing Wayback Machine captures of Reddit content remain accessible for now, but all new archiving beyond top-level listings will be prevented unless a new agreement is reached.
- The measure underscores Reddit’s broader data strategy, which includes multimillion-dollar licensing deals with Google and OpenAI and legal actions such as its lawsuit against Anthropic.