Particle News: Reddit Restricts Wayback Machine to Homepage to Thwart AI Scraping

Overview

Under updated robots.txt settings, the Wayback Machine can now only crawl reddit.com’s homepage, with post detail pages, comments, and user profiles blocked.
Reddit says it identified instances of AI developers mining archived snapshots in violation of its platform policies, prompting the technical blockade to protect user privacy.
The company notified the Internet Archive in advance and began “ramping up” the restrictions on August 11, while both parties continue negotiations over future archival access.
Existing Wayback Machine captures of Reddit content remain accessible for now, but all new archiving beyond top-level listings will be prevented unless a new agreement is reached.
The measure underscores Reddit’s broader data strategy, which includes multimillion-dollar licensing deals with Google and OpenAI and legal actions such as its lawsuit against Anthropic.