Particle.news

Download on the App Store

Reddit Restricts Wayback Machine to Homepage to Thwart AI Scraping

New rules took effect this week after Reddit detected AI companies scraping its content via archived pages.

A person holds a smartphone displaying the Reddit logo.
Image
Image
Image

Overview

  • Under updated robots.txt settings, the Wayback Machine can now only crawl reddit.com’s homepage, with post detail pages, comments, and user profiles blocked.
  • Reddit says it identified instances of AI developers mining archived snapshots in violation of its platform policies, prompting the technical blockade to protect user privacy.
  • The company notified the Internet Archive in advance and began “ramping up” the restrictions on August 11, while both parties continue negotiations over future archival access.
  • Existing Wayback Machine captures of Reddit content remain accessible for now, but all new archiving beyond top-level listings will be prevented unless a new agreement is reached.
  • The measure underscores Reddit’s broader data strategy, which includes multimillion-dollar licensing deals with Google and OpenAI and legal actions such as its lawsuit against Anthropic.