Particle.news
Download on the App Store

Study Finds Viral Social Media Data Causes Lasting 'Brain Rot' in AI Models

A new arXiv preprint reports lasting declines after exposure to viral posts, identifying engagement metrics as the primary driver.

Overview

  • Researchers at the University of Texas at Austin, Texas A&M, and Purdue continually pretrained Llama3 and Qwen variants on X datasets optimized for virality or clickbait to measure effects on cognition.
  • Performance dropped in a dose-dependent way as junk ratios rose, with ARC-Challenge falling from 74.9 to 57.2 and RULER-CWE from 84.4 to 52.3, and popularity signals harming reasoning more than low semantic quality.
  • Degraded models showed a failure pattern dubbed “thought skipping,” yielding shorter, less-structured answers with more factual and logical errors.
  • Retraining on cleaner data produced only partial recovery, which the authors attribute to persistent representational drift that standard fine-tuning could not reverse.
  • All tested models declined—Llama3 8B was most sensitive and Qwen3 4B relatively more resilient—and the not-yet–peer-reviewed paper urges stricter data curation, provenance, and routine cognitive health checks.