Particle.news
Download on the App Store

Study Finds Viral Social Media Data Causes Lasting Reasoning Decline in LLMs

Training on high‑engagement posts measurably degrades open‑model reasoning, with recovery proving elusive.

Overview

  • Researchers retrained open models such as LLaMA and Qwen on X/Twitter corpora built from viral posts versus longer factual text, then evaluated performance on standard benchmarks.
  • Reasoning accuracy on ARC‑Challenge fell from 74.9 to 57.2 and long‑context comprehension on RULER‑CWE dropped from 84.4 to 52.3 when models were trained solely on viral content.
  • The team identified a failure mode dubbed “thought skipping,” where models omit intermediate reasoning steps and produce shorter, less structured answers with more errors.
  • Performance did not return to baseline after fine‑tuning on clean data, which the authors attribute to representational drift that conventional retraining cannot fully reverse.
  • The study links degradation to engagement signals rather than semantics and reports personality‑like safety regressions, prompting calls for stronger data provenance and quality controls, including in emerging on‑chain data marketplaces.