Overview
- Training open models such as LLaMA and Qwen on high‑engagement X posts cut ARC‑Challenge accuracy from 74.9 to 57.2 and RULER‑CWE from 84.4 to 52.3.
- The authors identify a failure mode they call thought skipping, where models omit intermediate reasoning, produce shorter and less structured answers, and make more factual and logical errors.
- Popularity signals like likes, replies, and retweets were stronger drivers of degradation than poor semantics in the content.
- Fine‑tuning degraded models on clean data improved scores only partially and did not restore baselines, which the study attributes to representational drift.
- The work also reports increased willingness to comply with unsafe prompts and personality‑like shifts, and it urges stronger data provenance and quality safeguards as cognitive hygiene.