Particle.news

Download on the App Store

Goldman Sachs Data Chief Says AI Has Run Out of Human Training Data

The bank says unlocking proprietary corporate data offers the clearest near-term path to sustain model gains.

Overview

  • Neema Raphael said, "We’ve already run out of data," noting that developers are increasingly relying on synthetic outputs or training on other models’ responses.
  • He warned about the risk of model collapse, where repeatedly training on AI-generated content degrades accuracy and amplifies errors.
  • Goldman Sachs argues that large stores of information behind corporate firewalls remain underused and could provide higher-quality inputs for future systems.
  • Raphael said data scarcity should not be a massive constraint if firms clean, normalize, and properly govern their internal datasets.
  • The remarks echo broader concerns about a looming peak-data crunch, with Nature forecasting a crisis by 2028 and Ilya Sutskever predicting rapid gains will end, while some reporting suggests companies may shift focus toward more agentic AI.