Overview
- European teams from EMBL, DKFZ and the University of Copenhagen reported the Nature study introducing Delphi-2M, a transformer-based system for longitudinal medical records.
- Training drew on roughly 400,000–454,000 UK Biobank participants and 811 million clinical events, with retrospective validation on about 1.9 million Danish patient records.
- The approach treats medical histories as sequences to model event order and timing, enabling calibrated risk estimates and timing predictions rather than definitive diagnoses.
- Performance was strongest for conditions with consistent progression, such as certain cancers, myocardial infarction and septicemia, with some five-year predictions reaching AUCs above 0.85 and outdoing traditional tools for selected tasks.
- The authors note demographic biases from underrepresented groups and emphasize next steps including prospective trials, broader population evaluation and frameworks for ethics, privacy and transparency; the model can also simulate multi-decade trajectories and produce privacy-preserving synthetic data.