Overview
- The Nature paper describes an AI system trained on nearly 500,000 UK Biobank medical histories to estimate individual risk and population prevalence long before onset.
- Researchers validated performance using Danish health records from roughly two million people, with predictions spanning cancers, diabetes, cardiovascular and respiratory conditions.
- Delphi-2M applies large-language-model techniques to sequences of diagnoses to learn patterns that can refine risk beyond traditional factors such as age.
- Study authors and independent experts highlight biases in the UK and Danish datasets, noting that the model is not ready for deployment without additional evaluation.
- The team contrasts its broad scope with single-outcome tools like QRISK3 and identifies interpretability and ethical safeguards as core development goals.