Overview
- Using 110 published cases scored with the Manchester Triage System, ChatGPT 3.5 achieved 50.4% overall accuracy versus 65.5% for nurses and 70.6% for doctors.
- Sensitivity to true urgent cases was lower for the AI at 58.3% compared with 73.8% for nurses and 83.0% for doctors, with physicians leading across categories including surgical and therapeutic cases.
- In the highest-urgency category, the AI outperformed nurses on accuracy (27.3% vs 9.3%) and specificity (27.8% vs 8.3%), though it tended to over-triage many patients.
- The single-center, case-based comparison at Vilnius University Hospital Santaros Klinikos involved 44 nurses and six emergency doctors, and evaluated ChatGPT 3.5 outside real-time clinical workflows.
- EUSEM experts and the study team say AI should not replace trained staff for triage and will be explored as decision support, with planned follow-up work using fine-tuned models, larger cohorts, ECG interpretation, and training and mass-casualty scenarios.