Particle.news

Download on the App Store

Clinicians Outperform ChatGPT 3.5 in Emergency Triage, EUSEM Study Finds

Researchers report lower accuracy and sensitivity for the AI than for clinicians, prompting supervised real-time trials with newer models.

Overview

  • Using 110 published cases scored with the Manchester Triage System, ChatGPT 3.5 achieved 50.4% overall accuracy versus 65.5% for nurses and 70.6% for doctors.
  • Sensitivity to true urgent cases was lower for the AI at 58.3% compared with 73.8% for nurses and 83.0% for doctors, with physicians leading across categories including surgical and therapeutic cases.
  • In the highest-urgency category, the AI outperformed nurses on accuracy (27.3% vs 9.3%) and specificity (27.8% vs 8.3%), though it tended to over-triage many patients.
  • The single-center, case-based comparison at Vilnius University Hospital Santaros Klinikos involved 44 nurses and six emergency doctors, and evaluated ChatGPT 3.5 outside real-time clinical workflows.
  • EUSEM experts and the study team say AI should not replace trained staff for triage and will be explored as decision support, with planned follow-up work using fine-tuned models, larger cohorts, ECG interpretation, and training and mass-casualty scenarios.