Overview
- MAI-DxO ran a Sequential Diagnosis Benchmark on 304 complex New England Journal of Medicine case reports and achieved up to 85.5% diagnostic accuracy compared to 20% for human physicians.
- The model-agnostic orchestrator simulates a panel of five AI agents that iteratively refine hypotheses and select strategic tests to mirror expert clinical reasoning.
- Simulated evaluations showed diagnostic costs were 20% lower than those of doctors and 70% lower than standard AI models.
- Microsoft has not set a commercialization timeline and positions the AI tool as a complement to physicians rather than a replacement.
- Medical experts emphasize that formal peer review and trials with actual patients are essential before confirming its clinical efficacy and cost benefits.