From breadth to depth in clinical artificial intelligence evaluation
A large-scale benchmark of 87 clinical text tasks across nine languages reveals just how far large language models remain from mastering real-world medical records — and raises the question of what comes next.
This is a preview of subscription content, access via your institution
Prices may be subject to local taxes which are calculated during checkout
Wu, J. et al. Nat. Biomed. Eng. https://doi.org/10.1038/s41551-026-01719-2 (2026).
Article PubMed PubMed Central Google Scholar
Tordjman, M. et al. Nat. Med. 31, 2550–2555 (2025).
Article CAS PubMed Google Scholar
Raji, I. D., Daneshjou, R. & Alsentzer, E. NEJM AI 2, AIe2401235 (2025).
Bedi, S. et al. Nat. Med. 32, 943–951 (2026).
Article CAS PubMed PubMed Central Google Scholar
Bedi, S. et al. JAMA 333, 319–328 (2025).
Wu, D. et al. Preprint at https://arxiv.org/abs/2512.01241 (2025).
McCoy, L. G., Manrai, A. K. & Rodman, A. N. Engl. J. Med. 391, 1561–1564 (2024).
Rodman, A., Zwaan, L., Olson, A. & Manrai, A. K. NEJM AI 2, AIe2500143 (2025).
Division of Neurology, University of Alberta, Edmonton, Alberta, Canada
Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
Harvard Combined Dermatology Program, Harvard Medical School, Boston, MA, USA
Department of Dermatology, Mass General Brigham, Boston, MA, USA
Search author on:PubMed Google Scholar
Correspondence to Liam G. McCoy.
L.G.M. and D.W. report paid consulting for Meta Platforms via Magnit Global.
McCoy, L.G., Wu, D. From breadth to depth in clinical artificial intelligence evaluation. Nat. Biomed. Eng (2026). https://doi.org/10.1038/s41551-026-01691-x
Version of record: 24 June 2026
DOI: https://doi.org/10.1038/s41551-026-01691-x
Related Stories
AI News
A Jefferson for every era, from Lincoln to Trump, and the contradictions that endure
29 minutes ago
AI News
FIFA World Cup impact on Vancouver at the halfway mark
29 minutes ago
AI News
Colorado Democrats choose between insurgent progressives and veteran incumbents
29 minutes ago
AI News
Live updates: Trump and Iran issue conflicting statements about talks
29 minutes ago
AI News
Ukraine war briefing: Zelenskyy ridicules Russian military drive, saying Putin keeps postponing goal deadlines
29 minutes ago
AI News
INDIA bloc writes to CJI on concerns over SIR process; alliance gets AAP, DMK on board to raise issue
30 minutes ago
AI News
Severe thunderstorm threat continues for Ontario and Manitoba
30 minutes ago
AI News
Marsh appoints Teresa Palandra as Canada CEO
30 minutes ago