LanguageEst. 2022

MultiMedQA

MultiMedQA combines multiple medical question answering benchmarks including MedQA (USMLE-style), MedMCQA, PubMedQA, and clinical case studies. It evaluates medical knowledge and clinical reasoning capabilities.

Metrics

Accuracy (%) on medical QA tasks

Created By

Google Research / DeepMind

Top Model Scores

RankModelScoreDate
1GPT-5.293.7%2026-03
2Med-Gemini 393.1%2026-01
3Claude Opus 4.691.8%2026-02
4Grok 488.4%2026-02
5Llama 4 405B85.6%2026-01