AIME 2024
The American Invitational Mathematics Examination (AIME) 2024 problems test advanced mathematical problem-solving. These competition problems require creative mathematical thinking and are used to evaluate frontier model math capabilities.
Metrics
Solve rate (%) on 30 AIME problems
Created By
MAA (Mathematical Association of America)
Paper
View paper →Website
Visit website →Top Model Scores
| Rank | Model | Score | Date |
|---|---|---|---|
| 1 | GPT-5.2 | 83.3% | 2026-03 |
| 2 | Claude Opus 4.6 | 80.0% | 2026-02 |
| 3 | Gemini 3 Ultra | 76.7% | 2026-01 |
| 4 | DeepSeek Math V2 | 73.3% | 2026-01 |
| 5 | Grok 4 | 70.0% | 2026-02 |
Related Math Benchmarks
MATH
The MATH benchmark consists of 12,500 challenging competition mathematics problems from AMC, AIME, and Olympiad competitions. Problems span seven subjects: Prealgebra, Algebra, Number Theory, Counting and Probability, Geometry, Intermediate Algebra, and Precalculus.
Top: GPT-5.2 — 89.6%
GSM8K
Grade School Math 8K is a dataset of 8,500 high-quality, linguistically diverse grade school math word problems. It tests multi-step mathematical reasoning with problems requiring 2-8 steps to solve.
Top: GPT-5.2 — 98.1%
MGSM
Multilingual Grade School Math (MGSM) extends GSM8K to 10 typologically diverse languages including Bengali, Chinese, French, German, Japanese, Russian, Spanish, Swahili, Telugu, and Thai, testing multilingual mathematical reasoning.
Top: GPT-5.2 — 93.7%