MathEst. 2024

AIME 2024

The American Invitational Mathematics Examination (AIME) 2024 problems test advanced mathematical problem-solving. These competition problems require creative mathematical thinking and are used to evaluate frontier model math capabilities.

Metrics

Solve rate (%) on 30 AIME problems

Created By

MAA (Mathematical Association of America)

Paper

View paper →

Website

Visit website →

Top Model Scores

Rank	Model	Score	Date
1	GPT-5.2	83.3%	2026-03
2	Claude Opus 4.6	80.0%	2026-02
3	Gemini 3 Ultra	76.7%	2026-01
4	DeepSeek Math V2	73.3%	2026-01
5	Grok 4	70.0%	2026-02

Related Math Benchmarks

MATH

The MATH benchmark consists of 12,500 challenging competition mathematics problems from AMC, AIME, and Olympiad competitions. Problems span seven subjects: Prealgebra, Algebra, Number Theory, Counting and Probability, Geometry, Intermediate Algebra, and Precalculus.

Top: GPT-5.2 — 89.6%

GSM8K

Grade School Math 8K is a dataset of 8,500 high-quality, linguistically diverse grade school math word problems. It tests multi-step mathematical reasoning with problems requiring 2-8 steps to solve.

Top: GPT-5.2 — 98.1%

MGSM

Multilingual Grade School Math (MGSM) extends GSM8K to 10 typologically diverse languages including Bengali, Chinese, French, German, Japanese, Russian, Spanish, Swahili, Telugu, and Thai, testing multilingual mathematical reasoning.

Top: GPT-5.2 — 93.7%

← Back to all benchmarks