GeneralEst. 2024

LiveBench

LiveBench is a continuously updated benchmark designed to minimize contamination by using new questions monthly. It covers math, coding, reasoning, language, instruction following, and data analysis with objective, verifiable answers.

Metrics

Average accuracy (%) across 6 categories

Created By

Abacus.AI

Top Model Scores

RankModelScoreDate
1GPT-5.282.6%2026-03
2Claude Opus 4.681.3%2026-02
3Gemini 3 Ultra79.8%2026-01
4Grok 477.4%2026-02
5Llama 4 405B73.9%2026-01