CodeEst. 2023

Codeforces Benchmark

The Codeforces Benchmark evaluates AI models on competitive programming problems from the Codeforces platform, one of the world's largest competitive programming communities. Problems range from beginner to expert difficulty and require algorithmic thinking, data structure knowledge, and efficient implementation. AI models are rated using the same ELO-style system as human competitors, enabling direct comparison with human programmers.

Metrics

ELO rating (competitive programming scale)

Created By

Mike Mirzayanov (Codeforces)

Top Model Scores

RankModelScoreDate
1Claude Opus 4.618922026-02
2GPT-5.218562026-03
3DeepSeek-V417982026-01
4Gemini 3 Ultra17522026-01
5Grok 416892026-02