LanguageEst. 2019

Natural Questions

Natural Questions is a question answering benchmark with real queries from Google Search. Each question has a long answer (paragraph) and a short answer (entity or phrase) from Wikipedia, testing both retrieval and comprehension.

Metrics

F1 score on Google Search questions

Created By

Google Research

Top Model Scores

RankModelScoreDate
1GPT-5.278.32026-03
2Gemini 3 Ultra77.62026-01
3Claude Opus 4.676.82026-02
4Grok 474.22026-02
5Llama 4 405B71.52026-01