LLMJanuary 28, 2022Google Brain
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou
Abstract
We explore how generating a chain of thought — a series of intermediate reasoning steps — significantly improves the ability of large language models to perform complex reasoning. We show that chain-of-thought prompting substantially outperforms standard prompting on arithmetic, commonsense, and symbolic reasoning benchmarks, with improvements most dramatic in the largest models.
Key Findings
- 1Demonstrated that chain-of-thought prompting dramatically improves LLM reasoning
- 2Showed that providing step-by-step reasoning examples unlocks emergent capabilities
- 3Achieved state-of-the-art on GSM8K math benchmark with prompting alone
- 4Found that chain-of-thought is an emergent ability appearing primarily in large models
- 5Required no fine-tuning, only changes to the prompt format
Impact & Significance
Chain-of-thought prompting became one of the most widely used techniques in prompt engineering and influenced how models like GPT-4 and Claude are designed to reason through complex problems step by step.
Related Papers
LLMJuly 23, 2024
The Llama 3 Herd of Models
Meta AI
LLMJuly 15, 2024
Qwen2 Technical Report
Alibaba Cloud / Qwen Team
EfficiencyMay 7, 2024
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek AI
LLMMarch 4, 2024
The Claude 3 Model Family: Opus, Sonnet, and Haiku
Anthropic