LLMJanuary 28, 2022Google Brain

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou

Abstract

We explore how generating a chain of thought — a series of intermediate reasoning steps — significantly improves the ability of large language models to perform complex reasoning. We show that chain-of-thought prompting substantially outperforms standard prompting on arithmetic, commonsense, and symbolic reasoning benchmarks, with improvements most dramatic in the largest models.

Key Findings

1Demonstrated that chain-of-thought prompting dramatically improves LLM reasoning
2Showed that providing step-by-step reasoning examples unlocks emergent capabilities
3Achieved state-of-the-art on GSM8K math benchmark with prompting alone
4Found that chain-of-thought is an emergent ability appearing primarily in large models
5Required no fine-tuning, only changes to the prompt format

Impact & Significance

Chain-of-thought prompting became one of the most widely used techniques in prompt engineering and influenced how models like GPT-4 and Claude are designed to reason through complex problems step by step.

Related Tools

Chatgpt Claude Gemini

Read Full Paper

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Abstract

Key Findings

Impact & Significance

Related Tools

Related Papers

The Llama 3 Herd of Models

Qwen2 Technical Report

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

The Claude 3 Model Family: Opus, Sonnet, and Haiku