Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Kuttler, Mike Lewis, Wen-tau Yih, Tim Rocktaschel, Sebastian Riedel, Douwe Kiela
Abstract
Large pre-trained language models have been shown to store factual knowledge in their parameters. However, their ability to access and precisely manipulate knowledge is still limited. We explore a general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) — models which combine pre-trained parametric and non-parametric memory for language generation.
Key Findings
- 1Introduced RAG architecture combining retrieval with generation
- 2Demonstrated improved factual accuracy by grounding generation in retrieved documents
- 3Showed state-of-the-art results on open-domain question answering
- 4Enabled models to access up-to-date information beyond training data
- 5Provided a framework for knowledge-grounded text generation
Impact & Significance
RAG became the dominant pattern for building production AI applications that need accurate, up-to-date information. It is now the standard approach for enterprise AI, customer support bots, and knowledge management systems.
Related Tools
Related Papers
The Llama 3 Herd of Models
Meta AI
Qwen2 Technical Report
Alibaba Cloud / Qwen Team
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek AI
The Claude 3 Model Family: Opus, Sonnet, and Haiku
Anthropic