Alpaca: A Strong, Replicable Instruction-Following Model
Rohan Taori, Ishaan Gulrajani, Tianyi Zhang, Yann Dubois, Xuechen Li, Carlos Guestrin, Percy Liang, Tatsunori B. Hashimoto
Abstract
We demonstrate that fine-tuning Meta's LLaMA 7B model on 52K instruction-following demonstrations generated by GPT-3.5 produces a model that behaves qualitatively similarly to OpenAI's text-davinci-003. Alpaca costs less than $600 to reproduce, making it an accessible starting point for the research community to study instruction-following models.
Key Findings
- 1Fine-tuned LLaMA 7B on 52K instruction-following examples for under $600
- 2Produced a model qualitatively similar to text-davinci-003
- 3Demonstrated that instruction tuning with synthetic data is highly effective
- 4Released training code, data, and model for research community
- 5Showed the viability of low-cost instruction-tuned model creation
Impact & Significance
Alpaca sparked the open-source instruction-tuning revolution, showing that high-quality chatbots could be created cheaply. It led to dozens of follow-up projects (Vicuna, Koala, GPT4All) and democratized LLM fine-tuning research.
Related Tools
Related Papers
The Llama 3 Herd of Models
Meta AI
Qwen2 Technical Report
Alibaba Cloud / Qwen Team
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek AI
The Claude 3 Model Family: Opus, Sonnet, and Haiku
Anthropic