Self-Instruct: Aligning Language Models with Self-Generated Instructions
Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi
Abstract
We introduce Self-Instruct, a framework for improving the instruction-following capabilities of pretrained language models by bootstrapping off their own generations. Our pipeline generates instructions, input, and output samples from a language model, then uses them to fine-tune the original model. Applying self-instruct to GPT-3 leads to a 33% absolute improvement over the original model on SuperNatural Instructions.
Key Findings
- 1Bootstrapped instruction-following data from the model itself
- 2Achieved 33% improvement on instruction-following without human annotation
- 3Demonstrated a scalable approach to alignment data generation
- 4Generated 52K instruction-following examples for fine-tuning
- 5Influenced how open-source models generate training data
Impact & Significance
Self-Instruct enabled the creation of instruction-tuned models without expensive human annotation, directly inspiring Stanford Alpaca and the wave of self-instruct fine-tuned open-source models.
Related Tools
Related Papers
The Llama 3 Herd of Models
Meta AI
Qwen2 Technical Report
Alibaba Cloud / Qwen Team
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek AI
The Claude 3 Model Family: Opus, Sonnet, and Haiku
Anthropic