Toolformer: Language Models Can Teach Themselves to Use Tools
Timo Schick, Jane Dwivedi-Yu, Roberto Dessi, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom
Abstract
We introduce Toolformer, a model trained to decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results into future token prediction. Toolformer achieves substantially improved zero-shot performance across a variety of downstream tasks, often competitive with much larger models, without sacrificing its core language modeling abilities.
Key Findings
- 1Demonstrated that LLMs can learn to use tools through self-supervised training
- 2Showed tool use (calculator, search, calendar, translator) without human annotation
- 3Achieved improved factuality and mathematical reasoning through tool integration
- 4Used a self-supervised approach where the model decides when and how to call APIs
- 5Maintained general language modeling abilities while adding tool competencies
Impact & Significance
Toolformer provided the theoretical foundation for tool-using AI assistants. It influenced how ChatGPT plugins, Claude tool use, and Gemini function calling were designed and implemented.
Related Papers
The Llama 3 Herd of Models
Meta AI
Qwen2 Technical Report
Alibaba Cloud / Qwen Team
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek AI
The Claude 3 Model Family: Opus, Sonnet, and Haiku
Anthropic