AgentsFebruary 9, 2023Meta AI

Toolformer: Language Models Can Teach Themselves to Use Tools

Timo Schick, Jane Dwivedi-Yu, Roberto Dessi, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom

Abstract

We introduce Toolformer, a model trained to decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results into future token prediction. Toolformer achieves substantially improved zero-shot performance across a variety of downstream tasks, often competitive with much larger models, without sacrificing its core language modeling abilities.

Key Findings

1Demonstrated that LLMs can learn to use tools through self-supervised training
2Showed tool use (calculator, search, calendar, translator) without human annotation
3Achieved improved factuality and mathematical reasoning through tool integration
4Used a self-supervised approach where the model decides when and how to call APIs
5Maintained general language modeling abilities while adding tool competencies

Impact & Significance

Toolformer provided the theoretical foundation for tool-using AI assistants. It influenced how ChatGPT plugins, Claude tool use, and Gemini function calling were designed and implemented.

Related Tools

Chatgpt Claude Gemini

Read Full Paper

Toolformer: Language Models Can Teach Themselves to Use Tools

Abstract

Key Findings

Impact & Significance

Related Tools

Related Papers

The Llama 3 Herd of Models

Qwen2 Technical Report

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

The Claude 3 Model Family: Opus, Sonnet, and Haiku