EfficiencyDecember 1, 2023Carnegie Mellon / Princeton
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Albert Gu, Tri Dao
Abstract
We introduce Mamba, a new architecture for sequence modeling based on structured state space models (SSMs) with a selection mechanism. Mamba achieves performance comparable to Transformers while scaling linearly with sequence length instead of quadratically. On language modeling, Mamba matches or exceeds Transformers of the same size while being 5x faster at inference.
Key Findings
- 1Achieved linear-time sequence modeling compared to quadratic Transformer attention
- 2Matched Transformer performance on language modeling benchmarks
- 35x faster inference than equivalent-size Transformers
- 4Introduced selection mechanism for input-dependent state space dynamics
- 5Demonstrated strong performance on long-sequence tasks without attention
Impact & Significance
Mamba challenged the dominance of Transformer attention for sequence modeling and sparked intense research into SSM-based and hybrid architectures. It demonstrated a viable path to more efficient sequence models at scale.
Related Tools
Related Papers
LLMJuly 23, 2024
The Llama 3 Herd of Models
Meta AI
LLMJuly 15, 2024
Qwen2 Technical Report
Alibaba Cloud / Qwen Team
EfficiencyMay 7, 2024
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek AI
LLMMarch 4, 2024
The Claude 3 Model Family: Opus, Sonnet, and Haiku
Anthropic