Segment Anything
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick
Abstract
We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the largest segmentation dataset ever, with over 1 billion masks on 11 million licensed and privacy-respecting images. The model is designed and trained to be promptable, so it can transfer zero-shot to new image distributions and tasks.
Key Findings
- 1Created the largest segmentation dataset with 1 billion masks on 11 million images
- 2Introduced a promptable segmentation model (SAM) supporting points, boxes, and text prompts
- 3Achieved strong zero-shot transfer to diverse segmentation tasks
- 4Demonstrated a foundation model approach to computer vision segmentation
- 5Released model and dataset as open source
Impact & Significance
SAM established the foundation model paradigm for image segmentation, making pixel-level understanding accessible through simple prompts. It transformed image editing tools, medical imaging, autonomous driving, and any application requiring object segmentation.
Related Tools
Related Papers
The Llama 3 Herd of Models
Meta AI
Qwen2 Technical Report
Alibaba Cloud / Qwen Team
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepSeek AI
The Claude 3 Model Family: Opus, Sonnet, and Haiku
Anthropic