AI Glossary/TPU (Tensor Processing Unit)

What Is TPU (Tensor Processing Unit)?

Definition

A TPU (Tensor Processing Unit) is a custom-designed AI accelerator chip developed by Google specifically for machine learning workloads, optimized for the matrix and tensor operations that form the core computations in neural network training and inference.

How TPU (Tensor Processing Unit) Works

While GPUs (originally designed for graphics) have become the dominant AI hardware, Google developed TPUs as purpose-built alternatives optimized exclusively for machine learning. TPUs are designed to efficiently perform the matrix multiplications and other tensor operations that neural networks rely on. They feature large amounts of high-bandwidth memory and are connected in pods of thousands of chips for distributed training. Google uses TPUs to train and serve its own models (Gemini, PaLM) and offers them to external users through Google Cloud. TPUs compete with NVIDIA GPUs, and the choice often depends on the specific workload, software ecosystem, and cost considerations.

Real-World Examples

1

Google training Gemini on TPU v5p pods containing thousands of interconnected chips

2

Researchers using Google Cloud TPUs to train large language models at lower cost than equivalent GPU clusters

3

A startup using TPU v4 chips on Google Cloud to accelerate training of their computer vision model by 3x compared to GPUs

Recommended Tools

Related Terms