Question 1

What is FLOPS (Floating Point Operations Per Second)?

Accepted Answer

FLOPS (Floating Point Operations Per Second) is a unit of measurement for computing performance that counts how many floating-point arithmetic operations a processor can execute per second, serving as the primary metric for comparing AI training and inference hardware capabilities.

Question 2

How does FLOPS (Floating Point Operations Per Second) work?

Accepted Answer

Neural network training and inference consist primarily of floating-point operations — multiplications and additions on decimal numbers in matrix computations. FLOPS measures how fast hardware can perform these operations. Modern AI GPUs operate in the teraFLOPS (10^12) to petaFLOPS (10^15) range. An NVIDIA H100 delivers about 1,979 teraFLOPS of FP8 performance. The total FLOPS required for training is a useful metric for comparing model scales: GPT-3 required approximately 3.14 x 10^23 FLOPS to train, while GPT-4 is estimated to have required 10-100x more. Understanding FLOPS helps estimate training time, hardware costs, and model scaling behavior.

Question 3

What are examples of FLOPS (Floating Point Operations Per Second)?

Accepted Answer

NVIDIA's H100 GPU delivering approximately 2 petaFLOPS of FP8 AI performance for training workloads Researchers estimating that training GPT-4 required roughly 2 x 10^25 FLOPS of compute over several months A team calculating they need 10^21 FLOPS for their training run and provisioning the right number of GPUs accordingly

What Is FLOPS (Floating Point Operations Per Second)?

How FLOPS (Floating Point Operations Per Second) Works

Real-World Examples

Recommended Tools

Related Terms