What Is Accuracy (AI Metric)?
Accuracy is a fundamental machine learning metric that measures the percentage of correct predictions a model makes out of the total number of predictions, calculated as (correct predictions) / (total predictions).
How Accuracy (AI Metric) Works
Accuracy is the most intuitive evaluation metric: if a model classifies 95 out of 100 images correctly, its accuracy is 95%. However, accuracy can be misleading in many real-world scenarios, particularly with imbalanced datasets. For example, if 99% of transactions are legitimate and only 1% are fraudulent, a model that simply labels everything as 'legitimate' achieves 99% accuracy but catches zero fraud. In such cases, metrics like F1 score, precision, and recall provide a more meaningful picture. Accuracy is most useful when classes are roughly balanced and the cost of different types of errors is similar.
Real-World Examples
GPT-4 achieving 86.4% accuracy on the MMLU benchmark across 57 academic subjects
An image classification model achieving 97% accuracy on a balanced dataset of cats vs. dogs
A team discovering that their 99.5% accurate fraud detector is useless because it achieves that score by never flagging any transactions