What Is Catastrophic Forgetting?
Catastrophic forgetting is a phenomenon in neural networks where learning new information causes the model to abruptly lose or significantly degrade its performance on previously learned tasks, because new training overwrites the weights that stored earlier knowledge.
How Catastrophic Forgetting Works
When a neural network is trained on a new task, the gradient updates that help it learn the new task can overwrite the weights that were important for previous tasks. This is catastrophic forgetting — the model performs well on the new task but 'forgets' what it learned before. For example, fine-tuning a general language model on medical text might make it excellent at medical tasks but worse at general conversation. This is a fundamental challenge for continual learning systems that need to keep learning over time. Mitigation strategies include elastic weight consolidation (protecting important weights), rehearsal (periodically retraining on old data), progressive networks (adding capacity for new tasks), and LoRA (which preserves original weights by adding separate adapters).
Real-World Examples
A chatbot fine-tuned on customer support data suddenly becoming unable to perform general knowledge tasks it could handle before
A multilingual model fine-tuned exclusively on Japanese text losing its English language capabilities
A researcher using elastic weight consolidation to fine-tune a model on new data while preserving performance on the original tasks