AI Glossary/LSTM (Long Short-Term Memory)

What Is LSTM (Long Short-Term Memory)?

Definition

LSTM (Long Short-Term Memory) is a specialized recurrent neural network architecture that uses gated memory cells to selectively remember or forget information over long sequences, solving the vanishing gradient problem that plagued standard RNNs.

How LSTM (Long Short-Term Memory) Works

LSTMs introduce a cell state and three gates — input, forget, and output — that control the flow of information. The forget gate decides what information to discard from the cell state, the input gate decides what new information to store, and the output gate determines what the cell outputs. This gating mechanism allows LSTMs to maintain information over hundreds of time steps, making them effective for tasks requiring long-term memory. LSTMs dominated sequence modeling from their introduction in 1997 until Transformers emerged in 2017, and they remain relevant for time series analysis and real-time processing where Transformer overhead is excessive.

Real-World Examples

1

A speech recognition system using LSTMs to transcribe long audio recordings by maintaining context across many time steps

2

A predictive text keyboard using LSTM to suggest next words based on the entire sentence so far

3

An anomaly detection system using LSTM to learn normal patterns in sensor data and flag unusual readings

Recommended Tools

Related Terms