AI Glossary/Encoder-Decoder Architecture

What Is Encoder-Decoder Architecture?

Definition

The encoder-decoder architecture is a neural network design pattern where an encoder processes an input sequence into a compressed internal representation, and a decoder generates an output sequence from that representation, enabling sequence-to-sequence tasks like translation and summarization.

How Encoder-Decoder Architecture Works

The encoder reads the entire input and compresses it into a fixed or variable-length representation that captures its meaning. The decoder then uses this representation to generate the output one step at a time. In the original Transformer paper, both the encoder and decoder used self-attention layers, with the decoder also attending to the encoder's output through cross-attention. Models like T5 and BART use the full encoder-decoder architecture, while models like GPT use decoder-only and BERT uses encoder-only. Each variant is suited to different tasks: encoder-decoder for translation, decoder-only for text generation, and encoder-only for understanding and classification.

Real-World Examples

1

Google Translate using an encoder-decoder model to encode an English sentence and decode it into French

2

T5 using its encoder to understand a long document and its decoder to generate a concise summary

3

A speech-to-text system encoding audio into a representation and decoding it into written text

Recommended Tools

Related Terms