AI Glossary/LoRA (Low-Rank Adaptation)

What Is LoRA (Low-Rank Adaptation)?

Definition

LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning technique that adapts pre-trained AI models by injecting small, trainable low-rank matrices into the model's layers, enabling customization without modifying the original model weights.

How LoRA (Low-Rank Adaptation) Works

Fine-tuning a large language model normally requires updating billions of parameters, which demands enormous GPU memory and compute. LoRA dramatically reduces these requirements by freezing the original model weights and adding small trainable matrices (typically less than 1% of the original parameters) to specific layers. These adapters capture task-specific knowledge while the base model remains unchanged. Multiple LoRA adapters can be swapped in and out of the same base model, enabling a single model to serve many different specialized tasks. LoRA has become the go-to method for fine-tuning open-source models like LLaMA and Stable Diffusion.

Real-World Examples

1

A developer fine-tuning LLaMA-70B with LoRA on a single consumer GPU instead of requiring a cluster of A100s

2

An artist training a LoRA adapter on Stable Diffusion to generate images in their specific art style

3

A company maintaining multiple LoRA adapters for different departments that all share the same base model

V

LoRA (Low-Rank Adaptation) on Vincony

Vincony's fine-tuning capabilities support parameter-efficient methods like LoRA, enabling users to customize AI models for specific tasks without massive compute costs.

Try Vincony free →

Recommended Tools

Related Terms