Stage 9: Parameter-Efficient Fine-Tuning (PEFT)¶

Adapting large models without breaking the bank

Overview¶

Modern LLMs have billions of parameters. Fine-tuning them all is:

Expensive: A 7B model needs ~28GB just for weights in fp32
Slow: Updating billions of parameters takes time
Wasteful: Most parameters don't need to change much

PEFT methods solve this by training only a tiny fraction of parameters while keeping most of the model frozen.

"Fine-tuning 1% of parameters can achieve 99% of full fine-tuning performance."

The Key Insight¶

Research shows that weight updates during fine-tuning have low intrinsic rank. This means:

The change from pretrained weights to fine-tuned weights can be approximated with far fewer parameters
We don't need to update 7 billion parameters—a few million carefully placed parameters suffice

Methods We'll Cover¶

Method	Key Idea	Parameters
LoRA	Low-rank weight updates	~0.1-1%
Adapters	Bottleneck layers	~1-5%
Prefix Tuning	Learned key/value prefixes	~0.01%
Prompt Tuning	Soft input prompts	~0.001%

Why This Matters¶

For a 7B parameter model:

Method	Trainable Params	GPU Memory
Full fine-tuning	7B	~28GB
LoRA (r=8)	~4M	~8GB
Prompt tuning	~80K	~2GB

That's the difference between needing a $10,000 GPU and a $500 one.

Learning Objectives¶

By the end of this stage, you will:

Understand why PEFT works (the low-rank hypothesis)
Implement LoRA from scratch
Implement adapters with bottleneck architecture
Understand prefix and prompt tuning
Know when to use each method

Sections¶

The Fine-Tuning Problem - Why full fine-tuning is hard
LoRA: Low-Rank Adaptation - The most popular PEFT method
Adapter Layers - Bottleneck modules
Prefix and Prompt Tuning - Learning soft prompts
Choosing a Method - Trade-offs and recommendations
Implementation - Building PEFT from scratch

Prerequisites¶

Understanding of transformer architecture (Stage 6)
Familiarity with backpropagation (Stage 2)
Experience with optimization (Stage 4)

Key Insight¶

PEFT isn't about approximating full fine-tuning—it's about finding the right subspace for adaptation. Often, this subspace is tiny compared to the full parameter space.

Code & Resources¶

Resource	Description
`code/stage-09/peft.py`	LoRA, Adapters, and Prompt Tuning
`code/stage-09/tests/`	Test suite
Exercises	Practice problems
Common Mistakes	Debugging guide