Overview
Thetraining module provides tools for fine-tuning diffusion models using LoRA (Low-Rank Adaptation). LoRA is a parameter-efficient fine-tuning technique that trains only a small number of additional weights while keeping the base model frozen.
What is LoRA?LoRA (Low-Rank Adaptation) adds trainable low-rank matrices to model layers, allowing you to customize a model with minimal computational resources. It’s perfect for:
- Learning specific styles or subjects
- Adapting models to custom datasets
- Training on consumer GPUs
LoRATrainer
The main class for configuring and running LoRA training.Constructor
Parameters
Model instance to train (from
model.load())LoRA rank - the dimensionality of the low-rank matrices. Lower rank means:
- Fewer trainable parameters
- Faster training
- Less expressive (may not capture complex patterns)
8- Very lightweight (0.5-1M parameters)16- Good balance (1-2M parameters)32- More expressive (2-4M parameters)64- High capacity (4-8M parameters)
LoRA alpha - scaling factor for the LoRA updates. Generally set to 2� rank.Higher alpha = stronger LoRA influence. Formula:
scaling = alpha / rankWhich model modules to apply LoRA to. If
None, automatically targets attention layers.Default: ["to_q", "to_k", "to_v", "to_out.0"] (UNet attention layers)Advanced users can specify custom modules based on model architecture.Dropout probability for LoRA layers. Helps prevent overfitting.Typical range: 0.0 - 0.2
Example
LoRATrainer.train()
Train the LoRA adapter on a dataset.Parameters
Dataset to train on (from
dataset.load())Total number of training steps to perform
Learning rate for the optimizer. Use a float or
"auto" for automatic selection.Recommended values:1e-4(0.0001) - Standard starting point1e-5(0.00001) - Conservative, slower learning5e-5(0.00005) - Good for SDXL1e-3(0.001) - Aggressive, may be unstable
Number of images per training batch. Use
"auto" for automatic selection.Larger batches = more stable gradients but more memory usage.Typical values:1- Minimal memory usage2-4- Good for most GPUs8+- High-memory GPUs only
Accumulate gradients over multiple steps before updating weights.Effective batch size =
batch_size � gradient_accumulation_stepsUse this to simulate larger batches without using more memory:batch_size=1, gradient_accumulation_steps=16� effective batch size of 16
Save a checkpoint every N steps. If
None, only saves at the end.Must specify output_dir for checkpoints to be saved.Directory to save checkpoints to. If
None, checkpoints are not saved to disk.Checkpoints are saved in subdirectories named checkpoint-{step}.Additional training arguments (reserved for future use)
Returns
Dictionary containing the trained LoRA state dict. This can be saved and loaded later.
Example
Training Output
During training, you’ll see output like:train_lora()
Convenience function for quick LoRA training without creating a trainer instance.Parameters
Model to train
Dataset to train on
All arguments from
LoRATrainer.train() are supportedReturns
Dictionary containing trained LoRA weights
Example
This function creates a
LoRATrainer with default settings internally. For more control over training parameters, use LoRATrainer directly.Training Best Practices
Dataset Size
Small Dataset (10-50 images)
- Use higher rank (32-64)
- More training steps (1500-3000)
- Lower learning rate (5e-5)
Large Dataset (100+ images)
- Use lower rank (8-16)
- Fewer steps (500-1000)
- Standard learning rate (1e-4)
Memory Optimization
If you run out of GPU memory:- Reduce batch size: Start with
batch_size=1 - Use gradient accumulation:
gradient_accumulation_steps=16for larger effective batches - Reduce rank: Try
rank=8instead ofrank=16 - Use mixed precision: Models are loaded with
float16by default
Learning Rate Selection
1
Start Conservative
Begin with
learning_rate=1e-4 for most cases2
Monitor Training
Watch for:
- Too high: Loss spikes, unstable training
- Too low: Very slow convergence
3
Adjust
- Increase if training is too slow
- Decrease if training is unstable
Training Duration
- Style learning: 500-1000 steps
- Subject learning: 1000-2000 steps
- Complex concepts: 2000-3000+ steps