Training

Overview

The training module provides tools for fine-tuning diffusion models using LoRA (Low-Rank Adaptation). LoRA is a parameter-efficient fine-tuning technique that trains only a small number of additional weights while keeping the base model frozen.

What is LoRA?LoRA (Low-Rank Adaptation) adds trainable low-rank matrices to model layers, allowing you to customize a model with minimal computational resources. It’s perfect for:

Learning specific styles or subjects
Adapting models to custom datasets
Training on consumer GPUs

LoRATrainer

The main class for configuring and running LoRA training.

from hypergen import model, dataset
from hypergen.training import LoRATrainer

m = model.load("stabilityai/stable-diffusion-xl-base-1.0").to("cuda")
trainer = LoRATrainer(m)
lora_weights = trainer.train(dataset, steps=1000)

Constructor

LoRATrainer(
    model,
    rank=16,
    alpha=32,
    target_modules=None,
    dropout=0.1
)

Parameters

model

Model

required

Model instance to train (from model.load())

rank

integer

default:"16"

LoRA rank - the dimensionality of the low-rank matrices. Lower rank means:

Fewer trainable parameters
Faster training
Less expressive (may not capture complex patterns)

Common values:

8 - Very lightweight (0.5-1M parameters)
16 - Good balance (1-2M parameters)
32 - More expressive (2-4M parameters)
64 - High capacity (4-8M parameters)

alpha

integer

default:"32"

LoRA alpha - scaling factor for the LoRA updates. Generally set to 2� rank.Higher alpha = stronger LoRA influence. Formula: scaling = alpha / rank

target_modules

list[string]

Which model modules to apply LoRA to. If None, automatically targets attention layers.Default: ["to_q", "to_k", "to_v", "to_out.0"] (UNet attention layers)Advanced users can specify custom modules based on model architecture.

dropout

float

default:"0.1"

Dropout probability for LoRA layers. Helps prevent overfitting.Typical range: 0.0 - 0.2

Example

from hypergen import model
from hypergen.training import LoRATrainer

m = model.load("stabilityai/stable-diffusion-xl-base-1.0").to("cuda")

# Lightweight LoRA
trainer = LoRATrainer(m, rank=8, alpha=16)

# High-capacity LoRA
trainer = LoRATrainer(m, rank=64, alpha=128, dropout=0.05)

# Custom target modules
trainer = LoRATrainer(
    m,
    rank=32,
    target_modules=["to_q", "to_k", "to_v", "to_out.0", "ff.net.0"]
)

LoRATrainer.train()

Train the LoRA adapter on a dataset.

lora_weights = trainer.train(dataset, steps=1000)

Parameters

dataset

Dataset

required

Dataset to train on (from dataset.load())

steps

integer

default:"1000"

Total number of training steps to perform

learning_rate

float | string

default:"1e-4"

Learning rate for the optimizer. Use a float or "auto" for automatic selection.Recommended values:

1e-4 (0.0001) - Standard starting point
1e-5 (0.00001) - Conservative, slower learning
5e-5 (0.00005) - Good for SDXL
1e-3 (0.001) - Aggressive, may be unstable

batch_size

integer | string

default:"1"

Number of images per training batch. Use "auto" for automatic selection.Larger batches = more stable gradients but more memory usage.Typical values:

1 - Minimal memory usage
2-4 - Good for most GPUs
8+ - High-memory GPUs only

gradient_accumulation_steps

integer

default:"1"

Accumulate gradients over multiple steps before updating weights.Effective batch size = batch_size � gradient_accumulation_stepsUse this to simulate larger batches without using more memory:

batch_size=1, gradient_accumulation_steps=16 � effective batch size of 16

save_steps

integer

Save a checkpoint every N steps. If None, only saves at the end.Must specify output_dir for checkpoints to be saved.

output_dir

string

Directory to save checkpoints to. If None, checkpoints are not saved to disk.Checkpoints are saved in subdirectories named checkpoint-{step}.

**kwargs

dict

Additional training arguments (reserved for future use)

Returns

lora_weights

dict[str, Any]

Dictionary containing the trained LoRA state dict. This can be saved and loaded later.

Example

from hypergen import model, dataset
from hypergen.training import LoRATrainer

# Load model and data
m = model.load("stabilityai/stable-diffusion-xl-base-1.0").to("cuda")
ds = dataset.load("./my_training_images")

# Create trainer
trainer = LoRATrainer(m, rank=16, alpha=32)

# Basic training
lora_weights = trainer.train(ds, steps=1000)

# Advanced training with checkpointing
lora_weights = trainer.train(
    ds,
    steps=2000,
    learning_rate=1e-4,
    batch_size=2,
    gradient_accumulation_steps=8,  # Effective batch size = 16
    output_dir="./lora_checkpoints",
    save_steps=250  # Save every 250 steps
)

# Use the trained model
image = m.generate("A portrait in my custom style")
image.save("output.png")

Training Output

During training, you’ll see output like:

LoRA trainable parameters: 1,234,567 / 987,654,321 (0.12%)

Starting LoRA training:
  Total steps: 1000
  Learning rate: 0.0001
  Batch size: 2
  Gradient accumulation: 8

Training: 100%|������������| 1000/1000 [12:34<00:00, 1.32it/s]

Training complete!

train_lora()

Convenience function for quick LoRA training without creating a trainer instance.

from hypergen.training import train_lora

lora = train_lora(model, dataset, steps=1000)

Parameters

model

Model

required

Model to train

dataset

Dataset

required

Dataset to train on

**kwargs

dict

All arguments from LoRATrainer.train() are supported

Returns

lora_weights

dict[str, Any]

Dictionary containing trained LoRA weights

Example

from hypergen import model, dataset
from hypergen.training import train_lora

# Load model and dataset
m = model.load("stabilityai/sdxl-turbo").to("cuda")
ds = dataset.load("./images")

# Quick training
lora = train_lora(m, ds, steps=500, learning_rate=1e-4)

# Generate with trained model
image = m.generate("A photo in my style")

This function creates a LoRATrainer with default settings internally. For more control over training parameters, use LoRATrainer directly.

Training Best Practices

Dataset Size

Small Dataset (10-50 images)

Use higher rank (32-64)
More training steps (1500-3000)
Lower learning rate (5e-5)

Large Dataset (100+ images)

Use lower rank (8-16)
Fewer steps (500-1000)
Standard learning rate (1e-4)

Memory Optimization

If you run out of GPU memory:

Reduce batch size: Start with batch_size=1
Use gradient accumulation: gradient_accumulation_steps=16 for larger effective batches
Reduce rank: Try rank=8 instead of rank=16
Use mixed precision: Models are loaded with float16 by default

Learning Rate Selection

Start Conservative

Begin with learning_rate=1e-4 for most cases

Monitor Training

Watch for:

Too high: Loss spikes, unstable training
Too low: Very slow convergence

Adjust

Increase if training is too slow
Decrease if training is unstable

Training Duration

Style learning: 500-1000 steps
Subject learning: 1000-2000 steps
Complex concepts: 2000-3000+ steps

Monitor the generated outputs to avoid overfitting. Stop when results look good!

Example Workflows

Portrait Style Training

from hypergen import model, dataset
from hypergen.training import LoRATrainer

# Load base model
m = model.load("stabilityai/stable-diffusion-xl-base-1.0").to("cuda")

# Load portrait dataset
ds = dataset.load("./portrait_photos")

# Configure trainer for style
trainer = LoRATrainer(
    m,
    rank=32,      # Higher rank for complex styles
    alpha=64,
    dropout=0.1
)

# Train
lora = trainer.train(
    ds,
    steps=1500,
    learning_rate=5e-5,  # Conservative
    batch_size=2,
    output_dir="./portrait_lora"
)

# Test
image = m.generate(
    "A portrait of a woman in professional lighting",
    num_inference_steps=30
)
image.save("portrait_test.png")

Fast Style Transfer

from hypergen import model, dataset
from hypergen.training import train_lora

# Load fast model
m = model.load("stabilityai/sdxl-turbo").to("cuda")
ds = dataset.load("./art_style_images")

# Quick training
lora = train_lora(
    m,
    ds,
    steps=500,
    learning_rate=1e-4,
    rank=16,
    batch_size=4
)

# Generate quickly (SDXL Turbo only needs 1-4 steps)
image = m.generate(
    "A landscape painting",
    num_inference_steps=4
)

Type Reference

LoRATrainer

class LoRATrainer:
    model: Model
    rank: int
    alpha: int
    target_modules: list[str] | None
    dropout: float

    def __init__(
        model: Model,
        rank: int = 16,
        alpha: int = 32,
        target_modules: list[str] | None = None,
        dropout: float = 0.1
    ): ...

    def train(
        dataset: Dataset,
        steps: int = 1000,
        learning_rate: float | str = 1e-4,
        batch_size: int | str = 1,
        gradient_accumulation_steps: int = 1,
        save_steps: int | None = None,
        output_dir: str | None = None,
        **kwargs
    ) -> dict[str, Any]: ...

train_lora

def train_lora(
    model: Model,
    dataset: Dataset,
    **kwargs
) -> dict[str, Any]: ...

Python API

HTTP API

Overview

LoRATrainer

Constructor

Parameters

Example

LoRATrainer.train()

Parameters

Returns

Example

Training Output

train_lora()

Parameters

Returns

Example

Training Best Practices

Dataset Size

Small Dataset (10-50 images)

Large Dataset (100+ images)

Memory Optimization

Learning Rate Selection

Training Duration

Example Workflows

Portrait Style Training

Fast Style Transfer

Type Reference

LoRATrainer

train_lora

Python API

HTTP API

​Overview

​LoRATrainer

​Constructor

​Parameters

​Example

​LoRATrainer.train()

​Parameters

​Returns

​Example

​Training Output

​train_lora()

​Parameters

​Returns

​Example

​Training Best Practices

​Dataset Size

Small Dataset (10-50 images)

Large Dataset (100+ images)

​Memory Optimization

​Learning Rate Selection

​Training Duration

​Example Workflows

​Portrait Style Training

​Fast Style Transfer

​Type Reference

​LoRATrainer

​train_lora

Overview

LoRATrainer

Constructor

Parameters

Example

LoRATrainer.train()

Parameters

Returns

Example

Training Output

train_lora()

Parameters

Returns

Example

Training Best Practices

Dataset Size

Memory Optimization

Learning Rate Selection

Training Duration

Example Workflows

Portrait Style Training

Fast Style Transfer

Type Reference

LoRATrainer

train_lora