Overview
Themodel module provides a high-level interface for loading and working with diffusion models from HuggingFace or local paths. It automatically handles model optimization, device management, and inference.
model.load()
Load a diffusion model from HuggingFace or a local path.Parameters
HuggingFace model ID (e.g., “stabilityai/stable-diffusion-xl-base-1.0”) or local path to model directory
Data type for model weights. Options:
"float16"or"fp16"- Half precision (recommended for most GPUs)"bfloat16"or"bf16"- Brain float (better for newer hardware)"float32"or"fp32"- Full precision (slower but more accurate)
Additional arguments passed to
DiffusionPipeline.from_pretrained(). Common options:variant- Model variant (e.g., “fp16”)use_auth_token- HuggingFace authentication tokenrevision- Model revision to use
Returns
Model instance ready for inference and training
Example
Model.to()
Move the model to a specific device (GPU, CPU, etc.).Parameters
Device to move model to:
"cuda"- NVIDIA GPU (default CUDA device)"cuda:0","cuda:1", etc. - Specific CUDA device"cpu"- CPU"mps"- Apple Silicon GPU (Metal Performance Shaders)
Returns
Returns self for method chaining
Example
Model.generate()
Generate images from text prompt(s).Parameters
Text prompt(s) describing the image to generate. Can be a single string or list of strings for batch generation.
Negative prompt(s) describing what to avoid in the generation
Number of denoising steps. More steps = higher quality but slower. Typical range: 20-100.
Classifier-free guidance scale. Higher values = stronger adherence to prompt. Typical range: 5.0-15.0.
Output image width in pixels. Must be divisible by 8.
Output image height in pixels. Must be divisible by 8.
Number of images to generate per prompt
Random seed for reproducible generation. If not specified, uses random seed.
Additional arguments passed to the underlying pipeline (model-specific)
Returns
Generated image(s). Returns a single PIL Image if one image is generated, or a list of images for batch generation.
Example
Model.train_lora()
Train a LoRA (Low-Rank Adaptation) adapter on a custom dataset. This is the simple, high-level interface for LoRA training.Parameters
Dataset to train on, created with
dataset.load()Number of training steps to perform
Learning rate for training. Use a float value or
"auto" for automatic learning rate selection.Typical values:1e-4- Good starting point for most cases1e-5- More conservative, slower convergence1e-3- Aggressive, may cause instability
LoRA rank. Lower rank = fewer parameters and faster training, but less expressive.Common values:
8- Very lightweight, fast training16- Good balance (default)32- More expressive, slower training64- High capacity, slow training
LoRA alpha scaling factor. Typically set to 2x the rank. Controls the magnitude of LoRA updates.
Batch size for training. Use
"auto" for automatic batch size selection based on available memory.Number of steps to accumulate gradients before updating weights. Effective batch size = batch_size � gradient_accumulation_steps.
Directory to save checkpoints. If not specified, checkpoints are not saved to disk.
Save a checkpoint every N steps. Only applies if
output_dir is specified.Additional training arguments passed to the trainer
Returns
Dictionary containing trained LoRA weights that can be loaded later
Example
Notes
LoRA training is memory-efficient and fast compared to full fine-tuning. It only trains a small number of additional parameters while keeping the base model frozen.
Type Reference
Model
The main model class returned bymodel.load().