Overview
Generate images from text descriptions using the loaded diffusion model. This endpoint is compatible with OpenAI’s image generation API.Authentication
Bearer token with your API key (if server started with
--api-key)Format: Bearer YOUR_API_KEYRequest Body
Text description of the image to generate. Can be detailed and descriptive.Example:
"A professional portrait of a person in natural lighting, photorealistic, high detail"Model identifier (optional, ignored by HyperGen as server uses preloaded model)Note: Unlike OpenAI, HyperGen uses the model specified when starting the server.
Number of images to generate (1-10)Range: 1 to 10
Image dimensions in format
WIDTHxHEIGHT. Must be divisible by 8.Common sizes:"512x512"- Fast, lower quality"768x768"- Balanced"1024x1024"- High quality (recommended for SDXL)"1024x768"- Landscape"768x1024"- Portrait
"1920x1080", "2048x2048")Image quality setting (OpenAI compatibility parameter)Options:
"standard" or "hd"Note: This parameter is accepted for OpenAI compatibility but doesn’t affect HyperGen output. Use num_inference_steps instead.Response format for imagesOptions:
"url"- Returns image URL (placeholder, not fully implemented)"b64_json"- Returns base64-encoded PNG image data
HyperGen Extensions
Description of what to avoid in the image. Helps improve quality by specifying undesired elements.Example:
"blurry, low quality, distorted, watermark, text"Number of denoising steps. More steps = higher quality but slower generation.Range: 1 to 150Recommended:
- SDXL: 30-50 steps
- SDXL Turbo: 1-4 steps
- SD 1.5/2.1: 20-50 steps
Classifier-free guidance scale. Controls how closely the image follows the prompt.Range: 1.0 to 20.0Recommended:
7.5- Standard, balanced5.0-6.0- More creative, less literal10.0-15.0- Very literal, strict adherence to prompt
Random seed for reproducible generation. Use the same seed with the same prompt to get identical results.Example:
42, 12345, 999999Path to LoRA weights to use for this request. Overrides server default.Example:
"/path/to/custom_lora.safetensors"LoRA influence strengthRange: 0.0 to 2.0
0.0- No LoRA influence (base model only)1.0- Full LoRA influence (default)>1.0- Amplified LoRA influence
Response
Unix timestamp when the images were generated
Array of generated image objects
Examples
Basic Request
Response
Advanced Request with All Parameters
Using OpenAI Python Client
Batch Generation
Generate multiple variations of the same prompt:Reproducible Generation
Use seeds for consistent results:Error Responses
400 Bad Request
Invalid request parameters:- Invalid size format (not in
WIDTHxHEIGHTformat) - Dimensions not divisible by 8
nparameter out of range (must be 1-10)- Invalid parameter types
401 Unauthorized
Missing or invalid API key:- Missing
Authorizationheader - Incorrect API key
- Wrong authorization format
500 Internal Server Error
Generation failed:- Out of GPU memory (try smaller image size or fewer images)
- Model loading error
- Invalid LoRA path
- Hardware issues
Performance Tips
Faster Generation
Faster Generation
- Use SDXL Turbo with
num_inference_steps=4for 10x faster generation - Reduce image size (
512x512instead of1024x1024) - Lower
num_inference_steps(25-30 for SDXL) - Generate single images (
n=1)
Higher Quality
Higher Quality
- Increase
num_inference_steps(50-100) - Use larger image sizes (
1024x1024or larger) - Use negative prompts to avoid unwanted elements
- Adjust
guidance_scale(7.5-10.0 for SDXL)
Memory Management
Memory Management
- Generate fewer images per request (reduce
n) - Use smaller image sizes
- Use
float16dtype (default) - Clear GPU cache between large batches
Reproducibility
Reproducibility
- Always set
seedfor consistent results - Keep all parameters identical
- Note that different hardware may produce slight variations
Model-Specific Recommendations
SDXL (Stable Diffusion XL)
SDXL Turbo
Stable Diffusion 1.5/2.1
Rate Limiting
Requests are queued and processed sequentially:- Queue size visible via
/healthendpoint - Maximum queue size configurable via
--max-queue-size(default: 100) - Requests block until completed
- No explicit rate limit, limited by processing speed
Monitor the
/health endpoint to check queue status and server load