Skip to main content
POST
/
v1
/
images
/
generations
POST /v1/images/generations
curl --request POST \
  --url http://localhost:8000/v1/images/generations \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '{
  "prompt": "<string>",
  "model": "<string>",
  "n": 123,
  "size": "<string>",
  "quality": "<string>",
  "response_format": "<string>",
  "negative_prompt": "<string>",
  "num_inference_steps": 123,
  "guidance_scale": 123,
  "seed": 123,
  "lora_path": "<string>",
  "lora_scale": 123
}'
{
  "created": 123,
  "data": [
    {
      "b64_json": "<string>",
      "url": "<string>",
      "revised_prompt": "<string>"
    }
  ]
}

Overview

Generate images from text descriptions using the loaded diffusion model. This endpoint is compatible with OpenAI’s image generation API.

Authentication

Authorization
string
required
Bearer token with your API key (if server started with --api-key)Format: Bearer YOUR_API_KEY

Request Body

prompt
string
required
Text description of the image to generate. Can be detailed and descriptive.Example: "A professional portrait of a person in natural lighting, photorealistic, high detail"
model
string
Model identifier (optional, ignored by HyperGen as server uses preloaded model)Note: Unlike OpenAI, HyperGen uses the model specified when starting the server.
n
integer
default:"1"
Number of images to generate (1-10)Range: 1 to 10
size
string
default:"1024x1024"
Image dimensions in format WIDTHxHEIGHT. Must be divisible by 8.Common sizes:
  • "512x512" - Fast, lower quality
  • "768x768" - Balanced
  • "1024x1024" - High quality (recommended for SDXL)
  • "1024x768" - Landscape
  • "768x1024" - Portrait
Custom sizes: Any dimensions divisible by 8 (e.g., "1920x1080", "2048x2048")
quality
string
default:"standard"
Image quality setting (OpenAI compatibility parameter)Options: "standard" or "hd"Note: This parameter is accepted for OpenAI compatibility but doesn’t affect HyperGen output. Use num_inference_steps instead.
response_format
string
default:"url"
Response format for imagesOptions:
  • "url" - Returns image URL (placeholder, not fully implemented)
  • "b64_json" - Returns base64-encoded PNG image data

HyperGen Extensions

negative_prompt
string
Description of what to avoid in the image. Helps improve quality by specifying undesired elements.Example: "blurry, low quality, distorted, watermark, text"
num_inference_steps
integer
default:"50"
Number of denoising steps. More steps = higher quality but slower generation.Range: 1 to 150Recommended:
  • SDXL: 30-50 steps
  • SDXL Turbo: 1-4 steps
  • SD 1.5/2.1: 20-50 steps
guidance_scale
float
default:"7.5"
Classifier-free guidance scale. Controls how closely the image follows the prompt.Range: 1.0 to 20.0Recommended:
  • 7.5 - Standard, balanced
  • 5.0-6.0 - More creative, less literal
  • 10.0-15.0 - Very literal, strict adherence to prompt
Note: SDXL Turbo should use lower values (1.0-2.0)
seed
integer
Random seed for reproducible generation. Use the same seed with the same prompt to get identical results.Example: 42, 12345, 999999
lora_path
string
Path to LoRA weights to use for this request. Overrides server default.Example: "/path/to/custom_lora.safetensors"
lora_scale
float
default:"1.0"
LoRA influence strengthRange: 0.0 to 2.0
  • 0.0 - No LoRA influence (base model only)
  • 1.0 - Full LoRA influence (default)
  • >1.0 - Amplified LoRA influence

Response

created
integer
Unix timestamp when the images were generated
data
array
Array of generated image objects

Examples

Basic Request

curl http://localhost:8000/v1/images/generations \
  -H "Authorization: Bearer sk-hypergen-123456" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A cat holding a sign that says Hello World",
    "size": "1024x1024"
  }'

Response

{
  "created": 1708472400,
  "data": [
    {
      "b64_json": "iVBORw0KGgoAAAANSUhEUgAAA...",
      "revised_prompt": "A cat holding a sign that says Hello World"
    }
  ]
}

Advanced Request with All Parameters

curl http://localhost:8000/v1/images/generations \
  -H "Authorization: Bearer sk-hypergen-123456" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A futuristic cityscape at sunset, cyberpunk style, neon lights, highly detailed",
    "negative_prompt": "blurry, low quality, distorted, watermark",
    "n": 2,
    "size": "1024x768",
    "num_inference_steps": 50,
    "guidance_scale": 7.5,
    "seed": 42,
    "response_format": "b64_json"
  }'

Using OpenAI Python Client

from openai import OpenAI

client = OpenAI(
    api_key="sk-hypergen-123456",
    base_url="http://localhost:8000/v1"
)

response = client.images.generate(
    prompt="A professional portrait in natural lighting",
    n=1,
    size="1024x1024"
)

# Note: With OpenAI client, response_format is always "url"
# But HyperGen returns base64 in the background
print(response.data[0].url)

Batch Generation

Generate multiple variations of the same prompt:
import requests
import base64
from PIL import Image
from io import BytesIO

response = requests.post(
    "http://localhost:8000/v1/images/generations",
    headers={
        "Authorization": "Bearer sk-hypergen-123456",
        "Content-Type": "application/json"
    },
    json={
        "prompt": "A beautiful landscape painting",
        "n": 4,  # Generate 4 variations
        "size": "1024x1024",
        "response_format": "b64_json"
    }
)

# Save all 4 images
for i, img_data in enumerate(response.json()["data"]):
    image = Image.open(BytesIO(base64.b64decode(img_data["b64_json"])))
    image.save(f"landscape_{i}.png")

Reproducible Generation

Use seeds for consistent results:
import requests

# First generation with seed
response1 = requests.post(
    "http://localhost:8000/v1/images/generations",
    headers={"Authorization": "Bearer sk-hypergen-123456"},
    json={
        "prompt": "A red apple on a wooden table",
        "seed": 42,
        "size": "1024x1024",
        "response_format": "b64_json"
    }
)

# Second generation with same seed - will produce identical image
response2 = requests.post(
    "http://localhost:8000/v1/images/generations",
    headers={"Authorization": "Bearer sk-hypergen-123456"},
    json={
        "prompt": "A red apple on a wooden table",
        "seed": 42,  # Same seed = same result
        "size": "1024x1024",
        "response_format": "b64_json"
    }
)

# response1 and response2 will have identical images

Error Responses

400 Bad Request

Invalid request parameters:
{
  "detail": "Invalid size format: 1000x1000. Use format like '1024x1024'"
}
Common causes:
  • Invalid size format (not in WIDTHxHEIGHT format)
  • Dimensions not divisible by 8
  • n parameter out of range (must be 1-10)
  • Invalid parameter types

401 Unauthorized

Missing or invalid API key:
{
  "detail": "Invalid API key"
}
Causes:
  • Missing Authorization header
  • Incorrect API key
  • Wrong authorization format

500 Internal Server Error

Generation failed:
{
  "error": {
    "message": "CUDA out of memory",
    "type": "internal_error"
  }
}
Common causes:
  • Out of GPU memory (try smaller image size or fewer images)
  • Model loading error
  • Invalid LoRA path
  • Hardware issues

Performance Tips

  • Use SDXL Turbo with num_inference_steps=4 for 10x faster generation
  • Reduce image size (512x512 instead of 1024x1024)
  • Lower num_inference_steps (25-30 for SDXL)
  • Generate single images (n=1)
  • Increase num_inference_steps (50-100)
  • Use larger image sizes (1024x1024 or larger)
  • Use negative prompts to avoid unwanted elements
  • Adjust guidance_scale (7.5-10.0 for SDXL)
  • Generate fewer images per request (reduce n)
  • Use smaller image sizes
  • Use float16 dtype (default)
  • Clear GPU cache between large batches
  • Always set seed for consistent results
  • Keep all parameters identical
  • Note that different hardware may produce slight variations

Model-Specific Recommendations

SDXL (Stable Diffusion XL)

{
  "prompt": "Your detailed prompt here",
  "size": "1024x1024",
  "num_inference_steps": 40,
  "guidance_scale": 7.5
}

SDXL Turbo

{
  "prompt": "Your prompt here",
  "size": "1024x1024",
  "num_inference_steps": 4,
  "guidance_scale": 1.0
}

Stable Diffusion 1.5/2.1

{
  "prompt": "Your prompt here",
  "size": "512x512",
  "num_inference_steps": 30,
  "guidance_scale": 7.5
}

Rate Limiting

Requests are queued and processed sequentially:
  • Queue size visible via /health endpoint
  • Maximum queue size configurable via --max-queue-size (default: 100)
  • Requests block until completed
  • No explicit rate limit, limited by processing speed
Monitor the /health endpoint to check queue status and server load