POST /v1/images/generations

Overview

Generate images from text descriptions using the loaded diffusion model. This endpoint is compatible with OpenAI’s image generation API.

Authentication

Authorization

string

required

Bearer token with your API key (if server started with --api-key)Format: Bearer YOUR_API_KEY

Request Body

prompt

string

required

Text description of the image to generate. Can be detailed and descriptive.Example: "A professional portrait of a person in natural lighting, photorealistic, high detail"

model

string

Model identifier (optional, ignored by HyperGen as server uses preloaded model)Note: Unlike OpenAI, HyperGen uses the model specified when starting the server.

integer

default:"1"

Number of images to generate (1-10)Range: 1 to 10

size

string

default:"1024x1024"

Image dimensions in format WIDTHxHEIGHT. Must be divisible by 8.Common sizes:

"512x512" - Fast, lower quality
"768x768" - Balanced
"1024x1024" - High quality (recommended for SDXL)
"1024x768" - Landscape
"768x1024" - Portrait

Custom sizes: Any dimensions divisible by 8 (e.g., "1920x1080", "2048x2048")

quality

string

default:"standard"

Image quality setting (OpenAI compatibility parameter)Options: "standard" or "hd"Note: This parameter is accepted for OpenAI compatibility but doesn’t affect HyperGen output. Use num_inference_steps instead.

response_format

string

default:"url"

Response format for imagesOptions:

"url" - Returns image URL (placeholder, not fully implemented)
"b64_json" - Returns base64-encoded PNG image data

HyperGen Extensions

negative_prompt

string

Description of what to avoid in the image. Helps improve quality by specifying undesired elements.Example: "blurry, low quality, distorted, watermark, text"

num_inference_steps

integer

default:"50"

Number of denoising steps. More steps = higher quality but slower generation.Range: 1 to 150Recommended:

SDXL: 30-50 steps
SDXL Turbo: 1-4 steps
SD 1.5/2.1: 20-50 steps

guidance_scale

float

default:"7.5"

Classifier-free guidance scale. Controls how closely the image follows the prompt.Range: 1.0 to 20.0Recommended:

7.5 - Standard, balanced
5.0-6.0 - More creative, less literal
10.0-15.0 - Very literal, strict adherence to prompt

Note: SDXL Turbo should use lower values (1.0-2.0)

seed

integer

Random seed for reproducible generation. Use the same seed with the same prompt to get identical results.Example: 42, 12345, 999999

lora_path

string

Path to LoRA weights to use for this request. Overrides server default.Example: "/path/to/custom_lora.safetensors"

lora_scale

float

default:"1.0"

LoRA influence strengthRange: 0.0 to 2.0

0.0 - No LoRA influence (base model only)
1.0 - Full LoRA influence (default)
>1.0 - Amplified LoRA influence

Response

created

integer

Unix timestamp when the images were generated

data

array

Array of generated image objects

Show Image Object

b64_json

string

Base64-encoded PNG image data (if response_format="b64_json")

url

string

Image URL (if response_format="url", currently returns null)

revised_prompt

string

The prompt that was used (echoed back)

Examples

Basic Request

curl http://localhost:8000/v1/images/generations \
  -H "Authorization: Bearer sk-hypergen-123456" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A cat holding a sign that says Hello World",
    "size": "1024x1024"
  }'

Response

{
  "created": 1708472400,
  "data": [
    {
      "b64_json": "iVBORw0KGgoAAAANSUhEUgAAA...",
      "revised_prompt": "A cat holding a sign that says Hello World"
    }
  ]
}

Advanced Request with All Parameters

curl http://localhost:8000/v1/images/generations \
  -H "Authorization: Bearer sk-hypergen-123456" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A futuristic cityscape at sunset, cyberpunk style, neon lights, highly detailed",
    "negative_prompt": "blurry, low quality, distorted, watermark",
    "n": 2,
    "size": "1024x768",
    "num_inference_steps": 50,
    "guidance_scale": 7.5,
    "seed": 42,
    "response_format": "b64_json"
  }'

Using OpenAI Python Client

from openai import OpenAI

client = OpenAI(
    api_key="sk-hypergen-123456",
    base_url="http://localhost:8000/v1"
)

response = client.images.generate(
    prompt="A professional portrait in natural lighting",
    n=1,
    size="1024x1024"
)

# Note: With OpenAI client, response_format is always "url"
# But HyperGen returns base64 in the background
print(response.data[0].url)

Batch Generation

Generate multiple variations of the same prompt:

import requests
import base64
from PIL import Image
from io import BytesIO

response = requests.post(
    "http://localhost:8000/v1/images/generations",
    headers={
        "Authorization": "Bearer sk-hypergen-123456",
        "Content-Type": "application/json"
    },
    json={
        "prompt": "A beautiful landscape painting",
        "n": 4,  # Generate 4 variations
        "size": "1024x1024",
        "response_format": "b64_json"
    }
)

# Save all 4 images
for i, img_data in enumerate(response.json()["data"]):
    image = Image.open(BytesIO(base64.b64decode(img_data["b64_json"])))
    image.save(f"landscape_{i}.png")

Reproducible Generation

Use seeds for consistent results:

import requests

# First generation with seed
response1 = requests.post(
    "http://localhost:8000/v1/images/generations",
    headers={"Authorization": "Bearer sk-hypergen-123456"},
    json={
        "prompt": "A red apple on a wooden table",
        "seed": 42,
        "size": "1024x1024",
        "response_format": "b64_json"
    }
)

# Second generation with same seed - will produce identical image
response2 = requests.post(
    "http://localhost:8000/v1/images/generations",
    headers={"Authorization": "Bearer sk-hypergen-123456"},
    json={
        "prompt": "A red apple on a wooden table",
        "seed": 42,  # Same seed = same result
        "size": "1024x1024",
        "response_format": "b64_json"
    }
)

# response1 and response2 will have identical images

Error Responses

400 Bad Request

Invalid request parameters:

{
  "detail": "Invalid size format: 1000x1000. Use format like '1024x1024'"
}

Common causes:

Invalid size format (not in WIDTHxHEIGHT format)
Dimensions not divisible by 8
n parameter out of range (must be 1-10)
Invalid parameter types

401 Unauthorized

Missing or invalid API key:

{
  "detail": "Invalid API key"
}

Causes:

Missing Authorization header
Incorrect API key
Wrong authorization format

500 Internal Server Error

Generation failed:

{
  "error": {
    "message": "CUDA out of memory",
    "type": "internal_error"
  }
}

Common causes:

Out of GPU memory (try smaller image size or fewer images)
Model loading error
Invalid LoRA path
Hardware issues

Performance Tips

Faster Generation

Use SDXL Turbo with num_inference_steps=4 for 10x faster generation
Reduce image size (512x512 instead of 1024x1024)
Lower num_inference_steps (25-30 for SDXL)
Generate single images (n=1)

Higher Quality

Increase num_inference_steps (50-100)
Use larger image sizes (1024x1024 or larger)
Use negative prompts to avoid unwanted elements
Adjust guidance_scale (7.5-10.0 for SDXL)

Memory Management

Generate fewer images per request (reduce n)
Use smaller image sizes
Use float16 dtype (default)
Clear GPU cache between large batches

Reproducibility

Always set seed for consistent results
Keep all parameters identical
Note that different hardware may produce slight variations

Model-Specific Recommendations

SDXL (Stable Diffusion XL)

{
  "prompt": "Your detailed prompt here",
  "size": "1024x1024",
  "num_inference_steps": 40,
  "guidance_scale": 7.5
}

SDXL Turbo

{
  "prompt": "Your prompt here",
  "size": "1024x1024",
  "num_inference_steps": 4,
  "guidance_scale": 1.0
}

Stable Diffusion 1.5/2.1

{
  "prompt": "Your prompt here",
  "size": "512x512",
  "num_inference_steps": 30,
  "guidance_scale": 7.5
}

Rate Limiting

Requests are queued and processed sequentially:

Queue size visible via /health endpoint
Maximum queue size configurable via --max-queue-size (default: 100)
Requests block until completed
No explicit rate limit, limited by processing speed

Monitor the /health endpoint to check queue status and server load

Python API

HTTP API

POST /v1/images/generations

Overview

Authentication

Request Body

HyperGen Extensions

Response

Examples

Basic Request

Response

Advanced Request with All Parameters

Using OpenAI Python Client

Batch Generation

Reproducible Generation

Error Responses

400 Bad Request

401 Unauthorized

500 Internal Server Error

Performance Tips

Model-Specific Recommendations

SDXL (Stable Diffusion XL)

SDXL Turbo

Stable Diffusion 1.5/2.1

Rate Limiting

Python API

HTTP API

​Overview

​Authentication

​Request Body

​HyperGen Extensions

​Response

​Examples

​Basic Request

​Response

​Advanced Request with All Parameters

​Using OpenAI Python Client

​Batch Generation

​Reproducible Generation

​Error Responses

​400 Bad Request

​401 Unauthorized

​500 Internal Server Error

​Performance Tips

​Model-Specific Recommendations

​SDXL (Stable Diffusion XL)

​SDXL Turbo

​Stable Diffusion 1.5/2.1

​Rate Limiting

Overview

Authentication

Request Body

HyperGen Extensions

Response

Examples

Basic Request

Response

Advanced Request with All Parameters

Using OpenAI Python Client

Batch Generation

Reproducible Generation

Error Responses

400 Bad Request

401 Unauthorized

500 Internal Server Error

Performance Tips

Model-Specific Recommendations

SDXL (Stable Diffusion XL)

SDXL Turbo

Stable Diffusion 1.5/2.1

Rate Limiting