🌌 LumaForge v1.1 - SD-3.5 Image Generation

LumaForge is a powerful image generation model built on SDXL Turbo, featuring ultra-fast 4-step generation, superior quality, and advanced image editing capabilities. This repository contains the complete model backend with a FastAPI interface, designed to be deployed directly to Hugging Face Spaces.

🚀 What's New in v2.0

⚡ SDXL Turbo: Upgraded from SD 1.5 to SDXL Turbo for dramatically better quality
🎯 4-Step Generation: Ultra-fast 4-6 step generation (vs 30-40 steps in v1.x)
📈 3-4x Faster: 8-15 seconds per image (vs 40-60 seconds)
🎨 Better Quality: Superior prompt following, better anatomy, higher resolution
✨ Enhanced Prompts: Optimized prompt engineering for SDXL Turbo

Model Capabilities

Text-to-Image generation with 16 specialized categories, Image-to-Image styling, advanced image editing (colorization & face restoration), 2x upscaling, background removal, dataset curation, and fine-tuning support.

📊 Model Specifications

Specification	Details
Base Model	SDXL Turbo (Stability AI)
Generation Speed	4 steps, 8-15 seconds per image
Quality	High-quality, photorealistic results
Backend	FastAPI with PyTorch & Diffusers
Device Support	Apple Silicon MPS, CPU fallback
Categories	16 specialized categories with 110+ prompt templates
Image Editing	Colorization (5 styles), Face Restoration (4 levels), Background Removal, Upscaling (2x)
Deployment	Docker or Python SDK on Hugging Face Spaces
Rate Limiting	10 gen/min, 60 API calls/min
Output Format	Base64 PNG with metadata

🚀 Hugging Face Space Deployment

Hugging Face Spaces automatically detect configuration metadata from the YAML frontmatter at the top of this file.

Option A: Docker Space (Recommended)

This folder is configured to run on port 7860 (the default Hugging Face Space port). You can create a Hugging Face space using the Docker SDK and push the contents of the model/ directory along with a standard Dockerfile:

FROM python:3.10-slim

WORKDIR /app

# Install system dependencies for Pillow and image processing
RUN apt-get update && apt-get install -y \
    build-essential \
    libgl1-mesa-glx \
    libglib2.0-0 \
    && rm -rf /var/lib/apt/lists/*

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 7860

# Run FastAPI server
CMD ["python", "app.py"]

Option B: FastAPI Space

Create a Hugging Face space with the FastAPI SDK, selecting Python 3.10, and copy the contents of the model/ directory. Hugging Face will automatically recognize app.py as the entrypoint.

📡 API Endpoints Reference

1. System Status

GET /api/status
- Returns device specs (Metal MPS vs CPU) and local Ollama server connectivity logs.

2. Text-to-Image Generation

POST /api/generate
- Payload:
```
{
  "prompt": "studio ghibli street",
  "mode": "general | poster | character",
  "aspect_ratio": "1:1 | 16:9 | 9:16 | 4:3 | 3:4",
  "steps": 20,
  "guidance_scale": 7.5,
  "seed": -1,
  "mock": false
}
```
- Actions: Checks text safety boundaries (Ollama client),适配 expands prompts structurally, runs latent diffusion on MPS, watermarks the result with the LumaForge logo, and returns the image as a Base64 string.

3. Image-to-Image Stylization

POST /api/generate-img2img

Payload:

{
  "prompt": "Convert this photo into anime illustration",
  "image_b64": "data:image/png;base64,...",
  "strength": 0.32,
  "mode": "general",
  "steps": 20,
  "guidance_scale": 7.5,
  "seed": -1,
  "mock": false
}

Actions: Styles the input image using shared pipeline weights. Caps strength to 0.32 and applies a Radial Face Protection Mask to preserve original facial structure and details with pixel-level accuracy.

4. High-Fidelity 2x Upscaling

POST /api/upscale
- Payload:
```
{
  "image_b64": "data:image/png;base64,...",
  "scale_factor": 2.0,
  "mock": false
}
```
- Actions: Doubles the resolution of the image using high-quality Lanczos interpolation and sharpens details using an Unsharp Mask.

5. Transparent Background Removal

POST /api/remove-background
- Payload:
```
{
  "image_b64": "data:image/png;base64,...",
  "mock": false
}
```
- Actions: Isolates the foreground subject. Uses rembg if available, falling back to a vectorized NumPy color-threshold algorithm featuring linear alpha feathering to prevent jagged edges.

6. Image Colorization (v1.1)

POST /api/colorize
- Payload:
```
{
  "image_b64": "data:image/png;base64,...",
  "style": "vibrant | warm | cool | vintage | sepia",
  "mock": false
}
```
- Styles:
  - Vibrant: Boost saturation and contrast for punchy, eye-catching colors
  - Warm: Golden temperature shift for cozy, sunset-like atmospheres
  - Cool: Blue temperature shift for calming, professional aesthetics
  - Vintage: Retro film look with muted tones and warm overlay
  - Sepia: Classic sepia tone for timeless, nostalgic effects
- Actions: Applies adaptive color grading and enhancement filters to transform image color profiles.

7. Face Restoration (v1.1)

POST /api/face-restoration
- Payload:
```
{
  "image_b64": "data:image/png;base64,...",
  "intensity": "low | medium | high | ultra",
  "mock": false
}
```
- Intensity Levels:
  - Low: Subtle enhancement, preserves original character
  - Medium: Balanced enhancement for improved clarity
  - High: Aggressive enhancement for maximum facial detail
  - Ultra: Maximum enhancement with intensive denoising and sharpening
- Actions: Applies denoising, sharpening, contrast enhancement, and color vibrancy boost to improve facial features and clarity.

8. Model Training Telemetry

POST /api/train: Triggers PyTorch UNet LoRA layer fine-tuning on a background thread.
GET /api/train/status: Returns live telemetry logs (epoch progress, validation loss metrics, prompt adherence).

7. Dataset Curation & Benchmarking

POST /api/curate: Curates and captions images.
POST /api/benchmark: Evaluates pipeline adherence, processing latency, and VRAM footprints.

⚡ Performance Optimizations

Attention Slicing: Pipeline memory slicing allows Stable Diffusion to run on standard consumer MPS buffers without out-of-memory errors.
Vectorized Processing: Replaced slow pixel iteration loops with fast vectorized NumPy operations, reducing processing latencies (Sketch generation to 4ms, Background removal to 8ms).
Token-Bucket Rate Limiters: Restricts API calls to prevent client flooding (10 generations/min, 60 general api calls/min).

Downloads last month: -

Model tree for sujithputta/Lumaforge

Base model

stabilityai/sdxl-turbo

Finetuned

(23)

this model