Instructions to use sujithputta/Lumaforge with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use sujithputta/Lumaforge with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("sujithputta/Lumaforge", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
- π¨ LumaForge: AI Image Generation Platform
- π What is LumaForge?
- β¨ Features
- πΌοΈ Examples
- β‘ Key Enhancements & Optimizations
- π Getting Started
- π Evaluation & Verification
- π License
- ποΈ System Architecture
- π Project Structure
- β‘ Key Optimizations
- π Quick Start
- π‘ API Endpoints
- π Performance Metrics
- π³ Deployment
- π Safety
- π Documentation
- π What is LumaForge?
π¨ LumaForge: AI Image Generation Platform
Text-to-Image β’ Image Styling β’ Background Removal β’ 2x Upscaling β’ LoRA Fine-tuning
Modular image generation backend designed for creative developers. Combines Stable Diffusion, LoRA fine-tuning, and image enhancement with a professional web UIβoptimized for Apple Silicon.
Explore Examples β’ Try Now β’ API Docs β’ Deploy
π What is LumaForge?
LumaForge is a production-ready, modular image generation platform combining:
- AI Engine Backend (FastAPI + PyTorch + Stable Diffusion)
- Spatial UI Web Playground (Next.js + Tailwind + Bun)
- Advanced Safety & Moderation (Ollama-based content checks)
- Performance Optimizations (Apple Silicon MPS, vectorized processing)
- Deployment-Ready (Docker, Hugging Face Spaces, cloud-ready)
Perfect for building AI creative suites, automating design workflows, or deploying image generation at scale.
β¨ Features
| Feature | Status | Tech |
|---|---|---|
| Text-to-Image | β | Stable Diffusion v1.5 |
| Image-to-Image Styling | β | Img2Img with face protection |
| 2x Upscaling | β | Lanczos + Unsharp Mask |
| Background Removal | β | Vectorized NumPy (~8.9ms) |
| LoRA Fine-tuning | β | PyTorch UNet adaptation |
| Web UI Dashboard | β | Next.js + Tailwind glassmorphic |
| REST API | β | FastAPI with rate limiting |
| Apple Silicon | β | MPS acceleration (M1/M2/M3) |
| Safety & Auditing | β | Ollama + JSONL logging |
πΌοΈ Examples
Text-to-Image
Prompt: "A futuristic cyberpunk city at sunset"
curl -X POST http://localhost:7860/api/generate \
-H "Content-Type: application/json" \
-d '{"prompt": "cyberpunk city", "steps": 30, "guidance_scale": 7.5}'
Generate stunning images with rich composition and vibrant colors.
Image-to-Image Styling
Input: Portrait photo β Output: Anime illustration
The pipeline preserves facial structure using a Radial Face Protection Mask while applying creative styles.
Key innovation: Pixel-accurate detail transfer preserves eyes, nose, and expression.
Background Removal
Before: Product with background | After: Transparent background
Vectorized NumPy segmentation with smooth alpha featheringβcompletes in ~8.9ms.
2x Upscaling
Original: 512Γ512 | Upscaled: 1024Γ1024
Lanczos resampling + Unsharp Mask filter for crisp, detailed outputs.
LoRA Fine-tuning
Train custom adapters in minutes:
python main.py train --epochs 5 --lr 5e-6 --batch_size 2
Monitor real-time loss metrics, prompt adherence, and progress.
The codebase is split into two self-contained subsystems:
graph TD
A[Next.js Spatial UI Client] -->|Bun Proxy Routes / Rate Limiters| B[FastAPI Backend Server]
B -->|PyTorch MPS / CPU| C[LumaForge Core Pipeline]
B -->|urllib API Call| D[Ollama LLM Client]
C -->|Stable Diffusion v1.5| E[Image Generation / Img2Img]
C -->|Vectorized NumPy & PIL| F[Post-Processing Filters]
C -->|LoRA Training Script| G[Fine-Tuning Engine]
D -->|llama3.2:1b| H[Prompt Expansion & Safety]
1. The Core AI Engine (model/)
lumaforge/pipeline.py: The central image synthesis pipeline. It manages:- Text-to-Image Generation: Uses
StableDiffusionPipelineloaded onto Apple Silicon MPS with attention slicing and float32 precision. - Image-to-Image (Img2Img): Instantiates
StableDiffusionImg2ImgPipelinesharing preloaded model weights to minimize unified memory footprints. - High-Fidelity 2x Upscaling: Resolves images using Lanczos resampling and an Unsharp Mask filter for crisp details.
- Vectorized Background Remover: A fallback color-threshold segmenter vectorized in NumPy (running in 8.9ms) featuring smooth linear alpha feathering.
- NumPy-Vectorized Mock Shaders: Full procedural pipeline to simulate sketches (dodge-blend), Ghibli paintings (NumPy 5x5 Bilateral Filter, YCbCr cell-shading, gradient ink outlines, and volumetric bloom highlights), and weather effects (motion-blurred rain/snow).
- Text-to-Image Generation: Uses
lumaforge/ollama_client.py: Interacts with local Ollama (llama3.2:1b) to perform safety classification, creative prompt expansion (structured into subject, action, environment, style, lighting, camera, mood), and prompt rewriting.lumaforge/safety.py: Standardizes pre-generation text checking and post-generation image screening, archiving events inaudit_log.jsonl.lumaforge/train.py: Runs PyTorch UNet LoRA layer fine-tuning on a curated dataset, writing live progress telemetry totrain_log.json.lumaforge/dataset_curator.py: Automates image downloading, hashing, deduplication, and LLM-based captioning.lumaforge/benchmark.py: Profiles model performance, measuring generation latency, prompt adherence, and MPS VRAM overhead.app.py: FastAPI server exposing full endpoint proxies, custom token-bucket rate limiters, and background workers.main.py: Consolidated Command Line Interface (CLI) exposing generate, benchmark, curate, train, and audit subcommands.
2. Next.js Web Playground (web/)
- Spatial UI Dashboard: Cards, backdrop blur components, and glowing background spotlights.
- Playground Panel: Offers side-by-side Text-to-Image and Image-to-Image controls, file upload drag-zones, strength sliders, and preset task templates (Style Transfer, Color Recolor, Object Addition, Background Replacement).
- Hover Viewport Overlays: Success screens support immediate Download, Scale Up 2x, and Remove BG actions.
- Fine-Tuning Telemetry: Real-time graphs showing training/validation loss, prompt adherence, overall progress bars, and scrolling stdout logs.
- Censorship Audit logs: Tabulates prompt status (APPROVED, REWRITTEN, REFUSED) with safety classification reasoning.
- Bun API Proxying: Employs sliding-window rate limiters restricting web users to 10 generations and 20 upscales per minute.
β‘ Key Enhancements & Optimizations
- Pixel-Accurate Detail Preservation (Tom Holland Face & Suit Rescue):
- Adaptive Detail Transfer: In Img2Img, the pipeline computes a high-pass gradient mask of the original photo. It overlays high-frequency edge details (eyes, nose, mouth contours, suit webs) back onto the cartoon output to prevent morphing.
- Radial Face Protection Mask: Blends $55%$ of the original photo in the face region with a soft Gaussian falloff, while allowing the background to be fully cartoonized ($90%$ weight), ensuring absolute portrait accuracy.
- Strength Cap: Dynamically limits diffusion strength to
0.32for cartoon styles to preserve facial layouts during denoising.
- 500x Vectorization Speedups:
- Ported slow pure-Python nested pixel loops (Pencil Sketch dodge-blends, background removal thresholds) to vectorized NumPy arrays. Reduced sketch generation to 4.1ms and background removal to 8.9ms on a single thread.
- Smooth Alpha Feathering:
- Uses linear alpha interpolation between a min and max distance threshold to resolve background cutouts with smooth margins, eliminating pixelated outlines.
- VRAM Safety:
- Employs
from_pipeshared diffusers pipelines and MPS attention slicing to generate images locally on macOS without bottlenecking VRAM.
- Employs
π Getting Started
Prerequisites
- macOS with Apple Silicon (M1/M2/M3)
- Python 3.10+
- Node.js 18+ & Bun
- Ollama installed and running locally with the
llama3.2:1bmodel pulled:ollama pull llama3.2:1b
Backend Setup & Execution
- Navigate to the
modelfolder and install Python dependencies:cd model pip install -r requirements.txt - Start the FastAPI backend server (defaults to port
7860with hot-reloading):python3 app.py - (Optional) Run pipeline commands directly via the CLI:
- Generate an Image (Mock Mode):
python3 main.py generate --prompt "cyberpunk street" --mock - Generate an Image (Real Diffusion):
python3 main.py generate --prompt "studio ghibli scene" --device mps - Run Evaluation Benchmarks:
python3 main.py benchmark --mock
- Generate an Image (Mock Mode):
Frontend Web Setup & Execution
- Navigate to the
webfolder and install Node packages:cd web bun install - Start the Next.js development server (runs on
http://localhost:3000):bun run dev - Open your browser and navigate to
http://localhost:3000to interact with the workstation.
π Evaluation & Verification
A dedicated test suite is available at the root directory to verify pipeline performance:
python3 test_enhancements.py
Asserted Latencies:
- Vectorized Background Removal:
~8 ms(Expected:<100 ms) - Vectorized Pencil Sketch Dodge-Blend:
~4 ms(Expected:<50 ms) - Bilateral Cell-Shaded Ghibli Cartoon Shader:
~100 ms(Expected:<250 ms) - Composited Background Replacement:
~10 ms(Expected:<50 ms)
π License
This project is licensed under the MIT License - see the LICENSE file for details.
ποΈ System Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Next.js Web Playground β
β (Glassmorphic Spatial UI, Realtime Monitoring) β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
Bun API Proxy Routes
(Rate Limiting / Auth)
β
ββββββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββ
β FastAPI Backend (app.py) β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Safety Manager (Ollama Integration) β β
β β - Prompt moderation & safety classification β β
β β - Output screening & audit logging β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββ β
β β LumaForge Core Pipeline (pipeline.py) β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββ β β
β β βText-to-Imageβ βImg-to-Img β βUpscaling β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββ β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββ β β
β β βBG Removal β βLoRA Training β βBenchmarks β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββ β β
β βββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ β
β β β
β PyTorch + MPS β
β Stable Diffusion v1.5 β
β (Apple Silicon Optimized) β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Project Structure
LumaForge/
βββ model/ # Backend (FastAPI + PyTorch)
β βββ app.py # FastAPI server entrypoint
β βββ main.py # CLI interface
β βββ requirements.txt # Python dependencies
β βββ Dockerfile # Docker configuration
β βββ README.md # Model documentation
β βββ lumaforge/
β βββ pipeline.py # Core image synthesis
β βββ ollama_client.py # LLM integration
β βββ safety.py # Content moderation
β βββ train.py # LoRA fine-tuning
β βββ dataset_curator.py # Image curation
β βββ benchmark.py # Performance evaluation
βββ web/ # Frontend (Next.js)
β βββ app/ # Next.js 13+ app directory
β βββ components/ # UI components
β βββ README.md # Web UI docs
βββ data/ # Dataset storage
βββ outputs/ # Generated images
βββ README.md # This file
β‘ Key Optimizations
Performance
- Vectorized NumPy: Background removal in ~8.9ms, sketch generation in ~4.1ms
- Apple Silicon MPS: GPU acceleration with attention slicing for memory efficiency
- Shared Pipeline Weights: Minimize VRAM overhead
- Token-Bucket Rate Limiting: 10 gen/min, 60 API calls/min per IP
Quality
- Radial Face Protection Mask: Preserves facial structure in transformations
- High-Pass Detail Transfer: Pixel-accurate detail preservation
- Adaptive Strength Capping: Limited to 0.32 for cartoon styles
- Lanczos + Unsharp Mask: High-fidelity 2x upscaling
Safety
- Multi-Stage Moderation: Pre & post-generation checks
- Ollama Integration: Local LLM-based classification
- Audit Logging: JSONL format for compliance
- Content Tagging: Automatic classification
π Quick Start
Prerequisites
- macOS with Apple Silicon (M1/M2/M3)
- Python 3.10+, Node.js 18+, Bun
- Ollama running locally with
llama3.2:1b
Backend
cd model
pip install -r requirements.txt
python app.py
Server: http://localhost:7860
Frontend
cd web
bun install
bun run dev
UI: http://localhost:3000
Quick Test
cd model
python main.py generate --prompt "cyberpunk street" --mock
π‘ API Endpoints
- POST
/api/generate- Text-to-Image - POST
/api/generate-img2img- Image styling - POST
/api/upscale- 2x upscaling - POST
/api/remove-background- Background removal - POST
/api/train- Start LoRA fine-tuning - GET
/api/train/status- Training progress - GET
/api/status- System status
Full API reference: model/README.md
π Performance Metrics
| Operation | Latency | Device |
|---|---|---|
| Text-to-Image (30 steps) | ~12-15s | M1 MPS |
| Image-to-Image (20 steps) | ~8-10s | M1 MPS |
| 2x Upscaling | ~1.2s | CPU |
| Background Removal | ~8.9ms | NumPy |
| Pencil Sketch | ~4.1ms | NumPy |
π³ Deployment
Docker
cd model
docker build -t lumaforge .
docker run -p 7860:7860 lumaforge
Hugging Face Spaces
- Create Docker space
- Push
model/directory - Auto-deploys to your URL
π Safety
- Content moderation with Ollama
- Comprehensive audit trails
- Per-IP rate limiting
- Optional watermarking
π Documentation
Built for Creative AI Development
- Downloads last month
- -