FLUX.2 and Krea-2-Turbo run natively on Apple Silicon — text encoder, diffusion transformer, and VAE, all on-device. Type a prompt, or hand it your own photo and tell it what to change. No cloud, no Python, no subscription.
Pick your model in the Image pane, hit Download once, and generate with a live progress bar as it denoises. Both pipelines were validated numerically faithful to their references.
4B parameters, ~5 GB pre-quantized. Quick results, and the engine behind instruction photo editing. Comfortable on 8–16 GB Macs.
12.9B parameters, one-click ~15 GB download. Photorealism validated at 0.9996 end-to-end pixel cosine against the reference implementation.
Every generated image passes a local NSFW classifier — nothing is uploaded anywhere. On by default, with a Safe-mode toggle (and a --no-safety flag).
In Agent mode, "draw a red fox in the snow" renders inline in the conversation with your saved Image settings. Double-click to open full-size.

Attach a photo, type what should change, and FLUX.2-klein edits it while keeping the subject, pose, and scene intact. This isn't a noisy remix — your photo rides through the model as a clean in-context reference, the exact mechanism the model was trained on. Measured live: a "make the fox blue" edit kept 97% structural correlation with the original.
Your photo keeps its proportions, too: a portrait or landscape source is recomposed into the output size — never stretched, never squished.

Every image model — Krea-2 included — takes a source image plus a strength slider: low for a subtle remix that keeps the composition, high for a full re-imagination that keeps only the vibe. Sources with a different shape than the output are center-cropped, never distorted.
Both VAE encoders were validated by encode→decode round-trips at pixel correlation 0.999+, and they ship inside the model downloads you already have — nothing extra to fetch.
An Instagram filter adjusts the pixels you already have. A LoRA changes how the model paints: attach one small .safetensors file and every generation comes out watercolor, anime, film-noir, or in the signature look of whoever trained it — composition, lighting, brushwork and all. It's the difference between tinting a photo and hiring a different artist.
mlx-serve applies diffusers-format LoRAs at runtime: no re-quantization, zero quality loss on the base weights, clean detach between requests. Grab any compatible LoRA from HuggingFace or Civitai, point Advanced options at the file, and dial the strength. The same mechanism restyles LTX video generations too.
curl http://localhost:11234/v1/images/generations -d '{ "prompt": "a lighthouse at dusk, long exposure", "size": "1024x1024", "image": "<base64 source photo>", // optional "mode": "edit", // or "variation" + strength "lora_path": "/path/to/watercolor.safetensors", "lora_scale": 0.9 }'
No. The whole pipeline — text encoder, diffusion transformer, VAE — runs inside the same native Zig server that does chat, through MLX. Click Download in the Image pane, then Generate.
Img2img (variation mode) re-noises your photo and re-imagines it — great for remixes, useless for "change one thing". Edit mode passes your photo as a clean in-context reference and generates fresh, so "remove the monitor in the background" keeps the person, pose, and room. Live measurement: 0.97 structural correlation vs 0.16 without the reference mechanism.
Any diffusers-format LoRA .safetensors — the standard on HuggingFace and Civitai. Common alias layouts (lora_A/B, lora_down/up, wrapper prefixes) are handled automatically. Attach under Advanced options or pass lora_path / lora_scale on the API.
FLUX.2-klein is comfortable on 8–16 GB; Krea-2-Turbo wants ~16 GB. The app pre-checks free memory before generating, and models load on demand and unload after. Outputs land in ~/.mlx-serve/generations/images/, organized by date.
No. Generation, editing, and the safety classifier all run on-device. Your source photos and outputs never leave the Mac.
Download MLX Core, grab FLUX.2 or Krea-2 with one click, and generate — or hand it your own photos and start editing with words.