How do I run an AI agent's shell commands safely on a Mac?

Flip on Agent Sandbox in MLX Core and every shell command the agent runs executes inside an isolated Linux VM built directly on Apple's Virtualization framework — not on your Mac. The VM boots in under a second, a green shield in the chat toolbar shows when commands run isolated, and the agent is told which environment it's in so it uses the right tools.

Can I still use the web apps the agent builds inside the sandbox?

Yes — this is the trick that makes the sandbox pleasant: any server the agent starts inside the VM is mirrored live to localhost on your Mac. An Express app the agent runs on guest port 8080 is http://localhost:8080 in your browser, automatically. Mirroring is loopback-only; nothing is exposed to your LAN.

Does the Agent Sandbox need Docker?

No. It pulls standard OCI container images directly — no Docker daemon, no Docker Desktop license. The image's filesystem becomes the VM's root, shared on demand, so guest overhead is about 1 GiB of RAM and boot takes under a second.

What can the sandboxed agent actually touch on my Mac?

Only the workspace folder you chose, mounted into the VM at /workspace — and it follows your working-folder switch automatically. Networking is a separate toggle: turn it off and the VM gets no network device at all. Everything else on your Mac is invisible to the guest.

Let the agent go wild. Your Mac stays untouched.

The problem

You can't review every command. So stop having to.

Agent mode is most useful when you let it run — npm install, scaffold, migrate, retry. But every one of those commands executes somewhere, and per-command approval dialogs train you to click Allow without reading. The honest fix isn't more dialogs; it's making the blast radius someone else's filesystem.

With the sandbox on, the agent's shell is a real Linux machine that isn't yours. It can rm -rf to its heart's content — the VM's root is disposable, and your Mac's disk was never in the room.

MLX Core chat with the green sandbox shield active while an agent runs shell commands in the Linux VM — The green shield means every shell command is running inside the VM, not on macOS.

The engineering

A real VM that behaves like a lightweight one

This is Apple's Virtualization framework driving a minimal Linux kernel — the same foundation apps like UTM ship on, Mac App Store-compatible, no kernel extensions, no Docker daemon. The root filesystem is a standard OCI container image, pulled directly (no Docker required) and shared into the VM on demand, so the guest needs only ~1 GiB of RAM and cold-boots in under a second.

The agent is also told which environment it's in — so it reaches for apt inside the VM instead of brew, and vice versa when the sandbox is off.

Apple Virtualization framework — hardware-enforced isolation, signed by the OS vendor.
OCI images as rootfs — pull any standard container image, no Docker.
Sub-second boot, ~1 GiB RAM — cheap enough to leave on always.
Background processes tracked — the chat shows what's running in the guest, with a kill button.

The magic trick

Guest servers appear on your localhost. Live.

The classic sandbox pain: the agent builds you a web app, and now it's trapped in a VM you can't reach. Here, every port the guest starts listening on is mirrored to localhost on your Mac automatically — the agent runs an Express app on guest port 8080, and http://localhost:8080 opens it in your browser, live, while it runs.

Mirroring binds loopback only — nothing is ever exposed to your LAN. And networking itself is a separate toggle: switch it off and the VM gets no network device at all.

agent, inside the VM

$ node server.js &
Listening on port 8080

# on your Mac, automatically:
http://localhost:8080  → the guest app, live
# loopback only — never your LAN, never the guest IP
          

The boundary

Exactly one shared folder. You picked it.

The only piece of your Mac the guest can see is the workspace folder you chose, mounted at /workspace — and it follows your working-folder switch automatically. That's where the agent's deliverables land; everything else on your disk simply doesn't exist from inside the VM.

A live RAM readout for the guest sits in the menu-bar tray, and the whole sandbox — image, kernel, caches — lives under ~/.mlx-serve/sandbox/ where you can delete it any time.

Files: only /workspace crosses the boundary.
Network: on = DHCP inside the guest + loopback mirroring; off = no device.
Visibility: green shield in the toolbar, guest RAM in the tray.

FAQ

Sandbox questions, answered

Do I need Docker or any extra install?

No. The sandbox pulls standard OCI images directly and boots them on Apple's Virtualization framework — no Docker daemon, no Docker Desktop license, no kernel extensions. It's all inside the signed, notarized app.

What's the performance cost?

Boot is under a second and the guest reserves about 1 GiB of RAM for workload headroom — the root filesystem is shared on demand rather than loaded into memory. Inference isn't affected at all; only shell commands route into the VM.

Can the agent still break anything that matters?

It can modify the workspace folder you mounted — that's the point of having one — and, if networking is on, make outbound connections from inside the guest. Everything else on your Mac is invisible to it. Turn networking off for fully airgapped runs.

How do I reach a server the agent started?

Just open http://localhost:<port> — every non-loopback port the guest listens on is mirrored to your Mac's loopback live, and released when it closes. Ports already in use on the host are skipped and logged.

Let the agent go wild.
Your Mac stays untouched.

You can't review every command. So stop having to.

A real VM that behaves like a lightweight one

Guest servers appear on your localhost. Live.

Exactly one shared folder. You picked it.

Sandbox questions, answered

More deep dives

Autonomy without the anxiety.

Let the agent go wild.Your Mac stays untouched.

You can't review every command. So stop having to.

A real VM that behaves like a lightweight one

Guest servers appear on your localhost. Live.

Exactly one shared folder. You picked it.

Sandbox questions, answered

More deep dives

Self-healing tool calls →

Claude Code, fully local →

Always-on assistant →

Speculative decoding →

LM Studio alternative →

Ollama alternative →

Autonomy without the anxiety.

Let the agent go wild.
Your Mac stays untouched.