# claudebox A thin layer over Claude Code's built-in [`/sandbox`](https://code.claude.com/docs/en/sandboxing) that adds **CIDR-level egress blocking** (Tailscale, RFC1918, MagicDNS) and **hardened credential-path denyRead defaults**. NixOS-distributed. ## When to use this claudebox is worth it if **any** of these apply: - You have **internal/private-network services** reachable from your machine that you don't want a prompt-injected agent to touch — anything on a mesh VPN (Tailscale, Headscale, Nebula, ZeroTier, WireGuard), anything on RFC1918 LAN (router admin, NAS, homelab, internal dashboards), or cloud metadata services (169.254.169.254). - You're on **NixOS** and want hardened sandbox defaults (denyRead trifecta, opinionated `allowedDomains`) shipped as a flake input rather than hand-rolled per-project. The gap claudebox closes over plain `/sandbox`: built-in `/sandbox` does **hostname-based egress allowlist only** — it cannot block address *ranges* like `100.64.0.0/10` (CGNAT, used by Tailscale and some ISPs), `192.168.0.0/16`, or `169.254.169.254`. If the agent resolves a name to one of those IPs (e.g. MagicDNS), the hostname allowlist won't catch the connection. ## When *not* to use this Skip claudebox and just use `claude` with `/sandbox` enabled if: - **No internal network exposure.** Your machine doesn't reach anything you wouldn't put on the public internet anyway. Hostname allowlist (`api.anthropic.com`, `github.com`, etc.) covers your exfil concern. - **Not on NixOS.** This is distributed as a NixOS flake with a NixOS module for the nftables rules. The wrapper-only piece works elsewhere but you'd reinvent the network policy by hand. - **You need hostname-only filtering.** `/sandbox` does that natively via `sandbox.network.allowedDomains` in `.claude/settings.json` — claudebox doesn't add anything there. Put bluntly: if you took your laptop to a coffee shop and never noticed anything was missing, you probably don't need claudebox. ## Quick start ```bash nix run git+https://git.toph.so/toph/claudebox ``` Or add to your flake: ```nix { inputs.claudebox.url = "git+https://git.toph.so/toph/claudebox"; } ``` Then add `inputs.claudebox.packages.${system}.default` to your `environment.systemPackages` or home-manager packages, and import the NixOS module to install the nftables rules: ```nix { imports = [ inputs.claudebox.nixosModules.default ]; services.claudebox.enable = true; } ``` Without the module, `claudebox` still runs but the CIDR block won't be enforced — you'll get only the hardened denyRead defaults on top of `/sandbox`. ## What it does - Writes a hardened `sandbox.*` config into `./.claude/settings.local.json` (deep-merge: preserves your other keys, replaces the sandbox subtree). - Launches `claude` inside the `claude-sandbox.slice` systemd user scope so nftables rules can match by cgroup. - NixOS module installs the nftables `output` chain that drops egress to private/internal ranges — CGNAT (`100.64.0.0/10`, used by Tailscale/Headscale/some ISPs), RFC1918 (`10/8`, `172.16/12`, `192.168/16`), link-local (`169.254/16`, includes cloud metadata services), Tailscale's IPv6 ULA prefix (`fd7a:115c:a1e0::/48`), generic IPv6 ULA (`fc00::/7`), and IPv6 link-local — only for processes inside that slice. CIDRs are configurable via the module. What it doesn't do (anymore, post-rewrite): no bwrap orchestration of its own, no SANDBOX.md injection, no per-project history overlay, no forced `--dangerously-skip-permissions`. Claude's built-in `/sandbox` handles the kernel-isolation primitives; claudebox does network policy + Nix glue. ## Scope and limits Right now, there are likely files on your machine you'd rather an attacker not exfiltrate — an unencrypted SSH key, an agenix age key, mail server credentials, your `~/.aws/credentials`. This section describes what the sandbox does and does not keep them safe from. The defaults are not no-op; they protect against things you may not have catalogued. **Explicitly in scope:** - Reducing blast radius of model misbehavior to "I lost an hour of work" rather than "my SSH keys are on pastebin." - Making the easy path the safer one — fewer footguns, less to remember. - Knowing which tier I'm in for any given session, and switching deliberately. See [THREAT-MODEL.md](./THREAT-MODEL.md) for the posture ladder. **What this sandbox protects:** - **Reads of well-known credential paths are denied.** SSH keys, GPG keys, AWS/GCP creds, agenix/sops secrets, Tailscale state — the standard list of dotfiles and runtime secret locations. Enforced by `sandbox.filesystem.denyRead` at the syscall layer. ([Reasoning, including the list-drift caveat.](./GUARANTEES.md#mount-namespace-denial)) - **Writes outside the working directory are denied** by default (Claude Code's `/sandbox` default policy). The agent cannot overwrite your `~/.bashrc`, drop a hook into `~/.claude/hooks/`, or touch anything else in `$HOME` without being explicitly allowed. - **The agent cannot reach internal-network hosts.** CGNAT (Tailscale, etc.), RFC1918, MagicDNS, link-local — all dropped by nftables matched on cgroup membership. This one *is* structural: kernel-enforced, won't drift, fires at packet emit time. ([Why this holds.](./GUARANTEES.md#internal-network-block)) The network block is the strongest claim — kernel rules matched on slice membership, no configuration list to forget. The credential-read denial is a hardened preset; the list is opinionated but finite, and unusual credential locations on your machine won't be covered unless you add them. **What it does *not* guarantee:** - Anything in the working directory that you wouldn't want public — `.env` files, hardcoded credentials, customer-data test fixtures, database dumps — can be exfiltrated through allowed network destinations (GitHub, npm, Anthropic API, anything you've permitted). Source code itself is rarely the worry; LLMs have made code largely commodity. The issue is *what's next to* the code in the same dir. The sandbox confines the *session*; it does not protect what flows out of it. **Code review at commit/push time** is the control for that leg. ([CWD exfil reasoning.](./GUARANTEES.md#cwd-exfil) · [Code review as control.](./GUARANTEES.md#code-review-as-control)) - Defense against an attacker with **specific knowledge of your setup**. claudebox is good for untargeted attacks (random injections, generic exfil payloads). It is *not* sufficient against someone actively targeting you who knows your dotfile layout, dependency stack, CI pipeline, or homelab topology. For higher-risk work, escalate to a remote VM or managed sandbox — see [THREAT-MODEL.md](./THREAT-MODEL.md#the-line-we-draw). If you want to skip the sandbox for a session — you trust this task, you need full homelab access, you're decrypting agenix locally — run bare `claude` instead. The choice happens at the binary name. No flag inside the wrapper turns the sandbox off; that would be a false-safety footgun. ## Flags | Flag | Description | |------|-------------| | `--yes`, `-y` | Skip the audit prompt and launch immediately | | `--dry-run` | Print the launch command without executing | | `--check` | Verify prerequisites (claude, jq, systemd-run, nftables chain) and exit | | `--no-slice` | Skip the systemd slice scope (CIDR block won't apply — for debugging) | | `--` | Pass remaining args to Claude Code | ## How it works ``` project root/ └── .claude/ └── settings.local.json # managed by claudebox (sandbox.* keys), # user keys preserved on merge ``` On launch the wrapper: 1. Computes the canonical project root (worktree-aware via `git rev-parse --git-common-dir`). 2. Deep-merges the hardened `sandbox.*` config into `.claude/settings.local.json`. Existing top-level keys (model, env, MCP servers, etc.) are kept; the `sandbox` subtree is replaced wholesale. 3. Shows an audit of what's being applied, asks for confirmation. 4. Execs `systemd-run --user --scope --slice=claude-sandbox.slice -- claude "$@"`. Inside that slice, two things happen in parallel: - Claude Code reads `settings.local.json` and activates its built-in `/sandbox` — bwrap + seccomp + namespace isolation + hostname-allowlisted proxy. - The kernel nftables rules (installed by the NixOS module) fire on every `connect()` from any socket inside `claude-sandbox.slice`, dropping packets bound for internal CIDRs. Together: kernel-isolated process for the session, kernel-enforced CIDR block for the network, hostname allowlist on top. ## Requirements - NixOS with flakes enabled (the NixOS module is the value-add — without it, `claudebox` falls back to the same set of guarantees as plain `/sandbox`). - `jq`, `systemd-run`, and `claude` on PATH (bundled via the flake's `runtimeInputs`). - cgroup v2 (default on every modern systemd setup). - Kernel with `socket cgroupv2` nftables match (default on NixOS). ## License MIT