# claudebox

A thin layer over Claude Code's built-in [`/sandbox`](https://code.claude.com/docs/en/sandboxing) that adds **CIDR-level egress blocking** (Tailscale, RFC1918, MagicDNS) and **hardened credential-path denyRead defaults**. NixOS-distributed.

## When to use this

claudebox is worth it if **any** of these apply:

- You have **internal/private-network services** reachable from your machine that you don't want a prompt-injected agent to touch — anything on a mesh VPN (Tailscale, Headscale, Nebula, ZeroTier, WireGuard), anything on RFC1918 LAN (router admin, NAS, homelab, internal dashboards), or cloud metadata services (169.254.169.254).
- You're on **NixOS** and want hardened sandbox defaults (denyRead trifecta, opinionated `allowedDomains`) shipped as a flake input rather than hand-rolled per-project.

The gap claudebox closes over plain `/sandbox`: built-in `/sandbox` does **hostname-based egress allowlist only** — it cannot block address *ranges* like `100.64.0.0/10` (CGNAT, used by Tailscale and some ISPs), `192.168.0.0/16`, or `169.254.169.254`. If the agent resolves a name to one of those IPs (e.g. MagicDNS), the hostname allowlist won't catch the connection.

## When *not* to use this

Skip claudebox and just use `claude` with `/sandbox` enabled if:

- **No internal network exposure.** Your machine doesn't reach anything you wouldn't put on the public internet anyway. Hostname allowlist (`api.anthropic.com`, `github.com`, etc.) covers your exfil concern.
- **Not on NixOS.** This is distributed as a NixOS flake with a NixOS module for the nftables rules. The wrapper-only piece works elsewhere but you'd reinvent the network policy by hand.
- **You need hostname-only filtering.** `/sandbox` does that natively via `sandbox.network.allowedDomains` in `.claude/settings.json` — claudebox doesn't add anything there.

Put bluntly: if you took your laptop to a coffee shop and never noticed anything was missing, you probably don't need claudebox.

## Quick start

```bash
nix run git+https://git.toph.so/toph/claudebox
```

Or add to your flake:

```nix
{
  inputs.claudebox.url = "git+https://git.toph.so/toph/claudebox";
}
```

Then add `inputs.claudebox.packages.${system}.default` to your `environment.systemPackages` or home-manager packages, and import the NixOS module to install the nftables rules:

```nix
{
  imports = [ inputs.claudebox.nixosModules.default ];
  services.claudebox.enable = true;
}
```

Without the module, `claudebox` still runs but the CIDR block won't be enforced — you'll get only the hardened denyRead defaults on top of `/sandbox`.

## What it does

- Writes a hardened `sandbox.*` config into `./.claude/settings.local.json` (deep-merge: preserves your other keys, replaces the sandbox subtree).
- Launches `claude` inside the `claude-sandbox.slice` systemd user scope so nftables rules can match by cgroup.
- NixOS module installs the nftables `output` chain that drops egress to private/internal ranges — CGNAT (`100.64.0.0/10`, used by Tailscale/Headscale/some ISPs), RFC1918 (`10/8`, `172.16/12`, `192.168/16`), link-local (`169.254/16`, includes cloud metadata services), Tailscale's IPv6 ULA prefix (`fd7a:115c:a1e0::/48`), generic IPv6 ULA (`fc00::/7`), and IPv6 link-local — only for processes inside that slice. CIDRs are configurable via the module.

What it doesn't do (anymore, post-rewrite): no bwrap orchestration of its own, no SANDBOX.md injection, no per-project history overlay, no forced `--dangerously-skip-permissions`. Claude's built-in `/sandbox` handles the kernel-isolation primitives; claudebox does network policy + Nix glue.

## Scope and limits

Right now, there are likely files on your machine you'd rather an attacker not exfiltrate — an unencrypted SSH key, an agenix age key, mail server credentials, your `~/.aws/credentials`. This section describes what the sandbox does and does not keep them safe from. The defaults are not no-op; they protect against things you may not have catalogued.

**Explicitly in scope:**

- Reducing blast radius of model misbehavior to "I lost an hour of work" rather than "my SSH keys are on pastebin."
- Making the easy path the safer one — fewer footguns, less to remember.
- Knowing which tier I'm in for any given session, and switching deliberately. See [THREAT-MODEL.md](./THREAT-MODEL.md) for the posture ladder.

**What this sandbox protects:**

- **Reads of well-known credential paths are denied.** SSH keys, GPG keys, AWS/GCP creds, agenix/sops secrets, Tailscale state — the standard list of dotfiles and runtime secret locations. Enforced by `sandbox.filesystem.denyRead` at the syscall layer. ([Reasoning, including the list-drift caveat.](./GUARANTEES.md#mount-namespace-denial))
- **Writes outside the working directory are denied** by default (Claude Code's `/sandbox` default policy). The agent cannot overwrite your `~/.bashrc`, drop a hook into `~/.claude/hooks/`, or touch anything else in `$HOME` without being explicitly allowed.
- **The agent cannot reach internal-network hosts.** CGNAT (Tailscale, etc.), RFC1918, MagicDNS, link-local — all dropped by nftables matched on cgroup membership. This one *is* structural: kernel-enforced, won't drift, fires at packet emit time. ([Why this holds.](./GUARANTEES.md#internal-network-block))

The network block is the strongest claim — kernel rules matched on slice membership, no configuration list to forget. The credential-read denial is a hardened preset; the list is opinionated but finite, and unusual credential locations on your machine won't be covered unless you add them.

**What it does *not* guarantee:**

- Anything in the working directory that you wouldn't want public — `.env` files, hardcoded credentials, customer-data test fixtures, database dumps — can be exfiltrated through allowed network destinations (GitHub, npm, Anthropic API, anything you've permitted). Source code itself is rarely the worry; LLMs have made code largely commodity. The issue is *what's next to* the code in the same dir. The sandbox confines the *session*; it does not protect what flows out of it. **Code review at commit/push time** is the control for that leg. ([CWD exfil reasoning.](./GUARANTEES.md#cwd-exfil) · [Code review as control.](./GUARANTEES.md#code-review-as-control))
- Defense against an attacker with **specific knowledge of your setup**. claudebox is good for untargeted attacks (random injections, generic exfil payloads). It is *not* sufficient against someone actively targeting you who knows your dotfile layout, dependency stack, CI pipeline, or homelab topology. For higher-risk work, escalate to a remote VM or managed sandbox — see [THREAT-MODEL.md](./THREAT-MODEL.md#the-line-we-draw).

If you want to skip the sandbox for a session — you trust this task, you need full homelab access, you're decrypting agenix locally — run bare `claude` instead. The choice happens at the binary name. No flag inside the wrapper turns the sandbox off; that would be a false-safety footgun.

## Flags

| Flag | Description |
|------|-------------|
| `--yes`, `-y` | Skip the audit prompt and launch immediately |
| `--dry-run` | Print the launch command without executing |
| `--check` | Verify prerequisites (claude, jq, systemd-run, nftables chain) and exit |
| `--no-slice` | Skip the systemd slice scope (CIDR block won't apply — for debugging) |
| `--` | Pass remaining args to Claude Code |

## How it works

```
project root/
└── .claude/
    └── settings.local.json   # managed by claudebox (sandbox.* keys),
                              # user keys preserved on merge
```

On launch the wrapper:

1. Computes the canonical project root (worktree-aware via `git rev-parse --git-common-dir`).
2. Deep-merges the hardened `sandbox.*` config into `.claude/settings.local.json`. Existing top-level keys (model, env, MCP servers, etc.) are kept; the `sandbox` subtree is replaced wholesale.
3. Shows an audit of what's being applied, asks for confirmation.
4. Execs `systemd-run --user --scope --slice=claude-sandbox.slice -- claude "$@"`.

Inside that slice, two things happen in parallel:

- Claude Code reads `settings.local.json` and activates its built-in `/sandbox` — bwrap + seccomp + namespace isolation + hostname-allowlisted proxy.
- The kernel nftables rules (installed by the NixOS module) fire on every `connect()` from any socket inside `claude-sandbox.slice`, dropping packets bound for internal CIDRs.

Together: kernel-isolated process for the session, kernel-enforced CIDR block for the network, hostname allowlist on top.

## Requirements

- NixOS with flakes enabled (the NixOS module is the value-add — without it, `claudebox` falls back to the same set of guarantees as plain `/sandbox`).
- `jq`, `systemd-run`, and `claude` on PATH (bundled via the flake's `runtimeInputs`).
- cgroup v2 (default on every modern systemd setup).
- Kernel with `socket cgroupv2` nftables match (default on NixOS).

## License

MIT