README gains a scope section linking to two new docs: GUARANTEES.md (mechanism-level reasoning behind hard guarantees) and THREAT-MODEL.md (posture ladder, lethal-trifecta framing). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .planning | ||
| CLAUDE.md | ||
| claudebox.sh | ||
| flake.lock | ||
| flake.nix | ||
| GUARANTEES.md | ||
| README.md | ||
| test-gc.sh | ||
| THREAT-MODEL.md | ||
claudebox
Run Claude Code inside a bubblewrap sandbox with an allowlisted environment, explicit filesystem mounts, and a minimal PATH.
SSH keys, GPG/age secrets, cloud tokens, and Tailscale state stay completely invisible to the AI agent. If a secret is accessible inside the sandbox, it's a bug.
Quick start
nix run git+https://git.toph.so/toph/claudebox
Or add to your flake:
{
inputs.claudebox.url = "git+https://git.toph.so/toph/claudebox";
}
Then add inputs.claudebox.packages.${system}.default to your environment.systemPackages or home-manager packages.
What it does
- Starts Claude Code inside a bwrap namespace with
--clearenv - Only allowlisted env vars enter the sandbox (HOME, PATH, TERM, EDITOR, LANG, ANTHROPIC_API_KEY if set)
- Mounts CWD read-write, Nix store read-only, everything else is tmpfs
- Provides
nix shelland comma (, <tool>) so Claude can install tools on demand - Injects a SANDBOX.md so Claude knows it's sandboxed and how to get tools
- Pre-configures git identity and safe.directory from host
Scope and limits
Right now, there are likely files on your machine you'd rather an attacker not exfiltrate — an unencrypted SSH key, an agenix age key, mail server credentials, your ~/.aws/credentials. This section describes what the sandbox does and does not keep them safe from. The defaults are not no-op; they protect against things you may not have catalogued.
Explicitly in scope:
- Reducing blast radius of model misbehavior to "I lost an hour of work" rather than "my SSH keys are on pastebin."
- Making the easy path the safer one — fewer footguns, less to remember.
- Knowing which tier I'm in for any given session, and switching deliberately. See THREAT-MODEL.md for the posture ladder.
Hard guarantees this sandbox provides:
- The agent cannot read your SSH keys, GPG keys, cloud credentials, or other dotfiles outside the working directory. (Why this holds.)
- The agent cannot reach your homelab or internal-network hosts (Tailscale, RFC1918, MagicDNS resolver). (Why this holds.)
These are structural — kernel-enforced via mount namespaces, cgroup membership, and nftables — not configuration that can drift if you forget to add a path to a denylist.
What it does not guarantee:
- Anything in the working directory that you wouldn't want public —
.envfiles, hardcoded credentials, customer-data test fixtures, database dumps — can be exfiltrated through allowed network destinations (GitHub, npm, Anthropic API, anything you've permitted). Source code itself is rarely the worry; LLMs have made code largely commodity. The issue is what's next to the code in the same dir. The sandbox confines the session; it does not protect what flows out of it. Code review at commit/push time is the control for that leg. (CWD exfil reasoning. · Code review as control.) - Defense against an attacker with specific knowledge of your setup. claudebox is good for untargeted attacks (random injections, generic exfil payloads). It is not sufficient against someone actively targeting you who knows your dotfile layout, dependency stack, CI pipeline, or homelab topology. For higher-risk work, escalate to a remote VM or managed sandbox — see THREAT-MODEL.md.
If you want to skip the sandbox for a session — you trust this task, you need full homelab access, you're decrypting agenix locally — run bare claude instead. The choice happens at the binary name. No flag inside the wrapper turns the sandbox off; that would be a false-safety footgun.
Flags
| Flag | Description |
|---|---|
--yes, -y |
Skip the env audit and launch immediately |
--dry-run |
Print the bwrap command without executing |
--check |
Verify prerequisites and exit |
--shell |
Drop into a bash shell instead of Claude Code |
--gc |
Remove stale per-project instance dirs and exit |
-- |
Pass remaining args to Claude Code |
Env vars
Env files (preferred) — define vars without polluting your shell:
~/.claudebox/env — global, loaded on every launch:
ANTHROPIC_API_KEY=sk-ant-...
MY_GLOBAL_VAR=value
<project>/.claudebox.env — per-project, loaded when present:
DATABASE_URL=postgres://localhost/myapp
SOME_PROJECT_VAR=value
Add .claudebox.env to your .gitignore if it contains secrets.
Pass-through — inject host vars already set in your shell:
CLAUDEBOX_EXTRA_ENV=MY_VAR,OTHER_VAR claudebox
All injected vars appear in the [+] section of the env audit.
How it works
~/.claudebox/ # persistent config dir (host)
├── SANDBOX.md # managed by claudebox, overwritten each launch
├── history.jsonl # conversation history
├── .credentials.json # Claude Code credentials (if present)
└── projects/
└── <16-char-hex>/ # per-project instance dir (keyed by canonical git root)
└── project-root # records the canonical path for this instance
Inside the sandbox:
~/.claude → bind-mounted from host (plugins, skills, hooks, MCP all visible)
~/.claude/projects → bind-mounted from ~/.claudebox/projects/<hash>/ (per-project isolation)
~/.claude/history.jsonl → bind-mounted from ~/.claudebox/history.jsonl
~/.claude/SANDBOX.md → bind-mounted from ~/.claudebox/SANDBOX.md
Each project gets an isolated ~/.claude/projects/ directory inside the sandbox, so conversation history and project state are separated per repo. Git worktrees share the same instance dir as their main worktree.
Requirements
- NixOS or Nix with flakes enabled
- User namespaces (enabled by default on NixOS)
License
MIT