claudebox/README.md
Christopher Mühl fbca134511
docs: add scope/limits section, GUARANTEES and THREAT-MODEL
README gains a scope section linking to two new docs: GUARANTEES.md
(mechanism-level reasoning behind hard guarantees) and THREAT-MODEL.md
(posture ladder, lethal-trifecta framing).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 09:21:47 +02:00

119 lines
5.9 KiB
Markdown

# claudebox
Run [Claude Code](https://docs.anthropic.com/en/docs/claude-code) inside a [bubblewrap](https://github.com/containers/bubblewrap) sandbox with an allowlisted environment, explicit filesystem mounts, and a minimal PATH.
SSH keys, GPG/age secrets, cloud tokens, and Tailscale state stay completely invisible to the AI agent. If a secret is accessible inside the sandbox, it's a bug.
## Quick start
```bash
nix run git+https://git.toph.so/toph/claudebox
```
Or add to your flake:
```nix
{
inputs.claudebox.url = "git+https://git.toph.so/toph/claudebox";
}
```
Then add `inputs.claudebox.packages.${system}.default` to your `environment.systemPackages` or home-manager packages.
## What it does
- Starts Claude Code inside a bwrap namespace with `--clearenv`
- Only allowlisted env vars enter the sandbox (HOME, PATH, TERM, EDITOR, LANG, ANTHROPIC_API_KEY if set)
- Mounts CWD read-write, Nix store read-only, everything else is tmpfs
- Provides `nix shell` and [comma](https://github.com/nix-community/comma) (`, <tool>`) so Claude can install tools on demand
- Injects a SANDBOX.md so Claude knows it's sandboxed and how to get tools
- Pre-configures git identity and safe.directory from host
## Scope and limits
Right now, there are likely files on your machine you'd rather an attacker not exfiltrate — an unencrypted SSH key, an agenix age key, mail server credentials, your `~/.aws/credentials`. This section describes what the sandbox does and does not keep them safe from. The defaults are not no-op; they protect against things you may not have catalogued.
**Explicitly in scope:**
- Reducing blast radius of model misbehavior to "I lost an hour of work" rather than "my SSH keys are on pastebin."
- Making the easy path the safer one — fewer footguns, less to remember.
- Knowing which tier I'm in for any given session, and switching deliberately. See [THREAT-MODEL.md](./THREAT-MODEL.md) for the posture ladder.
**Hard guarantees this sandbox provides:**
- The agent cannot read your SSH keys, GPG keys, cloud credentials, or other dotfiles outside the working directory. ([Why this holds.](./GUARANTEES.md#mount-namespace-denial))
- The agent cannot reach your homelab or internal-network hosts (Tailscale, RFC1918, MagicDNS resolver). ([Why this holds.](./GUARANTEES.md#internal-network-block))
These are structural — kernel-enforced via mount namespaces, cgroup membership, and nftables — not configuration that can drift if you forget to add a path to a denylist.
**What it does *not* guarantee:**
- Anything in the working directory that you wouldn't want public — `.env` files, hardcoded credentials, customer-data test fixtures, database dumps — can be exfiltrated through allowed network destinations (GitHub, npm, Anthropic API, anything you've permitted). Source code itself is rarely the worry; LLMs have made code largely commodity. The issue is *what's next to* the code in the same dir. The sandbox confines the *session*; it does not protect what flows out of it. **Code review at commit/push time** is the control for that leg. ([CWD exfil reasoning.](./GUARANTEES.md#cwd-exfil) · [Code review as control.](./GUARANTEES.md#code-review-as-control))
- Defense against an attacker with **specific knowledge of your setup**. claudebox is good for untargeted attacks (random injections, generic exfil payloads). It is *not* sufficient against someone actively targeting you who knows your dotfile layout, dependency stack, CI pipeline, or homelab topology. For higher-risk work, escalate to a remote VM or managed sandbox — see [THREAT-MODEL.md](./THREAT-MODEL.md#the-line-we-draw).
If you want to skip the sandbox for a session — you trust this task, you need full homelab access, you're decrypting agenix locally — run bare `claude` instead. The choice happens at the binary name. No flag inside the wrapper turns the sandbox off; that would be a false-safety footgun.
## Flags
| Flag | Description |
|------|-------------|
| `--yes`, `-y` | Skip the env audit and launch immediately |
| `--dry-run` | Print the bwrap command without executing |
| `--check` | Verify prerequisites and exit |
| `--shell` | Drop into a bash shell instead of Claude Code |
| `--gc` | Remove stale per-project instance dirs and exit |
| `--` | Pass remaining args to Claude Code |
## Env vars
**Env files (preferred)** — define vars without polluting your shell:
`~/.claudebox/env` — global, loaded on every launch:
```bash
ANTHROPIC_API_KEY=sk-ant-...
MY_GLOBAL_VAR=value
```
`<project>/.claudebox.env` — per-project, loaded when present:
```bash
DATABASE_URL=postgres://localhost/myapp
SOME_PROJECT_VAR=value
```
Add `.claudebox.env` to your `.gitignore` if it contains secrets.
**Pass-through** — inject host vars already set in your shell:
```bash
CLAUDEBOX_EXTRA_ENV=MY_VAR,OTHER_VAR claudebox
```
All injected vars appear in the `[+]` section of the env audit.
## How it works
```
~/.claudebox/ # persistent config dir (host)
├── SANDBOX.md # managed by claudebox, overwritten each launch
├── history.jsonl # conversation history
├── .credentials.json # Claude Code credentials (if present)
└── projects/
└── <16-char-hex>/ # per-project instance dir (keyed by canonical git root)
└── project-root # records the canonical path for this instance
Inside the sandbox:
~/.claude → bind-mounted from host (plugins, skills, hooks, MCP all visible)
~/.claude/projects → bind-mounted from ~/.claudebox/projects/<hash>/ (per-project isolation)
~/.claude/history.jsonl → bind-mounted from ~/.claudebox/history.jsonl
~/.claude/SANDBOX.md → bind-mounted from ~/.claudebox/SANDBOX.md
```
Each project gets an isolated `~/.claude/projects/` directory inside the sandbox, so conversation history and project state are separated per repo. Git worktrees share the same instance dir as their main worktree.
## Requirements
- NixOS or Nix with flakes enabled
- User namespaces (enabled by default on NixOS)
## License
MIT