claude-code with --dangerously-skip-permissions, minus the danger

ai-safety claude-code cli developer-tools go nix nix-flake nixos security

Find a file

Christopher Mühl fbca134511 docs: add scope/limits section, GUARANTEES and THREAT-MODEL README gains a scope section linking to two new docs: GUARANTEES.md (mechanism-level reasoning behind hard guarantees) and THREAT-MODEL.md (posture ladder, lethal-trifecta framing). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-11 09:21:47 +02:00
.planning	docs(quick-260505-le7): Add harness config file support to claudebox	2026-05-05 15:34:33 +00:00
CLAUDE.md	docs: create roadmap (3 phases)	2026-04-09 10:32:35 +02:00
claudebox.sh	feat(260505-le7): add config file globals, CLI flags, load_config_file, HARNESS_BIN resolution	2026-05-05 15:31:11 +00:00
flake.lock	fix: SHELL path, PATH isolation, --shell flag, nix-claude-code input	2026-04-09 14:59:43 +02:00
flake.nix	fix: SHELL path, PATH isolation, --shell flag, nix-claude-code input	2026-04-09 14:59:43 +02:00
GUARANTEES.md	docs: add scope/limits section, GUARANTEES and THREAT-MODEL	2026-05-11 09:21:47 +02:00
README.md	docs: add scope/limits section, GUARANTEES and THREAT-MODEL	2026-05-11 09:21:47 +02:00
test-gc.sh	test(05-02): add GC integration test covering stale removal, valid preservation, empty-dir safety	2026-04-13 10:02:31 +00:00
THREAT-MODEL.md	docs: add scope/limits section, GUARANTEES and THREAT-MODEL	2026-05-11 09:21:47 +02:00

README.md

claudebox

Run Claude Code inside a bubblewrap sandbox with an allowlisted environment, explicit filesystem mounts, and a minimal PATH.

SSH keys, GPG/age secrets, cloud tokens, and Tailscale state stay completely invisible to the AI agent. If a secret is accessible inside the sandbox, it's a bug.

Quick start

nix run git+https://git.toph.so/toph/claudebox

Or add to your flake:

{
  inputs.claudebox.url = "git+https://git.toph.so/toph/claudebox";
}

Then add inputs.claudebox.packages.${system}.default to your environment.systemPackages or home-manager packages.

What it does

Starts Claude Code inside a bwrap namespace with --clearenv
Only allowlisted env vars enter the sandbox (HOME, PATH, TERM, EDITOR, LANG, ANTHROPIC_API_KEY if set)
Mounts CWD read-write, Nix store read-only, everything else is tmpfs
Provides nix shell and comma (, <tool>) so Claude can install tools on demand
Injects a SANDBOX.md so Claude knows it's sandboxed and how to get tools
Pre-configures git identity and safe.directory from host

Scope and limits

Right now, there are likely files on your machine you'd rather an attacker not exfiltrate — an unencrypted SSH key, an agenix age key, mail server credentials, your ~/.aws/credentials. This section describes what the sandbox does and does not keep them safe from. The defaults are not no-op; they protect against things you may not have catalogued.

Explicitly in scope:

Reducing blast radius of model misbehavior to "I lost an hour of work" rather than "my SSH keys are on pastebin."
Making the easy path the safer one — fewer footguns, less to remember.
Knowing which tier I'm in for any given session, and switching deliberately. See THREAT-MODEL.md for the posture ladder.

Hard guarantees this sandbox provides:

The agent cannot read your SSH keys, GPG keys, cloud credentials, or other dotfiles outside the working directory. (Why this holds.)
The agent cannot reach your homelab or internal-network hosts (Tailscale, RFC1918, MagicDNS resolver). (Why this holds.)

These are structural — kernel-enforced via mount namespaces, cgroup membership, and nftables — not configuration that can drift if you forget to add a path to a denylist.

What it does not guarantee:

Anything in the working directory that you wouldn't want public — .env files, hardcoded credentials, customer-data test fixtures, database dumps — can be exfiltrated through allowed network destinations (GitHub, npm, Anthropic API, anything you've permitted). Source code itself is rarely the worry; LLMs have made code largely commodity. The issue is what's next to the code in the same dir. The sandbox confines the session; it does not protect what flows out of it. Code review at commit/push time is the control for that leg. (CWD exfil reasoning. · Code review as control.)
Defense against an attacker with specific knowledge of your setup. claudebox is good for untargeted attacks (random injections, generic exfil payloads). It is not sufficient against someone actively targeting you who knows your dotfile layout, dependency stack, CI pipeline, or homelab topology. For higher-risk work, escalate to a remote VM or managed sandbox — see THREAT-MODEL.md.

If you want to skip the sandbox for a session — you trust this task, you need full homelab access, you're decrypting agenix locally — run bare claude instead. The choice happens at the binary name. No flag inside the wrapper turns the sandbox off; that would be a false-safety footgun.

Flags

Flag	Description
`--yes`, `-y`	Skip the env audit and launch immediately
`--dry-run`	Print the bwrap command without executing
`--check`	Verify prerequisites and exit
`--shell`	Drop into a bash shell instead of Claude Code
`--gc`	Remove stale per-project instance dirs and exit
`--`	Pass remaining args to Claude Code

Env vars

Env files (preferred) — define vars without polluting your shell:

~/.claudebox/env — global, loaded on every launch:

ANTHROPIC_API_KEY=sk-ant-...
MY_GLOBAL_VAR=value

<project>/.claudebox.env — per-project, loaded when present:

DATABASE_URL=postgres://localhost/myapp
SOME_PROJECT_VAR=value

Add .claudebox.env to your .gitignore if it contains secrets.

Pass-through — inject host vars already set in your shell:

CLAUDEBOX_EXTRA_ENV=MY_VAR,OTHER_VAR claudebox

All injected vars appear in the [+] section of the env audit.

How it works

~/.claudebox/          # persistent config dir (host)
├── SANDBOX.md         # managed by claudebox, overwritten each launch
├── history.jsonl      # conversation history
├── .credentials.json  # Claude Code credentials (if present)
└── projects/
    └── <16-char-hex>/ # per-project instance dir (keyed by canonical git root)
        └── project-root  # records the canonical path for this instance

Inside the sandbox:
  ~/.claude            →  bind-mounted from host (plugins, skills, hooks, MCP all visible)
  ~/.claude/projects   →  bind-mounted from ~/.claudebox/projects/<hash>/ (per-project isolation)
  ~/.claude/history.jsonl → bind-mounted from ~/.claudebox/history.jsonl
  ~/.claude/SANDBOX.md →  bind-mounted from ~/.claudebox/SANDBOX.md

Each project gets an isolated ~/.claude/projects/ directory inside the sandbox, so conversation history and project state are separated per repo. Git worktrees share the same instance dir as their main worktree.

Requirements

NixOS or Nix with flakes enabled
User namespaces (enabled by default on NixOS)

License

MIT