claudebox/.planning/research/ARCHITECTURE.md
Christopher Mühl a6f8ea95d3
docs: complete project research
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 10:05:35 +02:00

384 lines
15 KiB
Markdown

# Architecture Patterns
**Domain:** Nix bubblewrap sandbox wrapper
**Researched:** 2026-04-09
## Recommended Architecture
claudebox is a single Nix derivation producing a shell script. The script has five logical stages that execute sequentially before `exec`ing into the sandboxed Claude process.
```
claudebox (entry)
|
v
[1. Argument Parsing] --yes/-y flag, passthrough args for claude
|
v
[2. Environment Build] Start empty, allowlist safe vars from host
|
v
[3. Env Audit Display] Show what's entering the sandbox, prompt user
|
v
[4. bwrap Invocation] Namespace + mount table + env + exec chain
| |
| +-- Mount table (ro: /nix/store, /etc/resolv.conf, ...)
| +-- Mount table (rw: CWD, ~/.claudebox -> ~/.claude)
| +-- Mount table (tmpfs: /tmp, /home)
| +-- Namespace config (unshare user, pid, ipc)
| +-- Env vars (--clearenv + explicit --setenv per var)
|
v
[5. exec claude] --dangerously-skip-permissions + user args
```
### Component Boundaries
| Component | Responsibility | Notes |
|-----------|---------------|-------|
| **Nix derivation** (`default.nix` / `flake.nix`) | Pins all runtime deps, builds wrapper via `writeShellApplication` | Closure includes coreutils, git, curl, jq, rg, fd, nix, comma, claude-code |
| **Argument parser** | Handles `--yes`/`-y`, collects passthrough args | Simple `case`/`shift` loop, no getopt needed |
| **Env builder** | Constructs the `--setenv` flag list from allowlist | Reads host vars, filters through allowlist, builds array |
| **Env auditor** | Displays env to user, prompts for confirmation | Skipped with `--yes`; uses stderr for display |
| **Mount table** | Defines all filesystem bindings for bwrap | Static mounts + dynamic CWD mount |
| **bwrap exec** | Assembles and execs the bwrap command | Final `exec bwrap ... -- claude ...` |
### The bwrap Invocation Structure
bubblewrap flags are order-sensitive for mounts (later mounts overlay earlier ones) but not for namespace flags. The canonical structure:
```bash
exec bwrap \
# --- Namespace isolation ---
--unshare-user \
--unshare-pid \
--unshare-ipc \
--unshare-cgroup \
--die-with-parent \
\
# --- Environment (start clean) ---
--clearenv \
--setenv HOME "$sandbox_home" \
--setenv PATH "$sandbox_path" \
--setenv TERM "$TERM" \
# ... more --setenv flags from allowlist ...
\
# --- Base filesystem (read-only) ---
--ro-bind /nix/store /nix/store \
--ro-bind /etc/resolv.conf /etc/resolv.conf \
--ro-bind /etc/ssl /etc/ssl \
--ro-bind /etc/nix /etc/nix \
--ro-bind /etc/passwd /etc/passwd \
--ro-bind /etc/group /etc/group \
--ro-bind /etc/nsswitch.conf /etc/nsswitch.conf \
\
# --- Nix daemon socket (required for nix commands) ---
--bind /nix/var/nix/daemon-socket /nix/var/nix/daemon-socket \
--ro-bind /nix/var/nix/db /nix/var/nix/db \
--ro-bind /nix/var/nix/profiles /nix/var/nix/profiles \
\
# --- tmpfs layers ---
--tmpfs /tmp \
--tmpfs /run \
\
# --- Proc/dev (needed for process management) ---
--proc /proc \
--dev /dev \
\
# --- User home (isolated) ---
--tmpfs "$sandbox_home" \
\
# --- Persistent Claude config ---
--bind "$HOME/.claudebox" "$sandbox_home/.claude" \
\
# --- Working directory (read-write) ---
--bind "$(pwd)" "$(pwd)" \
--chdir "$(pwd)" \
\
# --- XDG cache for nix/comma ---
--bind "$HOME/.claudebox/cache" "$sandbox_home/.cache" \
\
-- \
claude --dangerously-skip-permissions "$@"
```
**Flag ordering rationale:**
1. Namespace flags first (they configure the sandbox type)
2. `--clearenv` before any `--setenv` (clear then populate)
3. Read-only system mounts before read-write user mounts (base before overlay)
4. `--tmpfs` for home before `--bind` into home (create the mount point, then bind into it)
5. `--chdir` last before `--` (sets starting directory)
6. `--` separates bwrap flags from the command to execute
### How /nix/store Works Inside bwrap
The Nix store is the critical piece. Here is how each layer works:
**Read-only store access (`--ro-bind /nix/store /nix/store`):**
- All store paths (the closure of the wrapper script) are immediately available
- Programs in PATH resolve because PATH points to `/nix/store/...-coreutils/bin` etc.
- This is a bind mount, not a copy -- zero overhead
**Nix daemon socket (`--bind /nix/var/nix/daemon-socket`):**
- `nix` commands (build, shell, run) communicate with the Nix daemon via a Unix socket at `/nix/var/nix/daemon-socket/socket`
- The daemon runs OUTSIDE the sandbox as root -- it handles store writes
- Inside the sandbox, the user can request builds but the daemon does the actual `/nix/store` writing
- This is why `/nix/store` can be `--ro-bind` even though nix builds "write" to it: the daemon writes from outside
**Nix DB access (`--ro-bind /nix/var/nix/db`):**
- The Nix database (SQLite) tells `nix` what's installed and what paths are valid
- Read-only is sufficient; the daemon handles mutations
**Nix profiles (`--ro-bind /nix/var/nix/profiles`):**
- Needed for `nix` to resolve channels/registries
- Read-only is fine
**Result:** `nix shell nixpkgs#python3 -c python3` works inside the sandbox. The daemon fetches/builds the derivation, writes to the store (outside sandbox), and the new store path becomes visible through the existing `--ro-bind` mount (because bind mounts reflect the source's live state).
### How comma (`,`) Works Inside the Sandbox
comma is a wrapper around `nix shell`. When Claude runs `, ripgrep`:
1. comma resolves `ripgrep` to a nixpkgs attribute using `nix-index` (a prebuilt database)
2. comma runs `nix shell nixpkgs#ripgrep -c rg ...`
3. Nix daemon fetches/builds the derivation outside the sandbox
4. The result appears in `/nix/store` which is bind-mounted
5. The command executes
**Requirements for comma to work:**
- `nix-index` database must exist. Two options:
- Pre-populate in the derivation (larger closure, stale)
- Bind-mount host's `~/.cache/nix-index` read-only (recommended -- uses host's existing DB)
- The `nix` command must be in PATH
- The Nix daemon socket must be accessible
**Recommended approach:** Bind-mount the host nix-index database:
```bash
--ro-bind "$HOME/.cache/nix-index" "$sandbox_home/.cache/nix-index"
```
Or if using `nix-index-database` flake (common on NixOS), bind-mount its store path.
### Data Flow
```
Host environment
|
|-- [env vars] --> allowlist filter --> --setenv flags --> sandbox env
|
|-- [/nix/store] --ro-bind--> sandbox /nix/store
|-- [nix daemon socket] --bind--> sandbox can request builds
|-- [CWD] --bind (rw)--> sandbox CWD (Claude edits code here)
|-- [~/.claudebox/] --bind (rw)--> sandbox ~/.claude (config persists)
|-- [~/.claudebox/cache/] --bind (rw)--> sandbox ~/.cache
|
|-- [~/.ssh, ~/.gnupg, ~/.aws, ...] --> NOT MOUNTED (invisible)
|
v
sandbox
|-- claude --dangerously-skip-permissions
|-- reads/writes CWD (code)
|-- reads/writes ~/.claude (config, CLAUDE.md, etc.)
|-- can run: git, curl, jq, rg, fd, nix, comma
|-- can install tools via comma/nix shell
|-- CANNOT see secrets
```
### ~/.claudebox to ~/.claude Mapping
The bind mount `--bind "$HOME/.claudebox" "$sandbox_home/.claude"` means:
- **Outside sandbox:** `~/.claudebox/` is the real directory on disk
- **Inside sandbox:** It appears as `~/.claude/` (where Claude Code expects its config)
- Claude Code reads/writes `~/.claude/settings.json`, `~/.claude/CLAUDE.md`, etc. -- all actually stored in `~/.claudebox/`
- The real `~/.claude/` on the host (if it exists) is never visible inside the sandbox
- First-run setup: `mkdir -p ~/.claudebox` before first launch
Contents to pre-seed in `~/.claudebox/`:
- `CLAUDE.md` with sandbox-aware instructions (how to use comma, what tools are available)
- `settings.json` if needed for Claude Code config
## Patterns to Follow
### Pattern 1: writeShellApplication with runtimeInputs
**What:** Use `pkgs.writeShellApplication` to create the wrapper, with all tools in `runtimeInputs`
**Why:** Automatically sets up PATH, adds `set -euo pipefail`, shellcheck-validates the script
```nix
{ pkgs }:
pkgs.writeShellApplication {
name = "claudebox";
runtimeInputs = with pkgs; [
bubblewrap
coreutils
# These go into the wrapper's PATH, not the sandbox's PATH
];
text = builtins.readFile ./claudebox.sh;
}
```
**Important distinction:** `runtimeInputs` sets the PATH of the wrapper script itself (needs bwrap). The sandbox's internal PATH is constructed separately by the script and passed via `--setenv PATH`.
### Pattern 2: Constructing Sandbox PATH from Nix Store Paths
**What:** Build the sandbox's PATH from explicit Nix store paths, not from the wrapper's PATH
```nix
# In the Nix expression, interpolate store paths into the script
sandboxPath = lib.makeBinPath [
pkgs.coreutils
pkgs.git
pkgs.curl
pkgs.jq
pkgs.ripgrep
pkgs.fd
pkgs.nix
pkgs.comma
claude-code # however this is packaged
];
```
Then in the shell script: `--setenv PATH "${sandboxPath}"`. This guarantees the sandbox PATH contains exactly and only the intended tools, all as `/nix/store/...` paths.
### Pattern 3: Env Allowlist as Array
**What:** Define allowed env vars as a bash array, loop to build `--setenv` flags
```bash
allowed_vars=(
HOME PATH TERM EDITOR VISUAL
LANG LC_ALL LC_CTYPE
COLORTERM FORCE_COLOR
NO_COLOR
XDG_RUNTIME_DIR
CLAUDE_CODE_API_KEY
ANTHROPIC_API_KEY
)
env_args=()
for var in "${allowed_vars[@]}"; do
if [[ -n "${!var:-}" ]]; then
env_args+=(--setenv "$var" "${!var}")
fi
done
```
HOME and PATH get overridden with sandbox-specific values after this loop.
### Pattern 4: Pre-launch Audit on stderr
**What:** Print the env vars that will enter the sandbox, prompt on stderr
```bash
if [[ "${skip_audit}" != "true" ]]; then
echo "=== claudebox: environment entering sandbox ===" >&2
for var in "${allowed_vars[@]}"; do
if [[ -n "${!var:-}" ]]; then
echo " ${var}=${!var}" >&2
fi
done
echo "" >&2
read -rp "Proceed? [Y/n] " answer < /dev/tty
if [[ "${answer}" =~ ^[Nn] ]]; then
echo "Aborted." >&2
exit 1
fi
fi
```
## Anti-Patterns to Avoid
### Anti-Pattern 1: Using --dev-bind Instead of --ro-bind for /nix/store
**What:** Mounting /nix/store read-write inside the sandbox
**Why bad:** The sandbox process could write to the store, bypassing the Nix daemon. No security benefit and potential store corruption.
**Instead:** `--ro-bind /nix/store /nix/store` -- the daemon handles writes from outside.
### Anti-Pattern 2: Env Denylist
**What:** Starting with the full host env and removing known-bad vars
**Why bad:** New secrets (e.g., `VAULT_TOKEN`, `OPENAI_API_KEY`) leak automatically. You must know every possible secret name.
**Instead:** `--clearenv` + explicit `--setenv` for each allowed var.
### Anti-Pattern 3: Bind-Mounting All of /home
**What:** `--bind /home /home` for convenience
**Why bad:** Exposes `~/.ssh`, `~/.gnupg`, `~/.aws`, `~/.config/gcloud`, age keys, everything
**Instead:** `--tmpfs $HOME` then selectively bind specific directories.
### Anti-Pattern 4: Forgetting --die-with-parent
**What:** Omitting `--die-with-parent` from bwrap flags
**Why bad:** If the wrapper script is killed, the sandbox process becomes orphaned and keeps running
**Instead:** Always include `--die-with-parent`.
### Anti-Pattern 5: Bind-Mounting /nix/store But Not the Daemon Socket
**What:** Read-only store mount without daemon access
**Why bad:** `nix shell`, `nix build`, and comma all fail because they cannot talk to the daemon. Tools are frozen to what's in PATH.
**Instead:** Also bind the daemon socket and /nix/var/nix/db.
## Component Build Order
Build and test each component incrementally:
### Stage 1: Minimal bwrap exec (get a shell)
- Hardcode everything, no env audit, no argument parsing
- Goal: `bwrap --ro-bind /nix/store /nix/store --bind $(pwd) $(pwd) ... -- /bin/sh`
- Validates: mount table works, namespace config doesn't crash
- Test: Can you run `ls` inside the sandbox? Can you see `/nix/store`?
### Stage 2: Run Claude inside bwrap
- Replace `/bin/sh` with `claude --dangerously-skip-permissions`
- Add the `~/.claudebox` -> `~/.claude` bind mount
- Add proper env setup (HOME, PATH, TERM, API key)
- Test: Does Claude launch? Can it read/write CWD?
### Stage 3: Nix/comma inside the sandbox
- Add daemon socket mount
- Add nix db/profiles mounts
- Add nix-index database mount for comma
- Test: Can Claude run `, python3` and get a working Python?
### Stage 4: Env audit + argument parsing
- Add the allowlist builder
- Add the pre-launch audit display
- Add `--yes`/`-y` flag
- Test: Does the audit show correct vars? Does `-y` skip it?
### Stage 5: Nix packaging
- `writeShellApplication` wrapper
- Construct sandbox PATH via `lib.makeBinPath`
- Wire into flake
- Test: `nix run .#claudebox` works end-to-end
### Stage 6: Polish
- Default CLAUDE.md with sandbox instructions
- Error messages for missing `~/.claudebox`
- XDG_RUNTIME_DIR handling
## Scalability Considerations
Not applicable -- this is a single-user local tool. The architecture is a shell script wrapping a single process.
## Key Technical Notes
### /nix/store Bind Mount Reflects Live Changes
When bwrap does `--ro-bind /nix/store /nix/store`, it creates a bind mount. Bind mounts in Linux reflect the live state of the source. So when the Nix daemon (running outside) adds new paths to `/nix/store`, they immediately appear inside the sandbox through the existing mount. This is why `nix shell` works: the daemon builds, writes the result to `/nix/store`, and the sandbox sees it instantly.
### --unshare-net Is Intentionally Omitted
The project explicitly keeps network access (Claude needs API access, git needs remotes, curl needs endpoints). Network isolation is out of scope per PROJECT.md -- Claude Code's own proxy handles domain allowlisting.
### User Namespace Requirement
`--unshare-user` requires user namespaces to be enabled in the kernel (`sysctl kernel.unprivileged_userns_clone=1`). NixOS has this enabled by default. Without user namespaces, bwrap needs setuid -- but on NixOS this is handled by the `bubblewrap` package and `security.allowUserNamespaces` (defaults to true).
### XDG_RUNTIME_DIR
Some tools (including potentially Claude Code) expect `XDG_RUNTIME_DIR` to exist. Options:
- `--tmpfs /run/user/$(id -u)` and `--setenv XDG_RUNTIME_DIR /run/user/$(id -u)`
- Or simply don't pass it and let tools fall back to `/tmp`
Recommend the tmpfs approach for maximum compatibility.
## Sources
- bubblewrap documentation and manpage (training data, HIGH confidence -- bwrap is stable and rarely changes API)
- Nix daemon architecture (training data, HIGH confidence -- fundamental Nix design)
- nixpkgs `writeShellApplication` patterns (training data, HIGH confidence)
- Linux bind mount semantics (training data, HIGH confidence -- kernel behavior)
- comma/nix-index mechanics (training data, MEDIUM confidence -- verify comma's current invocation style)