172 lines
13 KiB
Markdown
172 lines
13 KiB
Markdown
# Project Research Summary
|
|
|
|
**Project:** claudebox
|
|
**Domain:** Nix bubblewrap sandbox wrapper for AI coding agents
|
|
**Researched:** 2026-04-09
|
|
**Confidence:** MEDIUM-HIGH
|
|
|
|
## Executive Summary
|
|
|
|
claudebox is a single-purpose Nix derivation that wraps Claude Code in a bubblewrap sandbox, hiding secrets (SSH keys, GPG, AWS creds, age keys, Tailscale state) while preserving full coding capability. The expert approach is straightforward: `writeShellApplication` produces a shellcheck-validated bash script that assembles bwrap flags and `exec`s into the sandboxed Claude process. The entire security model rests on two primitives -- `--clearenv` (environment allowlist) and selective filesystem bind-mounts (default-deny). This is a well-trodden pattern in the Nix ecosystem (nixpak, bubblejail, nix-bubblewrap all do variations of it).
|
|
|
|
The recommended approach is a five-stage shell script (arg parsing, env building, env audit, bwrap invocation, exec claude) packaged as a Nix flake. The key differentiator over generic sandbox wrappers is tool self-provisioning via comma/nix-index-database -- Claude can install any Nix package on demand inside the sandbox because the Nix daemon socket is bind-mounted, and new store paths appear through the live bind mount. This eliminates the "frozen toolset" problem that makes most sandboxes painful for development work.
|
|
|
|
The primary risks are all in the "looks like it works but doesn't" category. The sandbox can appear functional after a basic test while leaking environment variables, lacking DNS resolution, missing SSL certificates, or having broken git. Research identified 15 specific pitfalls, 6 of which are critical and all manifest in Phase 1. The mitigation is a strict 7-point integration test (env check, curl HTTPS, nix shell, git operations, Node.js, full Claude session, tool installation via comma) that must pass before declaring any phase complete.
|
|
|
|
## Key Findings
|
|
|
|
### Recommended Stack
|
|
|
|
The stack is minimal by design -- a shell script and two Nix packages. No frameworks, no languages beyond bash, no build systems beyond Nix.
|
|
|
|
**Core technologies:**
|
|
- **`writeShellApplication`**: Nix function to produce the wrapper -- provides shellcheck at build time, `set -euo pipefail`, and `runtimeInputs` PATH wiring
|
|
- **`bubblewrap` (bwrap)**: Unprivileged user-namespace sandbox -- no setuid needed on NixOS, mature and stable API
|
|
- **`lib.makeBinPath`**: Constructs the sandbox-internal PATH from explicit Nix store paths -- guarantees only declared tools are available
|
|
- **`comma` + `nix-index-database`**: On-demand package installation inside sandbox -- use `comma-with-db` or bind-mount host's nix-index DB
|
|
|
|
**Runtime deps for sandbox PATH:** coreutils, git, curl, jq, ripgrep, fd, nix, comma, bash, nodejs
|
|
|
|
**Explicitly excluded from sandbox:** gnupg, openssh, age/agenix, tailscale (secret material / infrastructure access)
|
|
|
|
### Expected Features
|
|
|
|
**Must have (table stakes):**
|
|
- Filesystem isolation with default-deny (bwrap `--tmpfs /` base)
|
|
- Environment allowlist via `--clearenv` + `--setenv`
|
|
- Secret path hiding (~/.ssh, ~/.gnupg, ~/.aws, age keys -- simply never mounted)
|
|
- Minimal PATH from Nix store paths only
|
|
- Nix store read-only mount + daemon socket for tool provisioning
|
|
- Persistent config directory (~/.claudebox mapped to ~/.claude inside sandbox)
|
|
- Pre-launch env audit with `--yes`/`-y` skip flag
|
|
- Working /tmp, /dev, /proc
|
|
- Exit code passthrough and signal forwarding via `exec`
|
|
|
|
**Should have (differentiators for v1):**
|
|
- Tool self-provisioning via comma (already planned, low complexity)
|
|
- Injected system prompt (CLAUDE.md in ~/.claudebox telling Claude about sandbox capabilities)
|
|
- Dry-run mode (`--dry-run` prints bwrap command without executing)
|
|
- Sandbox health check (`claudebox --check`)
|
|
|
|
**Defer (v2+):**
|
|
- Env var leak detection (regex scanning for secret-like patterns)
|
|
- Project-local tool declarations (.claudebox.toml)
|
|
- Git credential isolation (sandbox-specific .gitconfig)
|
|
- Multiple working directories (--mount-ro/--mount-rw flags)
|
|
- Configurable security profiles (one hardcoded posture is correct for v1)
|
|
|
|
**Anti-features (never build):**
|
|
- Network isolation (Claude Code handles domain allowlisting; bwrap netns is fragile)
|
|
- GUI/audio/DBus passthrough (CLI tool, no desktop integration)
|
|
- Seccomp/capability dropping (threat model is data exfiltration, not privilege escalation)
|
|
- Docker/OCI wrapping (Nix+bwrap is lighter and daemonless)
|
|
|
|
### Architecture Approach
|
|
|
|
The architecture is a single shell script with five sequential stages, packaged as one Nix derivation. There are no services, no config files to parse (in v1), no persistent state beyond ~/.claudebox. The critical architectural insight is the two-PATH distinction: `runtimeInputs` sets the wrapper script's PATH (needs bwrap), while `lib.makeBinPath` constructs the sandbox-internal PATH (needs git, curl, etc.). Mount ordering is the primary complexity -- bwrap processes mounts sequentially, later mounts overlay earlier ones, so the order must be: tmpfs root, read-only system mounts, tmpfs home, specific bind-mounts into home, CWD bind-mount.
|
|
|
|
**Major components:**
|
|
1. **Nix derivation** (flake.nix) -- pins all deps, builds wrapper via `writeShellApplication`, interpolates sandbox PATH via `lib.makeBinPath`
|
|
2. **Argument parser** -- handles `--yes`, `--dry-run`, `--check`, collects passthrough args for claude
|
|
3. **Env builder** -- reads host vars, filters through allowlist array, builds `--setenv` flag list
|
|
4. **Env auditor** -- displays filtered env on stderr, prompts for confirmation (skippable with `--yes`)
|
|
5. **bwrap invocation** -- assembles namespace flags, mount table, env flags, execs into `claude --dangerously-skip-permissions`
|
|
|
|
### Critical Pitfalls
|
|
|
|
1. **Environment variable leaks** -- bwrap inherits parent env by default; `--clearenv` is mandatory from day one. Without it, `SSH_AUTH_SOCK`, `AWS_PROFILE`, `KUBECONFIG` all pass through and the sandbox is theater. Test with `env` inside sandbox.
|
|
|
|
2. **Nix daemon socket missing** -- mounting `/nix/store` read-only but forgetting `/nix/var/nix/daemon-socket` kills all comma/nix-shell functionality. Must bind-mount the socket (not ro-bind, it's a Unix socket).
|
|
|
|
3. **DNS/SSL resolution failure** -- on NixOS, `/etc/resolv.conf` is often a symlink; must resolve with `readlink -f` before mounting. Must also mount `/etc/ssl`, `/etc/nsswitch.conf`, and pass `NIX_SSL_CERT_FILE`/`SSL_CERT_FILE` in the env allowlist. Without this, nothing network-dependent works.
|
|
|
|
4. **Git broken inside sandbox** -- missing ~/.gitconfig (no user identity), potential safe.directory rejection from UID mismatch with `--unshare-user`, credential helpers referencing binaries not in sandbox. Mount gitconfig read-only or generate minimal one.
|
|
|
|
5. **Missing /dev nodes** -- `--dev /dev` provides basics but may lack `/dev/shm` (Node.js V8), `/dev/pts` (PTY allocation), `/dev/tty` (git prompts). Test with actual Claude Code session, not just `echo hello`.
|
|
|
|
## Implications for Roadmap
|
|
|
|
Based on research, suggested phase structure:
|
|
|
|
### Phase 1: Minimal Viable Sandbox
|
|
**Rationale:** All critical pitfalls (6 of 6) and all table stakes features converge here. Nothing else can be built or tested without a working sandbox. The architecture research provides an explicit 6-stage build order within this phase.
|
|
**Delivers:** A working `claudebox` command that launches Claude Code in a bwrap sandbox with env isolation, filesystem isolation, and basic tool access.
|
|
**Addresses:** All table stakes features (filesystem isolation, env allowlist, secret hiding, minimal PATH, persistent config, /tmp, /dev, /proc, exit code passthrough, signal forwarding)
|
|
**Avoids:** Env leaks (#1), missing /dev (#2), daemon socket (#3), DNS (#4), /tmp (#5), git (#6), symlinks (#10), SSL (#11), locale (#12), home dir (#13), mount ordering (#14), hardcoded paths (#15)
|
|
**Build sub-order within phase:**
|
|
1. Bare bwrap invocation (get a shell, validate mounts)
|
|
2. Run Claude inside bwrap (add config mount, env setup, API key)
|
|
3. Add Nix daemon socket + comma support
|
|
4. Fix git (gitconfig mount, safe.directory)
|
|
5. Env audit + argument parsing (--yes flag)
|
|
6. Nix packaging (writeShellApplication, flake, lib.makeBinPath)
|
|
|
|
### Phase 2: System Prompt and UX Polish
|
|
**Rationale:** Once the sandbox works, Claude needs to know it's sandboxed and how to use comma. This is low-effort, high-impact.
|
|
**Delivers:** Default CLAUDE.md in ~/.claudebox with sandbox-aware instructions, `--dry-run` mode, `--check` health check, error messages for missing prerequisites.
|
|
**Addresses:** Injected system prompt, dry-run mode, sandbox health check
|
|
**Avoids:** TTY/PTY issues (#8), XDG/cache directory issues (#9)
|
|
|
|
### Phase 3: Hardening and Testing
|
|
**Rationale:** After functionality is proven, lock down remaining attack surface and formalize the test suite.
|
|
**Delivers:** PID namespace isolation (`--unshare-pid`), formalized 7-point integration test script, documentation.
|
|
**Addresses:** /proc info leak (#7), the meta-pitfall of happy-path-only testing
|
|
**Avoids:** Regression on any earlier pitfall via automated tests
|
|
|
|
### Phase Ordering Rationale
|
|
|
|
- Phase 1 must come first because every other phase depends on a working sandbox. The internal build order (shell first, then Claude, then Nix, then git, then UX, then packaging) follows the dependency chain identified in architecture research.
|
|
- Phase 2 is separated from Phase 1 because it adds no security value -- it's UX. But it dramatically improves the actual Claude experience and is low complexity.
|
|
- Phase 3 is last because hardening and testing are polish on a working tool. PID namespace isolation is not blocking functionality.
|
|
- All three phases are small. This is a shell script, not a platform. Total implementation is likely under 200 lines of bash + 50 lines of Nix.
|
|
|
|
### Research Flags
|
|
|
|
Phases likely needing deeper research during planning:
|
|
- **Phase 1 (Nix/comma sub-step):** Verify `comma-with-db` packaging in current `nix-community/nix-index-database` flake. Verify `--clearenv` availability in nixpkgs bwrap version. Test daemon socket bind-mount vs ro-bind behavior.
|
|
|
|
Phases with standard patterns (skip research-phase):
|
|
- **Phase 1 (all other sub-steps):** bwrap flags, writeShellApplication, mount ordering -- all well-documented, stable APIs.
|
|
- **Phase 2:** Entirely standard (writing a markdown file, adding CLI flags to a bash script).
|
|
- **Phase 3:** Standard bwrap flag (`--unshare-pid`), standard shell test assertions.
|
|
|
|
## Confidence Assessment
|
|
|
|
| Area | Confidence | Notes |
|
|
|------|------------|-------|
|
|
| Stack | HIGH | writeShellApplication and bwrap are stable, well-documented Nix/Linux primitives |
|
|
| Features | MEDIUM | Feature landscape derived from training data on firejail/nixpak/bubblejail; core features are certain, differentiator priority is judgment |
|
|
| Architecture | HIGH | Single shell script architecture is obvious for this scope; mount ordering and PATH construction are well-established patterns |
|
|
| Pitfalls | MEDIUM-HIGH | Pitfalls are real and well-known in the bwrap community, but some NixOS-specific behaviors (symlink resolution, daemon socket permissions) need live testing |
|
|
|
|
**Overall confidence:** MEDIUM-HIGH
|
|
|
|
### Gaps to Address
|
|
|
|
- **`--clearenv` version requirement:** Verify bwrap 0.8.0+ in current nixpkgs. If older, fall back to `env -i` prefix.
|
|
- **comma-with-db packaging:** Verify current nix-index-database flake API. May need to bind-mount host DB instead.
|
|
- **Claude Code env var requirements:** Research identified the obvious vars (HOME, PATH, TERM, API key) but Claude Code may need additional vars not documented. Requires live testing.
|
|
- **`--unshare-user` and git safe.directory interaction:** Research flags potential UID mismatch. Needs empirical verification -- may need to skip `--unshare-user` or add git safe.directory config.
|
|
- **`/dev/shm` and `/dev/pts` availability:** `--dev /dev` may or may not provide these. Requires testing on NixOS with current bwrap version.
|
|
|
|
## Sources
|
|
|
|
### Primary (HIGH confidence)
|
|
- bubblewrap documentation and manpage -- mount semantics, namespace flags, --clearenv
|
|
- nixpkgs `writeShellApplication` API -- stable since 2022, standard Nix pattern
|
|
- Linux bind mount semantics -- kernel behavior, well-established
|
|
- Git safe.directory behavior -- Git 2.35.2+, well-documented
|
|
|
|
### Secondary (MEDIUM confidence)
|
|
- firejail feature documentation -- used for feature landscape comparison
|
|
- nixpak/bubblejail GitHub repositories -- used for architectural pattern comparison
|
|
- comma/nix-index-database mechanics -- community project, verify current API
|
|
- Node.js /dev requirements -- inferred from V8 runtime behavior
|
|
|
|
### Tertiary (LOW confidence)
|
|
- Specific bwrap version in current nixpkgs -- needs verification
|
|
- comma-with-db package availability -- needs verification against current flake
|
|
|
|
---
|
|
*Research completed: 2026-04-09*
|
|
*Ready for roadmap: yes*
|