claudebox/.planning/research/SUMMARY.md

156 lines
9 KiB
Markdown

# Project Research Summary
**Project:** claudebox v2.0 — Network Isolation & Profiles
**Domain:** Nix/bubblewrap sandbox wrapper for AI coding agents (Claude Code)
**Researched:** 2026-04-10
**Confidence:** HIGH (auth passthrough, instance isolation, network none/full tiers), MEDIUM-HIGH (internet-only tier, profiles)
## Executive Summary
claudebox v2.0 extends the validated v1.0 foundation (writeShellApplication, bubblewrap, comma-with-db) with four new capability areas: tiered network isolation, per-project instance isolation, named profiles, and host auth passthrough. The only new runtime dependency is a userspace networking sidecar (`pkgs.slirp4netns` or `pkgs.passt`) for the internet-only network tier. All other features are pure shell extensions with no new deps.
The recommended build order is strictly dependency-driven. Auth passthrough must come first because every downstream feature assumes Claude Code can authenticate inside the sandbox. Instance isolation depends on the auth mount path being stable. The `none` network tier validates the exec-to-wait refactoring before the high-complexity `inet` tier is added. Named profiles tie together all prior subsystems. Nix package injection is independently testable and should be last.
The highest-risk feature is the internet-only network tier. It requires process coordination between bwrap and a sidecar: bwrap must be backgrounded (not exec'd), its sandbox PID captured via `--info-fd` or `--pidfile`, the sidecar started and waited on for readiness, and a custom `/etc/resolv.conf` injected because the host's resolv.conf points to a loopback DNS unreachable from the new network namespace.
## Key Findings
### Recommended Stack
The existing stack carries forward unchanged. New additions:
**Core technologies:**
- `pkgs.slirp4netns` (v1.3.3): internet-only network tier sidecar — well-documented `--ready-fd`/`--exit-fd` sync primitives for bash coordination
- `pkgs.passt` (pasta binary): alternative sidecar — Podman 5 default, NAT-free, cleaner DNS; consider as primary if bash integration proves clean
- Bash-sourced `.sh` profile files OR flat JSON + jq: named profile config
- `sha256sum` (coreutils, already present): instance directory hashing
### Expected Features
**Must have (table stakes):**
- Host auth passthrough — rw mount of `~/.claude/.credentials.json` (rw required for OAuth token refresh)
- Per-project instance isolation — `~/.claudebox/instances/<hash>/.claude/` with git worktree awareness
- Named profiles (`--profile foo` / `CLAUDEBOX_PROFILE=foo`) — env vars, mounts, packages, network tier
- Tiered network isolation: `none` (offline) and `inet` (internet, no LAN/Tailscale)
**Should have (differentiators):**
- Profile `extends` / inheritance
- Network tier and active profile shown in pre-launch env audit
- Profile `--list` and `--show` commands
- Instance dir GC (`--gc`)
**Defer to v2.1+:**
- Full `nix develop .#devShell` integration — profile `packages` field covers 80% case
- Domain-level network allowlists
**Anti-features (explicitly avoid):**
- Mounting `.credentials.json` read-only — breaks OAuth token refresh
- Auto-detecting and injecting devShell on every launch — breaks "no surprises" principle
- Storing secret values in profile files — profiles reference env var names, not values
### Architecture Approach
claudebox.sh grows four new functions plus modifications to arg parse, env builder, mount builder, and the exec block. The exec block must branch on network tier: `full` and `none` use `exec bwrap`; `inet` uses `bwrap ... &` + sidecar + `wait`.
**Major components:**
1. **Arg parse** — adds `--profile NAME` and `--network full|inet|none` flags
2. **Profile loader** — reads `~/.claudebox/profiles/<name>.json` via jq; yields network, packages, env, mounts, passthrough settings
3. **Instance resolver** — resolves git worktree common dir, hashes canonical project root, creates instance dir
4. **Auth mount**`--bind "$HOME/.claude/.credentials.json"` (read-write, not read-only)
5. **Package injector**`nix build --no-link --print-out-paths nixpkgs#<pkg>` loop; prepends to SANDBOX_PATH
6. **Network setup**`--unshare-net` for none/inet; sidecar coordination for inet; temp resolv.conf for inet
7. **Exec block** — three-branch: full → `exec bwrap`; none → `exec bwrap --unshare-net`; inet → `bwrap --pidfile &` + sidecar + `wait`
8. **Pre-launch audit** — extended to show active profile, network tier, extra mounts
### Critical Pitfalls
1. **Auth mount must be read-write, not read-only** — Claude Code's OAuth flow writes refreshed tokens back to `.credentials.json`. A `--ro-bind` causes silent EACCES; users get locked out. *Phase 1.*
2. **Sidecar requires process coordination** — bwrap must be backgrounded to capture sandbox PID; `--ready-fd` awaited before proceeding; `--exit-fd` used to prevent process leaks on abnormal exit. *Phase 3.*
3. **DNS breaks in isolated namespace** — Host `/etc/resolv.conf` points to `127.0.0.53` (loopback, unreachable in new namespace). Must generate temp resolv.conf with sidecar DNS gateway. *Phase 3.*
4. **Git worktree hash collision** — Hashing CWD gives different hashes for worktrees of same repo. Use `git rev-parse --git-common-dir` to normalize. *Phase 2.*
5. **Concurrent sessions race on instance directory** — Two claudebox invocations in same project write to same files. Add flock lockfile. *Phase 2.*
6. **Profile sourcing requires permission validation** — Shell-sourcing without checking ownership/permissions is code injection. Validate file ownership. *Phase 4.*
## Implications for Roadmap
### Phase 4: Auth Passthrough
Auth must come first — every downstream feature needs Claude to authenticate inside the sandbox.
- Mount `~/.claude/.credentials.json` read-write into instance dir
- Validate token refresh works (not just initial auth)
### Phase 5: Per-Project Instance Isolation
Depends on Phase 4 (auth mount path must be stable).
- `~/.claudebox/instances/<sha256(canonical_root)[0:16]>/` as `~/.claude`
- Git worktree-aware hashing via `git rev-parse --git-common-dir`
- flock-based concurrent session guard
### Phase 6: Tiered Network Isolation
Highest complexity. Two sub-phases: `none` first (trivial), `inet` second (sidecar coordination).
- `none`: add `--unshare-net`, keep `exec bwrap`
- `inet`: `bwrap &` + slirp4netns/pasta sidecar + `--ready-fd`/`--exit-fd` + temp resolv.conf
- `--network` flag and `CLAUDEBOX_NETWORK` env var
### Phase 7: Named Profiles
Ties together all prior subsystems.
- `--profile foo` / `CLAUDEBOX_PROFILE=foo` (flag wins)
- Profile schema: network, env, extra_env_passthrough, mounts, packages
- `~/.claudebox/profiles/<name>.json` parsed with jq
- Permission validation before loading
- Pre-launch audit extended with profile info
### Phase 8: Nix Package Injection
Last because it has startup latency risk and is independently testable.
- Profile `packages` field resolved via `nix build --no-link --print-out-paths`
- Store paths prepended to SANDBOX_PATH
- Result caching to avoid re-resolving
### Phase Ordering Rationale
- Auth before isolation: credential mount path must be established first
- Isolation before profiles: per-project history makes profile defaults meaningful
- Network `none` before `inet`: validates exec→wait refactor cheaply
- Network before profiles: profiles set the network tier; implementation must exist first
- Profiles before package injection: package injection consumes profile packages field
### Research Flags
- **Phase 6 (inet tier):** pasta/slirp4netns exact CLI flags need live verification
- **Phases 4, 5, 7, 8:** Standard patterns, skip research-phase
## Confidence Assessment
| Area | Confidence | Notes |
|------|------------|-------|
| Stack | HIGH | Existing stack unchanged; slirp4netns/passt verified in nixpkgs |
| Features | HIGH | Auth file path confirmed from official Claude Code docs |
| Architecture | MEDIUM-HIGH | Existing codebase read directly; sidecar integration flags are MEDIUM |
| Pitfalls | HIGH | Sourced from official docs, upstream issue trackers |
**Overall confidence:** HIGH for Phases 4-5, 7-8. MEDIUM-HIGH for Phase 6 inet tier.
### Gaps to Address
- **pasta vs slirp4netns final decision:** Attempt pasta first in Phase 6; fall back to slirp4netns if integration proves difficult
- **Profile format: JSON vs bash-sourced:** JSON is safer (no code injection); bash-sourced is simpler. Decide in Phase 7 planning.
- **Auth mount rw semantics:** Must verify token refresh works after Phase 4, not just initial auth
## Sources
### Primary (HIGH confidence)
- Claude Code official docs — `~/.claude/.credentials.json` path; credential precedence
- slirp4netns GitHub (v1.3.3) — `--ready-fd`, `--exit-fd`, `--configure`, `--disable-host-loopback`
- bubblewrap manpage — `--unshare-net`, `--info-fd`, `--pidfile`
- Existing codebase (claudebox.sh, flake.nix) — direct read
### Secondary (MEDIUM confidence)
- passt.top — pasta architecture, `--config-net`, LAN isolation
- Claude Code GitHub issues #24317, #27933 (OAuth refresh), #34437 (worktrees)
---
*Research completed: 2026-04-10*
*Ready for roadmap: yes*