diff --git a/.planning/MILESTONES.md b/.planning/MILESTONES.md
deleted file mode 100644
index 97533b8..0000000
--- a/.planning/MILESTONES.md
+++ /dev/null
@@ -1,15 +0,0 @@
-# Milestones
-
-## v1.0 MVP (Shipped: 2026-04-10)
-
-**Phases completed:** 3 phases, 5 plans, 6 tasks
-
-**Key accomplishments:**
-
-- Nix flake with writeShellApplication producing claudebox wrapper in bwrap with clearenv, env allowlist, tmpfs root, secret hiding, and comma/nix tool access
-- Fixed NixOS symlink resolution — readlink -f for profile paths to real nix store paths
-- CLI with --check, --dry-run modes, multi-flag parsing, and CLAUDE_ARGS accumulator
-- Pre-launch env audit with grouped display, sensitive value masking, and interactive Y/n confirmation
-- SANDBOX.md generation and CLAUDE.md import management for sandbox-aware prompting
-
----
diff --git a/.planning/RETROSPECTIVE.md b/.planning/RETROSPECTIVE.md
deleted file mode 100644
index 797c38f..0000000
--- a/.planning/RETROSPECTIVE.md
+++ /dev/null
@@ -1,52 +0,0 @@
-# Project Retrospective
-
-*A living document updated after each milestone. Lessons feed forward into future planning.*
-
-## Milestone: v1.0 — MVP
-
-**Shipped:** 2026-04-10
-**Phases:** 3 | **Plans:** 5
-
-### What Was Built
-- Nix flake producing `claudebox` wrapper: bwrap sandbox with clearenv, env allowlist, tmpfs root, secret path hiding, git identity forwarding, comma/nix tool access
-- CLI diagnostic modes: --check (environment validation), --dry-run (print bwrap command), --shell (debug shell)
-- Pre-launch env audit with grouped sections, sensitive value masking, Y/n confirmation prompt
-- SANDBOX.md generation and CLAUDE.md import management so Claude knows its sandbox constraints
-
-### What Worked
-- writeShellApplication with builtins.readFile pattern — shellcheck at build time, shell syntax highlighting in editors
-- Rapid phase execution: Phase 1 in ~2 min, Phase 2 in ~4 min, Phase 3 in ~76 sec
-- clearenv + allowlist approach caught all secret leakage by default
-- readlink -f fix for NixOS symlinks was discovered and fixed in-phase without blocking
-
-### What Was Inefficient
-- REQUIREMENTS.md traceability table not updated during execution — 7 requirements showed "Pending" despite being complete
-- Phase 3 context was gathered but not executed in the same session, requiring session continuity overhead
-
-### Patterns Established
-- readlink -f for all host-resolved binaries passed into bwrap (NixOS symlink chains)
-- SANDBOX.md as separate file with @import in CLAUDE.md (keeps user content clean, sandbox instructions always fresh)
-- export trick for shellcheck SC2034 when a variable is used in a later plan but not yet
-
-### Key Lessons
-1. On NixOS, every host binary path is a symlink chain through /etc/profiles/per-user/ — must resolve to real store paths before passing to bwrap
-2. Conditional mounts needed for cross-distro support (/etc/static exists only on NixOS)
-
-### Cost Observations
-- Model mix: predominantly opus for execution
-- Sessions: ~3 sessions across 2 days
-- Notable: entire v1.0 MVP shipped in under 2 days of wall clock time
-
----
-
-## Cross-Milestone Trends
-
-### Process Evolution
-
-| Milestone | Phases | Plans | Key Change |
-|-----------|--------|-------|------------|
-| v1.0 | 3 | 5 | Initial project — established sandbox patterns |
-
-### Top Lessons (Verified Across Milestones)
-
-1. (Will populate as more milestones complete)
diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md
index 1247f2d..fdb2a5e 100644
--- a/.planning/ROADMAP.md
+++ b/.planning/ROADMAP.md
@@ -1,89 +1,73 @@
 # Roadmap: claudebox
 
-## Milestones
+## Overview
 
-- ✅ **v1.0 MVP** — Phases 1-3 (shipped 2026-04-10)
-- 🚧 **v2.0 Network Isolation & Profiles** — Phases 4-7 (in progress)
+claudebox is a Nix-packaged bwrap sandbox wrapper for Claude Code. The roadmap moves from a working sandbox (Phase 1) through CLI polish (Phase 2) to sandbox-aware prompting (Phase 3). Phase 1 is the bulk of the work -- once Claude runs inside bwrap with env isolation, filesystem isolation, and tool provisioning, the remaining phases add UX and developer experience improvements.
 
 ## Phases
 
-<details>
-<summary>✅ v1.0 MVP (Phases 1-3) — SHIPPED 2026-04-10</summary>
+**Phase Numbering:**
+- Integer phases (1, 2, 3): Planned milestone work
+- Decimal phases (2.1, 2.2): Urgent insertions (marked with INSERTED)
 
-- [x] Phase 1: Minimal Viable Sandbox (2/2 plans) — bwrap sandbox with clearenv, env allowlist, filesystem isolation, secret hiding, tool provisioning
-- [x] Phase 2: Env Audit and CLI Polish (2/2 plans) — --check, --dry-run, env audit display with masking, confirmation prompt
-- [x] Phase 3: Sandbox-Aware Prompting (1/1 plan) — SANDBOX.md generation, CLAUDE.md import management
+Decimal phases appear between their surrounding integers in numeric order.
 
-Full details: [milestones/v1.0-ROADMAP.md](milestones/v1.0-ROADMAP.md)
-
-</details>
-
-### 🚧 v2.0 Network Isolation & Profiles (In Progress)
-
-**Milestone Goal:** Add tiered network isolation, per-project instance isolation, named profiles, and host auth passthrough so Claude can authenticate, work in project-scoped history, operate at controlled network exposure, and run under reusable configuration profiles.
-
-- [x] **Phase 4: Auth Passthrough** — Mount host Claude credentials read-write so subscription and API key access work inside the sandbox
-- [ ] **Phase 5: Per-Project Instance Isolation** — Scope conversation history and state to each project directory automatically
-- [ ] **Phase 6: Tiered Network Isolation** — Add none/inet/full network tiers selectable at launch
-- [ ] **Phase 7: Named Profiles** — Load named configuration profiles that set env vars, mounts, and network tier
+- [ ] **Phase 1: Minimal Viable Sandbox** - Working claudebox command that launches Claude in bwrap with full isolation and tool provisioning
+- [ ] **Phase 2: Env Audit and CLI Polish** - Pre-launch env review, --yes, --dry-run, and --check flags
+- [ ] **Phase 3: Sandbox-Aware Prompting** - Injected CLAUDE.md so Claude knows its capabilities and constraints
 
 ## Phase Details
 
-### Phase 4: Auth Passthrough ✅ COMPLETE
-**Goal**: Claude Code inside the sandbox can authenticate using the host subscription or API key
-**Depends on**: Phase 3
-**Requirements**: AUTH-01, AUTH-02
+### Phase 1: Minimal Viable Sandbox
+**Goal**: User can run `claudebox` in any project directory and get a fully functional Claude Code session with secrets invisible
+**Depends on**: Nothing (first phase)
+**Requirements**: SAND-01, SAND-02, SAND-03, SAND-04, SAND-05, SAND-06, SAND-07, SAND-08, SAND-09, SAND-10, SAND-11, SAND-12, SAND-13, SAND-14, SAND-15, TOOL-01, TOOL-02, TOOL-03, GIT-01, GIT-02, NIX-01, NIX-02, NIX-03, UX-06
 **Success Criteria** (what must be TRUE):
-  1. Running claudebox with an active Claude subscription succeeds without re-authentication
-  2. OAuth token refresh completes silently — credentials file is updated and the session continues
-  3. When `ANTHROPIC_API_KEY` is set on the host, it is passed into the sandbox and takes precedence over OAuth
-**Plans**: 1 plan
+  1. Running `nix run` or `nix profile install` produces a working `claudebox` command
+  2. `claudebox` launches Claude Code inside bwrap; `env` inside the sandbox shows only allowlisted variables (no SSH_AUTH_SOCK, AWS_PROFILE, etc.)
+  3. Secret paths (~/.ssh, ~/.gnupg, ~/.aws, ~/.config/gcloud, age keys, /var/lib/tailscale) are not visible inside the sandbox
+  4. Claude can run `curl https://example.com`, `git status`, `, jq --help` (comma), and `nix shell nixpkgs#python3 -c python3 --version` inside the sandbox
+  5. Ctrl+C terminates the session cleanly; exit code from Claude passes through to the caller
+**Plans:** 2 plans
+
 Plans:
-- [x] 04-01-PLAN.md — Credential mount + audit redesign (completed 2026-04-10)
+- [x] 01-01-PLAN.md -- Create flake.nix and claudebox.sh with complete bwrap sandbox
+- [x] 01-02-PLAN.md -- Build verification and manual sandbox smoke test
 
-### Phase 5: Per-Project Instance Isolation
-**Goal**: Each project directory has its own isolated Claude state so conversation history, todos, and settings do not bleed between projects
-**Depends on**: Phase 4
-**Requirements**: INST-01, INST-02, INST-03, INST-04
+### Phase 2: Env Audit and CLI Polish
+**Goal**: User can review exactly what enters the sandbox before launch, and has diagnostic tools for troubleshooting
+**Depends on**: Phase 1
+**Requirements**: UX-01, UX-02, UX-03, UX-04, UX-05
 **Success Criteria** (what must be TRUE):
-  1. Launching claudebox in two different project directories produces two separate conversation histories with no cross-contamination
-  2. Launching claudebox from a git worktree shares instance state with the main worktree of the same repo
-  3. Two concurrent claudebox sessions in the same project do not corrupt each other's state
-  4. Running `claudebox --gc` removes instance directories for project roots that no longer exist on disk
-**Plans**: TBD
+  1. Running `claudebox` without `--yes` prints all env vars being passed into the sandbox and prompts for confirmation before proceeding
+  2. Running `claudebox --yes` or `claudebox -y` skips the env audit and launches immediately
+  3. Running `claudebox --dry-run` prints the full bwrap command without executing it
+  4. Running `claudebox --check` reports whether bwrap exists, required Nix packages are available, and ~/.claudebox exists
+**Plans:** 2 plans
 
-### Phase 6: Tiered Network Isolation
-**Goal**: Users can select a network access tier at launch to control whether Claude has no network, internet-only, or full host network access
-**Depends on**: Phase 5
-**Requirements**: NET-01, NET-02, NET-03, NET-04, NET-05
-**Success Criteria** (what must be TRUE):
-  1. `--network none` (or `CLAUDEBOX_NETWORK=none`) starts a session with no network access; DNS and all TCP connections fail inside the sandbox while the Nix daemon socket remains usable
-  2. `--network inet` starts a session where internet hostnames resolve and connections succeed, but LAN addresses and Tailscale IPs are unreachable
-  3. `--network full` (the default) preserves existing behavior with full host network access
-  4. When both `CLAUDEBOX_NETWORK` and `--network` are set, the CLI flag wins
-**Plans**: TBD
-**UI hint**: no
+Plans:
+- [x] 02-01-PLAN.md -- Refactor flag parsing, add --check and --dry-run modes
+- [x] 02-02-PLAN.md -- Env audit display with grouping, masking, and confirmation prompt
 
-### Phase 7: Named Profiles
-**Goal**: Users can define named profiles that package env var passthrough, extra mounts, and network tier into a reusable configuration loaded by name at launch
-**Depends on**: Phase 6
-**Requirements**: PROF-01, PROF-02, PROF-03, PROF-04, PROF-05, PROF-06
+### Phase 3: Sandbox-Aware Prompting
+**Goal**: Claude inside the sandbox knows it is sandboxed, how to install tools, and what is unavailable
+**Depends on**: Phase 1
+**Requirements**: AWARE-01, AWARE-02
 **Success Criteria** (what must be TRUE):
-  1. `claudebox --profile foo` loads `~/.claudebox/profiles/foo.json` and applies its env vars, mounts, and network tier for the session
-  2. `CLAUDEBOX_PROFILE=foo` activates a profile when no `--profile` flag is given; `--profile` wins when both are set
-  3. `claudebox --list-profiles` prints all profiles found under `~/.claudebox/profiles/`
-  4. `claudebox --show-profile foo` prints the contents of the named profile
-  5. The pre-launch env audit displays the active profile name, resolved network tier, and any extra mounts added by the profile
-**Plans**: TBD
+  1. First run of `claudebox` creates a default CLAUDE.md in ~/.claudebox/ if none exists
+  2. The injected CLAUDE.md tells Claude it is in a bwrap sandbox, how to use comma (`, <tool>`) and `nix shell` for tool installation, and that SSH/GPG/cloud credentials are unavailable
+**Plans:** 1 plan
+
+Plans:
+- [x] 03-01-PLAN.md -- Add SANDBOX.md generation and CLAUDE.md import management
 
 ## Progress
 
-| Phase | Milestone | Plans Complete | Status | Completed |
-|-------|-----------|----------------|--------|-----------|
-| 1. Minimal Viable Sandbox | v1.0 | 2/2 | Complete | 2026-04-09 |
-| 2. Env Audit and CLI Polish | v1.0 | 2/2 | Complete | 2026-04-09 |
-| 3. Sandbox-Aware Prompting | v1.0 | 1/1 | Complete | 2026-04-10 |
-| 4. Auth Passthrough | v2.0 | 1/1 | Complete | 2026-04-10 |
-| 5. Per-Project Instance Isolation | v2.0 | 0/? | Not started | - |
-| 6. Tiered Network Isolation | v2.0 | 0/? | Not started | - |
-| 7. Named Profiles | v2.0 | 0/? | Not started | - |
+**Execution Order:**
+Phases execute in numeric order: 1 -> 2 -> 3
+
+| Phase | Plans Complete | Status | Completed |
+|-------|----------------|--------|-----------|
+| 1. Minimal Viable Sandbox | 2/2 | Complete | - |
+| 2. Env Audit and CLI Polish | 0/2 | Planned | - |
+| 3. Sandbox-Aware Prompting | 0/1 | Not started | - |
diff --git a/.planning/STATE.md b/.planning/STATE.md
index 980e5d8..187f20e 100644
--- a/.planning/STATE.md
+++ b/.planning/STATE.md
@@ -1,36 +1,43 @@
 ---
 gsd_state_version: 1.0
-milestone: v2.0
-milestone_name: Network Isolation & Profiles
-status: active
-stopped_at: null
-last_updated: "2026-04-10T12:41:00Z"
-last_activity: 2026-04-10 - Phase 04 auth-passthrough complete and verified
+milestone: v1.0
+milestone_name: milestone
+status: executing
+stopped_at: Phase 3 context gathered
+last_updated: "2026-04-10T09:33:52.025Z"
+last_activity: 2026-04-10
 progress:
-  total_phases: 4
-  completed_phases: 1
-  total_plans: 1
-  completed_plans: 1
-  percent: 25
+  total_phases: 3
+  completed_phases: 0
+  total_plans: 0
+  completed_plans: 0
+  percent: 33
 ---
 
 # Project State
 
 ## Project Reference
 
-See: .planning/PROJECT.md (updated 2026-04-10)
+See: .planning/PROJECT.md (updated 2026-04-09)
 
-**Core value:** Secrets never enter the Claude Code environment. If a secret is accessible inside the sandbox, it's a bug.
-**Current focus:** Phase 4 — Auth Passthrough
+**Core value:** Secrets never enter the Claude Code environment
+**Current focus:** Phase 2 (next)
 
 ## Current Position
 
-Phase: 4 of 7 (Auth Passthrough) — COMPLETE
-Plan: 1 of 1 complete
-Status: Phase 04 verified (7/7); ready to start Phase 05
-Last activity: 2026-04-10 — Phase 04 auth-passthrough complete and verified
+Phase: 04 of 3 (sandbox aware prompting)
+Plan: Not started
+Status: Ready to execute
+Last activity: 2026-04-10
 
-Progress: [█░░░░░░░░░] 25% (v1.0 complete; v2.0 phase 04 done; phases 05-07 not started)
+Progress: [███░░░░░░░] 33%
+
+## Performance Metrics
+
+**Velocity:**
+
+| Phase 01 P01 | 1min | 2 tasks | 3 files |
+| Phase 01 P02 | 1min | 2 tasks | 1 file |
 
 ## Accumulated Context
 
@@ -40,10 +47,8 @@ Progress: [█░░░░░░░░░] 25% (v1.0 complete; v2.0 phase 04 don
 - [Phase 01]: readlink -f required to resolve NixOS profile symlinks to real nix store paths for bwrap visibility
 - [Phase 01]: SANDBOX_PATH built via makeBinPath in flake.nix to prevent host PATH leakage
 - [Phase 01]: SHELL set to nix store bash path, not /bin/bash (doesn't exist in tmpfs root)
+- [Phase 01]: --shell flag added for manual sandbox debugging
 - [Phase 01]: SSL cert verification failure is a host-level NixOS issue, not sandbox-specific
-- [v2.0 planning]: Auth mount must be read-write — OAuth token refresh writes back to .credentials.json; ro-bind causes silent EACCES
-- [v2.0 planning]: Profile format will be JSON (not bash-sourced) to prevent code injection
-- [v2.0 planning]: Attempt pasta sidecar first for inet tier; fall back to slirp4netns if integration is difficult
 
 ### Pending Todos
 
@@ -51,12 +56,16 @@ None.
 
 ### Blockers/Concerns
 
-- [Phase 6]: pasta vs slirp4netns final decision deferred to Phase 6 planning — exact CLI flags need live verification
-- [Phase 6]: inet tier requires exec-to-wait refactor (background bwrap, coordinate with sidecar via --ready-fd/--exit-fd)
-- SSL cert verification fails system-wide (host + sandbox) — NixOS/OpenSSL issue, not claudebox
+- SSL cert verification fails system-wide (host + sandbox) -- NixOS/OpenSSL issue, not claudebox
 
 ### Quick Tasks Completed
 
 | # | Description | Date | Commit | Directory |
 |---|-------------|------|--------|-----------|
 | 260410-d4u | on non-nixos hosts, bwrap fails because /etc/static does not exist | 2026-04-10 | 97c10f8 | [260410-d4u-on-non-nixos-hosts-bwrap-fails-because-e](./quick/260410-d4u-on-non-nixos-hosts-bwrap-fails-because-e/) |
+
+## Session Continuity
+
+Last session: 2026-04-09T18:59:43.248Z
+Stopped at: Phase 3 context gathered
+Resume file: .planning/phases/03-sandbox-aware-prompting/03-CONTEXT.md
diff --git a/.planning/milestones/v1.0-REQUIREMENTS.md b/.planning/milestones/v1.0-REQUIREMENTS.md
deleted file mode 100644
index fd489c8..0000000
--- a/.planning/milestones/v1.0-REQUIREMENTS.md
+++ /dev/null
@@ -1,140 +0,0 @@
-# Requirements Archive: v1.0 MVP
-
-**Archived:** 2026-04-10
-**Status:** SHIPPED
-
-For current requirements, see `.planning/REQUIREMENTS.md`.
-
----
-
-# Requirements: claudebox
-
-**Defined:** 2026-04-09
-**Core Value:** Secrets never enter the Claude Code environment
-
-## v1 Requirements
-
-### Sandbox Core
-
-- [x] **SAND-01**: Wrapper script produces a `claudebox` binary via Nix `writeShellApplication`
-- [x] **SAND-02**: bwrap sandbox starts with `--clearenv` — empty environment, only explicitly allowed vars pass through
-- [x] **SAND-03**: Environment allowlist includes only: HOME, PATH, TERM, EDITOR, LANG, LC_ALL, NIX_SSL_CERT_FILE, SSL_CERT_FILE, ANTHROPIC_API_KEY, USER, SHELL, XDG_RUNTIME_DIR
-- [x] **SAND-04**: Filesystem starts as tmpfs root — nothing from host is visible unless explicitly mounted
-- [x] **SAND-05**: CWD is bind-mounted read-write inside the sandbox
-- [x] **SAND-06**: `/nix/store` is mounted read-only inside the sandbox
-- [x] **SAND-07**: Nix daemon socket (`/nix/var/nix/daemon-socket`) is bind-mounted for `nix shell` / comma to work
-- [x] **SAND-08**: `~/.claudebox` on host is bind-mounted as `~/.claude` inside the sandbox
-- [x] **SAND-09**: Secret paths are never mounted: `~/.ssh`, `~/.gnupg`, `~/.aws`, `~/.config/gcloud`, age key paths, `/var/lib/tailscale`
-- [x] **SAND-10**: PATH inside sandbox contains only Nix store paths: coreutils, git, curl, jq, ripgrep, fd, nix, comma, bash
-- [x] **SAND-11**: Working `/tmp` (tmpfs), `/dev` (bwrap `--dev`), `/proc` (bwrap `--proc`)
-- [x] **SAND-12**: DNS resolution works inside sandbox (`/etc/resolv.conf` and its symlink targets mounted)
-- [x] **SAND-13**: SSL/TLS works inside sandbox (cert bundle mounted, `NIX_SSL_CERT_FILE` set)
-- [x] **SAND-14**: Exit code from Claude Code passes through to the wrapper's caller
-- [x] **SAND-15**: Signals (Ctrl+C) reach Claude Code via `exec` — no intermediate shell
-
-### Tool Provisioning
-
-- [x] **TOOL-01**: comma (`,`) is available in sandbox PATH for on-demand tool installation
-- [x] **TOOL-02**: `nix shell` works inside the sandbox for installing arbitrary packages
-- [x] **TOOL-03**: Newly installed Nix store paths are visible inside sandbox (live bind mount)
-
-### User Experience
-
-- [ ] **UX-01**: Pre-launch env audit displays all env vars being passed into the sandbox on stderr
-- [ ] **UX-02**: Pre-launch env audit prompts for confirmation before proceeding
-- [ ] **UX-03**: `--yes` / `-y` flag skips the env audit confirmation
-- [ ] **UX-04**: `--dry-run` flag prints the full bwrap command without executing
-- [ ] **UX-05**: `--check` flag verifies bwrap exists, required Nix packages are available, and `~/.claudebox` exists
-- [x] **UX-06**: `claude --dangerously-skip-permissions` is always passed — the sandbox is the permission layer
-
-### Claude Awareness
-
-- [ ] **AWARE-01**: Default `CLAUDE.md` is created in `~/.claudebox/` on first run if not present
-- [ ] **AWARE-02**: Injected CLAUDE.md tells Claude it's in a sandbox, how to use comma/nix for tools, and what's not available
-
-### Git Support
-
-- [x] **GIT-01**: Git works inside the sandbox with a minimal `.gitconfig` (user name/email)
-- [x] **GIT-02**: `safe.directory` is configured to trust the mounted CWD
-
-### Nix Packaging
-
-- [x] **NIX-01**: Project is a Nix flake with `claudebox` as default package
-- [x] **NIX-02**: All runtime dependencies are pinned via flake inputs
-- [x] **NIX-03**: `nix run` or `nix profile install` produces a working `claudebox` command
-
-## v2 Requirements
-
-### Network Isolation
-
-- **NET-01**: Block LAN/Tailscale access (RFC1918 + 100.64.0.0/10) while allowing internet egress
-- **NET-02**: Network namespace with controlled outbound via slirp4netns or veth pair
-
-### Enhanced Security
-
-- **SEC-01**: Env var leak detection — regex scan for patterns like `*KEY*`, `*TOKEN*`, `*SECRET*`
-- **SEC-02**: PID namespace isolation (`--unshare-pid`)
-- **SEC-03**: Git credential isolation — sandbox-specific `.gitconfig` with HTTPS-only credential helpers
-
-### Extensibility
-
-- **EXT-01**: Project-local tool declarations via `.claudebox.toml` or `.claudebox/tools.txt`
-- **EXT-02**: Additional mount paths via `--mount-ro` / `--mount-rw` flags
-- **EXT-03**: Configurable security profiles (different postures for different projects)
-
-## Out of Scope
-
-| Feature | Reason |
-|---------|--------|
-| GUI/X11/Wayland passthrough | CLI tool, no desktop integration needed |
-| Audio/PulseAudio/PipeWire | No audio needed for coding agent |
-| DBus access | Common sandbox escape vector, not needed |
-| Seccomp syscall filtering | Threat model is data exfiltration, not privilege escalation |
-| Docker/OCI wrapping | Nix+bwrap is lighter and daemonless |
-| NixOS module (services/programs) | Wrapper script derivation is sufficient |
-| Multi-user / shareability | Personal tool for endurance |
-
-## Traceability
-
-| Requirement | Phase | Status |
-|-------------|-------|--------|
-| SAND-01 | Phase 1 | Complete |
-| SAND-02 | Phase 1 | Complete |
-| SAND-03 | Phase 1 | Complete |
-| SAND-04 | Phase 1 | Complete |
-| SAND-05 | Phase 1 | Complete |
-| SAND-06 | Phase 1 | Complete |
-| SAND-07 | Phase 1 | Complete |
-| SAND-08 | Phase 1 | Complete |
-| SAND-09 | Phase 1 | Complete |
-| SAND-10 | Phase 1 | Complete |
-| SAND-11 | Phase 1 | Complete |
-| SAND-12 | Phase 1 | Complete |
-| SAND-13 | Phase 1 | Complete |
-| SAND-14 | Phase 1 | Complete |
-| SAND-15 | Phase 1 | Complete |
-| TOOL-01 | Phase 1 | Complete |
-| TOOL-02 | Phase 1 | Complete |
-| TOOL-03 | Phase 1 | Complete |
-| UX-01 | Phase 2 | Pending |
-| UX-02 | Phase 2 | Pending |
-| UX-03 | Phase 2 | Pending |
-| UX-04 | Phase 2 | Pending |
-| UX-05 | Phase 2 | Pending |
-| UX-06 | Phase 1 | Complete |
-| AWARE-01 | Phase 3 | Pending |
-| AWARE-02 | Phase 3 | Pending |
-| GIT-01 | Phase 1 | Complete |
-| GIT-02 | Phase 1 | Complete |
-| NIX-01 | Phase 1 | Complete |
-| NIX-02 | Phase 1 | Complete |
-| NIX-03 | Phase 1 | Complete |
-
-**Coverage:**
-- v1 requirements: 31 total
-- Mapped to phases: 31
-- Unmapped: 0
-
----
-*Requirements defined: 2026-04-09*
-*Last updated: 2026-04-09 after roadmap creation*
diff --git a/.planning/milestones/v1.0-ROADMAP.md b/.planning/milestones/v1.0-ROADMAP.md
deleted file mode 100644
index fdb2a5e..0000000
--- a/.planning/milestones/v1.0-ROADMAP.md
+++ /dev/null
@@ -1,73 +0,0 @@
-# Roadmap: claudebox
-
-## Overview
-
-claudebox is a Nix-packaged bwrap sandbox wrapper for Claude Code. The roadmap moves from a working sandbox (Phase 1) through CLI polish (Phase 2) to sandbox-aware prompting (Phase 3). Phase 1 is the bulk of the work -- once Claude runs inside bwrap with env isolation, filesystem isolation, and tool provisioning, the remaining phases add UX and developer experience improvements.
-
-## Phases
-
-**Phase Numbering:**
-- Integer phases (1, 2, 3): Planned milestone work
-- Decimal phases (2.1, 2.2): Urgent insertions (marked with INSERTED)
-
-Decimal phases appear between their surrounding integers in numeric order.
-
-- [ ] **Phase 1: Minimal Viable Sandbox** - Working claudebox command that launches Claude in bwrap with full isolation and tool provisioning
-- [ ] **Phase 2: Env Audit and CLI Polish** - Pre-launch env review, --yes, --dry-run, and --check flags
-- [ ] **Phase 3: Sandbox-Aware Prompting** - Injected CLAUDE.md so Claude knows its capabilities and constraints
-
-## Phase Details
-
-### Phase 1: Minimal Viable Sandbox
-**Goal**: User can run `claudebox` in any project directory and get a fully functional Claude Code session with secrets invisible
-**Depends on**: Nothing (first phase)
-**Requirements**: SAND-01, SAND-02, SAND-03, SAND-04, SAND-05, SAND-06, SAND-07, SAND-08, SAND-09, SAND-10, SAND-11, SAND-12, SAND-13, SAND-14, SAND-15, TOOL-01, TOOL-02, TOOL-03, GIT-01, GIT-02, NIX-01, NIX-02, NIX-03, UX-06
-**Success Criteria** (what must be TRUE):
-  1. Running `nix run` or `nix profile install` produces a working `claudebox` command
-  2. `claudebox` launches Claude Code inside bwrap; `env` inside the sandbox shows only allowlisted variables (no SSH_AUTH_SOCK, AWS_PROFILE, etc.)
-  3. Secret paths (~/.ssh, ~/.gnupg, ~/.aws, ~/.config/gcloud, age keys, /var/lib/tailscale) are not visible inside the sandbox
-  4. Claude can run `curl https://example.com`, `git status`, `, jq --help` (comma), and `nix shell nixpkgs#python3 -c python3 --version` inside the sandbox
-  5. Ctrl+C terminates the session cleanly; exit code from Claude passes through to the caller
-**Plans:** 2 plans
-
-Plans:
-- [x] 01-01-PLAN.md -- Create flake.nix and claudebox.sh with complete bwrap sandbox
-- [x] 01-02-PLAN.md -- Build verification and manual sandbox smoke test
-
-### Phase 2: Env Audit and CLI Polish
-**Goal**: User can review exactly what enters the sandbox before launch, and has diagnostic tools for troubleshooting
-**Depends on**: Phase 1
-**Requirements**: UX-01, UX-02, UX-03, UX-04, UX-05
-**Success Criteria** (what must be TRUE):
-  1. Running `claudebox` without `--yes` prints all env vars being passed into the sandbox and prompts for confirmation before proceeding
-  2. Running `claudebox --yes` or `claudebox -y` skips the env audit and launches immediately
-  3. Running `claudebox --dry-run` prints the full bwrap command without executing it
-  4. Running `claudebox --check` reports whether bwrap exists, required Nix packages are available, and ~/.claudebox exists
-**Plans:** 2 plans
-
-Plans:
-- [x] 02-01-PLAN.md -- Refactor flag parsing, add --check and --dry-run modes
-- [x] 02-02-PLAN.md -- Env audit display with grouping, masking, and confirmation prompt
-
-### Phase 3: Sandbox-Aware Prompting
-**Goal**: Claude inside the sandbox knows it is sandboxed, how to install tools, and what is unavailable
-**Depends on**: Phase 1
-**Requirements**: AWARE-01, AWARE-02
-**Success Criteria** (what must be TRUE):
-  1. First run of `claudebox` creates a default CLAUDE.md in ~/.claudebox/ if none exists
-  2. The injected CLAUDE.md tells Claude it is in a bwrap sandbox, how to use comma (`, <tool>`) and `nix shell` for tool installation, and that SSH/GPG/cloud credentials are unavailable
-**Plans:** 1 plan
-
-Plans:
-- [x] 03-01-PLAN.md -- Add SANDBOX.md generation and CLAUDE.md import management
-
-## Progress
-
-**Execution Order:**
-Phases execute in numeric order: 1 -> 2 -> 3
-
-| Phase | Plans Complete | Status | Completed |
-|-------|----------------|--------|-----------|
-| 1. Minimal Viable Sandbox | 2/2 | Complete | - |
-| 2. Env Audit and CLI Polish | 0/2 | Planned | - |
-| 3. Sandbox-Aware Prompting | 0/1 | Not started | - |
diff --git a/.planning/phases/01-minimal-viable-sandbox/01-01-PLAN.md b/.planning/phases/01-minimal-viable-sandbox/01-01-PLAN.md
new file mode 100644
index 0000000..2f90379
--- /dev/null
+++ b/.planning/phases/01-minimal-viable-sandbox/01-01-PLAN.md
@@ -0,0 +1,380 @@
+---
+phase: 01-minimal-viable-sandbox
+plan: 01
+type: execute
+wave: 1
+depends_on: []
+files_modified:
+  - flake.nix
+  - claudebox.sh
+autonomous: true
+requirements:
+  - SAND-01
+  - SAND-02
+  - SAND-03
+  - SAND-04
+  - SAND-05
+  - SAND-06
+  - SAND-07
+  - SAND-08
+  - SAND-09
+  - SAND-10
+  - SAND-11
+  - SAND-12
+  - SAND-13
+  - SAND-14
+  - SAND-15
+  - TOOL-01
+  - TOOL-02
+  - TOOL-03
+  - GIT-01
+  - GIT-02
+  - NIX-01
+  - NIX-02
+  - NIX-03
+  - UX-06
+
+must_haves:
+  truths:
+    - "Running `nix build` in the project root produces a working `claudebox` binary"
+    - "claudebox launches Claude Code inside bwrap with only allowlisted env vars"
+    - "Secret paths (~/.ssh, ~/.gnupg, ~/.aws, etc.) are not visible inside the sandbox"
+    - "Git works inside the sandbox with user identity from host"
+    - "comma and nix shell work inside the sandbox for on-demand tool installation"
+    - "Ctrl+C terminates the session cleanly; exit code passes through"
+  artifacts:
+    - path: "flake.nix"
+      provides: "Nix flake with claudebox as default package"
+      contains: "writeShellApplication"
+    - path: "claudebox.sh"
+      provides: "Shell script body with bwrap sandbox invocation"
+      contains: "exec bwrap"
+  key_links:
+    - from: "flake.nix"
+      to: "claudebox.sh"
+      via: "builtins.readFile ./claudebox.sh"
+      pattern: "builtins.readFile ./claudebox.sh"
+    - from: "claudebox.sh"
+      to: "bwrap"
+      via: "exec bwrap with --clearenv"
+      pattern: "exec bwrap.*--clearenv"
+    - from: "claudebox.sh"
+      to: "claude"
+      via: "CLAUDE_BIN resolved from host PATH before clearenv"
+      pattern: "CLAUDE_BIN=.*command -v claude"
+---
+
+<objective>
+Create the complete claudebox Nix flake and shell script that launches Claude Code inside a bubblewrap sandbox with full environment isolation, filesystem isolation, secret hiding, git support, and tool provisioning.
+
+Purpose: This is the entire deliverable for Phase 1 -- a working `claudebox` command.
+Output: `flake.nix` and `claudebox.sh` in the project root.
+</objective>
+
+<execution_context>
+@$HOME/.claude/get-shit-done/workflows/execute-plan.md
+@$HOME/.claude/get-shit-done/templates/summary.md
+</execution_context>
+
+<context>
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/STATE.md
+@.planning/phases/01-minimal-viable-sandbox/01-CONTEXT.md
+@.planning/phases/01-minimal-viable-sandbox/01-RESEARCH.md
+</context>
+
+<tasks>
+
+<task type="auto">
+  <name>Task 1: Create flake.nix with all inputs and writeShellApplication</name>
+  <files>flake.nix</files>
+  <read_first>
+    .planning/phases/01-minimal-viable-sandbox/01-RESEARCH.md
+  </read_first>
+  <action>
+Create `flake.nix` in the project root with the following exact structure:
+
+```nix
+{
+  description = "claudebox - sandboxed Claude Code";
+
+  inputs = {
+    nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable";
+    nix-index-database = {
+      url = "github:nix-community/nix-index-database";
+      inputs.nixpkgs.follows = "nixpkgs";
+    };
+  };
+
+  outputs = { self, nixpkgs, nix-index-database, ... }:
+    let
+      system = "x86_64-linux";
+      pkgs = nixpkgs.legacyPackages.${system};
+      comma-with-db = nix-index-database.packages.${system}.comma-with-db;
+    in {
+      packages.${system} = {
+        claudebox = pkgs.writeShellApplication {
+          name = "claudebox";
+          runtimeInputs = [
+            pkgs.bubblewrap
+            pkgs.coreutils
+            pkgs.git
+            pkgs.curl
+            pkgs.jq
+            pkgs.ripgrep
+            pkgs.fd
+            pkgs.nix
+            comma-with-db
+            pkgs.bash
+            pkgs.nodejs
+          ];
+          text = builtins.readFile ./claudebox.sh;
+        };
+        default = self.packages.${system}.claudebox;
+      };
+    };
+}
+```
+
+Key points per user decisions:
+- Per D-02: `comma-with-db` comes from the `nix-index-database` flake, using `packages.${system}.comma-with-db` (not legacyPackages).
+- Per NIX-01/NIX-02: Flake with pinned inputs. `nixpkgs.follows` ensures single nixpkgs instance.
+- Per NIX-03: `default` package alias so `nix run` and `nix profile install` work.
+- Per SAND-01/SAND-10: `writeShellApplication` produces the binary and wires runtimeInputs into PATH.
+- Claude Code is NOT in runtimeInputs -- it's discovered from host PATH at runtime (see Research Pattern 5).
+  </action>
+  <verify>
+    <automated>grep -q 'writeShellApplication' flake.nix && grep -q 'comma-with-db' flake.nix && grep -q 'nix-index-database' flake.nix && grep -q 'builtins.readFile ./claudebox.sh' flake.nix && echo "PASS" || echo "FAIL"</automated>
+  </verify>
+  <acceptance_criteria>
+    - flake.nix contains `description = "claudebox - sandboxed Claude Code"`
+    - flake.nix contains `nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable"`
+    - flake.nix contains `nix-index-database.packages.${system}.comma-with-db`
+    - flake.nix contains `name = "claudebox"`
+    - flake.nix contains `builtins.readFile ./claudebox.sh`
+    - flake.nix contains all 11 runtimeInputs: bubblewrap, coreutils, git, curl, jq, ripgrep, fd, nix, comma-with-db, bash, nodejs
+    - flake.nix contains `default = self.packages.${system}.claudebox`
+  </acceptance_criteria>
+  <done>flake.nix exists with correct flake structure, all runtime dependencies, and readFile of claudebox.sh</done>
+</task>
+
+<task type="auto">
+  <name>Task 2: Create claudebox.sh with complete bwrap sandbox logic</name>
+  <files>claudebox.sh</files>
+  <read_first>
+    .planning/phases/01-minimal-viable-sandbox/01-RESEARCH.md
+    .planning/phases/01-minimal-viable-sandbox/01-CONTEXT.md
+  </read_first>
+  <action>
+Create `claudebox.sh` in the project root. This is the shell script body read by `writeShellApplication` (which adds `set -euo pipefail` and prepends runtimeInputs to PATH automatically). The script must implement the following sections in order:
+
+**Section 1: Resolve claude binary from host PATH (before clearenv strips it)**
+
+```bash
+CLAUDE_BIN=$(command -v claude) || {
+  echo "error: claude not found in PATH" >&2
+  echo "Install Claude Code first: https://docs.anthropic.com/en/docs/claude-code" >&2
+  exit 1
+}
+```
+
+**Section 2: Capture sandbox PATH**
+
+The runtimeInputs-constructed PATH is available as `$PATH` at this point. Capture it for passing into the sandbox:
+```bash
+SANDBOX_PATH="$PATH"
+```
+
+**Section 3: Record CWD**
+
+```bash
+CWD=$(pwd)
+```
+
+**Section 4: Ensure ~/.claudebox exists**
+
+```bash
+mkdir -p "$HOME/.claudebox"
+```
+
+**Section 5: Generate minimal .gitconfig (per D-05)**
+
+Read host git identity, write a temp .gitconfig, set up cleanup trap:
+```bash
+GIT_NAME=$(git config --global user.name 2>/dev/null || echo "Claude User")
+GIT_EMAIL=$(git config --global user.email 2>/dev/null || echo "claude@localhost")
+
+GITCONFIG_TMP=$(mktemp)
+trap 'rm -f "$GITCONFIG_TMP"' EXIT
+
+cat > "$GITCONFIG_TMP" <<GITEOF
+[user]
+    name = $GIT_NAME
+    email = $GIT_EMAIL
+[safe]
+    directory = *
+GITEOF
+```
+
+**Section 6: Build environment --setenv args array (per D-03, D-04, SAND-02, SAND-03)**
+
+Sandbox-generated vars (D-04) are set directly, never from host:
+```bash
+ENV_ARGS=(
+  --setenv HOME "$HOME"
+  --setenv USER "$USER"
+  --setenv PATH "$SANDBOX_PATH"
+  --setenv SHELL /bin/bash
+  --setenv TMPDIR /tmp
+  --setenv XDG_RUNTIME_DIR /tmp
+  --setenv NIX_SSL_CERT_FILE /etc/ssl/certs/ca-certificates.crt
+  --setenv SSL_CERT_FILE /etc/ssl/certs/ca-certificates.crt
+)
+```
+
+Allowlisted host vars -- only pass if set on host:
+```bash
+HOST_ALLOWLIST=(TERM EDITOR LANG LC_ALL ANTHROPIC_API_KEY)
+for var in "${HOST_ALLOWLIST[@]}"; do
+  if [[ -v "$var" ]]; then
+    ENV_ARGS+=(--setenv "$var" "${!var}")
+  fi
+done
+```
+
+CLAUDEBOX_EXTRA_ENV escape hatch (per D-03, comma-separated):
+```bash
+if [[ -v CLAUDEBOX_EXTRA_ENV ]]; then
+  IFS=',' read -ra EXTRAS <<< "$CLAUDEBOX_EXTRA_ENV"
+  for var in "${EXTRAS[@]}"; do
+    var="${var// /}"  # trim whitespace
+    if [[ -n "$var" ]] && [[ -v "$var" ]]; then
+      ENV_ARGS+=(--setenv "$var" "${!var}")
+    fi
+  done
+fi
+```
+
+**Section 7: exec bwrap (per SAND-04 through SAND-15, UX-06, D-01)**
+
+```bash
+exec bwrap \
+  --clearenv \
+  "${ENV_ARGS[@]}" \
+  --tmpfs / \
+  --proc /proc \
+  --dev /dev \
+  --tmpfs /tmp \
+  --ro-bind /nix/store /nix/store \
+  --bind /nix/var/nix /nix/var/nix \
+  --ro-bind /etc/resolv.conf /etc/resolv.conf \
+  --ro-bind /etc/ssl /etc/ssl \
+  --ro-bind /etc/static /etc/static \
+  --ro-bind /etc/passwd /etc/passwd \
+  --ro-bind /etc/group /etc/group \
+  --ro-bind /etc/hosts /etc/hosts \
+  --ro-bind /etc/nsswitch.conf /etc/nsswitch.conf \
+  --ro-bind /etc/nix /etc/nix \
+  --symlink "$(command -v env)" /usr/bin/env \
+  --tmpfs "$HOME" \
+  --bind "$HOME/.claudebox" "$HOME/.claude" \
+  --ro-bind "$GITCONFIG_TMP" "$HOME/.gitconfig" \
+  --bind "$CWD" "$CWD" \
+  --chdir "$CWD" \
+  -- "$CLAUDE_BIN" --dangerously-skip-permissions "$@"
+```
+
+Mount ordering rationale (most general to most specific):
+1. `--tmpfs /` -- empty root (SAND-04)
+2. `/proc`, `/dev`, `/tmp` -- system essentials (SAND-11)
+3. `/nix/store` ro, `/nix/var/nix` rw -- Nix access (SAND-06, SAND-07, TOOL-03)
+4. `/etc/*` -- DNS, SSL, user resolution, nix config (SAND-12, SAND-13)
+5. `/usr/bin/env` symlink -- shebang support (Pitfall 8)
+6. `$HOME` tmpfs -- clean home dir
+7. `~/.claudebox` as `~/.claude` -- Claude config (SAND-08)
+8. `.gitconfig` -- git identity (GIT-01, GIT-02)
+9. `$CWD` -- project directory (SAND-05), most specific = last
+
+Per D-01: all args after claudebox's own flags pass through via `"$@"`. Phase 1 has no claudebox-specific flags (--yes/--dry-run/--check are Phase 2), so ALL args pass through.
+Per UX-06: `--dangerously-skip-permissions` is always injected before `"$@"`.
+Per SAND-14/SAND-15: `exec` ensures no intermediate shell -- signals propagate, exit code passes through.
+Per SAND-09: Secret paths are never mounted. The tmpfs root and tmpfs HOME ensure nothing leaks. Only explicit bind-mounts above are visible.
+
+IMPORTANT: Do NOT add `#!/bin/bash` or `set -euo pipefail` -- `writeShellApplication` adds these automatically.
+  </action>
+  <verify>
+    <automated>grep -q 'exec bwrap' claudebox.sh && grep -q 'clearenv' claudebox.sh && grep -q 'CLAUDE_BIN' claudebox.sh && grep -q 'CLAUDEBOX_EXTRA_ENV' claudebox.sh && grep -q 'GITCONFIG_TMP' claudebox.sh && grep -q 'dangerously-skip-permissions' claudebox.sh && echo "PASS" || echo "FAIL"</automated>
+  </verify>
+  <acceptance_criteria>
+    - claudebox.sh does NOT contain `#!/bin/bash` or `set -euo pipefail` (writeShellApplication adds these)
+    - claudebox.sh contains `CLAUDE_BIN=$(command -v claude)` with error handling on failure
+    - claudebox.sh contains `SANDBOX_PATH="$PATH"`
+    - claudebox.sh contains `mkdir -p "$HOME/.claudebox"`
+    - claudebox.sh contains git config reading: `git config --global user.name` and `git config --global user.email`
+    - claudebox.sh contains `GITCONFIG_TMP=$(mktemp)` and `trap 'rm -f "$GITCONFIG_TMP"' EXIT`
+    - claudebox.sh contains `safe.directory = *` in the generated gitconfig
+    - claudebox.sh contains ENV_ARGS array with --setenv for HOME, USER, PATH, SHELL, TMPDIR, XDG_RUNTIME_DIR, NIX_SSL_CERT_FILE, SSL_CERT_FILE
+    - claudebox.sh contains HOST_ALLOWLIST with TERM, EDITOR, LANG, LC_ALL, ANTHROPIC_API_KEY
+    - claudebox.sh contains CLAUDEBOX_EXTRA_ENV parsing with `IFS=',' read -ra EXTRAS`
+    - claudebox.sh contains `exec bwrap` with `--clearenv`
+    - claudebox.sh contains `--tmpfs /` before any other mount
+    - claudebox.sh contains `--ro-bind /nix/store /nix/store`
+    - claudebox.sh contains `--bind /nix/var/nix /nix/var/nix` (rw, not ro-bind)
+    - claudebox.sh contains `--ro-bind /etc/resolv.conf /etc/resolv.conf`
+    - claudebox.sh contains `--ro-bind /etc/ssl /etc/ssl` AND `--ro-bind /etc/static /etc/static`
+    - claudebox.sh contains `--ro-bind /etc/nix /etc/nix`
+    - claudebox.sh contains `--ro-bind /etc/passwd /etc/passwd` and `--ro-bind /etc/group /etc/group`
+    - claudebox.sh contains `--symlink` for `/usr/bin/env`
+    - claudebox.sh contains `--tmpfs "$HOME"` BEFORE the claudebox and CWD bind mounts
+    - claudebox.sh contains `--bind "$HOME/.claudebox" "$HOME/.claude"`
+    - claudebox.sh contains `--ro-bind "$GITCONFIG_TMP" "$HOME/.gitconfig"`
+    - claudebox.sh contains `--bind "$CWD" "$CWD"` and `--chdir "$CWD"`
+    - claudebox.sh contains `-- "$CLAUDE_BIN" --dangerously-skip-permissions "$@"` at the end
+    - claudebox.sh does NOT contain any mount of ~/.ssh, ~/.gnupg, ~/.aws, ~/.config/gcloud, or /var/lib/tailscale
+  </acceptance_criteria>
+  <done>claudebox.sh exists with complete bwrap invocation covering all SAND-*, TOOL-*, GIT-*, and UX-06 requirements</done>
+</task>
+
+</tasks>
+
+<threat_model>
+## Trust Boundaries
+
+| Boundary | Description |
+|----------|-------------|
+| Host -> Sandbox | Environment variables cross from untrusted host env into sandbox via allowlist |
+| Host -> Sandbox | Filesystem paths cross via explicit bind mounts |
+| Sandbox -> Host | CWD is mounted read-write, so Claude can modify project files (intended) |
+
+## STRIDE Threat Register
+
+| Threat ID | Category | Component | Disposition | Mitigation Plan |
+|-----------|----------|-----------|-------------|-----------------|
+| T-01-01 | Information Disclosure | Environment variables | mitigate | `--clearenv` + explicit allowlist. Only SAND-03 vars pass through. CLAUDEBOX_EXTRA_ENV is user-opt-in. |
+| T-01-02 | Information Disclosure | Secret filesystem paths | mitigate | tmpfs root + tmpfs HOME. Only explicit bind mounts visible. ~/.ssh, ~/.gnupg, ~/.aws never mounted. Verify by absence in mount list. |
+| T-01-03 | Tampering | Host filesystem via CWD mount | accept | CWD is intentionally rw -- Claude needs to edit project files. Scope is limited to the single directory. |
+| T-01-04 | Information Disclosure | Git credential helpers | mitigate | Host ~/.gitconfig is NOT mounted. A minimal generated .gitconfig with only user.name, user.email, and safe.directory is used instead. No credential helper config enters sandbox. |
+| T-01-05 | Elevation of Privilege | Nix daemon socket access | accept | Daemon socket is rw to allow `nix shell`. The daemon runs as root on host but nix client operations are normal user operations. Nix daemon has its own access controls. |
+| T-01-06 | Information Disclosure | /etc/passwd, /etc/group | accept | Read-only mount of user database. Contains usernames/UIDs only, no password hashes (those are in /etc/shadow which is not mounted). Required for basic tool functionality. |
+| T-01-07 | Spoofing | CLAUDE_BIN resolution | accept | claude binary is resolved from host PATH before sandbox. If attacker controls host PATH, they already have full host access. Not a sandbox boundary issue. |
+</threat_model>
+
+<verification>
+After both tasks complete:
+1. `nix flake check` passes (or at least doesn't error on flake structure)
+2. `grep -c 'exec bwrap' claudebox.sh` returns 1
+3. `grep -c 'clearenv' claudebox.sh` returns 1
+4. No secret paths appear in claudebox.sh mounts: `grep -E '\.ssh|\.gnupg|\.aws|gcloud|tailscale' claudebox.sh` returns nothing
+</verification>
+
+<success_criteria>
+- flake.nix and claudebox.sh exist in project root
+- flake.nix defines claudebox as default package with all 11 runtimeInputs
+- claudebox.sh implements complete bwrap sandbox with env allowlist, filesystem isolation, git identity, and tool provisioning
+- All 24 phase requirements (SAND-01 through SAND-15, TOOL-01 through TOOL-03, GIT-01, GIT-02, NIX-01 through NIX-03, UX-06) are addressed
+</success_criteria>
+
+<output>
+After completion, create `.planning/phases/01-minimal-viable-sandbox/01-01-SUMMARY.md`
+</output>
diff --git a/.planning/phases/01-minimal-viable-sandbox/01-01-SUMMARY.md b/.planning/phases/01-minimal-viable-sandbox/01-01-SUMMARY.md
new file mode 100644
index 0000000..689aa50
--- /dev/null
+++ b/.planning/phases/01-minimal-viable-sandbox/01-01-SUMMARY.md
@@ -0,0 +1,95 @@
+---
+phase: 01-minimal-viable-sandbox
+plan: 01
+subsystem: infra
+tags: [nix, bubblewrap, bwrap, sandbox, writeShellApplication, flake, comma-with-db]
+
+requires: []
+provides:
+  - "claudebox command via nix build/run"
+  - "bwrap sandbox with clearenv + env allowlist"
+  - "filesystem isolation with secret path hiding"
+  - "git identity forwarding via generated .gitconfig"
+  - "comma/nix tool provisioning inside sandbox"
+affects: [02-verification-and-hardening]
+
+tech-stack:
+  added: [bubblewrap, writeShellApplication, nix-index-database, comma-with-db]
+  patterns: [clearenv-allowlist, tmpfs-root-selective-bind, exec-for-signal-passthrough]
+
+key-files:
+  created: [flake.nix, claudebox.sh, flake.lock]
+  modified: []
+
+key-decisions:
+  - "Claude Code discovered from host PATH at runtime, not bundled as runtimeInput"
+  - "Sandbox-generated vars (TMPDIR, XDG_RUNTIME_DIR) never read from host"
+  - "CLAUDEBOX_EXTRA_ENV comma-separated escape hatch for user-added env vars"
+
+patterns-established:
+  - "writeShellApplication + builtins.readFile: keep shell script separate for syntax highlighting and independent shellcheck"
+  - "clearenv + setenv: start empty, allowlist explicitly"
+  - "tmpfs root + selective bind-mounts: nothing visible unless explicitly mounted"
+  - "exec bwrap: no intermediate shell, signals propagate, exit code passes through"
+
+requirements-completed: [SAND-01, SAND-02, SAND-03, SAND-04, SAND-05, SAND-06, SAND-07, SAND-08, SAND-09, SAND-10, SAND-11, SAND-12, SAND-13, SAND-14, SAND-15, TOOL-01, TOOL-02, TOOL-03, GIT-01, GIT-02, NIX-01, NIX-02, NIX-03, UX-06]
+
+duration: 1min
+completed: 2026-04-09
+---
+
+# Phase 1 Plan 01: Nix Flake and Sandbox Script Summary
+
+**Nix flake with writeShellApplication producing claudebox wrapper that runs Claude Code inside bwrap with clearenv, env allowlist, tmpfs root, secret hiding, git identity forwarding, and comma/nix tool access**
+
+## Performance
+
+- **Duration:** ~1 min
+- **Started:** 2026-04-09T09:10:55Z
+- **Completed:** 2026-04-09T09:12:10Z
+- **Tasks:** 2
+- **Files created:** 3 (flake.nix, claudebox.sh, flake.lock)
+
+## Accomplishments
+- Nix flake with 11 runtimeInputs (bubblewrap, coreutils, git, curl, jq, ripgrep, fd, nix, comma-with-db, bash, nodejs) and nix-index-database flake input
+- Shell script with complete bwrap invocation: clearenv, env allowlist with CLAUDEBOX_EXTRA_ENV escape hatch, tmpfs root, selective bind-mounts, git identity generation, secret path exclusion
+- `nix build` succeeds -- derivation builds and passes shellcheck
+
+## Task Commits
+
+Each task was committed atomically:
+
+1. **Task 1: Create flake.nix** - `0ed2d33` (feat)
+2. **Task 2: Create claudebox.sh** - `51dba04` (feat)
+3. **flake.lock generated by nix flake check** - `26bdf36` (chore)
+
+## Files Created/Modified
+- `flake.nix` - Nix flake with writeShellApplication, all runtimeInputs, nix-index-database input
+- `claudebox.sh` - bwrap sandbox script with clearenv, env allowlist, filesystem isolation, git identity
+- `flake.lock` - Pinned nixpkgs and nix-index-database versions
+
+## Decisions Made
+None - followed plan as specified.
+
+## Deviations from Plan
+
+None - plan executed exactly as written.
+
+## Issues Encountered
+None.
+
+## User Setup Required
+None - no external service configuration required.
+
+## Next Phase Readiness
+- claudebox builds successfully via `nix build`
+- Ready for 01-02 (verification and manual testing)
+- Requires `claude` to be available on host PATH for runtime use
+
+## Self-Check: PASSED
+
+All 3 files exist. All 3 commits verified.
+
+---
+*Phase: 01-minimal-viable-sandbox*
+*Completed: 2026-04-09*
diff --git a/.planning/phases/01-minimal-viable-sandbox/01-02-PLAN.md b/.planning/phases/01-minimal-viable-sandbox/01-02-PLAN.md
new file mode 100644
index 0000000..dc89aee
--- /dev/null
+++ b/.planning/phases/01-minimal-viable-sandbox/01-02-PLAN.md
@@ -0,0 +1,208 @@
+---
+phase: 01-minimal-viable-sandbox
+plan: 02
+type: execute
+wave: 2
+depends_on: ["01-01"]
+files_modified: []
+autonomous: false
+requirements:
+  - NIX-03
+  - SAND-02
+  - SAND-03
+  - SAND-04
+  - SAND-05
+  - SAND-06
+  - SAND-09
+  - SAND-10
+  - SAND-12
+  - SAND-13
+  - SAND-14
+  - TOOL-01
+  - TOOL-02
+
+must_haves:
+  truths:
+    - "`nix build` succeeds and produces a claudebox binary"
+    - "claudebox launches and env inside sandbox contains only allowlisted vars"
+    - "Secret paths are invisible inside the sandbox"
+    - "DNS and SSL work (curl https succeeds)"
+    - "comma and nix shell can install packages"
+    - "Exit code passes through from claude to caller"
+  artifacts: []
+  key_links:
+    - from: "nix build result"
+      to: "claudebox binary"
+      via: "result/bin/claudebox symlink"
+      pattern: "result/bin/claudebox"
+---
+
+<objective>
+Build the claudebox flake and verify the sandbox works end-to-end through automated smoke tests and manual verification.
+
+Purpose: Confirm the sandbox actually isolates secrets, passes through tools, and runs Claude Code successfully.
+Output: Verified working claudebox command.
+</objective>
+
+<execution_context>
+@$HOME/.claude/get-shit-done/workflows/execute-plan.md
+@$HOME/.claude/get-shit-done/templates/summary.md
+</execution_context>
+
+<context>
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/phases/01-minimal-viable-sandbox/01-CONTEXT.md
+@.planning/phases/01-minimal-viable-sandbox/01-01-SUMMARY.md
+@flake.nix
+@claudebox.sh
+</context>
+
+<tasks>
+
+<task type="auto">
+  <name>Task 1: Build flake and run automated smoke tests</name>
+  <files></files>
+  <read_first>
+    flake.nix
+    claudebox.sh
+  </read_first>
+  <action>
+Run the following commands sequentially, fixing any issues that arise:
+
+**Step 1: Build the flake**
+```bash
+cd /home/toph/code/tools/claudebox
+nix build
+```
+If this fails, read the error and fix `flake.nix` or `claudebox.sh` as needed. Common issues:
+- shellcheck errors in claudebox.sh (fix the shell code)
+- Missing flake.lock (nix build will create it on first run)
+- Package name mismatches (verify against nixpkgs)
+
+**Step 2: Verify the binary exists**
+```bash
+ls -la result/bin/claudebox
+```
+
+**Step 3: Run a minimal bwrap test without Claude**
+To test the sandbox without needing Claude, run just the bwrap portion to verify mounts and env isolation. Extract the bwrap invocation concept and test key properties:
+
+```bash
+# Test that the built script at least starts (will fail at claude lookup if claude not in PATH, that's ok)
+# Instead, test bwrap directly using the same flags pattern:
+
+# Test 1: Verify --clearenv produces empty env
+result/bin/claudebox 2>&1 || true
+# If claude is found, it will launch. If not, we get the expected error.
+```
+
+Since claudebox requires `claude` in PATH and will exec into it, automated testing is limited. The key automated checks are:
+
+1. `nix build` succeeds (shellcheck passes, all deps resolve)
+2. `result/bin/claudebox` exists and is executable
+3. The script content in the Nix store passes basic sanity: `cat result/bin/claudebox` shows the wrapper with correct PATH setup
+
+Run:
+```bash
+# Check the built wrapper contains expected runtimeInputs in PATH
+cat result/bin/claudebox | head -20
+```
+
+If `nix build` fails due to shellcheck issues in claudebox.sh, fix them. Common shellcheck fixes:
+- SC2086: Double-quote variable expansions
+- SC2034: Unused variables (may need `# shellcheck disable=SC2034` if intentional)
+- SC2155: Declare and assign separately
+
+After build succeeds, if `claude` is available on the host PATH, run a quick sandbox test:
+```bash
+# Quick test: launch claudebox with --help to verify it starts and exits cleanly
+result/bin/claudebox --help 2>&1 | head -5 || true
+```
+This should show Claude Code's help output if everything is wired correctly, or show a meaningful error.
+  </action>
+  <verify>
+    <automated>test -x /home/toph/code/tools/claudebox/result/bin/claudebox && echo "PASS: binary exists" || echo "FAIL: binary missing"</automated>
+  </verify>
+  <acceptance_criteria>
+    - `nix build` exits 0 (no shellcheck errors, all deps resolve)
+    - `result/bin/claudebox` exists and is executable
+    - `flake.lock` exists (created by first build)
+    - The built wrapper script in the Nix store contains runtimeInputs PATH entries (visible in `cat result/bin/claudebox`)
+  </acceptance_criteria>
+  <done>nix build succeeds and produces an executable claudebox binary</done>
+</task>
+
+<task type="checkpoint:human-verify" gate="blocking">
+  <name>Task 2: Manual sandbox verification</name>
+  <files></files>
+  <action>Present the verification checklist below to the user and wait for their confirmation that each check passes.</action>
+  <what-built>Complete claudebox sandbox wrapping Claude Code with environment isolation, filesystem isolation, secret hiding, git support, and tool provisioning</what-built>
+  <how-to-verify>
+1. Launch claudebox from a project directory:
+   ```
+   cd ~/some-project
+   /home/toph/code/tools/claudebox/result/bin/claudebox
+   ```
+
+2. Inside the Claude session, verify environment isolation:
+   - Ask Claude to run `env | sort` -- should show ONLY allowlisted vars (HOME, PATH, TERM, USER, SHELL, TMPDIR, etc.)
+   - Confirm NO appearance of: SSH_AUTH_SOCK, AWS_PROFILE, GITHUB_TOKEN, or any secret vars
+
+3. Verify filesystem isolation:
+   - Ask Claude to run `ls ~/.ssh` -- should fail (directory not found)
+   - Ask Claude to run `ls ~/.gnupg` -- should fail
+   - Ask Claude to run `ls ~/.aws` -- should fail
+   - Ask Claude to run `ls ~/.claude` -- should succeed (mapped from ~/.claudebox)
+
+4. Verify tools work:
+   - Ask Claude to run `git status` -- should work in the project dir
+   - Ask Claude to run `curl -s https://example.com | head -5` -- should return HTML (DNS + SSL work)
+   - Ask Claude to run `, jq --help | head -3` -- should install and run jq via comma
+   - Ask Claude to run `rg --version` -- should show ripgrep version
+
+5. Exit Claude (Ctrl+C or /exit) and verify:
+   - The shell returns to your normal prompt
+   - `echo $?` shows the exit code from Claude (typically 0)
+  </how-to-verify>
+  <verify>
+    <automated>echo "CHECKPOINT: requires human verification"</automated>
+  </verify>
+  <done>User confirms all sandbox isolation and tool provisioning checks pass</done>
+  <resume-signal>Type "approved" if all checks pass, or describe any issues found</resume-signal>
+</task>
+
+</tasks>
+
+<threat_model>
+## Trust Boundaries
+
+| Boundary | Description |
+|----------|-------------|
+| Build output -> Runtime | Nix build produces the sandbox script; verification confirms it behaves as designed |
+
+## STRIDE Threat Register
+
+| Threat ID | Category | Component | Disposition | Mitigation Plan |
+|-----------|----------|-----------|-------------|-----------------|
+| T-01-08 | Information Disclosure | Env leak in built binary | mitigate | Manual verification (Task 2 step 2) confirms only allowlisted vars appear in `env` output inside sandbox |
+| T-01-09 | Information Disclosure | Secret path accessible | mitigate | Manual verification (Task 2 step 3) confirms ~/.ssh, ~/.gnupg, ~/.aws are not visible |
+</threat_model>
+
+<verification>
+1. `nix build` exits 0
+2. Human confirms env isolation (only allowlisted vars visible)
+3. Human confirms filesystem isolation (secret paths invisible)
+4. Human confirms tools work (git, curl, comma, ripgrep)
+5. Human confirms clean exit behavior
+</verification>
+
+<success_criteria>
+- claudebox builds from the Nix flake without errors
+- Human verifies the sandbox isolates secrets and provides working tools
+- Phase 1 success criteria from ROADMAP.md are met
+</success_criteria>
+
+<output>
+After completion, create `.planning/phases/01-minimal-viable-sandbox/01-02-SUMMARY.md`
+</output>
diff --git a/.planning/phases/01-minimal-viable-sandbox/01-02-SUMMARY.md b/.planning/phases/01-minimal-viable-sandbox/01-02-SUMMARY.md
new file mode 100644
index 0000000..2b3f510
--- /dev/null
+++ b/.planning/phases/01-minimal-viable-sandbox/01-02-SUMMARY.md
@@ -0,0 +1,93 @@
+---
+phase: 01-minimal-viable-sandbox
+plan: 02
+subsystem: infra
+tags: [nix, bubblewrap, bwrap, sandbox, verification, smoke-test]
+
+requires:
+  - phase: 01-01
+    provides: "claudebox flake.nix and claudebox.sh"
+provides:
+  - "verified working claudebox command"
+  - "sandbox path resolution fix for NixOS symlink chains"
+affects: []
+
+tech-stack:
+  added: []
+  patterns: [readlink-f-for-nix-store-resolution]
+
+key-files:
+  created: []
+  modified: [claudebox.sh]
+
+key-decisions:
+  - "readlink -f required to resolve NixOS profile symlinks to real nix store paths for bwrap visibility"
+
+patterns-established:
+  - "readlink -f for all host-resolved binaries passed into bwrap: NixOS profile paths are symlink chains that don't exist inside the sandbox"
+
+requirements-completed: [NIX-03, SAND-02, SAND-03, SAND-04, SAND-05, SAND-06, SAND-09, SAND-10, SAND-12, SAND-13, SAND-14, TOOL-01, TOOL-02]
+
+duration: 1min
+completed: 2026-04-09
+---
+
+# Phase 1 Plan 02: Build Verification and Smoke Tests Summary
+
+**Fixed NixOS symlink resolution for bwrap, verified nix build succeeds and claudebox --version returns Claude Code 2.1.70 inside sandbox**
+
+## Performance
+
+- **Duration:** ~1 min
+- **Started:** 2026-04-09T09:13:38Z
+- **Completed:** 2026-04-09T09:15:01Z
+- **Tasks:** 2
+- **Files modified:** 1 (claudebox.sh)
+
+## Accomplishments
+- `nix build` succeeds with shellcheck passing
+- `result/bin/claudebox` executable exists with full runtimeInputs PATH (bubblewrap, git, curl, jq, ripgrep, fd, nix, comma-with-db, nodejs)
+- `claudebox --version` returns "2.1.70 (Claude Code)" confirming end-to-end sandbox launch
+- Fixed path resolution bug where NixOS profile symlinks weren't accessible inside bwrap
+
+## Task Commits
+
+Each task was committed atomically:
+
+1. **Task 1: Build flake and run automated smoke tests** - `9296453` (fix)
+
+## Files Created/Modified
+- `claudebox.sh` - Added readlink -f for claude binary and env resolution to handle NixOS symlink chains
+
+## Decisions Made
+- Used `readlink -f` to resolve both `claude` and `env` binaries to their real nix store paths, since NixOS profile paths (`/etc/profiles/per-user/...`) are symlink chains not visible inside the bwrap sandbox
+
+## Deviations from Plan
+
+### Auto-fixed Issues
+
+**1. [Rule 1 - Bug] Fixed NixOS symlink resolution for bwrap**
+- **Found during:** Task 1 (Build and smoke test)
+- **Issue:** `command -v claude` returns `/etc/profiles/per-user/toph/bin/claude` which is a symlink chain. This path doesn't exist inside bwrap since only `/nix/store` is mounted. Same issue with `env`.
+- **Fix:** Changed `command -v claude` to `readlink -f "$(command -v claude)"` and same for env, resolving to real `/nix/store/...` paths
+- **Files modified:** claudebox.sh
+- **Verification:** `claudebox --version` now returns "2.1.70 (Claude Code)" instead of "execvp: No such file or directory"
+- **Committed in:** 9296453
+
+---
+
+**Total deviations:** 1 auto-fixed (1 bug)
+**Impact on plan:** Essential fix -- sandbox was completely non-functional without it on NixOS.
+
+## Issues Encountered
+None beyond the auto-fixed symlink resolution.
+
+## User Setup Required
+None - no external service configuration required.
+
+## Next Phase Readiness
+- claudebox builds and launches successfully
+- Manual verification of env isolation, filesystem isolation, and tool access is the next step (auto-approved in this run)
+- Ready for phase 2 (hardening/refinement) if applicable
+
+## Self-Check: PASSED
diff --git a/.planning/phases/01-minimal-viable-sandbox/01-CONTEXT.md b/.planning/phases/01-minimal-viable-sandbox/01-CONTEXT.md
new file mode 100644
index 0000000..eaa7f7d
--- /dev/null
+++ b/.planning/phases/01-minimal-viable-sandbox/01-CONTEXT.md
@@ -0,0 +1,88 @@
+# Phase 1: Minimal Viable Sandbox - Context
+
+**Gathered:** 2026-04-09
+**Status:** Ready for planning
+
+<domain>
+## Phase Boundary
+
+Produce a working `claudebox` command via Nix flake that launches Claude Code inside a bubblewrap sandbox with environment allowlisting, filesystem isolation, secret path hiding, and on-demand tool provisioning via comma/nix. User runs `claudebox` in any project directory and gets a fully functional Claude session with secrets invisible.
+
+</domain>
+
+<decisions>
+## Implementation Decisions
+
+### Argument Passthrough
+- **D-01:** Forward all unknown flags to `claude`. claudebox claims only its own flags (`--yes`, `--dry-run`, `--check`) and passes everything else through. No `--` separator required. `--dangerously-skip-permissions` is always injected.
+
+### nix-index Database
+- **D-02:** Use `comma-with-db` from the `nix-community/nix-index-database` flake. Self-contained — bundles the package index, no host dependency, no extra bind mount needed. DB updates when the flake input is bumped.
+
+### Environment Variables
+- **D-03:** Strict allowlist per SAND-03, plus a `CLAUDEBOX_EXTRA_ENV` escape hatch. Core allowlist always passes (HOME, PATH, TERM, EDITOR, LANG, LC_ALL, NIX_SSL_CERT_FILE, SSL_CERT_FILE, ANTHROPIC_API_KEY, USER, SHELL, XDG_RUNTIME_DIR). User can add extras at launch via `CLAUDEBOX_EXTRA_ENV="COLORTERM,NODE_OPTIONS"` — their responsibility to not leak secrets.
+- **D-04:** Sandbox-generated vars (TMPDIR=/tmp, etc.) are set via `--setenv`, never read from host.
+
+### Git Identity
+- **D-05:** Generate a minimal `.gitconfig` inside the sandbox at launch time. Reads `user.name` and `user.email` from the host's git config, writes them plus `safe.directory = *` into the sandbox's `~/.gitconfig`. No host `.gitconfig` mounted — avoids credential helper, pager, and alias breakage from missing binaries.
+
+### Claude's Discretion
+- Mount ordering strategy for CWD-under-HOME (bwrap specifics)
+- Exact tmpfs layout and /dev, /proc, /tmp setup
+- How `--clearenv` + `--setenv` are sequenced in the bwrap invocation
+- DNS resolution mount strategy (resolv.conf and its symlink targets)
+- SSL cert bundle path detection
+
+</decisions>
+
+<canonical_refs>
+## Canonical References
+
+**Downstream agents MUST read these before planning or implementing.**
+
+### Project Docs
+- `.planning/PROJECT.md` — Core value, constraints, key decisions
+- `.planning/REQUIREMENTS.md` — Full requirement list with IDs (SAND-*, TOOL-*, GIT-*, NIX-*, UX-*)
+- `.planning/ROADMAP.md` — Phase 1 success criteria and requirement mapping
+
+### Stack Research
+- `CLAUDE.md` §Technology Stack — writeShellApplication, bwrap flags, comma/nix-index, flake structure, PATH construction, testing strategy
+
+</canonical_refs>
+
+<code_context>
+## Existing Code Insights
+
+### Reusable Assets
+- None — greenfield project, only CLAUDE.md exists in the repo
+
+### Established Patterns
+- None yet — Phase 1 establishes all patterns
+
+### Integration Points
+- Nix flake as entry point (`nix run`, `nix profile install`)
+- `writeShellApplication` produces the claudebox script
+- bwrap is the sole runtime dependency for sandboxing
+
+</code_context>
+
+<specifics>
+## Specific Ideas
+
+- `CLAUDEBOX_EXTRA_ENV` is comma-separated, not space-separated
+- Git identity is read from host at launch time (not build time) so it works across machines
+- `comma-with-db` as flake input, not a nixpkgs package
+
+</specifics>
+
+<deferred>
+## Deferred Ideas
+
+None — discussion stayed within phase scope
+
+</deferred>
+
+---
+
+*Phase: 01-minimal-viable-sandbox*
+*Context gathered: 2026-04-09*
diff --git a/.planning/phases/01-minimal-viable-sandbox/01-DISCUSSION-LOG.md b/.planning/phases/01-minimal-viable-sandbox/01-DISCUSSION-LOG.md
new file mode 100644
index 0000000..04776bd
--- /dev/null
+++ b/.planning/phases/01-minimal-viable-sandbox/01-DISCUSSION-LOG.md
@@ -0,0 +1,73 @@
+# Phase 1: Minimal Viable Sandbox - Discussion Log
+
+> **Audit trail only.** Do not use as input to planning, research, or execution agents.
+> Decisions are captured in CONTEXT.md — this log preserves the alternatives considered.
+
+**Date:** 2026-04-09
+**Phase:** 01-minimal-viable-sandbox
+**Areas discussed:** Argument passthrough, nix-index database, Env edge cases, Git identity
+
+---
+
+## Argument Passthrough
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| Forward all unknown | claudebox claims --yes, --dry-run, --check; everything else passes through to claude | ✓ |
+| Explicit -- separator | claudebox args before --, claude args after -- | |
+| Pass everything through | claudebox has no flags, controlled via env vars | |
+
+**User's choice:** Forward all unknown
+**Notes:** No -- separator needed. Simple and intuitive.
+
+---
+
+## nix-index Database
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| comma-with-db | Use nix-community/nix-index-database flake, bundles the DB | ✓ |
+| Mount host ~/.cache/nix-index | Bind-mount host's nix-index DB read-only | |
+| Both — prefer host, fallback to bundled | Mount host DB if exists, otherwise comma-with-db | |
+
+**User's choice:** comma-with-db
+**Notes:** Self-contained, no host dependency.
+
+---
+
+## Env Edge Cases
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| Strict allowlist | Only SAND-03 vars, rebuild to add more | |
+| Allowlist + CLAUDEBOX_EXTRA_ENV | Core allowlist + user-specified extras via comma-separated env var | ✓ |
+| Strict + computed vars | Allowlist from host + claudebox generates its own TMPDIR, COLORTERM, etc. | |
+
+**User's choice:** Allowlist + CLAUDEBOX_EXTRA_ENV
+**Notes:** Pragmatic escape hatch for power users, user takes responsibility for not leaking secrets.
+
+---
+
+## Git Identity
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| Generate minimal .gitconfig | Create sandbox-only .gitconfig with user.name, user.email, safe.directory | ✓ |
+| Mount host .gitconfig read-only | Bind-mount host config, carries over everything including broken credential helpers | |
+| Mount + override dangerous keys | Mount host config but neutralize credential.helper and core.pager via env vars | |
+
+**User's choice:** Generate minimal .gitconfig
+**Notes:** User asked whether custom git settings matter for Claude. Conclusion: Claude uses git programmatically, doesn't need aliases/pagers/merge tools. Mounting host config risks breakage from credential helpers and pagers referencing binaries not in sandbox PATH.
+
+---
+
+## Claude's Discretion
+
+- Mount ordering, tmpfs layout, /dev /proc /tmp setup
+- --clearenv + --setenv sequencing
+- DNS resolution mount strategy
+- SSL cert bundle path detection
+
+## Deferred Ideas
+
+None
diff --git a/.planning/phases/01-minimal-viable-sandbox/01-RESEARCH.md b/.planning/phases/01-minimal-viable-sandbox/01-RESEARCH.md
new file mode 100644
index 0000000..ef04b09
--- /dev/null
+++ b/.planning/phases/01-minimal-viable-sandbox/01-RESEARCH.md
@@ -0,0 +1,465 @@
+# Phase 1: Minimal Viable Sandbox - Research
+
+**Researched:** 2026-04-09
+**Domain:** Nix derivation + bubblewrap sandboxing for Claude Code
+**Confidence:** HIGH
+
+## Summary
+
+This phase produces a single Nix flake that outputs a `claudebox` command wrapping Claude Code inside a bubblewrap (bwrap) sandbox. The sandbox uses `--clearenv` to start with an empty environment, allowlists specific variables, bind-mounts only the necessary filesystem paths, and explicitly excludes all secret material.
+
+The host system (NixOS with Lix 2.93.3) has bubblewrap 0.11.0, which supports all required flags. Claude Code is a Node.js application (v2.1.70 on host, 2.0.51 in nixpkgs) installed as a wrapped bash script that execs node. The `comma-with-db` package from `nix-community/nix-index-database` is confirmed available and bundles its own database. NixOS has several `/etc` symlink chains that need careful handling for DNS and SSL to work inside the sandbox.
+
+**Primary recommendation:** Use `writeShellApplication` with `builtins.readFile` for the script body, `--clearenv` + `--setenv` for environment, tmpfs root with selective bind-mounts, and `exec` into the final claude command for clean signal handling.
+
+<user_constraints>
+## User Constraints (from CONTEXT.md)
+
+### Locked Decisions
+- **D-01:** Forward all unknown flags to `claude`. claudebox claims only its own flags (`--yes`, `--dry-run`, `--check`) and passes everything else through. No `--` separator required. `--dangerously-skip-permissions` is always injected.
+- **D-02:** Use `comma-with-db` from the `nix-community/nix-index-database` flake. Self-contained -- bundles the package index, no host dependency, no extra bind mount needed. DB updates when the flake input is bumped.
+- **D-03:** Strict allowlist per SAND-03, plus a `CLAUDEBOX_EXTRA_ENV` escape hatch. Core allowlist always passes (HOME, PATH, TERM, EDITOR, LANG, LC_ALL, NIX_SSL_CERT_FILE, SSL_CERT_FILE, ANTHROPIC_API_KEY, USER, SHELL, XDG_RUNTIME_DIR). User can add extras at launch via `CLAUDEBOX_EXTRA_ENV="COLORTERM,NODE_OPTIONS"` -- their responsibility to not leak secrets.
+- **D-04:** Sandbox-generated vars (TMPDIR=/tmp, etc.) are set via `--setenv`, never read from host.
+- **D-05:** Generate a minimal `.gitconfig` inside the sandbox at launch time. Reads `user.name` and `user.email` from the host's git config, writes them plus `safe.directory = *` into the sandbox's `~/.gitconfig`. No host `.gitconfig` mounted.
+
+### Claude's Discretion
+- Mount ordering strategy for CWD-under-HOME (bwrap specifics)
+- Exact tmpfs layout and /dev, /proc, /tmp setup
+- How `--clearenv` + `--setenv` are sequenced in the bwrap invocation
+- DNS resolution mount strategy (resolv.conf and its symlink targets)
+- SSL cert bundle path detection
+
+### Deferred Ideas (OUT OF SCOPE)
+None -- discussion stayed within phase scope.
+</user_constraints>
+
+<phase_requirements>
+## Phase Requirements
+
+| ID | Description | Research Support |
+|----|-------------|------------------|
+| SAND-01 | Wrapper via Nix `writeShellApplication` | Standard Stack: writeShellApplication with builtins.readFile pattern |
+| SAND-02 | `--clearenv` empty environment | Verified: bwrap 0.11.0 supports `--clearenv` + `--setenv` |
+| SAND-03 | Environment allowlist | Architecture: env passthrough loop pattern |
+| SAND-04 | tmpfs root filesystem | Verified: `--tmpfs /` works in bwrap 0.11.0 |
+| SAND-05 | CWD bind-mounted rw | Architecture: mount ordering (CWD after HOME dir creation) |
+| SAND-06 | `/nix/store` read-only | Verified: `--ro-bind /nix/store /nix/store` works |
+| SAND-07 | Nix daemon socket mounted | Verified: `/nix/var/nix/daemon-socket` bind works, nix can talk to daemon |
+| SAND-08 | `~/.claudebox` -> `~/.claude` | Architecture: bind `~/.claudebox` as `$HOME/.claude` |
+| SAND-09 | Secret paths never mounted | Architecture: negative list, verified by env check |
+| SAND-10 | PATH only Nix store paths | Standard Stack: runtimeInputs wires PATH automatically |
+| SAND-11 | Working /tmp, /dev, /proc | Verified: `--tmpfs /tmp --dev /dev --proc /proc` |
+| SAND-12 | DNS resolution works | Pitfalls: NixOS resolv.conf is a real file (not symlink), bind-mount directly |
+| SAND-13 | SSL/TLS works | Pitfalls: NixOS cert chain requires `/etc/ssl` AND `/etc/static` mounts |
+| SAND-14 | Exit code passthrough | Architecture: `exec bwrap ...` pattern |
+| SAND-15 | Signals via exec | Architecture: `exec` ensures no intermediate shell |
+| TOOL-01 | comma available | Standard Stack: comma-with-db from nix-index-database flake |
+| TOOL-02 | `nix shell` works | Verified: daemon socket + nix.conf mount enables nix commands |
+| TOOL-03 | New store paths visible | Architecture: `/nix/store` must be a live bind, not snapshot |
+| GIT-01 | Git works with minimal config | Architecture: generate .gitconfig at launch from host identity |
+| GIT-02 | safe.directory configured | Architecture: `safe.directory = *` in generated .gitconfig |
+| NIX-01 | Nix flake with default package | Standard Stack: flake.nix structure |
+| NIX-02 | Runtime deps pinned via flake | Standard Stack: flake inputs pin nixpkgs + nix-index-database |
+| NIX-03 | `nix run` / `nix profile install` works | Standard Stack: flake outputs packages.default |
+| UX-06 | `--dangerously-skip-permissions` always passed | Architecture: injected before user args in exec |
+</phase_requirements>
+
+## Standard Stack
+
+### Core
+| Library | Version | Purpose | Why Standard |
+|---------|---------|---------|--------------|
+| `writeShellApplication` | nixpkgs stable | Produce claudebox script | Shellcheck at build, `set -euo pipefail`, runtimeInputs wiring [VERIFIED: nix eval on host] |
+| `bubblewrap` | 0.11.0 | Sandbox runtime | Unprivileged user-ns sandbox, all required flags confirmed [VERIFIED: `bwrap --version` on host] |
+| `comma-with-db` | 2.3.3 | On-demand package runner | Bundles nix-index database, no extra mount needed [VERIFIED: `nix eval github:nix-community/nix-index-database#packages.x86_64-linux.comma-with-db.name`] |
+
+### Runtime Dependencies (runtimeInputs for writeShellApplication)
+| Package | Purpose | Notes |
+|---------|---------|-------|
+| `bubblewrap` | Sandbox | 0.11.0 in nixpkgs [VERIFIED: `nix eval nixpkgs#bubblewrap.version`] |
+| `coreutils` | Basic utils | env, cat, mkdir, etc. [VERIFIED: available] |
+| `git` | VCS | Claude Code requires git [VERIFIED: available] |
+| `curl` | HTTP | MCP + tool use [VERIFIED: works inside sandbox] |
+| `jq` | JSON | Config manipulation [ASSUMED: standard nixpkgs] |
+| `ripgrep` | Search | Claude Code's grep [ASSUMED: standard nixpkgs] |
+| `fd` | File find | Claude Code's find [ASSUMED: standard nixpkgs] |
+| `nix` | Package mgr | For `nix shell` inside sandbox [VERIFIED: daemon comms work] |
+| `comma-with-db` | On-demand pkgs | From nix-index-database flake input [VERIFIED: 2.3.3] |
+| `bash` | Shell | bwrap exec target [VERIFIED: available] |
+| `nodejs` | Runtime | Claude Code is a Node.js app [VERIFIED: nodejs-24.13.0 in closure] |
+
+### Excluded (secrets)
+| Package | Why Excluded |
+|---------|-------------|
+| `gnupg` | Secret material |
+| `openssh` | Secret material |
+| `age`/`agenix` | Secret material |
+| `tailscale` | Infrastructure access |
+
+### Flake Inputs
+```nix
+{
+  inputs = {
+    nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable";
+    nix-index-database = {
+      url = "github:nix-community/nix-index-database";
+      inputs.nixpkgs.follows = "nixpkgs";
+    };
+  };
+}
+```
+[VERIFIED: nix-index-database uses `packages.x86_64-linux.comma-with-db` output -- the `legacyPackages` path is deprecated]
+
+## Architecture Patterns
+
+### Recommended Project Structure
+```
+claudebox/
+├── flake.nix           # Flake with nixpkgs + nix-index-database inputs
+├── flake.lock           # Pinned dependencies
+├── claudebox.sh         # Shell script body (read via builtins.readFile)
+├── CLAUDE.md            # Project docs
+└── .planning/           # GSD planning artifacts
+```
+
+### Pattern 1: writeShellApplication with builtins.readFile
+**What:** Keep the shell script in a separate `.sh` file, read it into the Nix expression.
+**When to use:** Always -- gives shell syntax highlighting, independent shellcheck, easier iteration.
+**Example:**
+```nix
+# flake.nix (simplified)
+{
+  outputs = { self, nixpkgs, nix-index-database, ... }:
+    let
+      system = "x86_64-linux";
+      pkgs = nixpkgs.legacyPackages.${system};
+      comma-with-db = nix-index-database.packages.${system}.comma-with-db;
+    in {
+      packages.${system}.default = pkgs.writeShellApplication {
+        name = "claudebox";
+        runtimeInputs = [
+          pkgs.bubblewrap pkgs.coreutils pkgs.git pkgs.curl
+          pkgs.jq pkgs.ripgrep pkgs.fd pkgs.nix
+          comma-with-db pkgs.bash pkgs.nodejs
+        ];
+        text = builtins.readFile ./claudebox.sh;
+      };
+    };
+}
+```
+[VERIFIED: writeShellApplication API is stable in nixpkgs, runtimeInputs prepends to PATH]
+
+### Pattern 2: bwrap Invocation Structure
+**What:** The core sandbox call with proper ordering.
+**Mount ordering rule:** tmpfs root first, then system mounts, then HOME-level mounts, then CWD (most specific last wins).
+
+```bash
+exec bwrap \
+  --clearenv \
+  # --- Sandbox-generated vars ---
+  --setenv HOME "$HOME" \
+  --setenv USER "$USER" \
+  --setenv PATH "$SANDBOX_PATH" \
+  --setenv TERM "${TERM:-xterm}" \
+  --setenv SHELL "/bin/bash" \
+  --setenv TMPDIR /tmp \
+  --setenv NIX_SSL_CERT_FILE /etc/ssl/certs/ca-certificates.crt \
+  # --- Allowlisted host vars (only if set) ---
+  ${EDITOR:+--setenv EDITOR "$EDITOR"} \
+  ${LANG:+--setenv LANG "$LANG"} \
+  ${ANTHROPIC_API_KEY:+--setenv ANTHROPIC_API_KEY "$ANTHROPIC_API_KEY"} \
+  # ... etc for each allowlisted var ...
+  # --- Filesystem: base layer ---
+  --tmpfs / \
+  --proc /proc \
+  --dev /dev \
+  --tmpfs /tmp \
+  # --- Filesystem: system ---
+  --ro-bind /nix/store /nix/store \
+  --bind /nix/var/nix /nix/var/nix \
+  --ro-bind /etc/resolv.conf /etc/resolv.conf \
+  --ro-bind /etc/ssl /etc/ssl \
+  --ro-bind /etc/static /etc/static \
+  --ro-bind /etc/passwd /etc/passwd \
+  --ro-bind /etc/group /etc/group \
+  --ro-bind /etc/hosts /etc/hosts \
+  --ro-bind /etc/nsswitch.conf /etc/nsswitch.conf \
+  --ro-bind /etc/nix /etc/nix \
+  --symlink /usr/bin/env /usr/bin/env \
+  # --- Filesystem: user ---
+  --tmpfs "$HOME" \
+  --bind "$HOME/.claudebox" "$HOME/.claude" \
+  --bind "$CWD" "$CWD" \
+  --chdir "$CWD" \
+  # --- Exec ---
+  -- claude --dangerously-skip-permissions "$@"
+```
+[VERIFIED: tested bwrap invocations on host confirm this structure works]
+
+### Pattern 3: Environment Allowlist with CLAUDEBOX_EXTRA_ENV
+**What:** Loop over allowlisted vars, only pass those that are set.
+```bash
+ALLOWLIST=(HOME PATH TERM EDITOR LANG LC_ALL NIX_SSL_CERT_FILE SSL_CERT_FILE ANTHROPIC_API_KEY USER SHELL XDG_RUNTIME_DIR)
+
+# Build --setenv args array
+SETENV_ARGS=()
+for var in "${ALLOWLIST[@]}"; do
+  if [[ -v "$var" ]]; then
+    SETENV_ARGS+=(--setenv "$var" "${!var}")
+  fi
+done
+
+# Handle CLAUDEBOX_EXTRA_ENV
+if [[ -v CLAUDEBOX_EXTRA_ENV ]]; then
+  IFS=',' read -ra EXTRAS <<< "$CLAUDEBOX_EXTRA_ENV"
+  for var in "${EXTRAS[@]}"; do
+    if [[ -v "$var" ]]; then
+      SETENV_ARGS+=(--setenv "$var" "${!var}")
+    fi
+  done
+fi
+```
+[ASSUMED: bash array + indirect variable pattern is standard]
+
+### Pattern 4: Git Identity Generation
+**What:** Read host git config, write minimal .gitconfig inside sandbox.
+```bash
+GIT_NAME=$(git config --global user.name 2>/dev/null || echo "Claude User")
+GIT_EMAIL=$(git config --global user.email 2>/dev/null || echo "claude@localhost")
+
+# Create temp gitconfig for the sandbox
+GITCONFIG_TMP=$(mktemp)
+cat > "$GITCONFIG_TMP" <<EOF
+[user]
+    name = $GIT_NAME
+    email = $GIT_EMAIL
+[safe]
+    directory = *
+EOF
+```
+Then use `--ro-bind "$GITCONFIG_TMP" "$HOME/.gitconfig"` in the bwrap call. Clean up the tmpfile on exit with a trap.
+[ASSUMED: git config reading is straightforward]
+
+### Pattern 5: Claude Code as Dependency
+**What:** Claude Code needs to be available inside the sandbox PATH.
+**Key finding:** The host has claude-code 2.1.70 installed via a Nix derivation at `/nix/store/4960jbc91nlkdm7fbqb9p1b6gi0x2dq0-claude-code`. It's a bash wrapper that execs node with cli.js. The nixpkgs version is 2.0.51 (older).
+**Approach:** Do NOT add claude-code as a runtimeInput of writeShellApplication. Instead, accept it as a flake input or expect it on the host PATH. The script should discover `claude` from the host's PATH before `--clearenv` strips it. Capture the full path to claude at script startup: `CLAUDE_BIN=$(command -v claude)`, then exec `$CLAUDE_BIN` inside bwrap.
+[VERIFIED: claude binary is at a nix store path, will survive --clearenv if referenced by full path]
+
+### Anti-Patterns to Avoid
+- **Mounting host `~/.gitconfig`:** Contains credential helpers, pager, aliases referencing binaries not in sandbox. Generate a minimal one instead.
+- **Mounting host `~/.claude`:** Requirement says mount `~/.claudebox` AS `~/.claude`. Keeps sandbox state separate.
+- **Using `--unshare-net`:** Phase 1 needs network access. Network isolation is Phase 2 (NET-01, NET-02).
+- **Denylist env approach:** Must use allowlist (`--clearenv` + `--setenv`), never selectively `--unsetenv`.
+
+## Don't Hand-Roll
+
+| Problem | Don't Build | Use Instead | Why |
+|---------|-------------|-------------|-----|
+| Shell script derivation | Manual mkDerivation | `writeShellApplication` | Automatic shellcheck, set -euo pipefail, runtimeInputs PATH |
+| Package index for comma | Manual nix-index database generation | `comma-with-db` from nix-index-database flake | Self-contained, updated with flake lock |
+| SSL cert detection | Custom cert-finding logic | Bind-mount `/etc/ssl` + `/etc/static` + set `NIX_SSL_CERT_FILE` | NixOS cert chain is well-known, just mount the paths |
+| User namespace setup | Manual uid/gid mapping | bwrap defaults | bwrap handles user namespace automatically on NixOS |
+
+## Common Pitfalls
+
+### Pitfall 1: NixOS /etc Symlink Chains
+**What goes wrong:** SSL certs fail because `/etc/ssl/certs/ca-certificates.crt` symlinks to `/etc/static/ssl/certs/ca-certificates.crt` which symlinks to `/nix/store/...`. Mounting only `/etc/ssl` without `/etc/static` breaks the chain.
+**Why it happens:** NixOS manages `/etc` via symlinks to `/etc/static` which itself symlinks to the Nix store.
+**How to avoid:** Mount BOTH `/etc/ssl` and `/etc/static` read-only. The Nix store mount covers the final target.
+**Warning signs:** `curl: (77) error setting certificate` or empty curl responses.
+[VERIFIED: tested on host -- mounting /etc/ssl alone causes `cat /etc/ssl/certs/ca-certificates.crt` to fail; adding /etc/static fixes it]
+
+### Pitfall 2: /etc/nix/nix.conf for Experimental Features
+**What goes wrong:** `nix shell` and `nix eval` fail with "experimental feature 'nix-command' is disabled".
+**Why it happens:** The host's `/etc/nix/nix.conf` enables `experimental-features = nix-command flakes`. Without it, nix commands inside sandbox don't know about flakes.
+**How to avoid:** Mount `/etc/nix` read-only inside the sandbox.
+**Warning signs:** `nix shell` or `nix eval` errors about experimental features.
+[VERIFIED: tested -- without /etc/nix mounted, nix eval fails with exactly this error]
+
+### Pitfall 3: Mount Ordering for CWD Under HOME
+**What goes wrong:** CWD mount is invisible because HOME tmpfs is mounted after it.
+**Why it happens:** bwrap processes mount arguments in order. Later mounts can shadow earlier ones.
+**How to avoid:** Order: `--tmpfs /` -> `--tmpfs $HOME` -> `--bind $CWD $CWD`. Most specific mounts go last.
+**Warning signs:** CWD appears empty inside sandbox.
+[ASSUMED: standard bwrap behavior -- mounts are processed left-to-right]
+
+### Pitfall 4: PATH Inside Sandbox
+**What goes wrong:** `writeShellApplication` runtimeInputs prepends to the host PATH. But `--clearenv` clears PATH. The script needs to capture the Nix-constructed PATH before `--clearenv` wipes it, and pass it into the sandbox.
+**Why it happens:** The wrapper script runs on the host with runtimeInputs PATH. bwrap `--clearenv` clears everything inside.
+**How to avoid:** Capture `SANDBOX_PATH="$PATH"` at script top (this is the runtimeInputs-constructed PATH). Pass it via `--setenv PATH "$SANDBOX_PATH"` into bwrap. Remove any non-nix-store paths if paranoid.
+**Warning signs:** Commands not found inside sandbox.
+[VERIFIED: writeShellApplication prepends runtimeInputs to PATH; --clearenv removes it]
+
+### Pitfall 5: Nix Daemon Socket Needs Write Access
+**What goes wrong:** `nix shell` fails to download packages because daemon socket is mounted read-only.
+**Why it happens:** The Unix socket requires read-write access for nix client to talk to the daemon.
+**How to avoid:** Use `--bind` (rw) not `--ro-bind` for `/nix/var/nix`. The daemon also needs to write to store paths (but those go through the daemon, not the client).
+**Warning signs:** "error connecting to daemon" or permission denied on socket.
+[VERIFIED: tested with `--bind /nix/var/nix /nix/var/nix` -- nix eval works]
+
+### Pitfall 6: /etc/passwd and /etc/group Required
+**What goes wrong:** Various tools (git, nix, node) fail when they can't resolve the current user.
+**Why it happens:** They call getpwuid/getgrgid which reads /etc/passwd and /etc/group.
+**How to avoid:** Mount `/etc/passwd` and `/etc/group` read-only.
+**Warning signs:** "I have no name!" prompt, git errors about user identity.
+[ASSUMED: standard Unix behavior, confirmed by testing that bwrap shows uid 65534 (nobody) without these mounts]
+
+### Pitfall 7: Claude Code MCP Config Injection
+**What goes wrong:** The host's claude-code Nix derivation injects `--mcp-config` pointing to a Nix store path with host-specific MCP servers (e.g., charlie-comunica, charlie-memory referencing ~/agent/).
+**Why it happens:** The host's Nix package wraps claude with hardcoded MCP paths.
+**How to avoid:** This is actually fine for Phase 1 -- the MCP servers won't be accessible inside the sandbox (no ~/agent/ mounted) and will silently fail. Future phases might want to strip or override this. No action needed now.
+[VERIFIED: checked the MCP config at `/nix/store/5iv9id24chdvf39929rya0rvyjrl0p8f-claude-code-mcp-config.json` -- references host paths]
+
+### Pitfall 8: /usr/bin/env Missing
+**What goes wrong:** Scripts with `#!/usr/bin/env bash` shebangs fail.
+**Why it happens:** tmpfs root has no /usr/bin/env. Many scripts and Node.js npm scripts use this shebang.
+**How to avoid:** `--symlink /usr/bin/env "$(which env)"` or `--symlink $(which env) /usr/bin/env`. bwrap supports `--symlink` to create symlinks inside the sandbox.
+**Warning signs:** "bad interpreter: /usr/bin/env: no such file or directory".
+[ASSUMED: standard issue with minimal sandboxes]
+
+## Code Examples
+
+### Flake Structure
+```nix
+# flake.nix
+{
+  description = "claudebox - sandboxed Claude Code";
+
+  inputs = {
+    nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable";
+    nix-index-database = {
+      url = "github:nix-community/nix-index-database";
+      inputs.nixpkgs.follows = "nixpkgs";
+    };
+  };
+
+  outputs = { self, nixpkgs, nix-index-database, ... }:
+    let
+      system = "x86_64-linux";
+      pkgs = nixpkgs.legacyPackages.${system};
+      comma-with-db = nix-index-database.packages.${system}.comma-with-db;
+    in {
+      packages.${system} = {
+        claudebox = pkgs.writeShellApplication {
+          name = "claudebox";
+          runtimeInputs = [
+            pkgs.bubblewrap
+            pkgs.coreutils
+            pkgs.git
+            pkgs.curl
+            pkgs.jq
+            pkgs.ripgrep
+            pkgs.fd
+            pkgs.nix
+            comma-with-db
+            pkgs.bash
+            pkgs.nodejs
+          ];
+          text = builtins.readFile ./claudebox.sh;
+        };
+        default = self.packages.${system}.claudebox;
+      };
+    };
+}
+```
+[ASSUMED: flake structure based on standard nixpkgs patterns]
+
+### Signal Handling and Exit Code
+```bash
+# At the end of claudebox.sh -- exec replaces the shell process
+# so signals go directly to bwrap->claude, and exit code passes through
+exec bwrap \
+  ... \
+  -- "$CLAUDE_BIN" --dangerously-skip-permissions "$@"
+```
+[VERIFIED: exec ensures PID 1 in the script is bwrap, Ctrl+C propagates to children]
+
+### /usr/bin/env Symlink
+```bash
+# In the bwrap args -- coreutils provides env
+--symlink "$(command -v env)" /usr/bin/env
+```
+Note: `--symlink` creates `TARGET LINK_NAME` (dest is the symlink path). The `env` binary is in coreutils which is in the sandbox PATH.
+[ASSUMED: bwrap --symlink syntax]
+
+## State of the Art
+
+| Old Approach | Current Approach | When Changed | Impact |
+|--------------|------------------|--------------|--------|
+| `legacyPackages` for comma-with-db | `packages` output | Recent | Must use `nix-index-database.packages.${system}.comma-with-db` [VERIFIED: deprecation warning on legacyPackages] |
+| claude-code from npm | claude-code from nixpkgs | 2025 | Available as `pkgs.claude-code` but version 2.0.51 vs host's 2.1.70 [VERIFIED: nix eval] |
+| bwrap 0.9.x | bwrap 0.11.0 | 2025 | Current nixpkgs has 0.11.0 [VERIFIED: nix eval + host binary] |
+
+## Assumptions Log
+
+| # | Claim | Section | Risk if Wrong |
+|---|-------|---------|---------------|
+| A1 | bwrap processes mounts left-to-right, later mounts shadow earlier | Pitfalls #3 | Wrong mount ordering could hide CWD |
+| A2 | /etc/passwd and /etc/group are needed for user resolution | Pitfalls #6 | Tools might fail with "no name" if omitted |
+| A3 | `--symlink` creates symlinks inside sandbox with syntax `--symlink TARGET LINKNAME` | Code Examples | /usr/bin/env shebang scripts would fail if wrong |
+| A4 | jq, ripgrep, fd are standard nixpkgs packages | Standard Stack | Build would fail if package names differ |
+| A5 | flake.nix structure with writeShellApplication + builtins.readFile | Code Examples | Nix build would fail if API differs |
+
+## Open Questions (RESOLVED)
+
+1. **Claude Code source: host vs flake input**
+   - RESOLVED: Discover claude from host PATH at runtime (`CLAUDE_BIN=$(command -v claude)`). This avoids version management and respects the host's claude-code configuration. The script fails fast with a clear error if `claude` is not found.
+
+2. **XDG_RUNTIME_DIR inside sandbox**
+   - RESOLVED: Set `--setenv XDG_RUNTIME_DIR /tmp` inside the sandbox (D-04 says sandbox-generated). Don't mount the host's runtime dir as it may contain secret sockets.
+
+3. **`~/.claudebox` creation**
+   - RESOLVED: Script does `mkdir -p ~/.claudebox` before bwrap invocation if it doesn't exist.
+
+## Environment Availability
+
+| Dependency | Required By | Available | Version | Fallback |
+|------------|------------|-----------|---------|----------|
+| bubblewrap | Sandbox core | Yes | 0.11.0 | -- |
+| nix | Package management | Yes | Lix 2.93.3 | -- |
+| git | VCS operations | Yes | available on host | -- |
+| curl | HTTP requests | Yes | 8.17.0 in nixpkgs | -- |
+| nodejs | Claude Code runtime | Yes | 24.13.0 | -- |
+| claude-code | The wrapped tool | Yes | 2.1.70 on host | nixpkgs 2.0.51 |
+| comma-with-db | On-demand packages | Yes | 2.3.3 via flake | -- |
+| Nix daemon socket | nix shell/comma | Yes | /nix/var/nix/daemon-socket/socket | -- |
+
+**Missing dependencies with no fallback:** None.
+
+## Project Constraints (from CLAUDE.md)
+
+- **Stack:** Nix derivation + shell script only. No Docker, systemd, or external dependencies beyond nixpkgs.
+- **Sandbox:** Own bwrap call. Not delegating to Claude Code's `--sandbox` or Nix's build sandbox.
+- **Env model:** Allowlist, not denylist. Start empty, add explicitly.
+- **Commits:** Conventional commits, minimal/succinct messages.
+- **NixOS:** Changes go through the flake.
+
+## Sources
+
+### Primary (HIGH confidence)
+- Host bubblewrap 0.11.0 -- `bwrap --version`, `bwrap --help`, live sandbox tests
+- Host Nix/Lix 2.93.3 -- `nix --version`, `nix eval` commands
+- nixpkgs bubblewrap -- `nix eval nixpkgs#bubblewrap.version` = "0.11.0"
+- nix-index-database flake -- `nix eval` + `nix flake show` confirmed `packages.x86_64-linux.comma-with-db` (2.3.3)
+- Claude Code binary inspection -- wrapper chain confirmed: bash -> bash (env setup) -> node cli.js
+- NixOS /etc structure -- live inspection of symlink chains for resolv.conf, ssl, hosts, nsswitch.conf
+- Live sandbox tests -- confirmed: clearenv, tmpfs root, nix store mount, daemon socket, DNS resolution, SSL (with /etc/static)
+
+### Secondary (MEDIUM confidence)
+- Host `/etc/nix/nix.conf` -- confirmed experimental-features setting needed inside sandbox
+- Host `~/.claude/` directory -- confirmed .credentials.json, config/, history.jsonl structure
+
+### Tertiary (LOW confidence)
+- bwrap `--symlink` syntax -- from training data, not tested in this session
+
+## Metadata
+
+**Confidence breakdown:**
+- Standard stack: HIGH -- all packages verified in nixpkgs and on host
+- Architecture: HIGH -- core patterns verified with live sandbox tests
+- Pitfalls: HIGH -- most pitfalls discovered and verified through testing
+- Flake structure: MEDIUM -- writeShellApplication API assumed from training, not doc-verified
+
+**Research date:** 2026-04-09
+**Valid until:** 2026-05-09 (stable tools, 30-day window)
diff --git a/.planning/phases/02-env-audit-and-cli-polish/02-01-PLAN.md b/.planning/phases/02-env-audit-and-cli-polish/02-01-PLAN.md
new file mode 100644
index 0000000..581338f
--- /dev/null
+++ b/.planning/phases/02-env-audit-and-cli-polish/02-01-PLAN.md
@@ -0,0 +1,296 @@
+---
+phase: 02-env-audit-and-cli-polish
+plan: 01
+type: execute
+wave: 1
+depends_on: []
+files_modified:
+  - claudebox.sh
+autonomous: true
+requirements:
+  - UX-03
+  - UX-04
+  - UX-05
+must_haves:
+  truths:
+    - "claudebox --check reports pass/fail for bwrap, claude, git, nix, ~/.claudebox, and ANTHROPIC_API_KEY"
+    - "claudebox --dry-run prints the full bwrap command with all flags and exits without executing"
+    - "claudebox --yes -y flags are recognized and stored for Plan 02 audit skip"
+    - "Unknown flags are passed through to claude in CLAUDE_ARGS array"
+    - "Flag parsing handles multiple claudebox flags in any order"
+  artifacts:
+    - path: "claudebox.sh"
+      provides: "Refactored flag parsing, --check, --dry-run"
+      contains: "CHECK_MODE|DRY_RUN|SKIP_AUDIT|CLAUDE_ARGS"
+  key_links:
+    - from: "claudebox.sh flag parsing"
+      to: "SANDBOX_CMD construction"
+      via: "CLAUDE_ARGS array replaces raw $@"
+      pattern: "CLAUDE_ARGS"
+---
+
+<objective>
+Refactor claudebox.sh flag parsing to support multiple flags, then add --check and --dry-run early-exit modes.
+
+Purpose: Foundation for all Phase 2 CLI flags. The current single-flag parser (for/shift/break) cannot handle multiple claudebox flags. This plan refactors it to a while/shift loop that collects claudebox flags and accumulates remaining args in CLAUDE_ARGS. Then adds two diagnostic modes that exit before sandbox launch.
+
+Output: claudebox.sh with working --check, --dry-run, and flag scaffolding for --yes/-y.
+</objective>
+
+<execution_context>
+@$HOME/.claude/get-shit-done/workflows/execute-plan.md
+@$HOME/.claude/get-shit-done/templates/summary.md
+</execution_context>
+
+<context>
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/STATE.md
+@.planning/phases/02-env-audit-and-cli-polish/02-CONTEXT.md
+@.planning/phases/02-env-audit-and-cli-polish/02-RESEARCH.md
+
+@claudebox.sh
+@flake.nix
+
+<interfaces>
+<!-- Current claudebox.sh flag parsing (lines 1-9) will be replaced -->
+<!-- Current SANDBOX_CMD uses $@ directly (lines 70-74) -- must switch to CLAUDE_ARGS -->
+<!-- ENV_ARGS array (lines 39-67) is NOT modified in this plan -->
+<!-- exec bwrap (lines 77-100) is the target for --dry-run print -->
+</interfaces>
+</context>
+
+<tasks>
+
+<task type="auto">
+  <name>Task 1: Refactor flag parsing to while/shift with CLAUDE_ARGS accumulator</name>
+  <files>claudebox.sh</files>
+  <read_first>claudebox.sh</read_first>
+  <action>
+Replace the current flag parsing block (lines 1-9) with a while/shift loop that handles multiple claudebox flags. Define these variables at the top of the script:
+
+```bash
+SKIP_AUDIT=false
+DRY_RUN=false
+CHECK_MODE=false
+SHELL_MODE=false
+CLAUDE_ARGS=()
+
+while (( $# > 0 )); do
+  case "$1" in
+    --yes|-y) SKIP_AUDIT=true ;;
+    --dry-run) DRY_RUN=true ;;
+    --check) CHECK_MODE=true ;;
+    --shell) SHELL_MODE=true ;;
+    --) shift; CLAUDE_ARGS+=("$@"); break ;;
+    *) CLAUDE_ARGS+=("$1") ;;
+  esac
+  shift
+done
+```
+
+Per D-08, --yes/-y sets SKIP_AUDIT=true (consumed by Plan 02's audit display).
+
+Then update the SANDBOX_CMD construction (currently lines 70-74) to use CLAUDE_ARGS instead of $@:
+
+```bash
+if [[ "$SHELL_MODE" == true ]]; then
+  SANDBOX_CMD=("$SANDBOX_BASH" "${CLAUDE_ARGS[@]}")
+else
+  SANDBOX_CMD=("$CLAUDE_BIN" --dangerously-skip-permissions "${CLAUDE_ARGS[@]}")
+fi
+```
+
+This ensures unknown flags like `--model sonnet` pass through to claude correctly.
+  </action>
+  <verify>
+    <automated>cd /home/toph/code/tools/claudebox && grep -q 'SKIP_AUDIT=false' claudebox.sh && grep -q 'DRY_RUN=false' claudebox.sh && grep -q 'CHECK_MODE=false' claudebox.sh && grep -q 'CLAUDE_ARGS' claudebox.sh && grep -q 'while (( \$# > 0 ))' claudebox.sh && echo "PASS: flag parsing refactored"</automated>
+  </verify>
+  <acceptance_criteria>
+    - claudebox.sh contains `SKIP_AUDIT=false` variable declaration
+    - claudebox.sh contains `DRY_RUN=false` variable declaration
+    - claudebox.sh contains `CHECK_MODE=false` variable declaration
+    - claudebox.sh contains `SHELL_MODE=false` variable declaration
+    - claudebox.sh contains `CLAUDE_ARGS=()` array declaration
+    - claudebox.sh contains `while (( $# > 0 ))` loop (not `for arg in`)
+    - claudebox.sh contains `CLAUDE_ARGS+=("$1")` for unknown flag accumulation
+    - SANDBOX_CMD uses `"${CLAUDE_ARGS[@]}"` not `"$@"`
+    - The old `for arg in "$@"` parsing block is removed
+  </acceptance_criteria>
+  <done>Flag parsing handles --yes, -y, --dry-run, --check, --shell, and passes unknown flags to CLAUDE_ARGS. SANDBOX_CMD uses CLAUDE_ARGS.</done>
+</task>
+
+<task type="auto">
+  <name>Task 2: Add --dry-run mode that prints full bwrap command and exits</name>
+  <files>claudebox.sh</files>
+  <read_first>claudebox.sh</read_first>
+  <action>
+Per D-09 (UX-04), add a --dry-run handler after the SANDBOX_CMD construction and before the `exec bwrap` line. When DRY_RUN=true, print the complete bwrap invocation in a readable multiline format to stderr, then exit 0. Do NOT prompt for confirmation (--dry-run implies --yes per research open question 1).
+
+Insert this block right before `exec bwrap`:
+
+```bash
+# --dry-run: print the bwrap command without executing (D-09, UX-04)
+if [[ "$DRY_RUN" == true ]]; then
+  {
+    echo "bwrap \\"
+    echo "  --clearenv \\"
+    # Print each ENV_ARGS entry
+    local_i=0
+    while (( local_i < ${#ENV_ARGS[@]} )); do
+      printf '  %s %s %q \\\n' "${ENV_ARGS[$local_i]}" "${ENV_ARGS[$((local_i+1))]}" "${ENV_ARGS[$((local_i+2))]}"
+      (( local_i += 3 ))
+    done
+    echo "  --tmpfs / \\"
+    echo "  --proc /proc \\"
+    echo "  --dev /dev \\"
+    echo "  --tmpfs /tmp \\"
+    echo "  --ro-bind /nix/store /nix/store \\"
+    echo "  --bind /nix/var/nix /nix/var/nix \\"
+    echo "  --ro-bind /etc/resolv.conf /etc/resolv.conf \\"
+    echo "  --ro-bind /etc/ssl /etc/ssl \\"
+    echo "  --ro-bind /etc/static /etc/static \\"
+    echo "  --ro-bind /etc/passwd /etc/passwd \\"
+    echo "  --ro-bind /etc/group /etc/group \\"
+    echo "  --ro-bind /etc/hosts /etc/hosts \\"
+    echo "  --ro-bind /etc/nsswitch.conf /etc/nsswitch.conf \\"
+    echo "  --ro-bind /etc/nix /etc/nix \\"
+    printf '  --symlink %q /usr/bin/env \\\n' "$(readlink -f "$(command -v env)")"
+    echo "  --tmpfs $HOME \\"
+    echo "  --bind $HOME/.claudebox $HOME/.claude \\"
+    printf '  --ro-bind %q %s/.gitconfig \\\n' "$GITCONFIG_TMP" "$HOME"
+    echo "  --bind $CWD $CWD \\"
+    echo "  --chdir $CWD \\"
+    printf '  -- %s\n' "${SANDBOX_CMD[*]}"
+  } >&2
+  exit 0
+fi
+```
+
+Note: The ENV_ARGS array is structured as triplets (--setenv key value), so iterate in steps of 3. Use `printf %q` for values that might contain special chars. The filesystem mount flags mirror the actual `exec bwrap` call exactly -- if the bwrap call changes, this must be updated too.
+
+Important: Since writeShellApplication uses `set -euo pipefail` and shellcheck, avoid using `local` outside functions. Use a unique variable name like `dry_run_i` instead of `local_i` for the loop counter.
+  </action>
+  <verify>
+    <automated>cd /home/toph/code/tools/claudebox && grep -q 'DRY_RUN.*true' claudebox.sh && grep -q 'dry.run' claudebox.sh && grep -c 'exit 0' claudebox.sh | grep -q '[1-9]' && echo "PASS: dry-run handler present"</automated>
+  </verify>
+  <acceptance_criteria>
+    - claudebox.sh contains an `if [[ "$DRY_RUN" == true ]]` block
+    - The dry-run block prints `bwrap \` as first line to stderr
+    - The dry-run block prints `--clearenv \` on its own line
+    - The dry-run block iterates ENV_ARGS in steps of 3 to print --setenv triplets
+    - The dry-run block prints all filesystem mount flags matching the actual exec bwrap call
+    - The dry-run block ends with `exit 0`
+    - All dry-run output goes to stderr (>&2)
+    - `nix build` succeeds (shellcheck passes)
+  </acceptance_criteria>
+  <done>Running `claudebox --dry-run` prints the full bwrap command in multiline format to stderr and exits 0 without launching the sandbox.</done>
+</task>
+
+<task type="auto">
+  <name>Task 3: Add --check mode that verifies prerequisites and exits</name>
+  <files>claudebox.sh</files>
+  <read_first>claudebox.sh</read_first>
+  <action>
+Per D-10 (UX-05), add a --check handler as an early exit right after flag parsing and before any binary resolution or env construction. When CHECK_MODE=true, run diagnostic checks and exit.
+
+Insert this block immediately after the flag parsing while/shift loop:
+
+```bash
+# --check: verify prerequisites and exit (D-10, UX-05)
+if [[ "$CHECK_MODE" == true ]]; then
+  pass=true
+  green=$'\033[32m' red=$'\033[31m' yellow=$'\033[33m' reset=$'\033[0m'
+
+  check_cmd() {
+    if command -v "$1" &>/dev/null; then
+      echo "${green}OK${reset}    $1" >&2
+    else
+      echo "${red}FAIL${reset}  $1 -- not found" >&2
+      pass=false
+    fi
+  }
+
+  echo "claudebox prerequisites:" >&2
+  echo "" >&2
+  check_cmd bwrap
+  check_cmd claude
+  check_cmd git
+  check_cmd curl
+  check_cmd nix
+
+  if [[ -d "$HOME/.claudebox" ]]; then
+    echo "${green}OK${reset}    ~/.claudebox exists" >&2
+  else
+    echo "${red}FAIL${reset}  ~/.claudebox -- not found (will be created on first run)" >&2
+  fi
+
+  if [[ -v ANTHROPIC_API_KEY ]]; then
+    echo "${green}OK${reset}    ANTHROPIC_API_KEY is set" >&2
+  else
+    echo "${yellow}WARN${reset}  ANTHROPIC_API_KEY is not set" >&2
+  fi
+
+  echo "" >&2
+  if [[ "$pass" == true ]]; then
+    echo "${green}All checks passed.${reset}" >&2
+    exit 0
+  else
+    echo "${red}Some checks failed.${reset}" >&2
+    exit 1
+  fi
+fi
+```
+
+The check verifies: bwrap, claude, git, curl, nix (required binaries), ~/.claudebox directory, and ANTHROPIC_API_KEY (warn only, not a hard failure). Exit 0 if all required checks pass, exit 1 if any fail. All output to stderr per D-07.
+  </action>
+  <verify>
+    <automated>cd /home/toph/code/tools/claudebox && grep -q 'CHECK_MODE.*true' claudebox.sh && grep -q 'check_cmd' claudebox.sh && grep -q 'prerequisites' claudebox.sh && echo "PASS: check handler present"</automated>
+  </verify>
+  <acceptance_criteria>
+    - claudebox.sh contains an `if [[ "$CHECK_MODE" == true ]]` block
+    - The check block is positioned BEFORE binary resolution (`command -v bash`, `command -v claude`)
+    - The check block tests for: bwrap, claude, git, curl, nix via `command -v`
+    - The check block tests for `$HOME/.claudebox` directory existence
+    - The check block tests for ANTHROPIC_API_KEY with WARN (not FAIL)
+    - The check block prints colored OK/FAIL/WARN indicators to stderr
+    - The check block exits 0 on all-pass, exits 1 on any FAIL
+    - `nix build` succeeds (shellcheck passes)
+  </acceptance_criteria>
+  <done>Running `claudebox --check` prints pass/fail for each prerequisite to stderr and exits with appropriate code.</done>
+</task>
+
+</tasks>
+
+<threat_model>
+## Trust Boundaries
+
+| Boundary | Description |
+|----------|-------------|
+| host env -> --dry-run output | Env var values printed to stderr may contain secrets |
+
+## STRIDE Threat Register
+
+| Threat ID | Category | Component | Disposition | Mitigation Plan |
+|-----------|----------|-----------|-------------|-----------------|
+| T-02-01 | Information Disclosure | --dry-run printing env values | accept | --dry-run shows env values unmasked because the user explicitly asked to see the full command. Plan 02 adds masking for the audit display. The dry-run user is debugging, not reviewing for leaks. |
+| T-02-02 | Information Disclosure | --check printing ANTHROPIC_API_KEY status | mitigate | Only print presence/absence, never the value. Use `[[ -v ANTHROPIC_API_KEY ]]` not echo of value. |
+</threat_model>
+
+<verification>
+1. `grep -c 'CLAUDE_ARGS' claudebox.sh` returns >= 3 (declaration + accumulation + usage)
+2. `grep 'DRY_RUN\|CHECK_MODE\|SKIP_AUDIT' claudebox.sh` shows all three flag variables
+3. `nix build` succeeds (shellcheck validation)
+</verification>
+
+<success_criteria>
+- Flag parsing supports --yes, -y, --dry-run, --check, --shell, and -- separator
+- Unknown flags accumulate in CLAUDE_ARGS and pass through to claude
+- --check exits early with colored diagnostic output
+- --dry-run prints full bwrap command and exits
+- shellcheck passes via nix build
+</success_criteria>
+
+<output>
+After completion, create `.planning/phases/02-env-audit-and-cli-polish/02-01-SUMMARY.md`
+</output>
diff --git a/.planning/phases/02-env-audit-and-cli-polish/02-01-SUMMARY.md b/.planning/phases/02-env-audit-and-cli-polish/02-01-SUMMARY.md
new file mode 100644
index 0000000..18a0420
--- /dev/null
+++ b/.planning/phases/02-env-audit-and-cli-polish/02-01-SUMMARY.md
@@ -0,0 +1,57 @@
+---
+phase: 02-env-audit-and-cli-polish
+plan: 01
+subsystem: cli
+tags: [flag-parsing, dry-run, check-mode, cli-ux]
+dependency_graph:
+  requires: []
+  provides: [SKIP_AUDIT, DRY_RUN, CHECK_MODE, CLAUDE_ARGS]
+  affects: [02-02]
+tech_stack:
+  added: []
+  patterns: [while-shift-flag-parsing, early-exit-modes]
+key_files:
+  modified: [claudebox.sh]
+decisions:
+  - "export SKIP_AUDIT to satisfy shellcheck SC2034 (unused variable) since Plan 02 consumes it"
+metrics:
+  duration: 2min
+  completed: "2026-04-09T15:11:34Z"
+  tasks: 3
+  files: 1
+---
+
+# Phase 02 Plan 01: Flag Parsing and CLI Modes Summary
+
+Refactored claudebox flag parsing from single-flag for/shift/break to multi-flag while/shift with CLAUDE_ARGS accumulator, then added --check and --dry-run early-exit diagnostic modes.
+
+## Completed Tasks
+
+| # | Task | Commit | Key Changes |
+|---|------|--------|-------------|
+| 1 | Refactor flag parsing to while/shift with CLAUDE_ARGS | `07096ae` | Replaced for-loop with while/shift, added SKIP_AUDIT/DRY_RUN/CHECK_MODE flags, CLAUDE_ARGS accumulator |
+| 2 | Add --dry-run mode | `3903667` | Prints full bwrap command with all env and mount flags to stderr, exits 0 |
+| 3 | Add --check mode | `cc6bd5b` | Verifies bwrap/claude/git/curl/nix binaries, ~/.claudebox dir, ANTHROPIC_API_KEY presence |
+
+## Deviations from Plan
+
+### Auto-fixed Issues
+
+**1. [Rule 3 - Blocking] shellcheck SC2034 for SKIP_AUDIT**
+- **Found during:** Task 3 (nix build verification)
+- **Issue:** shellcheck flagged SKIP_AUDIT as unused since Plan 02 hasn't consumed it yet
+- **Fix:** Added `export SKIP_AUDIT` after flag parsing loop with comment noting Plan 02 dependency
+- **Files modified:** claudebox.sh
+- **Commit:** `cc6bd5b`
+
+## Verification Results
+
+- `grep -c CLAUDE_ARGS claudebox.sh` returns 5 (declaration + 2 accumulations + 2 usages)
+- All three flag variables (SKIP_AUDIT, DRY_RUN, CHECK_MODE) present in claudebox.sh
+- `nix build` succeeds (shellcheck validation passes)
+
+## Threat Surface Scan
+
+T-02-02 mitigated: --check mode only tests `[[ -v ANTHROPIC_API_KEY ]]` for presence, never prints the value.
+
+## Self-Check: PASSED
diff --git a/.planning/phases/02-env-audit-and-cli-polish/02-02-PLAN.md b/.planning/phases/02-env-audit-and-cli-polish/02-02-PLAN.md
new file mode 100644
index 0000000..6883028
--- /dev/null
+++ b/.planning/phases/02-env-audit-and-cli-polish/02-02-PLAN.md
@@ -0,0 +1,325 @@
+---
+phase: 02-env-audit-and-cli-polish
+plan: 02
+type: execute
+wave: 2
+depends_on:
+  - "02-01"
+files_modified:
+  - claudebox.sh
+autonomous: true
+requirements:
+  - UX-01
+  - UX-02
+must_haves:
+  truths:
+    - "Running claudebox without --yes shows all env vars grouped by source before launching"
+    - "Env vars are grouped into Sandbox-generated, Host (allowlisted), and Extra (CLAUDEBOX_EXTRA_ENV) sections"
+    - "PATH is displayed split by colon with one entry per line"
+    - "Values matching *KEY*, *TOKEN*, *SECRET*, *PASSWORD*, *CREDENTIAL* are auto-masked"
+    - "User sees a Proceed? [Y/n] prompt and can abort by typing n"
+    - "Non-interactive stdin (piped, CI) aborts with error telling user to pass --yes/-y"
+    - "All audit output goes to stderr, stdout stays clean"
+  artifacts:
+    - path: "claudebox.sh"
+      provides: "Env audit display, masking, confirmation prompt"
+      contains: "mask_value|print_audit|Proceed"
+  key_links:
+    - from: "claudebox.sh env audit"
+      to: "claudebox.sh SKIP_AUDIT flag"
+      via: "if [[ $SKIP_AUDIT != true ]]"
+      pattern: "SKIP_AUDIT"
+    - from: "claudebox.sh audit display"
+      to: "ENV_ARGS array"
+      via: "parallel display arrays populated during env construction"
+      pattern: "AUDIT_SANDBOX\\|AUDIT_HOST\\|AUDIT_EXTRA"
+---
+
+<objective>
+Add pre-launch env audit display with grouped sections, value masking, and confirmation prompt to claudebox.sh.
+
+Purpose: Transparency before sandbox launch. The user sees exactly what environment enters the sandbox, with sensitive values masked, and can abort if something looks wrong. Non-interactive environments are forced to use --yes.
+
+Output: claudebox.sh with full env audit display and interactive confirmation, skippable via --yes/-y (flag already parsed by Plan 01).
+</objective>
+
+<execution_context>
+@$HOME/.claude/get-shit-done/workflows/execute-plan.md
+@$HOME/.claude/get-shit-done/templates/summary.md
+</execution_context>
+
+<context>
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/phases/02-env-audit-and-cli-polish/02-CONTEXT.md
+@.planning/phases/02-env-audit-and-cli-polish/02-RESEARCH.md
+@.planning/phases/02-env-audit-and-cli-polish/02-01-SUMMARY.md
+
+@claudebox.sh
+@flake.nix
+
+<interfaces>
+<!-- From Plan 01: SKIP_AUDIT variable is set by flag parser (true when --yes/-y passed) -->
+<!-- From Plan 01: DRY_RUN variable (dry-run implies skip audit) -->
+<!-- ENV_ARGS array contains --setenv key value triplets for bwrap -->
+<!-- HOST_ALLOWLIST array lists host-passed var names -->
+<!-- CLAUDEBOX_EXTRA_ENV parsing block handles comma-separated extras -->
+<!-- SANDBOX_PATH contains the colon-separated PATH for inside sandbox -->
+</interfaces>
+</context>
+
+<tasks>
+
+<task type="auto">
+  <name>Task 1: Add parallel display arrays and env audit display function</name>
+  <files>claudebox.sh</files>
+  <read_first>claudebox.sh</read_first>
+  <action>
+Add parallel display data structures alongside the existing ENV_ARGS construction, then add a display function. This task implements D-01, D-02, D-03, D-04, D-07.
+
+**Step 1: Add ANSI color constants and mask_value function near the top of the script (after flag parsing, before binary resolution):**
+
+```bash
+# ANSI formatting (D-03)
+if [[ -t 2 ]] && [[ "${NO_COLOR:-}" == "" ]]; then
+  BOLD=$'\033[1m'
+  RESET=$'\033[0m'
+  DIM=$'\033[2m'
+  CYAN=$'\033[36m'
+  YELLOW=$'\033[33m'
+  GREEN=$'\033[32m'
+  RED=$'\033[31m'
+else
+  BOLD="" RESET="" DIM="" CYAN="" YELLOW="" GREEN="" RED=""
+fi
+
+# Mask sensitive values (D-04)
+mask_value() {
+  local name="$1" value="$2"
+  local upper="${name^^}"
+  if [[ "$upper" == *KEY* || "$upper" == *TOKEN* || "$upper" == *SECRET* || "$upper" == *PASSWORD* || "$upper" == *CREDENTIAL* ]]; then
+    if (( ${#value} > 11 )); then
+      echo "${value:0:7}...${value: -4}"
+    else
+      echo "***"
+    fi
+  else
+    echo "$value"
+  fi
+}
+```
+
+**Step 2: Add display tracking arrays.** Declare these right before the ENV_ARGS construction block:
+
+```bash
+# Parallel display data for env audit (D-01)
+declare -a AUDIT_SANDBOX_KEYS=()
+declare -A AUDIT_SANDBOX_VALS=()
+declare -a AUDIT_HOST_KEYS=()
+declare -A AUDIT_HOST_VALS=()
+declare -a AUDIT_EXTRA_KEYS=()
+declare -A AUDIT_EXTRA_VALS=()
+```
+
+**Step 3: Populate display arrays alongside ENV_ARGS.** After each --setenv addition to ENV_ARGS, also record in the audit arrays.
+
+For sandbox-generated vars, after the ENV_ARGS=(...) block, add:
+```bash
+AUDIT_SANDBOX_KEYS=(HOME USER PATH SHELL TMPDIR XDG_RUNTIME_DIR NIX_SSL_CERT_FILE SSL_CERT_FILE)
+AUDIT_SANDBOX_VALS[HOME]="$HOME"
+AUDIT_SANDBOX_VALS[USER]="$USER"
+AUDIT_SANDBOX_VALS[PATH]="$SANDBOX_PATH"
+AUDIT_SANDBOX_VALS[SHELL]="$SANDBOX_BASH"
+AUDIT_SANDBOX_VALS[TMPDIR]="/tmp"
+AUDIT_SANDBOX_VALS[XDG_RUNTIME_DIR]="/tmp"
+AUDIT_SANDBOX_VALS[NIX_SSL_CERT_FILE]="/etc/ssl/certs/ca-certificates.crt"
+AUDIT_SANDBOX_VALS[SSL_CERT_FILE]="/etc/ssl/certs/ca-certificates.crt"
+```
+
+For host allowlisted vars, inside the existing HOST_ALLOWLIST loop, add after each `ENV_ARGS+=`:
+```bash
+AUDIT_HOST_KEYS+=("$var")
+AUDIT_HOST_VALS[$var]="${!var}"
+```
+
+For CLAUDEBOX_EXTRA_ENV vars, inside the existing extras loop, add after each `ENV_ARGS+=`:
+```bash
+AUDIT_EXTRA_KEYS+=("$var")
+AUDIT_EXTRA_VALS[$var]="${!var}"
+```
+
+**Step 4: Add the print_audit function** (after the display arrays are populated, before the dry-run check):
+
+```bash
+# Env audit display (D-01, D-02, D-03, D-04, D-07, UX-01)
+print_audit() {
+  echo "${BOLD}${CYAN}=== Sandbox Environment ===${RESET}" >&2
+  echo "" >&2
+
+  # Sandbox-generated (D-01)
+  echo "${BOLD}Sandbox-generated:${RESET}" >&2
+  for var in "${AUDIT_SANDBOX_KEYS[@]}"; do
+    if [[ "$var" == "PATH" ]]; then
+      echo "  ${GREEN}PATH=${RESET}" >&2
+      IFS=':' read -ra path_entries <<< "${AUDIT_SANDBOX_VALS[PATH]}"
+      for entry in "${path_entries[@]}"; do
+        echo "    ${DIM}${entry}${RESET}" >&2
+      done
+    else
+      echo "  ${GREEN}${var}=${RESET}$(mask_value "$var" "${AUDIT_SANDBOX_VALS[$var]}")" >&2
+    fi
+  done
+  echo "" >&2
+
+  # Host allowlisted (D-01)
+  if (( ${#AUDIT_HOST_KEYS[@]} > 0 )); then
+    echo "${BOLD}Host (allowlisted):${RESET}" >&2
+    for var in "${AUDIT_HOST_KEYS[@]}"; do
+      echo "  ${YELLOW}${var}=${RESET}$(mask_value "$var" "${AUDIT_HOST_VALS[$var]}")" >&2
+    done
+    echo "" >&2
+  fi
+
+  # Extra from CLAUDEBOX_EXTRA_ENV (D-01)
+  if (( ${#AUDIT_EXTRA_KEYS[@]} > 0 )); then
+    echo "${BOLD}Extra (CLAUDEBOX_EXTRA_ENV):${RESET}" >&2
+    for var in "${AUDIT_EXTRA_KEYS[@]}"; do
+      echo "  ${YELLOW}${var}=${RESET}$(mask_value "$var" "${AUDIT_EXTRA_VALS[$var]}")" >&2
+    done
+    echo "" >&2
+  fi
+}
+```
+
+All output goes to stderr per D-07.
+  </action>
+  <verify>
+    <automated>cd /home/toph/code/tools/claudebox && grep -q 'mask_value' claudebox.sh && grep -q 'print_audit' claudebox.sh && grep -q 'AUDIT_SANDBOX_KEYS' claudebox.sh && grep -q 'AUDIT_HOST_KEYS' claudebox.sh && grep -q 'AUDIT_EXTRA_KEYS' claudebox.sh && grep -q 'NO_COLOR' claudebox.sh && echo "PASS: audit display infrastructure present"</automated>
+  </verify>
+  <acceptance_criteria>
+    - claudebox.sh contains `mask_value()` function
+    - mask_value checks for KEY, TOKEN, SECRET, PASSWORD, CREDENTIAL (case-insensitive via `${name^^}`)
+    - mask_value shows first 7 + last 4 chars with `...` for values longer than 11
+    - mask_value shows `***` for values 11 chars or shorter
+    - claudebox.sh contains `print_audit()` function
+    - print_audit displays three sections: "Sandbox-generated:", "Host (allowlisted):", "Extra (CLAUDEBOX_EXTRA_ENV):"
+    - PATH display splits by colon with one entry per line (indented)
+    - All audit output uses `>&2` for stderr
+    - ANSI colors are suppressed when stderr is not a TTY or NO_COLOR is set
+    - AUDIT_SANDBOX_KEYS, AUDIT_HOST_KEYS, AUDIT_EXTRA_KEYS arrays exist and are populated
+    - `nix build` succeeds (shellcheck passes)
+  </acceptance_criteria>
+  <done>Env audit display function exists with grouped sections, PATH splitting, value masking, and ANSI formatting. Display data is tracked in parallel arrays alongside ENV_ARGS.</done>
+</task>
+
+<task type="auto">
+  <name>Task 2: Add confirmation prompt with TTY detection and wire audit into launch flow</name>
+  <files>claudebox.sh</files>
+  <read_first>claudebox.sh</read_first>
+  <action>
+Wire the audit display and confirmation prompt into the launch flow. This implements D-05, D-06, UX-02.
+
+Insert the following block AFTER print_audit is defined and BEFORE the --dry-run check (or before `exec bwrap` if dry-run check is already there). The audit+prompt should run before dry-run because dry-run implies --yes.
+
+Actually, looking at the flow: --dry-run should skip the audit (per research recommendation). So the order should be:
+
+1. If DRY_RUN is true, skip audit (it implies --yes)
+2. If SKIP_AUDIT is NOT true and NOT DRY_RUN, show audit and prompt
+
+Add this block after the env construction and print_audit function definition, before the dry-run check:
+
+```bash
+# Env audit and confirmation (D-05, D-06, D-07, UX-01, UX-02, UX-03)
+if [[ "$SKIP_AUDIT" != true && "$DRY_RUN" != true ]]; then
+  print_audit
+
+  # TTY check (D-06)
+  if [[ -t 0 ]]; then
+    read -r -p "Proceed? [Y/n] " response < /dev/tty 2>&1
+    response="${response,,}"  # lowercase
+    if [[ "$response" == "n" || "$response" == "no" ]]; then
+      echo "Aborted." >&2
+      exit 1
+    fi
+  else
+    echo "${RED}Error: stdin is not a terminal. Pass --yes or -y to skip confirmation.${RESET}" >&2
+    exit 1
+  fi
+fi
+```
+
+Key details:
+- `Proceed? [Y/n]` -- default is proceed, Enter launches (D-05)
+- Only `n` or `no` aborts. Any other input (including empty/Enter) proceeds.
+- Non-TTY stdin aborts with actionable error message (D-06)
+- `read -r -p` with `< /dev/tty` to handle stdin being consumed by pipes
+- `${response,,}` lowercases the input for case-insensitive comparison
+- The `2>&1` on read sends the prompt text to wherever stdout goes (which combined with `< /dev/tty` reads from terminal)
+
+Wait -- the prompt from `read -p` goes to stderr by default in some shells but stdout in bash. To ensure the prompt goes to stderr per D-07, use:
+
+```bash
+echo -n "Proceed? [Y/n] " >&2
+read -r response < /dev/tty
+```
+
+This is cleaner and guarantees stderr for the prompt.
+  </action>
+  <verify>
+    <automated>cd /home/toph/code/tools/claudebox && grep -q 'Proceed.*Y/n' claudebox.sh && grep -q '\-t 0' claudebox.sh && grep -q 'SKIP_AUDIT.*true' claudebox.sh && grep -q 'stdin is not a terminal' claudebox.sh && nix build 2>&1 && echo "PASS: confirmation prompt and nix build succeed"</automated>
+  </verify>
+  <acceptance_criteria>
+    - claudebox.sh contains `Proceed? [Y/n]` prompt text
+    - claudebox.sh checks `[[ -t 0 ]]` for TTY detection before prompting
+    - Non-TTY stdin prints error "stdin is not a terminal. Pass --yes or -y to skip confirmation." to stderr and exits 1
+    - The prompt and "Aborted." message go to stderr (>&2)
+    - Typing `n` or `no` (case-insensitive) at the prompt exits 1 with "Aborted."
+    - Pressing Enter (empty input) proceeds with launch
+    - The audit+prompt block is guarded by `SKIP_AUDIT != true && DRY_RUN != true`
+    - `nix build` succeeds (shellcheck passes, full script valid)
+    - Complete script flow: parse flags -> --check early exit -> resolve binaries -> build env -> audit+prompt (unless --yes/--dry-run) -> --dry-run print -> exec bwrap
+  </acceptance_criteria>
+  <done>Running `claudebox` shows env audit and prompts for confirmation. --yes/-y skips it. Non-TTY aborts with helpful error. nix build passes.</done>
+</task>
+
+</tasks>
+
+<threat_model>
+## Trust Boundaries
+
+| Boundary | Description |
+|----------|-------------|
+| host env -> audit display | Env var values displayed to stderr, masking required for secrets |
+| user input -> read prompt | User response controls launch/abort decision |
+
+## STRIDE Threat Register
+
+| Threat ID | Category | Component | Disposition | Mitigation Plan |
+|-----------|----------|-----------|-------------|-----------------|
+| T-02-03 | Information Disclosure | Env audit displaying ANTHROPIC_API_KEY | mitigate | mask_value() auto-masks any var name matching *KEY*, *TOKEN*, *SECRET*, *PASSWORD*, *CREDENTIAL*. Shows first 7 + last 4 chars only. Values <= 11 chars show `***`. |
+| T-02-04 | Information Disclosure | CLAUDEBOX_EXTRA_ENV secrets | mitigate | Same mask_value() applies to all displayed vars regardless of source category. User-added vars with sensitive names are masked. |
+| T-02-05 | Elevation of Privilege | Non-interactive auto-proceed | mitigate | D-06: non-TTY stdin aborts with error, never auto-proceeds. Scripts/CI must explicitly pass --yes. |
+</threat_model>
+
+<verification>
+1. `nix build` succeeds (shellcheck + compilation)
+2. `grep -c 'mask_value' claudebox.sh` returns >= 2 (definition + usage)
+3. `grep 'Proceed' claudebox.sh` shows the prompt text
+4. `grep 'SKIP_AUDIT' claudebox.sh` shows the guard condition
+5. Script flow order: flag parsing -> --check -> binary resolution -> env construction -> audit arrays -> audit+prompt -> --dry-run -> exec bwrap
+</verification>
+
+<success_criteria>
+- Env audit displays three grouped sections with colored headers to stderr
+- PATH entries displayed one per line, indented
+- Sensitive values auto-masked (ANTHROPIC_API_KEY shows `sk-ant-...xxxx`)
+- Proceed? [Y/n] prompt with Enter=proceed, n=abort
+- Non-interactive stdin aborts with actionable error
+- --yes/-y skips entire audit+prompt
+- --dry-run also skips audit+prompt
+- nix build passes (shellcheck clean)
+</success_criteria>
+
+<output>
+After completion, create `.planning/phases/02-env-audit-and-cli-polish/02-02-SUMMARY.md`
+</output>
diff --git a/.planning/phases/02-env-audit-and-cli-polish/02-02-SUMMARY.md b/.planning/phases/02-env-audit-and-cli-polish/02-02-SUMMARY.md
new file mode 100644
index 0000000..59a4b07
--- /dev/null
+++ b/.planning/phases/02-env-audit-and-cli-polish/02-02-SUMMARY.md
@@ -0,0 +1,63 @@
+---
+phase: 02-env-audit-and-cli-polish
+plan: 02
+subsystem: cli
+tags: [env-audit, masking, confirmation-prompt, ux]
+dependency_graph:
+  requires: [02-01]
+  provides: [print_audit, mask_value, env-confirmation-prompt]
+  affects: []
+tech_stack:
+  added: []
+  patterns: [associative-arrays-for-audit-tracking, ansi-color-with-no-color-support, tty-detection]
+key_files:
+  modified: [claudebox.sh]
+decisions:
+  - "export RED removed after Task 2 made it used -- shellcheck satisfied by actual usage not export"
+  - "read from /dev/tty for prompt input to handle piped stdin correctly"
+  - "mask_value shows first 7 + last 4 chars for values >11 chars, *** for shorter"
+metrics:
+  duration: 2min
+  completed: "2026-04-09T15:21:40Z"
+  tasks: 2
+  files: 1
+---
+
+# Phase 02 Plan 02: Env Audit Display and Confirmation Prompt Summary
+
+Pre-launch env audit with grouped sections (sandbox/host/extra), sensitive value masking, PATH splitting, and interactive Y/n confirmation with TTY detection.
+
+## Completed Tasks
+
+| # | Task | Commit | Key Changes |
+|---|------|--------|-------------|
+| 1 | Add parallel display arrays and env audit display function | `1c986d2` | ANSI colors with NO_COLOR support, mask_value(), AUDIT_*_KEYS/VALS arrays, print_audit() with grouped sections and PATH splitting |
+| 2 | Add confirmation prompt with TTY detection | `b035f82` | Proceed? [Y/n] prompt, TTY check via [[ -t 0 ]], non-TTY abort with actionable error, guarded by SKIP_AUDIT and DRY_RUN |
+
+## Deviations from Plan
+
+### Auto-fixed Issues
+
+**1. [Rule 3 - Blocking] shellcheck SC2034 for RED variable**
+- **Found during:** Task 1
+- **Issue:** RED was defined in ANSI color block but only used by Task 2's confirmation prompt code (not yet written)
+- **Fix:** Temporarily added `export RED` to satisfy shellcheck, then removed it in Task 2 commit after RED gained actual usage
+- **Files modified:** claudebox.sh
+- **Commit:** `1c986d2` (added), `b035f82` (removed)
+
+## Verification Results
+
+- `nix build` succeeds (shellcheck clean)
+- `grep -q mask_value claudebox.sh` -- present
+- `grep -q print_audit claudebox.sh` -- present
+- `grep -q 'Proceed.*Y/n' claudebox.sh` -- present
+- `grep -q 'SKIP_AUDIT.*true' claudebox.sh` -- present
+- Script flow order verified: flag parsing -> --check -> binary resolution -> env construction -> audit arrays -> audit+prompt -> dry-run -> exec bwrap
+
+## Threat Surface Scan
+
+T-02-03 mitigated: mask_value() auto-masks any var name matching *KEY*, *TOKEN*, *SECRET*, *PASSWORD*, *CREDENTIAL* (case-insensitive via ${name^^}).
+T-02-04 mitigated: mask_value() applies to all displayed vars regardless of source category.
+T-02-05 mitigated: non-TTY stdin aborts with error, never auto-proceeds.
+
+## Self-Check: PASSED
diff --git a/.planning/phases/02-env-audit-and-cli-polish/02-CONTEXT.md b/.planning/phases/02-env-audit-and-cli-polish/02-CONTEXT.md
new file mode 100644
index 0000000..f5851b0
--- /dev/null
+++ b/.planning/phases/02-env-audit-and-cli-polish/02-CONTEXT.md
@@ -0,0 +1,100 @@
+# Phase 2: Env Audit and CLI Polish - Context
+
+**Gathered:** 2026-04-09
+**Status:** Ready for planning
+
+<domain>
+## Phase Boundary
+
+Add pre-launch transparency and diagnostic CLI flags to claudebox. User can review exactly what enters the sandbox before launch (`env audit`), skip review (`--yes`/`-y`), inspect the bwrap command (`--dry-run`), and verify prerequisites (`--check`).
+
+</domain>
+
+<decisions>
+## Implementation Decisions
+
+### Env Audit Display
+- **D-01:** Group env vars by source in three labeled sections: "Sandbox-generated" (HOME, PATH, SHELL, TMPDIR, etc.), "Host (allowlisted)" (TERM, EDITOR, ANTHROPIC_API_KEY, etc.), "Extra (CLAUDEBOX_EXTRA_ENV)" (user-added vars). Display to stderr.
+- **D-02:** PATH is split by `:` and displayed one entry per line (indented under PATH=) for readability.
+- **D-03:** Use plain ANSI escape codes for color/formatting — no external dependency like gum. Bold section headers, colored section labels.
+- **D-04:** Auto-mask values where the variable name matches `*KEY*`, `*TOKEN*`, `*SECRET*`, `*PASSWORD*`, `*CREDENTIAL*` (case-insensitive). Show first 7 + last 4 characters with `...` in between. This catches ANTHROPIC_API_KEY and any user-added secrets via CLAUDEBOX_EXTRA_ENV.
+
+### Confirmation Prompt
+- **D-05:** Use `Proceed? [Y/n]` prompt — default is proceed (Enter launches). User must type `n` or `no` to abort.
+- **D-06:** If stdin is not a TTY (piped input, CI, scripts), abort with an error message telling the user to pass `--yes`/`-y`. Do NOT auto-proceed in non-interactive mode.
+- **D-07:** All audit output and the prompt go to stderr. Keeps stdout clean.
+
+### CLI Flags
+- **D-08:** `--yes` / `-y` skips the env audit display and confirmation entirely — launches immediately. (Per Phase 1 D-01, claudebox claims its own flags and passes the rest to claude.)
+- **D-09:** `--dry-run` prints the full bwrap command without executing. Claude's discretion on exact format (multiline, annotated, etc.).
+- **D-10:** `--check` verifies prerequisites: bwrap exists, required Nix packages available, `~/.claudebox` exists. Claude's discretion on diagnostic depth and output format.
+
+### Claude's Discretion
+- `--dry-run` output format (single-line vs multiline, annotated vs raw)
+- `--check` diagnostic depth (existence-only vs version checks vs connectivity tests)
+- Exact ANSI color choices and spacing
+- Flag parsing order and error messages for invalid flag combinations
+
+</decisions>
+
+<canonical_refs>
+## Canonical References
+
+**Downstream agents MUST read these before planning or implementing.**
+
+### Project Docs
+- `.planning/PROJECT.md` — Core value, constraints, key decisions
+- `.planning/REQUIREMENTS.md` — UX-01 through UX-05 requirement definitions
+- `.planning/ROADMAP.md` — Phase 2 success criteria
+
+### Phase 1 Context
+- `.planning/phases/01-minimal-viable-sandbox/01-CONTEXT.md` — D-01 (flag passthrough), D-03 (env allowlist + CLAUDEBOX_EXTRA_ENV)
+
+### Implementation
+- `claudebox.sh` — Current script with ENV_ARGS array, HOST_ALLOWLIST, CLAUDEBOX_EXTRA_ENV parsing, bwrap invocation
+- `flake.nix` — Nix derivation structure, runtimeInputs, SANDBOX_PATH injection
+
+</canonical_refs>
+
+<code_context>
+## Existing Code Insights
+
+### Reusable Assets
+- `ENV_ARGS` array in claudebox.sh — already structures all env vars as `--setenv key value` pairs, can be iterated for audit display
+- `HOST_ALLOWLIST` array — provides the list of host-passed vars
+- `CLAUDEBOX_EXTRA_ENV` parsing block — already splits comma-separated extras
+- Flag parsing `case/esac` block — extend with `--yes`, `-y`, `--dry-run`, `--check`
+
+### Established Patterns
+- `writeShellApplication` with shellcheck and `set -euo pipefail`
+- Flag parsing via `for arg in "$@"` with case/esac
+- Temp file cleanup via trap
+
+### Integration Points
+- Env audit logic goes between env construction and `exec bwrap`
+- `--dry-run` replaces the `exec bwrap` call with a print
+- `--check` is an early-exit path before any env construction
+- Flag parsing extends the existing case/esac block
+
+</code_context>
+
+<specifics>
+## Specific Ideas
+
+- PATH display: split by `:`, one Nix store path per line, indented under `PATH=`
+- Masking regex: case-insensitive match on var name containing KEY, TOKEN, SECRET, PASSWORD, CREDENTIAL
+- Non-interactive detection: `[[ -t 0 ]]` check on stdin
+
+</specifics>
+
+<deferred>
+## Deferred Ideas
+
+None — discussion stayed within phase scope
+
+</deferred>
+
+---
+
+*Phase: 02-env-audit-and-cli-polish*
+*Context gathered: 2026-04-09*
diff --git a/.planning/phases/02-env-audit-and-cli-polish/02-DISCUSSION-LOG.md b/.planning/phases/02-env-audit-and-cli-polish/02-DISCUSSION-LOG.md
new file mode 100644
index 0000000..71c0678
--- /dev/null
+++ b/.planning/phases/02-env-audit-and-cli-polish/02-DISCUSSION-LOG.md
@@ -0,0 +1,100 @@
+# Phase 2: Env Audit and CLI Polish - Discussion Log
+
+> **Audit trail only.** Do not use as input to planning, research, or execution agents.
+> Decisions are captured in CONTEXT.md — this log preserves the alternatives considered.
+
+**Date:** 2026-04-09
+**Phase:** 02-env-audit-and-cli-polish
+**Areas discussed:** Env audit display format, Confirmation and non-interactive behavior
+
+---
+
+## Env Audit Display Format
+
+### Grouping
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| Grouped by source | Three sections: Sandbox-generated, Host (allowlisted), Extra (CLAUDEBOX_EXTRA_ENV) | ✓ |
+| Flat key=value list | Simple sorted list, no categories | |
+| You decide | Claude's discretion | |
+
+**User's choice:** Grouped by source
+**Notes:** User wants PATH split by `:` with one entry per line. Wants automatic spacing and color coding.
+
+### Color/Formatting
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| gum (Charm TUI) | Rich formatting, handles terminal detection, adds ~15MB | |
+| Plain ANSI codes | printf with escape sequences, zero dependencies | ✓ |
+| You decide | Claude's discretion | |
+
+**User's choice:** Plain ANSI codes
+**Notes:** User initially considered gum but chose zero-dependency approach.
+
+### Value Masking
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| Mask sensitive values | Pattern-match on var name, show first 7 + last 4 chars | ✓ |
+| Show full values | Display everything as-is | |
+| You decide | Claude's discretion | |
+
+**User's choice:** Mask sensitive values
+**Notes:** User asked about dependency for secret detection. Decided pattern-matching on var names is sufficient.
+
+### Masking Approach
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| Pattern-match var names | Regex: *KEY*, *TOKEN*, *SECRET*, *PASSWORD*, *CREDENTIAL* | ✓ |
+| Hardcoded list | Only mask ANTHROPIC_API_KEY specifically | |
+
+**User's choice:** Pattern-match var names
+
+---
+
+## Confirmation and Non-Interactive Behavior
+
+### Prompt Style
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| y/N prompt | Default abort, user must type 'y' | |
+| Y/n prompt | Default proceed, Enter launches | ✓ |
+| You decide | Claude's discretion | |
+
+**User's choice:** Y/n prompt (default proceed)
+
+### Non-TTY Behavior
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| Auto-proceed | Behave as if --yes in non-interactive | |
+| Abort if no TTY | Refuse to run without explicit --yes | ✓ |
+| You decide | Claude's discretion | |
+
+**User's choice:** Abort if no TTY — forces scripts to opt-in with --yes
+
+### Output Destination
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| stderr | Audit and prompt to stderr, stdout clean | ✓ |
+| stdout | Everything to stdout | |
+
+**User's choice:** stderr
+
+---
+
+## Claude's Discretion
+
+- `--dry-run` output format
+- `--check` diagnostic depth and format
+- Exact ANSI color choices
+- Flag parsing order
+
+## Deferred Ideas
+
+None
diff --git a/.planning/phases/02-env-audit-and-cli-polish/02-HUMAN-UAT.md b/.planning/phases/02-env-audit-and-cli-polish/02-HUMAN-UAT.md
new file mode 100644
index 0000000..86e895a
--- /dev/null
+++ b/.planning/phases/02-env-audit-and-cli-polish/02-HUMAN-UAT.md
@@ -0,0 +1,44 @@
+---
+status: partial
+phase: 02-env-audit-and-cli-polish
+source: [02-VERIFICATION.md]
+started: 2026-04-09T17:30:00Z
+updated: 2026-04-09T17:30:00Z
+---
+
+## Current Test
+
+[awaiting human testing]
+
+## Tests
+
+### 1. Visual Audit Display
+expected: Run `claudebox` without `--yes` — see grouped sections (Sandbox-generated, Host allowlisted, Extra), PATH split by colon, sensitive values masked, Y/n prompt on stderr
+result: [pending]
+
+### 2. Dry-Run Output
+expected: Run `claudebox --dry-run` — full bwrap command prints to stderr, does not execute
+result: [pending]
+
+### 3. Check Mode
+expected: Run `claudebox --check` — colored OK/FAIL/WARN output for bwrap, claude, git, curl, nix, ~/.claudebox, ANTHROPIC_API_KEY
+result: [pending]
+
+### 4. Non-Interactive Abort
+expected: Pipe input to `claudebox` (e.g., `echo | claudebox`) — aborts with error telling user to pass `--yes`/`-y`
+result: [pending]
+
+### 5. Yes Flag Skip
+expected: Run `claudebox --yes` or `claudebox -y` — skips audit display and confirmation, launches immediately
+result: [pending]
+
+## Summary
+
+total: 5
+passed: 0
+issues: 0
+pending: 5
+skipped: 0
+blocked: 0
+
+## Gaps
diff --git a/.planning/phases/02-env-audit-and-cli-polish/02-RESEARCH.md b/.planning/phases/02-env-audit-and-cli-polish/02-RESEARCH.md
new file mode 100644
index 0000000..f6715d0
--- /dev/null
+++ b/.planning/phases/02-env-audit-and-cli-polish/02-RESEARCH.md
@@ -0,0 +1,408 @@
+# Phase 2: Env Audit and CLI Polish - Research
+
+**Researched:** 2026-04-09
+**Domain:** Bash CLI UX -- flag parsing, ANSI formatting, TTY detection
+**Confidence:** HIGH
+
+## Summary
+
+This phase adds pre-launch transparency and diagnostic CLI flags to the existing claudebox.sh script. All work is pure bash -- no new dependencies, no new Nix packages, no external tools. The existing code already has the data structures (ENV_ARGS, HOST_ALLOWLIST, CLAUDEBOX_EXTRA_ENV parsing) and the flag parsing skeleton (case/esac). The phase extends these with display logic, confirmation prompts, and two new early-exit modes (--dry-run, --check).
+
+The main technical concerns are: (1) correctly iterating ENV_ARGS for display without breaking the bwrap invocation, (2) ANSI escape code portability, (3) non-interactive stdin detection, and (4) shellcheck compliance since writeShellApplication enforces it at build time.
+
+**Primary recommendation:** Build parallel data structures for audit display rather than parsing the ENV_ARGS array. Track env vars in associative arrays by category during construction, then format for display separately from the bwrap --setenv pairs.
+
+<user_constraints>
+## User Constraints (from CONTEXT.md)
+
+### Locked Decisions
+- **D-01:** Group env vars by source in three labeled sections: "Sandbox-generated", "Host (allowlisted)", "Extra (CLAUDEBOX_EXTRA_ENV)". Display to stderr.
+- **D-02:** PATH is split by `:` and displayed one entry per line (indented under PATH=) for readability.
+- **D-03:** Use plain ANSI escape codes for color/formatting -- no external dependency like gum. Bold section headers, colored section labels.
+- **D-04:** Auto-mask values where the variable name matches `*KEY*`, `*TOKEN*`, `*SECRET*`, `*PASSWORD*`, `*CREDENTIAL*` (case-insensitive). Show first 7 + last 4 characters with `...` in between.
+- **D-05:** Use `Proceed? [Y/n]` prompt -- default is proceed (Enter launches). User must type `n` or `no` to abort.
+- **D-06:** If stdin is not a TTY (piped input, CI, scripts), abort with error telling user to pass `--yes`/`-y`. Do NOT auto-proceed.
+- **D-07:** All audit output and the prompt go to stderr. Keeps stdout clean.
+- **D-08:** `--yes` / `-y` skips the env audit display and confirmation entirely -- launches immediately.
+- **D-09:** `--dry-run` prints the full bwrap command without executing. Claude's discretion on format.
+- **D-10:** `--check` verifies prerequisites: bwrap exists, required Nix packages available, `~/.claudebox` exists. Claude's discretion on depth/format.
+
+### Claude's Discretion
+- `--dry-run` output format (single-line vs multiline, annotated vs raw)
+- `--check` diagnostic depth (existence-only vs version checks vs connectivity tests)
+- Exact ANSI color choices and spacing
+- Flag parsing order and error messages for invalid flag combinations
+
+### Deferred Ideas (OUT OF SCOPE)
+None -- discussion stayed within phase scope.
+</user_constraints>
+
+<phase_requirements>
+## Phase Requirements
+
+| ID | Description | Research Support |
+|----|-------------|------------------|
+| UX-01 | Pre-launch env audit displays all env vars being passed into the sandbox on stderr | Audit display patterns, ANSI formatting, ENV_ARGS iteration approach |
+| UX-02 | Pre-launch env audit prompts for confirmation before proceeding | TTY detection, read prompt pattern, non-interactive abort |
+| UX-03 | `--yes` / `-y` flag skips the env audit confirmation | Flag parsing extension to existing case/esac |
+| UX-04 | `--dry-run` flag prints the full bwrap command without executing | Command reconstruction from arrays, printf %q quoting |
+| UX-05 | `--check` flag verifies bwrap exists, required Nix packages available, and ~/.claudebox exists | command -v checks, exit code conventions |
+</phase_requirements>
+
+## Standard Stack
+
+No new packages required. Everything is bash builtins and existing runtimeInputs.
+
+| Tool | Source | Purpose | Already Available |
+|------|--------|---------|-------------------|
+| bash builtins | runtimeInputs | ANSI output, read, test, printf | Yes |
+| jq | runtimeInputs | Not needed for this phase | Yes (unused) |
+| coreutils | runtimeInputs | tput alternative -- but ANSI codes are simpler | Yes |
+
+## Architecture Patterns
+
+### Integration Points in claudebox.sh
+
+The current script flow is linear:
+
+```
+1. Parse flags (--shell)
+2. Resolve binaries
+3. Record CWD, ensure ~/.claudebox
+4. Generate gitconfig
+5. Build ENV_ARGS array
+6. Build SANDBOX_CMD
+7. exec bwrap
+```
+
+Phase 2 inserts into this flow:
+
+```
+1. Parse flags (--shell, --yes/-y, --dry-run, --check)  [EXTEND]
+2. --check: early exit                                    [NEW]
+3. Resolve binaries
+4. Record CWD, ensure ~/.claudebox
+5. Generate gitconfig
+6. Build ENV_ARGS array
+7. Build SANDBOX_CMD
+8. Env audit display + confirmation (unless --yes)        [NEW]
+9. --dry-run: print command and exit                      [NEW]
+10. exec bwrap
+```
+
+### Pattern: Parallel Display Data
+
+Rather than parsing ENV_ARGS (which is `--setenv key value` triplets), maintain separate display-oriented arrays during construction. This avoids fragile parsing of the bwrap args array.
+
+```bash
+# During env construction, also track for display
+declare -A SANDBOX_VARS    # sandbox-generated vars
+declare -A HOST_VARS       # host allowlisted vars
+declare -A EXTRA_VARS      # CLAUDEBOX_EXTRA_ENV vars
+```
+
+[VERIFIED: reading claudebox.sh -- the three categories already have distinct code blocks that can populate these]
+
+### Pattern: ANSI Escape Codes
+
+```bash
+# Color constants -- define once at top
+BOLD=$'\033[1m'
+RESET=$'\033[0m'
+DIM=$'\033[2m'
+CYAN=$'\033[36m'
+YELLOW=$'\033[33m'
+GREEN=$'\033[32m'
+RED=$'\033[31m'
+```
+
+[ASSUMED] These are standard VT100/ECMA-48 sequences supported by all modern terminals. No tput dependency needed.
+
+### Pattern: Value Masking (D-04)
+
+```bash
+mask_value() {
+  local name="$1" value="$2"
+  # Case-insensitive match on var name
+  if [[ "${name^^}" == *KEY* || "${name^^}" == *TOKEN* || "${name^^}" == *SECRET* || "${name^^}" == *PASSWORD* || "${name^^}" == *CREDENTIAL* ]]; then
+    local len=${#value}
+    if (( len > 11 )); then
+      echo "${value:0:7}...${value: -4}"
+    else
+      echo "***"
+    fi
+  else
+    echo "$value"
+  fi
+}
+```
+
+Note: `${name^^}` converts to uppercase in bash 4+. NixOS ships bash 5.x, so this is safe. [VERIFIED: NixOS uses bash 5.x from nixpkgs]
+
+### Pattern: PATH Display (D-02)
+
+```bash
+display_path() {
+  echo "  PATH="
+  IFS=':' read -ra path_entries <<< "$1"
+  for entry in "${path_entries[@]}"; do
+    echo "    $entry"
+  done
+}
+```
+
+### Pattern: TTY Detection (D-06)
+
+```bash
+if [[ -t 0 ]]; then
+  # Interactive -- show prompt
+  read -r -p "Proceed? [Y/n] " response < /dev/tty
+  # ...
+else
+  echo "Error: stdin is not a terminal. Pass --yes or -y to skip confirmation." >&2
+  exit 1
+fi
+```
+
+Important: Use `read < /dev/tty` rather than plain `read` because stdin may be consumed by pipes even when /dev/tty exists. The `[[ -t 0 ]]` check catches the non-interactive case. [ASSUMED]
+
+### Pattern: --dry-run Output
+
+Recommend multiline format with one flag per line, matching the existing `exec bwrap` layout in the script. This makes it easy to diff against the actual invocation and spot issues.
+
+```bash
+if [[ "$DRY_RUN" == true ]]; then
+  echo "bwrap \\" >&2
+  echo "  --clearenv \\" >&2
+  # ... each flag on its own line
+  exit 0
+fi
+```
+
+Use `printf '%q '` for values that may contain special characters (though in practice, env values and paths are clean).
+
+### Pattern: --check Diagnostics
+
+Recommend checking:
+1. `command -v bwrap` -- is bwrap on PATH
+2. `command -v claude` -- is claude on PATH
+3. Key runtimeInputs: git, curl, nix, bash
+4. `~/.claudebox` directory exists
+5. `ANTHROPIC_API_KEY` is set (warn if missing, don't fail)
+
+Output: one line per check with pass/fail indicator. Exit 0 if all required checks pass, exit 1 if any required check fails.
+
+### Pattern: Flag Parsing Extension
+
+The existing parser uses `for arg in "$@"` with shift. Extend with additional cases. Important: `--check` should be checked first (early exit before env construction), but parsing order can collect all flags first, then branch.
+
+```bash
+SKIP_AUDIT=false
+DRY_RUN=false
+CHECK_MODE=false
+SHELL_MODE=false
+CLAUDE_ARGS=()
+
+for arg in "$@"; do
+  case "$arg" in
+    --yes|-y) SKIP_AUDIT=true ;;
+    --dry-run) DRY_RUN=true ;;
+    --check) CHECK_MODE=true ;;
+    --shell) SHELL_MODE=true ;;
+    --) shift; break ;;
+    *) CLAUDE_ARGS+=("$arg") ;;
+  esac
+done
+```
+
+Note: Current parsing uses `shift` and `break` which is problematic for multi-flag support. The refactored approach collects all flags in one pass and stores remaining args in CLAUDE_ARGS. [VERIFIED: current claudebox.sh only handles --shell with shift+break]
+
+### Anti-Patterns to Avoid
+
+- **Parsing ENV_ARGS for display:** The array contains `--setenv key value` triplets interleaved with bwrap flags. Iterating it for display is fragile. Track display data separately.
+- **Using `tput` for colors:** Adds an ncurses dependency. ANSI escape codes are sufficient and have no dependency.
+- **Auto-proceeding in non-interactive mode:** D-06 explicitly requires aborting. Don't silently proceed.
+- **Echoing sensitive values without masking:** D-04 requires masking KEY/TOKEN/SECRET/PASSWORD/CREDENTIAL patterns.
+
+## Don't Hand-Roll
+
+| Problem | Don't Build | Use Instead | Why |
+|---------|-------------|-------------|-----|
+| Color output | ncurses/tput wrapper | Raw ANSI escapes | D-03 mandates plain ANSI, zero deps |
+| Argument parsing | getopt/getopts | Simple case/esac loop | Only 4 flags, no complex options, matches existing pattern |
+
+## Common Pitfalls
+
+### Pitfall 1: Shellcheck Violations
+**What goes wrong:** writeShellApplication runs shellcheck at build time. Bash-isms like `${name^^}` or `declare -A` may trigger warnings.
+**Why it happens:** shellcheck defaults may flag associative arrays or uppercase expansion.
+**How to avoid:** Use `# shellcheck disable=SCXXXX` directives only when necessary. Test with `shellcheck claudebox.sh` locally before `nix build`. Associative arrays and `${var^^}` are valid bash and shellcheck-clean in bash mode.
+**Warning signs:** `nix build` fails with shellcheck errors.
+
+### Pitfall 2: Flag Parsing Breaking Passthrough
+**What goes wrong:** Claudebox consumes a flag meant for claude, or fails to pass remaining args.
+**Why it happens:** Current parsing uses shift+break which only handles one flag.
+**How to avoid:** Refactor to collect all known flags, accumulate unknown args in CLAUDE_ARGS array, pass CLAUDE_ARGS to claude.
+**Warning signs:** `claudebox --model sonnet` silently drops `--model`.
+
+### Pitfall 3: read Prompt in Non-TTY
+**What goes wrong:** `read` hangs or reads garbage when stdin is piped.
+**Why it happens:** No TTY check before prompting.
+**How to avoid:** Check `[[ -t 0 ]]` before read. Read from `/dev/tty` explicitly.
+**Warning signs:** Script hangs in CI or when piped.
+
+### Pitfall 4: Masking Short Values
+**What goes wrong:** Masking "first 7 + last 4" on a 5-character value reveals the whole thing.
+**Why it happens:** No length check before substring extraction.
+**How to avoid:** If value length <= 11, show `***` instead of partial mask.
+**Warning signs:** Short API keys fully visible in audit output.
+
+### Pitfall 5: ANSI Codes in Redirected Output
+**What goes wrong:** If stderr is redirected to a file, ANSI escape codes pollute the output.
+**Why it happens:** Colors are sent regardless of terminal capability.
+**How to avoid:** Optional: check `[[ -t 2 ]]` and suppress colors if stderr is not a terminal. This is discretionary per D-03, but good practice.
+**Warning signs:** Garbled text in log files.
+
+## Code Examples
+
+### Complete Flag Parsing (recommended)
+
+```bash
+# Parse claudebox flags -- collect our flags, pass the rest to claude
+SKIP_AUDIT=false
+DRY_RUN=false
+CHECK_MODE=false
+SHELL_MODE=false
+CLAUDE_ARGS=()
+
+while (( $# > 0 )); do
+  case "$1" in
+    --yes|-y) SKIP_AUDIT=true ;;
+    --dry-run) DRY_RUN=true ;;
+    --check) CHECK_MODE=true ;;
+    --shell) SHELL_MODE=true ;;
+    --) shift; CLAUDE_ARGS+=("$@"); break ;;
+    *) CLAUDE_ARGS+=("$1") ;;
+  esac
+  shift
+done
+```
+
+[ASSUMED] This is standard bash argument parsing. The `while/shift` pattern is more robust than the current `for/shift/break`.
+
+### Env Audit Display Function
+
+```bash
+print_audit() {
+  local bold=$'\033[1m' reset=$'\033[0m'
+  local cyan=$'\033[36m' yellow=$'\033[33m' green=$'\033[32m'
+
+  echo "${bold}${cyan}=== Sandbox Environment ===${reset}" >&2
+  echo "" >&2
+
+  echo "${bold}Sandbox-generated:${reset}" >&2
+  for var in HOME USER PATH SHELL TMPDIR XDG_RUNTIME_DIR NIX_SSL_CERT_FILE SSL_CERT_FILE; do
+    if [[ "$var" == "PATH" ]]; then
+      echo "  ${green}PATH=${reset}" >&2
+      IFS=':' read -ra entries <<< "$SANDBOX_PATH"
+      for entry in "${entries[@]}"; do
+        echo "    $entry" >&2
+      done
+    else
+      # Look up value from the sandbox vars
+      echo "  ${green}${var}=${reset}${SANDBOX_DISPLAY[$var]}" >&2
+    fi
+  done
+  # ... similar for host and extra sections
+}
+```
+
+### --check Implementation
+
+```bash
+run_check() {
+  local pass=true
+  local green=$'\033[32m' red=$'\033[31m' reset=$'\033[0m'
+
+  check_cmd() {
+    if command -v "$1" &>/dev/null; then
+      echo "${green}OK${reset}  $1" >&2
+    else
+      echo "${red}FAIL${reset}  $1 -- not found" >&2
+      pass=false
+    fi
+  }
+
+  echo "claudebox prerequisites:" >&2
+  check_cmd bwrap
+  check_cmd claude
+  check_cmd git
+  check_cmd nix
+
+  if [[ -d "$HOME/.claudebox" ]]; then
+    echo "${green}OK${reset}  ~/.claudebox exists" >&2
+  else
+    echo "${red}FAIL${reset}  ~/.claudebox -- not found (will be created on first run)" >&2
+  fi
+
+  if [[ -v ANTHROPIC_API_KEY ]]; then
+    echo "${green}OK${reset}  ANTHROPIC_API_KEY is set" >&2
+  else
+    echo "${yellow}WARN${reset}  ANTHROPIC_API_KEY is not set" >&2
+  fi
+
+  if [[ "$pass" == true ]]; then
+    exit 0
+  else
+    exit 1
+  fi
+}
+```
+
+## Assumptions Log
+
+| # | Claim | Section | Risk if Wrong |
+|---|-------|---------|---------------|
+| A1 | ANSI VT100 escape codes work in all target terminals | Architecture Patterns | Low -- NixOS terminals universally support ANSI |
+| A2 | `read < /dev/tty` is the correct pattern for prompting when stdin may be piped | Architecture Patterns | Low -- standard Unix practice |
+| A3 | `${var^^}` uppercase expansion is shellcheck-clean | Pitfalls | Low -- shellcheck knows bash, would only flag if shell directive is sh |
+| A4 | while/shift is more robust than for/shift/break for multi-flag parsing | Code Examples | Very low -- well-established pattern |
+
+## Open Questions
+
+1. **Should `--dry-run` also show the env audit?**
+   - What we know: D-09 says print the full bwrap command. The env audit is separate (D-01).
+   - What's unclear: Whether `--dry-run` implies `--yes` (skip audit) or shows audit then command.
+   - Recommendation: `--dry-run` implies `--yes` -- the user wants to see the command, not be prompted. Show the command and exit.
+
+2. **Should `--check` be combinable with other flags?**
+   - What we know: `--check` is an early-exit diagnostic.
+   - What's unclear: What if user passes `--check --dry-run`.
+   - Recommendation: `--check` takes priority, exits before other flags matter. No need to error on combinations.
+
+3. **Color disable for non-terminal stderr?**
+   - What we know: D-03 says use ANSI codes. Doesn't mention disabling.
+   - What's unclear: Whether to add `NO_COLOR` or `[[ -t 2 ]]` detection.
+   - Recommendation: Respect the `NO_COLOR` env var convention if set. Otherwise, always emit ANSI. This is low-effort and follows modern CLI conventions.
+
+## Sources
+
+### Primary (HIGH confidence)
+- `claudebox.sh` -- current implementation, read directly
+- `flake.nix` -- current derivation structure, read directly
+- `02-CONTEXT.md` -- locked decisions D-01 through D-10
+- `REQUIREMENTS.md` -- UX-01 through UX-05 definitions
+
+### Secondary (MEDIUM confidence)
+- Bash reference manual (training data) -- builtins, parameter expansion, test operators
+- VT100/ECMA-48 escape codes (training data) -- ANSI color sequences
+
+## Metadata
+
+**Confidence breakdown:**
+- Standard stack: HIGH -- no new packages, pure bash
+- Architecture: HIGH -- clear insertion points in existing script, straightforward patterns
+- Pitfalls: HIGH -- well-known bash gotchas, verified against existing code
+
+**Research date:** 2026-04-09
+**Valid until:** No expiry -- bash and ANSI codes are stable
diff --git a/.planning/phases/02-env-audit-and-cli-polish/02-REVIEW.md b/.planning/phases/02-env-audit-and-cli-polish/02-REVIEW.md
new file mode 100644
index 0000000..200037c
--- /dev/null
+++ b/.planning/phases/02-env-audit-and-cli-polish/02-REVIEW.md
@@ -0,0 +1,127 @@
+---
+phase: 02-env-audit-and-cli-polish
+reviewed: 2026-04-09T12:00:00Z
+depth: standard
+files_reviewed: 2
+files_reviewed_list:
+  - claudebox.sh
+  - flake.nix
+findings:
+  critical: 1
+  warning: 3
+  info: 2
+  total: 6
+status: issues_found
+---
+
+# Phase 2: Code Review Report
+
+**Reviewed:** 2026-04-09
+**Depth:** standard
+**Files Reviewed:** 2
+**Status:** issues_found
+
+## Summary
+
+Reviewed `claudebox.sh` (298 lines, shell script) and `flake.nix` (49 lines, Nix expression). The shell script is well-structured with good separation of concerns (flag parsing, audit display, sandbox exec). The Nix flake is clean and idiomatic.
+
+Key concerns: the `CLAUDEBOX_EXTRA_ENV` escape hatch allows passing arbitrary environment variables into the sandbox without any blocklist, which undermines the core security model. There are also several robustness issues around variable handling and the dry-run output not matching actual execution.
+
+## Critical Issues
+
+### CR-01: CLAUDEBOX_EXTRA_ENV allows smuggling secrets into sandbox
+
+**File:** `claudebox.sh:162-172`
+**Issue:** The `CLAUDEBOX_EXTRA_ENV` escape hatch accepts any variable name without validation. A user (or a parent process) could set `CLAUDEBOX_EXTRA_ENV=SSH_AUTH_SOCK,GPG_AGENT_INFO,AWS_SECRET_ACCESS_KEY` and those values would be injected into the sandbox. This directly violates the project's core constraint: "Secrets never enter the Claude Code environment." While `mask_value` hides sensitive values in the audit display, the actual values are still passed via `--setenv`. An attacker or misconfigured environment could bypass the entire allowlist model.
+**Fix:** Add a blocklist check before accepting extra vars:
+```bash
+# Blocklist of vars that must never enter the sandbox
+BLOCKED_VARS="SSH_AUTH_SOCK|SSH_AGENT_PID|GPG_AGENT_INFO|GPG_TTY|AWS_SECRET_ACCESS_KEY|AWS_SESSION_TOKEN|GITHUB_TOKEN|GH_TOKEN|TAILSCALE_.*"
+
+for var in "${EXTRAS[@]}"; do
+  var="${var// /}"
+  if [[ -n "$var" ]] && [[ -v "$var" ]]; then
+    if [[ "$var" =~ ^($BLOCKED_VARS)$ ]]; then
+      echo "${RED}Blocked:${RESET} $var is not allowed in sandbox" >&2
+      continue
+    fi
+    ENV_ARGS+=(--setenv "$var" "${!var}")
+    AUDIT_EXTRA_KEYS+=("$var")
+    AUDIT_EXTRA_VALS[$var]="${!var}"
+  fi
+done
+```
+
+## Warnings
+
+### WR-01: Dry-run output diverges from actual bwrap invocation
+
+**File:** `claudebox.sh:240-272`
+**Issue:** The dry-run block manually reconstructs the bwrap command with `echo` statements (lines 242-270) rather than deriving it from the same data used by the actual `exec bwrap` call (lines 275-298). If a mount is added or changed in the real invocation, the dry-run output will silently become stale. The `--bind /nix/var/nix` mount and the `--ro-bind /etc/nix` mount are both present in both places today, but this is a maintenance hazard -- the two blocks must be kept in sync manually.
+**Fix:** Build the bwrap args array once, then use it for both dry-run printing and actual execution:
+```bash
+BWRAP_ARGS=(
+  --clearenv
+  "${ENV_ARGS[@]}"
+  --tmpfs /
+  --proc /proc
+  # ... all mounts ...
+  -- "${SANDBOX_CMD[@]}"
+)
+
+if [[ "$DRY_RUN" == true ]]; then
+  printf 'bwrap'
+  printf ' %q' "${BWRAP_ARGS[@]}"
+  printf '\n'
+  exit 0
+fi
+
+exec bwrap "${BWRAP_ARGS[@]}"
+```
+
+### WR-02: Temp file cleanup race with exec
+
+**File:** `claudebox.sh:108-109`
+**Issue:** Line 109 sets `trap 'rm -f "$GITCONFIG_TMP"' EXIT` to clean up the temp gitconfig. However, line 275 uses `exec bwrap ...` which replaces the current process -- the EXIT trap never fires. The temp file leaks on every successful run. This is not a security issue (the file contains only name/email), but it accumulates stale files in `/tmp`.
+**Fix:** Either accept the leak (temp files are cleaned by the OS on reboot) and remove the misleading trap, or delete the file after bwrap starts by restructuring to not use `exec`:
+```bash
+# Option A: Remove the misleading trap, add a comment
+GITCONFIG_TMP=$(mktemp)
+# Note: leaked intentionally -- exec replaces process, trap won't fire.
+# /tmp is cleaned on reboot. File contains only git name/email.
+```
+
+### WR-03: Unquoted variables in dry-run echo statements
+
+**File:** `claudebox.sh:264-268`
+**Issue:** Several dry-run echo statements use unquoted shell variables: `$HOME`, `$CWD`. If these contain spaces (unlikely for HOME but possible for CWD), the dry-run output would be misleading -- it would show split words rather than the actual path. The real `exec bwrap` call properly quotes these on lines 293-297.
+**Fix:**
+```bash
+echo "  --tmpfs \"$HOME\" \\"
+echo "  --bind \"$HOME/.claudebox\" \"$HOME/.claude\" \\"
+echo "  --bind \"$CWD\" \"$CWD\" \\"
+echo "  --chdir \"$CWD\" \\"
+```
+
+## Info
+
+### IN-01: Hardcoded system architecture in flake.nix
+
+**File:** `flake.nix:18`
+**Issue:** `system = "x86_64-linux";` hardcodes the architecture. This prevents the flake from being used on aarch64-linux or other systems. This is fine if claudebox is strictly for the developer's NixOS desktop, but limits portability.
+**Fix:** Use `flake-utils` or `nixpkgs.lib.genAttrs` to support multiple systems:
+```nix
+nixpkgs.lib.genAttrs ["x86_64-linux" "aarch64-linux"] (system: ...)
+```
+
+### IN-02: `pass` variable in --check uses string comparison instead of boolean
+
+**File:** `claudebox.sh:23,31,56`
+**Issue:** The `pass` variable is set to the string `true`/`false` and compared with `[[ "$pass" == true ]]`. This works in bash but is a code smell -- it looks like a boolean but is a string. Minor readability concern, no functional impact given `writeShellApplication` ensures bash.
+**Fix:** No change needed -- this is idiomatic bash. Noting for awareness only.
+
+---
+
+_Reviewed: 2026-04-09_
+_Reviewer: Claude (gsd-code-reviewer)_
+_Depth: standard_
diff --git a/.planning/phases/02-env-audit-and-cli-polish/02-VERIFICATION.md b/.planning/phases/02-env-audit-and-cli-polish/02-VERIFICATION.md
new file mode 100644
index 0000000..27d8789
--- /dev/null
+++ b/.planning/phases/02-env-audit-and-cli-polish/02-VERIFICATION.md
@@ -0,0 +1,135 @@
+---
+phase: 02-env-audit-and-cli-polish
+verified: 2026-04-09T16:00:00Z
+status: human_needed
+score: 4/4
+overrides_applied: 0
+human_verification:
+  - test: "Run claudebox without --yes and verify env vars display with grouped sections"
+    expected: "Three sections shown (Sandbox-generated, Host allowlisted, Extra) with PATH split per-line, sensitive values masked, Proceed? prompt appears"
+    why_human: "Requires running in a terminal with bwrap available to verify visual output, TTY interaction, and color formatting"
+  - test: "Run claudebox --yes and verify it launches immediately without audit"
+    expected: "No env audit displayed, sandbox launches directly"
+    why_human: "Requires running sandbox with bwrap and claude available"
+  - test: "Run claudebox --dry-run and verify full bwrap command is printed"
+    expected: "Complete bwrap command with all --setenv, mount flags, and sandbox command printed to stderr, then exits 0"
+    why_human: "Requires runtime environment with SANDBOX_PATH and resolved binaries"
+  - test: "Run claudebox --check and verify prerequisite report"
+    expected: "Colored OK/FAIL/WARN for bwrap, claude, git, curl, nix, ~/.claudebox, ANTHROPIC_API_KEY"
+    why_human: "Requires nix-built binary to test PATH resolution of check targets"
+  - test: "Pipe input to claudebox (non-interactive) and verify it aborts"
+    expected: "Error message about stdin not being a terminal, suggests --yes/-y, exits 1"
+    why_human: "Requires runtime execution to test TTY detection"
+---
+
+# Phase 2: Env Audit and CLI Polish Verification Report
+
+**Phase Goal:** User can review exactly what enters the sandbox before launch, and has diagnostic tools for troubleshooting
+**Verified:** 2026-04-09T16:00:00Z
+**Status:** human_needed
+**Re-verification:** No -- initial verification
+
+## Goal Achievement
+
+### Observable Truths
+
+| # | Truth | Status | Evidence |
+|---|-------|--------|----------|
+| 1 | Running `claudebox` without `--yes` prints all env vars and prompts for confirmation | VERIFIED | `print_audit()` at lines 175-211, prompt at line 219, guarded by `SKIP_AUDIT != true && DRY_RUN != true` at line 214 |
+| 2 | Running `claudebox --yes` or `-y` skips env audit and launches immediately | VERIFIED | Flag parsing at line 10 sets `SKIP_AUDIT=true`, guard at line 214 checks it |
+| 3 | Running `claudebox --dry-run` prints full bwrap command without executing | VERIFIED | Lines 240-272: prints all --setenv triplets, mount flags, sandbox command, then `exit 0` |
+| 4 | Running `claudebox --check` reports whether bwrap, Nix packages, ~/.claudebox exist | VERIFIED | Lines 22-63: `check_cmd` for bwrap/claude/git/curl/nix, dir check for ~/.claudebox, ANTHROPIC_API_KEY warn |
+
+**Score:** 4/4 truths verified
+
+### Required Artifacts
+
+| Artifact | Expected | Status | Details |
+|----------|----------|--------|---------|
+| `claudebox.sh` | Refactored flag parsing, --check, --dry-run (Plan 01) | VERIFIED | 299 lines, contains CHECK_MODE, DRY_RUN, SKIP_AUDIT, CLAUDE_ARGS (15 pattern matches) |
+| `claudebox.sh` | Env audit display, masking, confirmation prompt (Plan 02) | VERIFIED | Contains mask_value, print_audit, Proceed (7 pattern matches) |
+
+### Key Link Verification
+
+| From | To | Via | Status | Details |
+|------|----|-----|--------|---------|
+| Flag parsing (CLAUDE_ARGS) | SANDBOX_CMD construction | `CLAUDE_ARGS` array replaces raw `$@` | WIRED | Declared line 6, accumulated lines 14-15, used in SANDBOX_CMD lines 234, 236 |
+| Env audit block | SKIP_AUDIT flag | `if [[ "$SKIP_AUDIT" != true ]]` | WIRED | Set line 2/10, checked line 214 |
+| Audit display | ENV_ARGS array | Parallel AUDIT_*_KEYS/VALS arrays | WIRED | AUDIT_SANDBOX/HOST/EXTRA arrays declared lines 120-125, populated lines 141-169, displayed in print_audit lines 175-211 |
+
+### Data-Flow Trace (Level 4)
+
+Not applicable -- shell script with no dynamic data rendering. All data flows from flag parsing and host environment through to bwrap execution, verified via wiring checks above.
+
+### Behavioral Spot-Checks
+
+| Behavior | Command | Result | Status |
+|----------|---------|--------|--------|
+| nix build passes (shellcheck clean) | `nix build` | exit 0 | PASS |
+| No TODO/FIXME/PLACEHOLDER markers | `grep -n TODO\|FIXME\|PLACEHOLDER claudebox.sh` | 0 matches | PASS |
+| Flag parsing handles multiple flags | grep for while/shift loop | `while (( $# > 0 ))` at line 8 with case/esac | PASS |
+| Mask function covers all sensitive patterns | grep mask_value body | KEY, TOKEN, SECRET, PASSWORD, CREDENTIAL all present | PASS |
+| Stderr-only output | grep `>&2` count | 28 stderr redirections found | PASS |
+
+### Requirements Coverage
+
+| Requirement | Source Plan | Description | Status | Evidence |
+|-------------|------------|-------------|--------|----------|
+| UX-01 | 02-02 | Pre-launch env audit displays all env vars on stderr | SATISFIED | `print_audit()` with 3 grouped sections, all output to stderr |
+| UX-02 | 02-02 | Pre-launch env audit prompts for confirmation | SATISFIED | `Proceed? [Y/n]` at line 219, abort on `n`/`no` |
+| UX-03 | 02-01 | `--yes`/`-y` skips confirmation | SATISFIED | Flag parsed line 10, guard at line 214 |
+| UX-04 | 02-01 | `--dry-run` prints full bwrap command | SATISFIED | Lines 240-272, multiline bwrap output to stderr, exit 0 |
+| UX-05 | 02-01 | `--check` verifies prerequisites | SATISFIED | Lines 22-63, checks bwrap/claude/git/curl/nix + ~/.claudebox + ANTHROPIC_API_KEY |
+
+No orphaned requirements found -- all 5 phase requirements (UX-01 through UX-05) are claimed and satisfied.
+
+### Anti-Patterns Found
+
+| File | Line | Pattern | Severity | Impact |
+|------|------|---------|----------|--------|
+| (none) | - | - | - | No anti-patterns detected |
+
+### Human Verification Required
+
+### 1. Visual Audit Display
+
+**Test:** Run `claudebox` in a terminal without `--yes` flag
+**Expected:** Three grouped sections (Sandbox-generated, Host allowlisted, Extra) with colored headers, PATH entries split one per line, sensitive values masked (ANTHROPIC_API_KEY shows `sk-ant-...xxxx`), `Proceed? [Y/n]` prompt
+**Why human:** Requires bwrap-capable environment, TTY interaction, visual confirmation of color formatting
+
+### 2. Dry-Run Output
+
+**Test:** Run `claudebox --dry-run`
+**Expected:** Full multiline bwrap command printed to stderr with all --setenv and mount flags, exits 0
+**Why human:** Requires runtime with resolved SANDBOX_PATH and binary paths
+
+### 3. Check Mode
+
+**Test:** Run `claudebox --check`
+**Expected:** Colored OK/FAIL/WARN for each prerequisite, appropriate exit code
+**Why human:** Requires nix-built binary to verify PATH resolution targets
+
+### 4. Non-Interactive Abort
+
+**Test:** Run `echo "" | claudebox`
+**Expected:** Error message about stdin not being a terminal, suggests `--yes`/`-y`, exits 1
+**Why human:** Requires runtime TTY detection test
+
+### 5. Yes Flag Skip
+
+**Test:** Run `claudebox --yes`
+**Expected:** No audit display, sandbox launches immediately
+**Why human:** Requires full sandbox environment
+
+### Gaps Summary
+
+No automated gaps found. All 4 roadmap success criteria verified at code level. All 5 requirements (UX-01 through UX-05) are satisfied in the implementation. The code is clean (no TODOs, no stubs, shellcheck passes via nix build).
+
+One minor documentation note: commit hashes in 02-01-SUMMARY.md (`07096ae`, `3903667`, `cc6bd5b`) do not match actual commits (`72ba48d`, `1eddd93`, `7001303`). This is cosmetic and does not affect functionality.
+
+Human verification is needed to confirm runtime behavior -- the code structure is correct but these are interactive CLI features that require a terminal and bwrap environment to fully validate.
+
+---
+
+_Verified: 2026-04-09T16:00:00Z_
+_Verifier: Claude (gsd-verifier)_
diff --git a/.planning/phases/03-sandbox-aware-prompting/03-01-PLAN.md b/.planning/phases/03-sandbox-aware-prompting/03-01-PLAN.md
new file mode 100644
index 0000000..18edffa
--- /dev/null
+++ b/.planning/phases/03-sandbox-aware-prompting/03-01-PLAN.md
@@ -0,0 +1,253 @@
+---
+phase: 03-sandbox-aware-prompting
+plan: 01
+type: execute
+wave: 1
+depends_on: []
+files_modified:
+  - claudebox.sh
+autonomous: true
+requirements:
+  - AWARE-01
+  - AWARE-02
+
+must_haves:
+  truths:
+    - "Claude inside the sandbox sees SANDBOX.md content describing its environment"
+    - "CLAUDE.md in ~/.claudebox/ exists after first launch with @SANDBOX.md import on line 1"
+    - "SANDBOX.md is overwritten on every launch with current content"
+    - "Existing user content in CLAUDE.md is preserved when import line is prepended"
+  artifacts:
+    - path: "claudebox.sh"
+      provides: "SANDBOX.md generation and CLAUDE.md import check"
+      contains: "SANDBOXEOF"
+    - path: "~/.claudebox/SANDBOX.md"
+      provides: "Sandbox context for Claude Code"
+      contains: "bubblewrap"
+    - path: "~/.claudebox/CLAUDE.md"
+      provides: "User-owned CLAUDE.md with managed import"
+      contains: "@SANDBOX.md"
+  key_links:
+    - from: "claudebox.sh"
+      to: "~/.claudebox/SANDBOX.md"
+      via: "heredoc write on every launch"
+      pattern: "cat > .*/SANDBOX.md"
+    - from: "~/.claudebox/CLAUDE.md"
+      to: "~/.claudebox/SANDBOX.md"
+      via: "@SANDBOX.md import on line 1"
+      pattern: "@SANDBOX.md"
+    - from: "bwrap --bind ~/.claudebox ~/.claude"
+      to: "Claude Code session"
+      via: "bind mount makes ~/.claudebox visible as ~/.claude"
+      pattern: "--bind.*\\.claudebox.*\\.claude"
+---
+
+<objective>
+Add sandbox-aware prompting to claudebox so Claude Code automatically knows it is running in a bwrap sandbox, how to install tools via comma and nix shell, and what host resources are unavailable by default.
+
+Purpose: Claude Code sessions inside claudebox currently have no context about their sandboxed environment. This causes Claude to attempt operations that will fail (SSH git, gpg signing) and miss capabilities it has (comma, nix shell). Injecting this context eliminates friction.
+
+Output: Modified claudebox.sh that generates SANDBOX.md and manages CLAUDE.md import on every launch.
+</objective>
+
+<execution_context>
+@$HOME/.claude/get-shit-done/workflows/execute-plan.md
+@$HOME/.claude/get-shit-done/templates/summary.md
+</execution_context>
+
+<context>
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/STATE.md
+@.planning/phases/03-sandbox-aware-prompting/03-CONTEXT.md
+@.planning/phases/03-sandbox-aware-prompting/03-RESEARCH.md
+@claudebox.sh
+</context>
+
+<tasks>
+
+<task type="auto">
+  <name>Task 1: Add SANDBOX.md generation and CLAUDE.md import check to claudebox.sh</name>
+  <files>claudebox.sh</files>
+  <read_first>
+    - claudebox.sh (current state, integration point at line 102-103)
+    - .planning/phases/03-sandbox-aware-prompting/03-CONTEXT.md (locked decisions D-01 through D-08)
+    - .planning/phases/03-sandbox-aware-prompting/03-RESEARCH.md (implementation patterns, SANDBOX.md content structure)
+  </read_first>
+  <action>
+Insert a new code block in claudebox.sh between line 102 (`mkdir -p "$HOME/.claudebox"`) and line 104 (`GIT_NAME=$(git config ...)`). The block does two things:
+
+**Part 1: Write SANDBOX.md (per D-02 -- overwritten every launch)**
+
+Use a single-quoted heredoc (`<< 'SANDBOXEOF'`) to write `$HOME/.claudebox/SANDBOX.md`. The content must follow D-04 (friendly guide tone, short prose paragraphs), D-05 (default restrictions with "by default" phrasing), and D-06 (git identity pre-configured, HTTPS preferred). Use this exact content:
+
+```
+# Sandbox Environment
+
+You are running inside a bubblewrap (bwrap) sandbox managed by claudebox.
+Your filesystem is isolated -- only the current working directory and
+essential system paths are mounted.
+
+## Installing Tools
+
+You have two ways to install tools on the fly:
+
+**Comma (preferred for quick one-off commands):**
+`, ripgrep` runs ripgrep without permanent installation. Comma uses
+nix-index to find the right package automatically.
+
+**Nix shell (for persistent access within the session):**
+`nix shell nixpkgs#python3 -c python3 script.py` runs a command with
+a package available. To keep it in your PATH for the session:
+`nix shell nixpkgs#python3` then use `python3` normally.
+
+## Default Restrictions
+
+By default, the following are not mounted into the sandbox:
+- SSH keys (~/.ssh)
+- GPG and age keys (~/.gnupg, age key files)
+- Cloud credentials (~/.aws, ~/.config/gcloud)
+- Tailscale state
+
+If your setup has been customized, some of these may be available.
+
+## Git
+
+Your git identity (name and email) is pre-configured from the host.
+The `safe.directory` setting trusts the mounted working directory.
+For remote operations, prefer HTTPS URLs over SSH since SSH keys
+are not available by default.
+```
+
+The heredoc line is: `cat > "$HOME/.claudebox/SANDBOX.md" << 'SANDBOXEOF'`
+Closing delimiter is: `SANDBOXEOF` (flush-left, no indentation).
+
+**Part 2: Ensure CLAUDE.md has @SANDBOX.md import (per D-03, D-07, D-08, AWARE-01)**
+
+After writing SANDBOX.md, add the CLAUDE.md import check:
+
+```bash
+# Ensure CLAUDE.md has @SANDBOX.md import (D-03, D-08, AWARE-01)
+CLAUDEMD="$HOME/.claudebox/CLAUDE.md"
+if [[ ! -f "$CLAUDEMD" ]]; then
+  printf '%s\n' "@SANDBOX.md" > "$CLAUDEMD"
+elif [[ "$(head -1 "$CLAUDEMD")" != "@SANDBOX.md" ]]; then
+  tmp=$(mktemp)
+  { printf '%s\n' "@SANDBOX.md"; cat "$CLAUDEMD"; } > "$tmp"
+  mv "$tmp" "$CLAUDEMD"
+fi
+```
+
+This creates CLAUDE.md with just the import if it does not exist (AWARE-01), or prepends the import if the first line does not match exactly `@SANDBOX.md` (D-08). Existing user content is preserved (D-03).
+
+Add a section comment before the block: `# === Sandbox-aware prompting (AWARE-01, AWARE-02) ===`
+  </action>
+  <verify>
+    <automated>cd /home/toph/code/tools/claudebox && nix build --no-link 2>&1 | tail -5 && echo "BUILD OK"</automated>
+  </verify>
+  <acceptance_criteria>
+    - claudebox.sh contains `cat > "$HOME/.claudebox/SANDBOX.md" << 'SANDBOXEOF'`
+    - claudebox.sh contains the closing `SANDBOXEOF` heredoc delimiter
+    - SANDBOX.md heredoc contains `# Sandbox Environment`
+    - SANDBOX.md heredoc contains `## Installing Tools`
+    - SANDBOX.md heredoc contains `## Default Restrictions`
+    - SANDBOX.md heredoc contains `By default, the following are not mounted into the sandbox:`
+    - SANDBOX.md heredoc contains `If your setup has been customized, some of these may be available.`
+    - SANDBOX.md heredoc contains `## Git`
+    - SANDBOX.md heredoc contains `prefer HTTPS URLs over SSH`
+    - claudebox.sh contains `CLAUDEMD="$HOME/.claudebox/CLAUDE.md"`
+    - claudebox.sh contains `printf '%s\n' "@SANDBOX.md" > "$CLAUDEMD"`
+    - claudebox.sh contains `head -1 "$CLAUDEMD"` for first-line check
+    - claudebox.sh contains `mv "$tmp" "$CLAUDEMD"` for atomic prepend
+    - The SANDBOX.md/CLAUDE.md block appears AFTER `mkdir -p "$HOME/.claudebox"` and BEFORE gitconfig generation
+    - `nix build` succeeds (shellcheck passes)
+  </acceptance_criteria>
+  <done>claudebox.sh contains SANDBOX.md heredoc generation and CLAUDE.md import check between mkdir and gitconfig blocks. nix build succeeds with shellcheck passing.</done>
+</task>
+
+<task type="auto">
+  <name>Task 2: Verify file generation behavior end-to-end</name>
+  <files>claudebox.sh</files>
+  <read_first>
+    - claudebox.sh (the modified file from Task 1)
+  </read_first>
+  <action>
+Run claudebox in dry-run mode to verify the script executes past the SANDBOX.md/CLAUDE.md generation without errors, then verify the files were created correctly.
+
+**Step 1:** Clean slate test. Remove any existing SANDBOX.md and CLAUDE.md from ~/.claudebox/:
+```bash
+rm -f "$HOME/.claudebox/SANDBOX.md" "$HOME/.claudebox/CLAUDE.md"
+```
+
+**Step 2:** Run `claudebox --dry-run` (which executes all pre-bwrap logic including file generation, then prints the command and exits). This exercises the SANDBOX.md write and CLAUDE.md creation path (AWARE-01 -- first run, file does not exist).
+
+**Step 3:** Verify SANDBOX.md was created with correct content:
+- `head -1 ~/.claudebox/SANDBOX.md` should output `# Sandbox Environment`
+- `grep -c "## " ~/.claudebox/SANDBOX.md` should be 4 (four H2 sections: Installing Tools, Default Restrictions, Git, plus the H1 counts differently -- actually grep for `^## ` to count H2s, expect 3: Installing Tools, Default Restrictions, Git)
+
+**Step 4:** Verify CLAUDE.md was created with import line:
+- `cat ~/.claudebox/CLAUDE.md` should output exactly `@SANDBOX.md`
+
+**Step 5:** Idempotency test. Run `claudebox --dry-run` again. Verify:
+- SANDBOX.md was overwritten (check timestamp or content is identical)
+- CLAUDE.md still has exactly one `@SANDBOX.md` line on line 1 (no duplication)
+
+**Step 6:** Prepend test. Add user content to CLAUDE.md, remove the import line, then run again:
+```bash
+printf '%s\n' "# My custom instructions" "Do cool stuff" > ~/.claudebox/CLAUDE.md
+```
+Run `claudebox --dry-run`. Verify:
+- `head -1 ~/.claudebox/CLAUDE.md` outputs `@SANDBOX.md`
+- `head -3 ~/.claudebox/CLAUDE.md` shows import line followed by the user's content (preserved per D-03)
+  </action>
+  <verify>
+    <automated>cd /home/toph/code/tools/claudebox && rm -f "$HOME/.claudebox/SANDBOX.md" "$HOME/.claudebox/CLAUDE.md" && nix run . -- --dry-run --yes 2>/dev/null; head -1 "$HOME/.claudebox/SANDBOX.md" | grep -qF "# Sandbox Environment" && echo "SANDBOX.md OK" && cat "$HOME/.claudebox/CLAUDE.md" | head -1 | grep -qF "@SANDBOX.md" && echo "CLAUDE.md OK" && printf '%s\n' "# My stuff" > "$HOME/.claudebox/CLAUDE.md" && nix run . -- --dry-run --yes 2>/dev/null && head -1 "$HOME/.claudebox/CLAUDE.md" | grep -qF "@SANDBOX.md" && sed -n '2p' "$HOME/.claudebox/CLAUDE.md" | grep -qF "# My stuff" && echo "PREPEND OK"</automated>
+  </verify>
+  <acceptance_criteria>
+    - After first run with no existing files: ~/.claudebox/SANDBOX.md exists and starts with `# Sandbox Environment`
+    - After first run with no existing files: ~/.claudebox/CLAUDE.md exists and contains exactly `@SANDBOX.md`
+    - After second run: CLAUDE.md still has exactly one `@SANDBOX.md` on line 1 (no duplication)
+    - After removing import from CLAUDE.md and re-running: `@SANDBOX.md` is prepended, existing content preserved on subsequent lines
+    - grep -c "@SANDBOX.md" ~/.claudebox/CLAUDE.md returns 1 (exactly one import line)
+  </acceptance_criteria>
+  <done>SANDBOX.md generation, CLAUDE.md creation, CLAUDE.md import prepending, and idempotency all verified via dry-run execution.</done>
+</task>
+
+</tasks>
+
+<threat_model>
+## Trust Boundaries
+
+| Boundary | Description |
+|----------|-------------|
+| Host filesystem -> sandbox | Files written to ~/.claudebox/ on host are exposed as ~/.claude/ inside sandbox |
+| claudebox script -> user CLAUDE.md | Script modifies a user-owned file (prepend only) |
+
+## STRIDE Threat Register
+
+| Threat ID | Category | Component | Disposition | Mitigation Plan |
+|-----------|----------|-----------|-------------|-----------------|
+| T-03-01 | Tampering | SANDBOX.md content | accept | SANDBOX.md is informational only (context for Claude). Tampering has no security impact -- it cannot grant sandbox permissions. Overwritten every launch anyway (D-02). |
+| T-03-02 | Tampering | CLAUDE.md user content | mitigate | claudebox only modifies line 1 (import line). Uses mktemp+mv for atomic write. Preserves all existing content per D-03. No truncation risk. |
+| T-03-03 | Information Disclosure | SANDBOX.md reveals sandbox topology | accept | SANDBOX.md tells Claude what is NOT available (SSH, GPG, cloud creds). This is useful context, not a secret. Does not reveal host paths or configuration details beyond what Claude can already discover via `env` and filesystem exploration. |
+| T-03-04 | Denial of Service | CLAUDE.md write failure | accept | If mktemp or mv fails, `set -euo pipefail` aborts the script. User sees clear error. No data loss -- original CLAUDE.md unchanged (mv is atomic). |
+</threat_model>
+
+<verification>
+1. `nix build` succeeds (shellcheck validation)
+2. `claudebox --dry-run` creates SANDBOX.md with correct content
+3. `claudebox --dry-run` creates CLAUDE.md with @SANDBOX.md import on first run
+4. Repeated runs do not duplicate the import line
+5. Existing user content in CLAUDE.md is preserved when import is prepended
+</verification>
+
+<success_criteria>
+- claudebox.sh contains SANDBOX.md heredoc and CLAUDE.md import management code
+- nix build passes shellcheck
+- File generation works correctly for all three cases: first run (no files), repeat run (files exist), and recovery (import line missing)
+- SANDBOX.md content matches D-04/D-05/D-06 requirements
+</success_criteria>
+
+<output>
+After completion, create `.planning/phases/03-sandbox-aware-prompting/03-01-SUMMARY.md`
+</output>
diff --git a/.planning/phases/03-sandbox-aware-prompting/03-01-SUMMARY.md b/.planning/phases/03-sandbox-aware-prompting/03-01-SUMMARY.md
new file mode 100644
index 0000000..dba4e71
--- /dev/null
+++ b/.planning/phases/03-sandbox-aware-prompting/03-01-SUMMARY.md
@@ -0,0 +1,57 @@
+---
+phase: 03-sandbox-aware-prompting
+plan: 01
+subsystem: sandbox-prompting
+tags: [shell, claude-code, sandbox-context]
+dependency_graph:
+  requires: []
+  provides: [SANDBOX.md-generation, CLAUDE.md-import]
+  affects: [claudebox.sh]
+tech_stack:
+  added: []
+  patterns: [heredoc-generation, atomic-file-prepend]
+key_files:
+  created: []
+  modified: [claudebox.sh]
+decisions:
+  - Used head-1 string comparison instead of grep for first-line check (simpler, no grep dependency needed)
+metrics:
+  duration: 76s
+  completed: 2026-04-09
+  tasks: 2
+  files: 1
+---
+
+# Phase 03 Plan 01: Sandbox-Aware Prompting Summary
+
+SANDBOX.md heredoc generation and CLAUDE.md import management via head-1 check with atomic mktemp+mv prepend
+
+## What Was Done
+
+### Task 1: Add SANDBOX.md generation and CLAUDE.md import check
+
+Inserted a new block in claudebox.sh between `mkdir -p ~/.claudebox` and gitconfig generation. The block:
+
+1. Writes `~/.claudebox/SANDBOX.md` via single-quoted heredoc (no variable expansion) on every launch. Content covers: sandbox overview, tool installation (comma + nix shell), default restrictions with "by default" phrasing, and git identity/HTTPS guidance.
+
+2. Manages `~/.claudebox/CLAUDE.md` import line: creates file with `@SANDBOX.md` if missing, or prepends the import if first line doesn't match. Uses mktemp+mv for atomic write, preserving existing user content.
+
+### Task 2: End-to-end verification
+
+Verified three scenarios via `claudebox --dry-run --yes`:
+- **First run** (no files): SANDBOX.md created with correct content, CLAUDE.md created with `@SANDBOX.md`
+- **Idempotency**: Second run produces no duplicate import lines
+- **Prepend**: User content without import gets `@SANDBOX.md` prepended, existing content preserved
+
+## Commits
+
+| Task | Commit | Description |
+|------|--------|-------------|
+| 1 | 27d9db4 | feat(03-01): add SANDBOX.md generation and CLAUDE.md import check |
+| 2 | (verification only, no code changes) | |
+
+## Deviations from Plan
+
+None - plan executed exactly as written.
+
+## Self-Check: PASSED
diff --git a/.planning/phases/03-sandbox-aware-prompting/03-CONTEXT.md b/.planning/phases/03-sandbox-aware-prompting/03-CONTEXT.md
new file mode 100644
index 0000000..7706190
--- /dev/null
+++ b/.planning/phases/03-sandbox-aware-prompting/03-CONTEXT.md
@@ -0,0 +1,97 @@
+# Phase 3: Sandbox-Aware Prompting - Context
+
+**Gathered:** 2026-04-09
+**Status:** Ready for planning
+
+<domain>
+## Phase Boundary
+
+Inject sandbox context into Claude Code sessions so Claude knows its capabilities and constraints automatically on session start. Two files: a managed SANDBOX.md with sandbox context, and a reference line in CLAUDE.md that imports it via `@` syntax. Zero tool-use overhead.
+
+</domain>
+
+<decisions>
+## Implementation Decisions
+
+### File architecture
+- **D-01:** Two-file approach. claudebox manages `~/.claudebox/SANDBOX.md` (sandbox context) and ensures `~/.claudebox/CLAUDE.md` has an `@SANDBOX.md` import at the top. Since `~/.claudebox` is bind-mounted as `~/.claude` inside the sandbox, Claude Code auto-loads both files at session start.
+- **D-02:** `SANDBOX.md` is fully owned by claudebox — overwritten on every launch. User should not edit this file; changes are lost on next run.
+- **D-03:** `CLAUDE.md` is user-owned. claudebox only ensures the `@SANDBOX.md` import line exists at the top. If missing, it's re-added. All other content is untouched.
+
+### SANDBOX.md content
+- **D-04:** Friendly guide tone — short prose paragraphs, not terse bullets. Sections: sandbox overview, installing tools (comma + nix shell with examples), default restrictions (phrased as "by default, not mounted" to avoid contradicting user customizations), git setup.
+- **D-05:** Default restrictions use "by default" phrasing: "By default, the following are not mounted into the sandbox: SSH keys, GPG/age keys, cloud credentials, Tailscale." Includes note: "If your setup has been customized, some of these may be available."
+- **D-06:** Git section notes identity is pre-configured (name/email) and suggests HTTPS for remotes by default. Mentions safe.directory is set.
+
+### Import mechanism
+- **D-07:** Uses Claude Code's `@path` import syntax in CLAUDE.md. `@SANDBOX.md` at the first line. This is auto-expanded at session start — no Read tool call needed.
+
+### Launch-time behavior
+- **D-08:** On every launch, claudebox: (1) writes/overwrites `~/.claudebox/SANDBOX.md` with current content, (2) checks if `~/.claudebox/CLAUDE.md` exists — creates it with just the import line if not, (3) if CLAUDE.md exists, checks first line for the `@SANDBOX.md` import — prepends it if missing.
+
+### Claude's Discretion
+- Exact prose wording and section ordering in SANDBOX.md
+- How the first-line check works in shell (grep, head, etc.)
+- Whether to use a comment marker around the import line for robustness
+
+</decisions>
+
+<canonical_refs>
+## Canonical References
+
+**Downstream agents MUST read these before planning or implementing.**
+
+### Project Docs
+- `.planning/PROJECT.md` -- Core value, constraints, key decisions
+- `.planning/REQUIREMENTS.md` -- AWARE-01, AWARE-02 requirement definitions
+- `.planning/ROADMAP.md` -- Phase 3 success criteria
+
+### Prior Phase Context
+- `.planning/phases/01-minimal-viable-sandbox/01-CONTEXT.md` -- D-02 (comma-with-db), D-05 (git identity generation), sandbox mount structure
+- `.planning/phases/02-env-audit-and-cli-polish/02-CONTEXT.md` -- D-08 (--yes flag), existing flag parsing pattern
+
+### Implementation
+- `claudebox.sh` -- Current script, `mkdir -p ~/.claudebox` already runs at launch (line 102), bind-mount of `~/.claudebox` as `~/.claude` (line 294)
+- `flake.nix` -- Nix derivation structure
+
+</canonical_refs>
+
+<code_context>
+## Existing Code Insights
+
+### Reusable Assets
+- `mkdir -p "$HOME/.claudebox"` (line 102) -- already ensures the directory exists before any file operations
+- Existing ANSI color variables (BOLD, RESET, etc.) -- available if any output is needed during SANDBOX.md generation
+
+### Established Patterns
+- `writeShellApplication` with shellcheck and `set -euo pipefail`
+- Temp file pattern with trap cleanup (used for .gitconfig)
+- Flag parsing via while/case/esac
+
+### Integration Points
+- SANDBOX.md generation goes after `mkdir -p "$HOME/.claudebox"` and before the bwrap exec
+- CLAUDE.md import check runs alongside SANDBOX.md generation
+- No new flags needed -- this is automatic on every launch
+
+</code_context>
+
+<specifics>
+## Specific Ideas
+
+- `@SANDBOX.md` import resolves inside the sandbox because `~/.claudebox` becomes `~/.claude` -- the path is relative to CLAUDE.md's location
+- SANDBOX.md is a heredoc in claudebox.sh (simplest, no extra files in the derivation)
+- First-line check: if `head -1 ~/.claudebox/CLAUDE.md` doesn't contain `@SANDBOX.md`, prepend it
+
+</specifics>
+
+<deferred>
+## Deferred Ideas
+
+None -- discussion stayed within phase scope
+
+</deferred>
+
+---
+
+*Phase: 03-sandbox-aware-prompting*
+*Context gathered: 2026-04-09*
diff --git a/.planning/phases/03-sandbox-aware-prompting/03-DISCUSSION-LOG.md b/.planning/phases/03-sandbox-aware-prompting/03-DISCUSSION-LOG.md
new file mode 100644
index 0000000..1242adb
--- /dev/null
+++ b/.planning/phases/03-sandbox-aware-prompting/03-DISCUSSION-LOG.md
@@ -0,0 +1,74 @@
+# Phase 3: Sandbox-Aware Prompting - Discussion Log
+
+> **Audit trail only.** Do not use as input to planning, research, or execution agents.
+> Decisions are captured in CONTEXT.md -- this log preserves the alternatives considered.
+
+**Date:** 2026-04-09
+**Phase:** 03-sandbox-aware-prompting
+**Areas discussed:** CLAUDE.md content, Generation behavior, Template storage
+
+---
+
+## CLAUDE.md Content
+
+### Tone
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| Terse reference | Bullet points, no prose. Minimal, scannable. | |
+| Friendly guide | Short prose paragraphs explaining the sandbox, restrictions, and tools. | ✓ |
+| Minimal one-liner | Single paragraph, bare minimum. | |
+
+**User's choice:** Friendly guide
+**Notes:** User noted that users might inject certain secrets themselves (e.g., personal git SSH keys), so the CLAUDE.md shouldn't confuse Claude with absolute "no SSH keys" claims.
+
+### Unavailable Section Phrasing
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| Default-aware phrasing | "By default, these are not mounted: ..." -- accurate without contradicting customizations. | ✓ |
+| Omit unavailable section | Don't list restrictions at all. | |
+| Dynamic generation | Inspect mounts at launch and generate restrictions dynamically. | |
+
+**User's choice:** Default-aware phrasing
+**Notes:** None
+
+### Git Section
+
+| Option | Description | Selected |
+|--------|-------------|----------|
+| Yes, brief note | Mention git identity is pre-configured, suggest HTTPS for remotes. | ✓ |
+| Skip it | Git just works, let Claude figure it out. | |
+
+**User's choice:** Yes, brief note
+**Notes:** None
+
+---
+
+## Generation Behavior
+
+### Context Injection Mechanism
+
+User redirected the discussion: instead of managing CLAUDE.md directly, use a separate SANDBOX.md file with Claude Code's `@path` import syntax. This avoids touching user content and eliminates tool-use token overhead.
+
+**Final approach:** claudebox writes SANDBOX.md (managed, overwritten each launch) and ensures CLAUDE.md has `@SANDBOX.md` import at top line (checked/re-added each launch).
+
+**User's insight:** "We don't need to write CLAUDE.md like that at all. We can just write a separate file and add a quick reference at the top."
+
+---
+
+## Template Storage
+
+Folded into Generation behavior -- SANDBOX.md content lives as a heredoc in claudebox.sh. No separate template file needed since the two-file architecture resolved the storage question.
+
+---
+
+## Claude's Discretion
+
+- Exact prose wording in SANDBOX.md
+- Shell implementation of first-line check
+- Comment markers around import line
+
+## Deferred Ideas
+
+None
diff --git a/.planning/phases/03-sandbox-aware-prompting/03-RESEARCH.md b/.planning/phases/03-sandbox-aware-prompting/03-RESEARCH.md
new file mode 100644
index 0000000..0b5b53c
--- /dev/null
+++ b/.planning/phases/03-sandbox-aware-prompting/03-RESEARCH.md
@@ -0,0 +1,280 @@
+# Phase 3: Sandbox-Aware Prompting - Research
+
+**Researched:** 2026-04-09
+**Domain:** Shell scripting (heredoc generation, file manipulation), Claude Code memory system
+**Confidence:** HIGH
+
+## Summary
+
+Phase 3 adds two files to `~/.claudebox/` on every claudebox launch: a fully-managed `SANDBOX.md` containing sandbox context, and a user-owned `CLAUDE.md` that imports it via Claude Code's `@` syntax. Since `~/.claudebox` is bind-mounted as `~/.claude` inside the sandbox (SAND-08, already implemented), Claude Code automatically loads both files at session start.
+
+The implementation is straightforward shell scripting -- a heredoc write for SANDBOX.md and a head/grep check for the CLAUDE.md import line. The main research concern was verifying that `@SANDBOX.md` relative imports work correctly from `~/.claude/CLAUDE.md`. Official documentation confirms relative paths resolve relative to the containing file, which is the behavior we need. There are known issues with tilde-expansion imports (`@~/.claude/foo.md`) but our case uses a simple filename in the same directory, which is the simplest and most reliable form.
+
+**Primary recommendation:** Implement as a single contiguous block in `claudebox.sh` between the existing `mkdir -p "$HOME/.claudebox"` (line 102) and the gitconfig generation (line 104). Use a heredoc for SANDBOX.md content and simple `head -1` / `grep` for the CLAUDE.md import check.
+
+<user_constraints>
+
+## User Constraints (from CONTEXT.md)
+
+### Locked Decisions
+- **D-01:** Two-file approach. claudebox manages `~/.claudebox/SANDBOX.md` (sandbox context) and ensures `~/.claudebox/CLAUDE.md` has an `@SANDBOX.md` import at the top. Since `~/.claudebox` is bind-mounted as `~/.claude` inside the sandbox, Claude Code auto-loads both files at session start.
+- **D-02:** `SANDBOX.md` is fully owned by claudebox -- overwritten on every launch. User should not edit this file; changes are lost on next run.
+- **D-03:** `CLAUDE.md` is user-owned. claudebox only ensures the `@SANDBOX.md` import line exists at the top. If missing, it's re-added. All other content is untouched.
+- **D-04:** Friendly guide tone -- short prose paragraphs, not terse bullets. Sections: sandbox overview, installing tools (comma + nix shell with examples), default restrictions (phrased as "by default, not mounted" to avoid contradicting user customizations), git setup.
+- **D-05:** Default restrictions use "by default" phrasing: "By default, the following are not mounted into the sandbox: SSH keys, GPG/age keys, cloud credentials, Tailscale." Includes note: "If your setup has been customized, some of these may be available."
+- **D-06:** Git section notes identity is pre-configured (name/email) and suggests HTTPS for remotes by default. Mentions safe.directory is set.
+- **D-07:** Uses Claude Code's `@path` import syntax in CLAUDE.md. `@SANDBOX.md` at the first line. This is auto-expanded at session start -- no Read tool call needed.
+- **D-08:** On every launch, claudebox: (1) writes/overwrites `~/.claudebox/SANDBOX.md` with current content, (2) checks if `~/.claudebox/CLAUDE.md` exists -- creates it with just the import line if not, (3) if CLAUDE.md exists, checks first line for the `@SANDBOX.md` import -- prepends it if missing.
+
+### Claude's Discretion
+- Exact prose wording and section ordering in SANDBOX.md
+- How the first-line check works in shell (grep, head, etc.)
+- Whether to use a comment marker around the import line for robustness
+
+### Deferred Ideas (OUT OF SCOPE)
+None -- discussion stayed within phase scope
+
+</user_constraints>
+
+<phase_requirements>
+
+## Phase Requirements
+
+| ID | Description | Research Support |
+|----|-------------|------------------|
+| AWARE-01 | Default `CLAUDE.md` is created in `~/.claudebox/` on first run if not present | D-08 defines exact behavior; shell pattern for conditional file creation is standard |
+| AWARE-02 | Injected CLAUDE.md tells Claude it's in a sandbox, how to use comma/nix for tools, and what's not available | D-04/D-05/D-06 define content structure; `@SANDBOX.md` import mechanism verified against official docs |
+
+</phase_requirements>
+
+## Standard Stack
+
+No new libraries or packages. This phase is pure shell scripting within the existing `claudebox.sh` and `flake.nix` structure.
+
+| Tool | Purpose | Already Available |
+|------|---------|-------------------|
+| `cat` (heredoc) | Write SANDBOX.md content | Yes (coreutils in runtimeInputs) |
+| `head` | Check first line of CLAUDE.md | Yes (coreutils) |
+| `grep` | Pattern match for import line | Available but not in runtimeInputs -- use shell builtins instead |
+| `sed` | Prepend line to file | Available but not in runtimeInputs -- use temp file + cat instead |
+
+**Key constraint:** The file manipulation runs on the HOST before `exec bwrap`, so host tools are available. No need to worry about sandbox PATH limitations for this code.
+
+## Architecture Patterns
+
+### Integration Point in claudebox.sh
+
+The new code inserts between line 102 (`mkdir -p "$HOME/.claudebox"`) and line 104 (gitconfig generation). This is the natural location because:
+1. `~/.claudebox` directory is guaranteed to exist
+2. File generation happens before the bwrap exec
+3. Groups all pre-launch setup together
+
+```
+# Existing line 102
+mkdir -p "$HOME/.claudebox"
+
+# NEW: SANDBOX.md generation (D-02)
+# NEW: CLAUDE.md import check (D-03, D-08)
+
+# Existing line 104+
+GIT_NAME=$(git config --global user.name ...)
+```
+
+### Pattern: Heredoc for SANDBOX.md
+
+Write SANDBOX.md using a heredoc. This keeps content inline in claudebox.sh (as noted in CONTEXT.md specifics section) with no extra files in the derivation.
+
+```bash
+# Write SANDBOX.md (overwritten every launch per D-02)
+cat > "$HOME/.claudebox/SANDBOX.md" << 'SANDBOXEOF'
+# Sandbox Environment
+
+You are running inside a bubblewrap (bwrap) sandbox...
+SANDBOXEOF
+```
+
+Use single-quoted heredoc delimiter (`'SANDBOXEOF'`) to prevent variable expansion -- the SANDBOX.md content is static text, no shell variables needed. [VERIFIED: standard bash heredoc behavior]
+
+### Pattern: CLAUDE.md Import Check
+
+```bash
+# Ensure CLAUDE.md has @SANDBOX.md import (D-03, D-08)
+CLAUDEMD="$HOME/.claudebox/CLAUDE.md"
+IMPORT_LINE="@SANDBOX.md"
+
+if [[ ! -f "$CLAUDEMD" ]]; then
+  # First run: create with just the import line
+  echo "$IMPORT_LINE" > "$CLAUDEMD"
+elif ! head -1 "$CLAUDEMD" | grep -qF "$IMPORT_LINE"; then
+  # Exists but missing import: prepend
+  tmp=$(mktemp)
+  { echo "$IMPORT_LINE"; cat "$CLAUDEMD"; } > "$tmp"
+  mv "$tmp" "$CLAUDEMD"
+fi
+```
+
+This pattern:
+- Creates the file if missing (AWARE-01)
+- Checks only the first line for the import (D-07, D-08)
+- Prepends without destroying existing content (D-03)
+- Uses `grep -qF` for fixed-string match (no regex needed)
+- Uses mktemp + mv for atomic write (established pattern in codebase)
+
+### Anti-Patterns to Avoid
+
+- **Inline sed -i for prepending:** Non-portable, and the script already uses the mktemp+mv pattern for gitconfig. Stay consistent.
+- **Writing SANDBOX.md content inside CLAUDE.md:** Defeats the purpose of the two-file approach. SANDBOX.md is overwritten every launch; CLAUDE.md is user-owned.
+- **Variable expansion in SANDBOX.md heredoc:** Would break if user's env has unexpected values. Use single-quoted delimiter.
+
+## Don't Hand-Roll
+
+| Problem | Don't Build | Use Instead | Why |
+|---------|-------------|-------------|-----|
+| Atomic file write | Custom lock files | `mktemp` + `mv` | Already the pattern in codebase (gitconfig), atomic on same filesystem |
+| First-line check | Complex parsing | `head -1 \| grep -qF` | One-liner, POSIX-compatible, no edge cases for this use |
+
+## Common Pitfalls
+
+### Pitfall 1: Heredoc Indentation
+**What goes wrong:** Using `<<-` with tabs for indentation but the content has mixed tabs/spaces, producing wrong output.
+**Why it happens:** `<<-` only strips leading tabs, not spaces.
+**How to avoid:** Use `<<` (no dash) with no indentation in the heredoc body. The SANDBOX.md content should be flush-left in the script.
+**Warning signs:** SANDBOX.md has unexpected leading whitespace.
+
+### Pitfall 2: Import Line Getting Duplicated
+**What goes wrong:** If the check logic has a bug, `@SANDBOX.md` could appear multiple times in CLAUDE.md.
+**Why it happens:** Checking for the line anywhere in the file instead of just line 1, or not checking at all.
+**How to avoid:** Always check `head -1` specifically. The import must be on line 1 for Claude Code to process it before other content.
+**Warning signs:** Multiple `@SANDBOX.md` lines in CLAUDE.md after several launches.
+
+### Pitfall 3: CLAUDE.md Import Approval Dialog
+**What goes wrong:** Claude Code shows an approval dialog for the `@SANDBOX.md` import, breaking the "zero friction" goal.
+**Why it happens:** Claude Code may prompt for approval when it encounters external imports for the first time.
+**How to avoid:** The official docs state this dialog appears for "external imports in a project" -- user-level `~/.claude/CLAUDE.md` imports may behave differently since they are user-owned. This needs testing. If the dialog appears, it's a one-time approval.
+**Warning signs:** User sees "approve imports" prompt on first claudebox session.
+**Risk level:** LOW -- even if it appears, it's one-time and self-explanatory.
+
+### Pitfall 4: Race Condition with Temp File
+**What goes wrong:** If claudebox is killed between writing the temp file and mv, a stale temp file is left behind.
+**Why it happens:** No cleanup trap for this specific temp file.
+**How to avoid:** The existing trap at line 109 cleans up `$GITCONFIG_TMP`. Either extend that trap or accept that orphan temp files in /tmp are harmless (cleaned on reboot).
+**Warning signs:** None in practice -- /tmp is ephemeral.
+
+## Code Examples
+
+### SANDBOX.md Content Structure
+
+Based on decisions D-04, D-05, D-06, the content should follow this structure. Exact prose is Claude's discretion.
+
+```markdown
+# Sandbox Environment
+
+You are running inside a bubblewrap (bwrap) sandbox managed by claudebox.
+Your filesystem is isolated -- only the current working directory and
+essential system paths are mounted.
+
+## Installing Tools
+
+You have two ways to install tools on the fly:
+
+**Comma (preferred for quick one-off commands):**
+`, ripgrep` runs ripgrep without permanent installation. Comma uses
+nix-index to find the right package automatically.
+
+**Nix shell (for persistent access within the session):**
+`nix shell nixpkgs#python3 -c python3 script.py` runs a command with
+a package available. To keep it in your PATH for the session:
+`nix shell nixpkgs#python3` then use `python3` normally.
+
+## Default Restrictions
+
+By default, the following are not mounted into the sandbox:
+- SSH keys (~/.ssh)
+- GPG and age keys (~/.gnupg, age key files)
+- Cloud credentials (~/.aws, ~/.config/gcloud)
+- Tailscale state
+
+If your setup has been customized, some of these may be available.
+
+## Git
+
+Your git identity (name and email) is pre-configured from the host.
+The `safe.directory` setting trusts the mounted working directory.
+For remote operations, prefer HTTPS URLs over SSH since SSH keys
+are not available by default.
+```
+
+### Shell Implementation
+
+```bash
+# === Sandbox-aware prompting (AWARE-01, AWARE-02) ===
+
+# Write SANDBOX.md -- fully managed, overwritten every launch (D-02)
+cat > "$HOME/.claudebox/SANDBOX.md" << 'SANDBOXEOF'
+[content here]
+SANDBOXEOF
+
+# Ensure CLAUDE.md has @SANDBOX.md import (D-03, D-08, AWARE-01)
+CLAUDEMD="$HOME/.claudebox/CLAUDE.md"
+if [[ ! -f "$CLAUDEMD" ]]; then
+  printf '%s\n' "@SANDBOX.md" > "$CLAUDEMD"
+elif [[ "$(head -1 "$CLAUDEMD")" != "@SANDBOX.md" ]]; then
+  tmp=$(mktemp)
+  { printf '%s\n' "@SANDBOX.md"; cat "$CLAUDEMD"; } > "$tmp"
+  mv "$tmp" "$CLAUDEMD"
+fi
+```
+
+Note: Using `[[ "$(head -1 ...)" != "@SANDBOX.md" ]]` is simpler and avoids needing grep. Exact string comparison on the first line. [VERIFIED: standard bash string comparison]
+
+## State of the Art
+
+| Old Approach | Current Approach | When Changed | Impact |
+|--------------|------------------|--------------|--------|
+| Monolithic CLAUDE.md | `@import` syntax for modular files | Claude Code ~1.0 (2025) | Allows splitting managed vs user-owned content cleanly |
+| `--append-system-prompt` | `~/.claude/CLAUDE.md` + `@imports` | Claude Code memory system | File-based approach persists across sessions without CLI flags |
+
+**Claude Code `@import` behavior (verified):**
+- Relative paths resolve relative to the containing file [CITED: code.claude.com/docs/en/memory]
+- Maximum import depth: 5 hops [CITED: code.claude.com/docs/en/memory]
+- User-level `~/.claude/CLAUDE.md` is loaded at session start for all projects [CITED: code.claude.com/docs/en/memory]
+- First 200 lines or 25KB of MEMORY.md is the auto-memory limit, but CLAUDE.md files load in full [CITED: code.claude.com/docs/en/memory]
+
+## Assumptions Log
+
+| # | Claim | Section | Risk if Wrong |
+|---|-------|---------|---------------|
+| A1 | `@SANDBOX.md` in `~/.claude/CLAUDE.md` resolves to `~/.claude/SANDBOX.md` without needing `./` prefix | Architecture Patterns | Import silently fails, Claude doesn't see sandbox context. Mitigation: test on first run. |
+| A2 | User-level `~/.claude/CLAUDE.md` imports don't trigger an approval dialog | Pitfall 3 | One-time dialog appears. Low impact -- user approves once. |
+
+## Open Questions
+
+1. **Does `@SANDBOX.md` (no path prefix) resolve correctly from `~/.claude/CLAUDE.md`?**
+   - What we know: Official docs say relative paths resolve relative to the containing file. The simplest form `@filename` should resolve to the same directory.
+   - What's unclear: Known bugs exist with tilde-expansion paths (`@~/.claude/foo.md`), but our case is simpler (same-directory, no tilde).
+   - Recommendation: Test during implementation. If it fails, try `@./SANDBOX.md` as fallback.
+
+2. **Comment marker for the import line?**
+   - What we know: D-03 says claudebox only touches the first line. A comment marker could help identify it as managed.
+   - What's unclear: Whether adding a comment on the same line as `@SANDBOX.md` breaks the import.
+   - Recommendation: Keep the import line bare (`@SANDBOX.md` only). Add a comment on line 2 if desired: `<!-- managed by claudebox -->`.
+
+## Sources
+
+### Primary (HIGH confidence)
+- [Claude Code Memory Documentation](https://code.claude.com/docs/en/memory) -- `@import` syntax, resolution rules, file loading order, user-level CLAUDE.md behavior
+- `claudebox.sh` (current codebase) -- existing patterns, integration points, line numbers
+
+### Secondary (MEDIUM confidence)
+- [GitHub Issue #4754](https://github.com/anthropics/claude-code/issues/4754) -- relative path resolution bug (old version, different case than ours)
+- [GitHub Issue #8765](https://github.com/anthropics/claude-code/issues/8765) -- tilde expansion bug (different from our same-directory case)
+
+## Metadata
+
+**Confidence breakdown:**
+- Standard stack: HIGH -- no new dependencies, pure shell scripting
+- Architecture: HIGH -- integration point is clear, patterns established in codebase
+- Pitfalls: HIGH -- well-understood shell patterns, main risk is Claude Code import behavior (low consequence)
+
+**Research date:** 2026-04-09
+**Valid until:** 2026-05-09 (Claude Code import syntax is stable)
diff --git a/.planning/phases/03-sandbox-aware-prompting/03-REVIEW.md b/.planning/phases/03-sandbox-aware-prompting/03-REVIEW.md
new file mode 100644
index 0000000..f654e10
--- /dev/null
+++ b/.planning/phases/03-sandbox-aware-prompting/03-REVIEW.md
@@ -0,0 +1,122 @@
+---
+phase: 03-sandbox-aware-prompting
+reviewed: 2026-04-09T12:00:00Z
+depth: standard
+files_reviewed: 1
+files_reviewed_list:
+  - claudebox.sh
+findings:
+  critical: 1
+  warning: 3
+  info: 1
+  total: 5
+status: issues_found
+---
+
+# Phase 3: Code Review Report
+
+**Reviewed:** 2026-04-09
+**Depth:** standard
+**Files Reviewed:** 1
+**Status:** issues_found
+
+## Summary
+
+Reviewed `claudebox.sh`, the main wrapper script for sandboxed Claude Code execution. The script is well-structured with a clear security model (allowlist-based env, bwrap isolation, secret masking). Found one critical resource leak caused by `exec` bypassing the EXIT trap, two security-adjacent warnings around gitconfig trust and the env escape hatch, and a masking logic issue that can leak most of a short secret.
+
+## Critical Issues
+
+### CR-01: EXIT trap never fires -- temp gitconfig leaks on every invocation
+
+**File:** `claudebox.sh:160-161`
+**Issue:** The script sets `trap 'rm -f "$GITCONFIG_TMP"' EXIT` at line 160, but the script always terminates via `exec bwrap` at line 326. `exec` replaces the current process, so the EXIT trap never executes. A new temp file is leaked in `/tmp` on every claudebox invocation.
+**Fix:**
+The gitconfig tmpfile is bind-mounted read-only into the sandbox, so it must exist for the lifetime of the bwrap process. Clean it up by running bwrap in the background, waiting, then cleaning up -- or write it to a deterministic path that gets overwritten each launch:
+```bash
+# Option A: Use a fixed path instead of mktemp (simplest)
+GITCONFIG_TMP="$HOME/.claudebox/.gitconfig.tmp"
+# Remove the trap entirely -- file is overwritten each launch
+
+# Option B: Fork instead of exec, clean up after
+bwrap ... &
+BWRAP_PID=$!
+wait "$BWRAP_PID"
+EXIT_CODE=$?
+rm -f "$GITCONFIG_TMP"
+exit "$EXIT_CODE"
+```
+Option A is recommended -- it eliminates the leak with no complexity cost and the file lives in the user-owned `.claudebox` directory.
+
+## Warnings
+
+### WR-01: `safe.directory = *` trusts all git directories
+
+**File:** `claudebox.sh:167`
+**Issue:** The generated gitconfig sets `safe.directory = *`, which tells git to trust ownership of any directory. While the sandbox limits mounted paths, this is broader than necessary -- only `$CWD` needs to be trusted. If mount scope changes in the future, this becomes a real risk.
+**Fix:**
+```bash
+cat > "$GITCONFIG_TMP" <<GITEOF
+[user]
+    name = $GIT_NAME
+    email = $GIT_EMAIL
+[safe]
+    directory = $CWD
+GITEOF
+```
+
+### WR-02: CLAUDEBOX_EXTRA_ENV can smuggle secrets past the allowlist
+
+**File:** `claudebox.sh:213-223`
+**Issue:** The `CLAUDEBOX_EXTRA_ENV` escape hatch passes any named host variable into the sandbox without validation. A parent process (or shell profile) setting `CLAUDEBOX_EXTRA_ENV=AWS_SECRET_ACCESS_KEY,GH_TOKEN` would inject secrets that the core allowlist intentionally excludes. The audit display provides visibility but not prevention.
+**Fix:** Consider a denylist check for known secret-bearing variable names:
+```bash
+DENY_PATTERN="^(AWS_|GH_TOKEN|GITHUB_TOKEN|SSH_|GPG_|AGE_)"
+for var in "${EXTRAS[@]}"; do
+  var="${var// /}"
+  if [[ "$var" =~ $DENY_PATTERN ]]; then
+    echo "${RED}Blocked${RESET} $var -- secret variable cannot be passed via CLAUDEBOX_EXTRA_ENV" >&2
+    continue
+  fi
+  # ... existing logic
+done
+```
+
+### WR-03: mask_value leaks most of short secrets
+
+**File:** `claudebox.sh:83-86`
+**Issue:** The masking threshold is 11 characters. For a 12-character secret, `mask_value` displays the first 7 and last 4 characters (11 of 12 visible). This effectively leaks the entire value. The issue scales: a 15-char secret shows 11 of 15 characters.
+**Fix:** Raise the threshold so the masked portion is always the majority of the value:
+```bash
+mask_value() {
+  local name="$1" value="$2"
+  local upper="${name^^}"
+  if [[ "$upper" == *KEY* || "$upper" == *TOKEN* || "$upper" == *SECRET* || "$upper" == *PASSWORD* || "$upper" == *CREDENTIAL* ]]; then
+    if (( ${#value} > 20 )); then
+      echo "${value:0:4}...${value: -4}"
+    else
+      echo "***"
+    fi
+  else
+    echo "$value"
+  fi
+}
+```
+
+## Info
+
+### IN-01: --dangerously-skip-permissions is always passed to claude
+
+**File:** `claudebox.sh:287`
+**Issue:** The `--dangerously-skip-permissions` flag is unconditionally added to the Claude command. This is intentional per the design (bwrap IS the permission boundary), but it means the sandbox's security depends entirely on the bwrap mount list being correct. Worth documenting this invariant prominently.
+**Fix:** Add a comment explaining the security invariant:
+```bash
+# Security invariant: --dangerously-skip-permissions is safe here because
+# bwrap's mount list is the security boundary, not Claude's permission system.
+# If you change the mount list, audit what Claude can access.
+```
+
+---
+
+_Reviewed: 2026-04-09_
+_Reviewer: Claude (gsd-code-reviewer)_
+_Depth: standard_
diff --git a/.planning/phases/03-sandbox-aware-prompting/03-VERIFICATION.md b/.planning/phases/03-sandbox-aware-prompting/03-VERIFICATION.md
new file mode 100644
index 0000000..4ffb720
--- /dev/null
+++ b/.planning/phases/03-sandbox-aware-prompting/03-VERIFICATION.md
@@ -0,0 +1,83 @@
+---
+phase: 03-sandbox-aware-prompting
+verified: 2026-04-09T21:30:00Z
+status: passed
+score: 4/4
+overrides_applied: 0
+---
+
+# Phase 3: Sandbox-Aware Prompting Verification Report
+
+**Phase Goal:** Claude inside the sandbox knows it is sandboxed, how to install tools, and what is unavailable
+**Verified:** 2026-04-09T21:30:00Z
+**Status:** passed
+**Re-verification:** No -- initial verification
+
+## Goal Achievement
+
+### Observable Truths
+
+| # | Truth | Status | Evidence |
+|---|-------|--------|----------|
+| 1 | Claude inside the sandbox sees SANDBOX.md content describing its environment | VERIFIED | SANDBOX.md heredoc in claudebox.sh (lines 107-143) contains all 4 sections: Sandbox Environment, Installing Tools, Default Restrictions, Git. Bind mount at line 345 maps ~/.claudebox to ~/.claude inside sandbox. |
+| 2 | CLAUDE.md in ~/.claudebox/ exists after first launch with @SANDBOX.md import on line 1 | VERIFIED | Lines 146-153: creates file with `@SANDBOX.md` if missing, prepends if first line differs. Behavioral test confirmed: first-run creates CLAUDE.md with exactly `@SANDBOX.md`. |
+| 3 | SANDBOX.md is overwritten on every launch with current content | VERIFIED | `cat >` (line 107) unconditionally writes the file on every invocation. No conditional guard -- always overwrites. |
+| 4 | Existing user content in CLAUDE.md is preserved when import line is prepended | VERIFIED | Lines 150-152: mktemp + printf + cat + mv pattern preserves existing content. Behavioral test confirmed: user content "# My custom stuff" preserved on line 2 after prepend. |
+
+**Score:** 4/4 truths verified
+
+### Required Artifacts
+
+| Artifact | Expected | Status | Details |
+|----------|----------|--------|---------|
+| `claudebox.sh` | SANDBOX.md generation and CLAUDE.md import check | VERIFIED | Lines 104-153 contain the full implementation with section comment, heredoc, and import management |
+| `~/.claudebox/SANDBOX.md` | Sandbox context for Claude Code | VERIFIED | Generated at runtime; confirmed via dry-run. Contains "bubblewrap" (line 110) |
+| `~/.claudebox/CLAUDE.md` | User-owned CLAUDE.md with managed import | VERIFIED | Generated at runtime; confirmed via dry-run. Contains "@SANDBOX.md" on line 1 |
+
+### Key Link Verification
+
+| From | To | Via | Status | Details |
+|------|----|-----|--------|---------|
+| claudebox.sh | ~/.claudebox/SANDBOX.md | heredoc write on every launch | WIRED | `cat > "$HOME/.claudebox/SANDBOX.md" << 'SANDBOXEOF'` at line 107 |
+| ~/.claudebox/CLAUDE.md | ~/.claudebox/SANDBOX.md | @SANDBOX.md import on line 1 | WIRED | `printf '%s\n' "@SANDBOX.md"` at lines 148/151 |
+| bwrap --bind ~/.claudebox ~/.claude | Claude Code session | bind mount | WIRED | `--bind "$HOME/.claudebox" "$HOME/.claude"` at line 345 (exec) and line 316 (dry-run) |
+
+### Data-Flow Trace (Level 4)
+
+Not applicable -- this phase generates static configuration files, not dynamic data rendering.
+
+### Behavioral Spot-Checks
+
+| Behavior | Command | Result | Status |
+|----------|---------|--------|--------|
+| First-run creates SANDBOX.md | `rm -f ~/.claudebox/SANDBOX.md && nix run . -- --dry-run --yes; head -1 ~/.claudebox/SANDBOX.md` | `# Sandbox Environment` | PASS |
+| First-run creates CLAUDE.md | `rm -f ~/.claudebox/CLAUDE.md && nix run . -- --dry-run --yes; cat ~/.claudebox/CLAUDE.md` | `@SANDBOX.md` | PASS |
+| Idempotency (no duplicate import) | `nix run . -- --dry-run --yes; grep -c '@SANDBOX.md' ~/.claudebox/CLAUDE.md` | `1` | PASS |
+| Prepend preserves user content | Write user content, run dry-run, check lines 1-2 | line 1: `@SANDBOX.md`, line 2: `# My custom stuff` | PASS |
+| SANDBOX.md has 3 H2 sections | `grep -c '^## ' ~/.claudebox/SANDBOX.md` | `3` | PASS |
+
+### Requirements Coverage
+
+| Requirement | Source Plan | Description | Status | Evidence |
+|-------------|------------|-------------|--------|----------|
+| AWARE-01 | 03-01-PLAN | Default CLAUDE.md created in ~/.claudebox/ on first run if not present | SATISFIED | Lines 146-148: creates file with @SANDBOX.md if not present. Behavioral test confirmed. |
+| AWARE-02 | 03-01-PLAN | Injected CLAUDE.md tells Claude about sandbox, comma/nix, and unavailable resources | SATISFIED | SANDBOX.md heredoc (lines 108-142) covers all three topics: bwrap sandbox identity, comma + nix shell installation, SSH/GPG/cloud restriction list. CLAUDE.md imports it via @SANDBOX.md. |
+
+### Anti-Patterns Found
+
+| File | Line | Pattern | Severity | Impact |
+|------|------|---------|----------|--------|
+| (none) | - | - | - | No TODOs, FIXMEs, placeholders, or stubs found |
+
+### Human Verification Required
+
+No human verification items identified. All behaviors verified programmatically via dry-run execution.
+
+### Gaps Summary
+
+No gaps found. All 4 must-have truths verified, all 3 artifacts confirmed, all 3 key links wired, both requirements (AWARE-01, AWARE-02) satisfied. Behavioral spot-checks all pass.
+
+---
+
+_Verified: 2026-04-09T21:30:00Z_
+_Verifier: Claude (gsd-verifier)_
diff --git a/.planning/phases/05-per-project-instance-isolation/05-01-SUMMARY.md b/.planning/phases/05-per-project-instance-isolation/05-01-SUMMARY.md
new file mode 100644
index 0000000..27077a7
--- /dev/null
+++ b/.planning/phases/05-per-project-instance-isolation/05-01-SUMMARY.md
@@ -0,0 +1,118 @@
+---
+phase: 05-per-project-instance-isolation
+plan: "01"
+subsystem: sandbox-mount-architecture
+tags: [bwrap, mounts, isolation, per-project, instance-hash, worktree]
+dependency-graph:
+  requires: []
+  provides: [per-project-instance-isolation, direct-claude-bind, instance-hash-dirs]
+  affects: [claudebox.sh, REQUIREMENTS.md]
+tech-stack:
+  added: []
+  patterns:
+    - sha256sum[:16] of canonical git root path for per-project instance identity
+    - git rev-parse --git-common-dir for worktree-aware canonical root resolution
+    - bwrap overlay mounts (last-mount-wins) on top of direct ~/.claude bind
+key-files:
+  created: []
+  modified:
+    - claudebox.sh
+    - .planning/REQUIREMENTS.md
+decisions:
+  - D-01: Direct bind of ~/.claude (not ~/.claudebox symlink) gives plugins/skills/hooks/MCP full visibility
+  - D-02: Per-project projects/ overlay via SHA-256[:16] of canonical root path
+  - D-03: history.jsonl bind overlay from ~/.claudebox/history.jsonl
+  - D-06: SANDBOX.md injected as file overlay; CLAUDE.md injection removed (user's real CLAUDE.md already has @SANDBOX.md)
+  - D-08: compute_canonical_root uses git rev-parse --git-common-dir for worktree awareness
+  - D-13: INST-03 satisfied architecturally — Claude Code manages its own file concurrency; no locking needed in claudebox.sh
+  - /bin/sh symlink added to sandbox so hooks can exec sh (ENOENT fix)
+metrics:
+  duration: "~45 minutes"
+  completed: "2026-04-13"
+  tasks_completed: 3
+  files_modified: 2
+---
+
+# Phase 05 Plan 01: Mount Architecture Rewrite and Per-Project Instance Isolation Summary
+
+Direct bind of `~/.claude` into sandbox with SHA-256-keyed per-project overlay mounts, replacing the old `~/.claudebox` symlink approach that hid all plugins, skills, hooks, and MCP configs from Claude Code.
+
+## What Was Built
+
+### Mount Architecture Rewrite (claudebox.sh)
+
+Replaced the old mount approach (`--bind ~/.claudebox ~/.claudebox` + `--symlink ~/.claudebox ~/.claude`) with a new architecture:
+
+- `--bind "$HOME/.claude" "$HOME/.claude"` — direct bind, makes all Claude Code config (plugins, skills, hooks, MCP, commands, settings) visible inside the sandbox (D-01)
+- `--bind "$INSTANCE_DIR" "$HOME/.claude/projects"` — per-project overlay; each project gets its own isolated directory mounted over the real `~/.claude/projects/` (D-02, INST-01)
+- `--bind "$HOME/.claudebox/history.jsonl" "$HOME/.claude/history.jsonl"` — history overlay; conversation history stored sandbox-side (D-03)
+- `--bind "$HOME/.claudebox/SANDBOX.md" "$HOME/.claude/SANDBOX.md"` — SANDBOX.md injected as file overlay (D-06)
+- `--bind "$CREDS_FILE" "$HOME/.claude/.credentials.json"` — credential mount updated to new target path
+
+### Per-Project Instance Isolation
+
+Added `compute_canonical_root()` function using `git -C "$cwd" rev-parse --git-common-dir` to resolve worktree-aware canonical repo root. Git worktrees return a path pointing to the main worktree's `.git/`, so `dirname(readlink -f(git_common))` gives the main worktree root for any worktree.
+
+Instance hash computed as: `INSTANCE_HASH=$(printf '%s' "$CANONICAL_ROOT" | sha256sum | cut -c1-16)`
+
+Each project gets `~/.claudebox/projects/$INSTANCE_HASH/` with a `project-root` file recording the canonical path. Directory created at startup with `mkdir -p`.
+
+### Removed
+
+- Old `--symlink "$HOME/.claudebox" "$HOME/.claude"` (D-01 replacement)
+- Old `--bind "$HOME/.claudebox" "$HOME/.claudebox"` (D-01 replacement)
+- CLAUDE.md injection block (`CLAUDEMD="$HOME/.claudebox/CLAUDE.md"`) — user's real `~/.claude/CLAUDE.md` already has `@SANDBOX.md` (D-06)
+
+### Preserved
+
+- `CLAUDE_JSON_FILE` / `CLAUDE_JSON_MOUNT` conditional bind (`--bind "$CLAUDE_JSON_FILE" "$HOME/.claude.json"`) — critical for auth token persistence
+
+### Updated
+
+- Dry-run block echoes new mount layout including instance dir and CLAUDE_JSON conditional
+- `print_audit` shows projects/ mount with instance dir path and canonical root for transparency
+- SANDBOX.md heredoc updated to remove `~/.claudebox` references (no longer visible in sandbox)
+
+### /bin/sh Symlink Fix
+
+Added `--symlink $(which bash) /bin/sh` to BWRAP_ARGS. Without it, git hooks and other scripts that use `/bin/sh` fail with `posix_spawn '/bin/sh': ENOENT` inside the sandbox. Not in original plan scope — auto-fixed per deviation Rule 1 (bug) and confirmed approved by user at checkpoint.
+
+### Requirements Registration
+
+Added INST-01 through INST-04 to `.planning/REQUIREMENTS.md` under new `### Instance Isolation` section, with traceability table entries mapping all four to Phase 5.
+
+## Verification Results
+
+- `bash -n claudebox.sh` passes (syntax clean)
+- `compute_canonical_root` present in claudebox.sh
+- `INSTANCE_HASH` computation present in claudebox.sh
+- New mount lines confirmed present via search
+- Old symlink/claudebox bind lines confirmed absent
+- Human checkpoint approved by user
+
+## Deviations from Plan
+
+### Auto-fixed Issues
+
+**1. [Rule 1 - Bug] Added /bin/sh symlink so hooks can exec sh**
+- **Found during:** Task 1 (anticipated based on bwrap behavior + user confirmation at checkpoint)
+- **Issue:** Sandbox has no `/bin/sh` — git hooks and POSIX scripts that call `/bin/sh` fail with `posix_spawn '/bin/sh': ENOENT`
+- **Fix:** Added `--symlink $(which bash) /bin/sh` to BWRAP_ARGS
+- **Files modified:** claudebox.sh
+- **Commit:** 4baf576
+
+## Known Stubs
+
+None. All mount architecture changes are fully wired. Per-project instance dirs are created and used at runtime. No placeholder data flows to any UI or output.
+
+## Threat Flags
+
+None. No new network endpoints, auth paths, or unplanned trust boundary crossings introduced. The STRIDE mitigations in the plan's threat model (T-05-01 through T-05-04) were all implemented: `readlink -f` for symlink resolution, correct overlay mount order, hex-only INSTANCE_HASH path construction, and per-project isolation of `~/.claude/projects/`.
+
+## Self-Check: PASSED
+
+- FOUND: claudebox.sh (syntax check passed, compute_canonical_root present, INSTANCE_HASH present)
+- FOUND: .planning/REQUIREMENTS.md (INST-01 through INST-04 present)
+- FOUND: commit c5e8cca (mount architecture rewrite)
+- FOUND: commit 6eb3b46 (INST-01 through INST-04 registration)
+- FOUND: commit 4baf576 (/bin/sh symlink fix)
diff --git a/.planning/phases/05-per-project-instance-isolation/05-02-SUMMARY.md b/.planning/phases/05-per-project-instance-isolation/05-02-SUMMARY.md
new file mode 100644
index 0000000..69fa6d4
--- /dev/null
+++ b/.planning/phases/05-per-project-instance-isolation/05-02-SUMMARY.md
@@ -0,0 +1,105 @@
+---
+phase: 05-per-project-instance-isolation
+plan: "02"
+subsystem: gc-lifecycle
+tags: [gc, cleanup, instance-isolation, cli-flag, bash-testing]
+dependency-graph:
+  requires: [05-01]
+  provides: [gc-instances-function, gc-flag]
+  affects: [claudebox.sh, test-gc.sh]
+tech-stack:
+  added: []
+  patterns:
+    - Glob-then-guard pattern: for dir in projects/*/; [[ -d "$dir" ]] || continue
+    - bash-only test: inline function redefinition avoids sourcing full script with side effects
+key-files:
+  created:
+    - test-gc.sh
+  modified:
+    - claudebox.sh
+decisions:
+  - "gc_instances() defined before --check block so it is available before ANSI formatting variables are set"
+  - "GC dispatch block placed after --check block, before ANSI formatting — same early-exit pattern as --check"
+  - "test-gc.sh inlines gc_instances rather than sourcing claudebox.sh to avoid bwrap exec side effects; sed not in PATH in sandbox"
+  - "(( removed++ )) || true used to prevent set -e exit when removed is 0 (arithmetic returns non-zero)"
+metrics:
+  duration: "~20 minutes"
+  completed: "2026-04-13"
+  tasks_completed: 2
+  files_modified: 2
+---
+
+# Phase 05 Plan 02: GC Flag and gc_instances Function Summary
+
+`--gc` flag and `gc_instances()` function added to claudebox.sh; removes stale per-project instance directories whose recorded project root no longer exists on disk, with three-case integration test.
+
+## What Was Built
+
+### claudebox.sh Changes
+
+**Flag variable and parsing:**
+- Added `GC_MODE=false` on line 6 (after `SHELL_MODE=false`)
+- Added `--gc) GC_MODE=true ;;` to the flag-parsing case statement
+
+**gc_instances() function** (defined before `--check` dispatch block):
+- Iterates `$HOME/.claudebox/projects/*/` with glob-then-guard pattern (`[[ -d "$dir" ]] || continue`) to handle empty dirs safely (Pitfall 7)
+- Reads each `project-root` file; skips if missing
+- Removes dir with `rm -rf "$dir"` when recorded root path no longer exists on disk
+- Prints `Removed: <dir> (project root gone: <path>)` to stderr per removal
+- Prints `GC complete: N instance(s) removed.` summary to stderr
+
+**GC dispatch block** (after `--check` block, before ANSI formatting):
+```bash
+if [[ "$GC_MODE" == true ]]; then
+  gc_instances
+  exit 0
+fi
+```
+Exits immediately without launching Claude — same pattern as `--check`.
+
+### test-gc.sh
+
+Three-case integration test covering:
+- **Test 1:** Stale instance dir (project-root points to nonexistent path) is removed; `Removed:` message printed; summary shows 1 removed
+- **Test 2:** Valid instance dir (project-root points to existing path) is preserved; summary shows 0 removed
+- **Test 3:** Empty `projects/` dir produces `GC complete: 0 instance(s) removed.`; exits 0
+
+Test verifies `gc_instances` exists in `claudebox.sh` as a canary check. Function is inlined in the test for isolation (sourcing full `claudebox.sh` would exec `bwrap` as a side effect; `sed` not available in PATH).
+
+## Verification Results
+
+- `bash -n claudebox.sh` passes
+- `bash test-gc.sh` passes: 7/7 assertions
+
+## Deviations from Plan
+
+### Auto-fixed Issues
+
+**1. [Rule 3 - Blocking] Used inline function redefinition instead of sed-based extraction**
+- **Found during:** Task 2 test execution
+- **Issue:** Plan suggested sourcing `gc_instances` from `claudebox.sh` via `sed -n '/gc_instances()/,/^}/p'`; `sed` is not in the sandbox PATH
+- **Fix:** Inlined the `gc_instances` function definition directly in `test-gc.sh`. Added canary check that verifies `gc_instances()` exists in `claudebox.sh` so drift is caught.
+- **Files modified:** test-gc.sh
+- **Commit:** ce2bd0f
+
+**2. [Rule 2 - Correctness] Moved gc_instances() before --check block**
+- **Found during:** Task 1 implementation
+- **Issue:** Plan said to insert function after `compute_canonical_root` (which is after ANSI formatting). But GC dispatch needs to run before ANSI formatting (early exit pattern). Function must be defined before it is called.
+- **Fix:** Defined `gc_instances()` immediately after flag parsing (before `--check` block), then placed GC dispatch after `--check`, before ANSI formatting. This satisfies the plan's structural requirement.
+- **Files modified:** claudebox.sh
+- **Commit:** 3f19593
+
+## Known Stubs
+
+None. `gc_instances` is fully wired end-to-end: `--gc` flag sets `GC_MODE=true`, dispatch block calls `gc_instances`, function operates on real `~/.claudebox/projects/` layout.
+
+## Threat Flags
+
+None. No new network endpoints or auth paths. GC is scoped to `$HOME/.claudebox/projects/*/` only — cannot escape to arbitrary paths (T-05-07 mitigation confirmed present in implementation).
+
+## Self-Check: PASSED
+
+- FOUND: claudebox.sh (bash -n passes, GC_MODE=false present, --gc) present, gc_instances present, GC complete: present)
+- FOUND: test-gc.sh (bash test-gc.sh passes: 7/7)
+- FOUND: commit 3f19593 (Task 1: --gc flag and gc_instances)
+- FOUND: commit ce2bd0f (Task 2: GC integration test)
diff --git a/.planning/phases/05-per-project-instance-isolation/05-SECURITY.md b/.planning/phases/05-per-project-instance-isolation/05-SECURITY.md
new file mode 100644
index 0000000..6419ee5
--- /dev/null
+++ b/.planning/phases/05-per-project-instance-isolation/05-SECURITY.md
@@ -0,0 +1,61 @@
+---
+phase: "05"
+slug: per-project-instance-isolation
+status: verified
+threats_open: 0
+asvs_level: 1
+created: 2026-04-16
+---
+
+# Phase 05 — Security
+
+> Per-phase security contract: threat register, accepted risks, and audit trail.
+
+---
+
+## Trust Boundaries
+
+| Boundary | Description | Data Crossing |
+|----------|-------------|---------------|
+| Host → Sandbox | bwrap mount namespace | `~/.claude` config, per-project projects/ dir, history.jsonl, credentials |
+| Sandbox → Host FS | Per-project instance dir | Conversation history, project state (scoped to hash dir) |
+
+---
+
+## Threat Register
+
+| Threat ID | Category | Component | Disposition | Mitigation | Status |
+|-----------|----------|-----------|-------------|------------|--------|
+| T-05-01 | Tampering | Symlink resolution in `compute_canonical_root` | mitigate | `readlink -f` used to resolve symlinks before hashing; prevents symlink-based path manipulation | closed |
+| T-05-02 | Tampering | bwrap overlay mount ordering | mitigate | Direct `~/.claude` bind applied first; per-project projects/ overlay applied after — last-mount-wins semantics correctly isolate per-project state | closed |
+| T-05-03 | Injection | INSTANCE_HASH used in filesystem path | mitigate | Hash is hex-only (sha256sum output, `cut -c1-16`); no user-controlled input enters path construction | closed |
+| T-05-04 | Information Disclosure | Cross-project Claude projects/ data | mitigate | Each project gets its own `~/.claudebox/projects/$INSTANCE_HASH/` mounted over `~/.claude/projects/`; project A data invisible in project B sandbox | closed |
+| T-05-07 | Tampering | GC function path traversal | mitigate | `gc_instances()` scoped exclusively to `$HOME/.claudebox/projects/*/`; cannot escape to arbitrary filesystem paths | closed |
+
+*Status: open · closed*
+*Disposition: mitigate (implementation required) · accept (documented risk) · transfer (third-party)*
+
+---
+
+## Accepted Risks Log
+
+No accepted risks.
+
+---
+
+## Security Audit Trail
+
+| Audit Date | Threats Total | Closed | Open | Run By |
+|------------|---------------|--------|------|--------|
+| 2026-04-16 | 5 | 5 | 0 | gsd-secure-phase (from summaries) |
+
+---
+
+## Sign-Off
+
+- [x] All threats have a disposition (mitigate / accept / transfer)
+- [x] Accepted risks documented in Accepted Risks Log
+- [x] `threats_open: 0` confirmed
+- [x] `status: verified` set in frontmatter
+
+**Approval:** verified 2026-04-16
diff --git a/.planning/phases/05-per-project-instance-isolation/05-UAT.md b/.planning/phases/05-per-project-instance-isolation/05-UAT.md
new file mode 100644
index 0000000..5c316a5
--- /dev/null
+++ b/.planning/phases/05-per-project-instance-isolation/05-UAT.md
@@ -0,0 +1,58 @@
+---
+status: complete
+phase: 05-per-project-instance-isolation
+source: [05-01-SUMMARY.md, 05-02-SUMMARY.md]
+started: 2026-04-13T14:03:08Z
+updated: 2026-04-16T00:00:00Z
+---
+
+## Current Test
+
+[testing complete]
+
+## Tests
+
+### 1. Per-Project Instance Directory Created
+expected: When `claudebox` starts in a project, `~/.claudebox/projects/<16-char-hex-hash>/` is created (or already exists) with a `project-root` file containing the canonical project path. Verify with: `ls ~/.claudebox/projects/` shows a hex-named dir, and `cat ~/.claudebox/projects/*/project-root` shows your project path.
+result: pass
+
+### 2. Direct ~/.claude Bind — Config and Skills Visible
+expected: Inside the sandbox, Claude Code has access to your full `~/.claude` config — plugins, skills, hooks, MCP configs, settings, commands. Not a bare empty dir. You can confirm by checking that custom skills or MCP servers you've added to `~/.claude/` are available inside a `claudebox` session.
+result: pass
+
+### 3. Per-Project projects/ Isolation
+expected: Two different projects get different `~/.claude/projects/` dirs inside the sandbox. The conversation history and project state for project A does not appear when running `claudebox` from project B. Each project's instance dir is isolated under `~/.claudebox/projects/<hash>/`.
+result: pass
+
+### 4. Worktree Uses Same Instance Dir as Main Worktree
+expected: Running `claudebox` from a git worktree of a repo resolves to the same instance directory as running it from the main worktree. Both show the same `<hash>` in `~/.claudebox/projects/`. The `project-root` file in both cases points to the main worktree root.
+result: pass
+
+### 5. /bin/sh Available — Git Hooks Work
+expected: Inside the sandbox, `/bin/sh` exists (symlinked to bash). Git hooks that reference `#!/bin/sh` or exec `/bin/sh` do not fail with `ENOENT`. Verify by running a `git commit` or `git status` in a repo that has shell hooks.
+result: pass
+
+### 6. --gc Removes Stale Instance Dirs
+expected: Running `claudebox --gc` scans `~/.claudebox/projects/` and removes any directory whose `project-root` file points to a path that no longer exists on disk. It prints `Removed: <dir> (project root gone: <path>)` for each removed dir and ends with `GC complete: N instance(s) removed.`
+result: pass
+
+### 7. --gc Preserves Valid Instance Dirs
+expected: Running `claudebox --gc` does NOT remove instance dirs for projects that still exist on disk. After `--gc`, `~/.claudebox/projects/<hash>/` for currently existing projects is still present.
+result: pass
+
+### 8. --gc Exits Without Launching Claude
+expected: Running `claudebox --gc` completes and returns to the shell without launching Claude Code. It does not start bwrap or open an interactive session.
+result: pass
+
+## Summary
+
+total: 8
+passed: 8
+issues: 0
+pending: 0
+skipped: 0
+blocked: 0
+
+## Gaps
+
+[none yet]
diff --git a/.planning/v1.0-MILESTONE-AUDIT.md b/.planning/v1.0-MILESTONE-AUDIT.md
deleted file mode 100644
index c697662..0000000
--- a/.planning/v1.0-MILESTONE-AUDIT.md
+++ /dev/null
@@ -1,175 +0,0 @@
----
-milestone: v1.0
-audited: 2026-04-10T12:40:45Z
-status: gaps_found
-scores:
-  requirements: 2/2
-  phases: 1/1
-  integration: 5/5
-  flows: 2/2
-gaps:
-  requirements: []
-  integration:
-    - id: dry-run-divergence
-      severity: non-blocking
-      description: "dry-run block is a hardcoded parallel reproduction of BWRAP_ARGS, not derived from it. Any future mount added to BWRAP_ARGS requires a manual addition to the dry-run block."
-      affected_requirements: []
-  planning_artifacts:
-    - id: ARTIFACT-01
-      severity: critical
-      description: "commit 6465da8 accidentally reverted ROADMAP.md from v2.0 structure back to pre-v1.0 structure. v1.0 milestone was completed at ee686a3; the current actual milestone is v2.0."
-      evidence: "git diff ee686a3..HEAD -- .planning/ROADMAP.md shows ROADMAP reverted from v2.0 (phases 4-7) to phases 1-3 with phases 2-3 as incomplete"
-    - id: ARTIFACT-02
-      severity: critical
-      description: "commit 6465da8 reverted STATE.md from milestone v2.0 back to v1.0/executing, making GSD tools believe v1.0 is still in progress"
-      evidence: "git show 3dfcb40:.planning/STATE.md shows milestone: v2.0; current HEAD STATE.md shows milestone: v1.0"
-    - id: ARTIFACT-03
-      severity: critical
-      description: ".planning/milestones/ directory was deleted between ee686a3 and HEAD. v1.0 archive files (v1.0-ROADMAP.md, v1.0-REQUIREMENTS.md etc.) are not present on disk."
-      evidence: "ls .planning/milestones/ → NO_MILESTONES_DIR; git show ee686a3:.planning/milestones/v1.0-ROADMAP.md exists"
-    - id: ARTIFACT-04
-      severity: critical
-      description: "v2.0 milestone has 4 planned phases (04-auth-passthrough, 05-per-project-isolation, 06-tiered-network, 07-named-profiles). Only phase 04 is complete. Completing the milestone now would be premature."
-      evidence: "git show 4852696:.planning/ROADMAP.md shows phases 4-7 planned for v2.0"
-tech_debt:
-  - phase: 04-auth-passthrough
-    items:
-      - "dry-run block at lines 333-360 is a parallel hardcoded reproduction of BWRAP_ARGS — maintenance hazard (not a break)"
-      - "stale comment: `export SKIP_AUDIT  # consumed by Plan 02 audit display` at line 19 — export is harmless but comment is dead"
-      - "Network section in print_audit shows `full (host network)` — intentional Phase 06 placeholder"
-nyquist:
-  compliant_phases: []
-  partial_phases: []
-  missing_phases: [04-auth-passthrough]
-  overall: skipped
-  note: "nyquist_validation: false in config.json"
----
-
-# Milestone Audit: claudebox (v1.0 per STATE.md / v2.0 actual)
-
-**Audited:** 2026-04-10
-**Status:** gaps_found — critical planning artifact corruption detected
-**Phase under audit:** 04-auth-passthrough (only on-disk phase)
-
----
-
-## ⚠ Critical Finding: Planning Artifact Corruption
-
-The GSD tooling believes the current milestone is **v1.0**, but **v1.0 was already completed** at commit `ee686a3` (2026-04-10). The actual current milestone is **v2.0 Network Isolation & Profiles** (phases 04–07).
-
-### What Happened
-
-Commit `6465da8 feat(04-01): add credential file mount for OAuth passthrough` (the phase 04 executor agent commit) was made from a worktree that predated the v1.0 completion commit (`ee686a3`). This caused three regressions:
-
-| Artifact | Expected (after v2.0 start) | Actual at HEAD | Commit that broke it |
-|----------|----------------------------|----------------|----------------------|
-| `ROADMAP.md` | v2.0 structure — ✅ v1.0 archived, phases 4-7 in progress | Pre-v1.0 structure — phases 1-3, two marked incomplete | 6465da8 |
-| `STATE.md` | `milestone: v2.0`, `status: active` | `milestone: v1.0`, `status: executing` | 6465da8 |
-| `.planning/milestones/` | v1.0-ROADMAP.md, v1.0-REQUIREMENTS.md, MILESTONES.md, RETROSPECTIVE.md | Directory missing from disk | 6465da8 |
-
-### Recovery Path
-
-```bash
-# Restore correct planning state from the v2.0 roadmap commit
-git checkout 4852696 -- .planning/ROADMAP.md .planning/STATE.md
-git checkout ee686a3 -- .planning/milestones/ .planning/MILESTONES.md .planning/RETROSPECTIVE.md
-# Then update STATE.md to reflect phase 04 completion and commit
-```
-
----
-
-## Phase Audit: 04-auth-passthrough
-
-### Verification Status
-
-| Phase | VERIFICATION.md | Score | Status |
-|-------|-----------------|-------|--------|
-| 04-auth-passthrough | ✅ Present | 7/7 | **passed** |
-
-### Requirements Coverage (3-Source Cross-Reference)
-
-| REQ-ID | Description | VERIFICATION.md | SUMMARY frontmatter | REQUIREMENTS.md | Final Status |
-|--------|-------------|-----------------|---------------------|-----------------|--------------|
-| AUTH-01 | `~/.claudebox/.credentials.json` bind-mounted read-write when file exists | SATISFIED | not present | `Complete` | **satisfied** |
-| AUTH-02 | Silent skip when credentials file absent | SATISFIED | not present | `Complete` | **satisfied** |
-
-**Note:** SUMMARY.md frontmatter does not include a `requirements_completed` field. Both requirements are confirmed satisfied via VERIFICATION.md evidence and REQUIREMENTS.md traceability.
-
-### Orphan Detection
-
-No orphaned requirements. AUTH-01 and AUTH-02 are the only v2.0 phase 04 requirements; both appear in VERIFICATION.md and REQUIREMENTS.md traceability.
-
----
-
-## Integration Check Results (gsd-integration-checker)
-
-All 5 integration checks **PASS**:
-
-| Check | Result | Notes |
-|-------|--------|-------|
-| BWRAP_ARGS array used in exec bwrap | PASS | Line 401: `exec bwrap "${BWRAP_ARGS[@]}"` — correct quoting and `[@]` |
-| print_audit() shows credential when CREDS_MOUNT=true | PASS | Lines 281-283: conditional present and wired |
-| --dry-run mirrors credential bind | PASS | Lines 353-355: same guard and --bind flag |
-| Pre-existing v1.0 mounts intact | PASS | All 10 mount categories verified present in BWRAP_ARGS |
-| SKIP_AUDIT / --yes flag interaction | PASS | print_audit inside `[[ "$SKIP_AUDIT" != true && "$DRY_RUN" != true ]]` at line 293 |
-
-**Non-blocking integration gap:** dry-run block (lines 333-360) is a hardcoded reproduction of the exec path, not derived from `BWRAP_ARGS`. Maintenance hazard — future mounts must be manually mirrored. No current requirement violated.
-
-### Requirements Integration Map
-
-| Requirement | Integration Path | Status |
-|-------------|-----------------|--------|
-| AUTH-01 | `CREDS_FILE`→`CREDS_MOUNT=true`→`BWRAP_ARGS+=--bind`→`exec bwrap`; mirrored in `print_audit()` and dry-run | WIRED |
-| AUTH-02 | `[[ -f "$CREDS_FILE" ]] \|\| CREDS_MOUNT=false`→all consumers gate on `CREDS_MOUNT==true`→no bind emitted | WIRED |
-
----
-
-## Tech Debt Inventory
-
-| Phase | Item | Severity |
-|-------|------|----------|
-| 04 | dry-run block is hardcoded parallel to BWRAP_ARGS — maintenance hazard | low |
-| 04 | `export SKIP_AUDIT  # consumed by Plan 02 audit display` — stale comment, dead export | cosmetic |
-| 04 | Network: `full (host network)` in print_audit — intentional Phase 06 placeholder | intentional |
-
----
-
-## Nyquist Compliance
-
-Skipped — `workflow.nyquist_validation: false` in config.json.
-
----
-
-## Milestone Completeness Assessment
-
-The actual milestone is **v2.0 Network Isolation & Profiles**. Current state:
-
-| Phase | Name | Status |
-|-------|------|--------|
-| 04 | Auth Passthrough | ✅ Complete (verified) |
-| 05 | Per-Project Instance Isolation | ❌ Not started |
-| 06 | Tiered Network Isolation | ❌ Not started |
-| 07 | Named Profiles | ❌ Not started |
-
-**v2.0 is 25% complete (1/4 phases). Do not complete the milestone yet.**
-
----
-
-## Summary
-
-Phase 04 (auth-passthrough) is solid: all requirements satisfied, integration clean, no blocking issues.
-
-The milestone should **not** be completed because:
-1. STATE.md and ROADMAP.md are corrupted artifacts from a bad executor commit — they must be restored
-2. v2.0 has 3 remaining phases (05-07) yet to be executed
-3. v1.0 was already completed at `ee686a3` — completing it again would duplicate the archive
-
-**Required action before any milestone completion:**
-1. Restore correct ROADMAP.md and STATE.md from git history (see Recovery Path above)
-2. Restore `.planning/milestones/` from `ee686a3`
-3. Continue v2.0 development with phase 05
-
----
-
-_Audited: 2026-04-10_
-_Auditor: Claude (gsd-audit-milestone)_
diff --git a/README.md b/README.md
index 555af1e..c460f2d 100644
--- a/README.md
+++ b/README.md
@@ -23,7 +23,7 @@ Then add `inputs.claudebox.packages.${system}.default` to your `environment.syst
 ## What it does
 
 - Starts Claude Code inside a bwrap namespace with `--clearenv`
-- Only allowlisted env vars enter the sandbox (HOME, PATH, TERM, EDITOR, LANG, ANTHROPIC_API_KEY)
+- Only allowlisted env vars enter the sandbox (HOME, PATH, TERM, EDITOR, LANG, ANTHROPIC_API_KEY if set)
 - Mounts CWD read-write, Nix store read-only, everything else is tmpfs
 - Provides `nix shell` and [comma](https://github.com/nix-community/comma) (`, <tool>`) so Claude can install tools on demand
 - Injects a SANDBOX.md so Claude knows it's sandboxed and how to get tools
@@ -37,6 +37,7 @@ Then add `inputs.claudebox.packages.${system}.default` to your `environment.syst
 | `--dry-run` | Print the bwrap command without executing |
 | `--check` | Verify prerequisites and exit |
 | `--shell` | Drop into a bash shell instead of Claude Code |
+| `--gc` | Remove stale per-project instance dirs and exit |
 | `--` | Pass remaining args to Claude Code |
 
 ## Extra env vars
@@ -51,21 +52,26 @@ CLAUDEBOX_EXTRA_ENV=MY_VAR,OTHER_VAR claudebox
 
 ```
 ~/.claudebox/          # persistent config dir (host)
-├── CLAUDE.md          # user-owned, claudebox ensures @SANDBOX.md import
-└── SANDBOX.md         # managed by claudebox, overwritten each launch
+├── SANDBOX.md         # managed by claudebox, overwritten each launch
+├── history.jsonl      # conversation history
+├── .credentials.json  # Claude Code credentials (if present)
+└── projects/
+    └── <16-char-hex>/ # per-project instance dir (keyed by canonical git root)
+        └── project-root  # records the canonical path for this instance
 
 Inside the sandbox:
-  ~/.claudebox  →  bind-mounted from host
-  ~/.claude     →  symlink to ~/.claudebox
+  ~/.claude            →  bind-mounted from host (plugins, skills, hooks, MCP all visible)
+  ~/.claude/projects   →  bind-mounted from ~/.claudebox/projects/<hash>/ (per-project isolation)
+  ~/.claude/history.jsonl → bind-mounted from ~/.claudebox/history.jsonl
+  ~/.claude/SANDBOX.md →  bind-mounted from ~/.claudebox/SANDBOX.md
 ```
 
-Claude Code reads `~/.claude/CLAUDE.md` which imports `@SANDBOX.md` via Claude's `@`-import syntax. Both `~/.claude` and `~/.claudebox` resolve to the same directory inside the sandbox, so hook paths and settings work without fixups.
+Each project gets an isolated `~/.claude/projects/` directory inside the sandbox, so conversation history and project state are separated per repo. Git worktrees share the same instance dir as their main worktree.
 
 ## Requirements
 
 - NixOS or Nix with flakes enabled
 - User namespaces (enabled by default on NixOS)
-- `ANTHROPIC_API_KEY` set in your environment
 
 ## License
 
diff --git a/claudebox.sh b/claudebox.sh
index a0b4d95..73d1535 100644
--- a/claudebox.sh
+++ b/claudebox.sh
@@ -3,6 +3,7 @@ SKIP_AUDIT=false
 DRY_RUN=false
 CHECK_MODE=false
 SHELL_MODE=false
+GC_MODE=false
 CLAUDE_ARGS=()
 
 while (( $# > 0 )); do
@@ -11,6 +12,7 @@ while (( $# > 0 )); do
     --dry-run) DRY_RUN=true ;;
     --check) CHECK_MODE=true ;;
     --shell) SHELL_MODE=true ;;
+    --gc) GC_MODE=true ;;
     --) shift; CLAUDE_ARGS+=("$@"); break ;;
     *) CLAUDE_ARGS+=("$1") ;;
   esac
@@ -18,6 +20,29 @@ while (( $# > 0 )); do
 done
 export SKIP_AUDIT  # consumed by Plan 02 audit display
 
+# Garbage-collect stale instance directories (D-11, INST-04)
+gc_instances() {
+  local removed=0
+  local projects_dir="$HOME/.claudebox/projects"
+  if [[ ! -d "$projects_dir" ]]; then
+    echo "No projects directory found at $projects_dir" >&2
+    return
+  fi
+  for dir in "$projects_dir"/*/; do
+    [[ -d "$dir" ]] || continue
+    local root_file="$dir/project-root"
+    [[ -f "$root_file" ]] || continue
+    local root_path
+    root_path=$(< "$root_file")
+    if [[ ! -d "$root_path" ]]; then
+      rm -rf "$dir"
+      echo "Removed: $dir (project root gone: $root_path)" >&2
+      (( removed++ )) || true
+    fi
+  done
+  echo "GC complete: $removed instance(s) removed." >&2
+}
+
 # --check: verify prerequisites and exit (D-10, UX-05)
 if [[ "$CHECK_MODE" == true ]]; then
   pass=true
@@ -62,6 +87,12 @@ if [[ "$CHECK_MODE" == true ]]; then
   fi
 fi
 
+# --gc: remove stale instance directories and exit (D-12, INST-04)
+if [[ "$GC_MODE" == true ]]; then
+  gc_instances
+  exit 0
+fi
+
 # ANSI formatting (D-03)
 if [[ -t 2 ]] && [[ "${NO_COLOR:-}" == "" ]]; then
   BOLD=$'\033[1m'
@@ -98,12 +129,39 @@ CLAUDE_BIN="$(command -v claude)"
 # Record CWD
 CWD=$(pwd)
 
+# Compute canonical project root — worktree-aware (D-08, INST-02)
+compute_canonical_root() {
+  local cwd="$1"
+  local git_common
+  git_common=$(git -C "$cwd" rev-parse --git-common-dir 2>/dev/null) || {
+    echo "$cwd"
+    return
+  }
+  # git returns relative ".git" for normal repos; make absolute
+  if [[ "$git_common" != /* ]]; then
+    git_common="$cwd/$git_common"
+  fi
+  dirname "$(readlink -f "$git_common")"
+}
+
 # Ensure ~/.claudebox exists
 mkdir -p "$HOME/.claudebox"
 
+# Per-project instance isolation (D-04, D-07, D-09, D-10, INST-01)
+CANONICAL_ROOT=$(compute_canonical_root "$CWD")
+INSTANCE_HASH=$(printf '%s' "$CANONICAL_ROOT" | sha256sum | cut -c1-16)
+INSTANCE_DIR="$HOME/.claudebox/projects/$INSTANCE_HASH"
+
+mkdir -p "$INSTANCE_DIR"
+if [[ ! -f "$INSTANCE_DIR/project-root" ]]; then
+  printf '%s\n' "$CANONICAL_ROOT" > "$INSTANCE_DIR/project-root"
+fi
+
+# Ensure history.jsonl source exists — bwrap bind requires source to exist (D-04)
+touch "$HOME/.claudebox/history.jsonl"
+
 # Credential file mount (AUTH-01, AUTH-02)
-# Use ~/.claudebox (the host-side claudebox config dir), not ~/.claude
-# ~/.claude -> ~/.claudebox symlink only exists inside the sandbox at runtime
+# Credential file lives in ~/.claudebox on the host; mounted into sandbox at ~/.claude/.credentials.json
 CREDS_FILE="$HOME/.claudebox/.credentials.json"
 if [[ -f "$CREDS_FILE" ]]; then
   CREDS_MOUNT=true
@@ -111,6 +169,16 @@ else
   CREDS_MOUNT=false
 fi
 
+# Claude Code config file mount (~/.claude.json)
+# Stores auth tokens and user preferences; must be read-write so Claude Code
+# can update tokens and write backups without prompting for re-auth.
+CLAUDE_JSON_FILE="$HOME/.claude.json"
+if [[ -f "$CLAUDE_JSON_FILE" ]]; then
+  CLAUDE_JSON_MOUNT=true
+else
+  CLAUDE_JSON_MOUNT=false
+fi
+
 # === Sandbox-aware prompting (AWARE-01, AWARE-02) ===
 
 # Write SANDBOX.md -- fully managed, overwritten every launch (D-02)
@@ -119,8 +187,8 @@ cat > "$HOME/.claudebox/SANDBOX.md" << 'SANDBOXEOF'
 
 You are running inside a bubblewrap (bwrap) sandbox managed by claudebox.
 Your filesystem is isolated -- only the current working directory and
-essential system paths are mounted. Both ~/.claude and ~/.claudebox
-point to the same directory inside the sandbox.
+essential system paths are mounted. Your ~/.claude directory is bind-mounted
+from the host, with per-project isolation for conversation history.
 
 ## Installing Tools
 
@@ -153,16 +221,6 @@ For remote operations, prefer HTTPS URLs over SSH since SSH keys
 are not available by default.
 SANDBOXEOF
 
-# Ensure CLAUDE.md has @SANDBOX.md import (D-03, D-08, AWARE-01)
-CLAUDEMD="$HOME/.claudebox/CLAUDE.md"
-if [[ ! -f "$CLAUDEMD" ]]; then
-  printf '%s\n' "@SANDBOX.md" > "$CLAUDEMD"
-elif [[ "$(head -1 "$CLAUDEMD")" != "@SANDBOX.md" ]]; then
-  tmp=$(mktemp)
-  { printf '%s\n' "@SANDBOX.md"; cat "$CLAUDEMD"; } > "$tmp"
-  mv "$tmp" "$CLAUDEMD"
-fi
-
 # Generate minimal .gitconfig (D-05)
 GIT_NAME=$(git config --global user.name 2>/dev/null || echo "Claude User")
 GIT_EMAIL=$(git config --global user.email 2>/dev/null || echo "claude@localhost")
@@ -264,7 +322,10 @@ print_audit() {
   # Mounts section
   echo "${BOLD}Mounts:${RESET}" >&2
   printf '  %-12s %s   (read-write)\n' "CWD" "$CWD" >&2
-  printf '  %-12s %s   (read-write)\n' "$HOME/.claude" "$HOME/.claudebox" >&2
+  printf '  %-12s %s   (read-write)\n' "$HOME/.claude" "$HOME/.claude" >&2
+  printf '  %-12s %s   (read-write, project: %s)\n' "projects/" "$INSTANCE_DIR" "$CANONICAL_ROOT" >&2
+  printf '  %-12s %s   (read-write)\n' "history" "$HOME/.claudebox/history.jsonl" >&2
+  printf '  %-12s %s   (read-only overlay)\n' "SANDBOX.md" "$HOME/.claudebox/SANDBOX.md" >&2
   if [[ "$CREDS_MOUNT" == true ]]; then
     printf '  %-12s %s   (read-write)\n' "credentials" "$CREDS_FILE" >&2
   fi
@@ -331,11 +392,17 @@ if [[ "$DRY_RUN" == true ]]; then
     echo "  --ro-bind /etc/nsswitch.conf /etc/nsswitch.conf \\"
     echo "  --ro-bind /etc/nix /etc/nix \\"
     printf '  --symlink %q /usr/bin/env \\\n' "$(readlink -f "$(command -v env)")"
+    printf '  --symlink %q /bin/sh \\\n' "$(readlink -f "$(command -v bash)")"
     echo "  --tmpfs $HOME \\"
-    echo "  --bind $HOME/.claudebox $HOME/.claudebox \\"
-    echo "  --symlink $HOME/.claudebox $HOME/.claude \\"
+    echo "  --bind $HOME/.claude $HOME/.claude \\"
+    echo "  --bind $INSTANCE_DIR $HOME/.claude/projects \\"
+    echo "  --bind $HOME/.claudebox/history.jsonl $HOME/.claude/history.jsonl \\"
+    echo "  --bind $HOME/.claudebox/SANDBOX.md $HOME/.claude/SANDBOX.md \\"
+    if [[ "$CLAUDE_JSON_MOUNT" == true ]]; then
+      echo "  --bind $CLAUDE_JSON_FILE $HOME/.claude.json \\"
+    fi
     if [[ "$CREDS_MOUNT" == true ]]; then
-      echo "  --bind $CREDS_FILE $HOME/.claudebox/.credentials.json \\"
+      echo "  --bind $CREDS_FILE $HOME/.claude/.credentials.json \\"
     fi
     printf '  --ro-bind %q %s/.gitconfig \\\n' "$GITCONFIG_TMP" "$HOME"
     echo "  --bind $CWD $CWD \\"
@@ -363,12 +430,22 @@ BWRAP_ARGS=(
   --ro-bind /etc/nsswitch.conf /etc/nsswitch.conf
   --ro-bind /etc/nix /etc/nix
   --symlink "$(readlink -f "$(command -v env)")" /usr/bin/env
+  --symlink "$(readlink -f "$(command -v bash)")" /bin/sh
   --tmpfs "$HOME"
-  --bind "$HOME/.claudebox" "$HOME/.claudebox"
-  --symlink "$HOME/.claudebox" "$HOME/.claude"
+  # Phase 5: direct ~/.claude bind (D-01) — all plugins/skills/hooks/MCP visible
+  --bind "$HOME/.claude" "$HOME/.claude"
+  # Phase 5: overlay projects/ with per-project isolated dir (D-02, INST-01)
+  --bind "$INSTANCE_DIR" "$HOME/.claude/projects"
+  # Phase 5: overlay history.jsonl with sandbox-side file (D-03)
+  --bind "$HOME/.claudebox/history.jsonl" "$HOME/.claude/history.jsonl"
+  # Phase 5: inject SANDBOX.md as file overlay (D-06)
+  --bind "$HOME/.claudebox/SANDBOX.md" "$HOME/.claude/SANDBOX.md"
 )
+if [[ "$CLAUDE_JSON_MOUNT" == true ]]; then
+  BWRAP_ARGS+=(--bind "$CLAUDE_JSON_FILE" "$HOME/.claude.json")
+fi
 if [[ "$CREDS_MOUNT" == true ]]; then
-  BWRAP_ARGS+=(--bind "$CREDS_FILE" "$HOME/.claudebox/.credentials.json")
+  BWRAP_ARGS+=(--bind "$CREDS_FILE" "$HOME/.claude/.credentials.json")
 fi
 BWRAP_ARGS+=(
   --ro-bind "$GITCONFIG_TMP" "$HOME/.gitconfig"
diff --git a/test-gc.sh b/test-gc.sh
new file mode 100755
index 0000000..39c410c
--- /dev/null
+++ b/test-gc.sh
@@ -0,0 +1,156 @@
+#!/usr/bin/env bash
+# test-gc.sh — integration tests for gc_instances function (05-02)
+#
+# Tests three behaviors:
+#   1. Stale instance dir (project root gone) is removed
+#   2. Valid instance dir (project root exists) is preserved
+#   3. Empty projects/ dir produces "GC complete: 0 instance(s) removed."
+#
+# Usage: bash test-gc.sh
+
+set -euo pipefail
+
+PASS=0
+FAIL=0
+
+pass() { echo "PASS: $1"; (( PASS++ )) || true; }
+fail() { echo "FAIL: $1"; (( FAIL++ )) || true; }
+
+# Verify claudebox.sh exists and contains gc_instances
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+CLAUDEBOX_SH="$SCRIPT_DIR/claudebox.sh"
+
+if [[ ! -f "$CLAUDEBOX_SH" ]]; then
+  echo "ERROR: claudebox.sh not found at $CLAUDEBOX_SH" >&2
+  exit 1
+fi
+
+# Verify gc_instances is present in the source file (canary check)
+if ! cat "$CLAUDEBOX_SH" | tr '\n' '|' | cat > /dev/null 2>&1; then
+  echo "ERROR: claudebox.sh is not readable" >&2
+  exit 1
+fi
+
+found_gc=false
+while IFS= read -r line; do
+  [[ "$line" == "gc_instances()"* ]] && { found_gc=true; break; }
+done < "$CLAUDEBOX_SH"
+
+if [[ "$found_gc" != true ]]; then
+  echo "ERROR: gc_instances() not found in claudebox.sh" >&2
+  exit 1
+fi
+
+# Inline definition of gc_instances for isolated testing.
+# This mirrors the exact implementation in claudebox.sh.
+gc_instances() {
+  local removed=0
+  local projects_dir="$HOME/.claudebox/projects"
+  if [[ ! -d "$projects_dir" ]]; then
+    echo "No projects directory found at $projects_dir" >&2
+    return
+  fi
+  for dir in "$projects_dir"/*/; do
+    [[ -d "$dir" ]] || continue
+    local root_file="$dir/project-root"
+    [[ -f "$root_file" ]] || continue
+    local root_path
+    root_path=$(< "$root_file")
+    if [[ ! -d "$root_path" ]]; then
+      rm -rf "$dir"
+      echo "Removed: $dir (project root gone: $root_path)" >&2
+      (( removed++ )) || true
+    fi
+  done
+  echo "GC complete: $removed instance(s) removed." >&2
+}
+
+# ============================================================
+# Test setup: temporary home directory
+# ============================================================
+TMPDIR_TEST=$(mktemp -d)
+trap 'rm -rf "$TMPDIR_TEST"' EXIT
+
+# ============================================================
+# Test 1: Stale instance directory is removed
+# ============================================================
+HOME="$TMPDIR_TEST"
+mkdir -p "$HOME/.claudebox/projects/stale1234567890ab"
+echo "/nonexistent/path/that/does/not/exist/$$" > "$HOME/.claudebox/projects/stale1234567890ab/project-root"
+
+GC_OUTPUT=$(gc_instances 2>&1)
+
+if [[ ! -d "$HOME/.claudebox/projects/stale1234567890ab" ]]; then
+  pass "Test 1: stale instance dir removed"
+else
+  fail "Test 1: stale instance dir NOT removed"
+fi
+
+if [[ "$GC_OUTPUT" == *"Removed:"* ]]; then
+  pass "Test 1: 'Removed:' message printed"
+else
+  fail "Test 1: 'Removed:' not found in output: $GC_OUTPUT"
+fi
+
+if [[ "$GC_OUTPUT" == *"GC complete: 1 instance(s) removed."* ]]; then
+  pass "Test 1: GC summary shows 1 removed"
+else
+  fail "Test 1: GC summary wrong: $GC_OUTPUT"
+fi
+
+# ============================================================
+# Test 2: Valid instance directory is preserved
+# ============================================================
+mkdir -p "$HOME/.claudebox/projects/valid123456789012"
+# Point project-root at a path that actually exists (TMPDIR_TEST itself)
+echo "$TMPDIR_TEST" > "$HOME/.claudebox/projects/valid123456789012/project-root"
+
+GC_OUTPUT2=$(gc_instances 2>&1)
+
+if [[ -d "$HOME/.claudebox/projects/valid123456789012" ]]; then
+  pass "Test 2: valid instance dir preserved"
+else
+  fail "Test 2: valid instance dir was removed (should not be)"
+fi
+
+if [[ "$GC_OUTPUT2" == *"GC complete: 0 instance(s) removed."* ]]; then
+  pass "Test 2: GC summary shows 0 removed"
+else
+  fail "Test 2: GC summary wrong: $GC_OUTPUT2"
+fi
+
+# Clean up for Test 3
+rm -rf "$HOME/.claudebox/projects/valid123456789012"
+
+# ============================================================
+# Test 3: Empty projects/ dir produces "GC complete: 0 instance(s) removed."
+# ============================================================
+# Ensure projects/ dir is empty of instance subdirs
+for d in "$HOME/.claudebox/projects"/*/; do
+  [[ -d "$d" ]] && rm -rf "$d"
+done
+
+GC_OUTPUT3=$(gc_instances 2>&1)
+
+if [[ "$GC_OUTPUT3" == *"GC complete: 0 instance(s) removed."* ]]; then
+  pass "Test 3: empty projects/ produces 0 removed summary"
+else
+  fail "Test 3: empty projects/ output wrong: $GC_OUTPUT3"
+fi
+
+# Verify exits 0
+if gc_instances > /dev/null 2>&1; then
+  pass "Test 3: gc_instances exits 0 on empty projects/"
+else
+  fail "Test 3: gc_instances returned non-zero on empty projects/"
+fi
+
+# ============================================================
+# Results
+# ============================================================
+echo ""
+echo "Results: $PASS passed, $FAIL failed"
+if (( FAIL > 0 )); then
+  exit 1
+fi
+exit 0