chore: complete v1.0 MVP milestone

Archive milestone artifacts, evolve PROJECT.md, reorganize ROADMAP.md,
write retrospective. Requirements archived to milestones/v1.0-REQUIREMENTS.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Christopher Mühl 2026-04-10 08:05:53 +00:00
parent 778216ead9
commit a494576f44
8 changed files with 202 additions and 130 deletions

15
.planning/MILESTONES.md Normal file
View file

@ -0,0 +1,15 @@
# Milestones
## v1.0 MVP (Shipped: 2026-04-10)
**Phases completed:** 3 phases, 5 plans, 6 tasks
**Key accomplishments:**
- Nix flake with writeShellApplication producing claudebox wrapper in bwrap with clearenv, env allowlist, tmpfs root, secret hiding, and comma/nix tool access
- Fixed NixOS symlink resolution — readlink -f for profile paths to real nix store paths
- CLI with --check, --dry-run modes, multi-flag parsing, and CLAUDE_ARGS accumulator
- Pre-launch env audit with grouped display, sensitive value masking, and interactive Y/n confirmation
- SANDBOX.md generation and CLAUDE.md import management for sandbox-aware prompting
---

View file

@ -2,7 +2,7 @@
## What This Is
A Nix derivation that produces a `claudebox` wrapper script for Claude Code. It runs Claude inside a bubblewrap sandbox with an allowlisted environment, explicit filesystem mounts, and a minimal PATH — keeping SSH keys, GPG/age secrets, cloud tokens, and Tailscale state completely invisible to the AI agent.
A Nix derivation that produces a `claudebox` wrapper script for Claude Code. It runs Claude inside a bubblewrap sandbox with an allowlisted environment, explicit filesystem mounts, and a minimal PATH — keeping SSH keys, GPG/age secrets, cloud tokens, and Tailscale state completely invisible to the AI agent. Includes pre-launch env audit, diagnostic modes, and sandbox-aware prompting.
## Core Value
@ -12,20 +12,22 @@ Secrets never enter the Claude Code environment. If a secret is accessible insid
### Validated
- [x] Default prompt/instructions injected so Claude knows how to use nix/comma to get dev tools — Validated in Phase 3: Sandbox-Aware Prompting
- ✓ Wrapper script that execs `claude --dangerously-skip-permissions` inside a bwrap sandbox — v1.0
- ✓ Environment allowlist: start with empty env, explicitly pass only known-safe vars — v1.0
- ✓ Pre-launch env audit: list all env vars being passed in for user review — v1.0
- ✓ `--yes` / `-y` flag to skip the env audit — v1.0
- ✓ Filesystem isolation: only CWD mounted read-write, plus `~/.claudebox` mapped to `~/.claude` — v1.0
- ✓ Secret paths hidden: `~/.ssh`, `~/.gnupg`, `~/.config/gcloud`, `~/.aws`, Tailscale state, age keys — v1.0
- ✓ Minimal PATH: Nix store paths only — coreutils, git, curl, jq, ripgrep, fd, nix, comma — v1.0
- ✓ Claude can self-install tools via `nix shell` or `, <tool>` (comma) — v1.0
- ✓ Default SANDBOX.md injected so Claude knows its capabilities and constraints — v1.0
- ✓ Works on endurance (NixOS desktop) — v1.0
- ✓ `--check` flag for environment diagnostics — v1.0
- ✓ `--dry-run` flag to print bwrap command without executing — v1.0
### Active
- [ ] Wrapper script that execs `claude --dangerously-skip-permissions` inside a bwrap sandbox
- [ ] Environment allowlist: start with empty env, explicitly pass only known-safe vars (HOME, PATH, TERM, EDITOR, LANG, etc.)
- [ ] Pre-launch env audit: before running, list all env vars being passed in so the user can review for secrets. Proceed on confirmation, abort on rejection
- [ ] `--yes` / `-y` flag to skip the env audit and launch immediately
- [ ] Filesystem isolation: only CWD mounted read-write, plus `~/.claudebox` mapped to `~/.claude` inside the sandbox
- [ ] Secret paths hidden: `~/.ssh`, `~/.gnupg`, `~/.config/gcloud`, `~/.aws`, Tailscale state, age keys — none of these visible inside the sandbox
- [ ] Minimal PATH: Nix store paths only — coreutils, git, curl, jq, ripgrep, fd, nix, comma
- [ ] Claude can self-install tools via `nix shell` or `, <tool>` (comma) inside the sandbox
- [x] Default prompt/instructions injected so Claude knows how to use nix/comma to get dev tools — Validated in Phase 3
- [ ] Works on endurance (NixOS desktop)
(No active requirements — start next milestone with `/gsd-new-milestone`)
### Out of Scope
@ -36,12 +38,10 @@ Secrets never enter the Claude Code environment. If a secret is accessible insid
## Context
- Target machine: endurance (NixOS desktop)
- Claude Code already has bubblewrap sandboxing (`--sandbox`) but it doesn't control env vars or secret file visibility — that's claudebox's job
- bwrap is in nixpkgs, so the derivation uses `writeShellApplication` wrapping a bwrap invocation
- `~/.claudebox/` is the persistent config directory that gets bind-mounted as `~/.claude` inside the sandbox, keeping real `~/.claude` outside
- comma (`,`) is a tool that runs `nix shell nixpkgs#<pkg> -c <pkg>` — lets Claude pull in any tool on demand without pre-declaring it
- The Nix store needs to be mounted read-only inside the sandbox for nix/comma to work
Shipped v1.0 with 399 LOC (350 shell + 49 Nix).
Tech stack: Nix flake (`writeShellApplication`) + bubblewrap + comma-with-db.
Runs on NixOS (endurance) with readlink -f workaround for symlink chain resolution.
Non-NixOS support added via conditional `/etc/static` mount.
## Constraints
@ -53,28 +53,14 @@ Secrets never enter the Claude Code environment. If a secret is accessible insid
| Decision | Rationale | Outcome |
|----------|-----------|---------|
| Own bwrap over Claude's --sandbox | Full control over mounts, env, namespaces | — Pending |
| Env allowlist over denylist | Denylist misses unknown vars; allowlist is secure by default | — Pending |
| comma for tool access | Claude can pull any tool on demand without pre-declaring PATH entries | — Pending |
| --dangerously-skip-permissions always | The bwrap sandbox IS the permission layer — Claude's prompts are redundant | — Pending |
| Pre-launch env audit | User reviews exactly what enters the sandbox, catches leaks before they happen | — Pending |
## Evolution
This document evolves at phase transitions and milestone boundaries.
**After each phase transition** (via `/gsd-transition`):
1. Requirements invalidated? → Move to Out of Scope with reason
2. Requirements validated? → Move to Validated with phase reference
3. New requirements emerged? → Add to Active
4. Decisions to log? → Add to Key Decisions
5. "What This Is" still accurate? → Update if drifted
**After each milestone** (via `/gsd-complete-milestone`):
1. Full review of all sections
2. Core Value check — still the right priority?
3. Audit Out of Scope — reasons still valid?
4. Update Context with current state
| Own bwrap over Claude's --sandbox | Full control over mounts, env, namespaces | ✓ Good |
| Env allowlist over denylist | Denylist misses unknown vars; allowlist is secure by default | ✓ Good |
| comma for tool access | Claude can pull any tool on demand without pre-declaring PATH entries | ✓ Good |
| --dangerously-skip-permissions always | The bwrap sandbox IS the permission layer — Claude's prompts are redundant | ✓ Good |
| Pre-launch env audit | User reviews exactly what enters the sandbox, catches leaks before they happen | ✓ Good |
| readlink -f for binary resolution | NixOS profile symlinks aren't visible inside bwrap; resolve to real store paths | ✓ Good |
| Claude Code via nix-claude-code flake | ryoppippi/nix-claude-code, not host PATH | ✓ Good |
| SANDBOX.md as separate file with @import | Keeps user CLAUDE.md clean, sandbox instructions always fresh | ✓ Good |
---
*Last updated: 2026-04-09 after Phase 3 completion*
*Last updated: 2026-04-10 after v1.0 milestone*

View file

@ -0,0 +1,52 @@
# Project Retrospective
*A living document updated after each milestone. Lessons feed forward into future planning.*
## Milestone: v1.0 — MVP
**Shipped:** 2026-04-10
**Phases:** 3 | **Plans:** 5
### What Was Built
- Nix flake producing `claudebox` wrapper: bwrap sandbox with clearenv, env allowlist, tmpfs root, secret path hiding, git identity forwarding, comma/nix tool access
- CLI diagnostic modes: --check (environment validation), --dry-run (print bwrap command), --shell (debug shell)
- Pre-launch env audit with grouped sections, sensitive value masking, Y/n confirmation prompt
- SANDBOX.md generation and CLAUDE.md import management so Claude knows its sandbox constraints
### What Worked
- writeShellApplication with builtins.readFile pattern — shellcheck at build time, shell syntax highlighting in editors
- Rapid phase execution: Phase 1 in ~2 min, Phase 2 in ~4 min, Phase 3 in ~76 sec
- clearenv + allowlist approach caught all secret leakage by default
- readlink -f fix for NixOS symlinks was discovered and fixed in-phase without blocking
### What Was Inefficient
- REQUIREMENTS.md traceability table not updated during execution — 7 requirements showed "Pending" despite being complete
- Phase 3 context was gathered but not executed in the same session, requiring session continuity overhead
### Patterns Established
- readlink -f for all host-resolved binaries passed into bwrap (NixOS symlink chains)
- SANDBOX.md as separate file with @import in CLAUDE.md (keeps user content clean, sandbox instructions always fresh)
- export trick for shellcheck SC2034 when a variable is used in a later plan but not yet
### Key Lessons
1. On NixOS, every host binary path is a symlink chain through /etc/profiles/per-user/ — must resolve to real store paths before passing to bwrap
2. Conditional mounts needed for cross-distro support (/etc/static exists only on NixOS)
### Cost Observations
- Model mix: predominantly opus for execution
- Sessions: ~3 sessions across 2 days
- Notable: entire v1.0 MVP shipped in under 2 days of wall clock time
---
## Cross-Milestone Trends
### Process Evolution
| Milestone | Phases | Plans | Key Change |
|-----------|--------|-------|------------|
| v1.0 | 3 | 5 | Initial project — established sandbox patterns |
### Top Lessons (Verified Across Milestones)
1. (Will populate as more milestones complete)

View file

@ -1,73 +1,26 @@
# Roadmap: claudebox
## Overview
## Milestones
claudebox is a Nix-packaged bwrap sandbox wrapper for Claude Code. The roadmap moves from a working sandbox (Phase 1) through CLI polish (Phase 2) to sandbox-aware prompting (Phase 3). Phase 1 is the bulk of the work -- once Claude runs inside bwrap with env isolation, filesystem isolation, and tool provisioning, the remaining phases add UX and developer experience improvements.
- ✅ **v1.0 MVP** — Phases 1-3 (shipped 2026-04-10)
## Phases
**Phase Numbering:**
- Integer phases (1, 2, 3): Planned milestone work
- Decimal phases (2.1, 2.2): Urgent insertions (marked with INSERTED)
<details>
<summary>✅ v1.0 MVP (Phases 1-3) — SHIPPED 2026-04-10</summary>
Decimal phases appear between their surrounding integers in numeric order.
- [x] Phase 1: Minimal Viable Sandbox (2/2 plans) — bwrap sandbox with clearenv, env allowlist, filesystem isolation, secret hiding, tool provisioning
- [x] Phase 2: Env Audit and CLI Polish (2/2 plans) — --check, --dry-run, env audit display with masking, confirmation prompt
- [x] Phase 3: Sandbox-Aware Prompting (1/1 plan) — SANDBOX.md generation, CLAUDE.md import management
- [ ] **Phase 1: Minimal Viable Sandbox** - Working claudebox command that launches Claude in bwrap with full isolation and tool provisioning
- [ ] **Phase 2: Env Audit and CLI Polish** - Pre-launch env review, --yes, --dry-run, and --check flags
- [ ] **Phase 3: Sandbox-Aware Prompting** - Injected CLAUDE.md so Claude knows its capabilities and constraints
Full details: [milestones/v1.0-ROADMAP.md](milestones/v1.0-ROADMAP.md)
## Phase Details
### Phase 1: Minimal Viable Sandbox
**Goal**: User can run `claudebox` in any project directory and get a fully functional Claude Code session with secrets invisible
**Depends on**: Nothing (first phase)
**Requirements**: SAND-01, SAND-02, SAND-03, SAND-04, SAND-05, SAND-06, SAND-07, SAND-08, SAND-09, SAND-10, SAND-11, SAND-12, SAND-13, SAND-14, SAND-15, TOOL-01, TOOL-02, TOOL-03, GIT-01, GIT-02, NIX-01, NIX-02, NIX-03, UX-06
**Success Criteria** (what must be TRUE):
1. Running `nix run` or `nix profile install` produces a working `claudebox` command
2. `claudebox` launches Claude Code inside bwrap; `env` inside the sandbox shows only allowlisted variables (no SSH_AUTH_SOCK, AWS_PROFILE, etc.)
3. Secret paths (~/.ssh, ~/.gnupg, ~/.aws, ~/.config/gcloud, age keys, /var/lib/tailscale) are not visible inside the sandbox
4. Claude can run `curl https://example.com`, `git status`, `, jq --help` (comma), and `nix shell nixpkgs#python3 -c python3 --version` inside the sandbox
5. Ctrl+C terminates the session cleanly; exit code from Claude passes through to the caller
**Plans:** 2 plans
Plans:
- [x] 01-01-PLAN.md -- Create flake.nix and claudebox.sh with complete bwrap sandbox
- [x] 01-02-PLAN.md -- Build verification and manual sandbox smoke test
### Phase 2: Env Audit and CLI Polish
**Goal**: User can review exactly what enters the sandbox before launch, and has diagnostic tools for troubleshooting
**Depends on**: Phase 1
**Requirements**: UX-01, UX-02, UX-03, UX-04, UX-05
**Success Criteria** (what must be TRUE):
1. Running `claudebox` without `--yes` prints all env vars being passed into the sandbox and prompts for confirmation before proceeding
2. Running `claudebox --yes` or `claudebox -y` skips the env audit and launches immediately
3. Running `claudebox --dry-run` prints the full bwrap command without executing it
4. Running `claudebox --check` reports whether bwrap exists, required Nix packages are available, and ~/.claudebox exists
**Plans:** 2 plans
Plans:
- [x] 02-01-PLAN.md -- Refactor flag parsing, add --check and --dry-run modes
- [x] 02-02-PLAN.md -- Env audit display with grouping, masking, and confirmation prompt
### Phase 3: Sandbox-Aware Prompting
**Goal**: Claude inside the sandbox knows it is sandboxed, how to install tools, and what is unavailable
**Depends on**: Phase 1
**Requirements**: AWARE-01, AWARE-02
**Success Criteria** (what must be TRUE):
1. First run of `claudebox` creates a default CLAUDE.md in ~/.claudebox/ if none exists
2. The injected CLAUDE.md tells Claude it is in a bwrap sandbox, how to use comma (`, <tool>`) and `nix shell` for tool installation, and that SSH/GPG/cloud credentials are unavailable
**Plans:** 1 plan
Plans:
- [x] 03-01-PLAN.md -- Add SANDBOX.md generation and CLAUDE.md import management
</details>
## Progress
**Execution Order:**
Phases execute in numeric order: 1 -> 2 -> 3
| Phase | Plans Complete | Status | Completed |
|-------|----------------|--------|-----------|
| 1. Minimal Viable Sandbox | 2/2 | Complete | - |
| 2. Env Audit and CLI Polish | 0/2 | Planned | - |
| 3. Sandbox-Aware Prompting | 0/1 | Not started | - |
| Phase | Milestone | Plans Complete | Status | Completed |
|-------|-----------|----------------|--------|-----------|
| 1. Minimal Viable Sandbox | v1.0 | 2/2 | Complete | 2026-04-09 |
| 2. Env Audit and CLI Polish | v1.0 | 2/2 | Complete | 2026-04-09 |
| 3. Sandbox-Aware Prompting | v1.0 | 1/1 | Complete | 2026-04-10 |

View file

@ -1,11 +1,11 @@
---
gsd_state_version: 1.0
milestone: v1.0
milestone_name: milestone
status: executing
stopped_at: Phase 3 context gathered
last_updated: "2026-04-09T19:24:16.913Z"
last_activity: 2026-04-09
milestone_name: MVP
status: complete
stopped_at: Milestone v1.0 complete
last_updated: "2026-04-10"
last_activity: 2026-04-10 - Completed v1.0 milestone
progress:
total_phases: 3
completed_phases: 3
@ -18,26 +18,15 @@ progress:
## Project Reference
See: .planning/PROJECT.md (updated 2026-04-09)
See: .planning/PROJECT.md (updated 2026-04-10)
**Core value:** Secrets never enter the Claude Code environment
**Current focus:** Phase 2 (next)
**Current focus:** Planning next milestone
## Current Position
Phase: 03 of 3 (sandbox aware prompting)
Plan: Not started
Status: Ready to execute
Last activity: 2026-04-10 - Completed quick task 260410-d4u: on non-nixos hosts, bwrap fails because /etc/static does not exist
Progress: [███░░░░░░░] 33%
## Performance Metrics
**Velocity:**
| Phase 01 P01 | 1min | 2 tasks | 3 files |
| Phase 01 P02 | 1min | 2 tasks | 1 file |
Milestone: v1.0 MVP — SHIPPED 2026-04-10
All 3 phases complete, 5 plans executed.
## Accumulated Context
@ -56,16 +45,10 @@ None.
### Blockers/Concerns
- SSL cert verification fails system-wide (host + sandbox) -- NixOS/OpenSSL issue, not claudebox
- SSL cert verification fails system-wide (host + sandbox) NixOS/OpenSSL issue, not claudebox
### Quick Tasks Completed
| # | Description | Date | Commit | Directory |
|---|-------------|------|--------|-----------|
| 260410-d4u | on non-nixos hosts, bwrap fails because /etc/static does not exist | 2026-04-10 | 97c10f8 | [260410-d4u-on-non-nixos-hosts-bwrap-fails-because-e](./quick/260410-d4u-on-non-nixos-hosts-bwrap-fails-because-e/) |
## Session Continuity
Last session: 2026-04-09T18:59:43.248Z
Stopped at: Phase 3 context gathered
Resume file: .planning/phases/03-sandbox-aware-prompting/03-CONTEXT.md

View file

@ -28,7 +28,8 @@
"skip_discuss": false,
"code_review": true,
"code_review_depth": "standard",
"use_worktrees": true
"use_worktrees": true,
"_auto_chain_active": false
},
"hooks": {
"context_warnings": true

View file

@ -1,3 +1,12 @@
# Requirements Archive: v1.0 MVP
**Archived:** 2026-04-10
**Status:** SHIPPED
For current requirements, see `.planning/REQUIREMENTS.md`.
---
# Requirements: claudebox
**Defined:** 2026-04-09

View file

@ -0,0 +1,73 @@
# Roadmap: claudebox
## Overview
claudebox is a Nix-packaged bwrap sandbox wrapper for Claude Code. The roadmap moves from a working sandbox (Phase 1) through CLI polish (Phase 2) to sandbox-aware prompting (Phase 3). Phase 1 is the bulk of the work -- once Claude runs inside bwrap with env isolation, filesystem isolation, and tool provisioning, the remaining phases add UX and developer experience improvements.
## Phases
**Phase Numbering:**
- Integer phases (1, 2, 3): Planned milestone work
- Decimal phases (2.1, 2.2): Urgent insertions (marked with INSERTED)
Decimal phases appear between their surrounding integers in numeric order.
- [ ] **Phase 1: Minimal Viable Sandbox** - Working claudebox command that launches Claude in bwrap with full isolation and tool provisioning
- [ ] **Phase 2: Env Audit and CLI Polish** - Pre-launch env review, --yes, --dry-run, and --check flags
- [ ] **Phase 3: Sandbox-Aware Prompting** - Injected CLAUDE.md so Claude knows its capabilities and constraints
## Phase Details
### Phase 1: Minimal Viable Sandbox
**Goal**: User can run `claudebox` in any project directory and get a fully functional Claude Code session with secrets invisible
**Depends on**: Nothing (first phase)
**Requirements**: SAND-01, SAND-02, SAND-03, SAND-04, SAND-05, SAND-06, SAND-07, SAND-08, SAND-09, SAND-10, SAND-11, SAND-12, SAND-13, SAND-14, SAND-15, TOOL-01, TOOL-02, TOOL-03, GIT-01, GIT-02, NIX-01, NIX-02, NIX-03, UX-06
**Success Criteria** (what must be TRUE):
1. Running `nix run` or `nix profile install` produces a working `claudebox` command
2. `claudebox` launches Claude Code inside bwrap; `env` inside the sandbox shows only allowlisted variables (no SSH_AUTH_SOCK, AWS_PROFILE, etc.)
3. Secret paths (~/.ssh, ~/.gnupg, ~/.aws, ~/.config/gcloud, age keys, /var/lib/tailscale) are not visible inside the sandbox
4. Claude can run `curl https://example.com`, `git status`, `, jq --help` (comma), and `nix shell nixpkgs#python3 -c python3 --version` inside the sandbox
5. Ctrl+C terminates the session cleanly; exit code from Claude passes through to the caller
**Plans:** 2 plans
Plans:
- [x] 01-01-PLAN.md -- Create flake.nix and claudebox.sh with complete bwrap sandbox
- [x] 01-02-PLAN.md -- Build verification and manual sandbox smoke test
### Phase 2: Env Audit and CLI Polish
**Goal**: User can review exactly what enters the sandbox before launch, and has diagnostic tools for troubleshooting
**Depends on**: Phase 1
**Requirements**: UX-01, UX-02, UX-03, UX-04, UX-05
**Success Criteria** (what must be TRUE):
1. Running `claudebox` without `--yes` prints all env vars being passed into the sandbox and prompts for confirmation before proceeding
2. Running `claudebox --yes` or `claudebox -y` skips the env audit and launches immediately
3. Running `claudebox --dry-run` prints the full bwrap command without executing it
4. Running `claudebox --check` reports whether bwrap exists, required Nix packages are available, and ~/.claudebox exists
**Plans:** 2 plans
Plans:
- [x] 02-01-PLAN.md -- Refactor flag parsing, add --check and --dry-run modes
- [x] 02-02-PLAN.md -- Env audit display with grouping, masking, and confirmation prompt
### Phase 3: Sandbox-Aware Prompting
**Goal**: Claude inside the sandbox knows it is sandboxed, how to install tools, and what is unavailable
**Depends on**: Phase 1
**Requirements**: AWARE-01, AWARE-02
**Success Criteria** (what must be TRUE):
1. First run of `claudebox` creates a default CLAUDE.md in ~/.claudebox/ if none exists
2. The injected CLAUDE.md tells Claude it is in a bwrap sandbox, how to use comma (`, <tool>`) and `nix shell` for tool installation, and that SSH/GPG/cloud credentials are unavailable
**Plans:** 1 plan
Plans:
- [x] 03-01-PLAN.md -- Add SANDBOX.md generation and CLAUDE.md import management
## Progress
**Execution Order:**
Phases execute in numeric order: 1 -> 2 -> 3
| Phase | Plans Complete | Status | Completed |
|-------|----------------|--------|-----------|
| 1. Minimal Viable Sandbox | 2/2 | Complete | - |
| 2. Env Audit and CLI Polish | 0/2 | Planned | - |
| 3. Sandbox-Aware Prompting | 0/1 | Not started | - |