208 lines
7.6 KiB
Markdown
208 lines
7.6 KiB
Markdown
---
|
|
phase: 01-minimal-viable-sandbox
|
|
plan: 02
|
|
type: execute
|
|
wave: 2
|
|
depends_on: ["01-01"]
|
|
files_modified: []
|
|
autonomous: false
|
|
requirements:
|
|
- NIX-03
|
|
- SAND-02
|
|
- SAND-03
|
|
- SAND-04
|
|
- SAND-05
|
|
- SAND-06
|
|
- SAND-09
|
|
- SAND-10
|
|
- SAND-12
|
|
- SAND-13
|
|
- SAND-14
|
|
- TOOL-01
|
|
- TOOL-02
|
|
|
|
must_haves:
|
|
truths:
|
|
- "`nix build` succeeds and produces a claudebox binary"
|
|
- "claudebox launches and env inside sandbox contains only allowlisted vars"
|
|
- "Secret paths are invisible inside the sandbox"
|
|
- "DNS and SSL work (curl https succeeds)"
|
|
- "comma and nix shell can install packages"
|
|
- "Exit code passes through from claude to caller"
|
|
artifacts: []
|
|
key_links:
|
|
- from: "nix build result"
|
|
to: "claudebox binary"
|
|
via: "result/bin/claudebox symlink"
|
|
pattern: "result/bin/claudebox"
|
|
---
|
|
|
|
<objective>
|
|
Build the claudebox flake and verify the sandbox works end-to-end through automated smoke tests and manual verification.
|
|
|
|
Purpose: Confirm the sandbox actually isolates secrets, passes through tools, and runs Claude Code successfully.
|
|
Output: Verified working claudebox command.
|
|
</objective>
|
|
|
|
<execution_context>
|
|
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
|
@$HOME/.claude/get-shit-done/templates/summary.md
|
|
</execution_context>
|
|
|
|
<context>
|
|
@.planning/PROJECT.md
|
|
@.planning/ROADMAP.md
|
|
@.planning/phases/01-minimal-viable-sandbox/01-CONTEXT.md
|
|
@.planning/phases/01-minimal-viable-sandbox/01-01-SUMMARY.md
|
|
@flake.nix
|
|
@claudebox.sh
|
|
</context>
|
|
|
|
<tasks>
|
|
|
|
<task type="auto">
|
|
<name>Task 1: Build flake and run automated smoke tests</name>
|
|
<files></files>
|
|
<read_first>
|
|
flake.nix
|
|
claudebox.sh
|
|
</read_first>
|
|
<action>
|
|
Run the following commands sequentially, fixing any issues that arise:
|
|
|
|
**Step 1: Build the flake**
|
|
```bash
|
|
cd /home/toph/code/tools/claudebox
|
|
nix build
|
|
```
|
|
If this fails, read the error and fix `flake.nix` or `claudebox.sh` as needed. Common issues:
|
|
- shellcheck errors in claudebox.sh (fix the shell code)
|
|
- Missing flake.lock (nix build will create it on first run)
|
|
- Package name mismatches (verify against nixpkgs)
|
|
|
|
**Step 2: Verify the binary exists**
|
|
```bash
|
|
ls -la result/bin/claudebox
|
|
```
|
|
|
|
**Step 3: Run a minimal bwrap test without Claude**
|
|
To test the sandbox without needing Claude, run just the bwrap portion to verify mounts and env isolation. Extract the bwrap invocation concept and test key properties:
|
|
|
|
```bash
|
|
# Test that the built script at least starts (will fail at claude lookup if claude not in PATH, that's ok)
|
|
# Instead, test bwrap directly using the same flags pattern:
|
|
|
|
# Test 1: Verify --clearenv produces empty env
|
|
result/bin/claudebox 2>&1 || true
|
|
# If claude is found, it will launch. If not, we get the expected error.
|
|
```
|
|
|
|
Since claudebox requires `claude` in PATH and will exec into it, automated testing is limited. The key automated checks are:
|
|
|
|
1. `nix build` succeeds (shellcheck passes, all deps resolve)
|
|
2. `result/bin/claudebox` exists and is executable
|
|
3. The script content in the Nix store passes basic sanity: `cat result/bin/claudebox` shows the wrapper with correct PATH setup
|
|
|
|
Run:
|
|
```bash
|
|
# Check the built wrapper contains expected runtimeInputs in PATH
|
|
cat result/bin/claudebox | head -20
|
|
```
|
|
|
|
If `nix build` fails due to shellcheck issues in claudebox.sh, fix them. Common shellcheck fixes:
|
|
- SC2086: Double-quote variable expansions
|
|
- SC2034: Unused variables (may need `# shellcheck disable=SC2034` if intentional)
|
|
- SC2155: Declare and assign separately
|
|
|
|
After build succeeds, if `claude` is available on the host PATH, run a quick sandbox test:
|
|
```bash
|
|
# Quick test: launch claudebox with --help to verify it starts and exits cleanly
|
|
result/bin/claudebox --help 2>&1 | head -5 || true
|
|
```
|
|
This should show Claude Code's help output if everything is wired correctly, or show a meaningful error.
|
|
</action>
|
|
<verify>
|
|
<automated>test -x /home/toph/code/tools/claudebox/result/bin/claudebox && echo "PASS: binary exists" || echo "FAIL: binary missing"</automated>
|
|
</verify>
|
|
<acceptance_criteria>
|
|
- `nix build` exits 0 (no shellcheck errors, all deps resolve)
|
|
- `result/bin/claudebox` exists and is executable
|
|
- `flake.lock` exists (created by first build)
|
|
- The built wrapper script in the Nix store contains runtimeInputs PATH entries (visible in `cat result/bin/claudebox`)
|
|
</acceptance_criteria>
|
|
<done>nix build succeeds and produces an executable claudebox binary</done>
|
|
</task>
|
|
|
|
<task type="checkpoint:human-verify" gate="blocking">
|
|
<name>Task 2: Manual sandbox verification</name>
|
|
<files></files>
|
|
<action>Present the verification checklist below to the user and wait for their confirmation that each check passes.</action>
|
|
<what-built>Complete claudebox sandbox wrapping Claude Code with environment isolation, filesystem isolation, secret hiding, git support, and tool provisioning</what-built>
|
|
<how-to-verify>
|
|
1. Launch claudebox from a project directory:
|
|
```
|
|
cd ~/some-project
|
|
/home/toph/code/tools/claudebox/result/bin/claudebox
|
|
```
|
|
|
|
2. Inside the Claude session, verify environment isolation:
|
|
- Ask Claude to run `env | sort` -- should show ONLY allowlisted vars (HOME, PATH, TERM, USER, SHELL, TMPDIR, etc.)
|
|
- Confirm NO appearance of: SSH_AUTH_SOCK, AWS_PROFILE, GITHUB_TOKEN, or any secret vars
|
|
|
|
3. Verify filesystem isolation:
|
|
- Ask Claude to run `ls ~/.ssh` -- should fail (directory not found)
|
|
- Ask Claude to run `ls ~/.gnupg` -- should fail
|
|
- Ask Claude to run `ls ~/.aws` -- should fail
|
|
- Ask Claude to run `ls ~/.claude` -- should succeed (mapped from ~/.claudebox)
|
|
|
|
4. Verify tools work:
|
|
- Ask Claude to run `git status` -- should work in the project dir
|
|
- Ask Claude to run `curl -s https://example.com | head -5` -- should return HTML (DNS + SSL work)
|
|
- Ask Claude to run `, jq --help | head -3` -- should install and run jq via comma
|
|
- Ask Claude to run `rg --version` -- should show ripgrep version
|
|
|
|
5. Exit Claude (Ctrl+C or /exit) and verify:
|
|
- The shell returns to your normal prompt
|
|
- `echo $?` shows the exit code from Claude (typically 0)
|
|
</how-to-verify>
|
|
<verify>
|
|
<automated>echo "CHECKPOINT: requires human verification"</automated>
|
|
</verify>
|
|
<done>User confirms all sandbox isolation and tool provisioning checks pass</done>
|
|
<resume-signal>Type "approved" if all checks pass, or describe any issues found</resume-signal>
|
|
</task>
|
|
|
|
</tasks>
|
|
|
|
<threat_model>
|
|
## Trust Boundaries
|
|
|
|
| Boundary | Description |
|
|
|----------|-------------|
|
|
| Build output -> Runtime | Nix build produces the sandbox script; verification confirms it behaves as designed |
|
|
|
|
## STRIDE Threat Register
|
|
|
|
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|
|
|-----------|----------|-----------|-------------|-----------------|
|
|
| T-01-08 | Information Disclosure | Env leak in built binary | mitigate | Manual verification (Task 2 step 2) confirms only allowlisted vars appear in `env` output inside sandbox |
|
|
| T-01-09 | Information Disclosure | Secret path accessible | mitigate | Manual verification (Task 2 step 3) confirms ~/.ssh, ~/.gnupg, ~/.aws are not visible |
|
|
</threat_model>
|
|
|
|
<verification>
|
|
1. `nix build` exits 0
|
|
2. Human confirms env isolation (only allowlisted vars visible)
|
|
3. Human confirms filesystem isolation (secret paths invisible)
|
|
4. Human confirms tools work (git, curl, comma, ripgrep)
|
|
5. Human confirms clean exit behavior
|
|
</verification>
|
|
|
|
<success_criteria>
|
|
- claudebox builds from the Nix flake without errors
|
|
- Human verifies the sandbox isolates secrets and provides working tools
|
|
- Phase 1 success criteria from ROADMAP.md are met
|
|
</success_criteria>
|
|
|
|
<output>
|
|
After completion, create `.planning/phases/01-minimal-viable-sandbox/01-02-SUMMARY.md`
|
|
</output>
|