test(02): persist human verification items as UAT
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
caabd59ae2
commit
c83129953f
2 changed files with 179 additions and 0 deletions
44
.planning/phases/02-env-audit-and-cli-polish/02-HUMAN-UAT.md
Normal file
44
.planning/phases/02-env-audit-and-cli-polish/02-HUMAN-UAT.md
Normal file
|
|
@ -0,0 +1,44 @@
|
|||
---
|
||||
status: partial
|
||||
phase: 02-env-audit-and-cli-polish
|
||||
source: [02-VERIFICATION.md]
|
||||
started: 2026-04-09T17:30:00Z
|
||||
updated: 2026-04-09T17:30:00Z
|
||||
---
|
||||
|
||||
## Current Test
|
||||
|
||||
[awaiting human testing]
|
||||
|
||||
## Tests
|
||||
|
||||
### 1. Visual Audit Display
|
||||
expected: Run `claudebox` without `--yes` — see grouped sections (Sandbox-generated, Host allowlisted, Extra), PATH split by colon, sensitive values masked, Y/n prompt on stderr
|
||||
result: [pending]
|
||||
|
||||
### 2. Dry-Run Output
|
||||
expected: Run `claudebox --dry-run` — full bwrap command prints to stderr, does not execute
|
||||
result: [pending]
|
||||
|
||||
### 3. Check Mode
|
||||
expected: Run `claudebox --check` — colored OK/FAIL/WARN output for bwrap, claude, git, curl, nix, ~/.claudebox, ANTHROPIC_API_KEY
|
||||
result: [pending]
|
||||
|
||||
### 4. Non-Interactive Abort
|
||||
expected: Pipe input to `claudebox` (e.g., `echo | claudebox`) — aborts with error telling user to pass `--yes`/`-y`
|
||||
result: [pending]
|
||||
|
||||
### 5. Yes Flag Skip
|
||||
expected: Run `claudebox --yes` or `claudebox -y` — skips audit display and confirmation, launches immediately
|
||||
result: [pending]
|
||||
|
||||
## Summary
|
||||
|
||||
total: 5
|
||||
passed: 0
|
||||
issues: 0
|
||||
pending: 5
|
||||
skipped: 0
|
||||
blocked: 0
|
||||
|
||||
## Gaps
|
||||
135
.planning/phases/02-env-audit-and-cli-polish/02-VERIFICATION.md
Normal file
135
.planning/phases/02-env-audit-and-cli-polish/02-VERIFICATION.md
Normal file
|
|
@ -0,0 +1,135 @@
|
|||
---
|
||||
phase: 02-env-audit-and-cli-polish
|
||||
verified: 2026-04-09T16:00:00Z
|
||||
status: human_needed
|
||||
score: 4/4
|
||||
overrides_applied: 0
|
||||
human_verification:
|
||||
- test: "Run claudebox without --yes and verify env vars display with grouped sections"
|
||||
expected: "Three sections shown (Sandbox-generated, Host allowlisted, Extra) with PATH split per-line, sensitive values masked, Proceed? prompt appears"
|
||||
why_human: "Requires running in a terminal with bwrap available to verify visual output, TTY interaction, and color formatting"
|
||||
- test: "Run claudebox --yes and verify it launches immediately without audit"
|
||||
expected: "No env audit displayed, sandbox launches directly"
|
||||
why_human: "Requires running sandbox with bwrap and claude available"
|
||||
- test: "Run claudebox --dry-run and verify full bwrap command is printed"
|
||||
expected: "Complete bwrap command with all --setenv, mount flags, and sandbox command printed to stderr, then exits 0"
|
||||
why_human: "Requires runtime environment with SANDBOX_PATH and resolved binaries"
|
||||
- test: "Run claudebox --check and verify prerequisite report"
|
||||
expected: "Colored OK/FAIL/WARN for bwrap, claude, git, curl, nix, ~/.claudebox, ANTHROPIC_API_KEY"
|
||||
why_human: "Requires nix-built binary to test PATH resolution of check targets"
|
||||
- test: "Pipe input to claudebox (non-interactive) and verify it aborts"
|
||||
expected: "Error message about stdin not being a terminal, suggests --yes/-y, exits 1"
|
||||
why_human: "Requires runtime execution to test TTY detection"
|
||||
---
|
||||
|
||||
# Phase 2: Env Audit and CLI Polish Verification Report
|
||||
|
||||
**Phase Goal:** User can review exactly what enters the sandbox before launch, and has diagnostic tools for troubleshooting
|
||||
**Verified:** 2026-04-09T16:00:00Z
|
||||
**Status:** human_needed
|
||||
**Re-verification:** No -- initial verification
|
||||
|
||||
## Goal Achievement
|
||||
|
||||
### Observable Truths
|
||||
|
||||
| # | Truth | Status | Evidence |
|
||||
|---|-------|--------|----------|
|
||||
| 1 | Running `claudebox` without `--yes` prints all env vars and prompts for confirmation | VERIFIED | `print_audit()` at lines 175-211, prompt at line 219, guarded by `SKIP_AUDIT != true && DRY_RUN != true` at line 214 |
|
||||
| 2 | Running `claudebox --yes` or `-y` skips env audit and launches immediately | VERIFIED | Flag parsing at line 10 sets `SKIP_AUDIT=true`, guard at line 214 checks it |
|
||||
| 3 | Running `claudebox --dry-run` prints full bwrap command without executing | VERIFIED | Lines 240-272: prints all --setenv triplets, mount flags, sandbox command, then `exit 0` |
|
||||
| 4 | Running `claudebox --check` reports whether bwrap, Nix packages, ~/.claudebox exist | VERIFIED | Lines 22-63: `check_cmd` for bwrap/claude/git/curl/nix, dir check for ~/.claudebox, ANTHROPIC_API_KEY warn |
|
||||
|
||||
**Score:** 4/4 truths verified
|
||||
|
||||
### Required Artifacts
|
||||
|
||||
| Artifact | Expected | Status | Details |
|
||||
|----------|----------|--------|---------|
|
||||
| `claudebox.sh` | Refactored flag parsing, --check, --dry-run (Plan 01) | VERIFIED | 299 lines, contains CHECK_MODE, DRY_RUN, SKIP_AUDIT, CLAUDE_ARGS (15 pattern matches) |
|
||||
| `claudebox.sh` | Env audit display, masking, confirmation prompt (Plan 02) | VERIFIED | Contains mask_value, print_audit, Proceed (7 pattern matches) |
|
||||
|
||||
### Key Link Verification
|
||||
|
||||
| From | To | Via | Status | Details |
|
||||
|------|----|-----|--------|---------|
|
||||
| Flag parsing (CLAUDE_ARGS) | SANDBOX_CMD construction | `CLAUDE_ARGS` array replaces raw `$@` | WIRED | Declared line 6, accumulated lines 14-15, used in SANDBOX_CMD lines 234, 236 |
|
||||
| Env audit block | SKIP_AUDIT flag | `if [[ "$SKIP_AUDIT" != true ]]` | WIRED | Set line 2/10, checked line 214 |
|
||||
| Audit display | ENV_ARGS array | Parallel AUDIT_*_KEYS/VALS arrays | WIRED | AUDIT_SANDBOX/HOST/EXTRA arrays declared lines 120-125, populated lines 141-169, displayed in print_audit lines 175-211 |
|
||||
|
||||
### Data-Flow Trace (Level 4)
|
||||
|
||||
Not applicable -- shell script with no dynamic data rendering. All data flows from flag parsing and host environment through to bwrap execution, verified via wiring checks above.
|
||||
|
||||
### Behavioral Spot-Checks
|
||||
|
||||
| Behavior | Command | Result | Status |
|
||||
|----------|---------|--------|--------|
|
||||
| nix build passes (shellcheck clean) | `nix build` | exit 0 | PASS |
|
||||
| No TODO/FIXME/PLACEHOLDER markers | `grep -n TODO\|FIXME\|PLACEHOLDER claudebox.sh` | 0 matches | PASS |
|
||||
| Flag parsing handles multiple flags | grep for while/shift loop | `while (( $# > 0 ))` at line 8 with case/esac | PASS |
|
||||
| Mask function covers all sensitive patterns | grep mask_value body | KEY, TOKEN, SECRET, PASSWORD, CREDENTIAL all present | PASS |
|
||||
| Stderr-only output | grep `>&2` count | 28 stderr redirections found | PASS |
|
||||
|
||||
### Requirements Coverage
|
||||
|
||||
| Requirement | Source Plan | Description | Status | Evidence |
|
||||
|-------------|------------|-------------|--------|----------|
|
||||
| UX-01 | 02-02 | Pre-launch env audit displays all env vars on stderr | SATISFIED | `print_audit()` with 3 grouped sections, all output to stderr |
|
||||
| UX-02 | 02-02 | Pre-launch env audit prompts for confirmation | SATISFIED | `Proceed? [Y/n]` at line 219, abort on `n`/`no` |
|
||||
| UX-03 | 02-01 | `--yes`/`-y` skips confirmation | SATISFIED | Flag parsed line 10, guard at line 214 |
|
||||
| UX-04 | 02-01 | `--dry-run` prints full bwrap command | SATISFIED | Lines 240-272, multiline bwrap output to stderr, exit 0 |
|
||||
| UX-05 | 02-01 | `--check` verifies prerequisites | SATISFIED | Lines 22-63, checks bwrap/claude/git/curl/nix + ~/.claudebox + ANTHROPIC_API_KEY |
|
||||
|
||||
No orphaned requirements found -- all 5 phase requirements (UX-01 through UX-05) are claimed and satisfied.
|
||||
|
||||
### Anti-Patterns Found
|
||||
|
||||
| File | Line | Pattern | Severity | Impact |
|
||||
|------|------|---------|----------|--------|
|
||||
| (none) | - | - | - | No anti-patterns detected |
|
||||
|
||||
### Human Verification Required
|
||||
|
||||
### 1. Visual Audit Display
|
||||
|
||||
**Test:** Run `claudebox` in a terminal without `--yes` flag
|
||||
**Expected:** Three grouped sections (Sandbox-generated, Host allowlisted, Extra) with colored headers, PATH entries split one per line, sensitive values masked (ANTHROPIC_API_KEY shows `sk-ant-...xxxx`), `Proceed? [Y/n]` prompt
|
||||
**Why human:** Requires bwrap-capable environment, TTY interaction, visual confirmation of color formatting
|
||||
|
||||
### 2. Dry-Run Output
|
||||
|
||||
**Test:** Run `claudebox --dry-run`
|
||||
**Expected:** Full multiline bwrap command printed to stderr with all --setenv and mount flags, exits 0
|
||||
**Why human:** Requires runtime with resolved SANDBOX_PATH and binary paths
|
||||
|
||||
### 3. Check Mode
|
||||
|
||||
**Test:** Run `claudebox --check`
|
||||
**Expected:** Colored OK/FAIL/WARN for each prerequisite, appropriate exit code
|
||||
**Why human:** Requires nix-built binary to verify PATH resolution targets
|
||||
|
||||
### 4. Non-Interactive Abort
|
||||
|
||||
**Test:** Run `echo "" | claudebox`
|
||||
**Expected:** Error message about stdin not being a terminal, suggests `--yes`/`-y`, exits 1
|
||||
**Why human:** Requires runtime TTY detection test
|
||||
|
||||
### 5. Yes Flag Skip
|
||||
|
||||
**Test:** Run `claudebox --yes`
|
||||
**Expected:** No audit display, sandbox launches immediately
|
||||
**Why human:** Requires full sandbox environment
|
||||
|
||||
### Gaps Summary
|
||||
|
||||
No automated gaps found. All 4 roadmap success criteria verified at code level. All 5 requirements (UX-01 through UX-05) are satisfied in the implementation. The code is clean (no TODOs, no stubs, shellcheck passes via nix build).
|
||||
|
||||
One minor documentation note: commit hashes in 02-01-SUMMARY.md (`07096ae`, `3903667`, `cc6bd5b`) do not match actual commits (`72ba48d`, `1eddd93`, `7001303`). This is cosmetic and does not affect functionality.
|
||||
|
||||
Human verification is needed to confirm runtime behavior -- the code structure is correct but these are interactive CLI features that require a terminal and bwrap environment to fully validate.
|
||||
|
||||
---
|
||||
|
||||
_Verified: 2026-04-09T16:00:00Z_
|
||||
_Verifier: Claude (gsd-verifier)_
|
||||
Loading…
Add table
Reference in a new issue