16 KiB
Phase 2: Env Audit and CLI Polish - Research
Researched: 2026-04-09 Domain: Bash CLI UX -- flag parsing, ANSI formatting, TTY detection Confidence: HIGH
Summary
This phase adds pre-launch transparency and diagnostic CLI flags to the existing claudebox.sh script. All work is pure bash -- no new dependencies, no new Nix packages, no external tools. The existing code already has the data structures (ENV_ARGS, HOST_ALLOWLIST, CLAUDEBOX_EXTRA_ENV parsing) and the flag parsing skeleton (case/esac). The phase extends these with display logic, confirmation prompts, and two new early-exit modes (--dry-run, --check).
The main technical concerns are: (1) correctly iterating ENV_ARGS for display without breaking the bwrap invocation, (2) ANSI escape code portability, (3) non-interactive stdin detection, and (4) shellcheck compliance since writeShellApplication enforces it at build time.
Primary recommendation: Build parallel data structures for audit display rather than parsing the ENV_ARGS array. Track env vars in associative arrays by category during construction, then format for display separately from the bwrap --setenv pairs.
<user_constraints>
User Constraints (from CONTEXT.md)
Locked Decisions
- D-01: Group env vars by source in three labeled sections: "Sandbox-generated", "Host (allowlisted)", "Extra (CLAUDEBOX_EXTRA_ENV)". Display to stderr.
- D-02: PATH is split by
:and displayed one entry per line (indented under PATH=) for readability. - D-03: Use plain ANSI escape codes for color/formatting -- no external dependency like gum. Bold section headers, colored section labels.
- D-04: Auto-mask values where the variable name matches
*KEY*,*TOKEN*,*SECRET*,*PASSWORD*,*CREDENTIAL*(case-insensitive). Show first 7 + last 4 characters with...in between. - D-05: Use
Proceed? [Y/n]prompt -- default is proceed (Enter launches). User must typenornoto abort. - D-06: If stdin is not a TTY (piped input, CI, scripts), abort with error telling user to pass
--yes/-y. Do NOT auto-proceed. - D-07: All audit output and the prompt go to stderr. Keeps stdout clean.
- D-08:
--yes/-yskips the env audit display and confirmation entirely -- launches immediately. - D-09:
--dry-runprints the full bwrap command without executing. Claude's discretion on format. - D-10:
--checkverifies prerequisites: bwrap exists, required Nix packages available,~/.claudeboxexists. Claude's discretion on depth/format.
Claude's Discretion
--dry-runoutput format (single-line vs multiline, annotated vs raw)--checkdiagnostic depth (existence-only vs version checks vs connectivity tests)- Exact ANSI color choices and spacing
- Flag parsing order and error messages for invalid flag combinations
Deferred Ideas (OUT OF SCOPE)
None -- discussion stayed within phase scope. </user_constraints>
<phase_requirements>
Phase Requirements
| ID | Description | Research Support |
|---|---|---|
| UX-01 | Pre-launch env audit displays all env vars being passed into the sandbox on stderr | Audit display patterns, ANSI formatting, ENV_ARGS iteration approach |
| UX-02 | Pre-launch env audit prompts for confirmation before proceeding | TTY detection, read prompt pattern, non-interactive abort |
| UX-03 | --yes / -y flag skips the env audit confirmation |
Flag parsing extension to existing case/esac |
| UX-04 | --dry-run flag prints the full bwrap command without executing |
Command reconstruction from arrays, printf %q quoting |
| UX-05 | --check flag verifies bwrap exists, required Nix packages available, and ~/.claudebox exists |
command -v checks, exit code conventions |
| </phase_requirements> |
Standard Stack
No new packages required. Everything is bash builtins and existing runtimeInputs.
| Tool | Source | Purpose | Already Available |
|---|---|---|---|
| bash builtins | runtimeInputs | ANSI output, read, test, printf | Yes |
| jq | runtimeInputs | Not needed for this phase | Yes (unused) |
| coreutils | runtimeInputs | tput alternative -- but ANSI codes are simpler | Yes |
Architecture Patterns
Integration Points in claudebox.sh
The current script flow is linear:
1. Parse flags (--shell)
2. Resolve binaries
3. Record CWD, ensure ~/.claudebox
4. Generate gitconfig
5. Build ENV_ARGS array
6. Build SANDBOX_CMD
7. exec bwrap
Phase 2 inserts into this flow:
1. Parse flags (--shell, --yes/-y, --dry-run, --check) [EXTEND]
2. --check: early exit [NEW]
3. Resolve binaries
4. Record CWD, ensure ~/.claudebox
5. Generate gitconfig
6. Build ENV_ARGS array
7. Build SANDBOX_CMD
8. Env audit display + confirmation (unless --yes) [NEW]
9. --dry-run: print command and exit [NEW]
10. exec bwrap
Pattern: Parallel Display Data
Rather than parsing ENV_ARGS (which is --setenv key value triplets), maintain separate display-oriented arrays during construction. This avoids fragile parsing of the bwrap args array.
# During env construction, also track for display
declare -A SANDBOX_VARS # sandbox-generated vars
declare -A HOST_VARS # host allowlisted vars
declare -A EXTRA_VARS # CLAUDEBOX_EXTRA_ENV vars
[VERIFIED: reading claudebox.sh -- the three categories already have distinct code blocks that can populate these]
Pattern: ANSI Escape Codes
# Color constants -- define once at top
BOLD=$'\033[1m'
RESET=$'\033[0m'
DIM=$'\033[2m'
CYAN=$'\033[36m'
YELLOW=$'\033[33m'
GREEN=$'\033[32m'
RED=$'\033[31m'
[ASSUMED] These are standard VT100/ECMA-48 sequences supported by all modern terminals. No tput dependency needed.
Pattern: Value Masking (D-04)
mask_value() {
local name="$1" value="$2"
# Case-insensitive match on var name
if [[ "${name^^}" == *KEY* || "${name^^}" == *TOKEN* || "${name^^}" == *SECRET* || "${name^^}" == *PASSWORD* || "${name^^}" == *CREDENTIAL* ]]; then
local len=${#value}
if (( len > 11 )); then
echo "${value:0:7}...${value: -4}"
else
echo "***"
fi
else
echo "$value"
fi
}
Note: ${name^^} converts to uppercase in bash 4+. NixOS ships bash 5.x, so this is safe. [VERIFIED: NixOS uses bash 5.x from nixpkgs]
Pattern: PATH Display (D-02)
display_path() {
echo " PATH="
IFS=':' read -ra path_entries <<< "$1"
for entry in "${path_entries[@]}"; do
echo " $entry"
done
}
Pattern: TTY Detection (D-06)
if [[ -t 0 ]]; then
# Interactive -- show prompt
read -r -p "Proceed? [Y/n] " response < /dev/tty
# ...
else
echo "Error: stdin is not a terminal. Pass --yes or -y to skip confirmation." >&2
exit 1
fi
Important: Use read < /dev/tty rather than plain read because stdin may be consumed by pipes even when /dev/tty exists. The [[ -t 0 ]] check catches the non-interactive case. [ASSUMED]
Pattern: --dry-run Output
Recommend multiline format with one flag per line, matching the existing exec bwrap layout in the script. This makes it easy to diff against the actual invocation and spot issues.
if [[ "$DRY_RUN" == true ]]; then
echo "bwrap \\" >&2
echo " --clearenv \\" >&2
# ... each flag on its own line
exit 0
fi
Use printf '%q ' for values that may contain special characters (though in practice, env values and paths are clean).
Pattern: --check Diagnostics
Recommend checking:
command -v bwrap-- is bwrap on PATHcommand -v claude-- is claude on PATH- Key runtimeInputs: git, curl, nix, bash
~/.claudeboxdirectory existsANTHROPIC_API_KEYis set (warn if missing, don't fail)
Output: one line per check with pass/fail indicator. Exit 0 if all required checks pass, exit 1 if any required check fails.
Pattern: Flag Parsing Extension
The existing parser uses for arg in "$@" with shift. Extend with additional cases. Important: --check should be checked first (early exit before env construction), but parsing order can collect all flags first, then branch.
SKIP_AUDIT=false
DRY_RUN=false
CHECK_MODE=false
SHELL_MODE=false
CLAUDE_ARGS=()
for arg in "$@"; do
case "$arg" in
--yes|-y) SKIP_AUDIT=true ;;
--dry-run) DRY_RUN=true ;;
--check) CHECK_MODE=true ;;
--shell) SHELL_MODE=true ;;
--) shift; break ;;
*) CLAUDE_ARGS+=("$arg") ;;
esac
done
Note: Current parsing uses shift and break which is problematic for multi-flag support. The refactored approach collects all flags in one pass and stores remaining args in CLAUDE_ARGS. [VERIFIED: current claudebox.sh only handles --shell with shift+break]
Anti-Patterns to Avoid
- Parsing ENV_ARGS for display: The array contains
--setenv key valuetriplets interleaved with bwrap flags. Iterating it for display is fragile. Track display data separately. - Using
tputfor colors: Adds an ncurses dependency. ANSI escape codes are sufficient and have no dependency. - Auto-proceeding in non-interactive mode: D-06 explicitly requires aborting. Don't silently proceed.
- Echoing sensitive values without masking: D-04 requires masking KEY/TOKEN/SECRET/PASSWORD/CREDENTIAL patterns.
Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---|---|---|---|
| Color output | ncurses/tput wrapper | Raw ANSI escapes | D-03 mandates plain ANSI, zero deps |
| Argument parsing | getopt/getopts | Simple case/esac loop | Only 4 flags, no complex options, matches existing pattern |
Common Pitfalls
Pitfall 1: Shellcheck Violations
What goes wrong: writeShellApplication runs shellcheck at build time. Bash-isms like ${name^^} or declare -A may trigger warnings.
Why it happens: shellcheck defaults may flag associative arrays or uppercase expansion.
How to avoid: Use # shellcheck disable=SCXXXX directives only when necessary. Test with shellcheck claudebox.sh locally before nix build. Associative arrays and ${var^^} are valid bash and shellcheck-clean in bash mode.
Warning signs: nix build fails with shellcheck errors.
Pitfall 2: Flag Parsing Breaking Passthrough
What goes wrong: Claudebox consumes a flag meant for claude, or fails to pass remaining args.
Why it happens: Current parsing uses shift+break which only handles one flag.
How to avoid: Refactor to collect all known flags, accumulate unknown args in CLAUDE_ARGS array, pass CLAUDE_ARGS to claude.
Warning signs: claudebox --model sonnet silently drops --model.
Pitfall 3: read Prompt in Non-TTY
What goes wrong: read hangs or reads garbage when stdin is piped.
Why it happens: No TTY check before prompting.
How to avoid: Check [[ -t 0 ]] before read. Read from /dev/tty explicitly.
Warning signs: Script hangs in CI or when piped.
Pitfall 4: Masking Short Values
What goes wrong: Masking "first 7 + last 4" on a 5-character value reveals the whole thing.
Why it happens: No length check before substring extraction.
How to avoid: If value length <= 11, show *** instead of partial mask.
Warning signs: Short API keys fully visible in audit output.
Pitfall 5: ANSI Codes in Redirected Output
What goes wrong: If stderr is redirected to a file, ANSI escape codes pollute the output.
Why it happens: Colors are sent regardless of terminal capability.
How to avoid: Optional: check [[ -t 2 ]] and suppress colors if stderr is not a terminal. This is discretionary per D-03, but good practice.
Warning signs: Garbled text in log files.
Code Examples
Complete Flag Parsing (recommended)
# Parse claudebox flags -- collect our flags, pass the rest to claude
SKIP_AUDIT=false
DRY_RUN=false
CHECK_MODE=false
SHELL_MODE=false
CLAUDE_ARGS=()
while (( $# > 0 )); do
case "$1" in
--yes|-y) SKIP_AUDIT=true ;;
--dry-run) DRY_RUN=true ;;
--check) CHECK_MODE=true ;;
--shell) SHELL_MODE=true ;;
--) shift; CLAUDE_ARGS+=("$@"); break ;;
*) CLAUDE_ARGS+=("$1") ;;
esac
shift
done
[ASSUMED] This is standard bash argument parsing. The while/shift pattern is more robust than the current for/shift/break.
Env Audit Display Function
print_audit() {
local bold=$'\033[1m' reset=$'\033[0m'
local cyan=$'\033[36m' yellow=$'\033[33m' green=$'\033[32m'
echo "${bold}${cyan}=== Sandbox Environment ===${reset}" >&2
echo "" >&2
echo "${bold}Sandbox-generated:${reset}" >&2
for var in HOME USER PATH SHELL TMPDIR XDG_RUNTIME_DIR NIX_SSL_CERT_FILE SSL_CERT_FILE; do
if [[ "$var" == "PATH" ]]; then
echo " ${green}PATH=${reset}" >&2
IFS=':' read -ra entries <<< "$SANDBOX_PATH"
for entry in "${entries[@]}"; do
echo " $entry" >&2
done
else
# Look up value from the sandbox vars
echo " ${green}${var}=${reset}${SANDBOX_DISPLAY[$var]}" >&2
fi
done
# ... similar for host and extra sections
}
--check Implementation
run_check() {
local pass=true
local green=$'\033[32m' red=$'\033[31m' reset=$'\033[0m'
check_cmd() {
if command -v "$1" &>/dev/null; then
echo "${green}OK${reset} $1" >&2
else
echo "${red}FAIL${reset} $1 -- not found" >&2
pass=false
fi
}
echo "claudebox prerequisites:" >&2
check_cmd bwrap
check_cmd claude
check_cmd git
check_cmd nix
if [[ -d "$HOME/.claudebox" ]]; then
echo "${green}OK${reset} ~/.claudebox exists" >&2
else
echo "${red}FAIL${reset} ~/.claudebox -- not found (will be created on first run)" >&2
fi
if [[ -v ANTHROPIC_API_KEY ]]; then
echo "${green}OK${reset} ANTHROPIC_API_KEY is set" >&2
else
echo "${yellow}WARN${reset} ANTHROPIC_API_KEY is not set" >&2
fi
if [[ "$pass" == true ]]; then
exit 0
else
exit 1
fi
}
Assumptions Log
| # | Claim | Section | Risk if Wrong |
|---|---|---|---|
| A1 | ANSI VT100 escape codes work in all target terminals | Architecture Patterns | Low -- NixOS terminals universally support ANSI |
| A2 | read < /dev/tty is the correct pattern for prompting when stdin may be piped |
Architecture Patterns | Low -- standard Unix practice |
| A3 | ${var^^} uppercase expansion is shellcheck-clean |
Pitfalls | Low -- shellcheck knows bash, would only flag if shell directive is sh |
| A4 | while/shift is more robust than for/shift/break for multi-flag parsing | Code Examples | Very low -- well-established pattern |
Open Questions
-
Should
--dry-runalso show the env audit?- What we know: D-09 says print the full bwrap command. The env audit is separate (D-01).
- What's unclear: Whether
--dry-runimplies--yes(skip audit) or shows audit then command. - Recommendation:
--dry-runimplies--yes-- the user wants to see the command, not be prompted. Show the command and exit.
-
Should
--checkbe combinable with other flags?- What we know:
--checkis an early-exit diagnostic. - What's unclear: What if user passes
--check --dry-run. - Recommendation:
--checktakes priority, exits before other flags matter. No need to error on combinations.
- What we know:
-
Color disable for non-terminal stderr?
- What we know: D-03 says use ANSI codes. Doesn't mention disabling.
- What's unclear: Whether to add
NO_COLORor[[ -t 2 ]]detection. - Recommendation: Respect the
NO_COLORenv var convention if set. Otherwise, always emit ANSI. This is low-effort and follows modern CLI conventions.
Sources
Primary (HIGH confidence)
claudebox.sh-- current implementation, read directlyflake.nix-- current derivation structure, read directly02-CONTEXT.md-- locked decisions D-01 through D-10REQUIREMENTS.md-- UX-01 through UX-05 definitions
Secondary (MEDIUM confidence)
- Bash reference manual (training data) -- builtins, parameter expansion, test operators
- VT100/ECMA-48 escape codes (training data) -- ANSI color sequences
Metadata
Confidence breakdown:
- Standard stack: HIGH -- no new packages, pure bash
- Architecture: HIGH -- clear insertion points in existing script, straightforward patterns
- Pitfalls: HIGH -- well-known bash gotchas, verified against existing code
Research date: 2026-04-09 Valid until: No expiry -- bash and ANSI codes are stable