claudebox/.planning/research/PITFALLS.md

450 lines
28 KiB
Markdown

# Pitfalls Research
**Domain:** Bubblewrap sandbox wrappers for CLI tools on NixOS — v2.0 additions
**Researched:** 2026-04-10
**Confidence:** MEDIUM-HIGH (combination of live web research + authoritative sources + training data)
This file supersedes the v1.0 pitfalls from 2026-04-09 and focuses on the four new feature areas of the v2.0 milestone: tiered network isolation (slirp4netns), per-project instance isolation, named profiles, and host auth passthrough.
---
## Critical Pitfalls
### Pitfall 1: Auth Passthrough Read-Only Mount Breaks OAuth Token Refresh
**What goes wrong:**
Mounting `~/.claude/.credentials.json` as `--ro-bind` (read-only) into the sandbox to provide auth passthrough seems correct from a security standpoint. In practice, Claude Code's OAuth flow performs a read-refresh-write cycle on `.credentials.json` on every session startup and periodically when the access token approaches expiry. A read-only mount causes the write to fail silently or with a cryptic EACCES error, breaking authentication.
**Why it happens:**
The mental model is "auth is a secret, secrets should be read-only." But Claude Code on Linux stores the entire credentials object — including the refreshToken — in `.credentials.json`, and OAuth access tokens expire. The refresh flow reads the current token, requests a new access+refresh token pair from Anthropic's auth server, and writes the new pair back. If the write fails, the next invocation has an expired access token and a still-valid refresh token it can't update, eventually causing 401 errors.
Additionally, OAuth refresh tokens are single-use server-side. If a concurrent claude session inside the sandbox refreshes the token and can't write back, the original on-disk token is now invalid too. The user gets locked out.
**How to avoid:**
Do not mount `.credentials.json` read-only. Instead, mount `~/.claude` (or the specific subset of auth files) with `--bind` (read-write). Alternatively, keep credentials on the read-write `~/.claudebox` instance directory and symlink or copy on launch, updating the host copy on exit — but this is more complex.
The simplest correct approach: mount `~/.claude` read-write for auth files only, and keep the per-project conversation history and settings on the instance-scoped `~/.claudebox/instances/<hash>/.claude/`. Use a two-directory structure:
```
~/.claudebox/
auth/ # writable bind-mount to host ~/.claude auth files
instances/<hash>/ # per-project instance (conversation history, settings)
```
Inside the sandbox:
```
~/.claude -> ~/.claudebox/instances/<hash>/ (conversation history)
~/.claude/.credentials.json -> (from ~/.claudebox/auth/.credentials.json via symlink or separate bind)
```
**Warning signs:**
- Authentication works on first launch but fails after a few days
- Claude Code asks to re-authenticate on every launch
- Error messages mentioning 401, expired token, or authentication failure appear without the user having changed anything
**Phase to address:** Auth passthrough phase (Phase 1 of v2.0). Must be correct before any other feature.
---
### Pitfall 2: slirp4netns Requires Background Process Coordination — Not a Simple bwrap Flag
**What goes wrong:**
Developers treating "internet-only network isolation" as a bwrap option, similar to `--unshare-net`. There is no single bwrap flag for "internet but no LAN." The correct approach requires:
1. Creating a new network namespace via `--unshare-net`
2. Launching `slirp4netns` as a separate background process (before or concurrent with bwrap)
3. Connecting slirp4netns to the network namespace via the sandbox process's PID
4. Configuring the TAP device (`ip link set tap0 up`, `ip addr add 10.0.2.100/24 dev tap0`, `ip route add default via 10.0.2.2`)
5. Writing an `/etc/resolv.conf` inside the sandbox pointing to slirp4netns's built-in DNS at `10.0.2.3`
This is a non-trivial process orchestration problem in bash: you must start bwrap, capture the sandbox PID before it execs, start slirp4netns targeting that PID, wait for slirp4netns to signal readiness (`--ready-fd`), configure the network interface inside the namespace, then let the sandbox proceed.
**Why it happens:**
The conceptual model of bwrap as a self-contained invocation (set flags, exec, done) breaks down for network namespacing. slirp4netns is a peer process, not a child, and must outlive the bwrap invocation. All existing production users (podman, rootless containers, Nix daemon for fixed-output derivations) implement this in C or Go, not in bash. There is no documented bash-only reference implementation.
**How to avoid:**
Implement the slirp4netns setup using bwrap's `--sync-fd` mechanism:
1. Open a pipe: `coproc ... { exec bwrap --unshare-net --sync-fd 4 ... }`
2. Read the PID from the sync fd before bwrap execs the actual command
3. Start `slirp4netns --configure --ready-fd 5 "$BWRAP_PID" tap0 &`
4. Wait for slirp4netns ready signal on fd 5
5. Release the sync fd to let bwrap proceed
Alternatively, use `--userns-block-fd` to block bwrap until network setup completes. This is what Guix daemon and Podman do.
The `--ready-fd` flag on slirp4netns writes a byte when initialization (TAP up + routing configured) is complete. Do not proceed without it — there is a window where the TAP device exists but has no route, causing the first DNS queries to fail.
**Warning signs:**
- Network works "most of the time" but occasionally fails at startup (race condition — proceeding before slirp4netns is ready)
- DNS fails inside sandbox but ping 10.0.2.2 works (route configured, DNS not yet set up)
- `nix shell` fails inside internet-only mode (missing /etc/resolv.conf pointing to 10.0.2.3)
**Phase to address:** Network isolation phase. Expect this to be the most complex phase. Plan extra time.
---
### Pitfall 3: slirp4netns DNS Breaks When Host Uses systemd-resolved
**What goes wrong:**
On the host, `/etc/resolv.conf` typically points to `127.0.0.53` (systemd-resolved stub) or a loopback address (dnsmasq). Inside the sandbox with `--unshare-net`, the network namespace has no loopback access to the host's DNS resolver. Bind-mounting the host's `/etc/resolv.conf` into the sandbox gives a file pointing to an address unreachable from the new namespace.
slirp4netns provides a built-in DNS resolver at `10.0.2.3`, but this is only active inside the slirp4netns virtual network. You must create or bind-mount a custom `resolv.conf` inside the sandbox that says `nameserver 10.0.2.3`.
**Why it happens:**
The existing claudebox script already bind-mounts `/etc/resolv.conf` from the host. When adding network isolation, this existing mount becomes wrong for the internet-only tier. Developers add slirp4netns but forget to also replace the resolv.conf.
**How to avoid:**
Before launching bwrap for internet-only mode, write a temporary resolv.conf:
```bash
RESOLV_TMP=$(mktemp)
echo "nameserver 10.0.2.3" > "$RESOLV_TMP"
trap 'rm -f "$RESOLV_TMP"' EXIT
# Then in bwrap:
--ro-bind "$RESOLV_TMP" /etc/resolv.conf
```
For the "full network" tier, the existing host resolv.conf bind-mount is correct. Make the resolv.conf source conditional on network tier.
**Warning signs:**
- `curl https://...` fails with "Could not resolve host" inside internet-only sandbox
- `nix shell` hangs at "downloading..." indefinitely
- DNS works in full-network mode but not in internet-only mode
**Phase to address:** Network isolation phase, DNS subsection.
---
### Pitfall 4: slirp4netns Process Leaks When Sandbox Exits Abnormally
**What goes wrong:**
slirp4netns is launched as a background process that must be killed when the sandbox exits. If the sandbox process is killed with SIGKILL, the bash trap handler does not run, and the slirp4netns process becomes an orphan owned by init. On long-running systems, these accumulate. Podman has a documented bug where zombie slirp4netns processes pile up.
**Why it happens:**
bash `trap ... EXIT` handles normal exits, SIGTERM, and SIGINT but not SIGKILL. There is no portable way to register a SIGKILL handler. The `--die-with-parent` flag on bwrap causes bwrap to die if its parent (the wrapper script) dies, but the reverse (killing bwrap kills slirp4netns) is not automatic.
**How to avoid:**
Use `slirp4netns --exit-fd` to give slirp4netns a file descriptor that it monitors. When the fd is closed (because the holding process exited), slirp4netns exits itself. This is the correct mechanism.
```bash
# Open a pipe; slirp4netns holds the read end
exec {EXIT_FD}<>/tmp/slirp-exit-pipe
slirp4netns --exit-fd "$EXIT_FD" --ready-fd "$READY_FD" "$BWRAP_PID" tap0 &
SLIRP_PID=$!
# Close the fd in the parent when done (or on EXIT trap)
trap "exec {EXIT_FD}>&-; kill $SLIRP_PID 2>/dev/null || true; rm -f ..." EXIT
```
Note: `--exit-fd` requires slirp4netns 0.4.0+. Verify nixpkgs version.
**Warning signs:**
- `ps aux | grep slirp4netns` shows accumulating processes after repeated claudebox runs
- Memory usage grows gradually on systems with heavy claudebox usage
- Killing claudebox with Ctrl+C leaves a slirp4netns running
**Phase to address:** Network isolation phase, cleanup subsection.
---
### Pitfall 5: Per-Project Hash Collides for Git Worktrees
**What goes wrong:**
Per-project instance directories are keyed by hashing the project path (CWD). Claude Code itself uses this same approach (`~/.claude/projects/<hash-of-path>/`). When the user uses git worktrees, the main repo at `/home/user/myproject` and the worktree at `/home/user/myproject-feature` get different hashes and different instance directories. This splits conversation history and project memory across multiple isolated instances, even though they're branches of the same repository.
Worse: if the worktree is checked out inside the main repo (at `/home/user/myproject/.worktrees/feature`), claudebox's CWD hash approach and Claude Code's internal path hash approach may disagree, creating double-isolation where the user thinks they're resuming a session but they're in a fresh one.
**Why it happens:**
Hashing CWD is simple and correct for the non-worktree case. The edge case is only apparent during development workflows that use worktrees heavily (which is increasingly common with Claude Code being used for parallel feature development).
**How to avoid:**
Before computing the instance hash, attempt to resolve the canonical repo root:
```bash
canonical_project_root() {
local cwd="$1"
# If we're in a git worktree, resolve to the main worktree's root
local git_common
git_common=$(git -C "$cwd" rev-parse --git-common-dir 2>/dev/null) || { echo "$cwd"; return; }
# git-common-dir returns the .git directory for the common (main) worktree
# Strip the /.git suffix to get the project root
echo "${git_common%/.git}"
}
INSTANCE_KEY=$(canonical_project_root "$CWD")
INSTANCE_HASH=$(printf '%s' "$INSTANCE_KEY" | sha256sum | cut -c1-16)
```
This is a best-effort approach. Document the worktree behavior clearly so users know what to expect.
**Warning signs:**
- User reports "it forgot everything I told it" when switching to a worktree
- Multiple instance directories accumulate for what the user thinks is one project
- `~/.claudebox/instances/` grows unexpectedly large
**Phase to address:** Per-project isolation phase.
---
## Moderate Pitfalls
### Pitfall 6: Profile Config Format Creates Bash Parsing Complexity
**What goes wrong:**
Named profiles (`--profile foo`) must be stored in a config format that the bash script can parse. Using anything beyond simple `KEY=VALUE` pairs (e.g., TOML, YAML, JSON) requires either parsing tools inside the Nix derivation or adding jq/tomlq/yq as runtime dependencies specifically for the profile system. Profile config often needs list values (extra mounts, extra packages), which flat KEY=VALUE cannot represent cleanly.
Attempting to parse nested structures in bash leads to fragile code that breaks on paths with spaces, special characters, or newlines — all common in practice.
**Why it happens:**
Profile configs naturally want to describe lists (extra packages to add to PATH, extra bind mounts, extra env vars). The temptation is to use a "real" config format. But the wrapper script is bash, and adding a config language parser adds dependencies and complexity.
**How to avoid:**
Use shell-sourceable profile files (`~/.claudebox/profiles/foo.sh`) that are sourced (not parsed) by the wrapper script. The profile file sets variables following a declared schema:
```bash
# ~/.claudebox/profiles/foo.sh
PROFILE_NETWORK_TIER=internet-only
PROFILE_EXTRA_ENV=(SOME_VAR ANOTHER_VAR)
PROFILE_EXTRA_MOUNTS=(/data/myproject/secrets:/run/secrets:ro)
PROFILE_EXTRA_PACKAGES=(pkgs.python3 pkgs.postgresql)
```
The main script sources the profile with `source "$profile_file"` after validating it contains no dangerous patterns. This avoids a config parser entirely.
Risk: sourcing arbitrary files is a code injection vector if profile files are world-writable. Validate file permissions (must be owned by and only writable by the current user) before sourcing.
**Warning signs:**
- Profile parsing breaks on project paths containing spaces
- Lists of packages must be comma-separated, semicolon-separated, and newline-separated (inconsistency)
- Bash arrays can't be exported across source boundaries (requires workarounds)
**Phase to address:** Profile system phase.
---
### Pitfall 7: Nix Devshell Injection Requires Realizing Store Paths Before bwrap
**What goes wrong:**
Profile-specified packages (`PROFILE_EXTRA_PACKAGES`) must be resolved to actual Nix store paths before the bwrap call, so they can be added to `SANDBOX_PATH` and potentially bind-mounted. Attempting to run `nix shell nixpkgs#python3` inside the sandbox to "inject" a package only works if Nix daemon access is available inside the sandbox — and adds startup latency.
The correct approach is to resolve packages to store paths outside the sandbox before constructing the bwrap command. This requires a `nix build` or `nix eval` call in the pre-launch phase. If packages need to be fetched, this adds significant startup time (potentially 30-120 seconds for uncached packages) with no progress indication.
**Why it happens:**
The natural thought is "I'll just add the package to nix shell inside the sandbox." But that re-introduces the build step inside the sandbox, and the sandbox PATH doesn't include the injected package for non-shell invocations.
**How to avoid:**
In pre-launch (outside bwrap), resolve each profile package:
```bash
pkg_path=$(nix build --no-link --print-out-paths "nixpkgs#${pkg}" 2>/dev/null)
EXTRA_PATH="${EXTRA_PATH}:${pkg_path}/bin"
```
Cache this resolution: store the resolved store paths in `~/.claudebox/profiles/foo.resolved` with a lockfile and invalidate on flake lock update or nixpkgs channel change. Avoid re-resolving on every launch.
Show progress to the user when packages need to be fetched: "Resolving profile packages (first run may take a moment)..."
**Warning signs:**
- claudebox start time grows from ~1 second to 30+ seconds after adding profile packages
- Profile package resolution is re-run on every launch even when nothing changed
- `SANDBOX_PATH` doesn't include profile packages because they were resolved inside the sandbox
**Phase to address:** Profile + Nix devshell injection phase.
---
### Pitfall 8: Multiple Concurrent Instances of Same Project Race on Instance Directory
**What goes wrong:**
If a user runs two `claudebox` invocations in the same project directory (common when doing parallel work or forgetting a background session), both instances compute the same project hash and attempt to use the same `~/.claudebox/instances/<hash>/` directory simultaneously. Claude Code writes conversation history to JSONL files in that directory. Concurrent writes without coordination produce corrupted JSONL files.
This is not hypothetical: Claude Code already has a documented OAuth token refresh race condition when multiple instances run concurrently (GitHub issue #24317, #27933).
**Why it happens:**
The instance directory scheme assumes one session per project. Concurrent sessions of the same project break this assumption.
**How to avoid:**
Add a lockfile to the instance directory:
```bash
LOCK_FILE="$INSTANCE_DIR/.claudebox.lock"
exec {LOCK_FD}>"$LOCK_FILE"
if ! flock -n "$LOCK_FD"; then
echo "Another claudebox session is already running for this project." >&2
echo "Use --force to run anyway (conversation history may be interleaved)." >&2
exit 1
fi
```
Or allow concurrent sessions but assign distinct JSONL sub-directories per session (using a timestamp suffix), accepting that conversation history is session-scoped not project-scoped.
**Warning signs:**
- Corrupted `~/.claude/projects/` JSONL files after running two terminals in the same project
- "Unexpected end of JSON input" errors in Claude Code on startup
- Session history appears partially missing
**Phase to address:** Per-project isolation phase.
---
### Pitfall 9: IPv6 Tentative Address Delay Causes First-Connection Failures with slirp4netns
**What goes wrong:**
When slirp4netns configures a TAP device with an IPv6 address, the Linux kernel puts the address into "tentative" state and runs Duplicate Address Detection (DAD). DAD takes several seconds to complete. During this window, outgoing connections to IPv6 addresses fail. The first `curl` or `nix shell` command issued immediately after sandbox startup may fail with a connection error, even though the same command succeeds one second later.
This is documented in the Guix daemon slirp4netns implementation as a reason to explicitly disable DAD.
**Why it happens:**
Standard IPv6 address assignment always includes DAD. Most production container runtimes disable DAD for virtual interfaces because they know the address is unique within a private namespace. This is non-obvious unless you've worked with container networking before.
**How to avoid:**
After configuring the TAP device inside the namespace, disable DAD:
```bash
# Inside the network namespace (or via nsenter):
sysctl -w net.ipv6.conf.tap0.accept_dad=0
sysctl -w net.ipv6.conf.tap0.dad_transmits=0
```
Or use `ip link set tap0 addrgenmode none` before assigning the address.
Alternatively, if IPv6 is not needed for the use case, only configure IPv4 on the TAP device and let IPv6 be absent. Most Anthropic API endpoints and Nix binary caches resolve over IPv4.
**Warning signs:**
- First network request after sandbox start intermittently fails
- Problem is timing-dependent and hard to reproduce consistently
- `sleep 3 && curl https://...` works but `curl https://...` immediately after sandbox start fails
**Phase to address:** Network isolation phase, IPv6 subsection.
---
### Pitfall 10: Profile-Defined Extra Mounts Can Expose Secrets
**What goes wrong:**
Allowing profiles to define arbitrary extra bind mounts via `PROFILE_EXTRA_MOUNTS` breaks the core security invariant if users put secret paths there. A profile for a "cloud deployment" project might mount `~/.aws` or `~/.ssh` — and this is exactly what the profile system is meant to support. But it means the "secrets never enter the sandbox" guarantee becomes conditional on user discipline.
The meta-risk: a compromised or misconfigured profile file (`~/.claudebox/profiles/work.sh`) can silently mount secrets without the user reviewing the audit display.
**Why it happens:**
The profile system is designed to give users power to mount what they need. The same power that makes profiles useful makes them dangerous if misused. The env audit (pre-launch review) exists for env vars, but mounts are not currently in the audit display.
**How to avoid:**
Extend the pre-launch env audit to display active mounts from the profile:
```
Active profile: work
Network tier: internet-only
Extra mounts:
/data/myproject/config -> /run/config (read-only)
~/.aws -> ~/.aws (read-write) <-- HIGHLIGHTED IN RED
Extra packages: python3, postgresql
```
Highlight any mount that includes known-secret paths (`~/.ssh`, `~/.gnupg`, `~/.aws`, `~/.config/gcloud`, age key paths) in red with a warning. Do not block — the user may intentionally want to give Claude cloud access — but make it visible.
**Warning signs:**
- Profile silently mounts credentials without user awareness
- Pre-launch audit shows env vars but not mounts, giving false sense of security
- "It worked in the cloud project" — user discovers retrospectively that AWS keys were accessible
**Phase to address:** Profile system phase, audit integration subsection.
---
## Technical Debt Patterns
| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable |
|----------|-------------------|----------------|-----------------|
| Auth files mounted read-only | Simpler, "more secure" | Authentication breaks after token expiry | Never for active sessions |
| Shell-sourceable profile files without permission check | Avoids parser complexity | Code injection via malicious profile file | Never — always check ownership/permissions |
| Skip slirp4netns --ready-fd synchronization | Simpler startup code | Race condition at sandbox start causes intermittent network failures | Never |
| Single global ~/.claudebox/ directory for all instances | Avoids hash computation | Concurrent sessions corrupt shared state | Never if per-project isolation is a stated goal |
| Re-run `nix build` for profile packages on every launch | Always up-to-date | 30+ second startup penalty | Never for interactive use; acceptable in CI |
| Hardcode `nameserver 10.0.2.3` in resolv.conf without checking slirp4netns version | Simple | Breaks if slirp4netns DNS address changes in future versions | Only as MVP, document and add TODO |
---
## Integration Gotchas
| Integration | Common Mistake | Correct Approach |
|-------------|----------------|------------------|
| Claude Code OAuth on Linux | Mounting `~/.claude/.credentials.json` read-only | Mount the auth directory read-write; credentials need to be refreshed on disk |
| slirp4netns + bwrap | Launching slirp4netns after bwrap execs | Capture bwrap child PID before exec using `--sync-fd`; start slirp4netns targeting that PID |
| systemd-resolved host DNS | Bind-mounting host `/etc/resolv.conf` into network-isolated sandbox | Write a fresh resolv.conf pointing to `10.0.2.3` when using slirp4netns tier |
| Git worktrees | Hash CWD for project identity | Resolve git common dir to get canonical project root before hashing |
| Nix devshell packages | Resolve packages inside sandbox using nix shell | Pre-resolve store paths outside sandbox before bwrap invocation; cache results |
| Multiple concurrent sessions | No coordination | Lockfile on instance directory or per-session sub-directories |
---
## Security Mistakes
| Mistake | Risk | Prevention |
|---------|------|------------|
| Sourcing profile files without permission validation | Arbitrary code execution if profile file is modified by another process or user | Check `stat` — file must be owned by current user and not group/world writable |
| Displaying extra mounts in audit but not highlighting secret paths | User doesn't notice `~/.ssh` is being mounted | Highlight known secret paths in red in the audit display |
| Relying on slirp4netns `--disable-host-loopback` for LAN isolation | Does not block access to non-loopback LAN addresses | slirp4netns `--disable-host-loopback` only blocks 127.x.x.x; true LAN isolation requires additional iptables rules inside the namespace |
| Storing instance hash as CWD path | Path is predictable; could be used to pre-create a malicious instance directory | Include the uid in the hash: `sha256sum(uid:path)` |
| Profile files with plaintext secrets | Profile file itself becomes a secret file that must be protected | Profile files should reference env var names, not values; actual values come from host environment at launch time |
---
## "Looks Done But Isn't" Checklist
- [ ] **Auth passthrough:** Verify token refresh still works 24 hours after initial auth by checking that `.credentials.json` is writable inside the sandbox
- [ ] **Internet-only network:** Verify LAN addresses (192.168.x.x, 10.x.x.x) are unreachable but github.com and api.anthropic.com work
- [ ] **Network offline tier:** Verify `curl https://github.com` times out, but `nix shell` still works (Nix daemon socket is a Unix socket, not network; must remain mounted)
- [ ] **Per-project isolation:** Verify two different projects get different instance directories and their conversation histories don't mix
- [ ] **slirp4netns cleanup:** Verify `ps aux | grep slirp4netns` shows no processes after claudebox exits normally AND after Ctrl+C
- [ ] **Profile audit display:** Verify the pre-launch audit shows active profile, network tier, extra mounts, AND extra env vars — not just env vars
- [ ] **Profile permission check:** Verify sourcing a world-writable profile file is rejected with a clear error
- [ ] **Concurrent sessions:** Verify running two claudebox instances in the same project does not corrupt JSONL history
---
## Recovery Strategies
| Pitfall | Recovery Cost | Recovery Steps |
|---------|---------------|----------------|
| Auth passthrough breaks token refresh | MEDIUM | Re-authenticate via `claude /login`; fix mount to be read-write; may need to revoke and re-issue OAuth token |
| slirp4netns process leak | LOW | `pkill slirp4netns`; add --exit-fd to prevent recurrence |
| Instance directory corruption from concurrent sessions | MEDIUM | Delete corrupted JSONL files in `~/.claudebox/instances/<hash>/.claude/projects/`; conversation history lost but auth/settings preserved |
| Profile sources and mounts wrong packages | LOW | Remove or edit profile file; re-launch |
| IPv6 DAD causes intermittent startup failures | LOW | Add DAD disable to TAP device setup; or retry first connection |
| Wrong resolv.conf in network-isolated sandbox | LOW | Fix pre-launch resolv.conf generation for internet-only tier |
---
## Pitfall-to-Phase Mapping
| Pitfall | Prevention Phase | Verification |
|---------|------------------|--------------|
| Auth passthrough read-only breaks OAuth refresh (#1) | Auth passthrough phase | `touch ~/.claude/.credentials.json` inside sandbox succeeds; token refresh after 24h |
| slirp4netns process coordination complexity (#2) | Network isolation phase | `--dry-run` shows correct bwrap flags; network works on first launch |
| DNS breaks with systemd-resolved in isolated namespace (#3) | Network isolation phase | `curl https://cache.nixos.org` works in internet-only mode |
| slirp4netns process leak (#4) | Network isolation phase | No orphan processes after 10 start/stop cycles |
| Git worktree hash collision (#5) | Per-project isolation phase | Two worktrees of same repo share instance directory |
| Profile config parsing fragility (#6) | Profile system phase | Profile with path containing spaces works correctly |
| Nix devshell injection startup latency (#7) | Profile + devshell phase | Cached packages resolve in <1 second on second launch |
| Concurrent session race on instance directory (#8) | Per-project isolation phase | Two parallel sessions warn/block appropriately |
| IPv6 DAD delay (#9) | Network isolation phase | First `curl` after sandbox start succeeds consistently |
| Profile mounts exposing secrets silently (#10) | Profile system phase | Secret path mounts appear highlighted in pre-launch audit |
---
## Sources
- Claude Code GitHub issues: OAuth refresh race condition (#24317, #27933), credentials.json on Linux (confirmed at code.claude.com/docs/en/authentication) HIGH confidence
- slirp4netns man page and rootless-containers/slirp4netns GitHub HIGH confidence (official source)
- Guix daemon slirp4netns implementation (mail-archive.com/guix-commits, April 2025) HIGH confidence (authoritative implementation reference)
- bubblewrap issue #392 (slirp4netns feature request) MEDIUM confidence
- bubblewrap issue #633 (die-with-parent race condition) HIGH confidence
- bubblewrap issue #504 (abstract network namespace sharing) HIGH confidence
- Claude Code project storage structure (confirmed via inventivehq.com knowledge base and milvus.io deep dive) HIGH confidence
- Claude Code GitHub issue #34437 (worktrees share project directory) HIGH confidence
- Podman zombie slirp4netns issue #9777 HIGH confidence
- IPv6 DAD behavior from Guix implementation notes HIGH confidence
---
*Pitfalls research for: claudebox v2.0 — network isolation, per-project profiles, auth passthrough*
*Researched: 2026-04-10*