claudebox/.planning/phases/05-per-project-instance-isolation/05-RESEARCH.md

24 KiB

Phase 5: Per-Project Instance Isolation - Research

Researched: 2026-04-13 Domain: bash scripting, bubblewrap mounts, Claude Code storage layout Confidence: HIGH

<user_constraints>

User Constraints (from CONTEXT.md)

Locked Decisions

  • D-01: Replace --bind ~/.claudebox ~/.claudebox + --symlink ~/.claudebox ~/.claude with --bind ~/.claude ~/.claude
  • D-02: Overlay --bind ~/.claudebox/projects ~/.claude/projects AFTER the ~/.claude bind mount (bwrap last-mount-wins). Isolates per-project memory while keeping everything else from real ~/.claude.
  • D-03: Overlay --bind ~/.claudebox/history.jsonl ~/.claude/history.jsonl AFTER the ~/.claude bind mount. Isolates session prompt history.
  • D-04: Ensure ~/.claudebox/projects/ directory and ~/.claudebox/history.jsonl file exist at startup (before bwrap).
  • D-05: .credentials.json handling stays as-is — already separate in current code.
  • D-06: SANDBOX.md and CLAUDE.md management moves from ~/.claudebox/ to ~/.claude/ approach — must not overwrite user's real ~/.claude/CLAUDE.md destructively.
  • D-07: ~/.claudebox/projects/ contains per-project subdirs keyed by hash: ~/.claudebox/projects/<16-char-hash>/. Inside sandbox, these appear at ~/.claude/projects/<hash>/.
  • D-08: Canonical project root = git rev-parse --git-common-dir resolved to absolute path, falling back to $CWD for non-git directories. All worktrees of the same repo get the same hash.
  • D-09: Hash = SHA-256 of canonical root path, truncated to 16 hex chars.
  • D-10: On instance creation, write project-root plaintext file inside ~/.claudebox/projects/<hash>/ containing the canonical root path.
  • D-11: claudebox --gc iterates ~/.claudebox/projects/*/project-root, reads each path, removes any dir whose recorded path no longer exists on disk. Prints removed paths to stderr.
  • D-12: Add --gc to flag-parsing block alongside --yes, --dry-run, --check.
  • D-13: No locking needed. Claude Code manages file-level concurrency within its own data dir.
  • D-14: CLAUDE_CONFIG_DIR env var approach is abandoned. Direct mount + overlay is simpler.

Claude's Discretion

  • Exact hash truncation length (16 chars recommended)
  • Whether to print instance path at launch (verbose addition later)
  • SANDBOX.md strategy: whether to keep writing it to ~/.claudebox/ and bind-mount it, or write to ~/.claude/ directly
  • Whether history.jsonl needs per-project hashing too or one shared sandbox history is fine

Deferred Ideas (OUT OF SCOPE)

  • Per-instance settings.json override (project-specific model selection) — Phase 7
  • --gc --dry-run to preview what would be removed
  • claudebox --list-instances to show all instances with their project roots </user_constraints>

<phase_requirements>

Phase Requirements

INST-01 through INST-04 are NOT yet defined in REQUIREMENTS.md. The CONTEXT.md says they need to be added during planning. The planner should define them based on the Phase 5 success criteria and add them to REQUIREMENTS.md.

ID Description Research Support
INST-01 Each project directory has isolated conversation history (no cross-contamination between projects) D-02 overlay mounts ~/.claudebox/projects/<hash>/ as ~/.claude/projects/; Claude Code writes per-project subdirs inside
INST-02 Git worktrees of the same repo share instance state with the main worktree D-08: git rev-parse --git-common-dir resolved to absolute gives same root for all worktrees
INST-03 Two concurrent claudebox sessions in the same project do not corrupt each other's state D-13: No locking needed; Claude Code handles its own file-level concurrency
INST-04 claudebox --gc removes instance directories for project roots that no longer exist on disk D-11/D-12: iterate project-root files, remove stale dirs

There is also an implicit "plugin fix" requirement (call it PLUG-01 or scope it as part of INST-01): the real ~/.claude/ being mounted means all plugins, skills, hooks, MCP configs, commands, agents, keybindings, and settings.json become visible inside the sandbox. The planner should decide whether to track this separately. </phase_requirements>

Summary

Phase 5 is two coordinated changes: (1) fix the plugin mount architecture so that all Claude Code config files in ~/.claude/ are visible inside the sandbox, and (2) implement per-project isolation so that conversation history and project memory are scoped to the canonical git root.

The current architecture mounts ~/.claudebox at ~/.claudebox and symlinks it to ~/.claude inside the sandbox. This means any Claude Code config that lives in ~/.claude/ (plugins, skills, hooks, mcp.json, etc.) is invisible — the sandbox only sees ~/.claudebox/ contents. The fix is direct: mount ~/.claude at ~/.claude, then overlay only the two paths that need isolation (projects/ and history.jsonl) with content from ~/.claudebox/.

Per-project isolation works by mounting ~/.claudebox/projects/<16-char-hash>/ as the entire ~/.claude/projects/ view inside the sandbox. Claude Code writes conversation history and project memory into ~/.claude/projects/, which under the overlay lands in the project-specific hash dir. Two different projects get different hash dirs, so their histories never mix. Git worktrees of the same repo resolve to the same canonical root (via git rev-parse --git-common-dir), so they share the same hash dir.

Primary recommendation: Follow decisions D-01 through D-14 exactly as written. All are verified technically sound by live testing in this environment. The only ambiguity is the SANDBOX.md injection strategy (see Discretion section below).

Standard Stack

Core

Tool Version Purpose Why Standard
bubblewrap (bwrap) 0.9.x (nixpkgs) Sandbox + filesystem isolation Already in use; verified --bind overlay behavior confirmed via live test [VERIFIED: live bwrap test]
sha256sum (coreutils) GNU coreutils 9.10 Hash canonical root path Already in runtimeInputs; `printf '%s' "$path"
git nixpkgs Resolve canonical root for worktrees Already in runtimeInputs; git rev-parse --git-common-dir returns .git (relative) for normal repos, absolute path for worktrees [VERIFIED: live test]
readlink -f coreutils Resolve relative .git to absolute path Already used in claudebox.sh for NixOS symlink resolution

No new packages needed. All required tools are already in runtimeInputs in flake.nix.

Installation: No changes to flake.nix required.

Architecture Patterns

~/.claudebox/
├── .credentials.json     # OAuth tokens (hard-linked to ~/.claude/.credentials.json)
├── SANDBOX.md            # Managed by claudebox, injected as overlay
├── CLAUDE.md             # Managed by claudebox (kept for backward compat, not used)
├── history.jsonl         # Sandbox-side prompt history (overlays ~/.claude/history.jsonl)
└── projects/
    ├── 2458cbe666750168/ # SHA-256[:16] of /home/user/code/myproject
    │   ├── project-root  # plaintext: /home/user/code/myproject
    │   └── -home-user-code-myproject/   # Claude Code writes here (path-based name)
    │       ├── *.jsonl   # conversation history
    │       └── memory/   # project memory
    └── 421ebd2f76562141/ # SHA-256[:16] of /home/user/code/other
        ├── project-root
        └── -home-user-code-other/

Mount Order (critical)

--bind ~/.claude ~/.claude
    (makes all real ~/.claude/ content visible)
--bind ~/.claudebox/projects/<hash>/ ~/.claude/projects/
    (overlays projects/ with this project's isolated dir)
--bind ~/.claudebox/history.jsonl ~/.claude/history.jsonl
    (overlays history.jsonl with sandbox-side file)
--bind ~/.claudebox/SANDBOX.md ~/.claude/SANDBOX.md
    (injects SANDBOX.md without touching user's CLAUDE.md)
--bind ~/.claudebox/.credentials.json ~/.claude/.credentials.json
    (overlays credentials with claudebox-managed copy)

Last-mount-wins is bwrap's documented behavior. Verified with live bwrap tests: both subdirectory overlays and file-level overlays work correctly. [VERIFIED: live bwrap tests]

Pattern 1: Canonical Root Computation

# Source: live testing + D-08 design
compute_canonical_root() {
  local cwd="$1"
  local git_common
  git_common=$(git -C "$cwd" rev-parse --git-common-dir 2>/dev/null) || {
    echo "$cwd"
    return
  }
  # Make absolute if relative (normal repo returns ".git")
  if [[ "$git_common" != /* ]]; then
    git_common="$cwd/$git_common"
  fi
  dirname "$(readlink -f "$git_common")"
}

CANONICAL_ROOT=$(compute_canonical_root "$CWD")
INSTANCE_HASH=$(printf '%s' "$CANONICAL_ROOT" | sha256sum | cut -c1-16)
INSTANCE_DIR="$HOME/.claudebox/projects/$INSTANCE_HASH"

Why --git-common-dir and not --show-toplevel: In a worktree, --show-toplevel returns the worktree's own root (not the main repo root). --git-common-dir always points to the main repo's .git directory, so all worktrees of the same repo get the same canonical root. [VERIFIED: git docs + live testing]

Pattern 2: Instance Initialization

# Source: D-04, D-10
mkdir -p "$INSTANCE_DIR"
HISTORY_FILE="$HOME/.claudebox/history.jsonl"
touch "$HISTORY_FILE"

# Write project-root only on first creation (idempotent)
if [[ ! -f "$INSTANCE_DIR/project-root" ]]; then
  printf '%s\n' "$CANONICAL_ROOT" > "$INSTANCE_DIR/project-root"
fi

Why touch for history.jsonl: bwrap bind mount requires the source path to exist. Attempting to bind-mount a non-existent file produces bwrap: Can't find source path: No such file or directory and exits 1. [VERIFIED: live bwrap test]

Pattern 3: GC Implementation

# Source: D-11, D-12
gc_instances() {
  local removed=0
  for dir in "$HOME/.claudebox/projects"/*/; do
    [[ -d "$dir" ]] || continue
    local root_file="$dir/project-root"
    [[ -f "$root_file" ]] || continue   # skip dirs without tracking file (defensive)
    local root_path
    root_path=$(< "$root_file")
    if [[ ! -d "$root_path" ]]; then
      rm -rf "$dir"
      echo "Removed: $dir (project path gone: $root_path)" >&2
      (( removed++ )) || true
    fi
  done
  echo "GC complete: $removed instance(s) removed." >&2
}

Pattern 4: Credential Mount Target Update

# OLD (current, with symlink architecture):
# --bind "$HOME/.claudebox/.credentials.json" "$HOME/.claudebox/.credentials.json"
# (visible as ~/.claude/.credentials.json via the symlink)

# NEW (with direct ~/.claude bind):
# --bind "$HOME/.claudebox/.credentials.json" "$HOME/.claude/.credentials.json"
# (explicit overlay after the ~/.claude bind mount)

Note: on this system ~/.claude/.credentials.json and ~/.claudebox/.credentials.json are the same inode (hard links from Phase 4). The overlay still has the correct behavior: it re-mounts the claudebox-managed credential file on top of whatever ~/.claude/.credentials.json the direct bind exposed. [VERIFIED: inode check]

Anti-Patterns to Avoid

  • Mounting entire ~/.claudebox/projects/ as ~/.claude/projects/: This would expose ALL project dirs to every project, defeating isolation. Mount only the per-project hash subdir.
  • Using git rev-parse --show-toplevel instead of --git-common-dir: --show-toplevel returns the worktree root, not the main repo root. Worktrees would get different hashes than the main checkout.
  • Skipping touch ~/.claudebox/history.jsonl: bwrap fails with a hard error if the source doesn't exist. The script would crash on first run.
  • Writing to ~/.claude/CLAUDE.md: With the new mount architecture, ~/.claude is the user's real config dir. Writing CLAUDE.md there would destructively modify the user's actual config. Use bind-mounted SANDBOX.md instead.

Don't Hand-Roll

Problem Don't Build Use Instead Why
Content-addressed storage Custom hash scheme sha256sum | cut -c1-16 Already in coreutils; stable, collision-resistant for path strings
Filesystem overlay/union Custom copy-on-write bwrap --bind last-mount-wins Kernel-native, no performance cost, no FUSE dependencies
File locking for concurrent sessions Custom lockfile scheme Nothing (D-13) Claude Code already manages its own concurrency within its data dir

Common Pitfalls

Pitfall 1: bwrap requires source to exist before launch

What goes wrong: Adding --bind ~/.claudebox/history.jsonl ~/.claude/history.jsonl without first ensuring the source file exists. bwrap exits 1 with "Can't find source path". Why it happens: bwrap performs bind mounts at namespace creation time; the source must be a real, existing path on the host filesystem. How to avoid: touch "$HOME/.claudebox/history.jsonl" before the bwrap call (already in D-04). Also applies to: The per-project hash dir: mkdir -p "$INSTANCE_DIR" must happen before bwrap.

Pitfall 2: Overlay mount order matters

What goes wrong: Placing the ~/.claude/projects/ overlay bind BEFORE the ~/.claude bind. The ~/.claude bind would then overwrite the overlay (last-mount-wins goes the wrong way). Why it happens: bwrap processes --bind args left to right; the last bind for any given path wins. How to avoid: Always: --bind ~/.claude ~/.claude FIRST, then overlays.

Pitfall 3: Credential mount target path not updated

What goes wrong: Keeping --bind "$CREDS_FILE" "$HOME/.claudebox/.credentials.json" after removing the ~/.claudebox bind and symlink. The target path no longer exists in the sandbox (no ~/.claudebox/ dir is mounted). How to avoid: Change target to "$HOME/.claude/.credentials.json".

Pitfall 4: CLAUDE.md injection modifies user's real config

What goes wrong: Keeping the current CLAUDE.md injection logic (lines 167-174) that writes to "$HOME/.claudebox/CLAUDE.md". After the mount change, ~/.claudebox/CLAUDE.md is a host-side file not visible in the sandbox at all. Worse, if the code is updated to write to ~/.claude/CLAUDE.md, it destructively prepends @SANDBOX.md to the user's real config. How to avoid: Mount SANDBOX.md as a single-file overlay (--bind ~/.claudebox/SANDBOX.md ~/.claude/SANDBOX.md). The user's real ~/.claude/CLAUDE.md already contains @SANDBOX.md on this system. Warning sign: If SANDBOX.md content is missing inside the sandbox after the migration.

Pitfall 5: Dry-run block not mirrored

What goes wrong: Updating BWRAP_ARGS but forgetting the --dry-run echo block (lines 319-361). The dry-run output shows the old mount layout. How to avoid: The dry-run block must be updated in sync with BWRAP_ARGS. Both blocks need: new ~/.claude bind, projects overlay, history overlay, SANDBOX.md overlay, updated credentials target.

Pitfall 6: Unstaged CLAUDE_JSON changes from Phase 4

What goes wrong: The working tree has uncommitted changes adding CLAUDE_JSON_FILE detection and a --bind $CLAUDE_JSON_FILE $HOME/.claude.json mount (the ~/.claude.json file, separate from the ~/.claude/ directory). These changes are not in the last commit. Context: ~/.claude.json is at $HOME/.claude.json (not inside ~/.claude/), so this mount is independent of the Phase 5 architecture changes. The CLAUDE_JSON_FILE mount should be incorporated into Phase 5 work since it was left uncommitted from Phase 4.

Pitfall 7: Glob on empty projects/ directory

What goes wrong: The GC loop for dir in "$HOME/.claudebox/projects"/*/; do — if projects/ is empty, bash expands the glob literally and the loop runs once with a non-existent path. How to avoid: Add [[ -d "$dir" ]] || continue as the first statement in the loop (already in the Pattern 3 example above).

Code Examples

Complete instance initialization block

# Source: D-04, D-08, D-09, D-10

# Compute canonical project root (worktree-aware)
CWD=$(pwd)
GIT_COMMON=$(git -C "$CWD" rev-parse --git-common-dir 2>/dev/null) || true
if [[ -n "$GIT_COMMON" ]]; then
  [[ "$GIT_COMMON" != /* ]] && GIT_COMMON="$CWD/$GIT_COMMON"
  CANONICAL_ROOT=$(dirname "$(readlink -f "$GIT_COMMON")")
else
  CANONICAL_ROOT="$CWD"
fi

INSTANCE_HASH=$(printf '%s' "$CANONICAL_ROOT" | sha256sum | cut -c1-16)
INSTANCE_DIR="$HOME/.claudebox/projects/$INSTANCE_HASH"

# Create instance dir and project-root file
mkdir -p "$INSTANCE_DIR"
if [[ ! -f "$INSTANCE_DIR/project-root" ]]; then
  printf '%s\n' "$CANONICAL_ROOT" > "$INSTANCE_DIR/project-root"
fi

# Ensure history.jsonl source exists (bwrap requires it)
touch "$HOME/.claudebox/history.jsonl"

Relevant BWRAP_ARGS section (after change)

# Source: D-01, D-02, D-03, D-05, D-06 (after Phase 5 changes)
BWRAP_ARGS=(
  # ... clearenv, ENV_ARGS, tmpfs, proc, dev, nix, etc ...
  --tmpfs "$HOME"
  # Phase 5: mount real ~/.claude (replaces ~/.claudebox bind + symlink)
  --bind "$HOME/.claude" "$HOME/.claude"
  # Phase 5: overlay projects/ with this project's isolated dir
  --bind "$INSTANCE_DIR" "$HOME/.claude/projects"
  # Phase 5: overlay history.jsonl with sandbox-side file
  --bind "$HOME/.claudebox/history.jsonl" "$HOME/.claude/history.jsonl"
  # Phase 5: inject SANDBOX.md as file overlay
  --bind "$HOME/.claudebox/SANDBOX.md" "$HOME/.claude/SANDBOX.md"
)
# Credentials overlay (mount target changes from .claudebox to .claude)
if [[ "$CREDS_MOUNT" == true ]]; then
  BWRAP_ARGS+=(--bind "$CREDS_FILE" "$HOME/.claude/.credentials.json")
fi
# CLAUDE_JSON mount (independent of ~/.claude/ dir)
if [[ "$CLAUDE_JSON_MOUNT" == true ]]; then
  BWRAP_ARGS+=(--bind "$CLAUDE_JSON_FILE" "$HOME/.claude.json")
fi

Flag parsing addition for --gc

# Source: D-12 (add to existing while/case block)
GC_MODE=false

while (( $# > 0 )); do
  case "$1" in
    --yes|-y)  SKIP_AUDIT=true ;;
    --dry-run) DRY_RUN=true ;;
    --check)   CHECK_MODE=true ;;
    --shell)   SHELL_MODE=true ;;
    --gc)      GC_MODE=true ;;    # NEW
    --) shift; CLAUDE_ARGS+=("$@"); break ;;
    *) CLAUDE_ARGS+=("$1") ;;
  esac
  shift
done

Open Questions

  1. SANDBOX.md injection strategy

    • What we know: User's real ~/.claude/CLAUDE.md already has @SANDBOX.md on this system. The CONTEXT.md says D-06 "must not overwrite user's real ~/.claude/CLAUDE.md destructively".
    • What's unclear: Should claudebox continue to manage SANDBOX.md content (write it to ~/.claudebox/SANDBOX.md and bind-mount it), or stop injecting it entirely since the user's CLAUDE.md already has @SANDBOX.md?
    • Recommendation: Continue writing ~/.claudebox/SANDBOX.md and bind-mounting it as ~/.claude/SANDBOX.md. This keeps the content claudebox-controlled without touching the user's CLAUDE.md. Remove the CLAUDE.md injection logic (lines 167-174) entirely — the real ~/.claude/CLAUDE.md already has @SANDBOX.md, and we must not touch it.
  2. Uncommitted CLAUDE_JSON_FILE changes

    • What we know: The working tree has uncommitted changes (not in HEAD) adding CLAUDE_JSON mount for ~/.claude.json (the root-level file, not inside ~/.claude/). Phase 4 verification passed without this code.
    • What's unclear: Was this intentionally left uncommitted, or is it Phase 4 work that needs to be committed first?
    • Recommendation: Incorporate these changes into Phase 5. The ~/.claude.json mount is independent of Phase 5 architecture changes and is useful (stores auth tokens). The planner should include a task to commit these changes as part of Phase 5 work.
  3. ~/.claudebox/projects/ overlay vs subdir mount

    • This was clarified during research: the design mounts ~/.claudebox/projects/<hash>/ as ~/.claude/projects/ (the per-project subdir AS the entire projects view), not ~/.claudebox/projects/ as ~/.claude/projects/ (which would expose all projects).
    • Claude Code creates path-based subdirs (e.g., -home-user-code-myproject/) INSIDE the mounted projects dir, which lands inside the hash subdir on the host.

Environment Availability

Dependency Required By Available Version Fallback
sha256sum (coreutils) D-09 hash computation yes GNU coreutils 9.10
git D-08 canonical root yes nixpkgs git
readlink -f (coreutils) D-08 absolute path yes GNU coreutils 9.10
bwrap subdirectory overlay D-02 projects isolation yes 0.9.x (nixpkgs)
bwrap file overlay D-03 history isolation yes 0.9.x (nixpkgs)

All dependencies available. No new packages required in flake.nix.

State of the Art

Old Approach Current Approach When Changed Impact
~/.claudebox bind + symlink to ~/.claude Direct ~/.claude bind + overlays for isolation paths Phase 5 (this) All plugins/skills/hooks become visible in sandbox
No per-project isolation Hash-based per-project instance dirs Phase 5 (this) Conversation history scoped to git root
No GC --gc flag removes stale instance dirs Phase 5 (this) Prevents unbounded growth of ~/.claudebox/projects/

Deprecated/outdated:

  • --symlink ~/.claudebox ~/.claude in BWRAP_ARGS: replaced by direct --bind ~/.claude ~/.claude
  • CLAUDE.md injection logic (lines 167-174): no longer needed; user's real CLAUDE.md already has @SANDBOX.md
  • Target path ~/.claudebox/.credentials.json for credentials overlay: changes to ~/.claude/.credentials.json

Assumptions Log

# Claim Section Risk if Wrong
A1 Claude Code manages its own file-level concurrency within its data dir (D-13) Architecture Patterns If wrong, concurrent sessions in same project could corrupt data; but since D-13 is a locked decision, implementation proceeds without locking
A2 User's ~/.claude/CLAUDE.md already contains @SANDBOX.md on all deployments Open Questions SANDBOX.md content wouldn't be injected if user doesn't have @SANDBOX.md in their CLAUDE.md; mitigated by the SANDBOX.md bind-mount overlay

All other claims in this research were verified via live testing or direct inspection of the codebase.

Sources

Primary (HIGH confidence)

  • Live bwrap testing — --bind overlay behavior, subdirectory overlay under bound parent, file overlay, behavior when source doesn't exist
  • claude --help output — confirmed no --data-dir flag exists; Claude Code v2.1.97
  • ~/.claude/projects/ directory inspection — confirmed path-based naming convention (e.g., -home-toph-code-tools-claudebox/)
  • ~/.claude/history.jsonl content inspection — confirmed it stores prompt display history with project path references
  • Inode inspection — confirmed ~/.claude/.credentials.json and ~/.claudebox/.credentials.json are the same inode (hard links)
  • sha256sum availability — confirmed in coreutils 9.10 already in runtimeInputs
  • git rev-parse --git-common-dir behavior — confirmed .git (relative) for normal repos, resolves correctly via readlink -f
  • claudebox.sh source (working tree) — read completely; all line numbers referenced in CONTEXT.md verified
  • flake.nix — confirmed runtimeInputs; no new packages needed

Secondary (MEDIUM confidence)

  • PLUGIN_MOUNT_FIX.md — architectural rationale document confirming the plugin visibility problem and overlay fix
  • CONTEXT.md decisions D-01 through D-14 — user-validated design decisions from discuss-phase

Metadata

Confidence breakdown:

  • Standard stack: HIGH — all tools verified live on this system
  • Architecture: HIGH — bwrap overlay behavior verified with live tests; Claude Code storage layout confirmed by inspection
  • Pitfalls: HIGH — most pitfalls discovered via direct testing or code inspection, not assumption

Research date: 2026-04-13 Valid until: 2026-07-13 (stable APIs; Claude Code storage format could change on any release)