Context
Follow-up to closed predecessor #10224 (which established the coarse-grained .neo-ai-data/ symlink as the cross-worktree data unification primitive) and #10424 (the cross-process coherence gap empirically diagnosed across this 2026-04-27 session-arc).
bootstrapWorktree.mjs --link-data currently symlinks the entire .neo-ai-data/ directory atomically. With --force, it fs.rms the existing dir recursively before symlinking. This works for a fresh worktree but breaks for any worktree that:
- Has the git-tracked
.neo-ai-data/concepts/ directory present (every worktree, since it's tracked) → the symlink hides the worktree's own tracked files behind canonical's
- Has been previously fixed via
unlink + git checkout to restore concepts/ → the data-link is now broken (regular dirs for sqlite/, chroma/ instead of symlinks), causing cross-process coherence drift between MCP servers and the bridge daemon
This session-arc was the empirical anchor: my MCP server's list_messages repeatedly missed messages from @neo-gemini-3-1-pro that raw SQL on canonical confirmed existed; bridge daemon delivered to phantom WAKE_SUB IDs that no MCP-server view contained.
The Problem
The .gitignore boundary inside .neo-ai-data/:
.neo-ai-data
!.neo-ai-data/concepts/
Everything under .neo-ai-data/ is gitignored EXCEPT concepts/ which IS tracked. The bootstrap's all-or-nothing symlink can't honor this distinction:
- All-symlink → tracked
concepts/ becomes invisible (replaced by canonical's view)
- All-regular-dir (after the manual
unlink + git checkout reversal) → gitignored substrate-data subdirs (sqlite/, chroma/, wake-daemon/, etc.) get isolated per-worktree, breaking cross-process coherence
The empirical pattern observed this session-arc: 11 worktrees on disk, only 1 symlinked correctly (peaceful-pare-9cbfcb); 10 had regular dirs and 0 (or stale) WAKE_SUBSCRIPTION nodes in their isolated DBs. MCP servers in those worktrees couldn't see canonical's live state.
The Architectural Reality
ai/scripts/bootstrapWorktree.mjs — the canonical worktree-init substrate. Already has BOOTSTRAP_CONFIGS (the explicit allowlist for config.mjs files) — the same shape applies cleanly to data subdirs.
symlinkDataDir({dir = '.neo-ai-data'}) — current shape: single-dir all-or-nothing symlink with force clobber.
test/playwright/unit/ai/scripts/bootstrapWorktree.spec.mjs — existing test surface; the granular update needs symmetric coverage.
- Canonical
.neo-ai-data/ substrate-data subdirs (gitignored): sqlite/, chroma/, wake-daemon/, backups/, datasets/, neo-sqlite/. Plus tracked-only: concepts/. Plus a top-level memory-core.sqlite placeholder file (empty, harmless).
wake-daemon/ symlinking is critical for #10423's PID-lock singleton enforcement to span worktrees — currently each worktree has its own wake-daemon/bridge-daemon.pid, so daemons spawned from different worktrees can't see each other's locks. Same logic for bridge.log (#10425 persistent log substrate).
The Fix
Replace the single-dir symlinkDataDir with a granular per-subdir version:
export const DATA_SUBDIRS_TO_LINK = [
'sqlite',
'chroma',
'wake-daemon',
'backups',
'datasets',
'neo-sqlite'
];
export async function symlinkDataDir({
mainCheckout, projectRoot,
subdirs = DATA_SUBDIRS_TO_LINK,
force = false, log = console.log
}) {
}
Touched surface:
ai/scripts/bootstrapWorktree.mjs — function signature change + new DATA_SUBDIRS_TO_LINK export + per-subdir loop
test/playwright/unit/ai/scripts/bootstrapWorktree.spec.mjs — extend existing tests, add cases for: idempotent re-link per-subdir, refuse-clobber-without-force per-subdir, force-clobber per-subdir, skip-no-source-subdir, concepts/ never touched
Acceptance Criteria
Out of Scope
- Auto-discovery of subdirs via
git check-ignore (deferred — hardcoded allowlist is more conservative and the substrate-data subdir set is small + stable; revisit if churn warrants)
- Migration script for existing broken worktrees (manual
rm -rf + ln -s is one-shot; no need for tooling)
- Changes to the parent
.neo-ai-data/ directory itself (it's left as a regular dir; only its substrate-data children are symlinked)
- Changes to
concepts/ synchronization across worktrees (it's git-tracked; works through normal git mechanics)
- Bridge-daemon-side changes to handle non-symlinked deployments (out of scope; this fix addresses the substrate, not the consumer)
Avoided Traps
- Trap: Auto-discover via
git check-ignore per-subdir at runtime. Avoided: Adds dependency on git invocation in a path that should be fast and deterministic; hardcoded allowlist is more transparent and a small list is fine.
- Trap: Symlink the parent
.neo-ai-data/ and special-case concepts/ afterward. Avoided: Leaves a window where concepts/ is wrong; granular per-subdir symlinking is cleaner.
- Trap: Discriminate based on git-tracked status programmatically (e.g.,
git ls-files filter). Avoided: Coupling the bootstrap to git-tracking introspection adds overhead; the gitignore boundary is stable enough that an explicit allowlist captures the right shape with one source of truth.
Related
- Closed predecessor: #10224 (coarse-grained symlink that this refines)
- Related substrate bug: #10424 (cross-process coherence gap that the granular fix unblocks)
- Singleton enforcement that depends on
wake-daemon/ symlinking: #10423
- Persistent log substrate that depends on
wake-daemon/ symlinking: #10425
- Bootstrap script @see chain: #10095, #10176
Origin Session ID: 594d8e82-3c69-4d66-8038-fcba9a96efa7
Retrieval Hint: "bootstrapWorktree granular subdir symlink concepts gitignore boundary"
Context
Follow-up to closed predecessor #10224 (which established the coarse-grained
.neo-ai-data/symlink as the cross-worktree data unification primitive) and #10424 (the cross-process coherence gap empirically diagnosed across this 2026-04-27 session-arc).bootstrapWorktree.mjs --link-datacurrently symlinks the entire.neo-ai-data/directory atomically. With--force, itfs.rms the existing dir recursively before symlinking. This works for a fresh worktree but breaks for any worktree that:.neo-ai-data/concepts/directory present (every worktree, since it's tracked) → the symlink hides the worktree's own tracked files behind canonical'sunlink + git checkoutto restoreconcepts/→ the data-link is now broken (regular dirs forsqlite/,chroma/instead of symlinks), causing cross-process coherence drift between MCP servers and the bridge daemonThis session-arc was the empirical anchor: my MCP server's
list_messagesrepeatedly missed messages from@neo-gemini-3-1-prothat raw SQL on canonical confirmed existed; bridge daemon delivered to phantom WAKE_SUB IDs that no MCP-server view contained.The Problem
The
.gitignoreboundary inside.neo-ai-data/:Everything under
.neo-ai-data/is gitignored EXCEPTconcepts/which IS tracked. The bootstrap's all-or-nothing symlink can't honor this distinction:concepts/becomes invisible (replaced by canonical's view)unlink + git checkoutreversal) → gitignored substrate-data subdirs (sqlite/,chroma/,wake-daemon/, etc.) get isolated per-worktree, breaking cross-process coherenceThe empirical pattern observed this session-arc: 11 worktrees on disk, only 1 symlinked correctly (
peaceful-pare-9cbfcb); 10 had regular dirs and 0 (or stale) WAKE_SUBSCRIPTION nodes in their isolated DBs. MCP servers in those worktrees couldn't see canonical's live state.The Architectural Reality
ai/scripts/bootstrapWorktree.mjs— the canonical worktree-init substrate. Already hasBOOTSTRAP_CONFIGS(the explicit allowlist forconfig.mjsfiles) — the same shape applies cleanly to data subdirs.symlinkDataDir({dir = '.neo-ai-data'})— current shape: single-dir all-or-nothing symlink withforceclobber.test/playwright/unit/ai/scripts/bootstrapWorktree.spec.mjs— existing test surface; the granular update needs symmetric coverage..neo-ai-data/substrate-data subdirs (gitignored):sqlite/,chroma/,wake-daemon/,backups/,datasets/,neo-sqlite/. Plus tracked-only:concepts/. Plus a top-levelmemory-core.sqliteplaceholder file (empty, harmless).wake-daemon/symlinking is critical for #10423's PID-lock singleton enforcement to span worktrees — currently each worktree has its ownwake-daemon/bridge-daemon.pid, so daemons spawned from different worktrees can't see each other's locks. Same logic forbridge.log(#10425 persistent log substrate).The Fix
Replace the single-dir
symlinkDataDirwith a granular per-subdir version:export const DATA_SUBDIRS_TO_LINK = [ 'sqlite', // Memory Core graph DB 'chroma', // vector DBs (KB + memory-core) 'wake-daemon', // PID-lock + bridge.log + lastSyncId — unifies singleton across worktrees (#10423/#10425) 'backups', // JSONL backups 'datasets', // canonical CSVs 'neo-sqlite' // legacy DB (still 329MB / referenced) ]; export async function symlinkDataDir({ mainCheckout, projectRoot, subdirs = DATA_SUBDIRS_TO_LINK, force = false, log = console.log }) { // Ensure .neo-ai-data/ exists as a regular dir; never symlink the parent. // For each subdir in the allowlist: // - lstat dst; if symlink → 'already-linked' (idempotent) // - if non-symlink dir + canonical lacks subdir → 'skip-no-source' // - if non-symlink dir + force=false → throw (data-loss guard preserved) // - if non-symlink dir + force=true → recursive rm → symlink // - else (no dst) → mkdir parent → symlink // concepts/ is never in the allowlist → never touched. // Returns per-subdir result map: { sqlite: 'linked', chroma: 'already-linked', ... } }Touched surface:
ai/scripts/bootstrapWorktree.mjs— function signature change + newDATA_SUBDIRS_TO_LINKexport + per-subdir looptest/playwright/unit/ai/scripts/bootstrapWorktree.spec.mjs— extend existing tests, add cases for: idempotent re-link per-subdir, refuse-clobber-without-force per-subdir, force-clobber per-subdir, skip-no-source-subdir, concepts/ never touchedAcceptance Criteria
symlinkDataDiracceptssubdirsallowlist (defaults toDATA_SUBDIRS_TO_LINK)concepts/is NEVER in the default allowlist (cannot be clobbered by--forceaccidentally)forceflag--forceclobbers only listed subdirs, neverconcepts/or other unlisted pathsnode ai/scripts/bootstrapWorktree.mjs --link-data) prints per-subdir result linesOut of Scope
git check-ignore(deferred — hardcoded allowlist is more conservative and the substrate-data subdir set is small + stable; revisit if churn warrants)rm -rf+ln -sis one-shot; no need for tooling).neo-ai-data/directory itself (it's left as a regular dir; only its substrate-data children are symlinked)concepts/synchronization across worktrees (it's git-tracked; works through normal git mechanics)Avoided Traps
git check-ignoreper-subdir at runtime. Avoided: Adds dependency on git invocation in a path that should be fast and deterministic; hardcoded allowlist is more transparent and a small list is fine..neo-ai-data/and special-caseconcepts/afterward. Avoided: Leaves a window whereconcepts/is wrong; granular per-subdir symlinking is cleaner.git ls-filesfilter). Avoided: Coupling the bootstrap to git-tracking introspection adds overhead; the gitignore boundary is stable enough that an explicit allowlist captures the right shape with one source of truth.Related
wake-daemon/symlinking: #10423wake-daemon/symlinking: #10425Origin Session ID: 594d8e82-3c69-4d66-8038-fcba9a96efa7
Retrieval Hint: "bootstrapWorktree granular subdir symlink concepts gitignore boundary"