Context
Sub of Phase 2 Epic #11626 (meta-Epic #11624). Graduated from Discussion #11623 Q12.
Resolves Q12 (search-hydration mode for non-local tenant content) + implements chosen mode in SearchService. Phase 2A/B/C MUST NOT implement retrieval flow before this lands (phase-dependency invariant per Epic AC).
The Problem
SearchService.mjs:118-120 currently resolves chunk source paths via path.resolve(aiConfig.neoRootDir, ref.source). For tenant content NOT mirrored on KB server filesystem, this fails silently. Three Q12 options from Discussion #11623:
- Option A — chunk-metadata-embedded: store full content in
chunk.metadata.content; hydrate from chunk, not filesystem. Pro: works regardless of FS layout. Con: chunk metadata size grows ~2-3x.
- Option B — server-mirror: KB server mirrors tenant content locally (per-tenant mount under
/tenants/<tenantId>/<repoSlug>/). Pro: filesystem-native. Con: storage cost, sync coordination.
- Option C — hybrid: chunk metadata for small chunks; server-mirror for large. Pro: optimizes both. Con: dual-path complexity.
The Fix
- V-B-A against actual per-tenant chunk-size distribution (e.g., Neo's own chunks: median size, p95, p99) to inform the choice
- Decide Q12 option (lean Option A or Option C; document rationale + measurement evidence)
- Implement chosen mode in
SearchService — replace single-neoRootDir resolution with tenant-tuple-aware hydration
- Tenant content sourced from:
- Option A:
chunk.metadata.content directly
- Option B:
path.resolve(aiConfig.knowledgeBase.tenantMirrorRoot, tenantId, repoSlug, sourcePath)
- Option C: per-chunk routing based on size threshold or explicit flag
Acceptance Criteria
Out of Scope
- Phase 2A core service (depends on this for retrieval pattern)
- Service-mirror infrastructure if Option B/C chosen (sync coordination may need separate ticket)
Related
- Parent: #11626
- Blocked-by: Phase 0/1 Epic completion (chunk-shape stable)
- Blocks: Phase 2A retrieval flow (per phase-dependency AC)
- Discussion source: #11623 §4 Q12
Origin Session ID
7360e917-1733-4cdd-a6f3-5ac51c34b838
Handoff Retrieval Hints
SearchService.mjs:95-130 is the current hydration surface
- Start with V-B-A: query existing Chroma collection for chunk-size distribution before committing to Q12 option
ChromaDB defrag perf anchor (memory): ~10k chunks at 189MB → ~19KB/chunk average; informs Option A storage cost estimate
Context
Sub of Phase 2 Epic #11626 (meta-Epic #11624). Graduated from Discussion #11623 Q12.
Resolves Q12 (search-hydration mode for non-local tenant content) + implements chosen mode in
SearchService. Phase 2A/B/C MUST NOT implement retrieval flow before this lands (phase-dependency invariant per Epic AC).The Problem
SearchService.mjs:118-120currently resolves chunk source paths viapath.resolve(aiConfig.neoRootDir, ref.source). For tenant content NOT mirrored on KB server filesystem, this fails silently. Three Q12 options from Discussion #11623:chunk.metadata.content; hydrate from chunk, not filesystem. Pro: works regardless of FS layout. Con: chunk metadata size grows ~2-3x./tenants/<tenantId>/<repoSlug>/). Pro: filesystem-native. Con: storage cost, sync coordination.The Fix
SearchService— replace single-neoRootDirresolution with tenant-tuple-aware hydrationchunk.metadata.contentdirectlypath.resolve(aiConfig.knowledgeBase.tenantMirrorRoot, tenantId, repoSlug, sourcePath)Acceptance Criteria
SearchServicehydration path implemented for chosen modetenantId: 'neo-shared') still hydratable fromneoRootDir(gradual migration acceptable; existing path stays functional during transition)Out of Scope
Related
Origin Session ID
7360e917-1733-4cdd-a6f3-5ac51c34b838Handoff Retrieval Hints
SearchService.mjs:95-130is the current hydration surfaceChromaDB defrag perf anchor(memory): ~10k chunks at 189MB → ~19KB/chunk average; informs Option A storage cost estimate