LearnNewsExamplesServices
Frontmatter
id11636
titlePhase 2D — Q12 Search-Hydration Mode Resolution + SearchService Implementation
stateClosed
labels
enhancementaiarchitecture
assigneesneo-gpt
createdAtMay 19, 2026, 1:55 PM
updatedAtMay 20, 2026, 10:29 PM
githubUrlhttps://github.com/neomjs/neo/issues/11636
authorneo-opus-4-7
commentsCount1
parentIssue11626
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 20, 2026, 10:29 PM

Phase 2D — Q12 Search-Hydration Mode Resolution + SearchService Implementation

Closedenhancementaiarchitecture
neo-opus-4-7
neo-opus-4-7 commented on May 19, 2026, 1:55 PM

Context

Sub of Phase 2 Epic #11626 (meta-Epic #11624). Graduated from Discussion #11623 Q12.

Resolves Q12 (search-hydration mode for non-local tenant content) + implements chosen mode in SearchService. Phase 2A/B/C MUST NOT implement retrieval flow before this lands (phase-dependency invariant per Epic AC).

The Problem

SearchService.mjs:118-120 currently resolves chunk source paths via path.resolve(aiConfig.neoRootDir, ref.source). For tenant content NOT mirrored on KB server filesystem, this fails silently. Three Q12 options from Discussion #11623:

  • Option A — chunk-metadata-embedded: store full content in chunk.metadata.content; hydrate from chunk, not filesystem. Pro: works regardless of FS layout. Con: chunk metadata size grows ~2-3x.
  • Option B — server-mirror: KB server mirrors tenant content locally (per-tenant mount under /tenants/<tenantId>/<repoSlug>/). Pro: filesystem-native. Con: storage cost, sync coordination.
  • Option C — hybrid: chunk metadata for small chunks; server-mirror for large. Pro: optimizes both. Con: dual-path complexity.

The Fix

  1. V-B-A against actual per-tenant chunk-size distribution (e.g., Neo's own chunks: median size, p95, p99) to inform the choice
  2. Decide Q12 option (lean Option A or Option C; document rationale + measurement evidence)
  3. Implement chosen mode in SearchService — replace single-neoRootDir resolution with tenant-tuple-aware hydration
  4. Tenant content sourced from:
    • Option A: chunk.metadata.content directly
    • Option B: path.resolve(aiConfig.knowledgeBase.tenantMirrorRoot, tenantId, repoSlug, sourcePath)
    • Option C: per-chunk routing based on size threshold or explicit flag

Acceptance Criteria

  • V-B-A document: chunk-size distribution captured for current Neo content (median, p95, p99, max)
  • Q12 option chosen + rationale documented in code comments + cross-link to this ticket
  • SearchService hydration path implemented for chosen mode
  • Backward-compat: existing Neo content (with tenantId: 'neo-shared') still hydratable from neoRootDir (gradual migration acceptable; existing path stays functional during transition)
  • Unit tests: hydration for each mode-relevant path
  • Integration test: tenant content retrievable + content visible in search results

Out of Scope

  • Phase 2A core service (depends on this for retrieval pattern)
  • Service-mirror infrastructure if Option B/C chosen (sync coordination may need separate ticket)

Related

  • Parent: #11626
  • Blocked-by: Phase 0/1 Epic completion (chunk-shape stable)
  • Blocks: Phase 2A retrieval flow (per phase-dependency AC)
  • Discussion source: #11623 §4 Q12

Origin Session ID

7360e917-1733-4cdd-a6f3-5ac51c34b838

Handoff Retrieval Hints

  • SearchService.mjs:95-130 is the current hydration surface
  • Start with V-B-A: query existing Chroma collection for chunk-size distribution before committing to Q12 option
  • ChromaDB defrag perf anchor (memory): ~10k chunks at 189MB → ~19KB/chunk average; informs Option A storage cost estimate
tobiu referenced in commit 3383a3e - "fix(kb): hydrate tenant search results from metadata (#11636) (#11694) on May 20, 2026, 10:29 PM
tobiu closed this issue on May 20, 2026, 10:29 PM