Context
Phase 2 sub-ticket of Epic #11624 (Cloud-Native KB Ingestion for External Workspaces). Graduated from Discussion #11623. Blocked by Phase 0/1 #11625 — contracts MUST stabilize before endpoint implementation per cross-family peer convergence (substrate-correct ordering: contracts before transport).
This phase implements the cross-server push pipeline — the actual ingestion endpoints that consume Phase 0/1 contracts. Two facades behind a shared service, plus the Q12 search-hydration mode decision.
The Problem
After Phase 0/1 lands, the substrate has:
- Stable
parsed-chunk-v1 + backup-record-v1 schemas
- Source/Parser registry with
useDefaultSources / useDefaultParsers
- Path-identity tuple
{tenantId, repoSlug, rootKind, sourcePath}
- Tombstone / manifest / revision-boundary deletion-signaling contract
memorySharing KB port (write-side stamping + read-side filter)
What's missing for cloud deployments: the actual ingestion endpoint clients call from their git hooks. Pure-MCP cannot handle bulk-initial-imports (VectorService.mjs:216-240 refuses syncs > mcpSyncMaxChunks); pure-bulk-only loses the agent-native command-plane affordance for small batches. Two facades behind one service is the structurally-necessary shape.
Also: Q12 search hydration (chunk-metadata-embedded vs server-mirror vs hybrid) MUST be resolved before retrieval flow lands — Phase 0/1 marks the boundary; Phase 2 chooses the implementation.
The Architectural Reality
This phase touches:
| File |
Change |
NEW ai/services/knowledge-base/KnowledgeBaseIngestionService.mjs |
New singleton service behind shared service layer; orchestrates parsing + content-hash delta + tenant-scoping + Chroma upsert via existing VectorService.embed path |
NEW ai/mcp/server/knowledge-base/tools/ingestSourceFiles.mjs (or equivalent registration) |
MCP tool — small-batch command-plane facade; threads viaMcp per #10572 work-volume gate; returns structured KB_INGEST_VOLUME_EXCEEDED response when batch > threshold, pointing to bulk facade |
NEW buildScripts/ai/ingest-tenant.mjs (or ai/scripts/ingest-tenant.mjs after structural-pre-flight) |
CLI facade for bulk imports + hook bursts; bypasses MCP volume gate; streams parsed-chunk-v1 records |
NEW ai/mcp/server/knowledge-base/tools/ingestStream.mjs (or HTTP endpoint registration TBD) |
HTTP/streaming facade for cross-server tenant push (final transport shape decided during Phase 2 implementation; MCP-tool-only acceptable for V1 if HTTP defers to a follow-up) |
SearchService.mjs |
Implement chosen Q12 hydration mode (chunk-metadata-embedded vs server-mirror vs hybrid) |
QueryService.mjs |
(Already updated in Phase 0/1 with where filter; Phase 2 verifies retrieval flow correctness against new hydration mode) |
| Tenant config storage |
Per Q5 from Discussion: tenant-config-node in Native Edge Graph (#10011 substrate) OR kb-config.yaml bootstrap OR both — choice deferred to Phase 2 implementation review |
The Fix
1. KnowledgeBaseIngestionService (singleton)
Neo.ai.services.knowledge-base.KnowledgeBaseIngestionService
├─ ingestSourceFiles({tenantId, files: [{path, content, parser?}], deleted?, manifestSnapshot?, baseRevision?, headRevision?})
│ ├─ Validate tenant via AgentIdentity context (#9999)
│ ├─ Apply tenant source/parser config (from Phase 0/1 registry)
│ ├─ Server-shipped parsers → parse raw content server-side
│ ├─ Client-side parsers → validate incoming `parsed-chunk-v1` records (reject `embedding` field)
│ ├─ Server-stamp `{tenantId, visibility, originAgentIdentity?}` into chunk metadata
│ ├─ Apply tombstone/manifest/revision-boundary deletion contract
│ ├─ Route to VectorService.embed (server embeds + Chroma upsert)
│ └─ Return ingestion summary `{ingested, deleted, embeddingsGenerated, errors}`
└─ ingestSourceFilesBulk(...) — same contract, no MCP gate, streams response2. MCP Facade
ingestSourceFiles MCP tool registered via toolService.mjs:
- Accepts batches up to
aiConfig.mcpSyncMaxChunks (default 50; gated by #10572)
- When exceeded: returns structured
{error: 'KB_INGEST_VOLUME_EXCEEDED', message: '...', bulkPath: 'npm run ai:ingest-tenant <tenantId>'} response
- AgentIdentity-authenticated via standard MCP auth substrate
3. Bulk Facade
CLI command npm run ai:ingest-tenant <tenantId>:
- Reads
parsed-chunk-v1 records from file or stdin
- Bypasses MCP volume gate (CLI path;
viaMcp: false)
- Streams progress
- Suitable for initial tenant onboarding + hook bursts > MCP threshold
Optional V1.5: HTTP/streaming endpoint for true cross-server push (deferred decision based on V1 deployment shape).
4. Q12 Search Hydration Resolution
Required architectural choice before Phase 2 retrieval flow lands. Three options from Discussion §4 Q12:
- Option A — chunk-metadata-embedded: store full content in
chunk.metadata.content (already partially the case); hydrate from chunk, not filesystem. Pro: works regardless of FS layout. Con: chunk metadata size grows ~2-3x.
- Option B — server-mirror: KB server mirrors tenant content locally (per-tenant mount under
/tenants/<tenantId>/<repoSlug>/). Pro: filesystem-native, preserves existing hydration path. Con: storage cost, sync coordination.
- Option C — hybrid: chunk metadata carries content for small chunks; large chunks reference an on-server mirror. Pro: optimizes both axes. Con: dual-path complexity.
Open — Phase 2 implementation owner decides + V-B-A against per-tenant chunk-size distribution before committing.
5. Tenant Config Storage Resolution (Q5)
Phase 2 implementation also resolves Q5: Native Edge Graph tenant-config-node (#10011) vs kb-config.yaml vs both. Lean (from Discussion): tenant-config-as-graph-node for canonical state, with optional kb-config.yaml bootstrap for first-deploy.
6. Density / UX Measurement Gate
Per Discussion §6 sweep point 5: Phase 2 ships an empirical-evidence-threshold for when per-tenant Chroma sharding would be reopened (e.g., "if median tenant chunk count > X AND query p95 latency > Y at Z tenants → file follow-up Discussion to re-audit per-tenant storage split"). Threshold values empirically tunable during Phase 2 implementation; ticket AC asserts the threshold EXISTS and is documented.
Acceptance Criteria
Out of Scope
- Cloud deployment guide → Phase 3 (#TBD, blocked-by this ticket)
- HTTP/streaming endpoint if MCP + CLI proves sufficient for V1 deployment shape (operator-deferred decision)
- WASM/tree-sitter server-side custom-parser sandboxing → future Discussion
- Per-tenant Chroma sharding implementation (Phase 2 ships the measurement gate; sharding triggered only if gate's threshold trips per follow-up Discussion)
Avoided Traps
| Trap |
Why rejected |
| MCP-only ingestion path |
#10572 work-volume gate refuses bulk; structurally cannot handle initial tenant onboarding |
| Bulk-only ingestion path |
Loses agent-native command-plane affordance for small hook batches |
| Skipping Q12 resolution |
Per Phase 0/1 AC: retrieval flow MUST NOT land before hydration mode chosen; phase-dependency invariant |
| Building HTTP endpoint without V1 shape evidence |
Premature optimization; MCP + CLI may suffice for V1; HTTP can land as V1.5 if measurement justifies |
| Skipping density measurement gate |
Re-auditing per-tenant storage split needs empirical anchor; baking in the trigger now prevents re-derivation |
Related
- Parent Epic: #11624
- Blocked-by: #11625 (Phase 0/1 — contracts must stabilize)
- Origin Discussion: #11623 (archaeological source post-graduation)
- Sibling Epic: #9999 (read-side multi-tenancy substrate)
- Load-bearing dependency: #10572 (MCP work-volume gate threading)
- Identity substrate: #10011 (Native Edge Graph RLS), #9999 (AgentIdentity)
Origin Session ID
7360e917-1733-4cdd-a6f3-5ac51c34b838
Handoff Retrieval Hints
query_raw_memories({query: 'KnowledgeBaseIngestionService MCP small-batch bulk facade'})
query_raw_memories({query: 'Q12 search hydration chunk-metadata-embedded server-mirror'})
ask_knowledge_base({query: 'MCP work-volume gate ingestion threshold', type: 'src'})
- Discussion #11623 §7 Phase 2 + §4 Q3 + §4 Q12 + §6 sweep point 5 are the architectural source-of-authority
- Phase 0/1 (#11625) MUST be merged before this ticket's implementation begins; verify
parsed-chunk-v1.schema.json + useDefaultSources/useDefaultParsers configs + memorySharing KB port + byte-equivalence fixture all present in dev
Context
Phase 2 sub-ticket of Epic #11624 (Cloud-Native KB Ingestion for External Workspaces). Graduated from Discussion #11623. Blocked by Phase 0/1 #11625 — contracts MUST stabilize before endpoint implementation per cross-family peer convergence (substrate-correct ordering: contracts before transport).
This phase implements the cross-server push pipeline — the actual ingestion endpoints that consume Phase 0/1 contracts. Two facades behind a shared service, plus the Q12 search-hydration mode decision.
The Problem
After Phase 0/1 lands, the substrate has:
parsed-chunk-v1+backup-record-v1schemasuseDefaultSources/useDefaultParsers{tenantId, repoSlug, rootKind, sourcePath}memorySharingKB port (write-side stamping + read-side filter)What's missing for cloud deployments: the actual ingestion endpoint clients call from their git hooks. Pure-MCP cannot handle bulk-initial-imports (
VectorService.mjs:216-240refuses syncs >mcpSyncMaxChunks); pure-bulk-only loses the agent-native command-plane affordance for small batches. Two facades behind one service is the structurally-necessary shape.Also: Q12 search hydration (chunk-metadata-embedded vs server-mirror vs hybrid) MUST be resolved before retrieval flow lands — Phase 0/1 marks the boundary; Phase 2 chooses the implementation.
The Architectural Reality
This phase touches:
ai/services/knowledge-base/KnowledgeBaseIngestionService.mjsVectorService.embedpathai/mcp/server/knowledge-base/tools/ingestSourceFiles.mjs(or equivalent registration)viaMcpper #10572 work-volume gate; returns structuredKB_INGEST_VOLUME_EXCEEDEDresponse when batch > threshold, pointing to bulk facadebuildScripts/ai/ingest-tenant.mjs(orai/scripts/ingest-tenant.mjsafter structural-pre-flight)parsed-chunk-v1recordsai/mcp/server/knowledge-base/tools/ingestStream.mjs(or HTTP endpoint registration TBD)SearchService.mjsQueryService.mjswherefilter; Phase 2 verifies retrieval flow correctness against new hydration mode)kb-config.yamlbootstrap OR both — choice deferred to Phase 2 implementation reviewThe Fix
1.
KnowledgeBaseIngestionService(singleton)Neo.ai.services.knowledge-base.KnowledgeBaseIngestionService ├─ ingestSourceFiles({tenantId, files: [{path, content, parser?}], deleted?, manifestSnapshot?, baseRevision?, headRevision?}) │ ├─ Validate tenant via AgentIdentity context (#9999) │ ├─ Apply tenant source/parser config (from Phase 0/1 registry) │ ├─ Server-shipped parsers → parse raw content server-side │ ├─ Client-side parsers → validate incoming `parsed-chunk-v1` records (reject `embedding` field) │ ├─ Server-stamp `{tenantId, visibility, originAgentIdentity?}` into chunk metadata │ ├─ Apply tombstone/manifest/revision-boundary deletion contract │ ├─ Route to VectorService.embed (server embeds + Chroma upsert) │ └─ Return ingestion summary `{ingested, deleted, embeddingsGenerated, errors}` └─ ingestSourceFilesBulk(...) — same contract, no MCP gate, streams response2. MCP Facade
ingestSourceFilesMCP tool registered viatoolService.mjs:aiConfig.mcpSyncMaxChunks(default 50; gated by #10572){error: 'KB_INGEST_VOLUME_EXCEEDED', message: '...', bulkPath: 'npm run ai:ingest-tenant <tenantId>'}response3. Bulk Facade
CLI command
npm run ai:ingest-tenant <tenantId>:parsed-chunk-v1records from file or stdinviaMcp: false)Optional V1.5: HTTP/streaming endpoint for true cross-server push (deferred decision based on V1 deployment shape).
4. Q12 Search Hydration Resolution
Required architectural choice before Phase 2 retrieval flow lands. Three options from Discussion §4 Q12:
chunk.metadata.content(already partially the case); hydrate from chunk, not filesystem. Pro: works regardless of FS layout. Con: chunk metadata size grows ~2-3x./tenants/<tenantId>/<repoSlug>/). Pro: filesystem-native, preserves existing hydration path. Con: storage cost, sync coordination.Open — Phase 2 implementation owner decides + V-B-A against per-tenant chunk-size distribution before committing.
5. Tenant Config Storage Resolution (Q5)
Phase 2 implementation also resolves Q5: Native Edge Graph tenant-config-node (#10011) vs
kb-config.yamlvs both. Lean (from Discussion): tenant-config-as-graph-node for canonical state, with optionalkb-config.yamlbootstrap for first-deploy.6. Density / UX Measurement Gate
Per Discussion §6 sweep point 5: Phase 2 ships an empirical-evidence-threshold for when per-tenant Chroma sharding would be reopened (e.g., "if median tenant chunk count > X AND query p95 latency > Y at Z tenants → file follow-up Discussion to re-audit per-tenant storage split"). Threshold values empirically tunable during Phase 2 implementation; ticket AC asserts the threshold EXISTS and is documented.
Acceptance Criteria
KnowledgeBaseIngestionServicesingleton implemented behind shared service layeringestSourceFilesregistered + gated byaiConfig.mcpSyncMaxChunks(#10572)npm run ai:ingest-tenant <tenantId>CLI command implemented; streamsparsed-chunk-v1recordsparsed-chunk-v1validation (Phase 0/1 schema; rejects records withembeddingfield outside restore mode)SearchService; AC body documents rationale + per-tenant chunk-size distribution V-B-A usedmini-neo-workspace/,mini-es5-workspace/,mini-cpp-workspace/) push viaingestSourceFiles→ ingestion → query → tenant-isolation verifiedprivatecontentOut of Scope
Avoided Traps
Related
Origin Session ID
7360e917-1733-4cdd-a6f3-5ac51c34b838Handoff Retrieval Hints
query_raw_memories({query: 'KnowledgeBaseIngestionService MCP small-batch bulk facade'})query_raw_memories({query: 'Q12 search hydration chunk-metadata-embedded server-mirror'})ask_knowledge_base({query: 'MCP work-volume gate ingestion threshold', type: 'src'})parsed-chunk-v1.schema.json+useDefaultSources/useDefaultParsersconfigs + memorySharing KB port + byte-equivalence fixture all present in dev