LearnNewsExamplesServices
Frontmatter
id11631
titlePhase 0/1C — KB Tenant Isolation Write-Side: VectorService Server-Stamping + Tenant-Aware chunkId + Spoof-Rejection
stateClosed
labels
enhancementaiarchitecture
assigneesneo-gpt
createdAtMay 19, 2026, 1:54 PM
updatedAtMay 20, 2026, 8:02 AM
githubUrlhttps://github.com/neomjs/neo/issues/11631
authorneo-opus-4-7
commentsCount0
parentIssue11625
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[x] 11630 Phase 0/1B — Source/Parser Registry Extraction + Per-Source Path Externalization + Byte-Equivalence Fixture
blocking[x] 11632 Phase 0/1D — KB Tenant Isolation Read-Side: QueryService/SearchService where-Filter + Fail-Closed Test Suite
closedAtMay 20, 2026, 8:02 AM

Phase 0/1C — KB Tenant Isolation Write-Side: VectorService Server-Stamping + Tenant-Aware chunkId + Spoof-Rejection

Closedenhancementaiarchitecture
neo-opus-4-7
neo-opus-4-7 commented on May 19, 2026, 1:54 PM

Context

Sub of Phase 0/1 Epic #11625 (meta-Epic #11624). Graduated from Discussion #11623, Cycle 2.5/2.6 absorption (Gemini + GPT cross-substrate-assumption catches).

Implements the write-side server-stamping security invariant — clients MUST NOT determine their own privilege boundary.

Topology anchor: Per ADR 0003 — Chroma Topology Unified Only, this work writes tenant-scoping metadata to the knowledge-base collection in the unified Chroma daemon (shared by KB + MC MCP servers but with per-server-owned collections). It does NOT mutate topology, does NOT introduce a new Chroma instance, does NOT touch Memory Core collections (neo-agent-memory, neo-agent-sessions).

The Problem

V-B-A confirmed (Cycle 2.5/2.6):

  • memorySharing enum is Memory-Core-only today (MemoryService.mjs:314,388,391,403 + SummaryService.mjs:118,257,260,272)
  • KB has ZERO memorySharing/tenantId references (grep -rn "memorySharing\|tenantId" ai/services/knowledge-base/ → 0 hits)
  • Without server-stamping, a malicious or buggy client can spoof tenantId: 'other-tenant' or visibility: 'team' and tamper with cross-tenant data

Pattern reused (Memory Core's memorySharing semantics), infrastructure new (KB write-side stamping path).

The Fix

In ai/services/knowledge-base/VectorService.mjs:

  1. embed() path (lines 188-274) injects server-derived {tenantId, visibility, originAgentIdentity?} at chunk-metadata stamping (metadatas construction at line 263-269)
  2. Tenant-aware chunkId derivation: chunk.hash computation extended with tenantId + repoSlug so same source content under two tenants yields distinct ids (no cross-tenant chunk-shadow)
  3. Client-supplied tenant field handling: configurable mode (default: server-OVERWRITE + structured warning log; alt: REJECT) — clients may not spoof
  4. Authenticated AgentIdentity context propagation: ingestSourceFiles (Phase 2) → KnowledgeBaseIngestionService (Phase 2) → VectorService.embed passes server-derived AgentIdentity context, NOT client payload

Neo's curated content tagged with shared constant (e.g., tenantId: 'neo-shared'); per-tenant content tagged with <tenantId>.

Acceptance Criteria

  • VectorService.embed injects {tenantId, visibility, originAgentIdentity?} from authenticated AgentIdentity into chunk metadata at upsert
  • Client-supplied tenantId/visibility fields in chunk metadata server-OVERWRITTEN (default) or REJECTED (configurable via aiConfig.knowledgeBase.spoofRejectionMode)
  • Spoof-overwrite path emits structured warning log (logger.warn) with client-supplied value + server-overwritten value + AgentIdentity
  • Tenant-aware chunkId hash derivation: chunk.hash includes tenantId + repoSlug in createContentHash inputs
  • Backward-compat: existing Neo content (when reformulated with tenantId: 'neo-shared') yields same effective query results (byte-equivalence fixture from Phase 0/1B must continue passing)
  • Unit tests:
    • Server-stamping injects correct fields from AgentIdentity context
    • Client-supplied tenantId: 'other-tenant' overwritten + warning logged
    • Same source content under two tenants → distinct chunk ids
    • aiConfig.knowledgeBase.spoofRejectionMode = 'reject' causes error response on client-supplied tenant field

Out of Scope

  • Read-side where filter → Phase 0/1D
  • Fail-closed end-to-end test suite → Phase 0/1D
  • Service-level AgentIdentity context propagation → Phase 2 (this sub mocks via test harness; Phase 2 wires real auth)

Related

  • Parent: #11625
  • Blocked-by: Phase 0/1A (#TBD — schema fields), Phase 0/1B (#TBD — registry interaction with chunkId)
  • Blocks: Phase 0/1D (#TBD — read-side filter validates write-side stamps)
  • Discussion source: #11623 §4 Q13a + §7 Phase 0/1 + §11 Avoided Trap "Client-supplied tenant/visibility trust"
  • Sibling pattern: MemoryService.mjs:391-410 (Memory Core query-time policy pattern)

Origin Session ID

7360e917-1733-4cdd-a6f3-5ac51c34b838

Handoff Retrieval Hints

  • VectorService.mjs:188-274 is the write-side surface
  • query_raw_memories({query: 'memorySharing KB port write-side stamping spoof-rejection'})
  • Mirror the MemoryService.mjs:391-410 policy pattern; KB-side is the new infrastructure half
tobiu referenced in commit 918ee03 - "feat(kb): add write-side tenant stamping (#11631) (#11662) on May 20, 2026, 8:02 AM
tobiu closed this issue on May 20, 2026, 8:02 AM
tobiu referenced in commit 93276d7 - "feat(kb): read-side tenant isolation filter + fail-closed suite (#11632) (#11674) on May 20, 2026, 11:03 AM