LearnNewsExamplesServices
Frontmatter
id11011
titleRetire chromaUnified flag + federated Chroma topology
stateClosed
labels
enhancementairefactoringarchitecturemodel-experience
assigneesneo-gemini-3-1-pro
createdAtMay 9, 2026, 2:34 PM
updatedAtMay 12, 2026, 4:09 AM
githubUrlhttps://github.com/neomjs/neo/issues/11011
authorneo-opus-4-7
commentsCount0
parentIssue10822
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 9, 2026, 4:33 PM

Retire chromaUnified flag + federated Chroma topology

Closedenhancementairefactoringarchitecturemodel-experience
neo-opus-4-7
neo-opus-4-7 commented on May 9, 2026, 2:34 PM

Context

Filed 2026-05-09 to capture the operator ruling on AC3 of #10990 (federated topology deferral). @tobiu's substrate-truth analysis surfaced that the prevailing "deferral with operator-migration carve-out" assumption was over-cautious: chromaUnified was introduced ONLY in dev-branch substrate post-v12.1, so no externally-released config has ever enabled federated mode for an end user. The latest release (v12.1) shipped federated-only; the unified flag has lived in dev-only territory since being added during the multi-tenancy substrate work.

Because v13 is a major release that explicitly allows breaking changes, AND zero external operator deployments have the flag configured, the cleanest disposition is delete the flag entirely rather than the deprecate-then-retire shape that ConfigSubstrateDeadConfigAudit.md:40 previously scheduled under #10822 Phase 2.

This ticket consolidates the operator ruling (#10990 AC3), simplifies the audit-track (#10822 Phase 2), folds in an architectural decision record, and seeds graph traversal for #10449 Sub-Issue 2 (ArchitectureOverview.md ADR-link audit).

The Problem

The dev branch currently maintains TWO Chroma topology shapes:

  • Federated (legacy): Memory Core has its own ChromaDB instance on port 8001; Knowledge Base has its own on port 8000.
  • Unified (current product default): MC and KB share a single ChromaDB instance on port 8000.

Two-mode complexity costs:

  • ~18 file surface carrying topology-aware branching (if (aiConfig.chromaUnified) cases in ChromaLifecycleService, ChromaManager, HealthService, DeploymentConfig, backfillChromaSharedUserId, plus 2 config templates and 2 deploy compose files).
  • Dedicated test surface — ChromaLifecycleService — unified-mode bypass spec exists explicitly to cover the divergent lifecycle, plus paired federated-mode test cases in HealthService.spec.mjs and ChromaManager.spec.mjs.
  • Diverging HealthService JSON shape across the two modes (different chroma.{host, port} vs engines.kb.chroma.{host, port} references).
  • Cognitive load on maintainers — every new Chroma consumer must reason about both topologies.
  • Doc divergence in learn/agentos/SharedDeployment.md and learn/agentos/MemoryCore.md carrying federated-mode sections.

v13 architectural posture (per learn/agentos/v13-path.md): thin MCP servers (M2 ✓), daemons-in-charge (M3 ✓), flat SDK boundary (M6 ✓). Two-Chroma topology adds dimensional complexity that doesn't earn its slot in any milestone.

Tenant isolation is independent. MemoryService.mjs:96 ships multi-tenant isolation via per-user collections (Epic #9999 / sub #10016 / #10000). That mechanism lives entirely in unified mode unchanged. Federated Chroma adds zero isolation guarantee that collection-level multi-tenancy doesn't already provide; the two-mode complexity is pure dimensional debt.

The Architectural Reality

File surface (verified via grep -rln "chromaUnified" --include="*.mjs"):

Path Role
ai/mcp/server/memory-core/config.template.mjs Source-of-truth config; declares chromaUnified default
ai/mcp/server/memory-core/config.mjs Operator-instantiated config (gitignored; dev-only artifact)
ai/services/memory-core/HealthService.mjs Topology-aware healthcheck JSON shape branching
ai/services/memory-core/lifecycle/ChromaLifecycleService.mjs startDatabase skipped_unified_mode bypass branch
ai/services/memory-core/managers/ChromaManager.mjs resolveChromaCoordinates topology-routing branch
ai/mcp/server/shared/helpers/DeploymentConfig.mjs Cross-server deployment-shape wiring
ai/scripts/backfillChromaSharedUserId.mjs One-shot operator script with topology branch

Test surface:

  • test/playwright/unit/ai/services/memory-core/HealthService.spec.mjs — chromaUnified JSON-shape assertions
  • test/playwright/unit/ai/services/memory-core/managers/ChromaManager.spec.mjs — topology-routing test cases
  • test/playwright/unit/ai/services/memory-core/lifecycle/ChromaLifecycleService.spec.mjsunified-mode bypass spec (entire spec retires)
  • test/playwright/unit/ai/buildScripts/backup.spec.mjs + restore.spec.mjs — backup-path topology cases

Deploy surface:

  • ai/deploy/docker-compose.yml — production compose with federated chroma-mc + unified chroma
  • ai/deploy/docker-compose.test.yml — test fixture with same dual-Chroma shape

Doc surface:

  • learn/agentos/SharedDeployment.md — federated-mode deployment section
  • learn/agentos/MemoryCore.md — federated-mode references
  • learn/agentos/ConfigSubstrateDeadConfigAudit.md:40 — currently classified defer-to-Phase-1.5; updates to removed-in-v13

ADR surface:

  • learn/agentos/decisions/0003-chroma-topology-unified-only.md (NEW) — captures the architectural decision, alternatives considered, rejection rationale

The Fix

Single PR, clean delete:

  1. Delete the flag entirely from config.template.mjs. Remove aiConfig.chromaUnified consumer reads across all 7 source .mjs files.
  2. Collapse Chroma config coordinates:
    • Retire aiConfig.engines.chroma.{host,port} (the federated MC's own Chroma).
    • Rename aiConfig.engines.kb.chroma.{host,port}aiConfig.engines.chroma.{host,port} as the canonical single coordinate. Drop the kb-specific naming since there's no longer a KB-vs-MC topology split to disambiguate.
  3. Delete topology-divergent code paths:
    • ChromaLifecycleService.startDatabase skipped_unified_mode bypass branch retires
    • ChromaManager.resolveChromaCoordinates topology-routing branch retires
    • HealthService topology-aware JSON-shape branch collapses to single shape
    • DeploymentConfig cross-server topology wiring simplifies
    • backfillChromaSharedUserId.mjs topology branch retires
  4. Delete dedicated tests:
    • ChromaLifecycleService — unified-mode bypass spec retires entirely
    • HealthService.spec.mjs chromaUnified-related assertions collapse to single-shape
    • ChromaManager.spec.mjs topology-routing test cases retire
    • backup.spec.mjs + restore.spec.mjs topology branches retire
  5. Collapse compose templates: docker-compose.yml + docker-compose.test.yml retire the standalone MC chroma-mc service (port 8001); keep only the shared chroma service (port 8000).
  6. Update docs:
    • SharedDeployment.md retires federated-mode section
    • MemoryCore.md retires federated-mode references
    • ConfigSubstrateDeadConfigAudit.md:40 row updates from defer-to-Phase-1.5removed-in-v13
  7. Author the ADR at learn/agentos/decisions/0003-chroma-topology-unified-only.md. Captures: operator ruling rationale, alternatives considered (maintain non-MVP diagnostic suite per #10990 branching AC; backwards-compat shim; deprecate-then-retire), rejection rationale (no external deployments to preserve; v13 = major; tenant isolation lives in collection-level multi-tenancy independent of topology; horizontal-scale Chroma is the right substrate if performance becomes a concern, not separate instances).

No backwards-compat shim. No deprecation warning. No migration script. Justified by:

  • v12.1 was federated-only (no external operator has chromaUnified configured)
  • KB content ships as zip per release (no Chroma data migration concern across v12.1 → v13 boundary)
  • v13 = major = breaking changes explicitly in-scope
  • Dev-branch state ≠ released-version state; backwards-compat thinking must be scoped to actually-released versions

Contract Ledger Matrix

Target Surface Source of Authority Proposed Behavior Fallback Docs Evidence
aiConfig.chromaUnified config field This ticket + #10990 AC3 ruling Field deleted entirely; no replacement None — flag never released externally ConfigSubstrateDeadConfigAudit.md updates to removed-in-v13 grep returns 0 hits post-merge
aiConfig.engines.chroma.{host,port} This ticket Retire federated coordinate; rename engines.kb.chroma.{host,port}engines.chroma.{host,port} as canonical None — federated coordinate never reached external operators MemoryCore.md + SharedDeployment.md updated Healthcheck JSON shape simplifies; spec assertions collapse
ChromaLifecycleService.startDatabase M3 lifecycle contract Single-shape startDatabase (no skipped_unified_mode) None JSDoc on the method unified-mode bypass spec retires; new single-shape spec
ChromaManager.resolveChromaCoordinates This ticket Single-coordinate resolution from aiConfig.engines.chroma None JSDoc ChromaManager.spec.mjs topology cases retire
HealthService JSON shape Existing healthcheck contract Single-topology JSON shape; no topology-aware branching None HealthService JSDoc HealthService.spec.mjs topology assertions collapse
docker-compose.{yml,test.yml} Deploy substrate Single shared chroma service on port 8000; standalone chroma-mc on port 8001 retired None learn/agentos/DeploymentCookbook.md if affected Compose templates pass docker-compose config validation
ADR 0003-chroma-topology-unified-only.md This ticket + operator ruling New ADR documenting decision, alternatives, rejection rationale N/A learn/agentos/decisions/ ADR file exists; ArchitectureOverview.md Structural Inventory MC row links to it (Sub-Issue 2 of #10449 territory)

Acceptance Criteria

  • AC1: aiConfig.chromaUnified field deleted from config.template.mjs; grep -rln "chromaUnified" --include="*.mjs" . returns 0 hits in source tree (excluding archived resources/content/issue-archive/).
  • AC2: aiConfig.engines.chroma.{host,port} retired; aiConfig.engines.kb.chroma.{host,port} renamed to aiConfig.engines.chroma.{host,port} as the canonical single coordinate.
  • AC3: All topology-aware branches removed from ChromaLifecycleService.startDatabase, ChromaManager.resolveChromaCoordinates, HealthService, DeploymentConfig, backfillChromaSharedUserId.mjs.
  • AC4: ChromaLifecycleService — unified-mode bypass spec retires entirely; HealthService.spec.mjs + ChromaManager.spec.mjs + backup/restore spec topology cases retire; new single-shape spec coverage where appropriate.
  • AC5: docker-compose.yml + docker-compose.test.yml collapse to single shared chroma service; standalone chroma-mc service retires.
  • AC6: learn/agentos/SharedDeployment.md + learn/agentos/MemoryCore.md federated-mode sections retire; ConfigSubstrateDeadConfigAudit.md:40 row updates from defer-to-Phase-1.5 to removed-in-v13.
  • AC7: ADR authored at learn/agentos/decisions/0003-chroma-topology-unified-only.md with: decision summary, operator ruling source (#10990 AC3), alternatives considered (backwards-compat shim / deprecate-then-retire / non-MVP diagnostic suite), rejection rationale (no external deployments; v13 major; tenant-isolation independence; horizontal-scale is the perf substrate).
  • AC8: Full unit test run shows zero new failures attributable to this change. test-integration-unified row green post-merge.
  • AC9: AC3 of #10990 marked closed with this ticket as the operator-ruling-implementation reference; #10990 itself can close once federated-test-fixture purge confirms.
  • AC10: #10009 ("Playwright Test Coverage: Federated Cloud Topology") closes with wontfix-equivalent rationale (federated retired, no coverage needed).
  • AC11: #10950 ("Reconcile topology matrix coverage") simplifies — no longer reconciling two topologies, single-topology coverage is the matrix.
  • AC12: #10822 Phase 2 dead-config audit row for chromaUnified updates from defer-to-Phase-1.5 to removed-in-v13.

Out of Scope

  • Sub-Issue 2 of #10449 (ArchitectureOverview.md ADR-link audit / Structural Inventory enrichment) — separate scope. The new ADR 0003-chroma-topology-unified-only.md becomes a candidate for the audit's link list, but the audit ticket owns the wiring.
  • Horizontal-scale Chroma replication — separate hypothesis. The ADR captures "if perf becomes a concern, horizontal scale is the right substrate" as architectural posture, but actual scale-out work is a future ticket.
  • Multi-tenancy substrate changes — collection-level multi-tenancy (Epic #9999) lives in unified mode unchanged.
  • MC backup/restore semantic changes — backup script topology branches retire mechanically; backup contract semantics unchanged.
  • ai/mcp/server/knowledge-base/config.template.mjs chromaUnified retirement — KB config also reads the flag; same retirement scope, included in this PR.
  • External release-note authoring — release notes for v13 will reference this ticket, but the release-note authoring is a separate workflow.
  • learn/agentos/v13-path.md M-milestone updates — if this retirement adds context to the M-milestone narrative, that's a separate doc-only PR.

Avoided Traps

  • Backwards-compat shim (rejected per operator analysis at AC3 of #10990): the shim path was the over-cautious default. v12.1 was federated-only; chromaUnified is dev-only; no external operator has the flag configured. A shim preserves a contract that doesn't exist.
  • Deprecate-then-retire (rejected for same reason): deprecation warnings serve external operators with stale configs; there are zero such operators here.
  • Maintain non-MVP diagnostic suite per #10990 branching AC (rejected): explicit operator ruling at AC3 closes that branch; maintaining a diagnostic-only federated suite keeps the dimensional complexity without product-path value.
  • Phased retirement under #10822 Phase 2 sequencing (simplified): #10822's audit row currently reads "removal after operator migration"; with operator ruling supplying the absent-migration finding, that phasing collapses to immediate-removal.
  • Skipping the ADR (rejected): the ADR is substrate-truth durability for future-self. Future agents asking "why did we drop federated mode?" should hit the ADR, not re-derive from scattered tickets.
  • Title prefix [breaking] (rejected per feedback_no_title_prefix_duplicating_labels): release:v13 label + body context handle the breaking-change framing.
  • Inline title parenthetical "(v13 breaking)" (rejected): label + body framing are sufficient; title budget reserved for subject specificity.

Related

  • Operator ruling source: #10990 AC3 — federated topology deferral ticket; this PR's merge supplies the explicit operator ruling that AC3 awaited.
  • Audit-track parent: #10822 — Config substrate cleanup epic; ConfigSubstrateDeadConfigAudit.md:40 row tracks chromaUnified retirement under Phase 2.
  • Federated-coverage ticket: #10009 — Playwright Test Coverage: Federated Cloud Topology (closes upon retirement of the topology itself).
  • Topology matrix ticket: #10950 — Reconcile topology matrix coverage for Memory Core deployment (simplifies post-retirement).
  • ADR-link audit: #10449 Sub-Issue 2 — ArchitectureOverview.md Structural Inventory ADR-link enrichment; new ADR 0003 becomes a candidate node for the audit's link wiring.
  • Multi-tenancy substrate (independent): #9999, sub #10016, #10000 — collection-level multi-tenancy lives unchanged in unified mode.
  • Dev-only-flag empirical anchor: ConfigSubstrateDeadConfigAudit.md:40 documents chromaUnified as dev-only since introduction.

Origin Session ID: c2912891-b459-4a03-b2af-154d5e264df1

Retrieval Hint: query_raw_memories(query="chromaUnified flag retirement federated Chroma topology v13 breaking change unified-only ADR 0003 single-Chroma multi-tenancy collection-level isolation #10990 AC3 operator ruling #10822 Phase 2")

tobiu referenced in commit c7a4cc3 - "refactor(memory-core): retire chromaUnified topology and enforce unified mode (#11011) (#11014) on May 9, 2026, 4:33 PM
tobiu closed this issue on May 9, 2026, 4:33 PM