LearnNewsExamplesServices
Frontmatter
id11496
titleOrchestrator still starts Memory Core Chroma after ADR 0003
stateClosed
labels
bugairegressionarchitectureperformancemodel-experience
assigneesneo-gpt
createdAtMay 16, 2026, 10:57 PM
updatedAtMay 16, 2026, 11:33 PM
githubUrlhttps://github.com/neomjs/neo/issues/11496
authorneo-gpt
commentsCount1
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 16, 2026, 11:33 PM

Orchestrator still starts Memory Core Chroma after ADR 0003

Closedbugairegressionarchitectureperformancemodel-experience
neo-gpt
neo-gpt commented on May 16, 2026, 10:57 PM

Context

The current daemon incident surfaced two separate failure classes:

  1. summary and kbSync can launch in the same restart poll. That is already covered by #11487 / PR #11489.
  2. The orchestrator is still running a second Memory Core Chroma daemon even though ADR 0003 and #11011 retired federated Chroma topology in favor of one shared Chroma instance.

This ticket tracks only the second failure.

The Problem

Live operator logs from 2026-05-16 show the orchestrator adopting both Chroma processes:

Found running chroma daemon process (PID: 2077). Adopting.
Found running memory core chroma process (PID: 2078). Adopting.

V-B-A against the running machine confirmed the topology is real, not log noise:

node 2077 ... chroma run --path .neo-ai-data/chroma/knowledge-base --port 8000
node 2078 ... chroma run --path .neo-ai-data/chroma/memory-core --port 8001

lsof confirmed 2077 listens on 8000 and 2078 listens on 8001. Both processes run from /Users/Shared/github/neomjs/neo.

This directly contradicts the accepted unified-only architecture in ADR 0003.

The Architectural Reality

ADR 0003 (learn/agentos/decisions/0003-chroma-topology-unified-only.md) states:

  • A single ChromaDB daemon is shared by both Knowledge Base and Memory Core.
  • chromaUnified is removed from MCP server configurations.
  • ChromaLifecycleService no longer spawns or manages a local Memory Core daemon.

Current source partially reflects that:

  • ai/services/memory-core/lifecycle/ChromaLifecycleService.mjs logs unified topology and does not spawn Memory Core Chroma.
  • ai/services/memory-core/HealthService.mjs reports topology mode as static unified.

But current source also regresses the topology:

  • ai/daemons/TaskDefinitions.mjs defines a separate memoryCoreChroma task with --path .neo-ai-data/chroma/memory-core --port 8001.
  • ai/daemons/Orchestrator.mjs includes memoryCoreChroma in continuousTasks.
  • ai/mcp/server/memory-core/config.mjs still points engines.chroma at port 8001 and retains chromaUnified / engines.kb.chroma remnants.
  • ai/mcp/server/knowledge-base/config.mjs still retains chromaUnified.

Lineage check: #11011 closed on 2026-05-09, but git blame shows memoryCoreChroma was added later by commit 94cd337c0d on 2026-05-15. This is therefore a post-retirement regression, not unfinished #11011 scope.

The Fix

Restore unified-only topology in the live daemon/runtime path:

  1. Remove the orchestrator-owned memoryCoreChroma continuous task and its PID state from TaskDefinitions / orchestrator expectations.
  2. Point Memory Core's default Chroma client coordinates at the shared Chroma daemon (localhost:8000) instead of a Memory Core-only 8001 daemon.
  3. Remove or neutralize remaining chromaUnified / federated mirror-block config surfaces that survived #11011 where they affect runtime behavior.
  4. Add focused unit coverage proving the orchestrator no longer defines or supervises memoryCoreChroma and Memory Core defaults resolve to the shared Chroma port.
  5. Preserve collection separation (neo-agent-memory, neo-agent-sessions, neo-knowledge-base) inside the single Chroma daemon; do not collapse logical collections.

Contract Ledger Matrix

Target Surface Source of Authority Proposed Behavior Fallback Docs Evidence
Orchestrator Chroma process supervision ADR 0003 + #11011 Only the shared chroma daemon is supervised by the orchestrator Manual operator can still restart npm run ai:orchestrator; no memoryCoreChroma task exists ADR 0003 Unit test: task definitions and orchestrator state omit memoryCoreChroma
Memory Core Chroma client coordinates ADR 0003 + HealthService.buildTopologyBlock() Memory Core connects to shared engines.chroma on localhost:8000 by default Operator override via canonical NEO_CHROMA_HOST / NEO_CHROMA_PORT remains available Deployment docs / config comments Unit test: config default port is 8000
Logical collection separation Memory Core / KB collection configs KB and MC share the daemon but keep distinct collection names No fallback; collection collision is a correctness bug Config JSDoc Existing collection-name tests plus targeted assertion if needed

Acceptance Criteria

  • buildTaskDefinitions() no longer returns a memoryCoreChroma task.
  • Orchestrator continuous supervision no longer includes memoryCoreChroma.
  • Memory Core's default engines.chroma.port resolves to the shared Chroma port (8000) unless explicitly overridden by canonical env vars.
  • Remaining runtime chromaUnified / federated mirror-block reads are removed or proven inert with tests.
  • Focused Playwright unit coverage passes for daemon task definitions and Memory Core Chroma defaults.
  • PR body includes operator validation note: after merge/restart, ps / lsof should show one Chroma listener for KB + MC, not two.

Out of Scope

  • Does not solve the parallel summary + kbSync startup race; #11487 / PR #11489 owns that.
  • Does not kill live PIDs as part of the PR. Operational cleanup should happen manually after the code fix lands.
  • Does not migrate existing vector data between .neo-ai-data/chroma/memory-core and .neo-ai-data/chroma/knowledge-base; if data migration is needed, file a separate explicit migration ticket.
  • Does not change embedding provider selection or vector dimensions.

Related

  • #9999 — Cloud-Native Knowledge & Multi-Tenant Memory Core
  • #11011 — Retire chromaUnified flag + federated Chroma topology
  • #11487 / PR #11489 — maintenance backpressure for parallel heavy tasks
  • #11471 — stale MLX Gemma 2 model source fix; distinct from this live stale-process/topology issue
  • ADR 0003 — learn/agentos/decisions/0003-chroma-topology-unified-only.md

Handoff Retrieval Hints: memoryCoreChroma TaskDefinitions ADR 0003 unified-only Chroma regression, orchestrator adopts memory core chroma PID 2078 port 8001, 94cd337c0d added memoryCoreChroma after #11011

tobiu referenced in commit 91983d6 - "fix(ai): remove memory core chroma daemon task (#11496) (#11499) on May 16, 2026, 11:33 PM
tobiu closed this issue on May 16, 2026, 11:33 PM