LearnNewsExamplesServices
Frontmatter
id11111
titleChromaManager dedup: extract shared Chroma-client primitive (M6 retrospective)
stateClosed
labels
enhancementairefactoringarchitecturemodel-experience
assigneesneo-opus-4-7
createdAtMay 10, 2026, 3:37 PM
updatedAtMay 25, 2026, 12:04 AM
githubUrlhttps://github.com/neomjs/neo/issues/11111
authorneo-opus-4-7
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 25, 2026, 12:04 AM

ChromaManager dedup: extract shared Chroma-client primitive (M6 retrospective)

Closedenhancementairefactoringarchitecturemodel-experience
neo-opus-4-7
neo-opus-4-7 commented on May 10, 2026, 3:37 PM

Context

M6 epic #10986 (Tier-1 SDK migration, closed 2026-05-09) moved both KB and Memory Core ChromaManager files to the SDK boundary without auditing whether the underlying Chroma-client wrapping logic is dedup-eligible. Operator @tobiu surfaced post-merge during the M6 4-item retrospective:

"next item: 2 chroma service(s) [multiple files]. moved to the sdk, a lot of no longer needed duplication."

Empirical State

Two files exist post-M6:

File Extends Caches Purpose
ai/services/knowledge-base/ChromaManager.mjs Base knowledge-base collection (single) KB's vector DB wrapper
ai/services/memory-core/managers/ChromaManager.mjs AbstractVectorManager memory + summary collections (multi) MC's vector DB wrapper

V-B-A on the substrate-correct dedup shape:

  • AbstractVectorManager is Memory-Core-specific by design. className: 'Neo.ai.services.memory-core.managers.AbstractVectorManager'; methods are getMemoryCollection() + getSummaryCollection() — explicitly MC's two-collection abstraction. KB doesn't have a "summary collection" so the abstract contract doesn't apply. Lifting AbstractVectorManager UP is wrong-shape.
  • The actual duplication is at the Chroma-client-wrapping layer (lower than the per-server collection-abstraction layer): both files independently import { ChromaClient }, manage connection state, handle the dummyEmbeddingFunction quirk, and provide getOrCreateCollection-style helpers.

The Problem

Each ChromaManager file independently re-implements:

  • ChromaClient instantiation + connection lifecycle
  • dummyEmbeddingFunction boilerplate (Chroma client warning suppression)
  • getOrCreateCollection semantics (with Chroma-version-specific edge cases)
  • Connection error handling + retries

When Chroma client semantics change (e.g., new API version, new error patterns, new embedding-function requirements), both files need parallel updates. M6 migration mechanically moved both to SDK without consolidating.

The Fix

Extract a shared ChromaClient primitive at ai/services/shared/ChromaClient.mjs (or ai/services/shared/vector/ChromaClient.mjs if a vector/ namespace is preferred). Both ChromaManager files consume the primitive; per-server collection abstractions stay where they are.

Concrete shape:

ai/services/shared/ChromaClient.mjs   (NEW — wraps `import {ChromaClient} from 'chromadb'`)
   ├── connect()
   ├── getOrCreateCollection(name, metadata, embeddingFn)
   ├── ensureDummyEmbedding()
   └── disconnect()

ai/services/knowledge-base/ChromaManager.mjs   (UPDATED — uses shared primitive)
   └── extends Base
   └── getKnowledgeBaseCollection() — KB's single-collection accessor

ai/services/memory-core/managers/ChromaManager.mjs   (UPDATED — uses shared primitive)
   └── extends AbstractVectorManager   (stays MC-specific)
   └── getMemoryCollection() / getSummaryCollection() — MC's multi-collection accessors

The shared primitive owns Chroma-client mechanics. The ChromaManager files own per-server collection-abstraction semantics. Clean separation; future Chroma-client changes propagate via single update; per-server specializations preserved.

Acceptance Criteria

  • ai/services/shared/ChromaClient.mjs created with extracted Chroma-client-wrapping logic + appropriate JSDoc (Anchor & Echo per AGENTS.md §15.2)
  • ai/services/knowledge-base/ChromaManager.mjs refactored to consume the shared primitive (preserves KB's single-collection abstraction)
  • ai/services/memory-core/managers/ChromaManager.mjs refactored to consume the shared primitive (preserves MC's AbstractVectorManager extension + two-collection abstraction)
  • All existing tests pass (test/playwright/unit/ai/services/knowledge-base/ + test/playwright/unit/ai/services/memory-core/)
  • Net line-count reduction across the 2 ChromaManager files (the duplicated boilerplate moves to 1 file)
  • No behavior change in either KB or MC vector-collection access paths (smoke test by booting both MCP servers + exercising vector queries)

Out of Scope

  • Lifting AbstractVectorManager up to a shared namespace — explicitly rejected (see Avoided Traps below); MC-specific by design.
  • Refactoring KB to extend AbstractVectorManager — would force KB to implement getSummaryCollection() it doesn't need; scope-creep + wrong-shape.
  • Replacing Chroma with another vector DB — orthogonal; if needed, file separately.
  • Other vector-store managers (e.g., SQLiteVectorManager referenced in AbstractVectorManager comments) — same AbstractVectorManager family; not the dedup target here.

Avoided Traps / Gold Standards Rejected

Decision Matrix

  1. Shared ChromaClient primitive at ai/services/shared/ (Selected): Extracts the actual duplication (Chroma-client-wrapping mechanics) without forcing MC-specific abstractions onto KB. Substrate-correct because the layer-of-duplication is BELOW the per-server collection-abstraction layer, not at it. Future Chroma-client API changes propagate via single update.

  2. Lift AbstractVectorManager up to shared and have KB extend it: Rejected per V-B-A on AbstractVectorManager — its className + methods are MC-specific (getMemoryCollection + getSummaryCollection); KB doesn't have a "summary collection" so the abstract contract doesn't fit. Lifting it would force KB to implement abstract methods it doesn't need. Empirical anchor: AbstractVectorManager.mjs:13 @class Neo.ai.services.memory-core.managers.AbstractVectorManager — the namespace itself is MC-bound.

  3. Keep duplicates + document the dedup-debt: Rejected. Per @tobiu's "no longer needed duplication" framing; the duplication has zero substrate-quality justification.

  4. Have KB absorb into MC's directory structure: Rejected. Asymmetric (privileges MC over KB); doesn't address the actual duplication (which is at the client-wrapping layer, not at the manager-class layer).

  5. Have MC absorb into KB's structure (extend Base directly, drop AbstractVectorManager): Rejected. AbstractVectorManager exists for genuine multi-engine swap (Chroma vs SQLite-Vec per its docstring); dropping it loses the abstraction substrate.

  • Trap: Treating "everything moves to SDK" as success criterion without auditing dedup-substrate. Rejection: M6 mechanically moved both files; that's file-relocation, not deduplication. The substrate-correct test asks: "what's the actual layer of duplication, and at what level is it shareable?" Sharing at the wrong level (lift AbstractVectorManager up) creates worse problems than no sharing.
  • Trap: Designing the dedup-shape from class-name alone. Rejection: ChromaManager in both files implies "same thing, dedup it" — but the empirical content shows divergent abstractions over a shared primitive. Empirical V-B-A on the actual code structure is the gate.

Related

  • M6 epic #10986 (closed; the migration that moved both files to SDK without dedup audit)
  • @tobiu's M6 4-item retrospective (this session, 2026-05-10)
  • #11107 (sibling architectural-move retrospective — toolService.mjs back to MCP server dirs)
  • #11110 (sibling test-coverage retrospective — github-workflow MCP needs systematic boot/wiring tests)
  • AGENTS.md §3.5 V-B-A core value (the discipline that surfaced the AbstractVectorManager-MC-specificity finding above)
  • ai/services/memory-core/managers/AbstractVectorManager.mjs (the abstraction whose generalizability is the empirical anchor)

Origin Session ID: c2912891-b459-4a03-b2af-154d5e264df1 Retrieval Hint: "ChromaManager dedup", "shared ChromaClient primitive", "AbstractVectorManager memory-core-specific"

tobiu closed this issue on May 25, 2026, 12:04 AM