Context
Operator ran /opt/homebrew/bin/npm run ai:run-sandman from /Users/Shared/github/neomjs/neo on current dev (dev == origin/dev, 3271fb0281b2cd81fca84f403c0696b76e74d930) on 2026-05-18.
After REM extraction and graph maintenance, Golden Path synthesis failed its semantic query:
[WARN] [GoldenPathSynthesizer] Failed to query semantic vectors from ChromaDB. ChromaClientError: Bad request ... Collection expecting embedding with dimension of 4096, got 3072
at ... GoldenPathSynthesizer.mjs:91:37The process still exited 0, but Golden Path semantic ranking could not run.
The Problem
GoldenPathSynthesizer.synthesizeGoldenPath() generates frontierEmbedding via:
frontierEmbedding = await TextEmbeddingService.embedText(frontierText, aiConfig.embeddingProvider);
It then directly queries the graph Chroma collection:
await graphColl.query({ queryEmbeddings: [frontierEmbedding], nResults: 20 });In the observed run, the configured/provider-produced query vector was 3072 dimensions while the existing neo-native-graph Chroma collection expects 4096. Chroma rejects the query. The failure recurs the older #10003 class (4096 vs 3072) but on the Sandman / Golden Path read path rather than the KB sync write path that #10003/#10558 resolved.
The Architectural Reality
ai/mcp/server/memory-core/config.mjs / template still expose embeddingProvider, provider-specific embedding model keys, and vectorDimension independently.
TextEmbeddingService.embedText() returns whatever dimension the active provider/model emits; it does not enforce aiConfig.vectorDimension.
GoldenPathSynthesizer assumes the generated embedding dimension matches the graph collection dimension and only catches the resulting Chroma error after query.
- Existing docs (
learn/agentos/SharedDeployment.md) explain that operators must align NEO_VECTOR_DIMENSION with the active embedding model, but Sandman currently lacks a runtime preflight that turns mismatch into an actionable health/degraded state before Golden Path query.
- #10003 closed the KB/Memory Core embedding boundary mismatch for sync flows. This failure proves the Golden Path query path still needs a guard or config-alignment contract.
The Fix
Implement a runtime guard at the Golden Path / embedding boundary, and decide whether the default config also needs alignment:
- In
GoldenPathSynthesizer.synthesizeGoldenPath(), validate frontierEmbedding.length against aiConfig.vectorDimension before querying Chroma. If it differs, log one actionable warning naming provider, model, configured dimension, actual dimension, and remediation (NEO_EMBEDDING_PROVIDER / NEO_VECTOR_DIMENSION / collection rebuild). Then skip the semantic route cleanly.
- Add healthcheck observability if not already present: Memory Core health should surface actual embedding-vector length for the active provider, not only configured
dimensions.
- Re-evaluate config defaults: if
embeddingProvider: 'gemini' produces 3072 in the default path, default vectorDimension: 4096 is internally inconsistent unless the provider call requests a 4096-dimensional output. Fix either the default provider/dimension pair or the embedding request shape.
Contract Ledger Matrix
| Target Surface |
Source of Authority |
Proposed Behavior |
Fallback |
Docs |
Evidence |
GoldenPathSynthesizer semantic query preflight |
This ticket + #10003 recurrence |
Refuse Chroma query when embedding length differs from configured/collection dimension; emit actionable warning |
Current Chroma exception caught after failed query |
JSDoc on synthesizeGoldenPath() |
Unit test stubs embedText() to 3072 with vectorDimension=4096 and asserts no graphColl.query() call |
| Memory Core embedding health |
This ticket + SharedDeployment dimension contract |
Healthcheck reports actual provider output dimension when feasible |
Config-only dimensions remain visible |
learn/agentos/SharedDeployment.md |
Targeted health test or documented operator smoke |
| Default embedding config |
This ticket + current runtime evidence |
Defaults do not pair a 3072 provider output with a 4096 vector dimension |
Explicit operator override remains possible |
Config template comments |
Config/unit test asserts default provider + dimension pair is coherent |
Acceptance Criteria
Out of Scope
- Dropping or rebuilding Chroma collections automatically.
- Migrating all historical embeddings.
- Reopening the full #10003 architecture unless the implementation V-B-A proves the unified boundary itself regressed.
Avoided Traps
- Treating this as only operator misconfiguration — rejected. The process can detect actual vector length before Chroma query and produce a useful failure mode.
- Re-filing #10003 wholesale — rejected. #10003 covered KB/Memory Core sync unification; this is the Sandman/Golden Path read-path recurrence.
- Auto-wiping Chroma on mismatch — rejected. Destructive collection rebuilds require explicit operator action and backup discipline.
Related
- #10003 / PR #10558 — prior embedding-boundary unification; closed after KB sync succeeded with 4096-dim openAiCompatible embeddings.
learn/agentos/SharedDeployment.md — documents NEO_VECTOR_DIMENSION alignment requirement.
ai/daemons/services/GoldenPathSynthesizer.mjs:75-95
ai/services/memory-core/TextEmbeddingService.mjs
- Operator Sandman run, 2026-05-18, current
dev at 3271fb0281b2cd81fca84f403c0696b76e74d930.
Origin Session ID: 8591bc48-0ddc-48bf-aa47-58e53ea81a57
Retrieval Hint: query_raw_memories("Sandman GoldenPathSynthesizer Chroma embedding dimension mismatch 4096 3072 frontierEmbedding vectorDimension")
Context
Operator ran
/opt/homebrew/bin/npm run ai:run-sandmanfrom/Users/Shared/github/neomjs/neoon currentdev(dev == origin/dev,3271fb0281b2cd81fca84f403c0696b76e74d930) on 2026-05-18.After REM extraction and graph maintenance, Golden Path synthesis failed its semantic query:
[WARN] [GoldenPathSynthesizer] Failed to query semantic vectors from ChromaDB. ChromaClientError: Bad request ... Collection expecting embedding with dimension of 4096, got 3072 at ... GoldenPathSynthesizer.mjs:91:37The process still exited
0, but Golden Path semantic ranking could not run.The Problem
GoldenPathSynthesizer.synthesizeGoldenPath()generatesfrontierEmbeddingvia:frontierEmbedding = await TextEmbeddingService.embedText(frontierText, aiConfig.embeddingProvider);It then directly queries the graph Chroma collection:
await graphColl.query({ queryEmbeddings: [frontierEmbedding], nResults: 20 });In the observed run, the configured/provider-produced query vector was
3072dimensions while the existingneo-native-graphChroma collection expects4096. Chroma rejects the query. The failure recurs the older #10003 class (4096vs3072) but on the Sandman / Golden Path read path rather than the KB sync write path that #10003/#10558 resolved.The Architectural Reality
ai/mcp/server/memory-core/config.mjs/ template still exposeembeddingProvider, provider-specific embedding model keys, andvectorDimensionindependently.TextEmbeddingService.embedText()returns whatever dimension the active provider/model emits; it does not enforceaiConfig.vectorDimension.GoldenPathSynthesizerassumes the generated embedding dimension matches the graph collection dimension and only catches the resulting Chroma error after query.learn/agentos/SharedDeployment.md) explain that operators must alignNEO_VECTOR_DIMENSIONwith the active embedding model, but Sandman currently lacks a runtime preflight that turns mismatch into an actionable health/degraded state before Golden Path query.The Fix
Implement a runtime guard at the Golden Path / embedding boundary, and decide whether the default config also needs alignment:
GoldenPathSynthesizer.synthesizeGoldenPath(), validatefrontierEmbedding.lengthagainstaiConfig.vectorDimensionbefore querying Chroma. If it differs, log one actionable warning naming provider, model, configured dimension, actual dimension, and remediation (NEO_EMBEDDING_PROVIDER/NEO_VECTOR_DIMENSION/ collection rebuild). Then skip the semantic route cleanly.dimensions.embeddingProvider: 'gemini'produces 3072 in the default path, defaultvectorDimension: 4096is internally inconsistent unless the provider call requests a 4096-dimensional output. Fix either the default provider/dimension pair or the embedding request shape.Contract Ledger Matrix
GoldenPathSynthesizersemantic query preflightsynthesizeGoldenPath()embedText()to 3072 withvectorDimension=4096and asserts nographColl.query()calllearn/agentos/SharedDeployment.mdAcceptance Criteria
GoldenPathSynthesizerchecks generated embedding length beforegraphColl.query()and skips with an actionable warning on mismatch.frontierEmbedding.length !== aiConfig.vectorDimensionand proves Chroma query is not called.embeddingProvider/embeddingModel/vectorDimensiontuple is coherent, or the mismatch is explicitly marked operator-local and detected at runtime.Collection expecting embedding with dimension of 4096, got 3072; if configuration remains mismatched, it logs the new actionable preflight message instead.Out of Scope
Avoided Traps
Related
learn/agentos/SharedDeployment.md— documentsNEO_VECTOR_DIMENSIONalignment requirement.ai/daemons/services/GoldenPathSynthesizer.mjs:75-95ai/services/memory-core/TextEmbeddingService.mjsdevat3271fb0281b2cd81fca84f403c0696b76e74d930.Origin Session ID: 8591bc48-0ddc-48bf-aa47-58e53ea81a57 Retrieval Hint:
query_raw_memories("Sandman GoldenPathSynthesizer Chroma embedding dimension mismatch 4096 3072 frontierEmbedding vectorDimension")