LearnNewsExamplesServices
Frontmatter
id11596
titleGolden Path misses embedding dimension drift
stateClosed
labels
bugairegressionarchitecture
assigneesneo-gpt
createdAtMay 18, 2026, 11:19 PM
updatedAtMay 19, 2026, 8:33 AM
githubUrlhttps://github.com/neomjs/neo/issues/11596
authorneo-gpt
commentsCount1
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 19, 2026, 8:33 AM

Golden Path misses embedding dimension drift

Closed v13.0.0/archive-v13-0-0-chunk-12 bugairegressionarchitecture
neo-gpt
neo-gpt commented on May 18, 2026, 11:19 PM

Context

Operator ran /opt/homebrew/bin/npm run ai:run-sandman from /Users/Shared/github/neomjs/neo on current dev (dev == origin/dev, 3271fb0281b2cd81fca84f403c0696b76e74d930) on 2026-05-18.

After REM extraction and graph maintenance, Golden Path synthesis failed its semantic query:

[WARN] [GoldenPathSynthesizer] Failed to query semantic vectors from ChromaDB. ChromaClientError: Bad request ... Collection expecting embedding with dimension of 4096, got 3072
    at ... GoldenPathSynthesizer.mjs:91:37

The process still exited 0, but Golden Path semantic ranking could not run.

The Problem

GoldenPathSynthesizer.synthesizeGoldenPath() generates frontierEmbedding via:

frontierEmbedding = await TextEmbeddingService.embedText(frontierText, aiConfig.embeddingProvider);

It then directly queries the graph Chroma collection:

await graphColl.query({ queryEmbeddings: [frontierEmbedding], nResults: 20 });

In the observed run, the configured/provider-produced query vector was 3072 dimensions while the existing neo-native-graph Chroma collection expects 4096. Chroma rejects the query. The failure recurs the older #10003 class (4096 vs 3072) but on the Sandman / Golden Path read path rather than the KB sync write path that #10003/#10558 resolved.

The Architectural Reality

  • ai/mcp/server/memory-core/config.mjs / template still expose embeddingProvider, provider-specific embedding model keys, and vectorDimension independently.
  • TextEmbeddingService.embedText() returns whatever dimension the active provider/model emits; it does not enforce aiConfig.vectorDimension.
  • GoldenPathSynthesizer assumes the generated embedding dimension matches the graph collection dimension and only catches the resulting Chroma error after query.
  • Existing docs (learn/agentos/SharedDeployment.md) explain that operators must align NEO_VECTOR_DIMENSION with the active embedding model, but Sandman currently lacks a runtime preflight that turns mismatch into an actionable health/degraded state before Golden Path query.
  • #10003 closed the KB/Memory Core embedding boundary mismatch for sync flows. This failure proves the Golden Path query path still needs a guard or config-alignment contract.

The Fix

Implement a runtime guard at the Golden Path / embedding boundary, and decide whether the default config also needs alignment:

  1. In GoldenPathSynthesizer.synthesizeGoldenPath(), validate frontierEmbedding.length against aiConfig.vectorDimension before querying Chroma. If it differs, log one actionable warning naming provider, model, configured dimension, actual dimension, and remediation (NEO_EMBEDDING_PROVIDER / NEO_VECTOR_DIMENSION / collection rebuild). Then skip the semantic route cleanly.
  2. Add healthcheck observability if not already present: Memory Core health should surface actual embedding-vector length for the active provider, not only configured dimensions.
  3. Re-evaluate config defaults: if embeddingProvider: 'gemini' produces 3072 in the default path, default vectorDimension: 4096 is internally inconsistent unless the provider call requests a 4096-dimensional output. Fix either the default provider/dimension pair or the embedding request shape.

Contract Ledger Matrix

Target Surface Source of Authority Proposed Behavior Fallback Docs Evidence
GoldenPathSynthesizer semantic query preflight This ticket + #10003 recurrence Refuse Chroma query when embedding length differs from configured/collection dimension; emit actionable warning Current Chroma exception caught after failed query JSDoc on synthesizeGoldenPath() Unit test stubs embedText() to 3072 with vectorDimension=4096 and asserts no graphColl.query() call
Memory Core embedding health This ticket + SharedDeployment dimension contract Healthcheck reports actual provider output dimension when feasible Config-only dimensions remain visible learn/agentos/SharedDeployment.md Targeted health test or documented operator smoke
Default embedding config This ticket + current runtime evidence Defaults do not pair a 3072 provider output with a 4096 vector dimension Explicit operator override remains possible Config template comments Config/unit test asserts default provider + dimension pair is coherent

Acceptance Criteria

  • GoldenPathSynthesizer checks generated embedding length before graphColl.query() and skips with an actionable warning on mismatch.
  • Unit test covers frontierEmbedding.length !== aiConfig.vectorDimension and proves Chroma query is not called.
  • Unit or healthcheck test surfaces actual active embedding length for at least one provider path, or documents why live smoke is required instead.
  • Config defaults are audited so the default embeddingProvider / embeddingModel / vectorDimension tuple is coherent, or the mismatch is explicitly marked operator-local and detected at runtime.
  • Follow-up Sandman run no longer logs raw Chroma Collection expecting embedding with dimension of 4096, got 3072; if configuration remains mismatched, it logs the new actionable preflight message instead.

Out of Scope

  • Dropping or rebuilding Chroma collections automatically.
  • Migrating all historical embeddings.
  • Reopening the full #10003 architecture unless the implementation V-B-A proves the unified boundary itself regressed.

Avoided Traps

  • Treating this as only operator misconfiguration — rejected. The process can detect actual vector length before Chroma query and produce a useful failure mode.
  • Re-filing #10003 wholesale — rejected. #10003 covered KB/Memory Core sync unification; this is the Sandman/Golden Path read-path recurrence.
  • Auto-wiping Chroma on mismatch — rejected. Destructive collection rebuilds require explicit operator action and backup discipline.

Related

  • #10003 / PR #10558 — prior embedding-boundary unification; closed after KB sync succeeded with 4096-dim openAiCompatible embeddings.
  • learn/agentos/SharedDeployment.md — documents NEO_VECTOR_DIMENSION alignment requirement.
  • ai/daemons/services/GoldenPathSynthesizer.mjs:75-95
  • ai/services/memory-core/TextEmbeddingService.mjs
  • Operator Sandman run, 2026-05-18, current dev at 3271fb0281b2cd81fca84f403c0696b76e74d930.

Origin Session ID: 8591bc48-0ddc-48bf-aa47-58e53ea81a57 Retrieval Hint: query_raw_memories("Sandman GoldenPathSynthesizer Chroma embedding dimension mismatch 4096 3072 frontierEmbedding vectorDimension")

tobiu referenced in commit 9c934c8 - "fix(memory-core): guard Golden Path embedding dimensions (#11596) (#11613) on May 19, 2026, 8:33 AM
tobiu closed this issue on May 19, 2026, 8:33 AM