Context
Surfaced 2026-05-08 during 5-iteration AC4-strict verification on origin/dev at a771afdb0 ("test(memory-core): Stabilize daemon specs via non-destructive cleanup and engine correction (#10924)" — Gemini's TestLifecycleHelper daemon-spec migration via PR #10940).
The synthesizeGoldenPath executes without crashing test was originally a HARD failure under the prior config (aiConfig.engine = 'neo', which skipped ChromaManager init per the April 15/30 Hybrid RAG refactor — topology.frontier came back undefined). Gemini's fix at a771afdb0 correctly switched aiConfig.engine to 'hybrid' and removed the NEO_TEST_SKIP_CI describe-skip guard, taking the test from never-passing to mostly-passing.
Empirical post-fix state: the test now flakes 5/5 runs (passes on retry every time, but fails first attempt). It is NOT a hard fail — Playwright counts it as flaky-but-passed, so CI exit code stays 0. But it produces noise + retry cost on every CI run.
The Problem
test/playwright/unit/ai/daemons/DreamServiceGoldenPath.spec.mjs:77 — the synthesizeGoldenPath executes without crashing test — flakes deterministically across all 5 post-fix iterations. The first-attempt failure mode needs investigation; it likely traces to a race between DreamService.ingestIssueStates() → synthesizeGoldenPath() → getContextFrontier() lookup at lines 80-93, where the frontier node setup is not synchronously settled before the topology read.
The failure is NOT a singleton-pollution / close+null pattern (those were addressed by PR #10940). The fix was scoped to engine-config + helper migration, not to the synchronization semantics inside DreamService's golden-path synthesis pipeline.
The Architectural Reality
test/playwright/unit/ai/daemons/DreamServiceGoldenPath.spec.mjs:77-94 — the flaky test (single test in the describe)
ai/daemons/DreamService.mjs#synthesizeGoldenPath — the producer of the topology being asserted
ai/mcp/server/memory-core/services/GraphService.mjs#getContextFrontier — the consumer that reads the topology
- Likely race surface: the
frontier node is upserted during synthesizeGoldenPath but the read happens before the SQLite-side persistence settles (similar shape to the WAL-snapshot lag pattern referenced elsewhere in GraphService.spec at the linkNodes cache-warm test)
- The
await new Promise(resolve => setTimeout(resolve, 50)) pattern used elsewhere in GraphService.spec for SQLite settling is NOT present in this test
The Fix (Investigation-Shaped)
Two candidate paths:
- Spec-level: add explicit
await settling between synthesizeGoldenPath() and the topology read (the 50ms-settle pattern used in sibling specs)
- Producer-level: ensure
DreamService.synthesizeGoldenPath() returns a promise that only resolves after the frontier-node + GUIDES edges are observable in GraphService.getContextFrontier() — i.e., propagate the SQLite-write barrier into the synthesizeGoldenPath contract
Option 2 is structurally cleaner (test doesn't paper over a producer-side race) but larger scope. Option 1 is the immediate Phase 3 (#10939) skip-guard alternative if the investigation defers.
Acceptance Criteria
Out of Scope
- Re-investigation of singleton-pollution patterns (covered by #10941 + #10936 + #10937)
- Migration of
DreamService to per-test instances (architectural change beyond AC scope)
- Fixing other unmasked-by-#10940 surfaces (one ticket per surface per substrate-discipline)
Avoided Traps
- Treating as duplicate of #10941: rejected — different consumer surface (DreamService synthesizeGoldenPath vs GraphService.getNeighbors), different root-cause class (synthesis-write race vs singleton-pollution). Both unmask via PR #10940 but differ in fix layer.
- Skip-guard restoration: PR #10940 deliberately removed the bucket-G3 skip-guard because the engine-config fix resolves the hard fail. Restoring the skip-guard would lose the empirical signal that the engine fix worked. If Phase 3 needs a skip-guard, it should reference THIS ticket, not bucket-G3.
- Increasing Playwright retries: doesn't address the root cause; just papers over.
Related
- Surfacing: post-#10940 verification at
a771afdb0 (5-iteration empirical evidence; documented in Phase 3 PR #10939 body when re-filed)
- Substrate fix that caused the unmasking: PR #10940 (TestLifecycleHelper substrate primitive + DreamServiceGoldenPath engine fix)
- Sibling residual: #10941 (GraphService.spec:107 — different surface, different hypothesis class)
- Phase 3 PR: #10939 (will reference this ticket if a skip-guard is needed on first attempt)
- Bucket G epic: #10924
Origin Session ID: 005b6edf-85d8-4980-9e17-486b6b8bed3f
Retrieval Hint: query_raw_memories(query="DreamServiceGoldenPath synthesizeGoldenPath flake post-PR-10940 engine hybrid frontier topology race")
Context
Surfaced 2026-05-08 during 5-iteration AC4-strict verification on
origin/devata771afdb0("test(memory-core): Stabilize daemon specs via non-destructive cleanup and engine correction (#10924)" — Gemini's TestLifecycleHelper daemon-spec migration via PR #10940).The
synthesizeGoldenPath executes without crashingtest was originally a HARD failure under the prior config (aiConfig.engine = 'neo', which skipped ChromaManager init per the April 15/30 Hybrid RAG refactor —topology.frontiercame back undefined). Gemini's fix ata771afdb0correctly switchedaiConfig.engineto'hybrid'and removed theNEO_TEST_SKIP_CIdescribe-skip guard, taking the test from never-passing to mostly-passing.Empirical post-fix state: the test now flakes 5/5 runs (passes on retry every time, but fails first attempt). It is NOT a hard fail — Playwright counts it as flaky-but-passed, so CI exit code stays 0. But it produces noise + retry cost on every CI run.
The Problem
test/playwright/unit/ai/daemons/DreamServiceGoldenPath.spec.mjs:77— thesynthesizeGoldenPath executes without crashingtest — flakes deterministically across all 5 post-fix iterations. The first-attempt failure mode needs investigation; it likely traces to a race betweenDreamService.ingestIssueStates()→synthesizeGoldenPath()→getContextFrontier()lookup at lines 80-93, where thefrontiernode setup is not synchronously settled before the topology read.The failure is NOT a singleton-pollution / close+null pattern (those were addressed by PR #10940). The fix was scoped to engine-config + helper migration, not to the synchronization semantics inside DreamService's golden-path synthesis pipeline.
The Architectural Reality
test/playwright/unit/ai/daemons/DreamServiceGoldenPath.spec.mjs:77-94— the flaky test (single test in the describe)ai/daemons/DreamService.mjs#synthesizeGoldenPath— the producer of the topology being assertedai/mcp/server/memory-core/services/GraphService.mjs#getContextFrontier— the consumer that reads the topologyfrontiernode is upserted duringsynthesizeGoldenPathbut the read happens before the SQLite-side persistence settles (similar shape to the WAL-snapshot lag pattern referenced elsewhere inGraphService.specat thelinkNodescache-warm test)await new Promise(resolve => setTimeout(resolve, 50))pattern used elsewhere inGraphService.specfor SQLite settling is NOT present in this testThe Fix (Investigation-Shaped)
Two candidate paths:
awaitsettling betweensynthesizeGoldenPath()and the topology read (the 50ms-settle pattern used in sibling specs)DreamService.synthesizeGoldenPath()returns a promise that only resolves after the frontier-node + GUIDES edges are observable inGraphService.getContextFrontier()— i.e., propagate the SQLite-write barrier into the synthesizeGoldenPath contractOption 2 is structurally cleaner (test doesn't paper over a producer-side race) but larger scope. Option 1 is the immediate Phase 3 (#10939) skip-guard alternative if the investigation defers.
Acceptance Criteria
CI=true npm run test-unit× 5 iterations onorigin/devsynthesizeGoldenPathnot awaiting all ingest writes, orgetContextFrontiernot seeing settled SQLite state?CI=true npm run test-unitinvocationsOut of Scope
DreamServiceto per-test instances (architectural change beyond AC scope)Avoided Traps
Related
a771afdb0(5-iteration empirical evidence; documented in Phase 3 PR #10939 body when re-filed)Origin Session ID:
005b6edf-85d8-4980-9e17-486b6b8bed3fRetrieval Hint:
query_raw_memories(query="DreamServiceGoldenPath synthesizeGoldenPath flake post-PR-10940 engine hybrid frontier topology race")