Context
Surfaced 2026-05-08 during PR #10933 (Phase 3 unit-row re-add) CI run 25524203756. Originally classified as G5#2 in the #10924 G5 triage matrix. Triage hypothesis was "auto-resolve post-G4-merge"; falsified empirically by the unit-row first activation in CI.
Skip-guarded in PR #10933 commit 8e6c3fd78 per the established #10907/#10921/#10928 pattern; this ticket tracks the proper investigation + fix.
The Problem
Test test/playwright/unit/ai/mcp/server/knowledge-base/services/KBRecorderService.spec.mjs:91 flakes with:
TypeError: Cannot read properties of undefined (reading '0')
expect(listed.faqs[0].canonicalQuery).toContain('reactive');
listed.faqs[0] is undefined, meaning KBRecorderService.listAgentFaqs({minCount: 2}) returned {faqs: []} or {faqs: undefined}. The test wrote 3 entries to the singleton's kb_query_log via KBRecorderService.log(...) and called buildAgentFaqs first — both should have populated state.
workers:1 substrate amplifies singleton-state pollution. Other specs touch the same singleton (DreamService.spec.mjs writes kb_query_log per grep), and per CI's serial execution they share the SAME KBRecorderService.db connection state. If a sibling spec runs first and either closes/reopens the singleton's DB OR clears kb_query_log/kb_query_faqs tables in its own beforeEach/afterAll, the FAQ-cluster build in this spec returns no rows.
The Architectural Reality
test/playwright/unit/ai/mcp/server/knowledge-base/services/KBRecorderService.spec.mjs:91-129 — the flaky test
test/playwright/unit/ai/mcp/server/knowledge-base/services/KBRecorderService.spec.mjs:49-50 — beforeEach clears kb_query_log + kb_query_faqs for this spec, but doesn't account for sibling-spec mutations between this spec's tests
ai/mcp/server/knowledge-base/services/KBRecorderService.mjs — module-scope singleton (export default new KBRecorderService()); singleton's db connection persists across spec boundaries within one worker
- Sibling spec writers verified via grep:
test/playwright/unit/ai/daemons/DreamService.spec.mjs writes kb_query_log
The Fix (TBD via investigation)
Two candidate paths (mirrors the FileSystemIngestor #10934 prescription):
- Spec-level: switch
test.describe.configure({mode: 'serial'}) and add beforeAll that re-initializes the singleton DB; or extract the singleton import into per-test factory.
- SDK-level: harden
KBRecorderService.listAgentFaqs to return {faqs: []} defensively when the underlying query returns no rows (vs undefined); decouple test-isolation from singleton-data isolation by giving each test its own db instance.
Investigation needed before locking the prescription. The substrate-discipline lesson from PR #10933 review: re-measurement claims for singleton-pollution patterns MUST run with WORKERS=1 locally to match CI substrate.
Acceptance Criteria
Out of Scope
- Migrating KBRecorderService to per-test instances (too aggressive; solve at sibling-spec or SDK layer first)
- Cross-spec singleton-lifecycle audit (separate epic-shaped concern; this ticket is the targeted fix for KBRecorderService specifically)
Avoided Traps
- Increasing
retries: 2 → 5: papers over without addressing root cause
- Disabling
workers:1 in CI: that's a feature for deterministic singleton pollution detection, not a bug
- Skip-guard as permanent solution: applied as immediate ship-the-PR move on PR #10933 + tracked here for proper fix
Related
- Surfacing CI run: 25524203756
- Triage origin: #10924 G5 row (G5#2)
- Triage correction: #10924 comment 4401547656
- Sibling state-pollution patterns: #10934 (FileSystemIngestor singleton SQLite-close), G5#3 sibling, #10935 (TransportService residual race)
- Substrate config:
test/playwright/playwright.config.unit.mjs workers: 1 in CI
- Skip-guard commit:
8e6c3fd78 on PR #10933
Origin Session ID: 7e897a0b-33ce-4d6c-b1a9-a1ff93e4e571
Retrieval Hint: query_raw_memories(query="KBRecorderService singleton kb_query_log workers 1 flake G5#2 #10924 PR 10933")
Context
Surfaced 2026-05-08 during PR #10933 (Phase 3 unit-row re-add) CI run 25524203756. Originally classified as G5#2 in the #10924 G5 triage matrix. Triage hypothesis was "auto-resolve post-G4-merge"; falsified empirically by the unit-row first activation in CI.
Skip-guarded in PR #10933 commit
8e6c3fd78per the established #10907/#10921/#10928 pattern; this ticket tracks the proper investigation + fix.The Problem
Test
test/playwright/unit/ai/mcp/server/knowledge-base/services/KBRecorderService.spec.mjs:91flakes with:TypeError: Cannot read properties of undefined (reading '0') expect(listed.faqs[0].canonicalQuery).toContain('reactive');listed.faqs[0]is undefined, meaningKBRecorderService.listAgentFaqs({minCount: 2})returned{faqs: []}or{faqs: undefined}. The test wrote 3 entries to the singleton'skb_query_logviaKBRecorderService.log(...)and calledbuildAgentFaqsfirst — both should have populated state.workers:1substrate amplifies singleton-state pollution. Other specs touch the same singleton (DreamService.spec.mjswriteskb_query_logper grep), and per CI's serial execution they share the SAMEKBRecorderService.dbconnection state. If a sibling spec runs first and either closes/reopens the singleton's DB OR clearskb_query_log/kb_query_faqstables in its ownbeforeEach/afterAll, the FAQ-cluster build in this spec returns no rows.The Architectural Reality
test/playwright/unit/ai/mcp/server/knowledge-base/services/KBRecorderService.spec.mjs:91-129— the flaky testtest/playwright/unit/ai/mcp/server/knowledge-base/services/KBRecorderService.spec.mjs:49-50—beforeEachclearskb_query_log+kb_query_faqsfor this spec, but doesn't account for sibling-spec mutations between this spec's testsai/mcp/server/knowledge-base/services/KBRecorderService.mjs— module-scope singleton (export default new KBRecorderService()); singleton'sdbconnection persists across spec boundaries within one workertest/playwright/unit/ai/daemons/DreamService.spec.mjswriteskb_query_logThe Fix (TBD via investigation)
Two candidate paths (mirrors the FileSystemIngestor #10934 prescription):
test.describe.configure({mode: 'serial'})and addbeforeAllthat re-initializes the singleton DB; or extract the singleton import into per-test factory.KBRecorderService.listAgentFaqsto return{faqs: []}defensively when the underlying query returns no rows (vsundefined); decouple test-isolation from singleton-data isolation by giving each test its owndbinstance.Investigation needed before locking the prescription. The substrate-discipline lesson from PR #10933 review: re-measurement claims for singleton-pollution patterns MUST run with
WORKERS=1locally to match CI substrate.Acceptance Criteria
WORKERS=1(matches CI substrate; default-workers parallelism does NOT reproduce)kb_query_log/kb_query_faqsbefore this test runs8e6c3fd78npm run test-unitinvocations on CI substrateOut of Scope
Avoided Traps
retries: 2 → 5: papers over without addressing root causeworkers:1in CI: that's a feature for deterministic singleton pollution detection, not a bugRelated
test/playwright/playwright.config.unit.mjsworkers: 1in CI8e6c3fd78on PR #10933Origin Session ID:
7e897a0b-33ce-4d6c-b1a9-a1ff93e4e571Retrieval Hint:
query_raw_memories(query="KBRecorderService singleton kb_query_log workers 1 flake G5#2 #10924 PR 10933")