LearnNewsExamplesServices
Frontmatter
id10941
titleG5#5: GraphService.spec:107 residual flake post-TestLifecycleHelper migration under workers:1
stateClosed
labels
bugaitesting
assigneesneo-gpt
createdAtMay 8, 2026, 9:36 AM
updatedAtMay 12, 2026, 4:09 AM
githubUrlhttps://github.com/neomjs/neo/issues/10941
authorneo-opus-4-7
commentsCount1
parentIssue10924
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 11, 2026, 1:20 AM

G5#5: GraphService.spec:107 residual flake post-TestLifecycleHelper migration under workers:1

Closedbugaitesting
neo-opus-4-7
neo-opus-4-7 commented on May 8, 2026, 9:36 AM

Context

Surfaced 2026-05-08 during PR #10940 (TestLifecycleHelper substrate migration) Cycle 1 review. My empirical full-unit-suite run on Gemini's agent/sqlite-hardening branch (cd /Users/Shared/github/neomjs/neo && CI=true npm run test-unit, matches CI workers:1 substrate) showed test/playwright/unit/ai/mcp/server/memory-core/services/GraphService.spec.mjs:107 STILL flakes despite the spec being migrated to TestLifecycleHelper.cleanupGraphService(... 'clear') in afterAll/beforeEach hooks.

This is the residual after PR #10940's substrate fix lands — the migration covers FileSystemIngestor (#10934) cleanly but doesn't fully resolve GraphService.spec's own intra-spec flake.

The Problem

should extract node neighbors properly (line 107) flakes with:

TypeError: The database connection is not open
   at SQLite.removeNodes (ai/graph/storage/SQLite.mjs:200)

Same shape as #10934 + #10938 (closed-singleton-then-reuse), but at the GraphService.spec consumer surface AFTER the migration. Root cause hypotheses:

  1. TestLifecycleHelper isn't called at the right hook scope — afterAll/beforeEach migration may need afterEach OR the helper's clear strategy may not cover the path that triggers removeNodes
  2. An upstream non-migrated spec leaves residual close-singleton state — e.g., a spec NOT covered by PR #10940's 5-spec migration (FileSystemIngestor, Server, GraphService, PermissionService, WakeSubscriptionService) closes the singleton before GraphService.spec runs
  3. SDK lazy re-init has a path TestLifecycleHelper doesn't cover — the SDK guards added in 5a52e8661 may not be sufficient

Skip-guarded in Phase 3 PR (forthcoming) per NEO_TEST_SKIP_CI pattern; this ticket tracks the proper investigation + fix.

The Architectural Reality

  • test/playwright/unit/ai/mcp/server/memory-core/services/GraphService.spec.mjs:107 — the flaky test (should extract node neighbors properly)
  • test/playwright/unit/ai/mcp/server/memory-core/services/util.mjs#TestLifecycleHelper.cleanupGraphService — the migration-target primitive (introduced in PR #10940 c.ce0771b3c)
  • ai/graph/storage/SQLite.mjs:200 removeNodes — naked this.db.prepare(...) that throws on closed connection (also present in 5a52e8661 SDK lifecycle guards)

The Fix (TBD via investigation)

Three candidate paths:

  1. Audit non-migrated specs for close-singleton patterns — if hypothesis 2 holds, expand TestLifecycleHelper migration to additional specs OR fix their teardown directly
  2. Add afterEach migration to GraphService.spec — if hypothesis 1 holds, the afterAll/beforeEach migration is incomplete for tests that mutate state mid-test
  3. SDK-level harden removeNodes open-state check — if hypothesis 3 holds, defensive if (!this.db || !this.db.open) in SQLite mutators with self-heal via re-init

Investigation needed before locking the prescription.

Acceptance Criteria

  • (AC1) Empirically reproduce locally with WORKERS=1 (matches CI substrate)
  • (AC2) Identify which hypothesis (1/2/3) is the root cause
  • (AC3) Implement chosen fix; coordinate with @neo-gemini-3-1-pro on TestLifecycleHelper extension if hypothesis 1 or 2
  • (AC4) Remove the NEO_TEST_SKIP_CI guard from GraphService.spec.mjs:107 in Phase 3 PR (forthcoming)
  • (AC5) Verify GraphService.spec:107 runs deterministically across 5 consecutive CI=true npm run test-unit invocations

Out of Scope

  • Cross-spec singleton-lifecycle audit beyond the GraphService.spec consumer (separate concern; this ticket targets the residual post-#10940 specifically)
  • Migrating GraphService to per-test instances (architectural change beyond AC scope)

Avoided Traps

  • Treating as duplicate of #10934: rejected — #10934 was specifically about FileSystemIngestor.afterAll closing singleton; that's been migrated. This residual is different consumer surface, possibly different hypothesis class.
  • Skip-guard as permanent solution: applied as immediate Phase 3 ship-the-PR move + tracked here for proper fix.
  • Increasing retries: papers over without addressing root cause.

Related

  • Surfacing CI run: my empirical CI=true full-suite run on PR #10940 c.7f454ee2 (1055 passed, 4 flaky)
  • Cycle 1 review citation: PR #10940 commentId IC_kwDODSospM8AAAABBodQ9A — flagged as substantive empirical residual
  • Sibling singleton-pollution patterns: #10934 (FileSystemIngestor — migrated), #10937 (PermissionService — STILL flakes despite migration, similar shape), #10938 (Database.spec G6 — likely auto-resolves now)
  • Substrate primitive: PR #10940 c.ce0771b3c — TestLifecycleHelper introduced
  • Bucket G epic: #10924

Origin Session ID: 7e897a0b-33ce-4d6c-b1a9-a1ff93e4e571

Retrieval Hint: query_raw_memories(query="GraphService.spec residual flake TestLifecycleHelper migration workers 1 #10940 PR 10933 sqlite removeNodes")

tobiu referenced in commit 98897fc - "feat(ci): re-add unit suite to matrix post-Bucket-G substrate (#10939) (#10953) on May 8, 2026, 2:43 PM
tobiu referenced in commit 2405528 - "feat(ai): migrate memory-core/Server to extend BaseServer + beforeToolDispatch hook (#10965) (#10977) on May 8, 2026, 6:45 PM
tobiu referenced in commit 220e63f - "test(memory-core): await GraphService promise wrappers (#10941) (#11159) on May 11, 2026, 1:20 AM
tobiu closed this issue on May 11, 2026, 1:20 AM