LearnNewsExamplesServices
Frontmatter
id10786
titleTest isolation bug: harnessLifecycle.spec.mjs fails under default-worker config (shared state-file identity)
stateClosed
labels
bugaitestingarchitecture
assigneesneo-gpt
createdAtMay 5, 2026, 10:17 PM
updatedAtMay 6, 2026, 12:08 AM
githubUrlhttps://github.com/neomjs/neo/issues/10786
authorneo-opus-4-7
commentsCount0
parentIssue10671
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 6, 2026, 12:08 AM

Test isolation bug: harnessLifecycle.spec.mjs fails under default-worker config (shared state-file identity)

Closedbugaitestingarchitecture
neo-opus-4-7
neo-opus-4-7 commented on May 5, 2026, 10:17 PM

Context

Surfaced 2026-05-05 by @neo-gpt during a verify-before-assert pass on the #10781 / #10671 wake-substrate validation lane. While confirming the existing heartbeat substrate is functionally complete (50 unit tests pass on origin/dev in --workers=1 serial mode), GPT discovered that test/playwright/unit/ai/scripts/harnessLifecycle.spec.mjs fails under the default-worker config.

Root cause: every test in harnessLifecycle.spec.mjs uses the same @neo-test-harness-agent state-file identity. Default Playwright unit-runner uses 6 workers; parallel tests collide on the shared state file. Not a substrate bug — a test-isolation bug.

This blocks the validation needed for #10671 epic-finish (sunsetted-harness E2E recovery proof per @tobiu's AC clarification today via GPT's relay).

The Problem

GPT's empirical evidence:

"Default-worker verification of harnessLifecycle.spec.mjs fails under current unit config: npm run test-unit -- test/playwright/unit/ai/scripts/harnessLifecycle.spec.mjs ran 6 tests with 6 workers and failed because every test uses the same @neo-test-harness-agent state-file identity. This is a test isolation bug, not proof the substrate is broken."

Concrete failure surface:

  • Multiple tests writing to the same state-file path concurrently
  • Tests reading state from previous workers' writes
  • Race conditions in setup/teardown around the shared identity
  • Local CI / contributor environments running default-worker fail; only --workers=1 passes

The substrate-correctness of harnessLifecycle.mjs itself is not in question — --workers=1 execution passes all 6 tests cleanly. The bug is in test design, not implementation.

The Architectural Reality

  • test/playwright/unit/ai/scripts/harnessLifecycle.spec.mjs — the failing spec
  • ai/scripts/harnessLifecycle.mjs — the substrate under test (correctness verified via serial runs)
  • Sibling specs that DO handle parallel correctly: checkSunsetted.spec.mjs, resumeHarness.spec.mjs, wakeSafetyGate.spec.mjs, swarm-heartbeat.spec.mjs (all pass under default workers)
  • Test runner: Playwright unit-test mode with 6 default workers (per playwright.config.mjs)

The Fix

Two equivalent shapes per GPT's recommendation:

Option A (smallest delta): add test.describe.configure({mode: 'serial'}) to the spec's top-level describe block. Forces serial execution within the file; default-workers still apply across files. Matches existing pattern in other Neo specs that depend on shared state.

Option B (more robust): per-test unique state-file identity. Each test gets @neo-test-harness-agent-<test-uuid> or similar. Avoids serial-execution overhead while preserving worker parallelism. More code change but no execution-time penalty.

Recommendation: Option A first (smaller surface, fast fix), Option B as scope-extension if the spec gains many more tests and serial-execution time becomes meaningful.

Acceptance Criteria

  • (AC1) harnessLifecycle.spec.mjs passes under default-worker config (currently fails)
  • (AC2) No regression: all 6 existing tests still pass (verify against current --workers=1 baseline)
  • (AC3) No regression on sibling specs: checkSunsetted.spec.mjs + resumeHarness.spec.mjs + wakeSafetyGate.spec.mjs + swarm-heartbeat.spec.mjs continue passing
  • (AC4) Brief comment in the spec explaining the serial-execution constraint OR the unique-identity pattern (operator/agent picking up the file later understands the discipline)

Out of Scope

  • Refactoring harnessLifecycle.mjs substrate code (correctness verified)
  • Replacing Playwright's worker model
  • Adding parallel-test infrastructure beyond what Playwright provides

Avoided Traps

  • Treating this as substrate bug: GPT's verify-before-assert pass empirically demonstrated the substrate works in serial mode; the failure mode is purely test-side
  • Disabling default-worker mode globally: that would slow ALL tests; this fix is local to one spec file
  • Implementing Option B prematurely: unique-identity pattern is more complex; Option A is sufficient for current scale

Related

  • Parent epic: #10671 — substrate-restart recovery; this test-isolation bug blocks default-worker validation that #10671 epic-finish depends on
  • Empirical anchor: @neo-gpt's verify-before-assert pass 2026-05-05 ~20:15Z (relayed via A2A MESSAGE:c147c7d3-a7df-4f1a-bee1-b70da20d9370)
  • Substrate-validation context: captured in my comment on #10671 — sunsetted-harness E2E recovery is the load-bearing milestone
  • Sibling spec precedents: checkSunsetted.spec.mjs, resumeHarness.spec.mjs, wakeSafetyGate.spec.mjs, swarm-heartbeat.spec.mjs — all handle parallel correctly; can serve as pattern reference

Origin Session ID: 23b9cbcd-4938-4a46-b21a-0d48dd12e7e7

Retrieval Hint: query_raw_memories(query="harnessLifecycle.spec.mjs test isolation bug shared state-file identity neo-test-harness-agent default workers playwright 10671")

tobiu referenced in commit 652b36b - "fix(test): serialize harness lifecycle spec (#10786) (#10791) on May 6, 2026, 12:08 AM
tobiu closed this issue on May 6, 2026, 12:08 AM