Context
Surfaced 2026-05-05 by @neo-gpt during a verify-before-assert pass on the #10781 / #10671 wake-substrate validation lane. While confirming the existing heartbeat substrate is functionally complete (50 unit tests pass on origin/dev in --workers=1 serial mode), GPT discovered that test/playwright/unit/ai/scripts/harnessLifecycle.spec.mjs fails under the default-worker config.
Root cause: every test in harnessLifecycle.spec.mjs uses the same @neo-test-harness-agent state-file identity. Default Playwright unit-runner uses 6 workers; parallel tests collide on the shared state file. Not a substrate bug — a test-isolation bug.
This blocks the validation needed for #10671 epic-finish (sunsetted-harness E2E recovery proof per @tobiu's AC clarification today via GPT's relay).
The Problem
GPT's empirical evidence:
"Default-worker verification of harnessLifecycle.spec.mjs fails under current unit config: npm run test-unit -- test/playwright/unit/ai/scripts/harnessLifecycle.spec.mjs ran 6 tests with 6 workers and failed because every test uses the same @neo-test-harness-agent state-file identity. This is a test isolation bug, not proof the substrate is broken."
Concrete failure surface:
- Multiple tests writing to the same state-file path concurrently
- Tests reading state from previous workers' writes
- Race conditions in setup/teardown around the shared identity
- Local CI / contributor environments running default-worker fail; only
--workers=1 passes
The substrate-correctness of harnessLifecycle.mjs itself is not in question — --workers=1 execution passes all 6 tests cleanly. The bug is in test design, not implementation.
The Architectural Reality
test/playwright/unit/ai/scripts/harnessLifecycle.spec.mjs — the failing spec
ai/scripts/harnessLifecycle.mjs — the substrate under test (correctness verified via serial runs)
- Sibling specs that DO handle parallel correctly:
checkSunsetted.spec.mjs, resumeHarness.spec.mjs, wakeSafetyGate.spec.mjs, swarm-heartbeat.spec.mjs (all pass under default workers)
- Test runner: Playwright unit-test mode with 6 default workers (per
playwright.config.mjs)
The Fix
Two equivalent shapes per GPT's recommendation:
Option A (smallest delta): add test.describe.configure({mode: 'serial'}) to the spec's top-level describe block. Forces serial execution within the file; default-workers still apply across files. Matches existing pattern in other Neo specs that depend on shared state.
Option B (more robust): per-test unique state-file identity. Each test gets @neo-test-harness-agent-<test-uuid> or similar. Avoids serial-execution overhead while preserving worker parallelism. More code change but no execution-time penalty.
Recommendation: Option A first (smaller surface, fast fix), Option B as scope-extension if the spec gains many more tests and serial-execution time becomes meaningful.
Acceptance Criteria
Out of Scope
- Refactoring
harnessLifecycle.mjs substrate code (correctness verified)
- Replacing Playwright's worker model
- Adding parallel-test infrastructure beyond what Playwright provides
Avoided Traps
- Treating this as substrate bug: GPT's verify-before-assert pass empirically demonstrated the substrate works in serial mode; the failure mode is purely test-side
- Disabling default-worker mode globally: that would slow ALL tests; this fix is local to one spec file
- Implementing Option B prematurely: unique-identity pattern is more complex; Option A is sufficient for current scale
Related
- Parent epic: #10671 — substrate-restart recovery; this test-isolation bug blocks default-worker validation that #10671 epic-finish depends on
- Empirical anchor: @neo-gpt's verify-before-assert pass 2026-05-05 ~20:15Z (relayed via A2A MESSAGE:c147c7d3-a7df-4f1a-bee1-b70da20d9370)
- Substrate-validation context: captured in my comment on #10671 — sunsetted-harness E2E recovery is the load-bearing milestone
- Sibling spec precedents:
checkSunsetted.spec.mjs, resumeHarness.spec.mjs, wakeSafetyGate.spec.mjs, swarm-heartbeat.spec.mjs — all handle parallel correctly; can serve as pattern reference
Origin Session ID: 23b9cbcd-4938-4a46-b21a-0d48dd12e7e7
Retrieval Hint: query_raw_memories(query="harnessLifecycle.spec.mjs test isolation bug shared state-file identity neo-test-harness-agent default workers playwright 10671")
Context
Surfaced 2026-05-05 by @neo-gpt during a verify-before-assert pass on the #10781 / #10671 wake-substrate validation lane. While confirming the existing heartbeat substrate is functionally complete (50 unit tests pass on
origin/devin--workers=1serial mode), GPT discovered thattest/playwright/unit/ai/scripts/harnessLifecycle.spec.mjsfails under the default-worker config.Root cause: every test in
harnessLifecycle.spec.mjsuses the same@neo-test-harness-agentstate-file identity. Default Playwright unit-runner uses 6 workers; parallel tests collide on the shared state file. Not a substrate bug — a test-isolation bug.This blocks the validation needed for #10671 epic-finish (sunsetted-harness E2E recovery proof per @tobiu's AC clarification today via GPT's relay).
The Problem
GPT's empirical evidence:
Concrete failure surface:
--workers=1passesThe substrate-correctness of
harnessLifecycle.mjsitself is not in question —--workers=1execution passes all 6 tests cleanly. The bug is in test design, not implementation.The Architectural Reality
test/playwright/unit/ai/scripts/harnessLifecycle.spec.mjs— the failing specai/scripts/harnessLifecycle.mjs— the substrate under test (correctness verified via serial runs)checkSunsetted.spec.mjs,resumeHarness.spec.mjs,wakeSafetyGate.spec.mjs,swarm-heartbeat.spec.mjs(all pass under default workers)playwright.config.mjs)The Fix
Two equivalent shapes per GPT's recommendation:
Option A (smallest delta): add
test.describe.configure({mode: 'serial'})to the spec's top-level describe block. Forces serial execution within the file; default-workers still apply across files. Matches existing pattern in other Neo specs that depend on shared state.Option B (more robust): per-test unique state-file identity. Each test gets
@neo-test-harness-agent-<test-uuid>or similar. Avoids serial-execution overhead while preserving worker parallelism. More code change but no execution-time penalty.Recommendation: Option A first (smaller surface, fast fix), Option B as scope-extension if the spec gains many more tests and serial-execution time becomes meaningful.
Acceptance Criteria
harnessLifecycle.spec.mjspasses under default-worker config (currently fails)--workers=1baseline)checkSunsetted.spec.mjs+resumeHarness.spec.mjs+wakeSafetyGate.spec.mjs+swarm-heartbeat.spec.mjscontinue passingOut of Scope
harnessLifecycle.mjssubstrate code (correctness verified)Avoided Traps
Related
checkSunsetted.spec.mjs,resumeHarness.spec.mjs,wakeSafetyGate.spec.mjs,swarm-heartbeat.spec.mjs— all handle parallel correctly; can serve as pattern referenceOrigin Session ID: 23b9cbcd-4938-4a46-b21a-0d48dd12e7e7
Retrieval Hint:
query_raw_memories(query="harnessLifecycle.spec.mjs test isolation bug shared state-file identity neo-test-harness-agent default workers playwright 10671")