Context
Sub-issue of #10671. Unit tests for resumeHarness.mjs (and likely bridge-daemon.mjs) invoke the real script via execFileSync / spawnSync, which calls real osascript against the live host. On a Mac with Claude Desktop / Antigravity / Codex installed AND System Events accessibility permission granted, osascript succeeds and actually pastes the boot-grounding prompt into the live app, spawning real new sessions.
Empirical anchor (2026-05-04): @neo-gemini-3-1-pro ran npm run test-unit test/playwright/unit/ai/scripts/resumeHarness.spec.mjs while implementing PR #10680 (her #10678 Antigravity terminal-restart investigation track). The test suite spawned 2 phantom Claude Desktop sessions on @tobiu's host with the boot-grounding prompt as their first user message ("Recovery context: test" — matching the 'test' reason argument the test passes). Confirmed in same-session A2A.
Operator framing: "imagine a real night shift. one of you starts 'all' unit tests and game over." Autonomous overnight operation + this test = catastrophic spawn storm.
The Problem
The test pattern at test/playwright/unit/ai/scripts/resumeHarness.spec.mjs:
- Line 64:
execFileSync('node', [scriptPath, '@neo-unknown', 'test'], {env: overrideEnv}) — unknown identity exits early before osascript, safe
- Line 74:
execFileSync('node', [scriptPath, '@neo-opus-4-7', 'test'], {env: overrideEnv}) — UNSAFE: known identity → osascript fires → real Claude Desktop paste
- Line 94:
spawnSync('node', [scriptPath, '@neo-opus-4-7', 'test'], {env: gateOnlyEnv()}) — gate-tripped path, safe (gate blocks before osascript)
- Line 113:
spawnSync('node', [scriptPath, '@neo-opus-4-7', 'test'], {env: <gate-enabled>}) — UNSAFE: gate-enabled path → osascript fires → real paste
overrideEnv (line 45) explicitly sets WAKE_GATE_OVERRIDE: '1' to bypass the gate, which is correct for testing the post-gate code paths but unsafe when those paths hit real osascript.
The expected output check expect(output).toContain('Failed to resume...') on line 78 only triggers if osascript FAILS — on a host where it succeeds, the test passes silently and the side effect lands.
bridge-daemon.spec.mjs exists at the same directory level and likely has analogous patterns (audit needed in implementation).
The Architectural Reality
Test isolation primitive options:
- Stub
spawnAsync at the resumeHarness module level — inject a mock dispatcher in unit-test mode (NEO_UNIT_TEST_MODE=1) so osascript args are captured + asserted but not executed
- Mock at the subprocess level — replace
osascript PATH with a recording stub for the test duration
- Gate behind
RUN_LIVE_OSASCRIPT=1 env var — skip the dangerous test path by default; CI / explicit isolation runs can opt in
- Move the osascript-routing assertion to a non-execution layer — the test on line 84 (
expect(scriptContent).toContain("...freshSessionShortcut: 'n'...")) already covers config-routing structurally without needing to run osascript
Option 4 (eliminate the unsafe execution entirely; rely on structural assertion) is the cleanest. Options 1-3 retain the live-execution variant for a CI-only mode.
The Fix
- Audit all
execFileSync / spawnSync / spawnAsync invocations in test/playwright/unit/ai/scripts/ and test/playwright/unit/ai/daemons/ that call resumeHarness/bridge-daemon scripts
- For each unsafe site: replace real-osascript execution with mock capture, OR move the assertion to structural inspection of script source (option 4 pattern)
- Add a test-suite-level invariant: NO test path may invoke real
osascript/open/pkill against host applications without RUN_LIVE_OSASCRIPT=1 explicit opt-in
- Document the test-isolation discipline in
learn/agentos/ or equivalent
Acceptance Criteria
Out of Scope
- Rewriting the test runner architecture
- Migrating other Playwright unit tests not related to harness/daemon scripts
- Adding
RUN_LIVE_OSASCRIPT runner support to CI/CD pipelines (separate work; this ticket only ensures default-off behavior)
Avoided Traps
- Trap: "the test EXPECTS osascript to fail anyway." Empirically false — on hosts where osascript succeeds, the test passes silently while the side effect lands; the failure-expectation only catches restricted-permission CI environments.
- Trap: "just run tests in CI, never on developer hosts." Doesn't survive contact with reality — agents working on this substrate empirically run unit tests during investigation (#10678 / #10679 ongoing), and developer-host spawn storms compound across the trio.
- Trap: "structural assertion alone is insufficient." Line 84 already does structural assertion of the config; the live-osascript invocation adds little additional coverage but enormous side-effect risk.
- Trap: "WAKE_GATE_OVERRIDE in test-mode is the bug." It's not — the gate-override is correct for testing the post-gate code paths. The bug is calling REAL osascript while bypassing the gate. Fix is at the osascript layer, not the gate-override layer.
Related
- Parent: #10671
- Sibling forensic: #10672 (this issue extends the runaway-spawn forensic record with the test-suite vector)
- Empirical trigger: PR #10680 (
@neo-gemini-3-1-pro's #10678 Antigravity track) — running resumeHarness.spec.mjs during her implementation work caused the 2026-05-04 09:03Z spawn event
- Adjacent in-flight lock validation: Gemini's
antigravity chat -n natively crashed her MCP servers via parallel-init port collisions — empirically validates #10674 in-flight lock primitive
Origin Session ID: cce1fea5-32ff-410c-b820-2e9a27b3cd51
Retrieval Hint: query_summaries("mock osascript unit test host environment side effects spawn storm") + query_raw_memories("test/playwright/unit/ai/scripts/resumeHarness.spec real osascript paste live Claude Desktop")
Context
Sub-issue of #10671. Unit tests for
resumeHarness.mjs(and likelybridge-daemon.mjs) invoke the real script viaexecFileSync/spawnSync, which calls realosascriptagainst the live host. On a Mac with Claude Desktop / Antigravity / Codex installed AND System Events accessibility permission granted,osascriptsucceeds and actually pastes the boot-grounding prompt into the live app, spawning real new sessions.Empirical anchor (2026-05-04):
@neo-gemini-3-1-prorannpm run test-unit test/playwright/unit/ai/scripts/resumeHarness.spec.mjswhile implementing PR #10680 (her #10678 Antigravity terminal-restart investigation track). The test suite spawned 2 phantom Claude Desktop sessions on@tobiu's host with the boot-grounding prompt as their first user message ("Recovery context: test" — matching the'test'reason argument the test passes). Confirmed in same-session A2A.Operator framing: "imagine a real night shift. one of you starts 'all' unit tests and game over." Autonomous overnight operation + this test = catastrophic spawn storm.
The Problem
The test pattern at
test/playwright/unit/ai/scripts/resumeHarness.spec.mjs:execFileSync('node', [scriptPath, '@neo-unknown', 'test'], {env: overrideEnv})— unknown identity exits early before osascript, safeexecFileSync('node', [scriptPath, '@neo-opus-4-7', 'test'], {env: overrideEnv})— UNSAFE: known identity → osascript fires → real Claude Desktop pastespawnSync('node', [scriptPath, '@neo-opus-4-7', 'test'], {env: gateOnlyEnv()})— gate-tripped path, safe (gate blocks before osascript)spawnSync('node', [scriptPath, '@neo-opus-4-7', 'test'], {env: <gate-enabled>})— UNSAFE: gate-enabled path → osascript fires → real pasteoverrideEnv(line 45) explicitly setsWAKE_GATE_OVERRIDE: '1'to bypass the gate, which is correct for testing the post-gate code paths but unsafe when those paths hit real osascript.The expected output check
expect(output).toContain('Failed to resume...')on line 78 only triggers if osascript FAILS — on a host where it succeeds, the test passes silently and the side effect lands.bridge-daemon.spec.mjsexists at the same directory level and likely has analogous patterns (audit needed in implementation).The Architectural Reality
Test isolation primitive options:
spawnAsyncat the resumeHarness module level — inject a mock dispatcher in unit-test mode (NEO_UNIT_TEST_MODE=1) so osascript args are captured + asserted but not executedosascriptPATH with a recording stub for the test durationRUN_LIVE_OSASCRIPT=1env var — skip the dangerous test path by default; CI / explicit isolation runs can opt inexpect(scriptContent).toContain("...freshSessionShortcut: 'n'...")) already covers config-routing structurally without needing to run osascriptOption 4 (eliminate the unsafe execution entirely; rely on structural assertion) is the cleanest. Options 1-3 retain the live-execution variant for a CI-only mode.
The Fix
execFileSync/spawnSync/spawnAsyncinvocations intest/playwright/unit/ai/scripts/andtest/playwright/unit/ai/daemons/that call resumeHarness/bridge-daemon scriptsosascript/open/pkillagainst host applications withoutRUN_LIVE_OSASCRIPT=1explicit opt-inlearn/agentos/or equivalentAcceptance Criteria
npm run test-unitagainstresumeHarness.spec.mjsproduces ZERO host-environment side effects (no Claude Desktop paste, no Antigravity paste, no Codex thread spawn) on a Mac with full accessibility permissionsbridge-daemon.spec.mjsaudited; same invariant holdsRUN_LIVE_OSASCRIPT=1opt-in; CI isolation / production guard documentedOut of Scope
RUN_LIVE_OSASCRIPTrunner support to CI/CD pipelines (separate work; this ticket only ensures default-off behavior)Avoided Traps
Related
@neo-gemini-3-1-pro's #10678 Antigravity track) — runningresumeHarness.spec.mjsduring her implementation work caused the 2026-05-04 09:03Z spawn eventantigravity chat -nnatively crashed her MCP servers via parallel-init port collisions — empirically validates #10674 in-flight lock primitiveOrigin Session ID: cce1fea5-32ff-410c-b820-2e9a27b3cd51
Retrieval Hint: query_summaries("mock osascript unit test host environment side effects spawn storm") + query_raw_memories("test/playwright/unit/ai/scripts/resumeHarness.spec real osascript paste live Claude Desktop")