LearnNewsExamplesServices
Frontmatter
id11080
titleOrchestrator daemon fails to start (Sub-4 #11064 regression)
stateClosed
labels
bugairegressionarchitecture
assigneesneo-gemini-3-1-pro, neo-opus-4-7
createdAtMay 10, 2026, 2:03 AM
updatedAtMay 12, 2026, 4:08 AM
githubUrlhttps://github.com/neomjs/neo/issues/11080
authorneo-opus-4-7
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 10, 2026, 3:39 AM

Orchestrator daemon fails to start (Sub-4 #11064 regression)

Closedbugairegressionarchitecture
neo-opus-4-7
neo-opus-4-7 commented on May 10, 2026, 2:03 AM

Context

The Agent OS orchestrator daemon (npm run ai:orchestratorai/scripts/orchestrator-daemon.mjs) currently crashes on start with ReferenceError: path is not defined. The crash was introduced by PR #11064 (Sub-4 of M3.5 epic #11022), merged 2026-05-10. Production daemon is non-functional on dev HEAD.

Empirical crash trace (reproduced on a clean working tree tracking origin/dev after #11064 merge):

[Orchestrator] Failed to start: ReferenceError: path is not defined
    at Orchestrator.configure (file://.../ai/daemons/Orchestrator.mjs:183:60)
    at Orchestrator.start (file://.../ai/daemons/Orchestrator.mjs:214:14)
    at startOrchestrator (file://.../ai/scripts/orchestrator-daemon.mjs:140:25)

A second related regression — DEFAULT_SCRIPT_DIR resolves to <repo>/scripts/ instead of <repo>/ai/scripts/ — is masked behind the first one (it would fire as soon as the daemon attempts to spawn summary or kbSync task subprocesses).

The Problem

PR #11064 extracted task definitions from Orchestrator.mjs into TaskDefinitions.mjs. Per the squash commit message, the file was initially placed in ai/daemons/utils/TaskDefinitions.mjs, then relocated to top-level ai/daemons/ before merge (final commit message: "refactor(ai): relocate TaskDefinitions to top-level and harden runIfDue"). Two collateral defects landed during this relocate step.

Bug A — Missing path import in Orchestrator.mjs

Pre-Sub-4 ai/daemons/Orchestrator.mjs had import path from 'path'; (line 7 of the pre-Sub-4 file, verified via git show 2f6f9b310^:ai/daemons/Orchestrator.mjs). When task-definition responsibilities moved into TaskDefinitions.mjs, the path import in Orchestrator.mjs was dropped — but Orchestrator.configure() lines 183–184 still reference path.join(...) for the logFile and stateFile defaults:

this.logFile   = options.logFile   || path.join(dataDir, 'orchestrator.log');
this.stateFile = options.stateFile || path.join(dataDir, 'orchestrator-state.json');

The boot wrapper ai/scripts/orchestrator-daemon.mjs calls Orchestrator.start({dataDir: DAEMON_DATA_DIR, ...options}) with no logFile/stateFile overrides → path.join(...) branch fires → ReferenceError. Daemon dies immediately.

Bug B — DEFAULT_SCRIPT_DIR resolves to wrong directory

ai/daemons/TaskDefinitions.mjs line 13:

export const DEFAULT_SCRIPT_DIR = path.resolve(__dirname, '../../scripts');

__dirname of TaskDefinitions.mjs is <repo>/ai/daemons/. '../../scripts' resolves to <repo>/scripts/. But the actual script lives at <repo>/ai/scripts/summarize-sessions.mjs (per package.json ai:summarize-sessions entry: node ./ai/scripts/summarize-sessions.mjs). The pre-Sub-4 expression was '../scripts' — single .., correct from ai/daemons/. The relocate-from-utils/ step preserved the deeper-path formula without normalizing for the new shallower location.

Knock-on effect: even if Bug A is fixed, the next sweep would attempt to spawn a non-existent script. For kbSync, path.resolve(scriptDir, '../../buildScripts/ai/syncKnowledgeBase.mjs') from the wrong base resolves to <parent-of-repo>/buildScripts/ai/syncKnowledgeBase.mjs — also wrong (escapes the repo entirely).

Test coverage gap

Both bugs are invisible to existing test coverage:

  • test/playwright/unit/ai/daemons/Orchestrator.spec.mjs uses Neo.create(Orchestrator, {...}) with explicit logFile=null + taskDefinitions overrides (lines 28-39) → never invokes configure() path.
  • test/playwright/unit/ai/scripts/orchestrator-daemon.spec.mjs line 17 passes explicit scriptDir to buildTaskDefinitions(...) → never exercises DEFAULT_SCRIPT_DIR. Also never calls startOrchestrator() end-to-end; only does string-content assertions on source files (lines 30-52).

Neither test exercises the production code path the daemon actually takes when started via npm run ai:orchestrator. CI passing on #11064 was not evidence the daemon works.

The Architectural Reality

  • ai/daemons/Orchestrator.mjs (the daemon class) and ai/daemons/TaskDefinitions.mjs (extracted constants/factory) sit in the same directory at depth <repo>/ai/daemons/. Therefore __dirname relative paths must use single .., not double.
  • ai/scripts/orchestrator-daemon.mjs (the entry-point boot wrapper) only passes dataDir. All other configuration MUST therefore work correctly with defaults — defaults are not optional polish, they are load-bearing on the production path.
  • Sub-4's stated goal — "slim Orchestrator.mjs while keeping behavior intact" — relies on entry-point-driven default paths being live-tested. The current test suite tests pieces in isolation but never the entry-point happy path. The string-content assertions on source files (expect(daemonSource).not.toContain('summarize-sessions.mjs')) prove the separation of concerns but not the runtime correctness of either piece.

The Fix

  1. Restore import path from 'path'; in ai/daemons/Orchestrator.mjs (single line, between import {spawn} on line 7 and import Base on line 8). Pre-Sub-4 ordering convention preserved.
  2. Correct DEFAULT_SCRIPT_DIR in ai/daemons/TaskDefinitions.mjs line 13 from path.resolve(__dirname, '../../scripts') to path.resolve(__dirname, '../scripts'). Single-character fix (drop one ..).
  3. Add a regression test that exercises the no-override configure() path and asserts:
    • Orchestrator.configure({dataDir: tmp}) does not throw,
    • this.logFile and this.stateFile resolve to path.join(tmp, ...),
    • this.taskDefinitions.summary.args[0] resolves to <repo>/ai/scripts/summarize-sessions.mjs (existing file),
    • this.taskDefinitions.kbSync.args[0] resolves to <repo>/buildScripts/ai/syncKnowledgeBase.mjs (existing file). This test must fail on the unfixed code to prove it would have caught Sub-4 at review time.

Acceptance Criteria

  • ai/daemons/Orchestrator.mjs imports path from 'path'.
  • DEFAULT_SCRIPT_DIR in ai/daemons/TaskDefinitions.mjs resolves to <repo>/ai/scripts/ (assertion in regression test using path.resolve(process.cwd(), 'ai/scripts') equivalence).
  • New regression test in test/playwright/unit/ai/daemons/Orchestrator.spec.mjs (or orchestrator-daemon.spec.mjs) exercises Orchestrator.configure({dataDir}) with no other overrides and asserts no throw + correct path defaults.
  • Regression test verified to FAIL on unfixed code (to prove regression coverage).
  • npm run ai:orchestrator starts cleanly without ReferenceError. Verified by running daemon ≥10 seconds and confirming no crash trace in .neo-ai-data/orchestrator-daemon/orchestrator.log.
  • npm run test-unit -- test/playwright/unit/ai/scripts/orchestrator-daemon.spec.mjs passes.
  • npm run test-unit -- test/playwright/unit/ai/daemons/Orchestrator.spec.mjs passes.

Out of Scope

  • M4 coordinator landscape work. Discussion #11076 is ongoing and halted per FULL TEAM STOP. This hotfix is strictly M3.5 substrate-only.
  • Magic-number migration to ai/config.template.mjs (#11075).
  • Broader test-coverage audit across other ai/daemons/services/*.spec.mjs files for the same "fixture override masks production path" pattern. Surface as a separate follow-up ticket if pattern recurs; do not expand this scope.
  • Refactoring configure() shape itself (e.g., split into per-concern setters). The existing shape is fine; the bug is two specific lines.

Avoided Traps

  • Don't replace path.join with template-literal interpolation ${dataDir}/orchestrator.log. The path.join is correct in spirit (cross-platform separator handling, normalize trailing slashes); only the missing import is the bug. Restoring the import preserves design intent. Bash-style string concat would propagate forward and bite Windows users later.
  • Don't widen Orchestrator.spec.mjs's existing explicit-override test fixture to also "test defaults" inline. Adding a logFile: undefined to the existing fixture would suppress the override but the test would still bypass configure(). The new regression test must exercise the same code path the production entry-point takes: Orchestrator.start({dataDir}) (or directly configure({dataDir})) with nothing else.
  • Don't move DEFAULT_SCRIPT_DIR back into Orchestrator.mjs. The TaskDefinitions extraction was correct in shape (Sub-4's design intent preserved); only the path expression needs the off-by-one correction.
  • Don't bundle this with a broader "improve daemon tests" rewrite. Hotfix discipline: ticket scope = the regression. Test-coverage-philosophy improvements deserve their own ticket.

Related

  • Introduces this regression: PR #11064 (Sub-4 of M3.5 #11022). Author: @neo-gemini-3-1-pro.
  • Parent epic: #11022.
  • M4 work indirectly blocked: Discussion #11076 needs the production daemon to actually start before Sandman/Bridge coordinator hookup is meaningful.
  • Test-coverage-pattern observation: this is the second time a "tests bypass production code path via fixture overrides" pattern has bitten substrate. If it recurs, file an MX-skill / pr-review-checklist enhancement ticket.

Origin Session ID: c2912891-b459-4a03-b2af-154d5e264df1

Retrieval Hint: query_raw_memories({query: "Sub-4 PR #11064 orchestrator daemon ReferenceError missing path import DEFAULT_SCRIPT_DIR"}) Retrieval Hint: Commit SHA 2f6f9b310 (Sub-4 squash-merge into dev).

tobiu referenced in commit 79578b9 - "fix(ai): restore missing path import and default script resolution in Orchestrator daemon (#11080) (#11087) on May 10, 2026, 3:39 AM
tobiu closed this issue on May 10, 2026, 3:39 AM