LearnNewsExamplesServices
Frontmatter
id11077
titleM4 Architectural Convergence: Orchestrator Supervision & Unified Daemons
stateClosed
labels
epicairefactoringarchitecturemodel-experience
assigneesneo-gemini-3-1-pro
createdAtMay 10, 2026, 1:55 AM
updatedAtMay 10, 2026, 8:15 PM
githubUrlhttps://github.com/neomjs/neo/issues/11077
authorneo-gemini-3-1-pro
commentsCount6
parentIssuenull
subIssues
11081 Rename Orchestrator.mjs to AgentOrchestrator.mjs
11093 Migrate Daemon Supervision to AgentOrchestrator
11094 Event-Driven Issue Sync and Golden Path Synthesis
subIssuesCompleted3
subIssuesTotal3
blockedBy[]
blocking[]
closedAtMay 10, 2026, 2:31 PM

M4 Architectural Convergence: Orchestrator Supervision & Unified Daemons

Closedepicairefactoringarchitecturemodel-experience
neo-gemini-3-1-pro
neo-gemini-3-1-pro commented on May 10, 2026, 1:55 AM

M4 Architectural Convergence: Orchestrator Supervision & Unified Daemons

Context

The M4 agentic infrastructure has historically relied on individual MCP servers (e.g., knowledge-base) spanning and managing isolated child processes for Neo.mjs daemons. This approach led to fragmented, polled coordinators and cross-agent termination friction, as well as stale data during ingestion syncs.

The Problem

  • Decentralized Daemon Management: MCP servers spawning their own daemons (like Chroma or MLX) leads to orphan processes, race conditions, and lack of universal supervision.
  • Polled Coordinators: M4 tickets (#11070, #11071, #11072) proposed individual polling coordinators (DreamCoord, GoldenPathCoord, GraphMaintenanceCoord) which created significant cognitive load and fragmented execution logic.
  • State Stagnation: The IssueCoordinator relied on stale data due to a single-stage ingestion process.

The Architectural Reality

The existing system lacked a central supervisor. The Orchestrator (ai/daemons/Orchestrator.mjs) possesses the capability via ProcessSupervisorService (ai/daemons/services/ProcessSupervisorService.mjs) to act as a resilient, Systemd-like manager for all long-running tasks. Furthermore, the CadenceEngine handles temporal execution perfectly when configured.

The Fix

This epic executes the M4 architecture reset aligned in Discussion #11076:

  1. Systemd for Neo: Elevate Orchestrator.mjs to manage all maintenance daemons (Bridge, Chroma, MLX) via ProcessSupervisorService.
  2. Unified Taxonomy Enforcement:
    • ai/daemons/services/ for scheduling ("When")
    • ai/services/ for logic ("What")
    • ai/scripts/ for entry points.
  3. Event-Driven Replacements: Replace polled Golden Path logic with a direct mutate_frontier MCP tool trigger.
  4. Issue Freshness Pipeline: Implement a strict two-stage issue synchronization pipeline (fetch to local .md -> ingest to Graph).
  5. Decouple Observability from Lifecycle: Refactor DatabaseLifecycleService.mjs to remove process spanning logic, pushing it to the Orchestrator, making the service a read-only observer (HealthService).

Acceptance Criteria

  • Orchestrator.mjs successfully spawns and supervises chroma and bridge-daemon via ProcessSupervisorService. (RESIDUAL_L4 — AC1 uniform daemon oversight runtime verification deferred per evidence-ladder L4-deferred operator handoff)
  • DatabaseLifecycleService.mjs and related MCP lifecycles are stripped of process management (spawn/kill). (RESIDUAL_L4 — AC2 uniform daemon oversight runtime verification deferred per evidence-ladder L4-deferred operator handoff)
  • Two-stage GitHub Issue sync pipeline implemented. (RESIDUAL_L4 — AC3 Event-ordering runtime verification deferred per evidence-ladder L4-deferred operator handoff)
  • mutate_frontier correctly triggers event-driven Golden Path synthesis. (RESIDUAL_L4 — AC4 Event-ordering runtime verification deferred per evidence-ladder L4-deferred operator handoff)
  • ai/agent/Orchestrator.mjs renamed to ai/agent/AgentOrchestrator.mjs to eliminate namespace collision.

Out of Scope

  • Expansion of the Graph memory schema beyond the current M4 baseline.
  • New swarm agents or MLX provider configuration outside the existing Neo substrate.

Avoided Traps / Gold Standards Rejected

Decision Matrix / Divergent Options Considered

Before execution, we evaluated the following shapes for Daemon Management:

  1. Orchestrator-as-process-supervisor (Selected): Integrates natively with our existing Orchestrator.mjs and ProcessSupervisorService. Zero external dependencies. Keeps the M4 architectural footprint minimal while centralizing control.
  2. Renaming daemon-side Orchestrator instead: Resolves the naming collision but leaves process supervision scattered across MCP lifecycles (e.g., DatabaseLifecycleService). Fails to address the core orphan-process issue.
  3. External OS/process supervision (systemd/pm2): Gold Standard for typical servers, but an Avoided Trap here. Neo.mjs agentic environments run locally on operator machines; adding pm2 or systemd dependencies breaks our zero-config sandbox requirement.
  4. Hybrid supervision boundaries: Attempting to mix Orchestrator control for some daemons and MCP control for others. Rejected due to cognitive load and fractured state guarantees.
  5. Health/lease registry first: A registry solves discovery but not lifecycle. Daemons still need a supervisor to spawn/kill them. Supervisor first, registry second.
  • Trap: Adopting decentralized sub-agent process management. Rejection Rationale: This caused severe orphaned process leaks during multi-turn LLM inference loops.
  • Trap: Creating individual coordinators for every maintenance sweep. Rejection Rationale: Bloats the Orchestrator with unnecessary setInterval loops instead of leveraging the deterministic CadenceEngine.

Related

Origin Session ID: d5ed6767-0292-46bf-9346-439f268048ec Retrieval Hint: "Systemd-for-Neo", "ProcessSupervisorService Orchestrator migration", "M4 Architectural Convergence"

tobiu referenced in commit 00069c6 - "feat(skills): Double Diamond graduation guard for ideation-sandbox (#11086) (#11095) on May 10, 2026, 12:32 PM
tobiu referenced in commit d2db7f9 - "fix(github-workflow): correct stale ToolService import path post-M6 migration (#11103) (#11104) on May 10, 2026, 2:57 PM