LearnNewsExamplesServices
Frontmatter
id11380
titleOrchestrator daemon child process failures (stdio, mlx arg, chroma gap)
stateClosed
labels
bugai
assigneesneo-gemini-3-1-pro
createdAtMay 15, 2026, 1:46 AM
updatedAtMay 15, 2026, 2:24 AM
githubUrlhttps://github.com/neomjs/neo/issues/11380
authorneo-gemini-3-1-pro
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 15, 2026, 2:24 AM

Orchestrator daemon child process failures (stdio, mlx arg, chroma gap)

Closedbugai
neo-gemini-3-1-pro
neo-gemini-3-1-pro commented on May 15, 2026, 1:46 AM

Context During the Antigravity boot sequence and orchestrator daemon startup, several child processes failed with exit code 1 or went into restart loops. This instability was detected in ai:orchestrator logs.

The Problem

  1. ProcessSupervisorService spawns child processes with stdio: 'ignore'. This hides stderr, making child process failures completely opaque and unrecoverable for post-mortem analysis.
  2. The mlx task definition uses --model gemma4:31b which is Ollama-format, but mlx_lm.server expects a valid HuggingFace repository ID or local path. This causes the mlx inference server to crash and restart every ~15s.
  3. The Memory Core Chroma instance (port 8001) is not managed by the orchestrator daemon, meaning if it's down, dependent tasks (like kbSync, backup, summary) fail on boot in a cascading chain until manually started.

The Architectural Reality

  • ai/daemons/services/ProcessSupervisorService.mjs line 207 spawns the child with {stdio: 'ignore'}.
  • ai/daemons/TaskDefinitions.mjs line 54 defines the mlx task with the incorrect model argument.

The Fix

  • Update ProcessSupervisorService.mjs to pipe or inherit stdio (e.g., ['ignore', 'pipe', 'pipe']) and explicitly handle logging child stderr to writeLog_.
  • Correct the mlx task model argument in TaskDefinitions.mjs to a valid HuggingFace repo ID (e.g., google/gemma-2-27b-it or appropriate for the local environment).
  • Add Memory Core Chroma to the orchestrator TaskDefinitions so it gets managed alongside KB Chroma.

Acceptance Criteria

  • ProcessSupervisorService captures and logs child process stderr on failure.
  • mlx inference server starts successfully without looping.
  • Memory Core Chroma is automatically managed by the orchestrator.

Out of Scope

  • Major refactoring of the ProcessSupervisor logic beyond fixing the logging visibility.

Origin Session ID Origin Session ID: 188acb85-b41e-435c-94ee-0cc9944d4c97

tobiu referenced in commit 94cd337 - "fix(ai): orchestrator daemon child process failures (#11380) (#11382) on May 15, 2026, 2:24 AM
tobiu closed this issue on May 15, 2026, 2:24 AM