LearnNewsExamplesServices
Frontmatter
id11471
titleOrchestrator MLX task starts stale Gemma 2 model
stateClosed
labels
bugaiagent-task:in-progressarchitecturemodel-experience
assigneesneo-gpt
createdAtMay 16, 2026, 5:37 PM
updatedAtMay 16, 2026, 6:17 PM
githubUrlhttps://github.com/neomjs/neo/issues/11471
authorneo-gpt
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 16, 2026, 6:17 PM

Orchestrator MLX task starts stale Gemma 2 model

Closedbugaiagent-task:in-progressarchitecturemodel-experience
neo-gpt
neo-gpt commented on May 16, 2026, 5:37 PM

Context

Operator ran the orchestrator directly on 2026-05-16:

node ./ai/scripts/orchestrator-daemon.mjs
...
[ERROR] [ProcessSupervisor] mlx inference stderr: 2026-05-16 17:11:37,921 - INFO - HTTP Request: GET https://huggingface.co/api/models/mlx-community/gemma-2-27b-it-4bit/revision/main "HTTP/1.1 200 OK"
...
[ERROR] [ProcessSupervisor] mlx inference stderr: 2026-05-16 17:11:39,615 - INFO - Starting httpd at 127.0.0.1 on port 11435...

The active expected local chat model is Gemma 4 31B, not Gemma 2 27B. V-B-A traced this to the repo, not to runtime hallucination or Hugging Face behavior:

  • ai/daemons/TaskDefinitions.mjs:58-63 hardcodes the orchestrator mlx child task as --model mlx-community/gemma-2-27b-it-4bit --port 11435.
  • ai/mcp/server/memory-core/config.mjs:131-135 defaults OpenAI-compatible chat model to gemma-4-31b-it.
  • ai/mcp/server/memory-core/config.template.mjs:145-148 also documents gemma-4-31b-it as the template default.
  • git blame -L 58,63 -- ai/daemons/TaskDefinitions.mjs shows the stale model was introduced in commit 94cd337c0d from PR #11382, whose commit message explicitly says Swap mlx model to mlx-community/gemma-2-27b-it-4bit.

Duplicate sweep:

  • ask_knowledge_base(query='open ticket orchestrator mlx gemma 2 gemma 4 model mismatch', type='ticket') returned no relevant documents.
  • list_issues(state='open', limit=100) did not surface an equivalent open ticket.
  • Local resources/content/issues / resources/content/discussions grep is currently unavailable because those synced directories are absent after the ADR 0004 clean-slate purge; code grep and git provenance were used as the empirical fallback.

The Problem

The orchestrator owns the MLX inference daemon lifecycle, but its task definition is now stale relative to the Memory Core model configuration. Starting the orchestrator pulls/serves Gemma 2 27B even when the configured local model family has moved to Gemma 4 31B.

This has two operator-facing failures:

  1. It starts the wrong model, wasting time and memory on the wrong local inference backend.
  2. It undermines health/config observability: Memory Core can report one intended OpenAI-compatible model while the supervisor starts another.

This is separate from #11459. #11459 covers daemon log-severity classification and skip-log throttling. This ticket covers the source model that the orchestrator launches.

The Architectural Reality

  • ai/daemons/TaskDefinitions.mjs is the child-process command factory for orchestrator-owned tasks.
  • Orchestrator calls buildTaskDefinitions() during startup, so defaults here directly determine what node ./ai/scripts/orchestrator-daemon.mjs launches.
  • Memory Core config is already the canonical local-model config surface for openAiCompatible.model and related environment overrides.
  • The orchestrator task currently duplicates model identity as a literal string instead of consuming the configured source of truth.
  • Existing unit coverage references buildTaskDefinitions() in:
    • test/playwright/unit/ai/daemons/Orchestrator.spec.mjs
    • test/playwright/unit/ai/scripts/orchestrator-daemon.spec.mjs

The Fix

Align the orchestrator MLX task with the configured OpenAI-compatible model instead of hardcoding Gemma 2.

Recommended implementation shape:

  1. Update buildTaskDefinitions() so the MLX task model defaults to the Memory Core OpenAI-compatible chat model (gemma-4-31b-it today) or an explicit env/config override, not mlx-community/gemma-2-27b-it-4bit.
  2. Preserve the existing port behavior unless a current config source already owns the MLX daemon port. Do not create an accidental host/port migration in this bug fix.
  3. Add focused unit coverage asserting the mlx task args contain the configured/default Gemma 4 model and do not contain gemma-2-27b-it-4bit.
  4. Keep #11459 log-severity classification out of scope unless the same files need a tiny helper shared by both fixes.

Contract Ledger Matrix

Target Surface Source of Authority Proposed Behavior Fallback Docs Evidence
buildTaskDefinitions().mlx.args Memory Core OpenAI-compatible model config + this ticket Starts MLX with the configured Gemma 4 model instead of stale Gemma 2 literal Explicit env/config override if present JSDoc/summary only if helper introduced Unit test inspects generated args
Orchestrator operator startup node ./ai/scripts/orchestrator-daemon.mjs log sample mlx_lm.server fetches/serves the intended model Manual override remains possible through config/env Existing daemon docs Optional manual dry-run/log inspection
Model config observability HealthService provider observability + memory-core config Runtime-started local model matches the model config surface Healthcheck still reports configured model Existing healthcheck docs Unit plus operator log evidence

Acceptance Criteria

  • ai/daemons/TaskDefinitions.mjs no longer contains mlx-community/gemma-2-27b-it-4bit as the default MLX task model.
  • The default MLX task model resolves to gemma-4-31b-it or the Memory Core OpenAI-compatible model config value.
  • Existing orchestrator task names, pid file names, expected command, and MLX port behavior remain unchanged unless a verified config source already owns the port.
  • Unit coverage verifies the mlx task args use the configured/default model and rejects the stale Gemma 2 literal.
  • node ./ai/scripts/orchestrator-daemon.mjs no longer attempts to fetch mlx-community/gemma-2-27b-it-4bit after restart (manual/operator validation acceptable if daemon startup is not safe inside CI).

Out of Scope

  • Reclassifying child stderr [INFO] lines from ERROR to INFO; that is #11459.
  • Throttling repeated Skipping knowledge base sync; task already running logs; that is #11459.
  • Changing embedding model/provider/vector dimension behavior.
  • Migrating the OpenAI-compatible host/port contract unless required to eliminate the stale model literal.
  • Reworking the full inference lifecycle service.

Related

  • #11380 / PR #11382 — introduced the stale Gemma 2 MLX task model via commit 94cd337c0d.
  • #11459 — adjacent ProcessSupervisor logging severity/throttle issue, distinct scope.
  • ai/daemons/TaskDefinitions.mjs
  • ai/mcp/server/memory-core/config.mjs
  • ai/mcp/server/memory-core/config.template.mjs

Origin Session ID: 6ec143cb-2e5b-4964-94d6-eb28cb25bde2

Handoff Retrieval Hint: query_raw_memories(query="orchestrator mlx inference gemma-2-27b gemma-4-31b TaskDefinitions model mismatch 94cd337c")

tobiu referenced in commit 97774ce - "fix(ai): align orchestrator mlx model (#11471) (#11473) on May 16, 2026, 6:17 PM
tobiu closed this issue on May 16, 2026, 6:17 PM