Context
Operator ran the orchestrator directly on 2026-05-16:
node ./ai/scripts/orchestrator-daemon.mjs
...
[ERROR] [ProcessSupervisor] mlx inference stderr: 2026-05-16 17:11:37,921 - INFO - HTTP Request: GET https://huggingface.co/api/models/mlx-community/gemma-2-27b-it-4bit/revision/main "HTTP/1.1 200 OK"
...
[ERROR] [ProcessSupervisor] mlx inference stderr: 2026-05-16 17:11:39,615 - INFO - Starting httpd at 127.0.0.1 on port 11435...
The active expected local chat model is Gemma 4 31B, not Gemma 2 27B. V-B-A traced this to the repo, not to runtime hallucination or Hugging Face behavior:
ai/daemons/TaskDefinitions.mjs:58-63 hardcodes the orchestrator mlx child task as --model mlx-community/gemma-2-27b-it-4bit --port 11435.
ai/mcp/server/memory-core/config.mjs:131-135 defaults OpenAI-compatible chat model to gemma-4-31b-it.
ai/mcp/server/memory-core/config.template.mjs:145-148 also documents gemma-4-31b-it as the template default.
git blame -L 58,63 -- ai/daemons/TaskDefinitions.mjs shows the stale model was introduced in commit 94cd337c0d from PR #11382, whose commit message explicitly says Swap mlx model to mlx-community/gemma-2-27b-it-4bit.
Duplicate sweep:
ask_knowledge_base(query='open ticket orchestrator mlx gemma 2 gemma 4 model mismatch', type='ticket') returned no relevant documents.
list_issues(state='open', limit=100) did not surface an equivalent open ticket.
- Local
resources/content/issues / resources/content/discussions grep is currently unavailable because those synced directories are absent after the ADR 0004 clean-slate purge; code grep and git provenance were used as the empirical fallback.
The Problem
The orchestrator owns the MLX inference daemon lifecycle, but its task definition is now stale relative to the Memory Core model configuration. Starting the orchestrator pulls/serves Gemma 2 27B even when the configured local model family has moved to Gemma 4 31B.
This has two operator-facing failures:
- It starts the wrong model, wasting time and memory on the wrong local inference backend.
- It undermines health/config observability: Memory Core can report one intended OpenAI-compatible model while the supervisor starts another.
This is separate from #11459. #11459 covers daemon log-severity classification and skip-log throttling. This ticket covers the source model that the orchestrator launches.
The Architectural Reality
ai/daemons/TaskDefinitions.mjs is the child-process command factory for orchestrator-owned tasks.
Orchestrator calls buildTaskDefinitions() during startup, so defaults here directly determine what node ./ai/scripts/orchestrator-daemon.mjs launches.
- Memory Core config is already the canonical local-model config surface for
openAiCompatible.model and related environment overrides.
- The orchestrator task currently duplicates model identity as a literal string instead of consuming the configured source of truth.
- Existing unit coverage references
buildTaskDefinitions() in:
test/playwright/unit/ai/daemons/Orchestrator.spec.mjs
test/playwright/unit/ai/scripts/orchestrator-daemon.spec.mjs
The Fix
Align the orchestrator MLX task with the configured OpenAI-compatible model instead of hardcoding Gemma 2.
Recommended implementation shape:
- Update
buildTaskDefinitions() so the MLX task model defaults to the Memory Core OpenAI-compatible chat model (gemma-4-31b-it today) or an explicit env/config override, not mlx-community/gemma-2-27b-it-4bit.
- Preserve the existing port behavior unless a current config source already owns the MLX daemon port. Do not create an accidental host/port migration in this bug fix.
- Add focused unit coverage asserting the
mlx task args contain the configured/default Gemma 4 model and do not contain gemma-2-27b-it-4bit.
- Keep #11459 log-severity classification out of scope unless the same files need a tiny helper shared by both fixes.
Contract Ledger Matrix
| Target Surface |
Source of Authority |
Proposed Behavior |
Fallback |
Docs |
Evidence |
buildTaskDefinitions().mlx.args |
Memory Core OpenAI-compatible model config + this ticket |
Starts MLX with the configured Gemma 4 model instead of stale Gemma 2 literal |
Explicit env/config override if present |
JSDoc/summary only if helper introduced |
Unit test inspects generated args |
| Orchestrator operator startup |
node ./ai/scripts/orchestrator-daemon.mjs log sample |
mlx_lm.server fetches/serves the intended model |
Manual override remains possible through config/env |
Existing daemon docs |
Optional manual dry-run/log inspection |
| Model config observability |
HealthService provider observability + memory-core config |
Runtime-started local model matches the model config surface |
Healthcheck still reports configured model |
Existing healthcheck docs |
Unit plus operator log evidence |
Acceptance Criteria
Out of Scope
- Reclassifying child stderr
[INFO] lines from ERROR to INFO; that is #11459.
- Throttling repeated
Skipping knowledge base sync; task already running logs; that is #11459.
- Changing embedding model/provider/vector dimension behavior.
- Migrating the OpenAI-compatible host/port contract unless required to eliminate the stale model literal.
- Reworking the full inference lifecycle service.
Related
- #11380 / PR #11382 — introduced the stale Gemma 2 MLX task model via commit
94cd337c0d.
- #11459 — adjacent ProcessSupervisor logging severity/throttle issue, distinct scope.
ai/daemons/TaskDefinitions.mjs
ai/mcp/server/memory-core/config.mjs
ai/mcp/server/memory-core/config.template.mjs
Origin Session ID: 6ec143cb-2e5b-4964-94d6-eb28cb25bde2
Handoff Retrieval Hint: query_raw_memories(query="orchestrator mlx inference gemma-2-27b gemma-4-31b TaskDefinitions model mismatch 94cd337c")
Context
Operator ran the orchestrator directly on 2026-05-16:
The active expected local chat model is Gemma 4 31B, not Gemma 2 27B. V-B-A traced this to the repo, not to runtime hallucination or Hugging Face behavior:
ai/daemons/TaskDefinitions.mjs:58-63hardcodes the orchestratormlxchild task as--model mlx-community/gemma-2-27b-it-4bit --port 11435.ai/mcp/server/memory-core/config.mjs:131-135defaults OpenAI-compatible chat model togemma-4-31b-it.ai/mcp/server/memory-core/config.template.mjs:145-148also documentsgemma-4-31b-itas the template default.git blame -L 58,63 -- ai/daemons/TaskDefinitions.mjsshows the stale model was introduced in commit94cd337c0dfrom PR #11382, whose commit message explicitly saysSwap mlx model to mlx-community/gemma-2-27b-it-4bit.Duplicate sweep:
ask_knowledge_base(query='open ticket orchestrator mlx gemma 2 gemma 4 model mismatch', type='ticket')returned no relevant documents.list_issues(state='open', limit=100)did not surface an equivalent open ticket.resources/content/issues/resources/content/discussionsgrep is currently unavailable because those synced directories are absent after the ADR 0004 clean-slate purge; code grep and git provenance were used as the empirical fallback.The Problem
The orchestrator owns the MLX inference daemon lifecycle, but its task definition is now stale relative to the Memory Core model configuration. Starting the orchestrator pulls/serves Gemma 2 27B even when the configured local model family has moved to Gemma 4 31B.
This has two operator-facing failures:
This is separate from #11459. #11459 covers daemon log-severity classification and skip-log throttling. This ticket covers the source model that the orchestrator launches.
The Architectural Reality
ai/daemons/TaskDefinitions.mjsis the child-process command factory for orchestrator-owned tasks.OrchestratorcallsbuildTaskDefinitions()during startup, so defaults here directly determine whatnode ./ai/scripts/orchestrator-daemon.mjslaunches.openAiCompatible.modeland related environment overrides.buildTaskDefinitions()in:test/playwright/unit/ai/daemons/Orchestrator.spec.mjstest/playwright/unit/ai/scripts/orchestrator-daemon.spec.mjsThe Fix
Align the orchestrator MLX task with the configured OpenAI-compatible model instead of hardcoding Gemma 2.
Recommended implementation shape:
buildTaskDefinitions()so the MLX task model defaults to the Memory Core OpenAI-compatible chat model (gemma-4-31b-ittoday) or an explicit env/config override, notmlx-community/gemma-2-27b-it-4bit.mlxtask args contain the configured/default Gemma 4 model and do not containgemma-2-27b-it-4bit.Contract Ledger Matrix
buildTaskDefinitions().mlx.argsnode ./ai/scripts/orchestrator-daemon.mjslog samplemlx_lm.serverfetches/serves the intended modelHealthServiceprovider observability + memory-core configAcceptance Criteria
ai/daemons/TaskDefinitions.mjsno longer containsmlx-community/gemma-2-27b-it-4bitas the default MLX task model.gemma-4-31b-itor the Memory Core OpenAI-compatible model config value.mlxtask args use the configured/default model and rejects the stale Gemma 2 literal.node ./ai/scripts/orchestrator-daemon.mjsno longer attempts to fetchmlx-community/gemma-2-27b-it-4bitafter restart (manual/operator validation acceptable if daemon startup is not safe inside CI).Out of Scope
[INFO]lines fromERRORtoINFO; that is #11459.Skipping knowledge base sync; task already runninglogs; that is #11459.Related
94cd337c0d.ai/daemons/TaskDefinitions.mjsai/mcp/server/memory-core/config.mjsai/mcp/server/memory-core/config.template.mjsOrigin Session ID: 6ec143cb-2e5b-4964-94d6-eb28cb25bde2
Handoff Retrieval Hint: query_raw_memories(query="orchestrator mlx inference gemma-2-27b gemma-4-31b TaskDefinitions model mismatch 94cd337c")