LearnNewsExamplesServices
Frontmatter
id9763
titleMigrate DreamService to MLX-Native OpenAI-Compatible Server
stateClosed
labels
enhancementaiagent-role:devperformance
assigneestobiu
createdAtApr 7, 2026, 9:57 PM
updatedAtApr 8, 2026, 1:03 AM
githubUrlhttps://github.com/neomjs/neo/issues/9763
authortobiu
commentsCount1
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtApr 7, 2026, 10:01 PM

Migrate DreamService to MLX-Native OpenAI-Compatible Server

Closedenhancementaiagent-role:devperformance
tobiu
tobiu commented on Apr 7, 2026, 9:57 PM

Currently, the autonomous REM sleep pipeline (DreamService.mjs) uses the Ollama provider. Calculating num_ctx dynamically for a 31B model creates massive memory bandwidth saturation and forces frequent KV cache resizing, resulting in monumental prefill IO overhead on Apple Silicon M-series chips.

Scope:

  1. Create a generic OpenAiCompatible.mjs provider inside src/ai/provider/.
  2. Refactor DreamService.mjs to target this new provider instead of Ollama, abandoning individual num_ctx manipulations and deferring dynamic KV-Cache paging to MLX.
  3. Add API connection settings to config.mjs.
tobiu added the enhancement label on Apr 7, 2026, 9:57 PM
tobiu added the ai label on Apr 7, 2026, 9:57 PM
tobiu added the agent-role:dev label on Apr 7, 2026, 9:57 PM
tobiu added the performance label on Apr 7, 2026, 9:57 PM
tobiu referenced in commit 0e315c4 - "feat: Migrate DreamService to MLX-Native OpenAI-Compatible Server (#9763) on Apr 7, 2026, 10:00 PM
tobiu
tobiu Apr 7, 2026, 10:01 PM

Input from Antigravity (Gemini 3.1 Pro):

Task Completed!

The autonomous infrastructure migration to native MLX support is finished.

Summary of Modifications:

  1. A new OpenAiCompatible.mjs API provider adapter has been implemented which binds specifically to /v1/chat/completions API JSON structures.
  2. num_ctx logic has been surgically stripped from all DreamService.mjs execution loops (Tri-Vector Synthesis, Topology Conflict Reconciliation, Capability Gap Extraction).
  3. KV-cache management has effectively been handed back over to Apple Native Paged Attention structures.

Note: The server expects the runner (such as mlx_lm.server) to exist on port :8000 via the new openAiCompatible config setting.

Code has been committed and physically pushed to the active development branch. I am proceeding to close this ticket structurallly.

tobiu closed this issue on Apr 7, 2026, 10:01 PM
tobiu assigned to @tobiu on Apr 8, 2026, 1:03 AM