What is the 'Neural Link'?

The Neural Link is a bi-directional bridge that connects AI agents (like Gemini or Claude) directly to the Neo.mjs runtime. It allows agents to 'see' the application's Scene Graph, inspect component state, verify event listeners, and even mutate the running application in real-time. This turns the application into a 'glass box' for AI, enabling autonomous debugging and feature development.

Why is Neo.mjs called an 'Application Engine' instead of a framework?

Traditional frameworks are general-purpose libraries (like a Toyota) that help you organize code, but they compile away into ephemeral DOM nodes. Neo.mjs is a precision-engineered runtime (like an F1 car), similar to Unreal Engine for games. It maintains a persistent 'Scene Graph' of objects in a separate worker thread. These objects retain their identity, state, and relationships, allowing for advanced capabilities like multi-window orchestration, runtime permutation, and deep AI introspection that are impossible with 'melted plastic' DOM-based frameworks.

What is 'Context Engineering'?

Context Engineering is the practice of curating the environment and information flow for AI agents to maximize their autonomy. In Neo.mjs, this is implemented via three Model Context Protocol (MCP) servers: a Knowledge Base for semantic code understanding, a Memory Core for learning from past sessions, and a GitHub Workflow server for project management. This ecosystem allows agents to work as fully integrated members of the development team.

What is 'Object Permanence' in the context of Neo.mjs?

In Neo.mjs, UI components are persistent JavaScript objects living in the App Worker, not just transient rendering results. This 'Object Permanence' means a component (like a Dashboard) maintains its state (scroll position, user input, internal logic) even if it is detached from the DOM or moved to a different browser window. This is the 'Lego Technic' approach versus the 'Duplo' approach of traditional frameworks.

What is an 'Agent Operating System'?

Neo.mjs v11 introduced the concept of an Agent OS, where the platform itself provides the tools and interfaces for AI agents to operate. It combines a standalone, type-safe AI SDK for autonomous 'Code Execution' with the Neural Link for runtime control. This enables agents to monitor, debug, and heal the application autonomously, effectively acting as an operating system for synthetic intelligence.

How does Neo.mjs handle multi-window applications?

Neo.mjs uses shared web workers to run a single application instance across multiple browser windows. All windows share the same application state and data in real-time. Components can even move between windows while retaining their JavaScript instances. This enables desktop-class experiences like multi-window IDEs, browser-based email clients, and multi-screen control rooms.

What makes Neo.mjs different from React, Angular, or Vue?

Neo.mjs is fundamentally different in its architecture: (1) It uses True Multithreading via web workers to prevent UI jank, (2) It is an AI-Native Application Engine with built-in bridges for AI collaboration, and (3) It treats components as persistent objects in a Scene Graph, enabling native multi-window support and runtime permutation.

Frontmatter

id	11487
title	Add orchestrator maintenance backpressure
state	Closed
labels	bugaiarchitectureperformancemodel-experience
assignees	neo-gpt
createdAt	May 16, 2026, 10:09 PM
updatedAt	May 16, 2026, 11:32 PM
githubUrl	https://github.com/neomjs/neo/issues/11487
author	neo-gpt
commentsCount	0
parentIssue	null
subIssues	[]
subIssuesCompleted	0
subIssuesTotal	0
blockedBy	[]
blocking	[]
closedAt	May 16, 2026, 11:32 PM

Add orchestrator maintenance backpressure

Name: Neo.mjs Application Engine
Author: Neo.mjs

Closedbugaiarchitectureperformancemodel-experience

neo-gpt commented on May 16, 2026, 10:09 PM

Context

Operator logs on 2026-05-16 showed npm run ai:orchestrator restart from an already pressured laptop state, adopt existing long-lived daemons, then immediately start both periodic maintenance lanes:

[2026-05-16T19:36:21.072Z] [PID:42037] [INFO] [Orchestrator] Started. summaryInterval=600000ms kbSyncInterval=1800000ms poll=3000ms.
[2026-05-16T19:36:21.578Z] [PID:42037] [INFO] [ProcessSupervisor] Starting session summarization (periodic-sweep:600000).
[2026-05-16T19:36:21.579Z] [PID:42037] [INFO] [ProcessSupervisor] Starting knowledge base sync (periodic-sync:1800000).

During the same window, add_memory from @neo-gpt timed out/froze. A later Memory Core healthcheck showed the memory count had increased by one, so the write path may commit late while the request/response path remains unusable under pressure. The operator therefore instructed @neo-gpt to skip add_memory until the current pressure issue is fixed.

The Problem

The orchestrator currently treats each scheduled maintenance task independently. After restart, if multiple interval tasks are overdue, the first poll() can launch them in the same tick. That is safe for per-task singleton correctness but unsafe for shared-substrate pressure: summary, KB sync, graph ingestion, Dream/Sandman, and future heavy lanes compete for Chroma, SQLite, MLX, and the same laptop memory budget.

This is not solved by #11459. #11459 fixed child stderr severity and duplicate already-running skip log spam. It did not add cross-task scheduling backpressure.

This is also not a duplicate of #11065. #11065 is closed and explicitly listed cross-coordinator scheduling synchronization as out of scope: "don't spawn TWO heavy tasks at once even when both are due" was filable once contention was observed. The 2026-05-16 restart log is that observation.

The Architectural Reality

Empirical code anchors from current dev:

ai/daemons/Orchestrator.mjs:326-354 configures/recover tasks and then calls this.poll() immediately after logging Started.
ai/daemons/Orchestrator.mjs:400-438 evaluates summary and kbSync with independent cadenceEngine.runIfDue(...) calls in the same poll() pass.
ai/daemons/services/CadenceEngine.mjs:52-54 uses pure interval due logic: intervalMs > 0 && now - lastRunAt >= intervalMs.
ai/daemons/services/CadenceEngine.mjs:64-70 executes every truthy trigger passed to runIfDue; it does not know whether another heavy task already started earlier in the poll.
ai/daemons/services/SummarizationCoordinatorService.mjs:25-43 returns a summary trigger for unread handovers or overdue interval; it does not account for KB sync or other active heavy tasks.
ai/daemons/services/ProcessSupervisorService.mjs:266-270 dedupes repeated starts for the same already-running task, but this does not prevent starting a different heavy task in the same poll.

learn/agentos/v13-path.md:115 already states the cadence model must be block-aware because graph-processing tasks can block add_memory while running. The current implementation still lacks the shared backpressure primitive that applies that direction across maintenance lanes.

The Fix

Add a narrow orchestrator maintenance backpressure primitive that gates heavy periodic work across task names.

Suggested shape:

Define a heavy-maintenance task set in the orchestrator scheduling layer. Initial candidates: summary, kbSync, primaryDevSync, dream, goldenPath; include backup only if implementation evidence shows it is materially contending, otherwise leave it out as a fast safety task.
During one poll() pass, allow at most one due heavy maintenance task to start. If a heavy task is already running at task-state level, defer other due heavy tasks.
Preserve continuous daemon supervision (chroma, memoryCoreChroma, bridgeDaemon, mlx) outside this backpressure gate.
Preserve per-task failure isolation and existing cadenceEngine.runIfDue(...) ergonomics where practical. The gate can live in Orchestrator.mjs or a small helper/service only if that reduces complexity without over-abstracting.
Log deferrals sparingly and below error severity. The log should answer "why did a due task not start?" without reintroducing per-poll noise.
Add unit tests that prove restart/overdue conditions start only one heavy task per poll while non-heavy/continuous supervision remains unaffected.

Contract Ledger Matrix

Target Surface	Source of Authority	Proposed Behavior	Fallback	Docs	Evidence
Orchestrator maintenance scheduler	This ticket + `learn/agentos/v13-path.md:115` + operator 2026-05-16 restart log	Heavy scheduled tasks are backpressured across task names so restart does not launch every overdue lane at once	Existing per-task singleton guard still prevents duplicate starts of the same task	JSDoc on helper/config if the task classification is not self-evident	Unit test with overdue `summary` + `kbSync` proves one start and one deferral
Deferral observability	#11459 logging discipline + this ticket	Deferrals are visible but sparse, non-error logs	Health outcome may record skipped/deferred state if existing health semantics support it	Inline comment only where needed	Unit test asserts no duplicate per-poll log flood
Continuous daemon supervision	Existing orchestrator daemon contract	`chroma`, `memoryCoreChroma`, `bridgeDaemon`, and `mlx` restart checks remain outside heavy-maintenance backpressure	Current cooldown remains authoritative	Existing docs unchanged	Existing orchestrator tests continue passing

Acceptance Criteria

Restart/first-poll scheduling cannot start both summary and kbSync in the same poll when both are due.
A running heavy maintenance task defers other due heavy maintenance tasks instead of starting them concurrently.
Continuous daemon restart checks remain unaffected by the heavy-maintenance gate.
Deferral logs are non-error and deduped or naturally sparse enough to avoid per-poll noise.
Unit coverage proves the cross-task gate and preserves existing scheduling/failure-isolation behavior.
No changes reduce child stderr visibility or undo #11459’s child log severity classification.
No changes alter add_memory semantics directly; this ticket reduces orchestrator-created pressure around it.

Out of Scope

Rewriting Memory Core write handling or Chroma internals.
Changing the embedding model, vector dimensions, or KB sync content semantics.
Reintroducing boot-time auto-summarize/auto-dream behavior.
Implementing SandmanCoordinatorService itself; #11065 covered that lane and is already closed.
Adding adaptive latency thresholds. A simple deterministic backpressure gate is enough for this incident.

Avoided Traps

Naive interval-only scheduling: already shown to start overdue lanes together after restart.
A global stop-the-world lock: continuous daemons still need supervision and should not be blocked by a KB sync.
Hiding deferrals entirely: silent non-starts make operator diagnosis worse.
Treating #11459 as sufficient: logging fixes reduce noise, not shared resource pressure.

#11459 — fixed child stderr severity and already-running skip log spam.
#11065 — closed SandmanCoordinatorService ticket; explicitly left cross-task heavy scheduling synchronization as future scope if observed.
#10576 — prior KB sync volume/backpressure and add_memory contention evidence.
learn/agentos/v13-path.md:115 — block-aware cadence direction.

Handoff Retrieval Hint: query_raw_memories(query="orchestrator restart summary kbSync add_memory freeze heavy task backpressure")

tobiu referenced in commit 1c9ed50 - "fix(ai): add orchestrator maintenance backpressure (#11487) (#11489) on May 16, 2026, 11:32 PM

tobiu closed this issue on May 16, 2026, 11:32 PM