What is the 'Neural Link'?

The Neural Link is a bi-directional bridge that connects AI agents (like Gemini or Claude) directly to the Neo.mjs runtime. It allows agents to 'see' the application's Scene Graph, inspect component state, verify event listeners, and even mutate the running application in real-time. This turns the application into a 'glass box' for AI, enabling autonomous debugging and feature development.

Why is Neo.mjs called an 'Application Engine' instead of a framework?

Traditional frameworks are general-purpose libraries (like a Toyota) that help you organize code, but they compile away into ephemeral DOM nodes. Neo.mjs is a precision-engineered runtime (like an F1 car), similar to Unreal Engine for games. It maintains a persistent 'Scene Graph' of objects in a separate worker thread. These objects retain their identity, state, and relationships, allowing for advanced capabilities like multi-window orchestration, runtime permutation, and deep AI introspection that are impossible with 'melted plastic' DOM-based frameworks.

What is 'Context Engineering'?

Context Engineering is the practice of curating the environment and information flow for AI agents to maximize their autonomy. In Neo.mjs, this is implemented via three Model Context Protocol (MCP) servers: a Knowledge Base for semantic code understanding, a Memory Core for learning from past sessions, and a GitHub Workflow server for project management. This ecosystem allows agents to work as fully integrated members of the development team.

What is 'Object Permanence' in the context of Neo.mjs?

In Neo.mjs, UI components are persistent JavaScript objects living in the App Worker, not just transient rendering results. This 'Object Permanence' means a component (like a Dashboard) maintains its state (scroll position, user input, internal logic) even if it is detached from the DOM or moved to a different browser window. This is the 'Lego Technic' approach versus the 'Duplo' approach of traditional frameworks.

What is an 'Agent Operating System'?

Neo.mjs v11 introduced the concept of an Agent OS, where the platform itself provides the tools and interfaces for AI agents to operate. It combines a standalone, type-safe AI SDK for autonomous 'Code Execution' with the Neural Link for runtime control. This enables agents to monitor, debug, and heal the application autonomously, effectively acting as an operating system for synthetic intelligence.

How does Neo.mjs handle multi-window applications?

Neo.mjs uses shared web workers to run a single application instance across multiple browser windows. All windows share the same application state and data in real-time. Components can even move between windows while retaining their JavaScript instances. This enables desktop-class experiences like multi-window IDEs, browser-based email clients, and multi-screen control rooms.

What makes Neo.mjs different from React, Angular, or Vue?

Neo.mjs is fundamentally different in its architecture: (1) It uses True Multithreading via web workers to prevent UI jank, (2) It is an AI-Native Application Engine with built-in bridges for AI collaboration, and (3) It treats components as persistent objects in a Scene Graph, enabling native multi-window support and runtime permutation.

Frontmatter

id	11395
title	Broaden TextEmbeddingService model-unload retry to cover LM Studio 'Failed to load model' / 'Operation canceled' shape
state	Closed
labels	bugaiagent-task:pendingmodel-experience
assignees	neo-gemini-3-1-pro
createdAt	May 15, 2026, 5:26 AM
updatedAt	May 15, 2026, 10:23 AM
githubUrl	https://github.com/neomjs/neo/issues/11395
author	neo-opus-4-7
commentsCount	0
parentIssue	null
subIssues	[]
subIssuesCompleted	0
subIssuesTotal	0
blockedBy	[]
blocking	[]
closedAt	May 15, 2026, 10:23 AM

Broaden TextEmbeddingService model-unload retry to cover LM Studio 'Failed to load model' / 'Operation canceled' shape

Name: Neo.mjs Application Engine
Author: Neo.mjs

Closedbugaiagent-task:pendingmodel-experience

neo-opus-4-7 commented on May 15, 2026, 5:26 AM

Context

Follow-up to #11393 / PR #11394 (fix(memory-core): implement retry-on-unload for openAiCompatible embeddings). The original ticket filed at 03:13Z scoped the retry-on-unload detection to the single empirically-observed error shape at that time ("Model was unloaded while the request was still in queue.."). PR #11394 implemented the literal scope correctly + landed Cycle-1 APPROVED.

Post-filing empirical evidence (same session): at 03:17Z (~4 minutes after #11393 was filed) the same session hit a SECOND distinct LM-Studio failure shape, structurally representing the same substrate-friction class:

openAiCompatible embedding error HTTP 400:
{"error":"Failed to load model \"text-embedding-qwen3-embedding-8b\". Error: Operation canceled."}

This is captured verbatim in MESSAGE:36b84262-2e96-49c6-9a48-51528bc65fea (the self-DM fallback turn-memory from that failure occurrence). The literal "Model was unloaded" detection in PR #11394 does NOT match this error shape, so the retry path does not fire and the error propagates unchanged — failing the AGENTS.md §0 Invariant 5 reliability gate that #11393's full-class fix intended to restore.

The Problem

Two distinct LM-Studio error shapes empirically observed in a single nightshift session represent the same substrate-friction class:

Shape A (Model was unloaded while the request was still in queue..) — emitted when LM Studio has JIT-unloaded the model due to idle timeout and a new request arrives while the unload state is still in the request queue. Currently handled by PR #11394's detection.
Shape B (Failed to load model "<MODEL>". Error: Operation canceled.) — emitted when LM Studio attempts to JIT-warm a model on a fresh request but the load operation is canceled (likely a different RAM-pressure / queueing-race condition, possibly when LM Studio's load operation itself fails partway through). NOT currently handled.

Both shapes:

Surface as HTTP 400 from /v1/embeddings
Indicate the same architectural state ("embedding model not resident, request cannot be served")
Have the same correct semantic response (retry-with-warmup-delay; the warmup-delay gives LM Studio a chance to complete the load operation that was either pending or canceled)

The current PR #11394 detection's substring match on "Model was unloaded" is the narrowest possible interpretation of LM-Studio's substrate-friction class.

The Architectural Reality

Detection callsite: ai/services/memory-core/TextEmbeddingService.mjs:115-119 (post-#11394-merge)
Current detection: err.message.includes('HTTP 400') && err.message.includes('Model was unloaded')
Empirical evidence: self-DM MESSAGE:36b84262 (same-session post-filing observation of Shape B); self-DM MESSAGE:3af300ee (session anchor of Shape A)

The Fix

Broaden the LM-Studio substrate-friction-class detection to catch BOTH error shapes (and remain extensible for any future LM-Studio variants of the same class). Proposed prescription:

// ai/services/memory-core/TextEmbeddingService.mjs — replace the current condition:
const isModelLoadError = err.message.includes('HTTP 400') && (
    err.message.includes('Model was unloaded') ||              // Shape A — JIT-unload-then-queued-request
    (err.message.includes('Failed to load model') &&            // Shape B — JIT-warm-load-canceled
     err.message.includes('Operation canceled'))
);

if (retriesLeft > 0 && isModelLoadError) {
    logger.log(`[TextEmbeddingService] embedding-provider model-load failure detected (Shape ${err.message.includes('Model was unloaded') ? 'A' : 'B'}), retrying (remaining retries: ${retriesLeft})`);
    await new Promise(r => setTimeout(r, unloadRetryDelayMs));
    return this.#postOpenAiCompatible(inputData, retriesLeft - 1);
}

The Shape-A-vs-Shape-B annotation in the log line surfaces operator-observability into which substrate-friction variant fired — useful for future tuning if one shape dominates over the other.

Acceptance Criteria

AC1: Detection condition expanded to match both Shape A and Shape B error patterns as enumerated above.
AC2: Existing PR #11394 spec coverage extended with a new test case: 'first-call-fails-shape-b-second-call-succeeds path with mock client' — mock-server emits the "Failed to load model ... Operation canceled" shape on call 1, success on call 2; assertion: 2 requests total, retry fires.
AC3: Existing PR #11394 spec coverage extended with: 'exhausted-retry-final-failure path with shape b' — mock-server emits Shape B on all calls; assertion: error propagates, request count matches retry-count + 1.
AC4: Log-line includes the Shape-A-vs-Shape-B annotation per the prescription above; spec verifies the log substance via captured logger output (if existing logger-mocking pattern in the spec supports it; if not, this AC is N/A for spec coverage and remains a manual-verification observation).
AC5: Non-load-class HTTP 400 errors still propagate without retry (existing PR #11394 propagates non-unload HTTP 400 errors immediately without retries spec must continue passing after the condition expansion — it currently emits "Some other bad request error" which doesn't match either Shape A or Shape B substrings).
AC6: No regression on the original Shape A detection — existing PR #11394 specs continue passing.

Out of Scope

Regex-based vs substring-based detection — prescription uses substring matching for consistency with PR #11394's existing pattern. Switching to regex would be a broader refactor; not load-bearing here.
LM-Studio-version-specific error-shape detection — the substrate is robust as long as LM-Studio's error-shapes stay stable. If a future LM-Studio version emits a new shape, file another follow-up; don't try to enumerate future shapes speculatively.
Daemon-managed embedding-endpoint pattern — companion to #11380 broader-scope; still out-of-scope for the narrow retry-detection fix. May become relevant as a Lane B follow-up if narrow retry continues to surface edge-shapes.
Provider-side coordination — fixing LM Studio's load-cancellation behavior is operator/LM-Studio's responsibility, not Memory Core's. We mitigate via retry-with-warmup-delay.

Avoided Traps

Treat ALL Failed to load model as retry-eligible — rejected. The Operation canceled co-condition is what specifically signals "load-attempt-canceled-but-retry-might-succeed". A Failed to load model paired with a non-Operation canceled cause (e.g., Model file not found) indicates a different failure class (e.g., model evicted from disk) that retry won't fix. Substring AND-condition preserves the narrow class-shape.
Catch-all HTTP-400 retry — rejected (already enforced in PR #11394 via the Some other bad request error test case). Generic 400 retry would mask real configuration bugs.
Add the Shape-B detection in PR #11394 mid-Cycle-1 — rejected as ticket-author goalpost-moving (per Cycle-1 review on #11394's Strategic-Fit rationale). PR #11394 implemented per literal #11393 AC1; broader-class is this follow-up's scope.

Predecessor ticket: #11393 — original narrower-scope retry-on-unload ticket.
Predecessor PR: #11394 — Cycle-1 APPROVED+Follow-Up; merged when @tobiu executes merge gate. This ticket's implementation builds on PR #11394's #postOpenAiCompatible private-method refactor.
Companion broader-scope substrate-pattern: #11380 — daemon-managed local-supporting-services; future Lane B if narrow retry continues to surface edge-shapes.
Empirical anchor self-DMs (private mailbox; A2A graph nodes): MESSAGE:3af300ee (Shape A) + MESSAGE:36b84262 (Shape B).
AGENTS.md §0 Invariant 5: "No skipping add_memory at end of turn" — both shapes break this gate; this ticket completes the substrate-friction-class coverage that #11393 started narrower.

Origin Session

Origin Session ID: e095c569-beac-4743-998f-e07d4344492e

Retrieval Hint

Search for LM Studio embedding model load Operation canceled JIT warm-load retry shape-b broader-class.

tobiu closed this issue on May 15, 2026, 10:23 AM

tobiu referenced in commit 805f779 - "fix(memory-core): broaden embedding-retry detection to LM Studio Shape B (#11395) (#11396) on May 15, 2026, 10:23 AM