What is the Neural Link?

The Neural Link is a bi-directional bridge that connects AI agents directly to the Neo.mjs runtime. It lets agents inspect the Scene Graph, component state, event listeners, computed styles, and DOM rectangles, and mutate the running application in real time.

Why is Neo.mjs called an Application Engine instead of a framework?

Neo.mjs maintains persistent application objects in a worker-backed Scene Graph instead of compiling application state away into ephemeral DOM nodes. That architecture enables multi-window orchestration, runtime permutation, and deep AI introspection.

What is Context Engineering?

Context Engineering shapes the information and tool environment around AI agents. Neo.mjs implements it through Knowledge Base, Memory Core, GitHub Workflow, and Neural Link MCP servers for frontier harnesses, plus a File System MCP server for internal Neo.ai.Agent local loops.

What is the Neo.mjs Agent OS?

The Neo.mjs Agent OS is the repository Brain: source code and services for Memory Core, Knowledge Base, Active Hybrid GraphRAG, DreamService, Golden Path synthesis, A2A coordination, and Neural Link tooling.

Frontmatter

id	12065
title	[Epic] Orchestrator-as-SSOT for the REM (Sandman) Pipeline
state	Closed
labels	epicairegressionarchitecturemodel-experience
assignees	neo-opus-grace
createdAt	May 27, 2026, 3:40 AM
updatedAt	Jun 27, 2026, 11:21 PM
githubUrl	https://github.com/neomjs/neo/issues/12065
author	neo-opus-ada
commentsCount	14
parentIssue	null
subIssues	12067 Sub 1: Silent-failure root-cause investigation across 13 hypotheses 12068 Sub 2: 5-axis observability primitive + REM run/stage state model 12069 Sub 3: Unified executeRemCycle() orchestrator method 12070 Sub 4: runSandman.mjs delegation refactor with CLI mode-selector 12071 Sub 5: Standalone refreshGoldenPath() orchestrator method + npm script 12072 Sub 6: Provider-readiness substrate + rich diagnostic + decay placement 12073 Sub 7: Hierarchical-summarization strategy (chunking-aware Tri-Vector) 12074 Sub 8: Benchmarking gemma4 + context-window reuse research 12075 Sub 9: Regression test coverage for 13 silent-failure modes 12123 REM run JSONL store: retention/prune to bound file count + read cost 12199 Orchestrator-driven low-latency on-disconnect summarization (restore ~100ms targeting) 12617 Add Phase A REM silent-failure regression tests 12830 Root-cause & characterize corrupted (empty-content) memories 12833 Bound session-summary synthesis with a timeout + degraded fallback 12872 Provider-class split for the session-summary synthesis timeout 13358 Split miniSummary backfill from KB-sync backpressure 13918 REM tri-vector context-overflow runaway — unbounded re-serve (local + cloud)
subIssuesCompleted	17
subIssuesTotal	17
contentTrust
projected
quarantined	0
signals	[]
blockedBy	[]
blocking	[]
closedAt	Jun 27, 2026, 11:21 PM
milestone	v13.1

[Epic] Orchestrator-as-SSOT for the REM (Sandman) Pipeline

Closed v13.1.0/archive-v13-1-0-chunk-1 epicairegressionarchitecturemodel-experience

neo-opus-ada commented on May 27, 2026, 3:40 AM

Origin / Source Discussion

Graduated from Discussion #12062 (Scope: high-blast Tier-2; 8 progressive operator-driven body cycles across 2026-05-27 00:00Z → 01:35Z). Originating empirical anchor: live ai:run-sandman failure trace 2026-05-27 ~00:00Z (graph corruption + provider rejection); prior-occurrence anchor: memory f04dd0ba 2026-05-23 (same gap analysis I deferred filing).

Signal Ledger state at Epic-file time:

Claude (Opus): [AUTHOR_SIGNAL] @ Discussion body-sha (post-STEP_BACK-address update 2026-05-27 ~01:35Z); 9-sub shape per operator additions (Sub 8+9 below) is operator-direct authorization post-STEP_BACK
GPT (Codex): [GRADUATION_DEFERRED] @ body-sha dd352a130c24a46f38d6de32c9245843b70a89fd8718f04f993e8b1b8f54baf6 (STEP_BACK comment 17069393); 5 blockers addressed in subsequent body update; re-poll pending at new body-sha
Gemini (Pro): no signal; participationStatus: operator_benched per ai/graph/identityRoots.mjs; Tier-2 revalidationTrigger AC carried (AC18)

Why filed before GPT re-poll completes: operator directive 2026-05-27 ~01:36Z: "super important on graduation: an epic needs to get filed with ALL subs. we must not repeat the 'i wanted to create more but never did' pattern. ... graduation is PRIO 0." Memory anchor f04dd0ba (2026-05-23 I recommended 4 follow-up tickets, deferred filing pending Sub 18 closer, never filed them) is the exact pattern this directive corrects. Filing Epic + ALL 9 subs atomically.

Concept

Consolidate the REM / Sandman pipeline into canonical execution paths owned by the Orchestrator. Current state: two divergent execution paths (ai/scripts/runners/runSandman.mjs CLI + Orchestrator#dream periodic task) share DreamService.processUndigestedSessions() but diverge on surrounding choreography. SSOT preservation: orchestrator OWNS the entry points (not "one method"); the orchestrator can expose executeRemCycle() for full REM + refreshGoldenPath() for cheap standalone refresh, both rooted in the same substrate.

Tier classification

Tier-2 (touches §critical_gates-adjacent orchestrator core scheduling substrate; cross-cutting across ai/daemons/orchestrator/, ai/scripts/runners/, ai/services/graph/, ai/services/ingestion/, ai/services/memory-core/, ai/mcp/server/memory-core/config.{template,mjs}.mjs). Carries ## Unresolved Liveness + revalidationTrigger AC per §6.5.

Acceptance Criteria (Epic-level — sub-issues hold concrete implementation ACs)

AC1: Orchestrator owns executeRemCycle({reason, mode, includeGoldenPath, includeDecay, dryRun}) method — single canonical entry for the full REM cycle (Sub 3)
AC2: Orchestrator owns refreshGoldenPath({reason}) standalone method — cheap-refresh entry point preserving the §2.9 UX (Sub 5)
AC3: runSandman.mjs becomes thin CLI selector delegating to orchestrator with mode argument (Sub 4)
AC4: Silent-failure root cause identified across all 13 hypotheses + mitigation strategy for each documented (Sub 1)
AC5: 5-axis measurement primitive shipped — Chroma summary count, graph SESSION nodes, ENTITY/RELATION per-session counts, topology-conflict counts, graphDigested:true counts (Sub 2)
AC6: REM run/stage state model implemented (per Discussion §2.11) — per-phase state tracking with runId, perSessionStates, lastSuccessfulPhase (Sub 2)
AC7: Provider-readiness gate + rich diagnostic substrate placed (Sub 6)
AC8: GraphService.decayGlobalTopology() runs as cycle-finalization step inside unified REM method (Sub 3 + Sub 6)
AC9: Active-control-plane safety AC enforced — decay/prune/GC operations MUST NOT touch WAKE_SUBSCRIPTION / TASK_STATE / LEASE / AgentIdentity / mailbox-routing nodes. Allowlist-over-denylist enforcement. Post-restore cursor-freshness probe required (Sub 3 + Sub 6)
AC10: Hierarchical-summarization implementation — chunk IDs <sessionId>:chunk:<N>, turn-aligned boundaries, deterministic reduce-pass order, threshold-conditional activation (Sub 7)
AC11: Benchmarking gemma4 + context-window reuse research delivered (Sub 8) — measure actual fans-glow time per session size; investigate whether the OpenAI-compat surface (Ollama) supports context-cache reuse; quantify the cost asymmetry between fresh-context vs reused-context per executeTriVectorExtraction invocation
AC12: Regression test coverage for the 13 silent-failure modes (Sub 9) — at least one targeted test per hypothesis from Discussion §2.4 + §2.4.1 that would fail if the failure mode re-emerged
AC13: Operator-visible fans-glow signal returns on next REM cycle after OQ11 hot-fix (#12063 + PR #12064) + #12061 (GPT's routing fix) both merge; post-merge validation comment on Epic
AC14: 5-axis live counts re-measured post-Sub-1+Sub-2 to confirm axis-divergence remediation (target: axis A ≈ axis B ≈ axis E within 5% after backlog catch-up)
AC15: npm run ai:refresh-golden-path script lands as standalone operator-invocable refresh path (Sub 5)
AC16: All sub-PRs cite the originating Discussion #12062 + this Epic in their ## Related substrate anchors section
AC17: All sub-PRs preserve the env-override paths (NEO_OPENAI_COMPATIBLE_*, NEO_ORCHESTRATOR_*) without breaking changes
AC18 (Tier-2 revalidationTrigger): At Gemini family reactivation (per identityRoots.mjs), run npm run ai:revalidation-sweep -- --family gemini per Sub #11803 mechanism. Notifies Gemini family at reactivation for retroactive signal posting

Sub-decomposition (9 subs)

Sub	Scope	Resolves	Order
Sub 1	Silent-failure root-cause investigation across 13 hypotheses + mitigation strategy per hypothesis	Discussion OQ9 + AC4	FIRST (per GPT migration blast-radius sweep)
Sub 2	5-axis observability primitive + ChromaManager/GraphService helpers + REM run/stage state model	Discussion OQ10 + §2.11 + AC5/AC6	parallel to Sub 1
Sub 3	Unified `executeRemCycle()` orchestrator method (chunking-aware per OQ12 + split-and-recompose per OQ2 + active-control-plane safety per §2.10)	Discussion OQ1 + §2.10 + §2.11 enforcement + AC1/AC8/AC9	after Sub 1+2
Sub 4	`runSandman.mjs` delegation refactor with CLI mode-selector	Discussion OQ1 + OQ6 + AC3	after Sub 3
Sub 5	Standalone `refreshGoldenPath()` orchestrator method + `npm run ai:refresh-golden-path` script	Discussion OQ2 (§2.9) + AC2/AC15	after Sub 3
Sub 6	Provider-readiness substrate placement + rich diagnostic + decay-placement per OQ4	Discussion OQ3 + OQ4 + AC7/AC8	after Sub 3
Sub 7	Hierarchical-summarization strategy implementation (chunking semantics per OQ12 AC)	Discussion OQ12 + AC10	after Sub 3
Sub 8 NEW	Benchmarking gemma4 + context-window reuse research (operator-direct addition post-STEP_BACK 2026-05-27 ~01:36Z)	new AC11	parallel to Sub 1
Sub 9 NEW	Regression test coverage for 13 silent-failure modes (operator-direct addition post-STEP_BACK 2026-05-27 ~01:36Z)	new AC12; spans Sub 1's hypothesis inventory	after Sub 1 (needs Sub 1 inventory)

Avoided Traps

❌ Single-pipeline bundling (collapses cheap-refresh into heavy REM cycle; §2.9 falsified)
❌ Status-quo (file 4 tickets to backfill orchestrator gaps) — empirically falsified by memory f04dd0ba 4-days-ago deferral pattern
❌ Helper-extraction-only (covers body but not wrapper gates; prior runRemPipeline extraction too-shallow)
❌ Delete runSandman.mjs entirely — operator workflow falsified (cheap-refresh UX + on-demand REM both fall casualty)
❌ Silent-failure surface preserved at new layer (per §2.6 measurement-axis blindness — observability AC9 prevents)
❌ Active-control-plane collateral damage from decay/prune/GC — empirically anchored to today's bridge-cursor failure (§2.10 carve-out prevents)
❌ graphDigested:true boolean as completion gate (§2.11 state model replaces — Topology silent-failure currently invisible)
❌ Defer-and-forget pattern for sub-filing — operator-direct correction this Epic-file: ALL 9 subs filed atomically with this Epic, no "I wanted to create more but never did" repeat

Out of Scope

Cross-epic dependency with #11829 wake-driver substrate (Discussion OQ8 [DEFERRED_WITH_TIMELINE]: post-SSOT-Sub-7-implementation)
Cloud-deployment safety dryRun / nullableProvider path (Discussion OQ7 [OQ_RESOLUTION_PENDING] — needs first-deployment data)
Provider routing fix (covered by separate #12059 + PR #12061, GPT-owned)
contextLimitTokens cap-raise hot-fix (covered by separate #12063 + PR #12064, mine)
Replacement of ai:run-sandman CLI with new entry-point name (preserved per Discussion §2.9)

Discussion Criteria Mapping

Per Discussion #12062 §6.4 preview — mapped 1:1 to this Epic's ACs above. Sub-decomposition table identical to Discussion §6.4 9-sub shape (after Sub 8+9 additions).

Signal Ledger (family-keyed at Epic-file time)

Family	Identity	Signal	Anchor	Status
Claude (Opus)	@neo-opus-ada (author)	`[AUTHOR_SIGNAL @ Discussion body-sha post-STEP_BACK-address ~01:35Z]`	Discussion #12062 body updatedAt `2026-05-27T01:35:16Z`	active
GPT (Codex)	@neo-gpt	`[GRADUATION_DEFERRED]`	body-sha:`dd352a130c24a46f38d6de32c9245843b70a89fd8718f04f993e8b1b8f54baf6`, discussion-comment 17069393	re-poll pending at new body-sha; 5 blockers addressed; Sub 8+9 added post-STEP_BACK per operator-direct
Gemini (Pro)	@neo-gemini-pro	no signal	`participationStatus: operator_benched since 2026-05-18` per `identityRoots.mjs`	Unresolved Liveness (below); Tier-2 revalidationTrigger AC18

Operator-authority override: per AGENTS.md §15.6 + operator directive 2026-05-27 ~01:36Z, Epic filed with all 9 subs while GPT re-poll is pending. Any blockers GPT raises post-Epic-file will be addressed via Epic body amendments + sub-amendments as needed. The 9-sub structural shape is operator-authorized.

Unresolved Dissent

#	Source	Concern	Status
1	@neo-gpt STEP_BACK	Explicit ACs for REM run/stage telemetry	RESOLVED via AC6 + Sub 2 scope (Discussion §2.11)
2	@neo-gpt STEP_BACK	OQ resolution tags missing	RESOLVED via Discussion body update 2026-05-27 ~01:35Z
3	@neo-gpt STEP_BACK	OQ11 standalone hot-fix evidence	RESOLVED via separate ticket #12063 + PR #12064
4	@neo-gpt STEP_BACK	OQ12 deterministic chunking	RESOLVED via AC10 + Sub 7 scope (Discussion OQ12 AC expansion)
5	@neo-gpt STEP_BACK	Active-control-plane safety AC	RESOLVED via AC9 + Sub 3+6 (Discussion §2.10)
6 (pending)	TBD post-GPT-re-poll	Any new concerns post body-sha re-evaluation	open

Unresolved Liveness

Family	Identity	participationStatus	Tier-2 revalidationTrigger AC
Gemini	@neo-gemini-pro	`operator_benched since 2026-05-18T00:00:00.000Z` per `ai/graph/identityRoots.mjs`	AC18: At Gemini family reactivation, run `npm run ai:revalidation-sweep -- --family gemini` per Sub #11803 mechanism. Reactivation trigger per `identityRoots.mjs`: "Google enables extra-high-equivalent thought budget for Gemini Pro-class maintainer work OR releases the next Gemini Pro-class model with verified ability to fully handle Neo lifecycle skills."

Source Discussion: #12062 — Orchestrator-as-SSOT for the REM (Sandman) Pipeline
Companion fixes (parallel-track, independent of Epic graduation):
- #12059 + PR #12061 (GPT-owned: graph provider routing fix)
- #12063 + PR #12064 (mine: contextLimitTokens cap-raise)
Memory anchors: f04dd0ba-4672-48c2-977e-29b86bd308ec (2026-05-23 prior gap analysis I deferred — the "first-time issue" empirical anchor for the no-defer mandate)
PR provenance: PR #11966 cycle-3 (2026-05-25, my regression-introducing buildGraphProvider with gemini deferred); orchestrator refactor wave 2026-05-23
Sister epic: #11829 (wake-driver substrate; cross-epic dependency per OQ8)
ADR: ADR 0014 cloud-deployment-topology (aligned-with)

Origin Session ID

Origin Session ID: <current claude-code nightshift session — 2026-05-26 ~23:18Z onwards>

Handoff Retrieval Hints

query_raw_memories(query='orchestrator-as-SSOT REM Sandman 13 silent-failure hypotheses 5-axis observability')
query_summaries(query='Discussion 12062 graduation Epic Sandman silent-failure context-cap')
File anchors: ai/scripts/runners/runSandman.mjs, ai/daemons/orchestrator/Orchestrator.mjs:681-729, ai/daemons/orchestrator/services/DreamService.mjs, ai/services/graph/{SemanticGraphExtractor,TopologyInferenceEngine,GapInferenceEngine,GraphMaintenanceService,GoldenPathSynthesizer}.mjs, ai/services/ingestion/{ConceptIngestor,MemorySessionIngestor}.mjs, ai/services/memory-core/{FileSystemIngestor,GraphService,helpers/ConsumerFrictionHelper}.mjs
Discussion #12062 §2.4 (13-hypothesis silent-failure analysis), §2.4.1 (cap smoking-gun), §2.6 (5-axis divergence), §2.9 (split-and-recompose), §2.10 (active-control-plane safety), §2.11 (REM run/stage state model)

🤖 Generated with Claude Code

tobiu referenced in commit 394c531 - "feat(ai): gemma4 REM-pipeline benchmark harness + keep_alive probe (#12074) (#12076) on May 27, 2026, 2:14 PM

tobiu referenced in commit 93c6a91 - "refactor(ai): delete dead ai/scripts/runners/runGoldenPath.mjs (#12078) (#12079) on May 27, 2026, 2:15 PM

tobiu referenced in commit 1b30fbd - "feat(ai): 5-axis REM observability primitive helpers + 22-case unit coverage (#12068 Sub 2 Phase 1a) (#12081) on May 27, 2026, 2:17 PM

tobiu referenced in commit f768ec2 - "refactor(ai): remove MC+KB boot-time auto-* self-triggers + redundant MCP tools — orchestrator-SSOT (#12139 PR A) (#12197) on May 30, 2026, 4:51 PM

tobiu referenced in commit d716c18 - "refactor(ai): collapse MC+KB lifecycle services to readiness gates; drop managed-mode artifacts (#12139) (#12202) on May 30, 2026, 6:10 PM

tobiu referenced in commit 19c181f - "session-sunset: no agent sandman-trigger + config-migrate on dev-pull (#12650) (#12652) on Jun 6, 2026, 9:27 PM

tobiu assigned to @neo-opus-grace on Jun 7, 2026, 7:58 PM

tobiu unassigned from @neo-opus-ada on Jun 7, 2026, 7:59 PM