LearnNewsExamplesServices
Frontmatter
id10957
titleDocument v13 architectural path — slim MCP servers, orchestrator daemon, SDK migration sequence
stateClosed
labels
documentationenhancementaiarchitecture
assigneesneo-opus-4-7
createdAtMay 8, 2026, 1:53 PM
updatedAtMay 8, 2026, 2:18 PM
githubUrlhttps://github.com/neomjs/neo/issues/10957
authorneo-opus-4-7
commentsCount0
parentIssue9999
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 8, 2026, 2:18 PM

Document v13 architectural path — slim MCP servers, orchestrator daemon, SDK migration sequence

Closeddocumentationenhancementaiarchitecture
neo-opus-4-7
neo-opus-4-7 commented on May 8, 2026, 1:53 PM

Context

Authored 2026-05-08 in response to @tobiu's chief-architect mandate after two compounding architectural-hallucination errors over the prior session day:

  1. Factory pattern misread: I initially conflated "remote multi-tenant Memory Core" with "high-availability primary/secondary deployment". Gemini correctly flagged this in MESSAGE:0ca25e5b-d59d-4cae-a66e-c6bfd669953e — the multi-tenant remote shape is ONE endpoint with per-request RequestContextService + Mcp-Session-Id binding, not multi-instance shared-substrate.
  2. NEO_MC_PRIMARY scoping over-correction: my and Gemini's first reaction was "strip it entirely". @tobiu corrected: "isPrimary => LOCAL mcp servers. if agent harnesses spawn multiple local server instances. however: we WANT to move summarization items into daemons." The flag is still valid for LOCAL multi-harness fleets TODAY; it becomes obsolete only after daemonization lands.

@tobiu's chief-architect mandate: "how do we get from right here to v13 => path. and this needs to get documented."

The Problem

The swarm has accumulated several architectural fragments without a coordinated path:

  • 5 MCP servers with no shared base class (each Server.mjs independent, ~80% boilerplate duplication)
  • Factory pattern adoption uneven (2 of 5 servers — memory-core direct, knowledge-base via TransportService)
  • ai/services.mjs SDK boundary mature but underutilized (most service logic still in MCP server services/ directories)
  • Daemon-shaped services in ai/daemons/ (DreamService, SwarmHeartbeatService, decomposed services from #10013/#10028) without an orchestrator-daemon process to schedule them
  • Bridge-daemon (per-host singleton) specialized for wake-event delivery; no sibling daemon for summarization/sandman/golden-path
  • Summarization gated off (#9942 daemon-collision fix); operator-runs only via npm run ai:summarize-sessions
  • NEO_MC_PRIMARY in-process gate exists as workaround for the missing daemon — keeps LOCAL multi-harness fleets from racing on shared substrate

Without a documented path, the swarm risks:

  • Iterating in conflicting directions (e.g., my Piece C in-process implementation vs the original ticket's bridge-daemon-or-sibling guidance)
  • Losing context between sessions on which architectural decisions are load-bearing vs incidental
  • Cowboy-coding without verifying the substrate (the failure mode of today's session)

The Architectural Reality

Current substrate state (verified empirically 2026-05-08):

  • ai/mcp/server/{file-system,github-workflow,knowledge-base,memory-core,neural-link}/Server.mjs — 5 MCP server entry points, no common base
  • ai/mcp/server/shared/services/RequestContextService.mjs — Factory pattern (per-request identity binding via AsyncLocalStorage)
  • ai/mcp/server/shared/services/TransportService.mjs — SSE transport setup; wraps dispatch in RequestContextService.run(...)
  • ai/services.mjs — mature SDK boundary; consumed by scripts, tests, and MC server's processPendingSummarizations
  • ai/scripts/bridge-daemon.mjs — standalone wake-event-delivery daemon, per-host singleton via PID file
  • ai/daemons/ — in-process daemon-shaped services (DreamService, SwarmHeartbeatService, services/* from #10013 decomposition)
  • ai/scripts/summarize-sessions.mjs — operator-trigger script (calls Memory_SessionService.summarizeSessions({}) via SDK)

The end-state vision (per @tobiu's mandate): slim MCP servers + mature SDK + clean daemon architecture + single-source-of-truth for summaries/sandman/golden-path (daemon-driven).

The Fix

Produce, peer-review, and merge learn/agentos/v13-path.md as the chief-architect document covering:

  1. Vision (3 load-bearing properties — slim MCP, common base, orchestrator daemon)
  2. Current State — empirically verified table comparing today vs v13 target
  3. Critical Architectural Decisions — D1 Factory pattern eval (with challenge framing) / D2 common base server class / D3 orchestrator daemon architecture / D4 SDK migration boundary / D5 NEO_MC_PRIMARY retirement path
  4. Sequenced Milestones — M1 substrate stabilization (current week) → M7 v13 release cut
  5. Tickets to file/update
  6. #9999 sub-issue audit (each open sub-issue triaged: on-trajectory / re-scope / verify)
  7. Risks
  8. Outcome Metrics (quantitative v13 readiness targets)
  9. Provenance

Doc location: learn/agentos/v13-path.md. Linked from ROADMAP.md once reviewed.

Acceptance Criteria

  • AC1: learn/agentos/v13-path.md committed via PR; covers all 9 sections enumerated in The Fix
  • AC2: Each "Critical Architectural Decision" (D1-D5) explicitly evaluated, including challenge framing (especially D1 Factory pattern — pros/cons/AsyncLocalStorage edge cases)
  • AC3: #9999 sub-issue audit triages every open sub-issue with explicit on-trajectory / re-scope / verify markers
  • AC4: Sequenced milestones M1-M7 each have: scope, owner-split (current week M1 only), exit gate
  • AC5: Outcome metrics are quantitative (LOC moved, server count, latency, reference count) — not vague
  • AC6: Peer-reviewed by @neo-gemini-3-1-pro and @neo-gpt cross-family (architectural pillar requires both)
  • AC7: ROADMAP.md updated with link to v13-path.md and v13 milestone roll-up
  • AC8: Linked from #9999 as the v13 umbrella reference; #9999's sub-issues that need re-scoping have follow-up comments per the audit

Out of Scope

  • Implementing any of the milestones (M2-M7 are separate epics/PRs)
  • Re-scoping or closing individual sub-issues of #9999 (audit identifies them; per-ticket updates are separate work)
  • Updating learn/agentos/MX.md or other foundational docs beyond ROADMAP.md cross-link
  • Filing the M2/M3/M6 epics this ticket only enumerates them; their creation is downstream of doc approval
  • Operator deployment cookbook updates (those land per-milestone, not in the doc itself)

Avoided Traps / Gold Standards Rejected

  • Rejected: comprehensive 1000+-line specification. Path docs that try to enumerate every detail become outdated within weeks. This doc is strategic (200-300 lines), not prescriptive — milestones link to per-ticket prescriptions.
  • Rejected: rubber-stamp the Factory pattern. Per @tobiu's directive "challenge the factory pattern (probably a reasonably good solution)", D1 explicitly evaluates pros/cons/edge cases rather than assuming. Recommendation is positive but with named caveats.
  • Rejected: "strip NEO_MC_PRIMARY immediately" framing (the over-correction the swarm jumped to earlier today). D5 sequences retirement BEHIND the orchestrator landing, not parallel to it.
  • Rejected: extend bridge-daemon for summarization (my initial wrong-shape proposal). D3 explicitly separates concerns — bridge stays specialized for wake-domain; orchestrator is its sibling for scheduled work.
  • Rejected: file 5+ tickets for each milestone immediately. Milestone tickets land downstream of doc approval; pre-filing pollutes the backlog before the path is endorsed.

Related

  • Parent epic: #9999 Cloud-Native Knowledge & Multi-Tenant Memory Core (v13 main epic)
  • Substrate triggers (today's correction chain): A2A MESSAGE:0ca25e5b-d59d-4cae-a66e-c6bfd669953e (Gemini's CRITICAL flagging primary/secondary mental-model error)
  • Adjacent in-flight work referenced by the path:
    • #10813 Restore session summaries — re-scoped per D3/D5
    • #10956 NEO_MC_PRIMARY removal — re-scoped per D5
    • #10945 Deployment-pipeline integration coverage — M1 GPT lane
    • #10939 Phase 3 unit-row re-add — M1 Gemini lane
    • #10013 DreamService Decomposition — M4 prerequisite (already mostly shipped)
    • #10028 Finalize DreamService Decomposition — M4 work
  • Referenced from ROADMAP.md (post-approval cross-link)

Origin Session ID: 005b6edf-85d8-4980-9e17-486b6b8bed3f

Retrieval Hint: query_raw_memories(query="v13 architectural path slim MCP servers orchestrator daemon SDK migration NEO_MC_PRIMARY retirement Factory pattern common base server class")

tobiu referenced in commit a77adf0 - "docs(agentos): v13 architectural path strategy document (#10957) (#10958) on May 8, 2026, 2:18 PM
tobiu closed this issue on May 8, 2026, 2:18 PM