LearnNewsExamplesServices
Frontmatter
id10232
titleGraphService.initAsync should self-seed AgentIdentity + BroadcastSentinel roots on boot
stateClosed
labels
enhancementaiarchitecturecore
assigneesneo-opus-4-7
createdAtApr 23, 2026, 2:02 PM
updatedAtApr 23, 2026, 3:03 PM
githubUrlhttps://github.com/neomjs/neo/issues/10232
authorneo-opus-4-7
commentsCount0
parentIssue10139
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtApr 23, 2026, 3:03 PM

GraphService.initAsync should self-seed AgentIdentity + BroadcastSentinel roots on boot

Closedenhancementaiarchitecturecore
neo-opus-4-7
neo-opus-4-7 commented on Apr 23, 2026, 2:02 PM

Context

Session 24aa1fa1 post-merge validation empirically hit the pattern documented in memory f73bd4ef (session 0327771f, 2026-04-22): even with correct NEO_AGENT_IDENTITY env + clean SQLite + merged cache-coherence fixes, bindAgentIdentity returns null at boot if the AgentIdentity graph node doesn't exist yet. The only current recovery is to run seedAgentIdentities.mjs manually, THEN restart the harness so bind fires against the newly-seeded graph — a two-restart recovery loop that's been observed across FOUR sessions in two days (0327771f, 15852d91, 8968b9f6, 24aa1fa1).

GraphService.initAsync already self-seeds two structural anchors (frontier + Neo-Master-Architecture) on every boot via check-then-upsert pattern. Extending this pattern to AgentIdentity + BroadcastSentinel roots eliminates the manual-seed recovery requirement entirely. This ticket complements #10176 (observability — surface identity binding state in healthcheck) but lives at a different substrate: #10176 surfaces the FAILURE post-hoc, this ticket prevents the failure.

The Problem

Current boot sequence failure mode (post-#10229, recurrence vectors closed but initial-provisioning still requires manual seed):

  1. Production DB gets wiped/fresh-provisioned (intentional reset, new dev-box setup, CI boot, schema migration, backup restore, etc.)
  2. AgentIdentity nodes are absent from the fresh SQLite
  3. MCP boot: bindAgentIdentity('neo-opus-4-7') looks up @neo-opus-4-7, returns null
  4. stdioIdentity.agentIdentityNodeId cached null for process lifetime
  5. Post-#10227, no self-heal rebind → stuck-null; every mailbox call throws
  6. Operator must: run node ai/scripts/seedAgentIdentities.mjs + restart harness + pray

The recovery loop is documented but fragile. A boot-time self-seed eliminates it structurally. The existing self-seed pattern in GraphService.initAsync:66-89 for frontier + Neo-Master-Architecture is a proven template.

The Architectural Reality

  • ai/mcp/server/memory-core/services/GraphService.mjs:66-89 — existing self-seed pattern for frontier (SYSTEM_ANCHOR) + Neo-Master-Architecture (System) + SYSTEM_TENET edge
  • ai/scripts/seedAgentIdentities.mjs:23-76 — canonical IDENTITIES array (4 entries: @neo-opus-4-7, @neo-gemini-3-1-pro, @tobiu, AGENT:*)
  • Boot sequence: Server.mjsawait GraphService.ready()bindAgentIdentity runs after initAsync completes
  • Post-upsertNode-lazy-load fix (sibling ticket): upsertNode lazy-loads from SQLite, so the "already exists" check in the self-seed loop is invariant-correct

The Fix

  1. Extract IDENTITIES array into a shared module (proposed: ai/graph/identityRoots.mjs — avoids coupling GraphService to a CLI script's imports)
  2. Import into GraphService.initAsync
  3. After existing frontier + Neo-Master-Architecture seeds, iterate IDENTITIES and upsert any missing (idempotent — defensive SQLite peek preserves createdAt on existing nodes, matching seedAgentIdentities.mjs:97-109 pattern)
  4. seedAgentIdentities.mjs remains functional as a standalone CLI tool for explicit re-seeding operations (post-wipe recovery, first-ever provisioning, CI provisioning workflows). Its logic becomes a thin wrapper around the shared IDENTITIES list + same upsert pattern — no behavioral regression.

Acceptance Criteria

  • Fresh Memory Core DB: MCP boot creates frontier + Neo-Master-Architecture + 4 AgentIdentity nodes before bindAgentIdentity fires
  • Post-wipe (e.g., rm .neo-ai-data/sqlite/*.sqlite): MCP boot re-seeds all roots automatically on next boot
  • Idempotent: boot with nodes already present preserves their existing createdAt + any user-added properties (defensive-SQLite-peek pattern from seedAgentIdentities.mjs preserved and/or relocated to the shared module)
  • seedAgentIdentities.mjs continues to work as a standalone CLI (backward-compatible — CI / operator workflows unaffected)
  • Regression test: LifecycleService.initAsync() against a fresh testDbPath (post-#10229 pattern) results in all 4 AgentIdentity nodes present
  • Empirical validation: after a harness restart with a fresh DB, healthcheck.mailboxPreview is populated without the operator running the seed script

Out of Scope

  • Changes to the IDENTITIES list itself (new identities still added via identityRoots.mjs modifications)
  • Handling per-tenant identities for the multi-tenant SSE path (covered by #10144 / #10016)
  • Adding a runtime identity registration MCP tool (future ticket if needed)
  • Closing the #10176 observability gap (complementary, separate concern)

Avoided Traps

  • "Remove seedAgentIdentities.mjs entirely": rejected. Keep as standalone CLI for explicit re-seed operations (post-wipe recovery, first-ever provisioning, CI workflows). Runtime auto-seed + manual CLI are complementary, not redundant.
  • "Inline IDENTITIES array in GraphService.initAsync": rejected. Two copies of the IDENTITIES array would drift; shared module is the right substrate.
  • "Put AgentIdentity seeds in a PRODUCTION-ONLY path (conditional on env var)": rejected. The same self-seed pattern should run in test fixtures too — otherwise tests using testDbPath pattern (post-#10229) couldn't exercise the AgentIdentity code paths without manually pre-seeding in every beforeEach.
  • "Parent this ticket under #10176 / merge scope": rejected after consideration. #10176 is about observability (surfacing state); this ticket is about prevention (not entering the bad state). Different substrates — keep scopes tight.

Related

  • Complements (not replaces): #10176 (healthcheck identity observability) — this ticket prevents the failure, #10176 exposes it when it does happen
  • Extends existing pattern: GraphService.initAsync:66-89 (frontier + System primer)
  • Depends on (soft): sibling ticket "upsertNode lazy-load" — ensures the self-seed upserts are safe against cold-cache stub overwrite (though self-seed's own idempotent-check pattern mitigates most of the hazard)
  • Parent epic: #10139 Mailbox A2A primitive
  • Historical narrative (memory sessions): 0327771f, 15852d91, 8968b9f6, 24aa1fa1 — four sessions across two days all hit the same "seed → restart-again" recovery loop

Origin Session ID: 24aa1fa1-9a22-498e-97f2-760c12e5a79d

Handoff Retrieval Hints

  • query_raw_memories(query="seed AgentIdentity boot bind null restart recovery loop")
  • query_raw_memories(query="GraphService initAsync frontier Neo-Master-Architecture self-seed")
  • query_summaries(query="A2A mailbox bootstrap seed binding restart")
tobiu referenced in commit cf1c704 - "feat(memory-core): GraphService boot-time identity self-seed (#10232) (#10236) on Apr 23, 2026, 3:03 PM
tobiu closed this issue on Apr 23, 2026, 3:03 PM
tobiu referenced in commit 057130b - "feat(memory-core): healthcheck identity observability block (#10176) (#10239) on Apr 23, 2026, 3:56 PM