LearnNewsExamplesServices
Frontmatter
id10259
titleIdentity normalization — merge aliased AgentIdentity nodes + purge test-fixture leakage + @@ -prefix strip
stateClosed
labels
bugaiarchitecturecore
assigneesneo-opus-4-7
createdAtApr 23, 2026, 11:02 PM
updatedAtApr 23, 2026, 11:28 PM
githubUrlhttps://github.com/neomjs/neo/issues/10259
authorneo-opus-4-7
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtApr 23, 2026, 11:28 PM

Identity normalization — merge aliased AgentIdentity nodes + purge test-fixture leakage + @@ -prefix strip

Closedbugaiarchitecturecore
neo-opus-4-7
neo-opus-4-7 commented on Apr 23, 2026, 11:02 PM

Context

SQLite identity inventory (via raw better-sqlite3 query this session) reveals three classes of non-canonical AgentIdentity state that cause mailbox routing ambiguity:

@gemini                | AgentIdentity | Gemini              | githubLogin: null  | family: null   | accountType: null
@neo-gemini-3-1-pro    | AgentIdentity | Gemini 3.1 Pro      | githubLogin: @neo-gemini-3-1-pro | family: gemini | accountType: agent
@neo-opus-4-7          | AgentIdentity | Claude Opus 4.7     | githubLogin: @neo-opus-4-7       | family: claude | accountType: agent
@opus                  | AgentIdentity | Opus                | githubLogin: null  | family: null   | accountType: null
@tobiu                 | AgentIdentity | Tobias Uhlig        | githubLogin: @tobiu              | family: null   | accountType: human
AGENT:*                | BroadcastSentinel | Broadcast         | (sentinel)
AGENT:alice            | AGENT         | Alice               | (test fixture)
AGENT:bob              | AGENT         | Bob                 | (test fixture)

Empirical impact this session (session 29f490c0-41ed-4846-a767-287552d5c3c4):

  1. Gemini's list_messages surfaced a broadcast from @opus (a stale alias of me), and her reply threading targeted @opus rather than my canonical @neo-opus-4-7. Result: her reply MESSAGE:64a3ba28-... landed in the wrong identity's inbox despite being semantically addressed to "the same physical agent who sent 'hello all'". Tobi directly flagged this as a routing gap worth fixing.

  2. Test-fixture pollution: AGENT:alice + AGENT:bob from MailboxService.spec.mjs / PermissionService.spec.mjs leaked into production SQLite. These have no githubLogin metadata and represent no real agent, but they're present in the live graph. They don't cause routing errors today but pollute graph-traversal results and identity searches.

  3. Potential @@-prefix typos: if a sender addresses to: '@@neo-opus-4-7' (double-prefix from misformed automation or accidental ID copy-paste), MailboxService.normalizeMailboxTarget() does NOT strip the extra @. The linkNodes FK guard would cull the SENT_TO edge silently. Low-probability but defense-in-depth warranted, and tobi called it out as the bonus item.

Tobi's directive: "we should take a look into graph identities. they should match our new github account ids. plus, as a bonus, we could remove a prefixing @ if send by mistake."

The Problem

  • Aliased identities: @opus + @gemini exist as standalone AgentIdentity nodes with null metadata. They represent the same physical agents as @neo-opus-4-7 + @neo-gemini-3-1-pro but are distinct graph nodes. The #10144 convention made @neo-*-* the canonical form; the pre-#10144 aliases weren't migrated.
  • Routing semantics follow the literal SENT_BY edge. A broadcast sent by @opus (lingering from earlier sessions before the #10144 seed wrote @neo-opus-4-7) gets replies routed to @opus, not @neo-opus-4-7. The inbox of @neo-opus-4-7 never sees those replies despite being the "same agent."
  • Test fixture bleed: AGENT:alice + AGENT:bob from unit test runs persist in production SQLite. Path: test code seeded these nodes with GraphService.upsertNode() which writes to the live file; cleanup via afterAll doesn't always run when tests fail or Playwright workers timeout.
  • @@ prefix: MailboxService.normalizeMailboxTarget currently handles AGENT:@login@login, but not @@login@login. Edge-case but the normalization surface is the right place to add it.

The Architectural Reality

  • ai/graph/identityRoots.mjs — the IDENTITIES array that GraphService.initAsync self-seeds at boot (#10232). Lists the 4 canonical identities: @neo-opus-4-7, @neo-gemini-3-1-pro, @tobiu, AGENT:*. Does NOT include @opus / @gemini, confirming those are historical.
  • ai/mcp/server/memory-core/services/MailboxService.mjs:~33normalizeMailboxTarget() function. Current branches: AGENT:@login@login. Candidate for @@ strip.
  • ai/mcp/server/memory-core/services/GraphService.mjsremoveNodes() + db.transaction() available for atomic delete operations.
  • ai/graph/Database.mjsremoveNode() cascades to edges via SQLite FK constraint (ON DELETE CASCADE).
  • Test specs at test/playwright/unit/ai/mcp/server/memory-core/services/MailboxService.spec.mjs + PermissionService.spec.mjs use AGENT:alice / AGENT:bob as test identities. These should NOT be the source of the pollution anymore after #10229 (test-pollution fix) — but the existing AGENT:alice/AGENT:bob nodes in SQLite predate #10229 and need manual purge.

The Fix

Three-part script + one code change:

1. Migration script: ai/scripts/normalizeGraphIdentities.mjs

Idempotent one-shot migration. Resolves alias → canonical mapping for each known pair, re-points all edges, deletes the alias nodes. Emits dry-run output by default; --apply flag commits.

Alias map (hardcoded in the script, source of truth):

const ALIAS_MAP = {
    '@opus'   : '@neo-opus-4-7',
    '@gemini' : '@neo-gemini-3-1-pro'
};

Purge list (test-fixture pollution):

const PURGE_NODES = ['AGENT:alice', 'AGENT:bob'];

Algorithm:

  1. For each alias → canonical pair in ALIAS_MAP:
    • Verify canonical node exists in SQLite
    • For each edge where source = alias or target = alias: rewrite the edge with canonical substituted. Handle duplicate-edge collisions (if canonical already has that edge, skip the rewrite and let the alias edge be deleted with the node).
    • DELETE FROM Nodes WHERE id = '<alias>' — cascades to remaining orphan edges via FK.
  2. For each node in PURGE_NODES: DELETE FROM Nodes WHERE id = '<node>' — cascades.
  3. Emit summary: N edges rewritten, M nodes deleted, K canonical-edge-collisions encountered.

Dry-run default: prints the plan but does not execute writes. Operator invokes with --apply after reviewing the plan.

2. Code change: extend normalizeMailboxTarget() for @@-prefix

ai/mcp/server/memory-core/services/MailboxService.mjs:~33:

function normalizeMailboxTarget(to) {
    if (to?.startsWith('AGENT:@')) {
        return to.slice('AGENT:'.length);
    }
    // Strip accidental double-@ prefix (#<this-ticket>). Defense-in-depth for
    // misformed automation / ID copy-paste. Does NOT apply to `@@tobiu`-style
    // patterns where the user intent was canonical — the transformation is
    // only lossy when the first @ was accidental, so keeping this minimal.
    if (to?.startsWith('@@')) {
        return to.slice(1);
    }
    return to;
}

Unit test coverage in MailboxService.spec.mjs — add cases to the #10174 production-convention addressing describe block: @@login@login persists SENT_TO edge to canonical.

3. Operator-invocation docs

learn/agentos/tooling/MemoryCoreMcpAuth.md — new §Identity Normalization Migration subsection covering:

  • When to run the script (one-time operator action during upgrade to post-# substrate)
  • How to verify outcome via SQLite inventory query
  • Idempotent re-run safety

Acceptance Criteria

  • ai/scripts/normalizeGraphIdentities.mjs ships with ALIAS_MAP + PURGE_NODES, dry-run default, --apply flag
  • Script idempotent: re-running after apply is a safe no-op
  • After apply: SQLite inventory shows exactly 4 AgentIdentity-type nodes + 1 BroadcastSentinel (AGENT:*). No @opus, no @gemini, no AGENT:alice/AGENT:bob.
  • Edge rewrites preserve semantic integrity — SENT_BY/SENT_TO edges that pointed at @opus now point at @neo-opus-4-7; the MESSAGE:64a3ba28-... "Re: hello all" becomes reachable to me via list_messages
  • MailboxService.normalizeMailboxTarget('@@login') returns @login — unit test coverage in MailboxService.spec.mjs
  • learn/agentos/tooling/MemoryCoreMcpAuth.md §Identity Normalization Migration documented
  • Regression check: running the mailbox test suite pre-apply on a freshly reset DB → no AGENT:alice/AGENT:bob leakage post-run (covered by #10229's :memory: + testDbPath refactor; just validating still works)

Out of Scope

  • Non-destructive CANONICAL_ALIAS edge topology — considered, rejected for v1. Preserving aliases via a redirect edge adds read-path complexity for every mailbox operation without clear benefit; the canonical nodes have all the metadata and there's no historical value to the alias nodes themselves. If future multi-user deployments need aliasing for tenant-identity-rotation, that's a separate architectural layer.
  • DreamService / Retrospective daemon awareness of the alias-merge — orthogonal. If the daemon has indexed memories or summaries referencing @opus, those references become stale pointers post-merge. Option (a) accept the staleness (low-frequency read path), option (b) reindex as part of the script (scope creep). Defer to empirical demand.
  • AGENT:* sentinel edge remap — not needed; broadcast semantics are intact.
  • ChromaDB metadata userId alias mapping — only SQLite graph in scope. Chroma metadata updates are a separate layer if / when we discover alias-based metadata drift.

Avoided Traps

  • "Soft-delete alias nodes, keep them for provenance" — rejected. Soft-delete means the routing ambiguity persists indefinitely. If tobi wants historical provenance, it's captured in Memory Core session memories, not graph topology.
  • "Extend MailboxService to transparently redirect alias-addressed messages to canonical" — rejected. Band-aid at the wrong layer. The graph-level merge is the correct substrate fix; mailbox-level redirect would mask the problem without resolving it.
  • "Purge AGENT:alice / AGENT:bob by tightening test isolation only" — insufficient. The existing nodes predate test-isolation improvements; they need explicit cleanup.
  • "Merge via ChromaDB-only metadata update" — wrong substrate. @opus and @neo-opus-4-7 are graph nodes, not Chroma metadata rows. Merging must happen at the SQLite graph layer.

Related

  • #10144 — AgentIdentity node type + GitHub account binding convention. This ticket consolidates stragglers from the pre-#10144 era.
  • #10229 — Test pollution refactor. Closes the ongoing source of AGENT:alice/AGENT:bob leakage; this ticket handles the historical pollution.
  • #10232 — Self-seed loop in GraphService.initAsync. Confirms the canonical set; alias merge does not affect this path.
  • #10016 — Multi-Tenant Identity & Data Privacy (parent sub-epic).
  • #9999 — Cloud-Native Memory Core (grand-parent). Multi-user deployments MUST NOT ship with alias-identity ambiguity; this closes the hygiene gap.

Origin Session ID

Origin Session ID: 29f490c0-41ed-4846-a767-287552d5c3c4

Handoff Retrieval Hints

Retrieval Hint: "identity aliasing @opus @gemini @neo-opus-4-7 @neo-gemini-3-1-pro merge" Retrieval Hint: "AGENT:alice AGENT:bob test-fixture leakage production SQLite" Retrieval Hint: "normalizeMailboxTarget @@ prefix strip" Commit-range anchor: dev HEAD 96ae5f866 at ticket filing

tobiu referenced in commit e855706 - "feat(memory-core): graph identity normalization migration + @@ prefix strip (#10259) (#10262) on Apr 23, 2026, 11:28 PM
tobiu closed this issue on Apr 23, 2026, 11:28 PM