LearnNewsExamplesServices
Frontmatter
id10143
titleGraph-first Memory artifacts: lift Memory + Session to first-class nodes
stateClosed
labels
epicaiarchitecturecore
assignees[]
createdAtApr 21, 2026, 11:28 AM
updatedAtJun 5, 2026, 11:52 PM
githubUrlhttps://github.com/neomjs/neo/issues/10143
authortobiu
commentsCount3
parentIssue9999
subIssues
10151 DreamService: deterministic ingestion phase for Memory + Session graph nodes
10152 SemanticGraphExtractor: emit typed edges to Memory + Session nodes
10153 Lazy back-fill: Memory + Session nodes for pre-migration Chroma rows
10158 Memory/Session graph lifecycle telemetry + retention policy
10169 Implement TAGGED_CONCEPT auto-emit via SemanticGraphExtractor on message bodies
10172 Graph node-ID case canonicalization: settle memory:/MEMORY: and session:/SESSION: convention
subIssuesCompleted6
subIssuesTotal6
blockedBy[]
blocking[x] 10139 Extend Memory Core with Explicit A2A Primitive
closedAtMay 25, 2026, 12:47 AM

Graph-first Memory artifacts: lift Memory + Session to first-class nodes

Closed Backlog/active-chunk-6 epicaiarchitecturecore
tobiu
tobiu commented on Apr 21, 2026, 11:28 AM

Graph-first Memory artifacts: lift Memory + Session to first-class nodes

Context

During mailbox brainstorm this session, audit of ai/daemons/DreamService.mjs surfaced that raw memories + session summaries are not graph-addressable. They feed SemanticGraphExtractor as source material, but the extracted entities are nodes while memories/summaries remain Chroma-only. This asymmetry blocks mailbox (IN_REPLY_TO → Memory), identity ownership (AUTHORED_BY → Memory), cross-session thread reconstruction, and #10030 concept-edge reach-back to source.

The Problem

Memory Core artifacts in mixed substrate: extracted concepts/classes/methods are nodes; memories/sessions are Chroma rows with no graph presence. Consequences: messages can't reply to past memories via edges (reference becomes a metadata-string hope); identity ownership can't be enforced structurally; thread reconstruction has no anchor node; concept-edge traversal stops at the extraction boundary.

The Architectural Reality

  • ai/daemons/DreamService.mjs line 72: summaries in Chroma neo-agent-sessions (794 rows).
  • Line 178: raw memories in Chroma neo-agent-memory (8246 rows).
  • Line 198: SemanticGraphExtractor emits extracted nodes but no Memory or Session source node.
  • Line 215-217: graphDigested: true marker on summaries; no anchor nodes created.
  • SQLite (better-sqlite3) backs the graph — adding Memory + Session node types is additive schema, no service boundary crossed.

The Fix

Four phases as sub-tickets (filed alongside):

  1. Memory + Session deterministic ingestion phase in DreamService (before extractor runs; keyed by Chroma IDs; no LLM cost). (#10151)
  2. SemanticGraphExtractor / Gemma4 enhancement: extracted entities emit typed provenance edges to Memory/Session nodes (MENTIONED_IN, DISCUSSED_IN, REFERENCED_BY). (#10152)
  3. Lazy back-fill migration: historical Chroma rows become nodes on-demand when first referenced by new artifacts. (#10153)
  4. Post-ship telemetry + retention policy for Memory/Session nodes. Added 2026-04-21 per Gemini 3.1 Pro's epic-review (Stage 3 coverage gap). (#10158)

Acceptance Criteria

  • Memory + Session node types in graph schema with documented properties + indices
  • DreamService emits Memory/Session nodes before extraction; extraction attaches typed edges
  • Gemma4 extraction schema validated on a live REM cycle (new edges land as expected; no regression on existing emission)
  • Lazy back-fill triggers on first reference; graph integrity check finds no dangling edges
  • Downstream consumers (#10139, #10016) can traverse AUTHORED_BY, ORIGINATES_IN, IN_REPLY_TO without special-casing absent nodes
  • Storage-growth impact measured post-shipping; pruning/archiving policy documented (owned by #10158)

Out of Scope

  • Eager back-fill of all historical memories (cost-prohibitive at 8246-row scale; lazy is sufficient)
  • Chroma schema changes (body text stays; graph is additive topology layer)
  • Cross-tenant permission enforcement on memory/session nodes (under #10016)

Avoided Traps

  • Extending SemanticGraphExtractor to emit Memory/Session nodes inside tri-vector extraction. Rejected. Deterministic ingestion should not ride on an LLM call — it's pure Chroma-ID → graph-node mapping. Separate phase preserves extractor focus, eliminates LLM cost per memory, makes back-fill trivial.
  • Forward-only migration. Rejected. Permanent dangling edges break integrity invariants; every downstream consumer has to special-case "maybe-null target."
  • Storing body text in the graph. Rejected. Chroma is load-bearing for embeddings; body stays there. Graph holds topology + lightweight properties only.
  • Concurrency races during lazy back-fill. Rejected via mitigation. Two graph edge creations referencing the same missing historical Memory node could both trigger lazy ingestion simultaneously, causing unique-constraint violations. SQLite ingestion MUST use INSERT OR IGNORE or an equivalent mutex so concurrent trigger paths converge safely. Flagged by Gemini 3.1 Pro's epic-review on 2026-04-21 (session 7a73e53f-801a-490f-b693-b431189aa1a9); implementation detail belongs in #10153.

Related

  • Parent: #9999
  • Unblocks: #10139 (mailbox), #10016 (AgentIdentity ownership edges)
  • Touches: #10030 (concept-edge reach post-shipping)
  • Epic-review: Gemini 3.1 Pro, session 7a73e53f-801a-490f-b693-b431189aa1a9, 2026-04-21
  • Files: ai/daemons/DreamService.mjs, ai/daemons/services/SemanticGraphExtractor.mjs, graph schema

Origin Session ID: 71dc3cd8-d39d-48e1-ac62-e240ca67d1a5

tobiu added the epic label on Apr 21, 2026, 11:28 AM
tobiu added the ai label on Apr 21, 2026, 11:28 AM
tobiu added the architecture label on Apr 21, 2026, 11:28 AM
tobiu added the core label on Apr 21, 2026, 11:28 AM
tobiu cross-referenced by #10151 on Apr 21, 2026, 11:31 AM
tobiu cross-referenced by #10152 on Apr 21, 2026, 11:31 AM
tobiu cross-referenced by #10153 on Apr 21, 2026, 11:31 AM
tobiu added parent issue #9999 on Apr 21, 2026, 11:32 AM
tobiu added sub-issue #10151 on Apr 21, 2026, 11:33 AM
tobiu added sub-issue #10152 on Apr 21, 2026, 11:33 AM
tobiu added sub-issue #10153 on Apr 21, 2026, 11:33 AM
tobiu marked this issue as blocking #10139 on Apr 21, 2026, 11:33 AM
tobiu cross-referenced by #10154 on Apr 21, 2026, 1:17 PM
tobiu cross-referenced by PR #10155 on Apr 21, 2026, 1:32 PM
tobiu
tobiu Apr 21, 2026, 3:06 PM

Input from Gemini 3.1 Pro (Antigravity):

✦ ## Epic Review by Gemini 3.1 Pro (Antigravity)

Stage 1 — Roadmap Fit

Perfectly aligned with the multi-agent orchestration roadmap. Lifting Memory and Session to graph nodes is structurally required to unblock AgentIdentity ownership (#10016) and the mailbox pipeline (#10139).

Stage 2 — Approach Elegance

Highly elegant. Separating the deterministic Chroma-ID-to-graph mapping (Phase 1) from the LLM-driven semantic extraction (Phase 2) prevents tying structural node creation to an expensive and non-deterministic LLM call. Relying on lazy back-fill (Phase 3) protects deployment velocity while maintaining graph invariants.

Stage 3 — Sub-Structure Coherence

⚠️

  • Missing prerequisite sub: The Epic's Acceptance Criteria includes "Storage-growth impact measured post-shipping; pruning/archiving policy documented." However, none of the 3 sub-issues (#10151, #10152, #10153) explicitly claim ownership of this documentation and impact measurement. Recommend adding a 4th sub-ticket for Post-Ship Telemetry & Archival Policy.

Stage 4 — Prescription Layer

The service boundaries are exceptionally clean. #10151 handles the deterministic schema and pipeline injection, while #10152 handles the LLM prompt and edge resolution. No substrate drift.

Stage 5 — Avoided Traps Completeness

⚠️

  • Suggested addition: Add a trap for "Concurrency race conditions during lazy back-fill." In #10153, if two concurrent graph edge creations reference the same missing historical Memory node simultaneously, they might both trigger the lazy ingestion. The SQLite ingestion logic must use an INSERT OR IGNORE or equivalent mutex to prevent Unique Constraint violations.

Review verdict: Greenlight

Origin Session ID: 7a73e53f-801a-490f-b693-b431189aa1a9

tobiu cross-referenced by #10158 on Apr 21, 2026, 3:23 PM
tobiu added sub-issue #10158 on Apr 21, 2026, 3:23 PM
tobiu cross-referenced by PR #10161 on Apr 21, 2026, 4:36 PM
tobiu cross-referenced by #9999 on Apr 21, 2026, 7:03 PM
tobiu cross-referenced by PR #10165 on Apr 21, 2026, 8:59 PM
tobiu cross-referenced by #10147 on Apr 21, 2026, 10:30 PM
tobiu cross-referenced by #10139 on Apr 21, 2026, 11:02 PM
tobiu added sub-issue #10169 on Apr 21, 2026, 11:24 PM
tobiu cross-referenced by PR #10171 on Apr 22, 2026, 12:47 AM
tobiu cross-referenced by #10172 on Apr 22, 2026, 1:19 AM
tobiu added sub-issue #10172 on Apr 22, 2026, 1:19 AM
tobiu cross-referenced by #10332 on Apr 26, 2026, 1:58 AM