Context
Discovered during a 2026-04-28 runSandman cycle with empirical evidence that failed memory ingestion is silently masked as full ingestion, leading to permanently invisible memory rows in the graph. The trigger today was the post-FileSystemIngestor SQLite IN-clause overflow (separate ticket), but the corruption pattern itself is independent of any specific trigger — it's a gating bug in DreamService.processUndigestedSessions.
The Problem
When MemorySessionIngestor.syncSessionToGraph partially fails — e.g., 100 of 133 memories upserted, 33 errored on individual GraphService.upsertNode calls — those 33 errors are caught silently in a per-memory try/catch and accumulated into stats.errors. The session-level loop in DreamService then proceeds to call SemanticGraphExtractor.executeTriVectorExtraction(session), which reads session.document (text-only, no graph dependency). If the LLM extraction succeeds, DreamService marks the session graphDigested: true regardless of how many memories actually made it into the graph.
On the next REM cycle, findUndigestedSessions filters that session out (graphDigested === true). The 33 missing memories never get re-attempted. Lazy back-fill (#10153) only triggers when a future extraction emits an edge referencing a missing memory:<id> — if no such reference materialises, the memories stay orphaned permanently.
This produces the observed pattern: "latest session summaries missing" — memories silently dropped during ingestion of recent sessions, no log signal loud enough to surface, no re-attempt path.
The Architectural Reality
Three converging conditions:
ai/daemons/services/MemorySessionIngestor.mjs:196-231 — per-memory try/catch swallows errors:
for (let i = 0; i < rawMemories.ids.length; i++) {
try {
stats.memoriesUpserted++;
} catch (e) {
stats.errors.push(`[${rawMemories.ids[i]}] ${e.message}`);
}
}
ai/daemons/DreamService.mjs:202-205 — never reads ingestStats.errors:
const ingestStats = await MemorySessionIngestor.syncSessionToGraph(session);
logger.info(`[DreamService] -> Memory/Session graph ingestion took: ${ingestTime}s (${ingestStats.memoriesUpserted} upserted, ${ingestStats.memoriesSkipped} skipped)`);
Log format omits errors count entirely.
ai/daemons/DreamService.mjs:224-228 — graphDigested gate is LLM-extraction-only:
if (success) {
await this.sessionsCollection.update({
ids: [session.id],
metadatas: [{ ...session.meta, graphDigested: true }]
});
}
success reflects only SemanticGraphExtractor.executeTriVectorExtraction's return value; memory-ingestion errors don't gate this.
SemanticGraphExtractor's own error path is structurally safe — non-fetch errors return null (line 264), so partial graph-extraction writes don't poison graphDigested. The dangerous path is exclusively MemorySessionIngestor → DreamService.
The Fix
Tighten the gate in ai/daemons/DreamService.mjs per-session loop:
const ingestStats = await MemorySessionIngestor.syncSessionToGraph(session);
const ingestErrors = ingestStats.errors?.length ?? 0;
if (ingestErrors > 0) {
logger.warn(`[DreamService] Session ${session.meta.sessionId} had ${ingestErrors} memory-ingestion error(s); graphDigested will NOT be set this cycle.`);
}
if (success && ingestErrors === 0) {
await this.sessionsCollection.update({
ids: [session.id],
metadatas: [{ ...session.meta, graphDigested: true }]
});
logger.info(`[DreamService] Session ${session.meta.sessionId} marked as graphDigested in Memory Core.`);
}
This makes ingestion errors self-healing: failed memories stay un-digested → re-attempted on next REM cycle → the offending error (e.g., transient SQLite saturation) typically clears between runs.
Also surface the error count in the existing INFO-tier ingestion log so operators have a fast-path signal:
logger.info(`[DreamService] -> Memory/Session graph ingestion took: ${ingestTime}s (${ingestStats.memoriesUpserted} upserted, ${ingestStats.memoriesSkipped} skipped, ${ingestErrors} errors)`);
Acceptance Criteria
Out of Scope
- Backfilling already-orphaned memories from prior runs. A separate substrate (graph-integrity audit, filed as sibling ticket) handles detection + backfill of pre-existing silent drops.
- Eliminating the per-memory
try/catch itself. The catch is correct in shape — the bug is the missing downstream gate, not the catch's existence.
- Distinguishing transient from permanent ingestion errors. Future work could classify error severity (e.g.,
SQLITE_BUSY retryable vs schema-mismatch fatal); for now, treat any error as "session re-attempt needed."
Avoided Traps
- Re-throwing per-memory errors out of MemorySessionIngestor. Rejected. The session-level loop should continue processing other memories even when one fails — partial progress is real progress as long as the session isn't prematurely marked digested. Bubble the count, not the throw.
- Adding a new state value
graphDigested: 'partial'. Rejected. Three-state flag adds query complexity downstream (findUndigestedSessions would need to handle 'partial' specifically). Boolean is sufficient if the gate is correct.
- Auto-retry within the same REM cycle. Rejected. Same-cycle retry of a transient SQLite saturation is unlikely to clear (the saturation is the reason MemorySessionIngestor failed in the first place). Cross-cycle retry is the correct interval — typically the delta log gets consumed, the saturation clears, the next run succeeds.
Related
- Trigger context: SQLite IN-clause overflow in
getDeltaLog (sibling ticket filed same session — saturated GraphLog post-FileSystemIngestor produces the per-memory upsertNode errors)
- Detection substrate: graph-integrity audit (sibling ticket — periodic SESSION→memory completeness check to surface historical silent drops)
- Adjacent: #10153 (lazy back-fill mechanism — only triggers on edge reference, doesn't cover orphaned memories that are never referenced)
- Adjacent: #10143 (Memory + Session as first-class graph nodes — shipped 2026-04-21; this ticket is a gating bug in the consumer of that substrate)
Origin Session ID: 4bb6859b-860f-440d-9055-320e20b0ee22
Retrieval Hint: MemorySessionIngestor silent per-memory error swallow graphDigested premature true partial-ingestion mask
Context
Discovered during a 2026-04-28 runSandman cycle with empirical evidence that failed memory ingestion is silently masked as full ingestion, leading to permanently invisible memory rows in the graph. The trigger today was the post-FileSystemIngestor SQLite IN-clause overflow (separate ticket), but the corruption pattern itself is independent of any specific trigger — it's a gating bug in
DreamService.processUndigestedSessions.The Problem
When
MemorySessionIngestor.syncSessionToGraphpartially fails — e.g., 100 of 133 memories upserted, 33 errored on individualGraphService.upsertNodecalls — those 33 errors are caught silently in a per-memorytry/catchand accumulated intostats.errors. The session-level loop inDreamServicethen proceeds to callSemanticGraphExtractor.executeTriVectorExtraction(session), which readssession.document(text-only, no graph dependency). If the LLM extraction succeeds,DreamServicemarks the sessiongraphDigested: trueregardless of how many memories actually made it into the graph.On the next REM cycle,
findUndigestedSessionsfilters that session out (graphDigested === true). The 33 missing memories never get re-attempted. Lazy back-fill (#10153) only triggers when a future extraction emits an edge referencing a missingmemory:<id>— if no such reference materialises, the memories stay orphaned permanently.This produces the observed pattern: "latest session summaries missing" — memories silently dropped during ingestion of recent sessions, no log signal loud enough to surface, no re-attempt path.
The Architectural Reality
Three converging conditions:
ai/daemons/services/MemorySessionIngestor.mjs:196-231— per-memorytry/catchswallows errors:for (let i = 0; i < rawMemories.ids.length; i++) { try { // upsertNode + linkNodes stats.memoriesUpserted++; } catch (e) { stats.errors.push(`[${rawMemories.ids[i]}] ${e.message}`); } }ai/daemons/DreamService.mjs:202-205— never readsingestStats.errors:const ingestStats = await MemorySessionIngestor.syncSessionToGraph(session); logger.info(`[DreamService] -> Memory/Session graph ingestion took: ${ingestTime}s (${ingestStats.memoriesUpserted} upserted, ${ingestStats.memoriesSkipped} skipped)`);Log format omits
errorscount entirely.ai/daemons/DreamService.mjs:224-228—graphDigestedgate is LLM-extraction-only:if (success) { await this.sessionsCollection.update({ ids: [session.id], metadatas: [{ ...session.meta, graphDigested: true }] }); }successreflects onlySemanticGraphExtractor.executeTriVectorExtraction's return value; memory-ingestion errors don't gate this.SemanticGraphExtractor's own error path is structurally safe — non-fetch errors returnnull(line 264), so partial graph-extraction writes don't poisongraphDigested. The dangerous path is exclusively MemorySessionIngestor → DreamService.The Fix
Tighten the gate in
ai/daemons/DreamService.mjsper-session loop:const ingestStats = await MemorySessionIngestor.syncSessionToGraph(session); const ingestErrors = ingestStats.errors?.length ?? 0; if (ingestErrors > 0) { logger.warn(`[DreamService] Session ${session.meta.sessionId} had ${ingestErrors} memory-ingestion error(s); graphDigested will NOT be set this cycle.`); } // ... existing extractor + topology + gap inference calls ... if (success && ingestErrors === 0) { await this.sessionsCollection.update({ ids: [session.id], metadatas: [{ ...session.meta, graphDigested: true }] }); logger.info(`[DreamService] Session ${session.meta.sessionId} marked as graphDigested in Memory Core.`); }This makes ingestion errors self-healing: failed memories stay un-digested → re-attempted on next REM cycle → the offending error (e.g., transient SQLite saturation) typically clears between runs.
Also surface the error count in the existing INFO-tier ingestion log so operators have a fast-path signal:
logger.info(`[DreamService] -> Memory/Session graph ingestion took: ${ingestTime}s (${ingestStats.memoriesUpserted} upserted, ${ingestStats.memoriesSkipped} skipped, ${ingestErrors} errors)`);Acceptance Criteria
DreamService.processUndigestedSessionsreadsingestStats.errors.lengthand only setsgraphDigested: trueif extraction succeeded AND ingestion errors are zeroingestStats.errors.length > 0, including session ID and error counttest/playwright/unit/ai/daemons/services/MemorySessionIngestor.spec.mjsextended (or new sibling spec) verifies that simulated per-memory errors blockgraphDigestedpropagationupsertNodefailure on 1 memory shows session NOT markedgraphDigestedand is re-attempted on the next cycle(#TICKET_ID)per AGENTS.md §3 Gate 1; type=fix, scope=aiOut of Scope
try/catchitself. The catch is correct in shape — the bug is the missing downstream gate, not the catch's existence.SQLITE_BUSYretryable vs schema-mismatch fatal); for now, treat any error as "session re-attempt needed."Avoided Traps
graphDigested: 'partial'. Rejected. Three-state flag adds query complexity downstream (findUndigestedSessionswould need to handle 'partial' specifically). Boolean is sufficient if the gate is correct.Related
getDeltaLog(sibling ticket filed same session — saturated GraphLog post-FileSystemIngestor produces the per-memoryupsertNodeerrors)Origin Session ID:
4bb6859b-860f-440d-9055-320e20b0ee22Retrieval Hint:
MemorySessionIngestor silent per-memory error swallow graphDigested premature true partial-ingestion mask