Context
Operator ran npm run ai:run-sandman on 2026-05-20. The run reached GraphMaintenanceService vector apoptosis, detected 8344 orphaned nodes, then logged a DreamService error while still printing ✅ Sandman cycle complete. and exiting with code 0.
Observed stack:
[INFO] [GraphMaintenanceService] Apoptosis detected 8344 orphaned nodes. Commencing eradication...
[ERROR] [DreamService] Failed to process undigested sessions: TypeError: Cannot read properties of null (reading 'id')
at Store.getKey (src/data/Store.mjs:554:25)
at Store.splice (src/collection/Base.mjs:1483:45)
at Store.splice (ai/graph/Store.mjs:161:30)
at Store.remove (src/collection/Base.mjs:1396:14)
at Database.removeNode (ai/graph/Database.mjs:413:18)
at GraphService.mjs:986:43
at Array.forEach (<anonymous>)
at GraphService.mjs:986:21
at Database.transaction (ai/graph/Database.mjs:474:13)
at GraphService.removeNodes (ai/services/memory-core/GraphService.mjs:985:17)
...
✅ Sandman cycle complete.
Process finished with exit code 0The Problem
The REM maintenance path can fail during apoptosis deletion and still look successful to the caller. This is dangerous for operator runs and future automation because the process-level success signal no longer means the REM cycle completed.
There are two coupled failure surfaces:
GraphService.removeNodes() can pass an invalid/null node id into Database.removeNode() during apoptosis cleanup.
DreamService.processUndigestedSessions() catches the error, logs it, does not rethrow or return a failure result, and buildScripts/ai/runSandman.mjs then prints success and sets process.exitCode = 0.
The Architectural Reality
ai/daemons/services/GraphMaintenanceService.mjs:49-53 calls GraphService.getOrphanedNodes() and then GraphService.removeNodes(orphaned).
ai/services/memory-core/GraphService.mjs:952-975 selects orphan rows from SQLite Nodes and pushes row.id into the deletion list without validating the id.
ai/services/memory-core/GraphService.mjs:982-988 wraps nodeIds.forEach(id => this.db.removeNode(id)) in Database.transaction().
ai/graph/Database.mjs:407-414 calls me.nodes.remove(nodeId) directly.
src/collection/Base.mjs:1294-1297 treats null as an item because typeof null === 'object'.
src/data/Store.mjs:542-554 then calls item[keyProperty], which throws for item === null.
ai/daemons/DreamService.mjs:248-254 catches and logs the failure without propagating it.
buildScripts/ai/runSandman.mjs:218-224 awaits DreamService.processUndigestedSessions(), then unconditionally synthesizes Golden Path, prints success, and sets exit code 0 if no exception escapes.
ai/graph/storage/SQLite.mjs:83-87 declares Nodes.id TEXT PRIMARY KEY without explicit NOT NULL; existing SQLite rowid-table semantics can allow historical null primary-key rows unless explicitly constrained or guarded at insert time.
The Fix
Implement a narrow hardening pass across the owning boundaries:
- Add validation/repair handling so GraphService apoptosis never calls
Database.removeNode(null) or otherwise routes invalid ids through Store.remove().
- Prevent future invalid graph node ids from entering SQLite through
SQLite.addNodes() / graph upsert paths, with an explicit diagnostic rather than a latent corrupt row.
- Ensure
DreamService.processUndigestedSessions() exposes failure to callers, either by rethrowing after logging or by returning a structured failure result consumed by runSandman.
- Make
runSandman fail non-zero whenever REM processing failed, while preserving the heavy-maintenance lease semantics and the held-lease early-exit behavior.
- Add focused unit coverage for the null-id apoptosis guard and the Sandman/DreamService failure propagation contract.
Contract Ledger Matrix
| Target Surface |
Source of Authority |
Proposed Behavior |
Fallback |
Docs |
Evidence |
GraphService.removeNodes(nodeIds) |
This ticket + 2026-05-20 Sandman stack |
Invalid/null ids are rejected or repaired before Database.removeNode(); no Store.getKey(null) TypeError |
Emit explicit diagnostic naming invalid node ids / corrupt rows |
JSDoc on removeNodes() and/or getOrphanedNodes() |
Unit test passes [null] or corrupt orphan row and proves no Store.getKey(null) crash |
| SQLite node insert path |
This ticket + schema audit |
New node writes require a non-empty string id |
Throw explicit invalid-node-id error before insert |
JSDoc on SQLite.addNodes() |
Unit test proves null/undefined node id is rejected before SQLite persistence |
DreamService.processUndigestedSessions() |
This ticket + runSandman caller contract |
REM failures are observable to callers |
Structured failure result if rethrow would break existing caller |
JSDoc on method return/error contract |
Unit test proves GraphMaintenanceService failure reaches caller |
npm run ai:run-sandman |
Operator CLI contract |
Fatal REM failure exits non-zero and does not print successful completion |
Held lease still exits 0 without mutation, as today |
Script comments near exitCode handling |
Script-level test or unitized dependency-injected test proves failure => exitCode 1 |
Acceptance Criteria
Out of Scope
- Manually deleting or rewriting all current orphan rows outside the tested repair/guard path.
- Reworking vector-dimension mismatch configuration (
gemini-embedding-001 3072 vs configured 4096); that warning is visible in the same log but is a separate configuration/collection-dimension concern.
- Changing apoptosis retention policy or protected node labels.
- Reworking heavy-maintenance lease acquisition semantics.
Avoided Traps
- Treating the exit code 0 as success — rejected. The logged DreamService error falsifies the success claim.
- Folding this into #11595 — rejected. #11595 fixed string-shaped rollback payloads; this stack reaches
Store.getKey(null) before that rollback failure class.
- Only changing
Collection.Base.isItem(null) — rejected as an insufficient sole fix. It may avoid this TypeError, but it does not explain or prevent invalid graph ids entering apoptosis or fix the swallowed REM failure.
- Only skipping null ids in apoptosis — rejected as incomplete. It leaves Sandman able to hide future fatal REM failures behind exit code 0.
Related
- #11595 / PR #11611 — adjacent prior apoptosis rollback shape bug, closed/merged.
ai/daemons/services/GraphMaintenanceService.mjs:49-53
ai/services/memory-core/GraphService.mjs:952-988
ai/graph/Database.mjs:407-414
src/collection/Base.mjs:1294-1297
src/data/Store.mjs:542-554
ai/daemons/DreamService.mjs:248-254
buildScripts/ai/runSandman.mjs:218-224
Origin Session ID: d13c94dd-e721-4e28-ac9e-4d0b3c0f66de
Retrieval Hint: query_raw_memories("Sandman apoptosis null node Store.getKey exit code 0 DreamService")
Retrieval Hint: query_raw_memories("GraphService removeNodes null id runSandman success after error")
Context
Operator ran
npm run ai:run-sandmanon 2026-05-20. The run reached GraphMaintenanceService vector apoptosis, detected 8344 orphaned nodes, then logged a DreamService error while still printing✅ Sandman cycle complete.and exiting with code 0.Observed stack:
[INFO] [GraphMaintenanceService] Apoptosis detected 8344 orphaned nodes. Commencing eradication... [ERROR] [DreamService] Failed to process undigested sessions: TypeError: Cannot read properties of null (reading 'id') at Store.getKey (src/data/Store.mjs:554:25) at Store.splice (src/collection/Base.mjs:1483:45) at Store.splice (ai/graph/Store.mjs:161:30) at Store.remove (src/collection/Base.mjs:1396:14) at Database.removeNode (ai/graph/Database.mjs:413:18) at GraphService.mjs:986:43 at Array.forEach (<anonymous>) at GraphService.mjs:986:21 at Database.transaction (ai/graph/Database.mjs:474:13) at GraphService.removeNodes (ai/services/memory-core/GraphService.mjs:985:17) ... ✅ Sandman cycle complete. Process finished with exit code 0The Problem
The REM maintenance path can fail during apoptosis deletion and still look successful to the caller. This is dangerous for operator runs and future automation because the process-level success signal no longer means the REM cycle completed.
There are two coupled failure surfaces:
GraphService.removeNodes()can pass an invalid/null node id intoDatabase.removeNode()during apoptosis cleanup.DreamService.processUndigestedSessions()catches the error, logs it, does not rethrow or return a failure result, andbuildScripts/ai/runSandman.mjsthen prints success and setsprocess.exitCode = 0.The Architectural Reality
ai/daemons/services/GraphMaintenanceService.mjs:49-53callsGraphService.getOrphanedNodes()and thenGraphService.removeNodes(orphaned).ai/services/memory-core/GraphService.mjs:952-975selects orphan rows from SQLiteNodesand pushesrow.idinto the deletion list without validating the id.ai/services/memory-core/GraphService.mjs:982-988wrapsnodeIds.forEach(id => this.db.removeNode(id))inDatabase.transaction().ai/graph/Database.mjs:407-414callsme.nodes.remove(nodeId)directly.src/collection/Base.mjs:1294-1297treatsnullas an item becausetypeof null === 'object'.src/data/Store.mjs:542-554then callsitem[keyProperty], which throws foritem === null.ai/daemons/DreamService.mjs:248-254catches and logs the failure without propagating it.buildScripts/ai/runSandman.mjs:218-224awaitsDreamService.processUndigestedSessions(), then unconditionally synthesizes Golden Path, prints success, and sets exit code 0 if no exception escapes.ai/graph/storage/SQLite.mjs:83-87declaresNodes.id TEXT PRIMARY KEYwithout explicitNOT NULL; existing SQLite rowid-table semantics can allow historical null primary-key rows unless explicitly constrained or guarded at insert time.The Fix
Implement a narrow hardening pass across the owning boundaries:
Database.removeNode(null)or otherwise routes invalid ids throughStore.remove().SQLite.addNodes()/ graph upsert paths, with an explicit diagnostic rather than a latent corrupt row.DreamService.processUndigestedSessions()exposes failure to callers, either by rethrowing after logging or by returning a structured failure result consumed byrunSandman.runSandmanfail non-zero whenever REM processing failed, while preserving the heavy-maintenance lease semantics and the held-lease early-exit behavior.Contract Ledger Matrix
GraphService.removeNodes(nodeIds)Database.removeNode(); noStore.getKey(null)TypeErrorremoveNodes()and/orgetOrphanedNodes()[null]or corrupt orphan row and proves noStore.getKey(null)crashSQLite.addNodes()DreamService.processUndigestedSessions()npm run ai:run-sandmanAcceptance Criteria
Store.getKey(null)path from apoptosis deletion and verifies the new guard prevents the TypeError.addNodes()persistence path without an explicit error.DreamService.processUndigestedSessions()no longer silently swallows a GraphMaintenanceService failure from callers.buildScripts/ai/runSandman.mjsexits non-zero and does not print✅ Sandman cycle complete.when REM processing fails.npm run ai:run-sandman; the previousCannot read properties of null (reading 'id')error is gone, and any future fatal REM error returns non-zero.Out of Scope
gemini-embedding-0013072 vs configured 4096); that warning is visible in the same log but is a separate configuration/collection-dimension concern.Avoided Traps
Store.getKey(null)before that rollback failure class.Collection.Base.isItem(null)— rejected as an insufficient sole fix. It may avoid this TypeError, but it does not explain or prevent invalid graph ids entering apoptosis or fix the swallowed REM failure.Related
ai/daemons/services/GraphMaintenanceService.mjs:49-53ai/services/memory-core/GraphService.mjs:952-988ai/graph/Database.mjs:407-414src/collection/Base.mjs:1294-1297src/data/Store.mjs:542-554ai/daemons/DreamService.mjs:248-254buildScripts/ai/runSandman.mjs:218-224Origin Session ID: d13c94dd-e721-4e28-ac9e-4d0b3c0f66de Retrieval Hint:
query_raw_memories("Sandman apoptosis null node Store.getKey exit code 0 DreamService")Retrieval Hint:query_raw_memories("GraphService removeNodes null id runSandman success after error")