Update 2026-05-10 (post @neo-gpt cycle-2 review on PR #11143): Scope narrowed to graph-only. Chroma #importMemories preserve-live parity has been split into named follow-up #11144. Post-restore hook narrowed to filesystem-ingestor-only allowlist (dream-service explicitly disallowed per peer-review — REM cycle does graph mutation/inference and would blur recovery validation). The Acceptance Criteria below have been updated to reflect the actual close-target shape; original prescription preserved as the design-intent record. PR #11143 implements the graph-only scope.
Premise
npm run ai:restore (buildScripts/ai/restore.mjs, shipped via #10871) advertises a two-mode contract: --mode merge (idempotent, "preserves operator additions") and --mode replace (destructive, gated). Operator's framing 2026-05-10 captured the intended semantics: "restore logic MERGES latest content with backup. if our witch hunt does not find all culprits → next time we just trigger the restore script, and we are fine."
V-B-A on Memory_DatabaseService.#importGraph (line 252) reveals the actual implementation does NOT honor the merge intent:
const insertNode = db.prepare('INSERT OR REPLACE INTO Nodes (id, user_id, data) VALUES (?, ?, ?)');
const insertEdge = db.prepare(`INSERT OR REPLACE INTO Edges (id, user_id, source, target, type, data) VALUES (?, ?, ?, ?, ?, ?)`);
The mode parameter only gates the truncate step (replace truncates first; merge skips truncate). Both modes use INSERT OR REPLACE — meaning when a backup row's ID matches a live row, the backup OVERWRITES the live version, regardless of which is more recent.
Empirical anchor (today's incident)
2026-05-10 graph-wipe incident: Store.clear → onNodesMutate → SQLite.removeNodes bypassed #10845's SQLite.clear() guard (root cause being addressed in #11140). Live graph went from 22,660 nodes → 3,545. Restoration analysis (/Users/Shared/github/neomjs/neo/.neo-ai-data/recovery-2026-05/graph-restore-dryrun.mjs, read-only) on the May 10 01:11 backup showed:
Filter-out: 22,756 elements (FILE 8299 + DIRECTORY 1141 + CONTAINS 9349 — regenerable via FileSystemIngestor; KB_GAP 395 + TOOLING_GAP 365 + DISCOVERED_IN/EVALUATED_BY 3104 — operator-classified garbage per feedback_audit_substrate_guides_before_architectural_claims)
Already-present in live: 3,580 (post-wipe re-ingestion via gh-workflow + retrospective daemon: ISSUE 1017 + RETROSPECTIVE 792 + PULL_REQUEST 711 etc.)
Would INSERT: 11,589 (CONCEPT 4216 + AGENT_MEMORY 4136 + CLASS 844 + MESSAGE 409 + valuable edges)
With current --mode merge, the 3,580 already-present rows would be OVERWRITTEN by their backup versions — losing the post-wipe re-ingestion that's actually MORE current than backup. Wrong-shape for incident-recovery.
Companion observation: the 22,756 filtered set is mostly inert noise (FileSystemIngestor regenerates filesystem mirror deterministically; KB_GAP/TOOLING_GAP have many hallucinated-per-file gaps). Restoring blindly inflates DB with stale + garbage substrate.
Prescription
Enhance runRestore + Memory_DatabaseService.#importGraph (and Chroma counterpart in #importMemories):
1. Core semantic correction (the primitive fix)
Change #importGraph merge-mode INSERT statements:
// BEFOREconst insertNode = db.prepare('INSERT OR REPLACE INTO Nodes ...');
// AFTERconst insertNode = mode === 'replace'
? db.prepare('INSERT OR REPLACE INTO Nodes ...')
: db.prepare('INSERT OR IGNORE INTO Nodes ...');
replace mode keeps current behavior (truncate-then-OR-REPLACE; destructive). merge mode flips to OR IGNORE — preserves live rows when IDs collide.
Same change for Edges INSERT statement.
2. Pre-import label/type filter (CLI-driven)
Add runRestore options:
--filter-labels=<csv> — drop nodes whose data.label matches any entry. Drops orphan-endpoint edges.
--filter-edge-types=<csv> — drop edges whose data.type matches.
Implementation: pre-process the JSONL stream in #importGraph (or in runRestore before passing the file to the SDK) — read each line, JSON-parse, skip if filter matches. Streams stay constant-memory.
Implementation: gate the existing if (await fs.pathExists(layout.<substrate>)) ... blocks on inclusion list.
4. Post-restore hooks
Add runRestore option:
--post-restore-hook=<name> — accept filesystem-ingestor (most common; regenerates FILE/DIRECTORY/CONTAINS deterministically) and dream-service (for full REM cycle).
Implementation: hook table; filesystem-ingestor calls FileSystemIngestor.syncWorkspaceToGraph() after restore completes.
5. Mirror semantics for Chroma side
Verify #importMemories (Chroma upsert via MC_StorageRouter.getMemoryCollection().upsert(...)) currently behaves as REPLACE or IGNORE on ID collision. If REPLACE: add a flag to switch to IGNORE for true merge-preserve. If already IGNORE: document.
5. Mirror semantics for Chroma side ⇒ MOVED TO #11144
Verify #importMemories (Chroma upsert...)
This sub-prescription was scope-split out of #11141 per @neo-gpt's /peer-role review on 2026-05-10. Chroma preserve-live parity now tracked as a standalone follow-up at #11144 with explicit ACs (preflight-then-add pattern, chunked-batch existence check, summaries+memories parity).
Avoided Traps
Considered
Rejected
Rationale
Build new restoreFromBackup script alongside ai:restore
Reject
Per feedback_audit_substrate_before_architectural_proposal — ai:restore already exists; right shape is enhance, not duplicate.
Epic-scope (multi-ticket coordination)
Reject for this work
Single substrate-correction; filter/targeting/hooks are coherent extensions of the same primitive. Epic adds coordination overhead without architectural coupling.
Pre-import filter as separate node script
Reject
Introduces second tool surface; should be a flag on existing ai:restore.
Auto-detect filter-labels from KB_GAP heuristics
Reject for this ticket
Filter set is operator-classified per-incident. Default config (in restore.mjs constants) can hold safe-default list; CLI flag overrides. Defer auto-classification to follow-up.
Make merge always preserve-live (regardless of backup recency)
Accept (this PR)
Today's incident shape: live re-ingestion is authoritative; backup is stale by minutes-to-hours. For backup-recency-priority, operator uses replace --force. Two clean modes, no half-modes.
Acceptance Criteria (post-#11144 scope-split)
Graph-only scope (this ticket):
Memory_DatabaseService.#importGraph merge-mode uses INSERT OR IGNORE; replace-mode keeps INSERT OR REPLACE. Verified by row-level regression test (restore-filters.spec.mjs) that conflicting IDs preserve live row in merge mode AND that live edges are NOT cascade-deleted (the empirical reason merge cannot use OR REPLACE).
Truthful counters: #importGraph returns {imported, counts: {nodes, edges}, mode} with per-type inserted/skippedExisting/failed. importDatabase propagates counts.graph for operator validation through runRestore.subsystems.graph.counts.
Documentation: restore.mjs JSDoc + CLI usage examples narrowed to graph-only preserve-live; Chroma side documented as still-upsert() with #11144 follow-up reference.
Empirical validation (post-merge): re-run on May 10 01:11 backup with --mode merge --only-substrate=graph --filter-labels=FILE,DIRECTORY,KB_GAP,TOOLING_GAP --filter-edge-types=CONTAINS,DISCOVERED_IN,EVALUATED_BY --post-restore-hook=filesystem-ingestor produces ~11,589-element insert + post-restore FileSystemIngestor regen.
Integration test against live MC services → orchestrator-shape covered by existing restore.spec.mjs; live-substrate run gated on test isolation now hardened post-#11140 merge
Dependencies
#11140 (Gemini's substrate-fix for Store.clear → SQLite.removeNodes guard bypass) MUST land first. Without it, this enhancement ships a restore primitive that can be re-wiped on next test run.
Empirical Anchors
2026-05-10 graph-wipe incident — root cause #11140; restoration analysis at .neo-ai-data/recovery-2026-05/graph-restore-dryrun.mjs
#10871 — parent ai:restore shipping ticket (May 7); this enhancement honors the original two-mode contract spec
#10845 — original destructive-operation guard work
feedback_audit_substrate_before_architectural_proposal — V-B-A uncovered existing ai:restore before I duplicated work; operator's "we have it" framing was the cue
Operator @tobiu (2026-05-10): "new ticket that restore logic MERGES latest content with backup. if our witch hunt does not find all culprits → next time we just trigger the restore script, and we are fine."
tobiu referenced in commit 8358201 - "feat(github-workflow): mechanically REJECT sync_all when caller not on dev branch (#11145) (#11146) on May 10, 2026, 9:21 PM
tobiu referenced in commit fcc4f2b - "feat(ai-restore): preserve-live merge semantics + per-incident filter/targeting/hooks (#11141) (#11143) on May 10, 2026, 9:37 PM
tobiu referenced in commit 0596953 - "fix(ai-restore): harden production-scale restore path after May 10 recovery (#11150) (#11151) on May 10, 2026, 10:36 PM
Premise
npm run ai:restore(buildScripts/ai/restore.mjs, shipped via #10871) advertises a two-mode contract:--mode merge(idempotent, "preserves operator additions") and--mode replace(destructive, gated). Operator's framing 2026-05-10 captured the intended semantics: "restore logic MERGES latest content with backup. if our witch hunt does not find all culprits → next time we just trigger the restore script, and we are fine."V-B-A on
Memory_DatabaseService.#importGraph(line 252) reveals the actual implementation does NOT honor the merge intent:const insertNode = db.prepare('INSERT OR REPLACE INTO Nodes (id, user_id, data) VALUES (?, ?, ?)'); const insertEdge = db.prepare(`INSERT OR REPLACE INTO Edges (id, user_id, source, target, type, data) VALUES (?, ?, ?, ?, ?, ?)`);The
modeparameter only gates the truncate step (replacetruncates first;mergeskips truncate). Both modes useINSERT OR REPLACE— meaning when a backup row's ID matches a live row, the backup OVERWRITES the live version, regardless of which is more recent.Empirical anchor (today's incident)
2026-05-10 graph-wipe incident:
Store.clear → onNodesMutate → SQLite.removeNodesbypassed #10845'sSQLite.clear()guard (root cause being addressed in #11140). Live graph went from 22,660 nodes → 3,545. Restoration analysis (/Users/Shared/github/neomjs/neo/.neo-ai-data/recovery-2026-05/graph-restore-dryrun.mjs, read-only) on the May 10 01:11 backup showed:feedback_audit_substrate_guides_before_architectural_claims)With current
--mode merge, the 3,580 already-present rows would be OVERWRITTEN by their backup versions — losing the post-wipe re-ingestion that's actually MORE current than backup. Wrong-shape for incident-recovery.Companion observation: the 22,756 filtered set is mostly inert noise (FileSystemIngestor regenerates filesystem mirror deterministically; KB_GAP/TOOLING_GAP have many hallucinated-per-file gaps). Restoring blindly inflates DB with stale + garbage substrate.
Prescription
Enhance
runRestore+Memory_DatabaseService.#importGraph(and Chroma counterpart in#importMemories):1. Core semantic correction (the primitive fix)
Change
#importGraphmerge-mode INSERT statements:// BEFORE const insertNode = db.prepare('INSERT OR REPLACE INTO Nodes ...'); // AFTER const insertNode = mode === 'replace' ? db.prepare('INSERT OR REPLACE INTO Nodes ...') : db.prepare('INSERT OR IGNORE INTO Nodes ...');replacemode keeps current behavior (truncate-then-OR-REPLACE; destructive).mergemode flips to OR IGNORE — preserves live rows when IDs collide.Same change for Edges INSERT statement.
2. Pre-import label/type filter (CLI-driven)
Add
runRestoreoptions:--filter-labels=<csv>— drop nodes whosedata.labelmatches any entry. Drops orphan-endpoint edges.--filter-edge-types=<csv>— drop edges whosedata.typematches.Implementation: pre-process the JSONL stream in
#importGraph(or inrunRestorebefore passing the file to the SDK) — read each line, JSON-parse, skip if filter matches. Streams stay constant-memory.3. Per-substrate targeting
Add
runRestoreoption:--only-substrate=<csv>— e.g.,graph,mc→ skip kb/concepts/trajectories/mailbox.Implementation: gate the existing
if (await fs.pathExists(layout.<substrate>)) ...blocks on inclusion list.4. Post-restore hooks
Add
runRestoreoption:--post-restore-hook=<name>— acceptfilesystem-ingestor(most common; regenerates FILE/DIRECTORY/CONTAINS deterministically) anddream-service(for full REM cycle).Implementation: hook table;
filesystem-ingestorcallsFileSystemIngestor.syncWorkspaceToGraph()after restore completes.5. Mirror semantics for Chroma side
Verify
#importMemories(ChromaupsertviaMC_StorageRouter.getMemoryCollection().upsert(...)) currently behaves as REPLACE or IGNORE on ID collision. If REPLACE: add a flag to switch to IGNORE for true merge-preserve. If already IGNORE: document.5. Mirror semantics for Chroma side ⇒ MOVED TO #11144
Verify#importMemories(Chromaupsert...)This sub-prescription was scope-split out of #11141 per @neo-gpt's
/peer-rolereview on 2026-05-10. Chroma preserve-live parity now tracked as a standalone follow-up at #11144 with explicit ACs (preflight-then-add pattern, chunked-batch existence check, summaries+memories parity).Avoided Traps
restoreFromBackupscript alongsideai:restorefeedback_audit_substrate_before_architectural_proposal—ai:restorealready exists; right shape is enhance, not duplicate.ai:restore.mergealways preserve-live (regardless of backup recency)replace --force. Two clean modes, no half-modes.Acceptance Criteria (post-#11144 scope-split)
Graph-only scope (this ticket):
Memory_DatabaseService.#importGraphmerge-mode usesINSERT OR IGNORE; replace-mode keepsINSERT OR REPLACE. Verified by row-level regression test (restore-filters.spec.mjs) that conflicting IDs preserve live row in merge mode AND that live edges are NOT cascade-deleted (the empirical reason merge cannot use OR REPLACE).runRestoreaccepts--filter-labels=<csv>,--filter-edge-types=<csv>,--only-substrate=<csv>,--post-restore-hook=<name>.filesystem-ingestoronly;dream-serviceexplicit-reject + unknown-hook reject).#importGraphreturns{imported, counts: {nodes, edges}, mode}with per-typeinserted/skippedExisting/failed.importDatabasepropagatescounts.graphfor operator validation throughrunRestore.subsystems.graph.counts.restore.mjsJSDoc + CLI usage examples narrowed to graph-only preserve-live; Chroma side documented as still-upsert()with #11144 follow-up reference.--mode merge --only-substrate=graph --filter-labels=FILE,DIRECTORY,KB_GAP,TOOLING_GAP --filter-edge-types=CONTAINS,DISCOVERED_IN,EVALUATED_BY --post-restore-hook=filesystem-ingestorproduces ~11,589-element insert + post-restore FileSystemIngestor regen.Out-of-scope (named follow-ups):
#importMemoriespreserve-live parity → #11144dream-servicepost-restore hook → deferred indefinitely (peer-review rationale; file new ticket if/when needed)restore.spec.mjs; live-substrate run gated on test isolation now hardened post-#11140 mergeDependencies
Store.clear → SQLite.removeNodesguard bypass) MUST land first. Without it, this enhancement ships a restore primitive that can be re-wiped on next test run.Empirical Anchors
.neo-ai-data/recovery-2026-05/graph-restore-dryrun.mjsai:restoreshipping ticket (May 7); this enhancement honors the original two-mode contract specfeedback_audit_substrate_before_architectural_proposal— V-B-A uncovered existingai:restorebefore I duplicated work; operator's "we have it" framing was the cue— @neo-opus-4-7 (Origin Session: c2912891-b459-4a03-b2af-154d5e264df1)