LearnNewsExamplesServices
Frontmatter
id11141
titleEnhance ai:restore merge mode: preserve-live INSERT OR IGNORE semantics + label-filter + per-substrate targeting + post-restore hooks
stateClosed
labels
enhancementaiarchitecturemodel-experience
assigneesneo-opus-4-7
createdAtMay 10, 2026, 8:18 PM
updatedAtMay 10, 2026, 9:46 PM
githubUrlhttps://github.com/neomjs/neo/issues/11141
authorneo-opus-4-7
commentsCount2
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 10, 2026, 9:37 PM

Enhance ai:restore merge mode: preserve-live INSERT OR IGNORE semantics + label-filter + per-substrate targeting + post-restore hooks

Closedenhancementaiarchitecturemodel-experience
neo-opus-4-7
neo-opus-4-7 commented on May 10, 2026, 8:18 PM

Update 2026-05-10 (post @neo-gpt cycle-2 review on PR #11143): Scope narrowed to graph-only. Chroma #importMemories preserve-live parity has been split into named follow-up #11144. Post-restore hook narrowed to filesystem-ingestor-only allowlist (dream-service explicitly disallowed per peer-review — REM cycle does graph mutation/inference and would blur recovery validation). The Acceptance Criteria below have been updated to reflect the actual close-target shape; original prescription preserved as the design-intent record. PR #11143 implements the graph-only scope.

Premise

npm run ai:restore (buildScripts/ai/restore.mjs, shipped via #10871) advertises a two-mode contract: --mode merge (idempotent, "preserves operator additions") and --mode replace (destructive, gated). Operator's framing 2026-05-10 captured the intended semantics: "restore logic MERGES latest content with backup. if our witch hunt does not find all culprits → next time we just trigger the restore script, and we are fine."

V-B-A on Memory_DatabaseService.#importGraph (line 252) reveals the actual implementation does NOT honor the merge intent:

const insertNode = db.prepare('INSERT OR REPLACE INTO Nodes (id, user_id, data) VALUES (?, ?, ?)');
const insertEdge = db.prepare(`INSERT OR REPLACE INTO Edges (id, user_id, source, target, type, data) VALUES (?, ?, ?, ?, ?, ?)`);

The mode parameter only gates the truncate step (replace truncates first; merge skips truncate). Both modes use INSERT OR REPLACE — meaning when a backup row's ID matches a live row, the backup OVERWRITES the live version, regardless of which is more recent.

Empirical anchor (today's incident)

2026-05-10 graph-wipe incident: Store.clear → onNodesMutate → SQLite.removeNodes bypassed #10845's SQLite.clear() guard (root cause being addressed in #11140). Live graph went from 22,660 nodes → 3,545. Restoration analysis (/Users/Shared/github/neomjs/neo/.neo-ai-data/recovery-2026-05/graph-restore-dryrun.mjs, read-only) on the May 10 01:11 backup showed:

  • Filter-out: 22,756 elements (FILE 8299 + DIRECTORY 1141 + CONTAINS 9349 — regenerable via FileSystemIngestor; KB_GAP 395 + TOOLING_GAP 365 + DISCOVERED_IN/EVALUATED_BY 3104 — operator-classified garbage per feedback_audit_substrate_guides_before_architectural_claims)
  • Already-present in live: 3,580 (post-wipe re-ingestion via gh-workflow + retrospective daemon: ISSUE 1017 + RETROSPECTIVE 792 + PULL_REQUEST 711 etc.)
  • Would INSERT: 11,589 (CONCEPT 4216 + AGENT_MEMORY 4136 + CLASS 844 + MESSAGE 409 + valuable edges)

With current --mode merge, the 3,580 already-present rows would be OVERWRITTEN by their backup versions — losing the post-wipe re-ingestion that's actually MORE current than backup. Wrong-shape for incident-recovery.

Companion observation: the 22,756 filtered set is mostly inert noise (FileSystemIngestor regenerates filesystem mirror deterministically; KB_GAP/TOOLING_GAP have many hallucinated-per-file gaps). Restoring blindly inflates DB with stale + garbage substrate.

Prescription

Enhance runRestore + Memory_DatabaseService.#importGraph (and Chroma counterpart in #importMemories):

1. Core semantic correction (the primitive fix)

Change #importGraph merge-mode INSERT statements:

// BEFORE
const insertNode = db.prepare('INSERT OR REPLACE INTO Nodes ...');

// AFTER
const insertNode = mode === 'replace'
    ? db.prepare('INSERT OR REPLACE INTO Nodes ...')
    : db.prepare('INSERT OR IGNORE INTO Nodes ...');

replace mode keeps current behavior (truncate-then-OR-REPLACE; destructive). merge mode flips to OR IGNORE — preserves live rows when IDs collide.

Same change for Edges INSERT statement.

2. Pre-import label/type filter (CLI-driven)

Add runRestore options:

  • --filter-labels=<csv> — drop nodes whose data.label matches any entry. Drops orphan-endpoint edges.
  • --filter-edge-types=<csv> — drop edges whose data.type matches.

Implementation: pre-process the JSONL stream in #importGraph (or in runRestore before passing the file to the SDK) — read each line, JSON-parse, skip if filter matches. Streams stay constant-memory.

3. Per-substrate targeting

Add runRestore option:

  • --only-substrate=<csv> — e.g., graph,mc → skip kb/concepts/trajectories/mailbox.

Implementation: gate the existing if (await fs.pathExists(layout.<substrate>)) ... blocks on inclusion list.

4. Post-restore hooks

Add runRestore option:

  • --post-restore-hook=<name> — accept filesystem-ingestor (most common; regenerates FILE/DIRECTORY/CONTAINS deterministically) and dream-service (for full REM cycle).

Implementation: hook table; filesystem-ingestor calls FileSystemIngestor.syncWorkspaceToGraph() after restore completes.

5. Mirror semantics for Chroma side

Verify #importMemories (Chroma upsert via MC_StorageRouter.getMemoryCollection().upsert(...)) currently behaves as REPLACE or IGNORE on ID collision. If REPLACE: add a flag to switch to IGNORE for true merge-preserve. If already IGNORE: document.

5. Mirror semantics for Chroma side ⇒ MOVED TO #11144

Verify #importMemories (Chroma upsert...)

This sub-prescription was scope-split out of #11141 per @neo-gpt's /peer-role review on 2026-05-10. Chroma preserve-live parity now tracked as a standalone follow-up at #11144 with explicit ACs (preflight-then-add pattern, chunked-batch existence check, summaries+memories parity).

Avoided Traps

Considered Rejected Rationale
Build new restoreFromBackup script alongside ai:restore Reject Per feedback_audit_substrate_before_architectural_proposalai:restore already exists; right shape is enhance, not duplicate.
Epic-scope (multi-ticket coordination) Reject for this work Single substrate-correction; filter/targeting/hooks are coherent extensions of the same primitive. Epic adds coordination overhead without architectural coupling.
Pre-import filter as separate node script Reject Introduces second tool surface; should be a flag on existing ai:restore.
Auto-detect filter-labels from KB_GAP heuristics Reject for this ticket Filter set is operator-classified per-incident. Default config (in restore.mjs constants) can hold safe-default list; CLI flag overrides. Defer auto-classification to follow-up.
Make merge always preserve-live (regardless of backup recency) Accept (this PR) Today's incident shape: live re-ingestion is authoritative; backup is stale by minutes-to-hours. For backup-recency-priority, operator uses replace --force. Two clean modes, no half-modes.

Acceptance Criteria (post-#11144 scope-split)

Graph-only scope (this ticket):

  • Memory_DatabaseService.#importGraph merge-mode uses INSERT OR IGNORE; replace-mode keeps INSERT OR REPLACE. Verified by row-level regression test (restore-filters.spec.mjs) that conflicting IDs preserve live row in merge mode AND that live edges are NOT cascade-deleted (the empirical reason merge cannot use OR REPLACE).
  • runRestore accepts --filter-labels=<csv>, --filter-edge-types=<csv>, --only-substrate=<csv>, --post-restore-hook=<name>.
  • Unit tests: filter drops correct elements + orphan-edge guard + FK-safe live-union check + post-hook allowlist (filesystem-ingestor only; dream-service explicit-reject + unknown-hook reject).
  • Truthful counters: #importGraph returns {imported, counts: {nodes, edges}, mode} with per-type inserted/skippedExisting/failed. importDatabase propagates counts.graph for operator validation through runRestore.subsystems.graph.counts.
  • Documentation: restore.mjs JSDoc + CLI usage examples narrowed to graph-only preserve-live; Chroma side documented as still-upsert() with #11144 follow-up reference.
  • Empirical validation (post-merge): re-run on May 10 01:11 backup with --mode merge --only-substrate=graph --filter-labels=FILE,DIRECTORY,KB_GAP,TOOLING_GAP --filter-edge-types=CONTAINS,DISCOVERED_IN,EVALUATED_BY --post-restore-hook=filesystem-ingestor produces ~11,589-element insert + post-restore FileSystemIngestor regen.

Out-of-scope (named follow-ups):

  • Chroma #importMemories preserve-live parity → #11144
  • dream-service post-restore hook → deferred indefinitely (peer-review rationale; file new ticket if/when needed)
  • Integration test against live MC services → orchestrator-shape covered by existing restore.spec.mjs; live-substrate run gated on test isolation now hardened post-#11140 merge

Dependencies

  • #11140 (Gemini's substrate-fix for Store.clear → SQLite.removeNodes guard bypass) MUST land first. Without it, this enhancement ships a restore primitive that can be re-wiped on next test run.

Empirical Anchors

  • 2026-05-10 graph-wipe incident — root cause #11140; restoration analysis at .neo-ai-data/recovery-2026-05/graph-restore-dryrun.mjs
  • #10871 — parent ai:restore shipping ticket (May 7); this enhancement honors the original two-mode contract spec
  • #10845 — original destructive-operation guard work
  • feedback_audit_substrate_before_architectural_proposal — V-B-A uncovered existing ai:restore before I duplicated work; operator's "we have it" framing was the cue
  • Operator @tobiu (2026-05-10): "new ticket that restore logic MERGES latest content with backup. if our witch hunt does not find all culprits → next time we just trigger the restore script, and we are fine."

— @neo-opus-4-7 (Origin Session: c2912891-b459-4a03-b2af-154d5e264df1)

tobiu referenced in commit 8358201 - "feat(github-workflow): mechanically REJECT sync_all when caller not on dev branch (#11145) (#11146) on May 10, 2026, 9:21 PM
tobiu referenced in commit fcc4f2b - "feat(ai-restore): preserve-live merge semantics + per-incident filter/targeting/hooks (#11141) (#11143) on May 10, 2026, 9:37 PM
tobiu closed this issue on May 10, 2026, 9:37 PM
tobiu referenced in commit 0596953 - "fix(ai-restore): harden production-scale restore path after May 10 recovery (#11150) (#11151) on May 10, 2026, 10:36 PM