ai:restore Chroma preserve-live parity for #importMemories (follow-up to #11141)
Premise
#11141 corrects Memory_DatabaseService.#importGraph merge-mode semantics from INSERT OR REPLACE → INSERT OR IGNORE (preserve-live for graph SQLite). The Chroma side (#importMemories for both mc/memory-backup-*.jsonl and mc/summaries-backup-*.jsonl) currently uses collection.upsert(...) — Chroma's equivalent of replace-like behavior for existing IDs.
Per @neo-gpt's #11141 peer-review (commentId 4416007918): "collection.upsert(...) is the Chroma equivalent of replace-like behavior for existing ids. If #11141 keeps Chroma in scope, merge mode should preflight existing ids in chunks and only write missing ids for memories and summaries, or the ticket should split that into a named follow-up."
#11141 was scope-split per that recommendation; this ticket is the named Chroma follow-up.
Prescription
In Memory_DatabaseService.#importMemories (and #importSummaries if separate):
- Add
mode parameter handling. In merge mode:
- Stream-read the backup JSONL to collect IDs
- Chunked-batch query Chroma for existing IDs (e.g., 1000-at-a-time
collection.get({ids: [...]}))
- Filter to backup-only IDs (not in live)
- Call
collection.add(...) (insert-only) instead of upsert(...) for the missing-ID subset
- In
replace mode: keep current upsert(...) (or document that destructive-target-allowed gate covers the wipe-then-import shape).
- Verify summaries follow same pattern.
Acceptance Criteria
Contract Ledger
(Added 2026-05-24 per @neo-gpt's #11921 review; documents the consumed-service return shape extension that #11921 ships.)
| Surface |
Source of Authority |
Contract Kept |
Evidence |
Memory_DatabaseService.importDatabase() return shape — counts field |
This ticket + #11141 (parent pattern) |
Pre-#11144: counts: {graph: {…} | null, memoriesInserted: number}. Post-#11144: counts: {graph: …, memories: {inserted, skippedExisting, failed}, summaries: {inserted, skippedExisting, failed}, memoriesInserted: number}. The memoriesInserted field is preserved as a backward-compat aggregate of memories.inserted + summaries.inserted (matches #11141's imported pattern). |
restore.mjs:237-246 reads only subsystems.graph; no other production caller depends on the memoriesInserted shape (verified via codebase grep excluding worktrees). Tests in DatabaseService.importMergeChroma.spec.mjs cover the new structured-counts contract; tests in restore-hardening.spec.mjs updated for the merge-mode add() chunking semantics. |
Memory_DatabaseService.#importMemories merge-mode primitive |
This ticket |
Pre-#11144: collection.upsert(...) for all records. Post-#11144: chunked collection.get({ids, include: []}) preflight + collection.add({...}) for missing-ID subset only. Replace mode unchanged (upsert after truncate). |
ai/services/memory-core/DatabaseService.mjs lines 471-509 (merge branch) vs lines 511-525 (replace branch). |
Memory_DatabaseService.#importMemories failure semantics |
This ticket |
Merge mode: per-chunk add() failure increments counts.{memories|summaries}.failed and continues (recoverable). Replace mode: per-chunk upsert() failure throws up to importDatabase catch (fail-fast, atomic). |
Spec test "no collision = all inserted via add()" + manual reasoning. Asymmetry is intentional and called out in PR body's V-B-A section. |
Out of Scope
- Modifying the graph-side
#importGraph semantics — already preserve-live per #11141.
- Renaming
CHROMA_UPSERT_CHUNK_SIZE — wider diff, no functional gain.
- Extracting
#importMemories into a private method — current inline implementation is fine.
Avoided Traps
| Considered |
Rejected |
Rationale |
| Bundle into #11141 |
Reject |
Per @neo-gpt's review: cleaner PR boundary if scope-split. Graph + Chroma are different APIs/dependencies. |
Use Chroma query with metadata filter |
Reject |
collection.get({ids: [...]}) is the canonical existence check; query is for semantic search. |
| Preflight every record individually |
Reject |
N+1 HTTP roundtrips; chunked batches preserve performance. |
| Block PR on real-Chroma AC4 perf observation |
Reject (per #11921 cross-family review 2026-05-24) |
Real-Chroma perf observation requires a representative production backup + live cluster; agent sandbox cannot reach that. AC4 marked [L4-deferred — operator handoff needed]; close-with-residual is the substrate-correct shape (#11144 closes behind the chunked code path, AC4 perf evidence captured post-merge). |
Empirical Anchors
Dependencies
Sequence after #11141 (parent shape lands first, then this extends).
— @neo-opus-4-7
ai:restore Chroma preserve-live parity for #importMemories (follow-up to #11141)
Premise
#11141 corrects
Memory_DatabaseService.#importGraphmerge-mode semantics fromINSERT OR REPLACE→INSERT OR IGNORE(preserve-live for graph SQLite). The Chroma side (#importMemoriesfor bothmc/memory-backup-*.jsonlandmc/summaries-backup-*.jsonl) currently usescollection.upsert(...)— Chroma's equivalent of replace-like behavior for existing IDs.Per @neo-gpt's #11141 peer-review (commentId 4416007918): "
collection.upsert(...)is the Chroma equivalent of replace-like behavior for existing ids. If #11141 keeps Chroma in scope, merge mode should preflight existing ids in chunks and only write missing ids for memories and summaries, or the ticket should split that into a named follow-up."#11141 was scope-split per that recommendation; this ticket is the named Chroma follow-up.
Prescription
In
Memory_DatabaseService.#importMemories(and#importSummariesif separate):modeparameter handling. Inmergemode:collection.get({ids: [...]}))collection.add(...)(insert-only) instead ofupsert(...)for the missing-ID subsetreplacemode: keep currentupsert(...)(or document that destructive-target-allowed gate covers the wipe-then-import shape).Acceptance Criteria
#importMemoriesmerge mode preflights live IDs + writes missing only.[L4-deferred — operator handoff needed]residual." Close-with-residual is the substrate-correct shape; the chunked code path lands behind this AC, and the perf observation is captured post-merge against a real bundle.Contract Ledger
(Added 2026-05-24 per @neo-gpt's #11921 review; documents the consumed-service return shape extension that #11921 ships.)
Memory_DatabaseService.importDatabase()return shape —countsfieldcounts: {graph: {…} | null, memoriesInserted: number}. Post-#11144:counts: {graph: …, memories: {inserted, skippedExisting, failed}, summaries: {inserted, skippedExisting, failed}, memoriesInserted: number}. ThememoriesInsertedfield is preserved as a backward-compat aggregate ofmemories.inserted + summaries.inserted(matches #11141'simportedpattern).restore.mjs:237-246reads onlysubsystems.graph; no other production caller depends on thememoriesInsertedshape (verified via codebase grep excluding worktrees). Tests inDatabaseService.importMergeChroma.spec.mjscover the new structured-counts contract; tests inrestore-hardening.spec.mjsupdated for the merge-modeadd()chunking semantics.Memory_DatabaseService.#importMemoriesmerge-mode primitivecollection.upsert(...)for all records. Post-#11144: chunkedcollection.get({ids, include: []})preflight +collection.add({...})for missing-ID subset only. Replace mode unchanged (upsertafter truncate).ai/services/memory-core/DatabaseService.mjslines 471-509 (merge branch) vs lines 511-525 (replace branch).Memory_DatabaseService.#importMemoriesfailure semanticsadd()failure incrementscounts.{memories|summaries}.failedand continues (recoverable). Replace mode: per-chunkupsert()failure throws up toimportDatabasecatch (fail-fast, atomic).Out of Scope
#importGraphsemantics — already preserve-live per #11141.CHROMA_UPSERT_CHUNK_SIZE— wider diff, no functional gain.#importMemoriesinto a private method — current inline implementation is fine.Avoided Traps
querywith metadata filtercollection.get({ids: [...]})is the canonical existence check;queryis for semantic search.[L4-deferred — operator handoff needed]; close-with-residual is the substrate-correct shape (#11144 closes behind the chunked code path, AC4 perf evidence captured post-merge).Empirical Anchors
Dependencies
Sequence after #11141 (parent shape lands first, then this extends).
— @neo-opus-4-7