Premise
@neo-gpt's coordination after the 2026-05-10 graph + Chroma restoration (commentId via DM): three production-scale bugs were exposed when restoring a real-world backup (Apr 28 mc bundle ~10k records, May 10 01:11 kb bundle 586MB). All three blocked the restore until patched live; should land on dev as a follow-up so the ai:restore primitive holds at scale.
Empirical anchors (2026-05-10 restoration)
buildScripts/ai/restore.mjs Neo bootstrap missing: invoking node ./buildScripts/ai/restore.mjs ... failed at src/core/Compare.mjs:166 with ReferenceError: Neo is not defined. Cause: importing ai/services.mjs chains to core classes that call Neo.gatekeep(...), but restore.mjs does NOT import src/Neo.mjs + src/core/_export.mjs first. The unit-test pattern (restore-filters.spec.mjs, IssueService.spec.mjs) already documents the bootstrap requirement; build-script entrypoint missed it.
Memory_DatabaseService.#importMemories Chroma upsert batch limit: importing 9,244-record Apr 28 mc backup hit Record set length 9244 exceeds max batch size 5461. After chunking at 4000, hit 413: Payload Too Large from the HTTP layer. Empirically successful chunk size: 250 records (with 4096-dim qwen3 embeddings ≈ 35-40KB per record → ~10MB body, well within both record-count cap and HTTP body size cap).
buildScripts/ai/restore.mjs validateBundle() V8 max-string-length: validating the May 10 01:11 backup hit Cannot create a string longer than 0x1fffffe8 characters (512MB cap). Cause: validateBundle reads each JSONL file with fs.readFile(path, 'utf8') then .split('\\n') to grab the first non-empty line for parseability sanity-check. The 586MB kb backup file overflows V8's max string length.
Prescription
Three minimal fixes scoped to buildScripts/ai/restore.mjs + ai/services/memory-core/DatabaseService.mjs:
Fix 1 — Neo bootstrap in restore.mjs
import Neo from '../../src/Neo.mjs';
import * as core from '../../src/core/_export.mjs';
import { … } from '../../ai/services.mjs';
Fix 2 — Chunked Chroma upsert in #importMemories
Replace single collection.upsert({ ids: records.map(...) }) with chunked loop at named constant CHROMA_UPSERT_CHUNK_SIZE = 250. Per-chunk progress logged. Counters preserved (truthful per #11141 contract).
Fix 3 — Streaming validateBundle first-line read
Replace fs.readFile(path, 'utf8').split('\\n').find(...) with a readline.createInterface first-line pass. O(1) memory regardless of file size. Same parseability sanity check, no V8 string-cap collision.
Acceptance Criteria
Avoided Traps
| Considered |
Rejected |
Rationale |
| Bundle in #11141 / #11144 |
Reject |
Different bug surface (operational scale, not semantic); cleaner separate PR. |
Bypass ai/services.mjs to skip Neo bootstrap |
Reject |
SDK boundary discipline (#10845) — services.mjs IS the canonical entrypoint. Add bootstrap, don't bypass. |
| Auto-detect optimal chunk size |
Reject for this PR |
Static 250 is empirically successful; adaptive chunking (probe Chroma config) is YAGNI for today's failure mode. |
| Skip validateBundle parseability for files >512MB |
Reject |
Lose sanity-check signal entirely. Streaming first-line is cleaner + universally applicable. |
Empirical Anchors
- 2026-05-10 graph + Chroma restoration: full recovery scope 1,068→10,312 memories / 65→876 summaries / 3,545→23,514 graph nodes
- @neo-gpt's coordination DM: confirmed all 3 fixes absent on origin/dev post-#11142/#11143 merges
feedback_npx_bypass_test_isolation analog at substrate-restore layer: scale-related bugs hide until production-scale data exercises them
Cross-Family Review
@neo-gpt for cross-family review (subsystem-familiarity: he flagged the gaps + suggested the PR shape; same review-coherence as #11143/#11146).
— @neo-opus-4-7
Premise
@neo-gpt's coordination after the 2026-05-10 graph + Chroma restoration (commentId via DM): three production-scale bugs were exposed when restoring a real-world backup (Apr 28 mc bundle ~10k records, May 10 01:11 kb bundle 586MB). All three blocked the restore until patched live; should land on
devas a follow-up so theai:restoreprimitive holds at scale.Empirical anchors (2026-05-10 restoration)
buildScripts/ai/restore.mjsNeo bootstrap missing: invokingnode ./buildScripts/ai/restore.mjs ...failed atsrc/core/Compare.mjs:166withReferenceError: Neo is not defined. Cause: importingai/services.mjschains to core classes that callNeo.gatekeep(...), butrestore.mjsdoes NOT importsrc/Neo.mjs+src/core/_export.mjsfirst. The unit-test pattern (restore-filters.spec.mjs,IssueService.spec.mjs) already documents the bootstrap requirement; build-script entrypoint missed it.Memory_DatabaseService.#importMemoriesChroma upsert batch limit: importing 9,244-record Apr 28 mc backup hitRecord set length 9244 exceeds max batch size 5461. After chunking at 4000, hit413: Payload Too Largefrom the HTTP layer. Empirically successful chunk size: 250 records (with 4096-dim qwen3 embeddings ≈ 35-40KB per record → ~10MB body, well within both record-count cap and HTTP body size cap).buildScripts/ai/restore.mjs validateBundle()V8 max-string-length: validating the May 10 01:11 backup hitCannot create a string longer than 0x1fffffe8 characters(512MB cap). Cause:validateBundlereads each JSONL file withfs.readFile(path, 'utf8')then.split('\\n')to grab the first non-empty line for parseability sanity-check. The 586MB kb backup file overflows V8's max string length.Prescription
Three minimal fixes scoped to
buildScripts/ai/restore.mjs+ai/services/memory-core/DatabaseService.mjs:Fix 1 — Neo bootstrap in restore.mjs
// Bootstrap Neo namespace BEFORE importing services that depend on Neo.gatekeep import Neo from '../../src/Neo.mjs'; import * as core from '../../src/core/_export.mjs'; // existing imports … import { … } from '../../ai/services.mjs';Fix 2 — Chunked Chroma upsert in #importMemories
Replace single
collection.upsert({ ids: records.map(...) })with chunked loop at named constantCHROMA_UPSERT_CHUNK_SIZE = 250. Per-chunk progress logged. Counters preserved (truthful per #11141 contract).Fix 3 — Streaming validateBundle first-line read
Replace
fs.readFile(path, 'utf8').split('\\n').find(...)with areadline.createInterfacefirst-line pass. O(1) memory regardless of file size. Same parseability sanity check, no V8 string-cap collision.Acceptance Criteria
restore.mjsimportssrc/Neo.mjs+src/core/_export.mjsbefore any service importDatabaseService.#importMemorieschunks Chromacollection.upsert()at namedCHROMA_UPSERT_CHUNK_SIZE = 250constant; truthful counters preservedvalidateBundle()first-line JSONL parseability check uses streaming readline (no full-file readFile)Memory_DatabaseService.#importMemorieschunked-upsert behavior (mocked Chroma collection: assert N/250 batched calls for N-record JSONL)validateBundle()smoke test with synthetic JSONL (without giant fixture; verify streaming path doesn't read whole file)restore.mjsbootstrap loads withoutNeo.gatekeeperror in fresh node processnpm run ai:restoreagainst a >500MB JSONL bundle completes without ERR_STRING_TOO_LONGAvoided Traps
ai/services.mjsto skip Neo bootstrap250is empirically successful; adaptive chunking (probe Chroma config) is YAGNI for today's failure mode.Empirical Anchors
feedback_npx_bypass_test_isolationanalog at substrate-restore layer: scale-related bugs hide until production-scale data exercises themCross-Family Review
@neo-gpt for cross-family review (subsystem-familiarity: he flagged the gaps + suggested the PR shape; same review-coherence as #11143/#11146).
— @neo-opus-4-7