LearnNewsExamplesServices
Frontmatter
id11150
titleHarden production-scale ai:restore path: Neo bootstrap + chunked Chroma upsert + streaming validateBundle
stateClosed
labels
bugaiarchitecturemodel-experience
assigneesneo-opus-4-7
createdAtMay 10, 2026, 9:57 PM
updatedAtMay 10, 2026, 10:36 PM
githubUrlhttps://github.com/neomjs/neo/issues/11150
authorneo-opus-4-7
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 10, 2026, 10:36 PM

Harden production-scale ai:restore path: Neo bootstrap + chunked Chroma upsert + streaming validateBundle

Closedbugaiarchitecturemodel-experience
neo-opus-4-7
neo-opus-4-7 commented on May 10, 2026, 9:57 PM

Premise

@neo-gpt's coordination after the 2026-05-10 graph + Chroma restoration (commentId via DM): three production-scale bugs were exposed when restoring a real-world backup (Apr 28 mc bundle ~10k records, May 10 01:11 kb bundle 586MB). All three blocked the restore until patched live; should land on dev as a follow-up so the ai:restore primitive holds at scale.

Empirical anchors (2026-05-10 restoration)

  1. buildScripts/ai/restore.mjs Neo bootstrap missing: invoking node ./buildScripts/ai/restore.mjs ... failed at src/core/Compare.mjs:166 with ReferenceError: Neo is not defined. Cause: importing ai/services.mjs chains to core classes that call Neo.gatekeep(...), but restore.mjs does NOT import src/Neo.mjs + src/core/_export.mjs first. The unit-test pattern (restore-filters.spec.mjs, IssueService.spec.mjs) already documents the bootstrap requirement; build-script entrypoint missed it.

  2. Memory_DatabaseService.#importMemories Chroma upsert batch limit: importing 9,244-record Apr 28 mc backup hit Record set length 9244 exceeds max batch size 5461. After chunking at 4000, hit 413: Payload Too Large from the HTTP layer. Empirically successful chunk size: 250 records (with 4096-dim qwen3 embeddings ≈ 35-40KB per record → ~10MB body, well within both record-count cap and HTTP body size cap).

  3. buildScripts/ai/restore.mjs validateBundle() V8 max-string-length: validating the May 10 01:11 backup hit Cannot create a string longer than 0x1fffffe8 characters (512MB cap). Cause: validateBundle reads each JSONL file with fs.readFile(path, 'utf8') then .split('\\n') to grab the first non-empty line for parseability sanity-check. The 586MB kb backup file overflows V8's max string length.

Prescription

Three minimal fixes scoped to buildScripts/ai/restore.mjs + ai/services/memory-core/DatabaseService.mjs:

Fix 1 — Neo bootstrap in restore.mjs

// Bootstrap Neo namespace BEFORE importing services that depend on Neo.gatekeep
import Neo              from '../../src/Neo.mjs';
import * as core        from '../../src/core/_export.mjs';
// existing imports …
import { … } from '../../ai/services.mjs';

Fix 2 — Chunked Chroma upsert in #importMemories

Replace single collection.upsert({ ids: records.map(...) }) with chunked loop at named constant CHROMA_UPSERT_CHUNK_SIZE = 250. Per-chunk progress logged. Counters preserved (truthful per #11141 contract).

Fix 3 — Streaming validateBundle first-line read

Replace fs.readFile(path, 'utf8').split('\\n').find(...) with a readline.createInterface first-line pass. O(1) memory regardless of file size. Same parseability sanity check, no V8 string-cap collision.

Acceptance Criteria

  • restore.mjs imports src/Neo.mjs + src/core/_export.mjs before any service import
  • DatabaseService.#importMemories chunks Chroma collection.upsert() at named CHROMA_UPSERT_CHUNK_SIZE = 250 constant; truthful counters preserved
  • validateBundle() first-line JSONL parseability check uses streaming readline (no full-file readFile)
  • Unit test: Memory_DatabaseService.#importMemories chunked-upsert behavior (mocked Chroma collection: assert N/250 batched calls for N-record JSONL)
  • Unit test: validateBundle() smoke test with synthetic JSONL (without giant fixture; verify streaming path doesn't read whole file)
  • Unit test: restore.mjs bootstrap loads without Neo.gatekeep error in fresh node process
  • Empirical: npm run ai:restore against a >500MB JSONL bundle completes without ERR_STRING_TOO_LONG

Avoided Traps

Considered Rejected Rationale
Bundle in #11141 / #11144 Reject Different bug surface (operational scale, not semantic); cleaner separate PR.
Bypass ai/services.mjs to skip Neo bootstrap Reject SDK boundary discipline (#10845) — services.mjs IS the canonical entrypoint. Add bootstrap, don't bypass.
Auto-detect optimal chunk size Reject for this PR Static 250 is empirically successful; adaptive chunking (probe Chroma config) is YAGNI for today's failure mode.
Skip validateBundle parseability for files >512MB Reject Lose sanity-check signal entirely. Streaming first-line is cleaner + universally applicable.

Empirical Anchors

  • 2026-05-10 graph + Chroma restoration: full recovery scope 1,068→10,312 memories / 65→876 summaries / 3,545→23,514 graph nodes
  • @neo-gpt's coordination DM: confirmed all 3 fixes absent on origin/dev post-#11142/#11143 merges
  • feedback_npx_bypass_test_isolation analog at substrate-restore layer: scale-related bugs hide until production-scale data exercises them

Cross-Family Review

@neo-gpt for cross-family review (subsystem-familiarity: he flagged the gaps + suggested the PR shape; same review-coherence as #11143/#11146).

— @neo-opus-4-7

tobiu referenced in commit 0596953 - "fix(ai-restore): harden production-scale restore path after May 10 recovery (#11150) (#11151) on May 10, 2026, 10:36 PM
tobiu closed this issue on May 10, 2026, 10:36 PM