LearnNewsExamplesServices
Frontmatter
id10844
titleDaily automated snapshot pipeline for Memory Core data substrate
stateClosed
labels
enhancementaiarchitecturemodel-experience
assigneesneo-gemini-3-1-pro
createdAtMay 6, 2026, 11:02 PM
updatedAtMay 7, 2026, 9:24 AM
githubUrlhttps://github.com/neomjs/neo/issues/10844
authorneo-gpt
commentsCount1
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 7, 2026, 8:44 AM

Daily automated snapshot pipeline for Memory Core data substrate

Closedenhancementaiarchitecturemodel-experience
neo-gpt
neo-gpt commented on May 6, 2026, 11:02 PM

Context

Neo already has a canonical local backup primitive: npm run ai:backup invokes buildScripts/ai/backup.mjs, which writes an atomic bundle under .neo-ai-data/backups/backup-<ISO-timestamp>/. The existing script captures the Knowledge Base, Memory Core memories and summaries, the Memory Core graph, Concept Ontology JSONL, and RLAIF trajectories in one timestamped tree.

That covers the manual snapshot substrate. The remaining operational gap is automation: operators and agents can run the backup script, but the system does not yet guarantee a recent scheduled snapshot, expose the latest successful backup in healthcheck output, or document a restoration runbook for routine operations.

This ticket is preventive operational substrate / MX hardening. It turns the existing backup primitive into an observable daily safety loop without changing the backup bundle format.

The Problem

Manual backup discipline does not scale cleanly across a multi-agent runtime. A backup script can be correct and discoverable while still depending on human recall at exactly the moments when agents are focused on graph-heavy work, scheduled daemons, migrations, or deployment changes.

The current substrate also leaves three follow-up questions for operators:

  • Is there a recent successful backup?
  • How long are daily backups retained?
  • Which documented steps restore from the latest known-good bundle?

Without a queryable backup.lastSuccessful signal, agents cannot cheaply validate backup freshness before starting work that depends on Memory Core or Knowledge Base persistence.

The Architectural Reality

  • package.json:18 exposes npm run ai:backup.
  • buildScripts/ai/backup.mjs:16-24 documents the atomic bundle shape for kb/, mc/, graph/, concepts/, and trajectories.
  • buildScripts/ai/backup.mjs:35-45 and buildScripts/ai/backup.mjs:136-139 explicitly defer retention implementation.
  • ai/mcp/server/memory-core/config.mjs:232 defines .neo-ai-data/backups as the Memory Core backup/export directory.
  • ai/mcp/server/memory-core/services/HealthService.mjs:671-703 builds the Memory Core healthcheck payload, but there is currently no backup freshness block.
  • ai/mcp/server/knowledge-base/services/HealthService.mjs:181-198 builds the Knowledge Base healthcheck payload, likewise without backup freshness observability.
  • #10129 established the atomic backup bundle as the canonical manual snapshot substrate.
  • #10780 documents backup-first operational discipline before DreamMode/Sandman work; this ticket is broader scheduled automation and observability, not another manual-discipline ticket.

The Fix

Implement a daily automated snapshot pipeline that invokes the existing canonical backup script rather than duplicating backup logic.

The implementation may use either a scheduled GitHub Actions workflow or a daemon-side scheduled task, but it must preserve the existing substrate boundary:

  • buildScripts/ai/backup.mjs remains the canonical snapshot orchestrator.
  • Automation is responsible for schedule, retention, status recording, and operator-visible failure reporting.
  • Healthcheck exposes a structured backup block, at minimum backup.lastSuccessful, so agents can verify freshness without scraping filesystem state.
  • A restoration runbook documents how to identify and restore the latest known-good bundle.

Contract Ledger Matrix

Target Surface Source of Authority Proposed Behavior Fallback / Edge Case Docs Evidence
Daily backup scheduler #10129, this ticket Runs the canonical npm run ai:backup path once per day and records status metadata for the attempt. If the backup fails, no data stores are mutated; failure is recorded and surfaced to operators. Scheduled workflow or daemon docs explain where the schedule lives. Dry-run or integration evidence showing the scheduler invokes the canonical script.
.neo-ai-data/backups/backup-<ISO-timestamp>/ retention buildScripts/ai/backup.mjs retention TODO Applies a rolling retention policy, defaulting to 30 daily snapshots plus longer-lived weekly snapshots unless implementation justifies a different operator-friendly default. Freshest successful backups are never deleted by a failed sweep; malformed folders are skipped and reported. Inline retention docs plus operator runbook. Unit coverage for retention selection and deletion boundaries.
healthcheck.backup.lastSuccessful Memory Core / Knowledge Base healthcheck surfaces Healthcheck returns the timestamp of the latest successful automated backup and enough status to distinguish fresh, stale, failed, and never-run states. If metadata is missing, report never_run or equivalent without failing unrelated health dimensions. Healthcheck schema docs / JSDoc updated. Targeted healthcheck tests for fresh, stale, failed, and missing metadata cases.
Restoration runbook #10129 bundle layout Documents how to locate the latest known-good bundle and restore MC + KB + graph + concepts + trajectories consistently. If a subsystem is absent in a fresh environment, the runbook distinguishes optional-missing from failed-restore. learn/agentos/ or the nearest operator-facing deployment doc. Manual evidence or script-level smoke test using a small fixture backup.

Acceptance Criteria

  • A daily scheduled backup mechanism exists and invokes the existing npm run ai:backup / buildScripts/ai/backup.mjs path.
  • The automation writes status metadata including at least lastAttempted, lastSuccessful, and failure detail for the latest failed attempt.
  • Retention is implemented for .neo-ai-data/backups/backup-* bundles with a documented default policy.
  • Healthcheck exposes a queryable backup.lastSuccessful field and a machine-readable backup status.
  • The healthcheck backup block is tested for fresh, stale, failed, and never-run states.
  • A restoration runbook documents how to restore from a selected atomic bundle.
  • The implementation does not duplicate backup export logic outside buildScripts/ai/backup.mjs.

Out of Scope

  • Changing the atomic backup bundle format established by #10129.
  • Cloud upload, off-machine replication, or encrypted remote storage.
  • Implementing destructive restore execution as an MCP tool. This ticket requires a runbook; restore tooling can be a follow-up.
  • Defrag pre-nuke physical-copy snapshots. defragChromaDB.mjs remains a peer script with a separate purpose.
  • Reactive postmortem framing. This is preventive substrate hardening.

Avoided Traps

  • Reimplementing backup in the scheduler: the scheduler should orchestrate backup.mjs, not create a second backup code path.
  • Retention without observability: deleting old bundles is only safe if operators can also see which backup last succeeded.
  • Healthcheck as a hard dependency: backup freshness should be visible and actionable; it should not make unrelated read-only health dimensions fail by default.

Related

  • #10129 — atomic timestamped backup bundle across persistent subsystems
  • #10780 — backup-first operational discipline before DreamMode/Sandman invocation
  • #10822 — config substrate cleanup epic context

Origin Session ID: ab6b2fad-6660-4148-a0c7-474c0131a6bf

Retrieval Hint: query_raw_memories(query="daily automated backup Memory Core snapshot lastSuccessful retention restoration runbook")

tobiu referenced in commit f734912 - "feat(ai): institutionalize daily snapshot pipeline retention and observability (#10844) (#10872) on May 7, 2026, 8:35 AM
tobiu referenced in commit 1dfcec8 - "feat(ai/backup): extend bundle-meta + integrity check + mailbox (#10871) (#10876) on May 7, 2026, 9:21 AM
tobiu referenced in commit 36db57e - "feat(ai/restore): bundle-aware restore orchestrator + npm run ai:restore (#10871) (#10886) on May 7, 2026, 1:12 PM