LearnNewsExamplesServices
Frontmatter
id11062
titleExtract BackupCoordinatorService as M4 per-task coordinator (orchestrator daily backup)
stateClosed
labels
enhancementaiarchitecturemodel-experience
assigneesneo-gemini-3-1-pro
createdAtMay 9, 2026, 11:44 PM
updatedAtMay 12, 2026, 4:08 AM
githubUrlhttps://github.com/neomjs/neo/issues/11062
authorneo-opus-4-7
commentsCount1
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 10, 2026, 1:03 AM

Extract BackupCoordinatorService as M4 per-task coordinator (orchestrator daily backup)

Closedenhancementaiarchitecturemodel-experience
neo-opus-4-7
neo-opus-4-7 commented on May 9, 2026, 11:44 PM

Context

Sub-of M4 per-task coordinator decomposition (learn/agentos/v13-path.md:188-193). Sibling-fileable from PR #11021 closure (Gemini's daily-backup-orchestrator-integration superseded 2026-05-09 by the M4 per-task coordinator architectural correction; the follow-up coordinator ticket was named in #11021's closing comment but never filed).

Empirical anchor (v13-path.md:90):

"Daily backup with rotation cap (PRIO 0 — non-negotiable) — orchestrator-owned scheduled task; 30-day rotation cap (one month coverage); backup-success precondition for any DreamMode/Sandman task spawn (#10780 discipline + post-#11018-retraction architectural correction; BackupService extraction lands as M4 per-task coordinator)"

Empirical anchor (v13-path.md:193):

"DreamCoordinatorService / SandmanCoordinatorService / BackupService / GoldenPathCoordinatorService / GraphMaintenanceCoordinatorService — each owning 'what work is due' semantics; supervisor executes; orchestrator wires per D3.1 boundary"

Duplicate sweep before filing:

  • gh issue list --search "BackupService extraction in:title,body" — no open ticket
  • gh issue list --search "backup orchestrator schedule task daily" — only #11018 (closed; original wrong-shape) + PR #11021 (closed; superseded) + #10844 (closed; predecessor flat-shape)
  • gh issue list --label ai --search "BackupCoordinatorService" — no matches

The Problem

The Memory Core has an atomic-bundle backup primitive (buildScripts/ai/backup.mjs via npm run ai:backup) but no automated scheduling. Currently:

  1. Manual operator discipline only: backups happen when an operator remembers to run them. Gap-period regressions (e.g. DreamService corruption mid-cycle) lose recoverability.
  2. No backup-success precondition gate: DreamMode / Sandman / GoldenPath cycles can run without a recent backup, risking unrecoverable graph state.
  3. #11018 + PR #11021 implemented the wrong shape: task-shell-out logic embedded directly in Orchestrator.mjs, violating the D3.1 single-responsibility boundary established by #11041 (TaskStateService) + #11044 (ProcessSupervisorService) + #11051 (CadenceEngine). Closed-not-planned per @tobiu's architectural correction (PR #11021 retraction + my comment 4413... at https://github.com/neomjs/neo/pull/11021 ).

The M3.5 Orchestrator decomposition triplet (TaskStateService + ProcessSupervisorService + CadenceEngine, all merged 2026-05-09) created the per-task coordinator slotSummarizationCoordinatorService is the canonical sibling precedent. M4 lands the remaining 5 per-task coordinators (per v13-path.md:193); BackupCoordinatorService is the highest-priority of those (PRIO 0 per v13-path.md:90).

The Architectural Reality

Sibling precedent: ai/daemons/services/SummarizationCoordinatorService.mjs (#11009) — pure-functional getDueTask({db, state, now, summarySweepIntervalMs}) method that returns {taskName, source, reason, onSuccess?} trigger object. Orchestrator calls it; ProcessSupervisorService executes the spawned child; TaskStateService persists state.

Existing backup mechanics:

  • buildScripts/ai/backup.mjs — atomic-bundle orchestrator (idempotent; produces .neo-ai-data/backups/backup-<ISO-timestamp>/)
  • The script is spawn-child shape with full Neo bootstrap (Neo + core/_export + InstanceManager per #11049 entry-point invariant — already correct)
  • TODO comment in backup.mjs notes retention policy is unimplemented (sibling-file lift from defragChromaDB.cleanOldBackups semantics)

Required wiring:

  • ai/daemons/Orchestrator.mjs#buildTaskDefinitions adds backup task definition (script path + PID file name + expectedCommand)
  • ai/daemons/Orchestrator.mjs config adds backupCoordinator_: BackupCoordinatorService + backupIntervalMs_ (default 24h)
  • ai/daemons/Orchestrator.mjs#runMaintenanceCycle adds runBackupCycle lane (matching runSummaryCycle + runKbSyncCycle shape)

The Fix

Step 1: Create ai/daemons/services/BackupCoordinatorService.mjs

Lift SummarizationCoordinatorService.mjs shape:

// Class-only file (entry-point invariant per #11049)
import Base from '../../../src/core/Base.mjs';

/**
 * @summary Builds the task trigger for the daily backup lane.
 *
 * Pure-functional projection. The backup lane has one wake-up source:
 * the periodic interval sweep (default 24h). Future #M4-spec extension
 * may add backup-failure-retry source or pre-DreamMode-precondition source.
 *
 * @param {Object} options
 * @param {Number} options.now Current timestamp in milliseconds.
 * @param {Number} options.lastRunAt Last backup task start timestamp.
 * @param {Number} options.intervalMs Periodic backup interval; `0` disables.
 * @returns {Object|null} A backup task trigger or null when no work is due.
 */
export function buildBackupTrigger({now, lastRunAt, intervalMs}) {
    if (intervalMs > 0 && now - lastRunAt >= intervalMs) {
        return {
            taskName: 'backup',
            source  : 'periodic-sweep',
            reason  : `periodic-sweep:${intervalMs}`
        };
    }
    return null;
}

class BackupCoordinatorService extends Base {
    static config = {
        className: 'Neo.ai.daemons.services.BackupCoordinatorService',
        singleton: true
    }

    /**
     * @param {Object} options
     * @param {Object} options.state Current orchestrator task state.
     * @param {Number} options.now Current timestamp in milliseconds.
     * @param {Number} options.backupIntervalMs Periodic backup interval (default 24h).
     * @returns {Object|null} Task trigger or null.
     */
    getDueTask({state, now, backupIntervalMs}) {
        return buildBackupTrigger({
            now,
            intervalMs: backupIntervalMs,
            lastRunAt : state.backup?.lastRunAt || 0
        });
    }
}

export default Neo.setupClass(BackupCoordinatorService);

Step 2: Wire into Orchestrator.mjs

+ import BackupCoordinatorService from './services/BackupCoordinatorService.mjs';

  export const DEFAULT_KB_SYNC_INTERVAL_MS  = 1800000;
+ export const DEFAULT_BACKUP_INTERVAL_MS   = 86400000; // 24h

  export function buildTaskDefinitions({scriptDir = DEFAULT_SCRIPT_DIR, nodeBin = process.argv[0]} = {}) {
      return {
          summary: { ... },
          kbSync : { ... },
+         backup : {
+             label          : 'memory core backup',
+             command        : nodeBin,
+             args           : [path.resolve(scriptDir, '../../buildScripts/ai/backup.mjs')],
+             pidFileName    : 'backup.pid',
+             expectedCommand: 'backup.mjs'
+         }
      };
  }

  // In Orchestrator.config:
+ backupCoordinator_: BackupCoordinatorService,
+ backupIntervalMs_ : DEFAULT_BACKUP_INTERVAL_MS,

  // In Orchestrator.configure:
+ this.backupIntervalMs = options.backupIntervalMs ?? this.cadenceEngine.parseInterval(
+     process.env.NEO_ORCHESTRATOR_BACKUP_INTERVAL_MS,
+     DEFAULT_BACKUP_INTERVAL_MS
+ );
+ this.backupCoordinator = options.backupCoordinator || BackupCoordinatorService;

  // New cycle method:
+ runBackupCycle(now) {
+     const trigger = this.backupCoordinator.getDueTask({
+         state           : this.taskStateService.getState(),
+         now,
+         backupIntervalMs: this.backupIntervalMs
+     });
+     if (trigger) {
+         this.processSupervisorService.runTask('backup', trigger.reason);
+     }
+ }

  // In runMaintenanceCycle:
  this.runTaskCycle('summary', () => this.runSummaryCycle(now));
  this.runTaskCycle('kbSync',  () => this.runKbSyncCycle(now));
+ this.runTaskCycle('backup',  () => this.runBackupCycle(now));

Step 3: Retention sweep — STAYS IN buildScripts/ai/backup.mjs

Per Gemini's #11021 approach, retention sweep semantics live in backup.mjs itself (operator-runnable shape). Apply 30-day window with 3-bundle minimum keep (mirrors defragChromaDB.cleanOldBackups). This is intentionally NOT in BackupCoordinatorService because retention is the backup-script's concern — coordinator only owns "is backup due" semantics, supervisor owns spawn execution.

Step 4: Test substrate

  • New: test/playwright/unit/ai/daemons/services/BackupCoordinatorService.spec.mjs — pure-function buildBackupTrigger coverage (interval boundaries + disabled). Lift SummarizationCoordinatorService.spec.mjs shape; remember the test-spec-Neo+core bootstrap pattern.
  • Update: test/playwright/unit/ai/daemons/Orchestrator.spec.mjs — verify backup task injection + isolated cycle scheduling.
  • Update: test/playwright/unit/ai/scripts/orchestrator-daemon.spec.mjs — verify backup task command resolution.

Acceptance Criteria

  • AC1 — ai/daemons/services/BackupCoordinatorService.mjs exists; pure-functional getDueTask({state, now, backupIntervalMs}) returning trigger or null; matches SummarizationCoordinatorService.mjs shape (no Neo import; class-only file)
  • AC2 — Orchestrator.mjs buildTaskDefinitions adds backup task pointing at buildScripts/ai/backup.mjs
  • AC3 — Orchestrator.mjs config + configure() wires backupCoordinator (DI, defaults to BackupCoordinatorService) + backupIntervalMs (env override NEO_ORCHESTRATOR_BACKUP_INTERVAL_MS, default 24h)
  • AC4 — Orchestrator.mjs runMaintenanceCycle adds backup lane (failure-isolated, matching summary + kbSync pattern)
  • AC5 — buildScripts/ai/backup.mjs retention sweep added (3-bundle minimum + 30-day cap; mirrors defragChromaDB.cleanOldBackups semantics) — operator-empirical anchor v13-path.md:90
  • AC6 — Unit test coverage for BackupCoordinatorService.spec.mjs (trigger boundaries + disabled state) + Orchestrator.spec.mjs updates (backup task verified) + orchestrator-daemon.spec.mjs updates (command resolution)
  • AC7 — All existing daemon tests continue passing (no regression in summary / kbSync lanes)
  • AC8 — Cross-family review per pull-request §6.1

Out of Scope

  • DreamMode/Sandman backup-success precondition gate (#10780 was the discipline; the gate version would be a follow-up that BackupCoordinatorService.getDueTask + ProcessSupervisorService.recoverTask state-machine enables). File as scope-extension when the gate becomes load-bearing.
  • Other M4 per-task coordinators (DreamCoordinatorService / SandmanCoordinatorService / GoldenPathCoordinatorService / GraphMaintenanceCoordinatorService) — file separately; this ticket is the BackupCoordinatorService keystone for the 5-coordinator M4 epic
  • Backup automation outside the orchestrator (cron, launchd, etc.) — orchestrator IS the scheduling substrate; operator-territory automations are not v13 scope
  • Restore automationnpm run ai:restore / buildScripts/ai/restore.mjs exist; the gate for orchestrator-driven restore is operator decision, not v13 scope

Avoided Traps

  • Embedding backup logic in Orchestrator.mjs (the #11018 / PR #11021 wrong shape): would re-create the fat-class problem M3.5 just resolved. The coordinator-supervisor-state separation is load-bearing; respect the D3.1 boundary.
  • Hardcoding retention policy in BackupCoordinatorService: retention is the backup-script's concern (operator-runnable shape, mirrors defrag). Coordinator stays pure — "is work due."
  • Naming as BackupService: conflicts with the existing buildScripts/ai/backup.mjs driver semantics + SummarizationCoordinatorService precedent. BackupCoordinatorService matches the naming pattern.
  • Skipping the test-spec-Neo+core bootstrap pattern: post-#11049 invariant requires test specs to import Neo+core themselves when target class file omits Neo (see TaskStateService.spec / ProcessSupervisorService.spec / SummarizationCoordinatorService.spec).
  • Bundling DreamMode-precondition gate scope: ticket-creation §3 prescription clarity — one ticket = one PR shape. The gate is filable as the natural Stage 2 once this lands.

Provenance

  • Operator architectural correction (2026-05-09): PR #11021 closed as superseded; M4 BackupService extraction named in v13-path.md:90 post-#11018 retraction
  • Operator backup-substrate confirmation (2026-05-09): manual backup completed today (feedback_session_state confirms PRIO 0 satisfied for current cycle)
  • Sibling-file precedent: SummarizationCoordinatorService.mjs (#11009 Piece C; Neo-singleton, pure getDueTask, returns trigger envelope)
  • D3.1 boundary anchor: v13-path.md:188-193 — coordinator-vs-supervisor-vs-state separation; M3.5 keystone substrate makes M4 incremental
  • Backup script: buildScripts/ai/backup.mjs already canonical; needs only the retention-sweep extension per Step 3
  • PR #11021 retraction comment (architectural-correction reasoning): https://github.com/neomjs/neo/pull/11021 (Gemini close + my retraction comment)

Related

  • #11018 (closed; original wrong-shape — orchestrator-bolted-backup) — superseded by this ticket
  • PR #11021 (closed; #11018 implementation — superseded by architectural correction)
  • #10780 (closed-not-planned; manual-discipline approach superseded by orchestrator-owned scheduled task per this ticket)
  • #11041 + #11044 + #11051 (M3.5 keystone substrate — TaskStateService + ProcessSupervisorService + CadenceEngine; ALL MERGED 2026-05-09; this ticket builds on all three)
  • #11009 (SummarizationCoordinatorService — sibling precedent shape)
  • #10844 (closed-not-planned; predecessor flat-shape daily-snapshot ticket — non-orchestrator approach)

Self-Identification: @neo-opus-4-7 (Claude Opus 4.7, Claude Code) — chief-architect lane, post-Round-4 architectural follow-up filing. Ticket open for self-selection. Empirical anchor: operator-surfaced gap 2026-05-09T21:38 (no follow-up ticket existed for PR #11021 closure direction).

Origin Session ID: c2912891-b459-4a03-b2af-154d5e264df1

tobiu referenced in commit 91755f4 - "feat(ai): extract BackupCoordinatorService as M4 per-task coordinator (#11062) (#11069) on May 10, 2026, 1:03 AM
tobiu closed this issue on May 10, 2026, 1:03 AM