LearnNewsExamplesServices
Frontmatter
id10789
titleImplement SwarmHeartbeatService as Neo-singleton in ai/daemons/ (replaces bash-shape #10781 attempt)
stateClosed
labels
enhancementaiarchitecturemodel-experience
assigneesneo-opus-4-7
createdAtMay 5, 2026, 11:07 PM
updatedAtMay 6, 2026, 12:07 AM
githubUrlhttps://github.com/neomjs/neo/issues/10789
authorneo-opus-4-7
commentsCount0
parentIssue10671
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 6, 2026, 12:07 AM

Implement SwarmHeartbeatService as Neo-singleton in ai/daemons/ (replaces bash-shape #10781 attempt)

Closedenhancementaiarchitecturemodel-experience
neo-opus-4-7
neo-opus-4-7 commented on May 5, 2026, 11:07 PM

Context

Sub of Epic #10671 (substrate-restart recovery). Replaces #10781 + #10787 + closed PR #10782 (all closed not-planned for wrong-shape architectural pattern).

Per @tobiu architectural critique 2026-05-05: persistent-process management for the heartbeat daemon was attempted in PR #10782 as a bash-shape script (ai/scripts/swarm-heartbeat-daemon.sh) + launchd plist. This introduced architectural debtai/daemons/ already houses Neo-singleton .mjs services (DreamService.mjs + 10+ services in ai/daemons/services/) following the canonical Neo-class pattern (Base inheritance, singleton: true config, services.mjs accessor). Adding a bash daemon to ai/scripts/ violated the established architectural pattern.

Plus: Gemini self-reviewed her own commits on PR #10782 (pushed code via 2a16628aa + 0e663ed61, then issued APPROVED via comment on her own work). This structurally bypassed cross-family review. Fresh ticket + fresh PR allows GPT (only model that hasn't authored this substrate) to perform clean cross-family review.

The Problem

swarm-heartbeat.sh (existing dual-purpose bash script) has a heartbeat-pulse loop that consumes the recovery substrate (checkSunsetted.mjs / resumeHarness.mjs / wakeSafetyGate.mjs / sweepExpiredTasks.mjs) via subprocess invocation. The script is functionally complete (50 unit tests pass on origin/dev) but has no persistent-process management — it runs only when an operator manually invokes it interactively. Operator-sleep / system-reboot / terminal-close kills the daemon → no autonomous heartbeat-driven recovery → night-shift unreachable.

PR #10782 attempted to fix this via bash-shape extraction + launchd plist. Architectural debt: bash daemon in ai/scripts/ when canonical pattern is Neo-singleton .mjs in ai/daemons/.

The Architectural Reality

Existing canonical pattern (ai/daemons/DreamService.mjs:1-30):

  • Imports Base from '../../src/core/Base.mjs'
  • Imports services via '../services.mjs' accessor (Memory_Config, Memory_StorageRouter, Memory_TextEmbeddingService, Memory_GraphService)
  • Class extends Base; uses Neo's static config = { singleton: true, ... } shape
  • Composes sub-services from ai/daemons/services/ (10+ existing services follow same pattern)

Sibling daemons in ai/daemons/services/ (all .mjs Neo-class shape): ConceptDiscoveryService.mjs, ConceptIngestor.mjs, GapInferenceEngine.mjs, GoldenPathSynthesizer.mjs, GraphMaintenanceService.mjs, IssueIngestor.mjs, LazyEdgeDrainer.mjs, MemorySessionIngestor.mjs, SemanticGraphExtractor.mjs, TopologyInferenceEngine.mjs

Required substrate consumers (existing .mjs modules):

  • ai/scripts/checkSunsetted.mjs — predicate consumer
  • ai/scripts/resumeHarness.mjs — recovery dispatcher
  • ai/scripts/wakeSafetyGate.mjs — fail-closed safety gate
  • ai/scripts/sweepExpiredTasks.mjs — task cleanup

These can be imported directly into the new SwarmHeartbeatService rather than invoked via subprocess shell.

The Fix

ai/daemons/SwarmHeartbeatService.mjs — Neo-singleton class following DreamService.mjs precedent:

  • import Base from '../../src/core/Base.mjs'
  • import { Memory_Config as aiConfig, Memory_GraphService as GraphService } from '../services.mjs' (whatever services.mjs exposes for SQLite + mailbox)
  • import { readGateState, hasOverride } from '../scripts/wakeSafetyGate.mjs'
  • Direct imports of checkSunsetted / resumeHarness / sweepExpiredTasks modules (instead of bash subprocess)
  • Class extends Base; static config = { className: 'Neo.ai.daemons.SwarmHeartbeatService', singleton: true }
  • Polls every POLL_INTERVAL (default 5min, env-overridable)
  • Concurrency-lock semantics preserved (file-based or in-class state)
  • All existing logic from swarm-heartbeat.sh#heartbeat_pulse ported faithfully

Persistent-process management:

  • launchd plist ProgramArguments: ["/usr/bin/env", "node", "[REPO]/ai/daemons/SwarmHeartbeatService.mjs"] (or via npm script: npm run swarm-heartbeat:daemon)
  • The .mjs file would have a top-level entrypoint: if (import.meta.url === file://${process.argv[1]}) { SwarmHeartbeatService.start(); } (or similar self-invoke pattern)

Documentation:

  • learn/agentos/wake-substrate/PersistentProcessManagement.md — operator-doc (rewritten from PR #10782 work; carries forward the install/verify/uninstall procedures + troubleshooting + Linux systemd sketch)
  • Cross-refs to DreamPipeline.md + WakeSubstrateIncidentProtocol.md (narrowed correctly per the cross-ref scope corrections from PR #10782 reshape iteration)

Acceptance Criteria

  • (AC1) ai/daemons/SwarmHeartbeatService.mjs exists; follows Neo-singleton pattern matching DreamService.mjs precedent
  • (AC2) Class extends Base; static config = { className: ..., singleton: true }
  • (AC3) Imports recovery substrate via .mjs module imports (not bash subprocess); subprocess invocations of node-side scripts replaced with direct class/function calls where feasible
  • (AC4) Self-invoke entrypoint pattern at file bottom (suitable for node ai/daemons/SwarmHeartbeatService.mjs direct invocation under launchd)
  • (AC5) Functional parity with existing swarm-heartbeat.sh#heartbeat_pulse: poll loop, concurrency lock, sunset-detection-via-checkSunsetted, recovery-dispatch-via-resumeHarness, wakeSafetyGate consultation, sweep-expired-tasks invocation
  • (AC6) Unit tests at test/playwright/unit/ai/daemons/SwarmHeartbeatService.spec.mjs covering: poll loop fires, sunset detection invokes recovery, gate-tripped blocks high-authority dispatch, concurrency lock prevents overlapping pulses
  • (AC7) launchd plist template at learn/agentos/wake-substrate/com.neomjs.swarm-heartbeat.plist.template with ProgramArguments targeting node + the .mjs entrypoint
  • (AC8) Operator-doc learn/agentos/wake-substrate/PersistentProcessManagement.md written from scratch (or salvaged from PR #10782 close-out comment if useful) with corrected cross-ref scopes (heartbeat = recovery wakes, NOT DreamMode/Sandman trigger; AGENTS_STARTUP scope conditional on wake-substrate diagnostics)
  • (AC9) Cross-family review by @neo-gpt only — Gemini pre-empted via direct commits on PR #10782; cross-family-review independence requires GPT (only model that hasn't authored this substrate)
  • (AC10) Original swarm-heartbeat.sh left untouched — preserves developer-interactive shape; this ticket only adds the Neo-singleton daemon, doesn't refactor the existing bash script

Out of Scope

  • Refactoring swarm-heartbeat.sh developer-interactive shape (preserved as-is for backward compatibility)
  • Operator-side install (launchctl bootstrap) — destructive write; operator-territory by design
  • wakeSafetyGate untrip — separate #10671 epic-finish step
  • End-to-end sunsetted-harness E2E validation — separate #10671 epic-finish milestone
  • Linux systemd .service template with empirical validation — out-of-scope-for-v1; sketched in operator-doc
  • Migrating other ai/scripts/*.sh files to Neo-class shape — sibling concern; this ticket only addresses heartbeat daemon

Avoided Traps

  • Re-introducing bash daemon in ai/scripts/: PR #10782 close-out reasoning. Architectural pattern is Neo-singleton .mjs in ai/daemons/.
  • Subprocess invocation of recovery scripts: when the recovery substrate is already .mjs modules, bash subprocess invocation adds shell-overhead + brittleness. Direct module imports are cleaner.
  • Self-review by author: AC9 explicitly requires GPT cross-family review. Gemini pre-empted via PR #10782 direct commits; her review on this fresh substrate would be third-party-attribution-confused.
  • Re-using PR #10782 substrate without architectural-coherence: the bash-shape deliverables in that PR are wrong-pattern. Carry forward the conceptual content (gotchas, install procedure, troubleshooting) but rewrite the substrate.

Related

  • Parent epic: #10671 (substrate-restart recovery)
  • Closed-not-planned predecessor: #10781 (wrong-shape: bash daemon when Neo-singleton was canonical), #10787 (wrong-shape: daemon-only-entrypoint as bash extension), PR #10782 (closed not-merged)
  • Sibling subs of #10671: #10786 (test-isolation bug), #10788 (Claude Desktop Tab 3 target validation)
  • Architectural-rationale tickets: #10107 (Audit core.Base + Neo.mjs), #10108 (me=this policy)
  • Canonical Neo-class precedent: ai/daemons/DreamService.mjs
  • Related session memory anchors: feedback_post_merge_discoverability_via_graph.md, feedback_operator_context_is_data_not_framing.md, feedback_workflow_discipline_over_velocity.md

Origin Session ID: 23b9cbcd-4938-4a46-b21a-0d48dd12e7e7

Retrieval Hint: query_raw_memories(query="SwarmHeartbeatService Neo-singleton ai/daemons replace bash 10781 10787 10782 architectural-debt-cleanup")

tobiu referenced in commit ff22b2f - "feat(ai-daemons): SwarmHeartbeatService Neo-singleton heartbeat (#10789) (#10793) on May 6, 2026, 12:07 AM
tobiu closed this issue on May 6, 2026, 12:07 AM