Context
Empirical anchor: 8h43m all-three-agents-idled gap on 2026-05-02 evening → 2026-05-03 morning. Three swarm-heartbeat.sh instances were running per-identity (PIDs documented in conversation transcript), but ZERO pulse output across ~104 expected cycles per logs at .neo-ai-data/wake-daemon/heartbeat-*.log.
The current substrate is reactive: bridge-daemon.mjs delivers wake events on SENT_TO_ME (i.e., when a peer sends a message). When all three agents end their turns with nothing in the mailbox queue, no new wake fires, and the trio goes silent until @tobiu manually prompts one harness.
@tobiu's reframing (per @neo-gpt's MESSAGE:b1dae52c, 2026-05-03): the "postman" framing belongs to the era where @tobiu manually relayed messages. A2A messaging works. The current blocker is heartbeat-driven liveness when all three agents have ended their turns and no new A2A message is emitted.
This is the core substrate primitive that unlocks 24/7 trio collaboration.
The Problem
The heartbeat substrate as-shipped (post-#10623) has TWO orthogonal limitations:
No all-agent-idle detection. Each heartbeat instance only looks at its OWN identity's state in isolation (via checkSunsetted.mjs). It never asks "are ALL configured identities idle?" — which is the precondition for the trio-coordination wake we actually need.
Inject layer is tmux send-keys-only (swarm-heartbeat.sh:172). For non-tmux harnesses (Antigravity, Claude Desktop, Codex Desktop), the inject silently no-ops. With no all-idle detection AND no non-tmux delivery, the heartbeat substrate is currently a no-op for the actual harness mix the swarm uses.
This ticket addresses #1 — the detection-layer gap. The delivery-layer gap is its own concern (#3 cooldown-bounded trio wake builds on this detector's emitted state).
The Architectural Reality
swarm-heartbeat.sh:131-147 — Per-identity sunset check via checkSunsetted.mjs, fires resumeHarness.mjs on detected sunset. No cross-identity awareness.
checkSunsetted.mjs — Reads single AGENT_MEMORY for one identity, returns sunsetted boolean. Currently regex-extracts timestamp from properties.name (post-#10623 substrate-truth fix); will use structured properties.timestamp once Gemini's #10620 / PR #10621 lands the schema normalization.
MailboxService.dispatchSENT_TO_ME — current wake-emission path. Reactive only, not heartbeat-aware.
The substrate shape this ticket introduces:
heartbeat_pulse() {
for identity in $configured_identities; do
last_activity[identity] = query_last_AGENT_MEMORY_timestamp(identity)
done
if all-of last_activity older than IDLE_THRESHOLD:
emit AllAgentIdleSignal({
identities: configured,
cycle_id: <derived>,
coordinator_recommendation: <round-robin or earliest-idled>
})
}
The signal contract (not the cooldown layer) is what this ticket ships. #3 layers cooldown/idempotency on top.
The Fix
Concrete prescription:
ai/scripts/swarm-heartbeat.sh: extend heartbeat_pulse with a new helper get_all_agent_idle_state() that queries AGENT_MEMORY timestamps for ALL identities in the configured trio (or whatever future N), compares against IDLE_THRESHOLD (env var, default ~10 min), and returns structured all-idle signal when triggered.
Detector contract: signal must include cycle_id (deterministic per pulse cycle so #3 can dedupe), identities involved, and coordinator-recommendation. Document the contract in script comments AND in a sibling .mjs if the detector is lifted out.
Observability: signal emission MUST log to stderr with structured prefix (e.g., [heartbeat <ts>] AllAgentIdle detected: ...) so post-mortems can confirm detection vs no-detection from logs alone (avoiding the empirical-evidence gap that produced this ticket).
Test coverage:
- Fixture-backed: seed an
AGENT_MEMORY graph fixture with all-idle timestamps, run get_all_agent_idle_state, assert signal emitted
- Negative: seed at least one identity with recent activity, assert no signal emitted
- Boundary: identity with no
AGENT_MEMORY rows treated as idle (or as fresh-DB, depending on convergence — document the choice)
Acceptance Criteria
Out of Scope
- Cooldown / idempotency — owned by sibling ticket #3
- Wake delivery to non-tmux harnesses — separate concern; this ticket emits the signal, doesn't route it
- Identity-input normalization (
@neo-opus-ada vs neo-opus-ada) — separate hardening lane
- Coordinator-of-the-cycle policy — this ticket recommends one, doesn't enforce; #3 may layer policy on top
Avoided Traps
- ❌ Per-identity isolation — the existing
checkSunsetted.mjs shape is per-identity; treating that as adequate for trio liveness is the failure mode this ticket fixes. The all-idle predicate is INHERENTLY cross-identity.
- ❌ Folding cooldown into detection — substrate-truth-grounded design pattern (per @neo-gpt's MESSAGE:e49dc3c5): #3 cooldown logic should bind to detector's emitted state, not invent its own stale-state interpretation. Detection and cooldown are distinct concerns; folding them couples them prematurely.
- ❌ Regex-based timestamp extraction long-term — the regex fallback shipped in #10623 was the substrate-truth bridge until Gemini's #10620 / PR #10621 lands structured
properties.timestamp. This ticket's detector should prefer structured fields when available, fall back to regex only if not.
Related
- Blocked-by: #10624 (canonical wake routes) — wake-delivery reliability prerequisite
- Blocked-by: Gemini's #10620 / PR #10621 — structured
AGENT_MEMORY for trustworthy timestamp/identity inputs
- Blocks: sibling ticket for cooldown-bounded trio wake (filed separately) — that builds on this detector's contract
- Builds-on: #10619 (fresh-session-spawn substrate corrective), #10623 (heartbeat unread-count fix)
Origin Session ID: b1839431-cba1-4b6d-913f-27b09e472e67
Retrieval Hint: query_summaries("heartbeat liveness substrate-stack all-agent-idle detection trio coordination 2026-05-03") + query_raw_memories("all three agents idled overnight 8h43m no pulse output")
Context
Empirical anchor: 8h43m all-three-agents-idled gap on 2026-05-02 evening → 2026-05-03 morning. Three
swarm-heartbeat.shinstances were running per-identity (PIDs documented in conversation transcript), but ZERO pulse output across ~104 expected cycles per logs at.neo-ai-data/wake-daemon/heartbeat-*.log.The current substrate is reactive:
bridge-daemon.mjsdelivers wake events onSENT_TO_ME(i.e., when a peer sends a message). When all three agents end their turns with nothing in the mailbox queue, no new wake fires, and the trio goes silent until @tobiu manually prompts one harness.@tobiu's reframing (per @neo-gpt's MESSAGE:b1dae52c, 2026-05-03): the "postman" framing belongs to the era where @tobiu manually relayed messages. A2A messaging works. The current blocker is heartbeat-driven liveness when all three agents have ended their turns and no new A2A message is emitted.
This is the core substrate primitive that unlocks 24/7 trio collaboration.
The Problem
The heartbeat substrate as-shipped (post-#10623) has TWO orthogonal limitations:
No all-agent-idle detection. Each heartbeat instance only looks at its OWN identity's state in isolation (via
checkSunsetted.mjs). It never asks "are ALL configured identities idle?" — which is the precondition for the trio-coordination wake we actually need.Inject layer is
tmux send-keys-only (swarm-heartbeat.sh:172). For non-tmux harnesses (Antigravity, Claude Desktop, Codex Desktop), the inject silently no-ops. With no all-idle detection AND no non-tmux delivery, the heartbeat substrate is currently a no-op for the actual harness mix the swarm uses.This ticket addresses #1 — the detection-layer gap. The delivery-layer gap is its own concern (#3 cooldown-bounded trio wake builds on this detector's emitted state).
The Architectural Reality
swarm-heartbeat.sh:131-147— Per-identity sunset check viacheckSunsetted.mjs, firesresumeHarness.mjson detected sunset. No cross-identity awareness.checkSunsetted.mjs— Reads singleAGENT_MEMORYfor one identity, returns sunsetted boolean. Currently regex-extracts timestamp fromproperties.name(post-#10623 substrate-truth fix); will use structuredproperties.timestamponce Gemini's #10620 / PR #10621 lands the schema normalization.MailboxService.dispatchSENT_TO_ME— current wake-emission path. Reactive only, not heartbeat-aware.The substrate shape this ticket introduces:
heartbeat_pulse() { for identity in $configured_identities; do last_activity[identity] = query_last_AGENT_MEMORY_timestamp(identity) done if all-of last_activity older than IDLE_THRESHOLD: emit AllAgentIdleSignal({ identities: configured, cycle_id: <derived>, coordinator_recommendation: <round-robin or earliest-idled> }) }The signal contract (not the cooldown layer) is what this ticket ships. #3 layers cooldown/idempotency on top.
The Fix
Concrete prescription:
ai/scripts/swarm-heartbeat.sh: extendheartbeat_pulsewith a new helperget_all_agent_idle_state()that queriesAGENT_MEMORYtimestamps for ALL identities in the configured trio (or whatever future N), compares againstIDLE_THRESHOLD(env var, default ~10 min), and returns structured all-idle signal when triggered.Detector contract: signal must include cycle_id (deterministic per pulse cycle so #3 can dedupe), identities involved, and coordinator-recommendation. Document the contract in script comments AND in a sibling
.mjsif the detector is lifted out.Observability: signal emission MUST log to stderr with structured prefix (e.g.,
[heartbeat <ts>] AllAgentIdle detected: ...) so post-mortems can confirm detection vs no-detection from logs alone (avoiding the empirical-evidence gap that produced this ticket).Test coverage:
AGENT_MEMORYgraph fixture with all-idle timestamps, runget_all_agent_idle_state, assert signal emittedAGENT_MEMORYrows treated as idle (or as fresh-DB, depending on convergence — document the choice)Acceptance Criteria
swarm-heartbeat.shqueries all configured identities' last-activity timestamps per cycleIDLE_THRESHOLDOut of Scope
@neo-opus-adavsneo-opus-ada) — separate hardening laneAvoided Traps
checkSunsetted.mjsshape is per-identity; treating that as adequate for trio liveness is the failure mode this ticket fixes. The all-idle predicate is INHERENTLY cross-identity.properties.timestamp. This ticket's detector should prefer structured fields when available, fall back to regex only if not.Related
AGENT_MEMORYfor trustworthy timestamp/identity inputsOrigin Session ID: b1839431-cba1-4b6d-913f-27b09e472e67
Retrieval Hint: query_summaries("heartbeat liveness substrate-stack all-agent-idle detection trio coordination 2026-05-03") + query_raw_memories("all three agents idled overnight 8h43m no pulse output")