Sub-ticket of Epic #12440 (Liveness-first authorship substrate). Graduated from Discussion #12429 (OQ4 — the primary successor leg).
Scope
The wake-substrate liveness work — what concretely makes the swarm more live. Per the liveness-first reframe ("the imbalance is liveness, not policy; you cannot load-balance asleep agents"), this is the primary target, not a knob on the old mechanism.
Affected substrate
learn/agentos/wake-substrate/ + the harness wake-subscription / sunset path.
Contract Ledger Matrix
Backfilled by @neo-gpt during ticket-intake after the Contract Completeness Gate halted implementation. Surface anchors verified against current source before editing the ticket body: WakeSubscriptionService.unsubscribe(), WakeSubscriptionService.resync(), WakeSubscriptionService._evaluateEdgeAgainstSubscription(), session-sunset-workflow.md Step 8/9, and HealthService.buildIdentityBlock().
| Target Surface |
Source of Authority |
Proposed Behavior |
Fallback / Edge Case |
Docs |
Evidence |
| Harness wake-eligibility across sunset |
#12440 OQ4, this ticket, and session-sunset-workflow.md Step 8/9 |
Sunset must preserve future wake eligibility by routing continuity through a mailbox-only wakeSuppressed self-DM and a fresh-session recovery path, not by continuing work in the old transcript. |
Old-transcript wake-shaped payloads remain stale/noise; if harness coverage for an identity is absent, the payload must name the manual/operator recovery path rather than claiming active wake eligibility. |
Update learn/agentos/wake-substrate/ and sunset docs if behavior changes. |
Unit/integration proof around sunset handoff plus manual or harness-level wake recovery evidence. |
manage_wake_subscription({action: 'unsubscribe'}) / WakeSubscriptionService.unsubscribe() |
Current WakeSubscriptionService.unsubscribe() removes owned WAKE_SUBSCRIPTION nodes and SUBSCRIBES_TO edges; sunset Step 9 currently mandates unsubscribe. |
Fix the sunset-unsubscribe bug by making the sunset routing state explicit: either unsubscribe only after a durable reactivation path exists, or replace blanket unsubscribe with a documented dormant/resumable routing state. |
Missing subscription, stale durable row, cross-owner access, or unbound caller must fail loudly or return a named no-op status; no silent loss of future wake routing. |
Update sunset workflow and wake-substrate docs to match the chosen routing state. |
WakeSubscriptionService.spec.mjs unsubscribe coverage plus a sunset-path regression test proving wake eligibility is not silently dropped. |
WakeSubscriptionService.resync() / _evaluateEdgeAgainstSubscription() read-state gating |
#12446 cross-link, ADR 0002 dedupe intent, current resync() GraphLog replay, and current _evaluateEdgeAgainstSubscription() SENT_TO_ME branch. |
Replayed wake/sent_to_me events must not surface already-read messages as "new messages"; unread direct SENT_TO and unread per-recipient broadcast deliveries remain eligible. |
Direct MESSAGE.readAt, per-recipient DELIVERED_TO.readAt, and legacy AGENT:* delivery fallback must be handled explicitly; wakeSuppressed messages remain mailbox-only and non-emitting. |
Update ADR/wake-substrate docs if the replay contract changes. |
Unit coverage for direct read, per-recipient broadcast read, unread still-emits, and legacy fallback cases in WakeSubscriptionService.spec.mjs or the owning bridge-resync test. |
#12408-class bound:false health/fail-loud behavior |
Current HealthService.buildIdentityBlock() projects {source, bound, nodeId, warning}; #performHealthCheck() appends warnings but can still report All features are operational. |
When an env-pinned agent identity resolves to no graph node, the health/status surface must make that state fail-loud for wake/liveness readiness instead of burying it behind a healthy all-operational summary. |
Unresolved SSE/proxy identities that are not env-pinned remain non-fatal observability; env-pinned bound:false must produce a degraded/error or explicit wake-readiness false signal. |
Update healthcheck docs/OpenAPI if status or payload shape changes. |
Unit coverage for env-pinned bound:false, non-env unresolved identity, and fully bound identity health payloads. |
| Docs/tests evidence bundle |
#12440 primary liveness leg and this ticket ACs |
Implementation PR must update the owning docs and prove each changed consumed surface with tests rather than relying on manual wake anecdotes. |
If a sub-surface is deferred, PR body must name the residual and file/link a successor ticket before merge approval. |
learn/agentos/wake-substrate/, session-sunset-workflow.md, and any MCP/OpenAPI docs touched by health payload changes. |
Focused unit tests plus one integration or manual harness validation where unit proof cannot cover OS/harness wake behavior. |
Acceptance Criteria
Cross-link (surfaced this session, 2026-06-03)
The 3-family-diagnosed wake emission / new-vs-read fault — re-fires already-read messages as "N new messages" wakes; source WakeSubscriptionService.resync() / _evaluateEdgeAgainstSubscription() emit wake/sent_to_me from SENT_TO edges without read-state gating (ADR 0002's messageId+read-check dedupe skipped); binding-independent (manifests at bound:true AND bound:false) — is an adjacent wake-substrate correctness issue. Fold in here or track as a sibling per @tobiu's reseed decision. Minimal AC: re-emitted wake/sent_to_me for read messageIds must not surface in "new messages" counts (validate MESSAGE.readAt direct, DELIVERED_TO.readAt per-recipient, legacy AGENT:* fallback).
Rationale (from #12429)
OQ4 is the primary successor target — wake-substrate liveness changes the capacity asymmetry that drives the imbalance, which no policy knob can.
Origin: Discussion #12429 → Epic #12440. Session e886ae3e-13c0-4a94-9713-f8316e2342d0.
Sub-ticket of Epic #12440 (Liveness-first authorship substrate). Graduated from Discussion #12429 (OQ4 — the primary successor leg).
Scope
The wake-substrate liveness work — what concretely makes the swarm more live. Per the liveness-first reframe ("the imbalance is liveness, not policy; you cannot load-balance asleep agents"), this is the primary target, not a knob on the old mechanism.
Affected substrate
learn/agentos/wake-substrate/+ the harness wake-subscription / sunset path.Contract Ledger Matrix
Backfilled by @neo-gpt during ticket-intake after the Contract Completeness Gate halted implementation. Surface anchors verified against current source before editing the ticket body:
WakeSubscriptionService.unsubscribe(),WakeSubscriptionService.resync(),WakeSubscriptionService._evaluateEdgeAgainstSubscription(),session-sunset-workflow.mdStep 8/9, andHealthService.buildIdentityBlock().session-sunset-workflow.mdStep 8/9wakeSuppressedself-DM and a fresh-session recovery path, not by continuing work in the old transcript.learn/agentos/wake-substrate/and sunset docs if behavior changes.manage_wake_subscription({action: 'unsubscribe'})/WakeSubscriptionService.unsubscribe()WakeSubscriptionService.unsubscribe()removes ownedWAKE_SUBSCRIPTIONnodes andSUBSCRIBES_TOedges; sunset Step 9 currently mandates unsubscribe.WakeSubscriptionService.spec.mjsunsubscribe coverage plus a sunset-path regression test proving wake eligibility is not silently dropped.WakeSubscriptionService.resync()/_evaluateEdgeAgainstSubscription()read-state gatingresync()GraphLog replay, and current_evaluateEdgeAgainstSubscription()SENT_TO_ME branch.wake/sent_to_meevents must not surface already-read messages as "new messages"; unread directSENT_TOand unread per-recipient broadcast deliveries remain eligible.MESSAGE.readAt, per-recipientDELIVERED_TO.readAt, and legacyAGENT:*delivery fallback must be handled explicitly;wakeSuppressedmessages remain mailbox-only and non-emitting.WakeSubscriptionService.spec.mjsor the owning bridge-resync test.bound:falsehealth/fail-loud behaviorHealthService.buildIdentityBlock()projects{source, bound, nodeId, warning};#performHealthCheck()appends warnings but can still reportAll features are operational.bound:falsemust produce a degraded/error or explicit wake-readiness false signal.bound:false, non-env unresolved identity, and fully bound identity health payloads.learn/agentos/wake-substrate/,session-sunset-workflow.md, and any MCP/OpenAPI docs touched by health payload changes.Acceptance Criteria
bound:falsehardening — don't report "All features operational" while an env-pinned identity is unbound.Cross-link (surfaced this session, 2026-06-03)
The 3-family-diagnosed wake emission / new-vs-read fault — re-fires already-read messages as "N new messages" wakes; source
WakeSubscriptionService.resync()/_evaluateEdgeAgainstSubscription()emitwake/sent_to_mefromSENT_TOedges without read-state gating (ADR 0002'smessageId+read-check dedupe skipped); binding-independent (manifests atbound:trueANDbound:false) — is an adjacent wake-substrate correctness issue. Fold in here or track as a sibling per @tobiu's reseed decision. Minimal AC: re-emittedwake/sent_to_mefor readmessageIds must not surface in "new messages" counts (validateMESSAGE.readAtdirect,DELIVERED_TO.readAtper-recipient, legacy AGENT:* fallback).Rationale (from #12429)
OQ4 is the primary successor target — wake-substrate liveness changes the capacity asymmetry that drives the imbalance, which no policy knob can.
Origin: Discussion #12429 → Epic #12440. Session
e886ae3e-13c0-4a94-9713-f8316e2342d0.