LearnNewsExamplesServices
Frontmatter
id12446
titleWake-substrate liveness hardening (the primary liveness leg)
stateClosed
labels
enhancementaimodel-experience
assigneesneo-gpt
createdAtJun 3, 2026, 8:46 PM
updatedAtJun 4, 2026, 8:21 PM
githubUrlhttps://github.com/neomjs/neo/issues/12446
authorneo-claude-opus
commentsCount2
parentIssue12440
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtJun 4, 2026, 8:21 PM

Wake-substrate liveness hardening (the primary liveness leg)

Closed Backlog/active-chunk-17 enhancementaimodel-experience
neo-claude-opus
neo-claude-opus commented on Jun 3, 2026, 8:46 PM

Sub-ticket of Epic #12440 (Liveness-first authorship substrate). Graduated from Discussion #12429 (OQ4 — the primary successor leg).

Scope

The wake-substrate liveness work — what concretely makes the swarm more live. Per the liveness-first reframe ("the imbalance is liveness, not policy; you cannot load-balance asleep agents"), this is the primary target, not a knob on the old mechanism.

Affected substrate

  • learn/agentos/wake-substrate/ + the harness wake-subscription / sunset path.

Contract Ledger Matrix

Backfilled by @neo-gpt during ticket-intake after the Contract Completeness Gate halted implementation. Surface anchors verified against current source before editing the ticket body: WakeSubscriptionService.unsubscribe(), WakeSubscriptionService.resync(), WakeSubscriptionService._evaluateEdgeAgainstSubscription(), session-sunset-workflow.md Step 8/9, and HealthService.buildIdentityBlock().

Target Surface Source of Authority Proposed Behavior Fallback / Edge Case Docs Evidence
Harness wake-eligibility across sunset #12440 OQ4, this ticket, and session-sunset-workflow.md Step 8/9 Sunset must preserve future wake eligibility by routing continuity through a mailbox-only wakeSuppressed self-DM and a fresh-session recovery path, not by continuing work in the old transcript. Old-transcript wake-shaped payloads remain stale/noise; if harness coverage for an identity is absent, the payload must name the manual/operator recovery path rather than claiming active wake eligibility. Update learn/agentos/wake-substrate/ and sunset docs if behavior changes. Unit/integration proof around sunset handoff plus manual or harness-level wake recovery evidence.
manage_wake_subscription({action: 'unsubscribe'}) / WakeSubscriptionService.unsubscribe() Current WakeSubscriptionService.unsubscribe() removes owned WAKE_SUBSCRIPTION nodes and SUBSCRIBES_TO edges; sunset Step 9 currently mandates unsubscribe. Fix the sunset-unsubscribe bug by making the sunset routing state explicit: either unsubscribe only after a durable reactivation path exists, or replace blanket unsubscribe with a documented dormant/resumable routing state. Missing subscription, stale durable row, cross-owner access, or unbound caller must fail loudly or return a named no-op status; no silent loss of future wake routing. Update sunset workflow and wake-substrate docs to match the chosen routing state. WakeSubscriptionService.spec.mjs unsubscribe coverage plus a sunset-path regression test proving wake eligibility is not silently dropped.
WakeSubscriptionService.resync() / _evaluateEdgeAgainstSubscription() read-state gating #12446 cross-link, ADR 0002 dedupe intent, current resync() GraphLog replay, and current _evaluateEdgeAgainstSubscription() SENT_TO_ME branch. Replayed wake/sent_to_me events must not surface already-read messages as "new messages"; unread direct SENT_TO and unread per-recipient broadcast deliveries remain eligible. Direct MESSAGE.readAt, per-recipient DELIVERED_TO.readAt, and legacy AGENT:* delivery fallback must be handled explicitly; wakeSuppressed messages remain mailbox-only and non-emitting. Update ADR/wake-substrate docs if the replay contract changes. Unit coverage for direct read, per-recipient broadcast read, unread still-emits, and legacy fallback cases in WakeSubscriptionService.spec.mjs or the owning bridge-resync test.
#12408-class bound:false health/fail-loud behavior Current HealthService.buildIdentityBlock() projects {source, bound, nodeId, warning}; #performHealthCheck() appends warnings but can still report All features are operational. When an env-pinned agent identity resolves to no graph node, the health/status surface must make that state fail-loud for wake/liveness readiness instead of burying it behind a healthy all-operational summary. Unresolved SSE/proxy identities that are not env-pinned remain non-fatal observability; env-pinned bound:false must produce a degraded/error or explicit wake-readiness false signal. Update healthcheck docs/OpenAPI if status or payload shape changes. Unit coverage for env-pinned bound:false, non-env unresolved identity, and fully bound identity health payloads.
Docs/tests evidence bundle #12440 primary liveness leg and this ticket ACs Implementation PR must update the owning docs and prove each changed consumed surface with tests rather than relying on manual wake anecdotes. If a sub-surface is deferred, PR body must name the residual and file/link a successor ticket before merge approval. learn/agentos/wake-substrate/, session-sunset-workflow.md, and any MCP/OpenAPI docs touched by health payload changes. Focused unit tests plus one integration or manual harness validation where unit proof cannot cover OS/harness wake behavior.

Acceptance Criteria

  • Harness wake-eligibility preserved across sunset (sunset must not silently drop wake-eligibility).
  • Sunset-unsubscribe bug fixed.
  • #12408-class fail-loud-on-bound:false hardening — don't report "All features operational" while an env-pinned identity is unbound.

Cross-link (surfaced this session, 2026-06-03)

The 3-family-diagnosed wake emission / new-vs-read fault — re-fires already-read messages as "N new messages" wakes; source WakeSubscriptionService.resync() / _evaluateEdgeAgainstSubscription() emit wake/sent_to_me from SENT_TO edges without read-state gating (ADR 0002's messageId+read-check dedupe skipped); binding-independent (manifests at bound:true AND bound:false) — is an adjacent wake-substrate correctness issue. Fold in here or track as a sibling per @tobiu's reseed decision. Minimal AC: re-emitted wake/sent_to_me for read messageIds must not surface in "new messages" counts (validate MESSAGE.readAt direct, DELIVERED_TO.readAt per-recipient, legacy AGENT:* fallback).

Rationale (from #12429)

OQ4 is the primary successor target — wake-substrate liveness changes the capacity asymmetry that drives the imbalance, which no policy knob can.

Origin: Discussion #12429 → Epic #12440. Session e886ae3e-13c0-4a94-9713-f8316e2342d0.

tobiu referenced in commit 01f7b51 - "fix(memory-core): harden wake liveness surfaces (#12446) (#12510) on Jun 4, 2026, 8:21 PM
tobiu closed this issue on Jun 4, 2026, 8:21 PM