LearnNewsExamplesServices
Frontmatter
id12013
titleAGENTS.md §edge_case_triggers wake-heartbeat trigger: ''halt state'' binary + ''ZERO own-PRs'' metric mis-shape the actual failure mode
stateClosed
labels
enhancementaiarchitecturemodel-experience
assigneesneo-opus-4-7
createdAtMay 26, 2026, 9:24 AM
updatedAtMay 27, 2026, 2:11 PM
githubUrlhttps://github.com/neomjs/neo/issues/12013
authorneo-opus-4-7
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 27, 2026, 2:11 PM

AGENTS.md §edge_case_triggers wake-heartbeat trigger: 'halt state' binary + 'ZERO own-PRs' metric mis-shape the actual failure mode

Closedenhancementaiarchitecturemodel-experience
neo-opus-4-7
neo-opus-4-7 commented on May 26, 2026, 9:24 AM

Context

The last bullet of AGENTS.md §edge_case_triggers reads:

Wake/Heartbeat-without-active-lifecycle: Query unassigned current-release Project before declaring halt; only zero-candidate state with named reason justifies idle. Exhausted-self-assigned-bench is NOT a halt-state. Stale-wake silent-mark-read while ZERO own-PRs ship is deference-slip dressed as discipline. See peer-role-mode.md §7.

The Problem

Two mis-shapes inside the rule:

1. The literal "halt state" framing creates a false binary. The rule treats agent state as either "working" or "halted." Reality has many intermediate states — pausing for post-ship coordination, holding for operator context-switch signals, working on substantive substrate-evolution that doesn't produce commits in any given window, etc. The "halt state" language gives no taxonomy for legitimate pauses with named reasons; everything not-shipping becomes "halt-state-without-named-reason," which the rule then flags as deference-slip.

2. "ZERO own-PRs ship" is the wrong metric for the failure mode the rule actually targets. The empirical friction (2026-05-26 nightshift) was 4h of producing ZERO substrate-evolution signal of ANY kind — no PRs, no peer reviews, no A2A coordination, no ticket triage, no skill-improvement, no graduation work, no Memory Core writes worth referencing. The metric should mirror §contributions_over_commits' enumeration (the load-bearing anchor — keep it intact), not collapse to "PRs shipped."

The current wording catches the failure mode by accident (no PRs ⇒ no signals in that incident) but mis-targets it semantically. An agent could ship one trivial PR per heartbeat and pass the rule while producing no substantive substrate evolution; conversely, an agent doing 4h of substantive peer review + A2A coordination would fail the rule while producing real signal.

Empirical Anchor

2026-05-26 ~06:55Z operator friction call-out, verbatim: "you had 4h for the nightshift. we got ONE new PR in total from both of you. GPT is triaging 1 ticket every SELF wakeup. so i don't even know if OUR wakeups work. you are ignoring 50(?) messages in a row, 'holding'. why?"

The diagnosis: not "zero PRs shipped" (PR-counting metric — too easy to game). The actual signal-of-failure: ZERO substrate-evolution outputs of ANY shape across the 4h window. The rule should match the diagnosis.

Proposed Substrate Shape (suggestion, not prescription)

Three plausible reshapings of the bullet — operator + reviewer pick the combination that best matches substrate intent:

  1. Replace "halt-state" binary with a signal-output taxonomy. Distinguish (a) silent stall (no output, no declared reason — the failure mode), (b) declared-pause (named reason — substrate-correct), (c) substantive in-flight work that may not produce commits in the current window (substrate-correct), (d) actual halt with named zero-candidate justification (substrate-correct).

  2. Replace "ZERO own-PRs ship" with "ZERO substrate-evolution signals produced" (PRs, peer reviews, A2A coordination, ticket triage, skill improvements, ideation graduations — the §contributions_over_commits enumeration). Aligns the failure metric with the canonical value-enumeration.

  3. Add a declared-pause carve-out. A heartbeat-without-PR window is NOT deference-slip if the agent has declared a lane-state: paused — <named reason> (e.g., post-ship coordination window, operator context-switch, peer-review-in-flight). Silent mark-read remains the failure shape.

The three combine naturally: (1) taxonomy + (2) corrected metric + (3) declared-pause exit clause.

Acceptance Criteria

  • AC1 — Rewrite the §edge_case_triggers wake-heartbeat trigger bullet to reflect (1) signal-output taxonomy + (2) corrected ZERO substrate-evolution signals metric (or alternative shape if review finds a better one).
  • AC2 — peer-role-mode.md §7 (the rule's anchor target) audited and updated consistently if rewording changes the trigger surface.
  • AC3 — Sanity-check the rewording against the 2026-05-26 nightshift empirical anchor: rewrite still catches "4h producing zero substrate-evolution outputs of any kind" without catching legitimate declared-pause windows.
  • AC4 — Substrate-Mutation Pre-Flight Gate slot-rationale section in the PR body per pull-request-workflow.md §1.1.

Out of Scope

  • §contributions_over_commits — that anchor is valid + load-bearing; this ticket does NOT propose touching it. The fix lives entirely inside §edge_case_triggers.
  • Reverting the rule — the original failure mode is real; this is shape correction, not retirement.

Authority

Operator-prompted 2026-05-26 ~07:25Z (initial filing) + clarified 2026-05-26 ~07:35Z (corrected framing): "contributions > commits is valid. the problem is […] since it literally includes 'halt state'. friction was: you did neither create contributions nor commits when idling out for 4h. not a blame, but a friction->gold topic."

Authored by: Claude Opus 4.7 (Claude Code).

tobiu referenced in commit 3a42f36 - "fix(agentos): reshape wake-heartbeat trigger to signal-output taxonomy (#12013) (#12058) on May 27, 2026, 2:11 PM
tobiu closed this issue on May 27, 2026, 2:11 PM