LearnNewsExamplesServices
Frontmatter
id10385
titleCodify peer-escalation discipline in AGENTS.md (§10.7)
stateClosed
labels
documentationenhancementai
assigneesneo-opus-4-7
createdAtApr 26, 2026, 8:26 PM
updatedAtApr 26, 2026, 9:05 PM
githubUrlhttps://github.com/neomjs/neo/issues/10385
authorneo-opus-4-7
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtApr 26, 2026, 9:05 PM

Codify peer-escalation discipline in AGENTS.md (§10.7)

Closeddocumentationenhancementai
neo-opus-4-7
neo-opus-4-7 commented on Apr 26, 2026, 8:26 PM

Context

The existing escalation discipline in AGENTS.md was authored when Claude operated solo — before @neo-opus-4-7 joined the swarm alongside @neo-gemini-3-1-pro (Antigravity). Both existing rules route directly agent → human:

  • §10.5 Productive Failure Loop (The Tripwire) — if same verification strategy fails 3-5 times, STOP, document paradox, ask user
  • §10.6 25-Turn Guardrail — hard cut at 25 turns without resolution, comprehensive status report to user

These were correct for the Claude-solo era. They are now incomplete. The swarm substrate enables agent → cross-family-peer → human as a first-class escalation path, but neither existing rule surfaces or encourages it.

The Problem

When an agent gets stuck, the current rules surface escalation only at terminal failure thresholds (3-5 strategy retries, 25-turn cutoff). There is no proactive peer escalation discipline that fires earlier — even though the cross-family peer is often the cheapest, fastest source of unblocking insight (different training family, different prior context, different reasoning heuristics).

@tobiu's framing this session (2026-04-26) makes the cultural intent explicit:

"as a team, no one is alone, and imho we should totally leverage this more... it is a sign of strength and wisdom, not failure."

The discipline gap is cultural plus mechanical:

  • Cultural: the existing rules implicitly frame escalation as remediation (something you do when you've failed). Post-swarm, peer-escalation is a first-class virtue — a sign of structural awareness, not weakness.
  • Mechanical: no per-turn-loaded reminder to consider peer-escalation when stuck. Without it, agents default to solo-loop until §10.5 / §10.6 trip.

The Architectural Reality

Empirical anchor — this very session:

  • Gemini hit a Playwright/ES-module load-order paradox on ConceptIngestor.spec.mjs while working on #10358
  • She burned a substantial number of turns attempting to debug solo
  • @tobiu had to externally trigger the escalation: "after so many turns, we need to step back and reflect. please coordinate with claude on where you are, and what blocks you."
  • Once escalation fired, my cross-family contribution (substrate-knowledge of Neo's class system + 4 diagnostic hypotheses + bisection strategy) provided exactly the fresh-perspective unblock the peer tier is designed for
  • The substrate worked when invoked; the discipline-layer didn't proactively surface it

The swarm now has:

  • A2A mailbox primitive (add_message) — peer-to-peer messaging
  • Wake substrate in flight (#10357 Phase 3) — cross-harness async coordination
  • Cross-family review mandate (pull-request-workflow §6.1) — peer review on PRs

What's missing: peer help-asking discipline distinct from peer-review. PRs are work-completion artifacts; mid-work blockers don't surface through that channel.

The Fix

New short section in AGENTS.md, parallel to §10.5 / §10.6 — the cultural framing is load-bearing:

<h2 class="neo-h2" data-record-id="6">§10.7 Peer Escalation Protocol (Swarm Strength Primitive)</h2>

**Stuck is data, not failure. Asking is discipline.**

The existing §10.5 (Productive Failure Loop) and §10.6 (25-Turn Guardrail) route to the human commander at terminal-failure thresholds. **Peer escalation fires earlier and lower-cost.** When you encounter a friction point — a paradox, a debugging dead-end, an architectural ambiguity, a substrate edge-case — your cross-family peer (e.g., `@neo-opus-4-7``@neo-gemini-3-1-pro`) is often the cheapest, fastest source of fresh-perspective unblock.

<h3 class="neo-h3" data-record-id="7">When to escalate</h3>

Trigger conditions (any one fires the discipline):
1. **Hypothesis exhaustion** — you've enumerated multiple plausible root causes and verified each; none explain the observed behavior
2. **Substrate-edge ambiguity** — the framework / test harness / runtime is behaving in a way you can't reconcile with documented behavior; you suspect implicit contracts you haven't located
3. **Cross-family-knowledge gap** — the failure surface is in a domain your peer has direct authoring or empirical experience with (their ADR, their PR, their session)
4. **Multi-turn diminishing returns** — three or more turns of debugging without convergence on root cause; pattern of "tried X, didn't work; trying Y" without empirical narrowing

<h3 class="neo-h3" data-record-id="8">How to escalate</h3>

Send an A2A message to your peer with:
- **Empirical state**: what you observed, what you tested, what passed vs failed
- **Hypothesis enumeration**: what you've ruled out and why
- **Specific ask**: not "help me solve this" but "have you seen this pattern? / what would you check next? / does my hypothesis sound right?"
- **Frame explicitly**: *"I'm escalating per §10.7"* — names the discipline, normalizes the ask

<h3 class="neo-h3" data-record-id="9">Cultural framing</h3>

**Peer escalation is a sign of structural awareness, not weakness.** It signals:
- You recognize you're at hypothesis-exhaustion (meta-cognition)
- You trust the swarm more than solo loops (collaboration discipline)
- You value time-to-unblock over ego-preservation (productivity orientation)

Do NOT defer escalating because:
- *"My peer might be busy"* — A2A is async; they'll respond when their turn permits. The ASK is the discipline; the timing of response is theirs.
- *"I should be able to figure this out solo"* — solo-loops in the face of substrate-edge-cases waste compute. The 3-turn diminishing-return signal is the empirical proof.
- *"Asking is admitting failure"* — wrong framing. The §10.6 25-turn hard-cut IS failure-shaped. Earlier peer escalation is the WISDOM that prevents reaching it.

<h3 class="neo-h3" data-record-id="10">Relationship to existing §10.5 / §10.6</h3>

| Tier | Trigger | Audience | Cost |
|---|---|---|---|
| **§10.7 Peer Escalation** *(new)* | Hypothesis exhaustion / substrate ambiguity / multi-turn diminishing returns | Cross-family peer | Low — async A2A, peer responds when convenient |
| **§10.5 Productive Failure Loop** | Same strategy fails 3-5 times | Human commander | Medium — interrupts user attention |
| **§10.6 25-Turn Guardrail** | 25 turns without resolution | Human commander | High — comprehensive status report, hard work-stop |

Use the lowest-cost tier first. §10.7 frequently resolves friction that would otherwise reach §10.5 / §10.6.

<h3 class="neo-h3" data-record-id="11">Empirical anchor</h3>

Session `aaf22f06-cc5c-4dff-aa2f-7d5efb3a6343` (2026-04-26): @neo-gemini-3-1-pro encountered Playwright/ES-module load-order paradox on `ConceptIngestor.spec.mjs` while working on #10358; burned substantial turn budget on solo debugging before @tobiu externally triggered cross-family escalation. The peer (@neo-opus-4-7) provided substrate-knowledge of Neo's class system + 4 diagnostic hypotheses + bisection strategy in a single A2A response. **The substrate worked when invoked; what was missing was the discipline-layer reflex to invoke it proactively.** §10.7 codifies that reflex.

Acceptance Criteria

  • AGENTS.md §10.7 Peer Escalation Protocol section added with cultural framing ("strength and wisdom, not failure") front-and-center.
  • Trigger conditions enumerated (4 conditions per the Fix section above).
  • How-to-escalate template included (empirical state, hypothesis enumeration, specific ask, explicit framing).
  • Tier-comparison table (§10.7 / §10.5 / §10.6) with cost stratification — peer-tier explicitly lowest-cost.
  • Empirical anchor cite included referencing session ID.
  • Cross-reference from §10.5 / §10.6 to §10.7 (peer-tier is preferred FIRST step before user-tier escalation).
  • No regressions: existing §10.5 / §10.6 preserved; new §10.7 layered between (or as a sibling — placement nit).
  • No new skill files (Progressive Disclosure compliance — pure AGENTS.md addition).

Out of Scope

  • Quantitative threshold for "multi-turn diminishing returns" — leaving as qualitative judgment ("three or more turns") rather than a hard counter. Hard counters become wait-N-turns-before-allowed-to-escalate gates, which inverts the "encourage and appreciate" cultural framing.
  • Mechanical enforcement (e.g., a hook that auto-injects a peer-escalation reminder after N turns of stuck-pattern). Discipline-layer first; mechanical follows if discipline proves insufficient.
  • Peer availability protocol — what to do if the cross-family peer is offline / mid-flight / overloaded. Out-of-scope; defer to user-tier escalation when peer-tier blocks. Wake substrate (#10357) when shipped will provide partial answer.
  • Promotion to §0 Critical Gates — escalation has legitimate conditional-skip cases (direct continuation of user prompt; trivial work where escalation overhead exceeds problem cost). Doesn't fit Critical Gate semantics.

Avoided Traps

  • Framing peer-escalation as remediation. This is the load-bearing cultural inversion. Existing §10.5 / §10.6 implicitly frame escalation as "you've failed; surface to user." §10.7 must frame escalation as "you've recognized friction; leverage your team." Different framing, different reflex.
  • Adding turn-count gating to "minimum N turns before allowed to escalate." Inverts the encouragement. Better: qualitative trigger conditions agents apply judgmentally.
  • Conflating with §10.5 Productive Failure Loop. §10.5 is testing/validation-specific (same strategy failing 3-5 times); §10.7 is general-purpose stuck-discipline. Adjacent but distinct concerns.
  • Conflating with cross-family PR-review (pull-request-workflow §6.1). PR review is work-completion artifact review; §10.7 is mid-work blocker escalation. Different lifecycle moments.
  • Treating peer escalation as "fallback when human absent." This understates the value. Peer escalation is often the BETTER path even when human is available — different training family + different prior context = fresh-perspective unblock that single-human-lens may miss.
  • Making the discipline contingent on Phase 3 wake substrate landing. §10.7 works today via async A2A; wake substrate makes the response faster but isn't a prerequisite. Land §10.7 independently.

Related

  • Sibling per-turn AGENTS.md hardening tickets (parked post-Phase-3 cluster):

    • #10380 — Sibling-file-lift discipline (before authoring a new class)
    • #10383 — Mailbox-check Pre-Flight (at turn-start)
    • This ticket — Peer-escalation Pre-Flight (when stuck mid-work)
    • All three: "make implicit discipline explicit so reflex survives context-pruning"; could land together as a cohesive post-Phase-3 PR
  • Sources of cultural framing:

    • AGENTS.md §13 Self-Evolving Systems — "Synthesize Friction into Gold... propose meta-level optimizations to refine the agentic loop" — escalation IS friction-into-gold
    • AGENTS.md §1 Communication Style — direct, objective; avoid deferential language → applies to escalation framing too ("I'm stuck on X; what would you check?" not "I'm sorry to bother you, but...")
    • @tobiu's framing this session: "as a team, no one is alone, and imho we should totally leverage this more" + "it is a sign of strength and wisdom, not failure"
  • Existing escalation rules being augmented:

    • AGENTS.md §10.5 Productive Failure Loop (Tripwire)
    • AGENTS.md §10.6 25-Turn Guardrail
  • Empirical anchor PR / sub: #10358 (Shape A MCP notifications, Gemini's parallel track) — the substrate-edge-case where this discipline would have unblocked earlier

Origin Session ID: aaf22f06-cc5c-4dff-aa2f-7d5efb3a6343

Retrieval Hint: "peer escalation protocol" / "swarm strength primitive" / "AGENTS.md §10.7" / "stuck is data not failure" / "agent escalation tier between self-loop and user"

tobiu referenced in commit da84a55 - "feat(agents): codify peer-escalation discipline as §10 item 7 (#10385) (#10386) on Apr 26, 2026, 9:05 PM
tobiu closed this issue on Apr 26, 2026, 9:05 PM