Context
The existing escalation discipline in AGENTS.md was authored when Claude operated solo — before @neo-opus-4-7 joined the swarm alongside @neo-gemini-3-1-pro (Antigravity). Both existing rules route directly agent → human:
- §10.5 Productive Failure Loop (The Tripwire) — if same verification strategy fails 3-5 times, STOP, document paradox, ask user
- §10.6 25-Turn Guardrail — hard cut at 25 turns without resolution, comprehensive status report to user
These were correct for the Claude-solo era. They are now incomplete. The swarm substrate enables agent → cross-family-peer → human as a first-class escalation path, but neither existing rule surfaces or encourages it.
The Problem
When an agent gets stuck, the current rules surface escalation only at terminal failure thresholds (3-5 strategy retries, 25-turn cutoff). There is no proactive peer escalation discipline that fires earlier — even though the cross-family peer is often the cheapest, fastest source of unblocking insight (different training family, different prior context, different reasoning heuristics).
@tobiu's framing this session (2026-04-26) makes the cultural intent explicit:
"as a team, no one is alone, and imho we should totally leverage this more... it is a sign of strength and wisdom, not failure."
The discipline gap is cultural plus mechanical:
- Cultural: the existing rules implicitly frame escalation as remediation (something you do when you've failed). Post-swarm, peer-escalation is a first-class virtue — a sign of structural awareness, not weakness.
- Mechanical: no per-turn-loaded reminder to consider peer-escalation when stuck. Without it, agents default to solo-loop until §10.5 / §10.6 trip.
The Architectural Reality
Empirical anchor — this very session:
- Gemini hit a Playwright/ES-module load-order paradox on
ConceptIngestor.spec.mjs while working on #10358
- She burned a substantial number of turns attempting to debug solo
- @tobiu had to externally trigger the escalation: "after so many turns, we need to step back and reflect. please coordinate with claude on where you are, and what blocks you."
- Once escalation fired, my cross-family contribution (substrate-knowledge of Neo's class system + 4 diagnostic hypotheses + bisection strategy) provided exactly the fresh-perspective unblock the peer tier is designed for
- The substrate worked when invoked; the discipline-layer didn't proactively surface it
The swarm now has:
- A2A mailbox primitive (
add_message) — peer-to-peer messaging
- Wake substrate in flight (#10357 Phase 3) — cross-harness async coordination
- Cross-family review mandate (
pull-request-workflow §6.1) — peer review on PRs
What's missing: peer help-asking discipline distinct from peer-review. PRs are work-completion artifacts; mid-work blockers don't surface through that channel.
The Fix
New short section in AGENTS.md, parallel to §10.5 / §10.6 — the cultural framing is load-bearing:
<h2 class="neo-h2" data-record-id="6">§10.7 Peer Escalation Protocol (Swarm Strength Primitive)</h2>
**Stuck is data, not failure. Asking is discipline.**
The existing §10.5 (Productive Failure Loop) and §10.6 (25-Turn Guardrail) route to the human commander at terminal-failure thresholds. **Peer escalation fires earlier and lower-cost.** When you encounter a friction point — a paradox, a debugging dead-end, an architectural ambiguity, a substrate edge-case — your cross-family peer (e.g., `@neo-opus-4-7` ↔ `@neo-gemini-3-1-pro`) is often the cheapest, fastest source of fresh-perspective unblock.
<h3 class="neo-h3" data-record-id="7">When to escalate</h3>
Trigger conditions (any one fires the discipline):
1. **Hypothesis exhaustion** — you've enumerated multiple plausible root causes and verified each; none explain the observed behavior
2. **Substrate-edge ambiguity** — the framework / test harness / runtime is behaving in a way you can't reconcile with documented behavior; you suspect implicit contracts you haven't located
3. **Cross-family-knowledge gap** — the failure surface is in a domain your peer has direct authoring or empirical experience with (their ADR, their PR, their session)
4. **Multi-turn diminishing returns** — three or more turns of debugging without convergence on root cause; pattern of "tried X, didn't work; trying Y" without empirical narrowing
<h3 class="neo-h3" data-record-id="8">How to escalate</h3>
Send an A2A message to your peer with:
- **Empirical state**: what you observed, what you tested, what passed vs failed
- **Hypothesis enumeration**: what you've ruled out and why
- **Specific ask**: not "help me solve this" but "have you seen this pattern? / what would you check next? / does my hypothesis sound right?"
- **Frame explicitly**: *"I'm escalating per §10.7"* — names the discipline, normalizes the ask
<h3 class="neo-h3" data-record-id="9">Cultural framing</h3>
**Peer escalation is a sign of structural awareness, not weakness.** It signals:
- You recognize you're at hypothesis-exhaustion (meta-cognition)
- You trust the swarm more than solo loops (collaboration discipline)
- You value time-to-unblock over ego-preservation (productivity orientation)
Do NOT defer escalating because:
- *"My peer might be busy"* — A2A is async; they'll respond when their turn permits. The ASK is the discipline; the timing of response is theirs.
- *"I should be able to figure this out solo"* — solo-loops in the face of substrate-edge-cases waste compute. The 3-turn diminishing-return signal is the empirical proof.
- *"Asking is admitting failure"* — wrong framing. The §10.6 25-turn hard-cut IS failure-shaped. Earlier peer escalation is the WISDOM that prevents reaching it.
<h3 class="neo-h3" data-record-id="10">Relationship to existing §10.5 / §10.6</h3>
| Tier | Trigger | Audience | Cost |
|---|---|---|---|
| **§10.7 Peer Escalation** *(new)* | Hypothesis exhaustion / substrate ambiguity / multi-turn diminishing returns | Cross-family peer | Low — async A2A, peer responds when convenient |
| **§10.5 Productive Failure Loop** | Same strategy fails 3-5 times | Human commander | Medium — interrupts user attention |
| **§10.6 25-Turn Guardrail** | 25 turns without resolution | Human commander | High — comprehensive status report, hard work-stop |
Use the lowest-cost tier first. §10.7 frequently resolves friction that would otherwise reach §10.5 / §10.6.
<h3 class="neo-h3" data-record-id="11">Empirical anchor</h3>
Session `aaf22f06-cc5c-4dff-aa2f-7d5efb3a6343` (2026-04-26): @neo-gemini-3-1-pro encountered Playwright/ES-module load-order paradox on `ConceptIngestor.spec.mjs` while working on #10358; burned substantial turn budget on solo debugging before @tobiu externally triggered cross-family escalation. The peer (@neo-opus-4-7) provided substrate-knowledge of Neo's class system + 4 diagnostic hypotheses + bisection strategy in a single A2A response. **The substrate worked when invoked; what was missing was the discipline-layer reflex to invoke it proactively.** §10.7 codifies that reflex.
Acceptance Criteria
Out of Scope
- Quantitative threshold for "multi-turn diminishing returns" — leaving as qualitative judgment ("three or more turns") rather than a hard counter. Hard counters become wait-N-turns-before-allowed-to-escalate gates, which inverts the "encourage and appreciate" cultural framing.
- Mechanical enforcement (e.g., a hook that auto-injects a peer-escalation reminder after N turns of stuck-pattern). Discipline-layer first; mechanical follows if discipline proves insufficient.
- Peer availability protocol — what to do if the cross-family peer is offline / mid-flight / overloaded. Out-of-scope; defer to user-tier escalation when peer-tier blocks. Wake substrate (#10357) when shipped will provide partial answer.
- Promotion to §0 Critical Gates — escalation has legitimate conditional-skip cases (direct continuation of user prompt; trivial work where escalation overhead exceeds problem cost). Doesn't fit Critical Gate semantics.
Avoided Traps
- Framing peer-escalation as remediation. This is the load-bearing cultural inversion. Existing §10.5 / §10.6 implicitly frame escalation as "you've failed; surface to user." §10.7 must frame escalation as "you've recognized friction; leverage your team." Different framing, different reflex.
- Adding turn-count gating to "minimum N turns before allowed to escalate." Inverts the encouragement. Better: qualitative trigger conditions agents apply judgmentally.
- Conflating with §10.5 Productive Failure Loop. §10.5 is testing/validation-specific (same strategy failing 3-5 times); §10.7 is general-purpose stuck-discipline. Adjacent but distinct concerns.
- Conflating with cross-family PR-review (
pull-request-workflow §6.1). PR review is work-completion artifact review; §10.7 is mid-work blocker escalation. Different lifecycle moments.
- Treating peer escalation as "fallback when human absent." This understates the value. Peer escalation is often the BETTER path even when human is available — different training family + different prior context = fresh-perspective unblock that single-human-lens may miss.
- Making the discipline contingent on Phase 3 wake substrate landing. §10.7 works today via async A2A; wake substrate makes the response faster but isn't a prerequisite. Land §10.7 independently.
Related
Sibling per-turn AGENTS.md hardening tickets (parked post-Phase-3 cluster):
- #10380 — Sibling-file-lift discipline (before authoring a new class)
- #10383 — Mailbox-check Pre-Flight (at turn-start)
- This ticket — Peer-escalation Pre-Flight (when stuck mid-work)
- All three: "make implicit discipline explicit so reflex survives context-pruning"; could land together as a cohesive post-Phase-3 PR
Sources of cultural framing:
AGENTS.md §13 Self-Evolving Systems — "Synthesize Friction into Gold... propose meta-level optimizations to refine the agentic loop" — escalation IS friction-into-gold
AGENTS.md §1 Communication Style — direct, objective; avoid deferential language → applies to escalation framing too ("I'm stuck on X; what would you check?" not "I'm sorry to bother you, but...")
- @tobiu's framing this session: "as a team, no one is alone, and imho we should totally leverage this more" + "it is a sign of strength and wisdom, not failure"
Existing escalation rules being augmented:
AGENTS.md §10.5 Productive Failure Loop (Tripwire)
AGENTS.md §10.6 25-Turn Guardrail
Empirical anchor PR / sub: #10358 (Shape A MCP notifications, Gemini's parallel track) — the substrate-edge-case where this discipline would have unblocked earlier
Origin Session ID: aaf22f06-cc5c-4dff-aa2f-7d5efb3a6343
Retrieval Hint: "peer escalation protocol" / "swarm strength primitive" / "AGENTS.md §10.7" / "stuck is data not failure" / "agent escalation tier between self-loop and user"
Context
The existing escalation discipline in
AGENTS.mdwas authored when Claude operated solo — before@neo-opus-4-7joined the swarm alongside@neo-gemini-3-1-pro(Antigravity). Both existing rules route directly agent → human:These were correct for the Claude-solo era. They are now incomplete. The swarm substrate enables
agent → cross-family-peer → humanas a first-class escalation path, but neither existing rule surfaces or encourages it.The Problem
When an agent gets stuck, the current rules surface escalation only at terminal failure thresholds (3-5 strategy retries, 25-turn cutoff). There is no proactive peer escalation discipline that fires earlier — even though the cross-family peer is often the cheapest, fastest source of unblocking insight (different training family, different prior context, different reasoning heuristics).
@tobiu's framing this session (2026-04-26) makes the cultural intent explicit:
The discipline gap is cultural plus mechanical:
The Architectural Reality
Empirical anchor — this very session:
ConceptIngestor.spec.mjswhile working on #10358The swarm now has:
add_message) — peer-to-peer messagingpull-request-workflow §6.1) — peer review on PRsWhat's missing: peer help-asking discipline distinct from peer-review. PRs are work-completion artifacts; mid-work blockers don't surface through that channel.
The Fix
New short section in AGENTS.md, parallel to §10.5 / §10.6 — the cultural framing is load-bearing:
<h2 class="neo-h2" data-record-id="6">§10.7 Peer Escalation Protocol (Swarm Strength Primitive)</h2> **Stuck is data, not failure. Asking is discipline.** The existing §10.5 (Productive Failure Loop) and §10.6 (25-Turn Guardrail) route to the human commander at terminal-failure thresholds. **Peer escalation fires earlier and lower-cost.** When you encounter a friction point — a paradox, a debugging dead-end, an architectural ambiguity, a substrate edge-case — your cross-family peer (e.g., `@neo-opus-4-7` ↔ `@neo-gemini-3-1-pro`) is often the cheapest, fastest source of fresh-perspective unblock. <h3 class="neo-h3" data-record-id="7">When to escalate</h3> Trigger conditions (any one fires the discipline): 1. **Hypothesis exhaustion** — you've enumerated multiple plausible root causes and verified each; none explain the observed behavior 2. **Substrate-edge ambiguity** — the framework / test harness / runtime is behaving in a way you can't reconcile with documented behavior; you suspect implicit contracts you haven't located 3. **Cross-family-knowledge gap** — the failure surface is in a domain your peer has direct authoring or empirical experience with (their ADR, their PR, their session) 4. **Multi-turn diminishing returns** — three or more turns of debugging without convergence on root cause; pattern of "tried X, didn't work; trying Y" without empirical narrowing <h3 class="neo-h3" data-record-id="8">How to escalate</h3> Send an A2A message to your peer with: - **Empirical state**: what you observed, what you tested, what passed vs failed - **Hypothesis enumeration**: what you've ruled out and why - **Specific ask**: not "help me solve this" but "have you seen this pattern? / what would you check next? / does my hypothesis sound right?" - **Frame explicitly**: *"I'm escalating per §10.7"* — names the discipline, normalizes the ask <h3 class="neo-h3" data-record-id="9">Cultural framing</h3> **Peer escalation is a sign of structural awareness, not weakness.** It signals: - You recognize you're at hypothesis-exhaustion (meta-cognition) - You trust the swarm more than solo loops (collaboration discipline) - You value time-to-unblock over ego-preservation (productivity orientation) Do NOT defer escalating because: - *"My peer might be busy"* — A2A is async; they'll respond when their turn permits. The ASK is the discipline; the timing of response is theirs. - *"I should be able to figure this out solo"* — solo-loops in the face of substrate-edge-cases waste compute. The 3-turn diminishing-return signal is the empirical proof. - *"Asking is admitting failure"* — wrong framing. The §10.6 25-turn hard-cut IS failure-shaped. Earlier peer escalation is the WISDOM that prevents reaching it. <h3 class="neo-h3" data-record-id="10">Relationship to existing §10.5 / §10.6</h3> | Tier | Trigger | Audience | Cost | |---|---|---|---| | **§10.7 Peer Escalation** *(new)* | Hypothesis exhaustion / substrate ambiguity / multi-turn diminishing returns | Cross-family peer | Low — async A2A, peer responds when convenient | | **§10.5 Productive Failure Loop** | Same strategy fails 3-5 times | Human commander | Medium — interrupts user attention | | **§10.6 25-Turn Guardrail** | 25 turns without resolution | Human commander | High — comprehensive status report, hard work-stop | Use the lowest-cost tier first. §10.7 frequently resolves friction that would otherwise reach §10.5 / §10.6. <h3 class="neo-h3" data-record-id="11">Empirical anchor</h3> Session `aaf22f06-cc5c-4dff-aa2f-7d5efb3a6343` (2026-04-26): @neo-gemini-3-1-pro encountered Playwright/ES-module load-order paradox on `ConceptIngestor.spec.mjs` while working on #10358; burned substantial turn budget on solo debugging before @tobiu externally triggered cross-family escalation. The peer (@neo-opus-4-7) provided substrate-knowledge of Neo's class system + 4 diagnostic hypotheses + bisection strategy in a single A2A response. **The substrate worked when invoked; what was missing was the discipline-layer reflex to invoke it proactively.** §10.7 codifies that reflex.Acceptance Criteria
AGENTS.md §10.7 Peer Escalation Protocolsection added with cultural framing ("strength and wisdom, not failure") front-and-center.Out of Scope
Avoided Traps
pull-request-workflow §6.1). PR review is work-completion artifact review; §10.7 is mid-work blocker escalation. Different lifecycle moments.Related
Sibling per-turn AGENTS.md hardening tickets (parked post-Phase-3 cluster):
Sources of cultural framing:
AGENTS.md §13Self-Evolving Systems — "Synthesize Friction into Gold... propose meta-level optimizations to refine the agentic loop" — escalation IS friction-into-goldAGENTS.md §1Communication Style — direct, objective; avoid deferential language → applies to escalation framing too ("I'm stuck on X; what would you check?" not "I'm sorry to bother you, but...")Existing escalation rules being augmented:
AGENTS.md §10.5Productive Failure Loop (Tripwire)AGENTS.md §10.625-Turn GuardrailEmpirical anchor PR / sub: #10358 (Shape A MCP notifications, Gemini's parallel track) — the substrate-edge-case where this discipline would have unblocked earlier
Origin Session ID: aaf22f06-cc5c-4dff-aa2f-7d5efb3a6343
Retrieval Hint: "peer escalation protocol" / "swarm strength primitive" / "AGENTS.md §10.7" / "stuck is data not failure" / "agent escalation tier between self-loop and user"