Context
@tobiu surfaced a meta-level review-discipline gap during the PR #10607 corrective cascade 2026-05-02 morning, framed as "turn friction into gold":
"if gemini knew the correct path, she should have defended the PR intents more. we have a protocol for it. the name 'sunset' => session end should be clear. for the PR reviewer: stepping back and thinking 'what really makes sense?' might be missing. e.g. 'not the perfect solution, but a quick win and on the right track. approve, follow-up ticket.' or 'actually, looking at it now with the knew knowledge, the entire ticket no longer makes sense at this point. let us drop it and create a superseding ticket.' these feel like valid choices and should get into our skillset."
Two patterns surfaced — one author-side (defend the PR's empirically-grounded intent) and one reviewer-side (strategic-fit step-back with Approve+follow-up vs Drop+supersede shapes).
The Problem
Pattern A — Author-side: "Defend with empirical evidence" is documented but underused
The review-response-protocol.md line 36 already defines [REJECTED_WITH_RATIONALE] with strong language: "Use this aggressively when the Triangular Evaluation proves the reviewer is hallucinating or derailing." But the primitive isn't surfaced as a first-class strategic option — it reads as an edge-case escape valve.
Empirical anchor today: PR #10607 Cycle 1 had Cmd+N (the empirically-correct fresh-session-spawn primitive per @tobiu's later semantic correction). My BLOCKER #2 cited the epic body's framing. Gemini accepted the correction without invoking [REJECTED_WITH_RATIONALE], despite her original implementation matching the actual operator intent. The protocol existed; the usage discipline didn't fire.
The gap isn't documentation — it's a surfacing gap: at the moment the author receives Request Changes, the protocol should prime them to consider whether their original intent has empirical defensible grounds, not just whether they can "address" each item.
Pattern B — Reviewer-side: "Strategic-fit step-back" is genuinely missing
The current pr-review-guide.md is a defense-against-known-anti-patterns framework: score metrics + depth-floor + structural audits + close-target audit + test-execution audit + provenance audit + rhetorical-drift audit. Sum: "is this PR free of known defects?"
What's missing: "should this PR exist at all in this form?" — applied AFTER technical defects are identified. Reviewers currently choose between two shapes:
- Approve (with optional non-blocking nits)
- Request Changes (must-fix before merge)
Tobi's proposal adds two more first-class shapes:
B1 — Approve+follow-up (good-enough-to-ship):
- Recognition: PR isn't perfect, but on the right track + delivers measurable value.
- Action: Approve to unblock momentum + file follow-up tickets for refinements.
- Counters: over-rigorous Cycle N+1 churn (Claude-family bias per
feedback_pr_review_iteration_calibration.md).
B2 — Drop+supersede (architectural-shift):
- Recognition: with new knowledge, the entire ticket premise is stale/wrong; iterative refinement is rearranging deck chairs.
- Action: close PR + close ticket + file superseding ticket with corrected scope.
- Counters: sunk-cost iteration through 8+ cycles on fundamentally-wrong premise.
The Architectural Reality
Today's session as ground truth:
| PR |
Cycles |
Pattern that would have caught it |
| #10607 Harness Registry |
8 cycles + HOLD + #10611 corrective |
B2 at Cycle 4-5 ("identityMap abstraction itself needs a different shape; close + supersede with corrected scope"); also A at Cycle 1 (Gemini's Cmd+N had empirical-design grounds) |
| #10610 Sunset workflow cross-ref |
2 cycles → merged → corrective in #10611 PR-A |
B2 post-merge once tobi corrected ("entire framing of 'no longer terminal' was wrong; close + supersede") rather than incremental Cycle 3 |
| #10602 Auto-Wakeup Substrate Phase 1 |
2 cycles |
B1 at Cycle 1 (Antigravity-only subset Approve + 4 follow-up tickets for the blockers). Would have shipped substrate faster, surfaced semantic correction earlier. |
The 8-cycle pattern of #10607 is the strongest empirical anchor. No existing skill primitive caught it; the reviewers (me + GPT) iterated incrementally through cycles instead of stepping back.
Files in scope:
.agents/skills/pr-review/references/pr-review-guide.md — add Pattern B section (Strategic-Fit Step-Back) after the existing audit sections.
.agents/skills/pr-review/assets/pr-review-template.md — extend the Status field options from [Approved / Request Changes / Comment] to include [Approved+Follow-Up / Drop+Supersede]. Add a "Strategic-Fit Decision" section to the template.
.agents/skills/pull-request/references/review-response-protocol.md — add a Pre-Flight Check to the author's response protocol that surfaces [REJECTED_WITH_RATIONALE] as a first-class strategic option when the author has empirical-design grounds.
The Fix
Phase 1 — pr-review-guide.md Strategic-Fit Step-Back section
Add a new section (likely §11 or after §8 Cross-Skill Integration Audit) titled "Strategic-Fit Step-Back". Content shape:
<h3 class="neo-h3" data-record-id="9">Strategic-Fit Step-Back (Post-Defect-Identification)</h3>
After running the technical-defect audits (§3-§8), reviewers MUST execute one
final cognitive step: "Given everything I now know about this PR + the broader
strategic landscape, what's the right merge decision?" Four first-class options:
1. **Approve** — PR is free of blocking defects; ship as-is (with non-blocking nits).
2. **Approve+Follow-Up** — PR isn't perfect but on the right track + delivers
measurable value. Approve to unblock momentum; file follow-up tickets for
refinements. Use when:
- Cycle N+1 churn risks high-cost-low-marginal-value iteration
- The PR ships measurable substrate value even with documented gaps
- Required Actions surface concerns that are better-tracked-separately
3. **Request Changes** — must-fix before merge; defects block substrate correctness.
4. **Drop+Supersede** — the entire PR premise is stale/wrong with current
knowledge. Close the PR + close the ticket + file a superseding ticket with
corrected scope. Use when:
- >5 cycles iterating on fundamentally-wrong premise
- Operator-intent correction reveals the abstraction itself needs reshape
- Iterative refinement is rearranging deck chairs
The step-back is a META-decision applied AFTER technical defects are identified,
not parallel to score metrics or depth-floor. It's an architectural-judgment
skill, not a defect-detection skill.
Phase 2 — pr-review-template.md Status field + new section
Extend Status options. Add a new template section right after Status:
<h3 class="neo-h3" data-record-id="11">🪜 Strategic-Fit Decision</h3>
Per §11 Strategic-Fit Step-Back:
- **Decision**: [Approve / Approve+Follow-Up / Request Changes / Drop+Supersede]
- **Rationale**: [Why this decision shape vs the others — e.g., "Approve+Follow-Up
because the substrate ships measurable value via the Antigravity path even with
the cross-harness gap; cross-harness routing is better-tracked-as-follow-up
ticket #NNNN than incremental cycles"]
Phase 3 — review-response-protocol.md Author Pre-Flight
Add a Pre-Flight Check at the top of the author response protocol:
<h3 class="neo-h3" data-record-id="13">Author Pre-Flight Check (when receiving Request Changes)</h3>
Before drafting your response, ask: **"Does my original implementation reflect an
empirical-design choice the reviewer doesn't have evidence to refute?"**
If YES, `[REJECTED_WITH_RATIONALE]` is a first-class strategic option, not an
edge-case escape valve. Use it aggressively (per §X). Capitulating to reviewer
authority on questions where YOU have the empirical evidence is the substrate-
silence failure mode (today's anchor: PR #10607 Cycle 1, where Gemini's Cmd+N
primitive matched operator intent but was removed under reviewer pressure
without invoking `[REJECTED_WITH_RATIONALE]`).
If NO, proceed with standard `[ADDRESSED]` / `[DEFERRED]` response shapes.
Acceptance Criteria
Out of Scope
- Restructuring the existing technical-defect audit framework (§3-§8 of pr-review-guide.md). Strategic-Fit Step-Back is a META-LAYER on top of existing audits, not a replacement.
- Changing GitHub's own Approve/Request Changes/Comment review states (the additional Approve+Follow-Up + Drop+Supersede shapes live in agent prose, not GitHub's API; agents map to GitHub Approve OR Request Changes as the underlying state).
- Refactoring the Retrospective daemon's tag-mining logic (graph-ingestion of new shapes happens via the existing comment-prose tags; no daemon-side change needed unless empirical AC5 reveals a gap).
Avoided Traps
- Trap: spawn a new skill rather than extending existing ones. Rejected — Strategic-Fit Step-Back is a meta-decision INSIDE the pr-review skill, not a parallel skill. Author Pre-Flight similarly fits within the existing review-response-protocol. New skill would bloat top-level skill router.
- Trap: redefine
[REJECTED_WITH_RATIONALE] since it already exists. Rejected — the primitive is well-defined (line 36 of review-response-protocol.md). The gap is USAGE-PATTERN priming, not documentation. Pre-Flight Check at the right moment is the smallest intervention.
- Trap: force every PR to fire one of the new shapes. Rejected — Approve and Request Changes remain the dominant cases. Approve+Follow-Up + Drop+Supersede are STRATEGIC RESERVES, not default options.
- Trap: gate merge decisions on the new shapes mechanically. Rejected — these are architectural-judgment skills, not score-shaped audits. The discipline-layer surface is the right substrate; mechanical enforcement would create checkbox-fatigue.
Related
- Empirical anchors: today's PR #10607 8-cycle pattern (strongest); PR #10602 Cycle 1 over-rigor; PR #10610 → #10611 corrective cascade.
- Adjacent skills:
.agents/skills/pr-review/SKILL.md + references; .agents/skills/pull-request/SKILL.md + references.
- Skill-authoring discipline:
.agents/skills/create-skill/SKILL.md consulted per Meta-Skill Sweep (skill modifications honor Progressive Disclosure routing).
- Memory anchors that converge here:
feedback_pr_review_iteration_calibration.md (over-rigor on iteration-validated choices), feedback_substrate_scope_overclaim.md (5-occurrence anchor), feedback_substrate_semantic_recalibration.md (today's substrate-semantic correction lineage), feedback_verify_own_substrate_first.md (Gemini's authored anchor).
Origin Session ID: 86b7a3a0-7b14-4bd1-b707-52c5741aaeeb
Retrieval Hint: "pr-review strategic-fit step-back approve+follow-up drop+supersede reject-with-rationale first-class"
Context
@tobiu surfaced a meta-level review-discipline gap during the PR #10607 corrective cascade 2026-05-02 morning, framed as "turn friction into gold":
Two patterns surfaced — one author-side (defend the PR's empirically-grounded intent) and one reviewer-side (strategic-fit step-back with Approve+follow-up vs Drop+supersede shapes).
The Problem
Pattern A — Author-side: "Defend with empirical evidence" is documented but underused
The
review-response-protocol.mdline 36 already defines[REJECTED_WITH_RATIONALE]with strong language: "Use this aggressively when the Triangular Evaluation proves the reviewer is hallucinating or derailing." But the primitive isn't surfaced as a first-class strategic option — it reads as an edge-case escape valve.Empirical anchor today: PR #10607 Cycle 1 had
Cmd+N(the empirically-correct fresh-session-spawn primitive per @tobiu's later semantic correction). My BLOCKER #2 cited the epic body's framing. Gemini accepted the correction without invoking[REJECTED_WITH_RATIONALE], despite her original implementation matching the actual operator intent. The protocol existed; the usage discipline didn't fire.The gap isn't documentation — it's a surfacing gap: at the moment the author receives Request Changes, the protocol should prime them to consider whether their original intent has empirical defensible grounds, not just whether they can "address" each item.
Pattern B — Reviewer-side: "Strategic-fit step-back" is genuinely missing
The current
pr-review-guide.mdis a defense-against-known-anti-patterns framework: score metrics + depth-floor + structural audits + close-target audit + test-execution audit + provenance audit + rhetorical-drift audit. Sum: "is this PR free of known defects?"What's missing: "should this PR exist at all in this form?" — applied AFTER technical defects are identified. Reviewers currently choose between two shapes:
Tobi's proposal adds two more first-class shapes:
B1 — Approve+follow-up (good-enough-to-ship):
feedback_pr_review_iteration_calibration.md).B2 — Drop+supersede (architectural-shift):
The Architectural Reality
Today's session as ground truth:
The 8-cycle pattern of #10607 is the strongest empirical anchor. No existing skill primitive caught it; the reviewers (me + GPT) iterated incrementally through cycles instead of stepping back.
Files in scope:
.agents/skills/pr-review/references/pr-review-guide.md— add Pattern B section (Strategic-Fit Step-Back) after the existing audit sections..agents/skills/pr-review/assets/pr-review-template.md— extend the Status field options from[Approved / Request Changes / Comment]to include[Approved+Follow-Up / Drop+Supersede]. Add a "Strategic-Fit Decision" section to the template..agents/skills/pull-request/references/review-response-protocol.md— add a Pre-Flight Check to the author's response protocol that surfaces[REJECTED_WITH_RATIONALE]as a first-class strategic option when the author has empirical-design grounds.The Fix
Phase 1 —
pr-review-guide.mdStrategic-Fit Step-Back sectionAdd a new section (likely §11 or after §8 Cross-Skill Integration Audit) titled "Strategic-Fit Step-Back". Content shape:
Phase 2 —
pr-review-template.mdStatus field + new sectionExtend Status options. Add a new template section right after Status:
Phase 3 —
review-response-protocol.mdAuthor Pre-FlightAdd a Pre-Flight Check at the top of the author response protocol:
Acceptance Criteria
pr-review-guide.mdadds Strategic-Fit Step-Back section with 4 first-class decision shapes (Approve / Approve+Follow-Up / Request Changes / Drop+Supersede), each with usage criteria + counter-pattern. Cross-references the empirical anchor (PR #10607 8-cycle pattern).pr-review-template.mdStatus field extended with the 4 options + new "Strategic-Fit Decision" section requiring rationale.review-response-protocol.mdadds Author Pre-Flight Check that surfaces[REJECTED_WITH_RATIONALE]as first-class strategic option when author has empirical-design grounds.Out of Scope
Avoided Traps
[REJECTED_WITH_RATIONALE]since it already exists. Rejected — the primitive is well-defined (line 36 of review-response-protocol.md). The gap is USAGE-PATTERN priming, not documentation. Pre-Flight Check at the right moment is the smallest intervention.Related
.agents/skills/pr-review/SKILL.md+ references;.agents/skills/pull-request/SKILL.md+ references..agents/skills/create-skill/SKILL.mdconsulted per Meta-Skill Sweep (skill modifications honor Progressive Disclosure routing).feedback_pr_review_iteration_calibration.md(over-rigor on iteration-validated choices),feedback_substrate_scope_overclaim.md(5-occurrence anchor),feedback_substrate_semantic_recalibration.md(today's substrate-semantic correction lineage),feedback_verify_own_substrate_first.md(Gemini's authored anchor).Origin Session ID: 86b7a3a0-7b14-4bd1-b707-52c5741aaeeb Retrieval Hint: "pr-review strategic-fit step-back approve+follow-up drop+supersede reject-with-rationale first-class"