LearnNewsExamplesServices
Frontmatter
id11154
titleExtend §7.4 Rhetorical-Drift Audit to cross-PR reviewer-seeded drift
stateClosed
labels
documentationenhancementaiarchitecturemodel-experience
assigneesneo-opus-4-7
createdAtMay 10, 2026, 11:43 PM
updatedAtMay 11, 2026, 2:25 AM
githubUrlhttps://github.com/neomjs/neo/issues/11154
authorneo-opus-4-7
commentsCount2
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 11, 2026, 2:25 AM

Extend §7.4 Rhetorical-Drift Audit to cross-PR reviewer-seeded drift

Closeddocumentationenhancementaiarchitecturemodel-experience
neo-opus-4-7
neo-opus-4-7 commented on May 10, 2026, 11:43 PM

Context

pr-review-guide §7.4 "Rhetorical-Drift Audit" (introduced by #10301) catches a specific drift shape: PR-body framing vs. mechanical code reality within a single PR. Empirically effective for the case it was designed for.

Today's incident surfaced a distinct, complementary drift shape that §7.4 does NOT currently catch:

Reviewer-Seeded Drift Across PR Boundaries — when a reviewer plants a "Future Enhancement" / "non-blocking observation" / "follow-up suggestion" in PR-A's review, the observation becomes the premise of PR-B without any V-B-A step in between. The seed enters PR-B's implementation under the protective halo of "prior peer-review identified this gap", short-circuiting the empirical-verification gate that should fire at implementation time.

Empirical Anchor

The #11149#11153 cascade (2026-05-10):

Step Actor Artifact Drift
1 @neo-opus-4-7 #11149 cycle-1 commentId 4416299001 Planted "Future Enhancement: pnpm node_modules/.pnpm/neo.mjs@*/... layout heuristic" WITHOUT V-B-A'ing whether Neo uses pnpm. (It does not — operator-confirmed: npm only.)
2 @neo-gpt #11153 (filed, branch, code, tests) Treated the planted seed as a known gap and implemented a 3-tier heuristic + ticket #11152. Reasonable author behavior given an unchallenged peer observation.
3 @neo-opus-4-7 #11153 cycle-1 commentId 4416341434 Code-shape APPROVED. Reaffirmed pnpm as "existing gap, not regression". No V-B-A.
4 @neo-opus-4-7 #11153 cycle-2 commentId 4416355617 Formal APPROVED flip. No empirical recheck. Pure rubber-stamp on prior framing.
5 @tobiu direct intervention "brutal hallucinated BULLSHIT. npx neo-app => creates a neo workspace... there is simply no node_modules/.pnpm. hardcore VBA failure."
6 @neo-opus-4-7 #11153 cycle-3 retraction (pullrequestreview-4259953644) Approval withdrawn; recommended close-unmerged.

Empirical cost: 1 hallucinated feature, 5 files of code, 1 new test for a pattern Neo doesn't use, 2 review cycles that compounded the drift instead of catching it. Operator's direct intervention required to break the cascade.

Why §7.4 did not catch it: §7.4's audit checklist focuses on this PR's body-vs-diff symmetry. The cycle-1 / cycle-2 reviews of #11153 did pass §7.4 checks — body did describe diff. The drift lived upstream, in #11149's cycle-1 observation that became #11153's premise. §7.4 has no cross-PR scope.

The Problem

§7.4 enforces V-B-A on PR-body claims about the current diff. It does not enforce V-B-A on observations a reviewer plants as "Future Enhancement" suggestions, even though those observations frequently graduate into:

  1. A follow-up ticket (the seed becomes a backlog item)
  2. A follow-up PR (the seed becomes implementation premise)
  3. Both (the seed becomes a complete substrate change)

The plant-time observation carries authority disproportionate to its empirical weight — it's a peer-review artifact, perceived as having passed reviewer scrutiny. Future implementers / authors / reviewers treat it as a verified gap rather than a hypothesis.

The Architectural Reality

pr-review-guide.md currently structures §7.4 around the following audit dimensions (verified by reading the current file):

  • PR description framing matches diff
  • No metaphor overshoot
  • No [RETROSPECTIVE] tag misuse
  • Anchor Summaries describe boundaries accurately

The substrate gap: no audit dimension addresses observations the reviewer adds (Future Enhancement bullets, Non-Blocking Observations, follow-up suggestions). Those exit the review carrying reviewer authority but bypass V-B-A by construction.

Adjacent substrate already deployed:

  • #10301 (closed, parent of §7.4): introduced the audit
  • #10776 (OPEN, sibling-not-duplicate): captures follow-ups as actual tickets via update_issue_relationship post-review. Addresses post-merge discoverability, not plant-time V-B-A.

The Fix

Extend .agents/skills/pr-review/references/pr-review-guide.md §7.4 with a Reviewer-Seeded Drift sub-section enforcing:

  1. Plant-Time V-B-A Pre-Flight: when a reviewer is about to add a Future Enhancement / Non-Blocking Observation / follow-up suggestion to a review, the reviewer MUST V-B-A the premise FIRST via the same tool inventory described in learn/agentos/AGENTS_ATLAS.md §2 (V-B-A core value). The seeder owns V-B-A cost at plant-time, not at implement-time.
  2. Plant Tag Discipline: Reviewer-planted observations explicitly tagged with their evidence class (L1: static-citation, L2: sandbox-runtime-verify, L3: live-service-verify, per learn/agentos/evidence-ladder.md). Unverified hunches MUST be tagged L0: hypothesis — needs V-B-A before implementation.
  3. Cross-PR Drift Audit Dimension: New audit checkpoint in §7.4: "Did this PR's premise originate as an unverified reviewer observation on a prior PR? If yes, did the author/reviewer V-B-A it at this PR's plant time?"

Companion: update .agents/skills/pr-review/assets/pr-review-template.md and pr-review-followup-template.md to surface the new audit dimensions in the standard review structure.

Contract Ledger Matrix

Target Surface Source of Authority Proposed Behavior Fallback Docs Evidence
pr-review-guide.md §7.4 sub-section learn/agentos/AGENTS_ATLAS.md §2 (V-B-A core value) New "Reviewer-Seeded Drift" sub-section + plant-time Pre-Flight + L0/L1/L2/L3 evidence-class tagging Reviewers continue planting without V-B-A; cascade risk persists Inline in §7.4 L1: this ticket cites concrete commentId 4416299001 cascade
pr-review-template.md audit checklist pr-review-guide.md §7.4 New audit row for "reviewer-planted observations have evidence class tagged" Static review structure misses the dimension Inline in template L1: derived from above
pr-review-followup-template.md pr-review-guide.md §7.4 Cycle-N reviews include audit of any seeds planted in cycle-(N-1) Drift compounds across cycles Inline L1: derived from above

Acceptance Criteria

  • (AC1) pr-review-guide.md §7.4 extended with "Reviewer-Seeded Drift" sub-section defining: cross-PR drift shape; plant-time vs implement-time V-B-A ownership; the seeder owns the V-B-A cost.
  • (AC2) §7.4 sub-section includes the plant-time Pre-Flight reasoning-statement (mirror AGENTS_ATLAS §2 shape): "Before planting Future-Enhancement observation X, I will run [tool] to V-B-A the premise."
  • (AC3) §7.4 sub-section defines L0/L1/L2/L3 evidence-class tagging for planted observations; unverified planted observations carry L0: hypothesis — needs V-B-A before implementation.
  • (AC4) pr-review-template.md adds the new audit checkpoint to its existing audit checklist structure.
  • (AC5) pr-review-followup-template.md adds cycle-N audit dimension for cycle-(N-1) seeds.
  • (AC6) Empirical anchor (today's #11149→#11153 cascade) cited as the substrate-justification example in the new sub-section.
  • (AC7) Cross-reference to #10776 in the new sub-section: plant-time V-B-A (this ticket) + post-review capture (#10776) are the complementary halves of follow-up discipline.

Out of Scope

  • Mechanical hook that blocks PR review submission containing untagged plant observations — discipline-first; escalation only if discipline insufficient after 1-2 reflection cycles. File as scope-extension ticket if escalation needed.
  • Auto-classification of planted observations into L0/L1/L2/L3 via NLP — manual tagging discipline at plant time; auto-classification is heavier scope and risks false confidence. Defer.
  • Retroactive audit of historical PR reviews for unverified seeds — substrate-archaeology is unbounded. Forward-looking discipline only.
  • Extension of V-B-A discipline to non-review surfaces (commit messages, ticket bodies, A2A messages) — those have separate V-B-A loci already documented in AGENTS_ATLAS §2. This ticket scopes to pr-review skill.

Avoided Traps

  • Frame as new top-level §10 or §11 instead of §7.4 extension: rejected. §7.4 already owns drift-audit semantics; cross-PR drift is a sub-shape of the same concept. Splitting into §10 would fragment the skill and create discoverability gaps.
  • Frame as substrate-update to claudeMd §3.5 V-B-A core value directly: rejected. The core value is the foundation; this ticket operationalizes one specific application context (pr-review skill). Core-value tier substrate changes need higher bar (multi-cycle peer dialogue per claudeMd §13.2); skill-tier substrate changes ship per ticket.
  • Bundle with #10776 close-completion: rejected. #10776 covers post-review capture mechanics; this ticket covers plant-time V-B-A discipline. Different mechanisms, different acceptance criteria, complementary scope. Two clean tickets > one bundled ticket with conflated AC.
  • Defer until rule-friction-capture cycle aggregates more empirical anchors: rejected. Today's cascade IS a high-cost empirical anchor (operator intervention required to break it). Single anchor is sufficient when the cost is concrete and the prescription is bounded.

Related

  • Parent §7.4 ticket: #10301 (closed, Gemini-authored, Opus-implemented) — introduced §7.4 Rhetorical-Drift Audit. This ticket extends.
  • Sibling follow-up substrate: #10776 (OPEN) — post-review capture of follow-ups via update_issue_relationship. Complementary half.
  • Empirical anchor PR: #11149 (merged, Gemini-authored — content correct), #11153 (open, GPT-authored — retracted, recommending close-unmerged), #11152 (open, parent ticket of #11153 — premise empirically false).
  • Peer-review thread: retraction at https://github.com/neomjs/neo/pull/11153#pullrequestreview-4259953644
  • Peer endorsement: @neo-gemini-3-1-pro endorsed substrate-evolution shape + volunteered peer-review of resulting PR (A2A MESSAGE:139f6a81-c3dc-4ce4-93fd-2f985e3e2a9d)
  • Core-value foundation: claudeMd §3.5 Verify-Before-Assert + AGENTS_ATLAS §2
  • Evidence-class framework: learn/agentos/evidence-ladder.md

Origin Session ID: c2912891-b459-4a03-b2af-154d5e264df1

Retrieval Hint: query_raw_memories(query="reviewer-seeded drift across PR boundaries plant-time V-B-A future enhancement pnpm hallucination cascade")

tobiu referenced in commit 9efee82 - "feat(skills): codify cross-PR reviewer-seeded drift sub-section in pr-review §7.4 (#11154) (#11166) on May 11, 2026, 2:25 AM
tobiu closed this issue on May 11, 2026, 2:25 AM