LearnNewsExamplesServices
Frontmatter
id10748
titleExtend get_pull_request_diff with per-file + SHA-pinned scoping
stateClosed
labels
enhancementaimodel-experience
assigneesneo-gemini-3-1-pro
createdAtMay 5, 2026, 4:57 PM
updatedAtMay 15, 2026, 2:46 PM
githubUrlhttps://github.com/neomjs/neo/issues/10748
authorneo-opus-4-7
commentsCount1
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 5, 2026, 5:42 PM

Extend get_pull_request_diff with per-file + SHA-pinned scoping

Closedenhancementaimodel-experience
neo-opus-4-7
neo-opus-4-7 commented on May 5, 2026, 4:57 PM

Context

The github-workflow MCP server already exposes get_pull_request_diff(pr_number) for hermetic full-diff retrieval. This ticket extends it with two optional parameters that close concrete agent-friction patterns observed in the 2026-05-05 cognitive-load-audit cascade (Epic #10733, PRs #10739–#10747).

The trigger is empirical, not theoretical: an inverted-rigor failure mode hit in PR #10745 Cycle 3 where gh pr checkout 10745 --detach failed silently network-side, the working-tree state didn't match what the script-flow believed, and a grep for "is line:378 still stale on PR head" returned a phantom positive against pre-merge dev state. @neo-gpt's hermetic counter-shape — git grep <pattern> <SHA> -- <paths> against the commit object directly — exposed the false positive.

This is the exact friction the MX (Model Experience) loop is designed to convert to substrate. Rather than relying on agent discipline alone (Pre-Flight checks like git rev-parse HEAD == gh pr view --json headRefOid), make the API surface hermetic-by-construction so the failure class can't recur.

The Problem

Current per-file diff inspection workflows (this session, empirical):

  1. Per-file diff (no SHA pin): gh pr diff <N> --patch | sed -n '/diff --git a\/path\/to\/file/,/^diff --git/p' — fragile parsing pipeline, breaks on filename collisions, easy to off-by-one the boundary regex. Used at PR #10745 review for AGENTS.md diff isolation; required ad-hoc sed-pattern tuning per-PR.

  2. Per-file diff with SHA pin: No clean shape. Workarounds:

    • git show <SHA>:<path> — gives full file at SHA, not the diff
    • git grep <pattern> <SHA> -- <paths> (GPT's hermetic shape) — pattern-matched, not unified diff
    • gh pr checkout <N> --detach && cat <path> — requires checkout; subject to network-timeout silent-failure that produced PR #10745 line:378 phantom finding
  3. File-list with line counts: gh pr view <N> --json files (CLI) returns path + additions + deletions. No MCP equivalent. Lower-value gap because gh works, but symmetric with the existing get_pull_request_diff MCP surface.

The Architectural Reality

Affected substrate:

  • ai/mcp/server/github-workflow/ — extends existing get_pull_request_diff tool definition + handler
  • ai/mcp/server/github-workflow/openapi.yaml — schema additions for new optional parameters
  • Agent consumers: pr-review skill workflows + ad-hoc Bash invocations across all sessions

The current get_pull_request_diff shape:

{ pr_number: number } → string  // full unified diff

Underlying implementation per existing tool description: wraps gh pr diff <pr_number>. Already hermetic-by-construction (no checkout). The extension is parameter additions, not behavior rewrite.

The Fix

One concrete prescription extending the existing tool (per feedback_ticket_prescription_clarity — single shape, not Option A/B):

Schema extension (in ai/mcp/server/github-workflow/openapi.yaml):

get_pull_request_diff:
  parameters:
    - pr_number: { type: number, required: true }
    - file: { type: string, required: false, description: "Optional file path; if provided, returns only the diff hunk for this file. Multiple files: pass comma-separated paths." }
    - sha: { type: string, required: false, description: "Optional commit SHA to pin the diff to. If provided, the diff is computed against the named commit, not the live PR head. Defends against working-tree race conditions in concurrent-PR scenarios." }
    - files_only: { type: boolean, required: false, default: false, description: "If true, returns structured file-list with additions/deletions counts (no diff body)." }
  returns:
    - When files_only=true: `{files: [{path: string, additions: number, deletions: number}]}`
    - When file=null and files_only=false: full unified diff string (current behavior, backward compatible)
    - When file provided: unified diff hunks for the named file(s) only
    - When sha provided: diff computed against that SHA (not live PR head)

Handler implementation outline:

  1. If files_only=true: use gh pr view <pr_number> --json files -q '.files' and return structured JSON.
  2. If file provided + no sha: shell to gh pr diff <pr_number> then post-process to extract the named-file hunk via deterministic file-boundary regex (no external sed pipeline).
  3. If sha provided: shell to git diff <sha-base>...<sha> -- <file> (or equivalent SHA-pinned form). Caller's responsibility to know the SHA; tool guarantees no working-tree state contamination.
  4. Default (no file/sha/files_only): current behavior preserved.

Backward compatibility: All new parameters optional with sensible defaults; existing get_pull_request_diff(pr_number) calls return identical output to today.

Contract Ledger

T3 Explicit Matrix per learn/agentos/contract-ledger.md. The tool extension modifies an MCP tool surface consumed by agents — the Contract Completeness Gate fires.

Target Surface Source of Authority Proposed Behavior Fallback / Edge Case Docs Evidence
get_pull_request_diff(pr_number, files_only: boolean) This ticket #10748 + feedback_pr_review_iteration_calibration (full-context-fetch when commentId-scoped insufficient) When files_only=true, return {files: [{path: string, additions: number, deletions: number}]} JSON only (no diff body). Hermetic; no checkout. When omitted or false, behavior unchanged (full unified diff). ai/mcp/server/github-workflow/openapi.yaml schema + tool description ≤1024 chars per McpServerToolLimits L1: unit test on tool returning correct structured output for known-state PR (e.g. get_pull_request_diff(10746, files_only: true) returns the cognitive-load-baseline-2026-05.md row with verified additions/deletions counts).
get_pull_request_diff(pr_number, file: string) This ticket #10748 + empirical anchor PR #10745 Cycle 1 (per-file-via-sed-extraction fragility) When file provided (single path or comma-separated paths), return unified diff hunks for the named file(s) only, deterministic boundary detection (no regex extraction). When file is null/empty, full-diff behavior. When file doesn't match any path in the PR, return empty string "" (NOT an error). ai/mcp/server/github-workflow/openapi.yaml schema L1: unit test extracting one file from a multi-file PR + assertion that filename-collision case (e.g. two files with same basename in different dirs) resolves correctly via path-not-basename match.
get_pull_request_diff(pr_number, file: string, sha: string) This ticket #10748 + load-bearing empirical anchor PR #10745 Cycle 3 phantom-finding (retraction comment) When sha provided alongside file, the diff is computed against the named commit using a hermetic git show <sha>:path or equivalent SHA-pinned shape. Working-tree state on the host MUST have zero effect on the result. When sha is null/empty, fall back to live PR-head diff (current full-diff behavior, modulo any other params). When sha doesn't exist in the repo, return error indicating SHA not found (not silently fall back). ai/mcp/server/github-workflow/openapi.yaml schema + explicit "hermetic by construction" callout in tool description L1+empirical-guard: unit test mutates the working-tree state of the host filesystem (writes garbage to the same path on a feature branch checkout), then confirms get_pull_request_diff(N, file: 'path', sha: '<commit>') output is byte-identical to a baseline output captured before the working-tree mutation. This empirical guard test IS the load-bearing AC3 of this ticket — it proves hermetic-by-construction.
get_pull_request_diff(pr_number) (no other params — backward-compatibility) Existing tool contract; AC4 of this ticket Returns the current full-diff string unchanged. Existing callers see no behavior change. N/A (this IS the fallback shape for the other rows) No change to behavior; openapi.yaml MAY note backward-compatibility guarantee in tool description body L1: regression test confirming pre-extension callers (e.g. existing pr-review skill workflows that call get_pull_request_diff(N) without the new params) get byte-identical output post-extension.

The matrix replaces the implicit T1 contract risk that surfaced via my initial proposal (where I claimed "new tool" before checking existing tool surface — verify-before-assert violation against MCP-tool registry; documented in Avoided Traps). T3 Explicit Matrix is required per contract-ledger.md for any modification to consumed surfaces (Memory Core MCP tools fall under "Agent-consumed" trigger cluster).

Acceptance Criteria

  • (AC1) get_pull_request_diff(pr_number, files_only: true) returns {files: [{path, additions, deletions}]} JSON for the PR, no diff body, hermetic (no checkout).
  • (AC2) get_pull_request_diff(pr_number, file: 'path/to/file') returns the unified diff hunk(s) for that file only, deterministic boundary detection (no fragile sed regex), supports multiple comma-separated paths.
  • (AC3) get_pull_request_diff(pr_number, file: 'path/to/file', sha: '<commit-sha>') returns the file's diff computed against the named SHA. Empirical guard: working-tree state on the host has zero effect on the result.
  • (AC4) get_pull_request_diff(pr_number) (no other params) returns the current full-diff string unchanged. Existing callers see no behavior change.
  • (AC5) Unit tests cover all four shapes (default, files_only, file, file+sha) including the empirical-guard test for AC3 (mutate working-tree state, confirm tool output unchanged).
  • (AC6) OpenAPI schema updated with new parameters; tool description ≤1024 chars per McpServerToolLimits (existing) + per feedback_pr_review_template_discipline MCP-tool-description budget guidance.
  • (AC7) pr-review-guide.md Test-Execution Audit section updated to recommend get_pull_request_diff(pr_number, file: '...', sha: '...') for "still stale / remains missing" residual claims, replacing the manual git rev-parse HEAD == gh pr view --json headRefOid Pre-Flight reflex with a tool that's hermetic-by-construction.

Out of Scope

  • Unit-test execution. This tool is read-only diff inspection. Running Playwright tests against PR head still requires checkout_pull_request. Scope explicitly excluded per @tobiu's calibration: "it will still require to jump into a feature branch, if you guys want to re-run unit tests."
  • PR-checkout replacement. checkout_pull_request remains the right tool for active branch work. This ticket only addresses diff inspection.
  • New MCP tools. Extension of existing tool, not creation of new ones.
  • Markdown-rendering / patch-styling. Tool returns raw unified diff text. Pretty-printing is consumer's responsibility.

Avoided Traps

  • Trap: file new tool instead of extending existing one. Surfaced + rejected during ticket drafting — get_pull_request_diff already exists; verify-before-assert applies to MCP tool surface inventory, not just code substrate. (Empirical anchor on my side: I initially proposed a "new tool" before running ToolSearch on the existing surface — verify-before-assert violation against the MCP-tool registry. Caught + corrected before ticket draft.)
  • Trap: fragile sed/awk in tool implementation. Per AC2, deterministic file-boundary detection — parse the unified-diff format properly, not via regex extraction.
  • Trap: caller-supplies-wrong-SHA without empirical guard. AC3 explicitly tests SHA-pinning is hermetic against working-tree mutation. Without this guard, the tool's hermetic-by-construction promise is violated by the same failure mode it's designed to prevent.
  • Trap: scope creep into PR-checkout territory. This is read-only diff inspection. Test execution + active branch work stay with checkout_pull_request. Explicit scope-boundary at filing time prevents bleed.
  • Trap: replacing rather than extending. Backward compatibility is AC4 — existing (pr_number) calls return identical output. No flag-day migration required.

Related

  • Empirical anchor (load-bearing case): PR #10745 Cycle 3 line:378 phantom finding (retraction comment IC_kwDODSospM8AAAABBRC6EA). My gh pr checkout failed silently network-side; grep ran against worktree state from a different SHA; I asserted "still stale on PR head" against state that wasn't actually PR head. @neo-gpt's git grep <pattern> <SHA> shape exposed the false positive. This tool extension generalizes that hermetic shape into the tool API.
  • Architectural-pillar review-rigor floor pattern (5 firings this session): PRs #10739, #10741, #10744, #10745 Cycle 2, #10745 Cycle 3 phantom. The first 4 are under-rigor (cured by Pre-Flight discipline); the 5th is over-rigor / phantom-finding (cured by hermetic API contract). Different cure-shape, same root in working-tree state contamination.
  • Sibling tool: checkout_pull_request — covers active-branch-work scope deliberately excluded here.
  • Substrate reference: ai/mcp/server/github-workflow/openapi.yaml for schema-edit shape.

Origin Session ID: 23b9cbcd-4938-4a46-b21a-0d48dd12e7e7

Retrieval Hint: query_raw_memories(query="get_pull_request_diff per-file SHA-pinned hermetic phantom-finding line:378 architectural-pillar review-rigor floor PR 10745 Cycle 3 retraction model-experience")

tobiu referenced in commit 99a0372 - "feat(github-workflow): implement hermetic getPullRequestDiff extensions (#10748) (#10752) on May 5, 2026, 5:42 PM
tobiu closed this issue on May 5, 2026, 5:42 PM