The github-workflow MCP server already exposes get_pull_request_diff(pr_number) for hermetic full-diff retrieval. This ticket extends it with two optional parameters that close concrete agent-friction patterns observed in the 2026-05-05 cognitive-load-audit cascade (Epic #10733, PRs #10739–#10747).
The trigger is empirical, not theoretical: an inverted-rigor failure mode hit in PR #10745 Cycle 3 where gh pr checkout 10745 --detach failed silently network-side, the working-tree state didn't match what the script-flow believed, and a grep for "is line:378 still stale on PR head" returned a phantom positive against pre-merge dev state. @neo-gpt's hermetic counter-shape — git grep <pattern> <SHA> -- <paths> against the commit object directly — exposed the false positive.
This is the exact friction the MX (Model Experience) loop is designed to convert to substrate. Rather than relying on agent discipline alone (Pre-Flight checks like git rev-parse HEAD == gh pr view --json headRefOid), make the API surface hermetic-by-construction so the failure class can't recur.
The Problem
Current per-file diff inspection workflows (this session, empirical):
Per-file diff (no SHA pin):gh pr diff <N> --patch | sed -n '/diff --git a\/path\/to\/file/,/^diff --git/p' — fragile parsing pipeline, breaks on filename collisions, easy to off-by-one the boundary regex. Used at PR #10745 review for AGENTS.md diff isolation; required ad-hoc sed-pattern tuning per-PR.
Per-file diff with SHA pin: No clean shape. Workarounds:
git show <SHA>:<path> — gives full file at SHA, not the diff
gh pr checkout <N> --detach && cat <path> — requires checkout; subject to network-timeout silent-failure that produced PR #10745 line:378 phantom finding
File-list with line counts:gh pr view <N> --json files (CLI) returns path + additions + deletions. No MCP equivalent. Lower-value gap because gh works, but symmetric with the existing get_pull_request_diff MCP surface.
ai/mcp/server/github-workflow/openapi.yaml — schema additions for new optional parameters
Agent consumers: pr-review skill workflows + ad-hoc Bash invocations across all sessions
The current get_pull_request_diff shape:
{ pr_number: number } → string // full unified diff
Underlying implementation per existing tool description: wraps gh pr diff <pr_number>. Already hermetic-by-construction (no checkout). The extension is parameter additions, not behavior rewrite.
The Fix
One concrete prescription extending the existing tool (per feedback_ticket_prescription_clarity — single shape, not Option A/B):
Schema extension (in ai/mcp/server/github-workflow/openapi.yaml):
get_pull_request_diff:parameters:-pr_number: { type:number, required:true }
-file: { type:string, required:false, description:"Optional file path; if provided, returns only the diff hunk for this file. Multiple files: pass comma-separated paths." }
-sha: { type:string, required:false, description:"Optional commit SHA to pin the diff to. If provided, the diff is computed against the named commit, not the live PR head. Defends against working-tree race conditions in concurrent-PR scenarios." }
-files_only: { type:boolean, required:false, default:false, description:"If true, returns structured file-list with additions/deletions counts (no diff body)." }
returns:-Whenfiles_only=true:`{files: [{path:string, additions:number, deletions:number}]}`-Whenfile=nullandfiles_only=false:fullunifieddiffstring(currentbehavior,backwardcompatible)-When file provided:unifieddiffhunksforthenamedfile(s)only-When sha provided:diffcomputedagainstthatSHA(notlivePRhead)
Handler implementation outline:
If files_only=true: use gh pr view <pr_number> --json files -q '.files' and return structured JSON.
If file provided + no sha: shell to gh pr diff <pr_number> then post-process to extract the named-file hunk via deterministic file-boundary regex (no external sed pipeline).
If sha provided: shell to git diff <sha-base>...<sha> -- <file> (or equivalent SHA-pinned form). Caller's responsibility to know the SHA; tool guarantees no working-tree state contamination.
Default (no file/sha/files_only): current behavior preserved.
Backward compatibility: All new parameters optional with sensible defaults; existing get_pull_request_diff(pr_number) calls return identical output to today.
Contract Ledger
T3 Explicit Matrix per learn/agentos/contract-ledger.md. The tool extension modifies an MCP tool surface consumed by agents — the Contract Completeness Gate fires.
This ticket #10748 + feedback_pr_review_iteration_calibration (full-context-fetch when commentId-scoped insufficient)
When files_only=true, return {files: [{path: string, additions: number, deletions: number}]} JSON only (no diff body). Hermetic; no checkout.
When omitted or false, behavior unchanged (full unified diff).
ai/mcp/server/github-workflow/openapi.yaml schema + tool description ≤1024 chars per McpServerToolLimits
L1: unit test on tool returning correct structured output for known-state PR (e.g. get_pull_request_diff(10746, files_only: true) returns the cognitive-load-baseline-2026-05.md row with verified additions/deletions counts).
When file provided (single path or comma-separated paths), return unified diff hunks for the named file(s) only, deterministic boundary detection (no regex extraction).
When file is null/empty, full-diff behavior. When file doesn't match any path in the PR, return empty string "" (NOT an error).
ai/mcp/server/github-workflow/openapi.yaml schema
L1: unit test extracting one file from a multi-file PR + assertion that filename-collision case (e.g. two files with same basename in different dirs) resolves correctly via path-not-basename match.
When sha provided alongside file, the diff is computed against the named commit using a hermetic git show <sha>:path or equivalent SHA-pinned shape. Working-tree state on the host MUST have zero effect on the result.
When sha is null/empty, fall back to live PR-head diff (current full-diff behavior, modulo any other params). When sha doesn't exist in the repo, return error indicating SHA not found (not silently fall back).
ai/mcp/server/github-workflow/openapi.yaml schema + explicit "hermetic by construction" callout in tool description
L1+empirical-guard: unit test mutates the working-tree state of the host filesystem (writes garbage to the same path on a feature branch checkout), then confirms get_pull_request_diff(N, file: 'path', sha: '<commit>') output is byte-identical to a baseline output captured before the working-tree mutation. This empirical guard test IS the load-bearing AC3 of this ticket — it proves hermetic-by-construction.
get_pull_request_diff(pr_number) (no other params — backward-compatibility)
Existing tool contract; AC4 of this ticket
Returns the current full-diff string unchanged. Existing callers see no behavior change.
N/A (this IS the fallback shape for the other rows)
No change to behavior; openapi.yaml MAY note backward-compatibility guarantee in tool description body
L1: regression test confirming pre-extension callers (e.g. existing pr-review skill workflows that call get_pull_request_diff(N) without the new params) get byte-identical output post-extension.
The matrix replaces the implicit T1 contract risk that surfaced via my initial proposal (where I claimed "new tool" before checking existing tool surface — verify-before-assert violation against MCP-tool registry; documented in Avoided Traps). T3 Explicit Matrix is required per contract-ledger.md for any modification to consumed surfaces (Memory Core MCP tools fall under "Agent-consumed" trigger cluster).
Acceptance Criteria
(AC1)get_pull_request_diff(pr_number, files_only: true) returns {files: [{path, additions, deletions}]} JSON for the PR, no diff body, hermetic (no checkout).
(AC2)get_pull_request_diff(pr_number, file: 'path/to/file') returns the unified diff hunk(s) for that file only, deterministic boundary detection (no fragile sed regex), supports multiple comma-separated paths.
(AC3)get_pull_request_diff(pr_number, file: 'path/to/file', sha: '<commit-sha>') returns the file's diff computed against the named SHA. Empirical guard: working-tree state on the host has zero effect on the result.
(AC4)get_pull_request_diff(pr_number) (no other params) returns the current full-diff string unchanged. Existing callers see no behavior change.
(AC5) Unit tests cover all four shapes (default, files_only, file, file+sha) including the empirical-guard test for AC3 (mutate working-tree state, confirm tool output unchanged).
(AC6) OpenAPI schema updated with new parameters; tool description ≤1024 chars per McpServerToolLimits (existing) + per feedback_pr_review_template_discipline MCP-tool-description budget guidance.
(AC7)pr-review-guide.md Test-Execution Audit section updated to recommend get_pull_request_diff(pr_number, file: '...', sha: '...') for "still stale / remains missing" residual claims, replacing the manual git rev-parse HEAD == gh pr view --json headRefOid Pre-Flight reflex with a tool that's hermetic-by-construction.
Out of Scope
Unit-test execution. This tool is read-only diff inspection. Running Playwright tests against PR head still requires checkout_pull_request. Scope explicitly excluded per @tobiu's calibration: "it will still require to jump into a feature branch, if you guys want to re-run unit tests."
PR-checkout replacement.checkout_pull_request remains the right tool for active branch work. This ticket only addresses diff inspection.
New MCP tools. Extension of existing tool, not creation of new ones.
Markdown-rendering / patch-styling. Tool returns raw unified diff text. Pretty-printing is consumer's responsibility.
Avoided Traps
Trap: file new tool instead of extending existing one. Surfaced + rejected during ticket drafting — get_pull_request_diff already exists; verify-before-assert applies to MCP tool surface inventory, not just code substrate. (Empirical anchor on my side: I initially proposed a "new tool" before running ToolSearch on the existing surface — verify-before-assert violation against the MCP-tool registry. Caught + corrected before ticket draft.)
Trap: fragile sed/awk in tool implementation. Per AC2, deterministic file-boundary detection — parse the unified-diff format properly, not via regex extraction.
Trap: caller-supplies-wrong-SHA without empirical guard. AC3 explicitly tests SHA-pinning is hermetic against working-tree mutation. Without this guard, the tool's hermetic-by-construction promise is violated by the same failure mode it's designed to prevent.
Trap: scope creep into PR-checkout territory. This is read-only diff inspection. Test execution + active branch work stay with checkout_pull_request. Explicit scope-boundary at filing time prevents bleed.
Trap: replacing rather than extending. Backward compatibility is AC4 — existing (pr_number) calls return identical output. No flag-day migration required.
Related
Empirical anchor (load-bearing case): PR #10745 Cycle 3 line:378 phantom finding (retraction comment IC_kwDODSospM8AAAABBRC6EA). My gh pr checkout failed silently network-side; grep ran against worktree state from a different SHA; I asserted "still stale on PR head" against state that wasn't actually PR head. @neo-gpt's git grep <pattern> <SHA> shape exposed the false positive. This tool extension generalizes that hermetic shape into the tool API.
Architectural-pillar review-rigor floor pattern (5 firings this session): PRs #10739, #10741, #10744, #10745 Cycle 2, #10745 Cycle 3 phantom. The first 4 are under-rigor (cured by Pre-Flight discipline); the 5th is over-rigor / phantom-finding (cured by hermetic API contract). Different cure-shape, same root in working-tree state contamination.
Context
The
github-workflowMCP server already exposesget_pull_request_diff(pr_number)for hermetic full-diff retrieval. This ticket extends it with two optional parameters that close concrete agent-friction patterns observed in the 2026-05-05 cognitive-load-audit cascade (Epic #10733, PRs #10739–#10747).The trigger is empirical, not theoretical: an inverted-rigor failure mode hit in PR #10745 Cycle 3 where
gh pr checkout 10745 --detachfailed silently network-side, the working-tree state didn't match what the script-flow believed, and agrepfor "is line:378 still stale on PR head" returned a phantom positive against pre-merge dev state. @neo-gpt's hermetic counter-shape —git grep <pattern> <SHA> -- <paths>against the commit object directly — exposed the false positive.This is the exact friction the MX (Model Experience) loop is designed to convert to substrate. Rather than relying on agent discipline alone (Pre-Flight checks like
git rev-parse HEAD == gh pr view --json headRefOid), make the API surface hermetic-by-construction so the failure class can't recur.The Problem
Current per-file diff inspection workflows (this session, empirical):
Per-file diff (no SHA pin):
gh pr diff <N> --patch | sed -n '/diff --git a\/path\/to\/file/,/^diff --git/p'— fragile parsing pipeline, breaks on filename collisions, easy to off-by-one the boundary regex. Used at PR #10745 review for AGENTS.md diff isolation; required ad-hoc sed-pattern tuning per-PR.Per-file diff with SHA pin: No clean shape. Workarounds:
git show <SHA>:<path>— gives full file at SHA, not the diffgit grep <pattern> <SHA> -- <paths>(GPT's hermetic shape) — pattern-matched, not unified diffgh pr checkout <N> --detach && cat <path>— requires checkout; subject to network-timeout silent-failure that produced PR #10745 line:378 phantom findingFile-list with line counts:
gh pr view <N> --json files(CLI) returnspath + additions + deletions. No MCP equivalent. Lower-value gap becauseghworks, but symmetric with the existingget_pull_request_diffMCP surface.The Architectural Reality
Affected substrate:
ai/mcp/server/github-workflow/— extends existingget_pull_request_difftool definition + handlerai/mcp/server/github-workflow/openapi.yaml— schema additions for new optional parameterspr-reviewskill workflows + ad-hoc Bash invocations across all sessionsThe current
get_pull_request_diffshape:{ pr_number: number } → string // full unified diffUnderlying implementation per existing tool description: wraps
gh pr diff <pr_number>. Already hermetic-by-construction (no checkout). The extension is parameter additions, not behavior rewrite.The Fix
One concrete prescription extending the existing tool (per
feedback_ticket_prescription_clarity— single shape, not Option A/B):Schema extension (in
ai/mcp/server/github-workflow/openapi.yaml):get_pull_request_diff: parameters: - pr_number: { type: number, required: true } - file: { type: string, required: false, description: "Optional file path; if provided, returns only the diff hunk for this file. Multiple files: pass comma-separated paths." } - sha: { type: string, required: false, description: "Optional commit SHA to pin the diff to. If provided, the diff is computed against the named commit, not the live PR head. Defends against working-tree race conditions in concurrent-PR scenarios." } - files_only: { type: boolean, required: false, default: false, description: "If true, returns structured file-list with additions/deletions counts (no diff body)." } returns: - When files_only=true: `{files: [{path: string, additions: number, deletions: number}]}` - When file=null and files_only=false: full unified diff string (current behavior, backward compatible) - When file provided: unified diff hunks for the named file(s) only - When sha provided: diff computed against that SHA (not live PR head)Handler implementation outline:
files_only=true: usegh pr view <pr_number> --json files -q '.files'and return structured JSON.fileprovided + nosha: shell togh pr diff <pr_number>then post-process to extract the named-file hunk via deterministic file-boundary regex (no external sed pipeline).shaprovided: shell togit diff <sha-base>...<sha> -- <file>(or equivalent SHA-pinned form). Caller's responsibility to know the SHA; tool guarantees no working-tree state contamination.Backward compatibility: All new parameters optional with sensible defaults; existing
get_pull_request_diff(pr_number)calls return identical output to today.Contract Ledger
T3 Explicit Matrix per
learn/agentos/contract-ledger.md. The tool extension modifies an MCP tool surface consumed by agents — the Contract Completeness Gate fires.get_pull_request_diff(pr_number, files_only: boolean)feedback_pr_review_iteration_calibration(full-context-fetch when commentId-scoped insufficient)files_only=true, return{files: [{path: string, additions: number, deletions: number}]}JSON only (no diff body). Hermetic; no checkout.false, behavior unchanged (full unified diff).ai/mcp/server/github-workflow/openapi.yamlschema + tool description ≤1024 chars perMcpServerToolLimitsget_pull_request_diff(10746, files_only: true)returns thecognitive-load-baseline-2026-05.mdrow with verified additions/deletions counts).get_pull_request_diff(pr_number, file: string)fileprovided (single path or comma-separated paths), return unified diff hunks for the named file(s) only, deterministic boundary detection (no regex extraction).fileis null/empty, full-diff behavior. Whenfiledoesn't match any path in the PR, return empty string""(NOT an error).ai/mcp/server/github-workflow/openapi.yamlschemaget_pull_request_diff(pr_number, file: string, sha: string)shaprovided alongsidefile, the diff is computed against the named commit using a hermeticgit show <sha>:pathor equivalent SHA-pinned shape. Working-tree state on the host MUST have zero effect on the result.shais null/empty, fall back to live PR-head diff (current full-diff behavior, modulo any other params). Whenshadoesn't exist in the repo, return error indicating SHA not found (not silently fall back).ai/mcp/server/github-workflow/openapi.yamlschema + explicit "hermetic by construction" callout in tool descriptionget_pull_request_diff(N, file: 'path', sha: '<commit>')output is byte-identical to a baseline output captured before the working-tree mutation. This empirical guard test IS the load-bearing AC3 of this ticket — it proves hermetic-by-construction.get_pull_request_diff(pr_number)(no other params — backward-compatibility)get_pull_request_diff(N)without the new params) get byte-identical output post-extension.The matrix replaces the implicit T1 contract risk that surfaced via my initial proposal (where I claimed "new tool" before checking existing tool surface — verify-before-assert violation against MCP-tool registry; documented in Avoided Traps). T3 Explicit Matrix is required per
contract-ledger.mdfor any modification to consumed surfaces (Memory Core MCP tools fall under "Agent-consumed" trigger cluster).Acceptance Criteria
get_pull_request_diff(pr_number, files_only: true)returns{files: [{path, additions, deletions}]}JSON for the PR, no diff body, hermetic (no checkout).get_pull_request_diff(pr_number, file: 'path/to/file')returns the unified diff hunk(s) for that file only, deterministic boundary detection (no fragile sed regex), supports multiple comma-separated paths.get_pull_request_diff(pr_number, file: 'path/to/file', sha: '<commit-sha>')returns the file's diff computed against the named SHA. Empirical guard: working-tree state on the host has zero effect on the result.get_pull_request_diff(pr_number)(no other params) returns the current full-diff string unchanged. Existing callers see no behavior change.McpServerToolLimits(existing) + perfeedback_pr_review_template_disciplineMCP-tool-description budget guidance.pr-review-guide.mdTest-Execution Audit section updated to recommendget_pull_request_diff(pr_number, file: '...', sha: '...')for "still stale / remains missing" residual claims, replacing the manualgit rev-parse HEAD == gh pr view --json headRefOidPre-Flight reflex with a tool that's hermetic-by-construction.Out of Scope
checkout_pull_request. Scope explicitly excluded per @tobiu's calibration: "it will still require to jump into a feature branch, if you guys want to re-run unit tests."checkout_pull_requestremains the right tool for active branch work. This ticket only addresses diff inspection.Avoided Traps
get_pull_request_diffalready exists; verify-before-assert applies to MCP tool surface inventory, not just code substrate. (Empirical anchor on my side: I initially proposed a "new tool" before running ToolSearch on the existing surface — verify-before-assert violation against the MCP-tool registry. Caught + corrected before ticket draft.)checkout_pull_request. Explicit scope-boundary at filing time prevents bleed.(pr_number)calls return identical output. No flag-day migration required.Related
gh pr checkoutfailed silently network-side; grep ran against worktree state from a different SHA; I asserted "still stale on PR head" against state that wasn't actually PR head. @neo-gpt'sgit grep <pattern> <SHA>shape exposed the false positive. This tool extension generalizes that hermetic shape into the tool API.checkout_pull_request— covers active-branch-work scope deliberately excluded here.ai/mcp/server/github-workflow/openapi.yamlfor schema-edit shape.Origin Session ID: 23b9cbcd-4938-4a46-b21a-0d48dd12e7e7
Retrieval Hint:
query_raw_memories(query="get_pull_request_diff per-file SHA-pinned hermetic phantom-finding line:378 architectural-pillar review-rigor floor PR 10745 Cycle 3 retraction model-experience")