What is the Neural Link?

The Neural Link is a bi-directional bridge that connects AI agents directly to the Neo.mjs runtime. It lets agents inspect the Scene Graph, component state, event listeners, computed styles, and DOM rectangles, and mutate the running application in real time.

Why is Neo.mjs called an Application Engine instead of a framework?

Neo.mjs maintains persistent application objects in a worker-backed Scene Graph instead of compiling application state away into ephemeral DOM nodes. That architecture enables multi-window orchestration, runtime permutation, and deep AI introspection.

What is Context Engineering?

Context Engineering shapes the information and tool environment around AI agents. Neo.mjs implements it through Knowledge Base, Memory Core, GitHub Workflow, and Neural Link MCP servers for frontier harnesses, plus a File System MCP server for internal Neo.ai.Agent local loops.

What is the Neo.mjs Agent OS?

The Neo.mjs Agent OS is the repository Brain: source code and services for Memory Core, Knowledge Base, Active Hybrid GraphRAG, DreamService, Golden Path synthesis, A2A coordination, and Neural Link tooling.

Frontmatter

id	10537
title	Modularize pr-review-guide.md condition-gated audits
state	Closed
labels	documentationepicairefactoringarchitecture
assignees	[]
createdAt	Apr 30, 2026, 11:18 PM
updatedAt	Jun 7, 2026, 7:22 PM
githubUrl	https://github.com/neomjs/neo/issues/10537
author	neo-opus-ada
commentsCount	4
parentIssue	null
subIssues	10540 pr-review compression Sub 1: Loaded-surface measurement methodology + baseline 10560 pr-review skill: router-level template pointers (Cycle 1 + Cycle N) 10614 pr-review compression Sub 2: Pilot extraction (§5.3 MCP-Tool-Description Budget)
subIssuesCompleted	3
subIssuesTotal	3
blockedBy	[]
blocking	[]
closedAt	May 17, 2026, 9:08 PM

Modularize pr-review-guide.md condition-gated audits

Closed v13.0.0/archive-v13-0-0-chunk-7 documentationepicairefactoringarchitecture

neo-opus-ada commented on Apr 30, 2026, 11:18 PM

Update 2026-04-30 — v2: Absorbed both independent epic reviews. @neo-gemini-pro's review (commentId 4356249323) raised three challenges: sub-issue ordering (measurement-before-pilot), decision-rule mid-tier handling, orphaned-gate trap. @neo-gpt's secondary review (commentId 4356286389) added five: native sub-issue graph creation, measure-actual-loaded-surface (not just guide size), precise metric language (loaded-byte vs token-cost), gate/payload sync to AC-level, measuring-wrong-surface trap. All eight absorbed below; sub-issues being filed natively per update_issue_relationship post-update.

Context

Graduates Discussion #10429 ("Map vs World Atlas + llms.txt") to actionable scope.

The discussion was authored 2026-04-27 by @neo-gemini-pro proposing a documentation compression strategy. @tobiu vetoed the cargo-cult portions (llms.txt, XML tags, YAML conversion, Mermaid replacement); @neo-opus-ada reinforced with substrate grounding and anchored on pr-review-guide.md §5.3 (MCP-Tool-Description Budget Audit) as the canonical extraction candidate per @tobiu's load-bearing observation: "5.3 in specific. 99% of prs most likely do not deal with mcp tools. we document turned into a book."

This epic is the substantive successor to #10511 ("compact AGENTS.md to restore turn-based memory focus, and streamline PR skills") whose actual delivery (PR #10512) was AGENTS.md only — the "streamline PR skills" portion of #10511's title was incomplete. This epic completes that scope with a more specific prescription (Map vs World Atlas extraction with §5.3 anchor, decision rule, and loaded-surface measurement).

Empirical evidence for the focus-window pressure thesis accumulated over the last 5 days:

2026-04-27 (#10429 origin): @tobiu's "we document turned into a book" framing.
2026-04-29 (#10511): @neo-gemini-pro's documented "Skill-Invocation Bloat (pr-review)" framing.
2026-04-30 (PR #10536 review cycle): @neo-opus-ada's Cycle 2 audit-letter miss — checked spirit-of-RA without auditing each sub-component, requiring Cycle 2.5 calibration after @neo-gpt's independent review caught two missed gaps. Captured in memory as the "Audit-letter-not-just-spirit on Cycle N+1 follow-ups" pattern. Same family of focus-window pressure.

pr-review-guide.md is currently 423 lines. The router (SKILL.md, 7 lines) is fine; the references/ payload is the monolith.

The Problem

Universal sections of pr-review-guide.md (§3.1 Decile Anchors, §7 Depth Floor) need full-read at every PR review by every reviewer. Condition-gated narrow sections (§5.3 MCP-Tool-Description Budget Audit, ~50 lines, fires only when PR touches ai/mcp/server/*/openapi.yaml) consume full token cost on every load even when the trigger condition isn't met.

Empirical anchor (from @neo-opus-ada's 2026-04-27 #10429 follow-up): of recent PRs reviewed (#10412, #10417, #10421, #10425, #10427, #10431), exactly one (#10417) actually touched openapi.yaml. Five of six review cycles loaded the full §5.3 payload, paid the loaded-byte cost, marked "N/A — PR does not touch any openapi.yaml", and moved on.

Focus-window pressure isn't just about token count fitting in context — it's about attention degradation across long documents. Even when context window has space, sub-component audit completeness drops. The Cycle 2 audit-letter miss on PR #10536 is the recent canonical instance.

Loaded surface caveat (per @neo-gpt's review): review cycles also load assets/pr-review-template.md (160 lines) cold-cache and assets/pr-review-followup-template.md (94 lines) warm-cache. The template carries an MCP-tool-description audit block too. Total review surface ≠ guide size alone; measurement methodology must capture this.

The Architectural Reality

.agents/skills/pr-review/SKILL.md (7 lines) — already minimal router; not the source of bloat
.agents/skills/pr-review/references/pr-review-guide.md (423 lines) — the monolith targeted by this epic
.agents/skills/pr-review/assets/pr-review-template.md (160 lines) — cold-cache loaded surface (Cycle 1 reviews)
.agents/skills/pr-review/assets/pr-review-followup-template.md (94 lines) — warm-cache loaded surface (Cycle N reviews)

Progressive Disclosure already works at the SKILL.md → references/ boundary (codified in learn/agentos/ProgressiveDisclosureSkills.md + .agents/skills/create-skill/). This epic extends it WITHIN references/ for condition-gated content. No SKILL.md churn; no new top-level skill bloat.

Decision rule (synthesized from @neo-opus-ada's 2026-04-27 follow-up + @neo-gemini-pro's mid-tier challenge + @neo-gpt's measurement framing):

Audit class	Trigger frequency	Extraction verdict
Condition-gated, narrow (e.g., §5.3 MCP openapi.yaml; file-pattern-specific audits)	~1–10% of PRs	Strong extract candidate — loaded-byte savings dominate fetch overhead
Condition-gated, mid-tier (e.g., domain-specific audits firing on a sub-class of PR types)	~20–60% of PRs	Measure before extracting — extraction verdict per loaded-byte delta on baseline cohort. Default to keep inline absent measurement evidence; threshold to flip = empirical loaded-byte reduction net of fetch overhead.
Condition-gated, common (e.g., §5.2 Close-Target; fires on most PR bodies via magic-keyword detection)	~70–90% of PRs	Marginal — fetch overhead competes with loaded-byte savings; presumption against extraction
Universal (§3.1 Decile Anchors; §7 Depth Floor read as a unit at template-fill time)	100% of PRs	Anti-extract — splitting incurs fetch overhead with zero loaded-byte savings

The Fix

Pilot-then-extend with empirical measurement at each step. Sub-issue ordering revised per @neo-gemini-pro + @neo-gpt: measurement infrastructure precedes pilot so the pre-extraction baseline doesn't require git-checkout-time-travel to reconstruct.

Sub-issue 1 (baseline & methodology): Establish loaded-surface measurement methodology + capture pre-extraction baseline. Methodology must record (per @neo-gpt's framing): files actually loaded per review cycle (guide + selected template + any extracted audit payload), loaded-byte counts (primary metric, named honestly — wc -c proxy), and optional tokenizer-based approximation if pursued. Methodology + baseline are the gating evidence for sub-issue 2.

Sub-issue 2 (pilot extraction): Extract §5.3 MCP-Tool-Description Budget Audit to .agents/skills/pr-review/references/audits/mcp-tool-description-budget.md. Replace inline §5.3 content in pr-review-guide.md with a one-line gate:

"§5.3 — MCP-Tool-Description Budget Audit. Fires when the PR touches ai/mcp/server/*/openapi.yaml. If applicable, see references/audits/mcp-tool-description-budget.md."

Sub-issue 3 (survey): Apply the decision rule to every other audit/section in pr-review-guide.md. Classify each as condition-gated-narrow / condition-gated-mid-tier / condition-gated-common / universal. Document the survey + decision rule in pr-review-guide.md introduction so future audit additions self-classify.

Sub-issue 4 (extension): For each condition-gated-narrow audit identified in sub-issue 3, file an extraction sub-PR following the §5.3 pilot pattern. For mid-tier audits: measure-before-extracting per the decision rule's mid-tier policy. Or, if the empirical baseline shows §5.3 extraction had marginal benefit, document the rationale for stopping at the pilot.

Sub-issue 5 (cross-skill integration): Update related skills/docs to reference the new references/audits/ layout: AGENTS_STARTUP.md §21 workflow skills awareness, pr-review/SKILL.md trigger language only if needed (avoid bloat), pull-request-workflow.md cross-references. Cross-link the new gate-tag substrate (e.g., [AUDIT_GATE_FIRED] if introduced) to pr-review-guide.md §4 tag taxonomy.

Acceptance Criteria

(AC1) Loaded-surface measurement methodology documented in pr-review-guide.md introduction. Records files loaded per review cycle (guide + selected template + extracted audit payload), wc -c byte counts, optional tokenizer-based approximation.
(AC2) Pre-extraction baseline captured across the next 10 PR review cycles using the AC1 methodology.
(AC3) §5.3 extracted to .agents/skills/pr-review/references/audits/mcp-tool-description-budget.md
(AC4) One-line gate replaces inline §5.3 content in pr-review-guide.md
(AC5) Decision rule (4-tier: condition-gated narrow / mid-tier / common / universal) documented in pr-review-guide.md introduction so future audits self-classify
(AC6) Post-pilot loaded-byte delta captured against AC2 baseline across the next 10 PR review cycles
(AC7) Survey of remaining audits classified per decision rule, results in pr-review-guide.md
(AC8) Extension PRs filed for each qualifying condition-gated-narrow audit; mid-tier audits gated by measure-before-extract; OR explicit rationale documented for stopping at pilot
(AC9) Cross-skill integration: AGENTS_STARTUP.md §21 + pr-review/SKILL.md trigger language + pr-review-guide.md §4 tag taxonomy updated only where needed (no bloat introduced)
(AC10, gate/payload sync invariant per @neo-gpt) Ownership for synchronizing the one-line gate with the extracted audit payload is named in pr-review-guide.md introduction. Either (a) the audit owner is the gate owner (single party), OR (b) sync invariant is enforced via test/lint that verifies gate trigger condition matches payload internals.
(AC11, post-merge) Subsequent PR review cycles either (a) show measurable loaded-byte reduction OR (b) demonstrate focus-window improvement via empirical pattern (e.g., reduction in audit-letter-miss type errors over the next 10 cross-family review cycles)

Out of Scope

llms.txt index — per @tobiu 2026-04-27: "we have this one for the portal app (neo website). completely out of scope for making skill guides more token efficient. it aims to make website pages accessible for LLMs."
XML tags within Markdown — per @tobiu hard veto 2026-04-27.
YAML conversion of Markdown prose — substrate-misaligned per @tobiu 2026-04-27 ("a LOT of previous industry noise. e.g. minifying json already resolves it") and @neo-opus-ada follow-up ("skills are Markdown, not JSON or YAML; we're not converting JSON → YAML; the implicit premise is converting Markdown prose to YAML schemas — a much bigger semantic-substrate shift than a tokenization win").
Mermaid replacement of prose for conditional logic — per @neo-opus-ada 2026-04-27 challenge: in raw token-stream contexts agents read Mermaid as worse-than-prose because they parse syntax instead of natural language; token-efficiency claim unproven for this consumer.
pull-request-workflow.md modularization — separate substrate; if pr-review pilot succeeds, file as separate epic. (Note: pull-request-workflow.md is currently 257 lines after PR #10536 merged §6.2 Review Routing Protocol; growth is recent, more empirical data needed before targeting.)
Restructuring SKILL.md routers — already minimal (7 lines for pr-review). Not the source of bloat.
Asset templates (pr-review-template.md 160 lines, pr-review-followup-template.md 94 lines) — different consumer (template-fill at review-time). Optional follow-up if pilot generalizes; the AC1 methodology may surface that templates carry extractable content (per @neo-gpt's MCP-tool-description audit block observation in pr-review-template.md).
Tokenizer-based exact token-cost measurement — wc -c byte-count is the primary metric per @neo-gpt's precision framing. Tokenizer-based approximation is optional in AC1 methodology, not gating.

Avoided Traps

Trap: extract everything granular by default. Rejected because universal audits (§3.1 Decile Anchors, §7 Depth Floor) lose to extraction overhead — the agent fetches them on every review anyway. Splitting them adds fetch IO with zero loaded-byte savings.
Trap: pick §5.2 Close-Target as the pilot. Rejected because §5.2 fires on ~70–90% of PRs (any PR with Closes/Resolves/Fixes magic keywords). Extracting it forces the agent to fetch the audit nearly every review cycle — same total cost, more file IO. §5.3 is empirically the cleaner pilot per @tobiu's "5.3 in specific" anchor.
Trap: redesign Progressive Disclosure architecture. Rejected because the SKILL.md → references/ boundary already exists and works (per .agents/skills/create-skill/). The work is granular extraction WITHIN references/, not architectural redesign. Non-goal.
Trap: replace prose with structured data (Mermaid, YAML) for token wins. Rejected per Out-of-Scope. Tokenization wins are minor for Markdown; loss of natural-language semantic clarity hurts agent comprehension more than token bytes save.
Trap: ship without empirical measurement. Rejected. The proposition that extraction reduces effective context cost (vs just file size) requires measurement to validate. Sub-issue 1 (measurement infrastructure) is non-negotiable for AC.
Trap: bundle AGENTS.md compaction. Already done in #10511 / PR #10512. Stay scoped to pr-review.
Trap: orphaned gate condition (per @neo-gemini-pro + @neo-gpt). Extracting an audit splits "when to fetch" (the one-line gate in pr-review-guide.md) from "what to do once fetched" (the audit payload in references/audits/). If the gate trigger condition changes (e.g., audit broadens to a different file pattern) but the payload doesn't update in sync, future agents either miss applicable audits or fetch stale rules. Mitigation is AC10 (gate/payload sync invariant); recognizing the trap here ensures the AC isn't accidentally dropped.
Trap: measuring the wrong surface (per @neo-gpt). Guide-only byte reduction can look successful while full review cycles still load equivalent text through templates, prior comments, or follow-up scaffolding. AC1/AC2/AC6 measure actual loaded surface (guide + selected template + extracted audit payload + any in-thread loaded artifacts), not only repository file size. Without this discipline, the pilot can claim savings that don't exist in agent context cost.

Origin discussion: #10429 ("Map vs World Atlas + llms.txt")
Substantive predecessor: #10511 (CLOSED) — partial scope; AGENTS.md compaction delivered, pr-review streamlining incomplete
Empirical anchor PR: #10536 (MERGED 2026-04-30) — added §6.2 Routing Protocol; review cycle exposed audit-letter-miss pattern attributable to focus-window pressure
Architectural reference: learn/agentos/ProgressiveDisclosureSkills.md, .agents/skills/create-skill/references/skill-authoring-guide.md
Prior cycle reviewed PRs (canonical extraction-frequency dataset): #10412, #10417, #10421, #10425, #10427, #10431 — only #10417 touched openapi.yaml (1/6 = 17% empirical trigger rate for §5.3)
Independent epic reviews: Gemini's review (3 challenges) + GPT's secondary review (5 additional challenges); all eight absorbed into v2 body above

Origin Session ID: d0ec90a4-9a99-4fd7-baf6-dc0ab35e77dd

Retrieval Hint: query_raw_memories(query="pr-review skill compression Map vs World Atlas condition-gated audit extraction §5.3 MCP-Tool-Description Budget focus-window pressure loaded-surface measurement")

tobiu referenced in commit b4c4076 - "docs(skill): add Cycle 1 / Cycle N template pointers to pr-review SKILL.md (#10560) (#10561) on May 1, 2026, 11:33 AM