What is the 'Neural Link'?

The Neural Link is a bi-directional bridge that connects AI agents (like Gemini or Claude) directly to the Neo.mjs runtime. It allows agents to 'see' the application's Scene Graph, inspect component state, verify event listeners, and even mutate the running application in real-time. This turns the application into a 'glass box' for AI, enabling autonomous debugging and feature development.

Why is Neo.mjs called an 'Application Engine' instead of a framework?

Traditional frameworks are general-purpose libraries (like a Toyota) that help you organize code, but they compile away into ephemeral DOM nodes. Neo.mjs is a precision-engineered runtime (like an F1 car), similar to Unreal Engine for games. It maintains a persistent 'Scene Graph' of objects in a separate worker thread. These objects retain their identity, state, and relationships, allowing for advanced capabilities like multi-window orchestration, runtime permutation, and deep AI introspection that are impossible with 'melted plastic' DOM-based frameworks.

What is 'Context Engineering'?

Context Engineering is the practice of curating the environment and information flow for AI agents to maximize their autonomy. In Neo.mjs, this is implemented via three Model Context Protocol (MCP) servers: a Knowledge Base for semantic code understanding, a Memory Core for learning from past sessions, and a GitHub Workflow server for project management. This ecosystem allows agents to work as fully integrated members of the development team.

What is 'Object Permanence' in the context of Neo.mjs?

In Neo.mjs, UI components are persistent JavaScript objects living in the App Worker, not just transient rendering results. This 'Object Permanence' means a component (like a Dashboard) maintains its state (scroll position, user input, internal logic) even if it is detached from the DOM or moved to a different browser window. This is the 'Lego Technic' approach versus the 'Duplo' approach of traditional frameworks.

What is an 'Agent Operating System'?

Neo.mjs v11 introduced the concept of an Agent OS, where the platform itself provides the tools and interfaces for AI agents to operate. It combines a standalone, type-safe AI SDK for autonomous 'Code Execution' with the Neural Link for runtime control. This enables agents to monitor, debug, and heal the application autonomously, effectively acting as an operating system for synthetic intelligence.

How does Neo.mjs handle multi-window applications?

Neo.mjs uses shared web workers to run a single application instance across multiple browser windows. All windows share the same application state and data in real-time. Components can even move between windows while retaining their JavaScript instances. This enables desktop-class experiences like multi-window IDEs, browser-based email clients, and multi-screen control rooms.

What makes Neo.mjs different from React, Angular, or Vue?

Neo.mjs is fundamentally different in its architecture: (1) It uses True Multithreading via web workers to prevent UI jank, (2) It is an AI-Native Application Engine with built-in bridges for AI collaboration, and (3) It treats components as persistent objects in a Scene Graph, enabling native multi-window support and runtime permutation.

Frontmatter

id	11491
title	Promote manage_pr_review MANDATORY pre-step from description-prose to JSON-schema body-shape validation
state	Closed
labels	enhancementaimodel-experience
assignees	neo-opus-4-7
createdAt	May 16, 2026, 10:43 PM
updatedAt	May 17, 2026, 12:09 AM
githubUrl	https://github.com/neomjs/neo/issues/11491
author	neo-opus-4-7
commentsCount	0
parentIssue	null
subIssues	[]
subIssuesCompleted	0
subIssuesTotal	0
blockedBy	[]
blocking	[]
closedAt	May 17, 2026, 12:09 AM

Promote manage_pr_review MANDATORY pre-step from description-prose to JSON-schema body-shape validation

Name: Neo.mjs Application Engine
Author: Neo.mjs

Closedenhancementaimodel-experience

neo-opus-4-7 commented on May 16, 2026, 10:43 PM

FAIR-band: in-band [3/30] — substrate-evolution layer; companion to PR #11479 (discipline-only MANDATORY pre-step) and conceptual sibling to PR #11406/#11490 (mechanical-enforcement complement to discipline-only §1.3/§2.6/§5.6 layer).

Premise

The manage_pr_review MCP tool's OpenAPI description (added by PR #11479, merged 2026-05-16T18:04Z) currently says:

"MANDATORY pre-step: read .agents/skills/pr-review/SKILL.md — this skill contains the authoritative protocol and template structure for conducting pull request reviews. You MUST NOT execute a formal review without adhering to the depth floor and evidence audit guidelines outlined in the skill."

This is discipline-only enforcement: agent reads the description, agent is supposed to comply, but the schema itself accepts any string for body. There's no mechanical gate. Gemini's reviews on PRs #11488 and #11489 (posted 2026-05-16T20:14-20:15Z, ~2h after PR #11479 merged) bypassed every template anchor — 🪜 Strategic-Fit Decision, 🔬 Depth Floor, 🛂 Provenance Audit, 🎯 Close-Target Audit, 📑 Contract Completeness Audit, 🪜 Evidence Audit, 📜 Source-of-Authority Audit, 📡 MCP-Tool-Description Budget Audit, 🔌 Wire-Format Compatibility Audit, 🔗 Cross-Skill Integration Audit, 🧪 Test-Execution & Location Audit, 🛡️ CI / Security Checks Audit, 📋 Required Actions, 📊 Evaluation Metrics (which uses [ARCH_ALIGNMENT] / [CONTENT_COMPLETENESS] / [EXECUTION_QUALITY] / [PRODUCTIVITY] / [IMPACT] / [COMPLEXITY] / [EFFORT_PROFILE] on a 0-100 scale). Output substituted a hallucinated Structural Evaluation Matrix with 5 invented metric names on a 1-10 scale.

Cost beyond noise: the Retrospective daemon's ConceptDiscoveryService.mjs regex-matches [ARCH_ALIGNMENT], [RETROSPECTIVE], [KB_GAP], [TOOLING_GAP] during REM-sleep graph ingestion. Gemini's hallucinated metrics produce zero ingest signal. Two PRs worth of review-substrate data silently lost to the Native Edge Graph; no error path to surface it.

Freshness gap (empirical caveat)

Operator confirmed 2026-05-16T20:38Z: agent harnesses were not restarted after PR #11479 merged. Gemini's harness very likely still has the pre-#11479 stale tool description (no MANDATORY pre-step line). So we don't actually know yet whether the discipline-only guard would have worked at fresh-harness state. This ticket assumes the structural gap exists regardless: even with a perfect discipline-only guard, agents can:

Have stale harness state (today's empirical case)
Bypass MCP via gh pr review CLI direct call (operator-flagged risk)
Skim the description under context-compression pressure (training-prior failure mode the "Helpful Assistant" counter-substrate has chased across 50+ closed tickets)

Mechanical body-shape validation closes all three failure surfaces at the tool boundary; the discipline-only description layer remains as the human-readable rationale.

Prior art — what's already been tried for this exact problem

#11105 (CLOSED 2026-05-10): "pull-request skill: add author-side check that reviewers use correct pr-review template" — same empirical pattern (Gemini rubber-stamped PR #11104 with non-template structure, operator caught externally). Closed with discipline-only "audit at receipt time" fix. Did not prevent the recurrence Gemini just produced on #11488/#11489.
#11273 (CLOSED 2026-05-13): introduced the manage_pr_review MCP tool with state enum but no body-shape constraint.
#11479 (MERGED 2026-05-16): added the MANDATORY pre-step description guard; body-shape still unvalidated.

The pattern across ~50 closed meta-tickets is that all prior approaches target agent cognition (what to read, how to interpret, how to decide). The categorical layer this ticket adds is output-side mechanical validation at the tool boundary.

Prescription

In ai/mcp/server/github-workflow/openapi.yaml, gate the body parameter on manage_pr_review with a schema-level shape constraint. Implementation paths in order of cost:

Layer 1 — required-substring validation (cheap, ships first)

Add a regex / required-substring validator that checks the body for the literal template anchors:

body:
  type: string
  description: |
    The Markdown body of the review. MUST contain the template anchors enumerated in
    `.agents/skills/pr-review/assets/pr-review-template.md`. Cycle-1 reviews use the full
    template (16 audit sections + 7 evaluation metrics); Cycle-N reviews use the compact
    delta template per `pr-review-followup-template.md`.
  x-required-substrings:
    cycle-1:
      - "🔬 Depth Floor"
      - "📊 Evaluation Metrics"
      - "[ARCH_ALIGNMENT]"
      - "[EXECUTION_QUALITY]"
      - "📋 Required Actions"
    cycle-followup:
      - "[ARCH_ALIGNMENT]"
      - "[EXECUTION_QUALITY]"

Service-layer validator in ai/mcp/server/github-workflow/PullRequestService.mjs (or wherever manage_pr_review dispatches) reads x-required-substrings and returns a structured error before posting:

{
  "error": "pr-review template anchors missing",
  "missing": ["🔬 Depth Floor", "[ARCH_ALIGNMENT]"],
  "skill": ".agents/skills/pr-review/SKILL.md",
  "template": ".agents/skills/pr-review/assets/pr-review-template.md"
}

The agent receives the missing-anchor list IN-TURN and can fix and re-submit. The bad data never lands on GitHub or reaches the Retrospective daemon.

Layer 2 — cycle-detection (decides which anchor set applies)

Detect cycle number from the PR's existing review history. Layer 1 can ship with cycle-1 anchors only as the floor; cycle-N detection is a follow-up if the false-positive rate (cycle-N reviews failing the cycle-1 schema) gets noisy.

Layer 3 — author-side companion (extends to creation)

The companion ticket (filed separately) extends this pattern to PR creation: create_pull_request should similarly validate body against the pull-request skill's template anchors.

Test plan

New service-layer spec asserts: missing anchor → structured error returned, no GitHub API call attempted
New service-layer spec asserts: all anchors present → underlying addPullRequestReview GraphQL mutation called
Existing manage_pr_review integration specs continue to pass
L4 operator verification: post-merge, attempt a deliberately-malformed review via the MCP tool → confirm rejection with the missing-anchor list

Avoided traps

Goodhart anchor-stuffing: agent puts the literal anchor strings into a malformed body to pass the gate while content is still wrong. Accepted residual — this is a depth-floor enforcement, not a quality-floor enforcement; quality remains the peer-V-B-A reviewer's job. The fix shifts the recurrence shape from "missing anchors" (which silently breaks graph ingestion) to "stuffed anchors with weak content" (which is observable by peer review). Net: failure mode becomes catchable, not silently destructive.
MCP-tool-bypass via gh pr review CLI: rejected as a reason NOT to ship this — the bypass is an ADDITIONAL ticket (filed separately), not a cancellation of this one. Layered defense: agents who use the MCP tool get the gate; agents who bypass to gh CLI get caught by the eventual peer-review or the post-hoc CI lint (companion ticket). This is the same "depth-floor + quality-floor + post-hoc-lint" layered enforcement pattern as check-retired-primitives + ADR 0004 §2.6 + peer-review.
Per-template variants (cycle-1 / cycle-followup / circuit-breaker / fair-band-declaration-audit): rejected scope-expansion in v1 — ship cycle-1 anchors as the floor; cycle-followup and audit-variants as follow-ups gated on empirical noise.
Schema enforced via JSONSchema pattern regex on body: tempted, but pattern is single-regex and the template requires multiple-anchor presence — the x-required-substrings extension is cleaner and emits a per-anchor missing list rather than a single opaque pattern-mismatch error.
Promoting from description: to errorMessage: rejected — description: retains the human-readable rationale; the schema validator is the mechanical floor that runs regardless of whether the description is read or stale.

Authority anchors

Operator framing: 2026-05-16T20:30Z+ extended A2A thread — operator surfaced the meta failure mode after Gemini's #11488/#11489 reviews diverged from template structure, then guided V-B-A across 50+ closed meta-tickets, confirmed substrate-evolution loop has hit diminishing returns on agent-cognition-side approaches, gave green light for tool-boundary enforcement as a new categorical layer
Empirical anchors:
- Gemini's #11488 review at PRR_kwDODSospM8AAAABAIyZ_Q and #11489 review at PRR_kwDODSospM8AAAABAIyaEw — both posted 2026-05-16T20:14-20:15Z, both structurally non-template, both APPROVED on substantive merit. Substrate gap proven independent of substance quality.
- Operator quote 2026-05-16T20:38Z: "current problem: gpt and you have an extra high thought budget, gemini's harness caps at high, not a model flaw. it is not that skills get applied in a wrong way, but the triggers to read them do not work."
- ConceptDiscoveryService.mjs:32 — [ARCH_ALIGNMENT] / [RETROSPECTIVE] / [KB_GAP] / [TOOLING_GAP] regex parser that lost two PRs of ingest signal.
Prior empirical anchor for the recurrence pattern: #11105 (closed 2026-05-10) — same problem, same shape, discipline-only fix, recurrence within 6 days.

Parent context: PR #11479 (description-prose discipline guard, merged today 2026-05-16T18:04Z) — this ticket promotes that guard to mechanical
Conceptual sibling: PR #11406/#11490 (CI grep-fail check for retired ADR 0004 primitives) — mechanical-enforcement complement to ADR 0004 §2.6 discipline-only layer; same pattern applied at the CI surface
Companion ticket (filed separately): MCP body-shape validation for create_pull_request against pull-request skill template — extends this pattern to PR authoring
Prior recurrence: #11105 (CLOSED) — exact same problem 6 days ago, discipline-only fix didn't prevent recurrence on #11488/#11489
Substrate landscape: ~50 closed meta-tickets across "Helpful Assistant" counter-substrate, "Map vs World Atlas" progressive disclosure, skill adherence, all targeting agent-side cognition. This ticket adds the missing categorical layer: tool-boundary output validation.

tobiu closed this issue on May 17, 2026, 12:09 AM