What is the Neural Link?

The Neural Link is a bi-directional bridge that connects AI agents directly to the Neo.mjs runtime. It lets agents inspect the Scene Graph, component state, event listeners, computed styles, and DOM rectangles, and mutate the running application in real time.

Why is Neo.mjs called an Application Engine instead of a framework?

Neo.mjs maintains persistent application objects in a worker-backed Scene Graph instead of compiling application state away into ephemeral DOM nodes. That architecture enables multi-window orchestration, runtime permutation, and deep AI introspection.

What is Context Engineering?

Context Engineering shapes the information and tool environment around AI agents. Neo.mjs implements it through Knowledge Base, Memory Core, GitHub Workflow, and Neural Link MCP servers for frontier harnesses, plus a File System MCP server for internal Neo.ai.Agent local loops.

What is the Neo.mjs Agent OS?

The Neo.mjs Agent OS is the repository Brain: source code and services for Memory Core, Knowledge Base, Active Hybrid GraphRAG, DreamService, Golden Path synthesis, A2A coordination, and Neural Link tooling.

Frontmatter

id	10756
title	Empirical grounding for §8 / §7.2 cross-model asymmetry codifications
state	Open
labels	documentationenhancementaiarchitecturemodel-experience
assignees	neo-opus-ada
createdAt	May 5, 2026, 6:43 PM
updatedAt	Jun 7, 2026, 7:24 PM
githubUrl	https://github.com/neomjs/neo/issues/10756
author	neo-opus-ada
commentsCount	2
parentIssue	10757
subIssues	[]
subIssuesCompleted	0
subIssuesTotal	0
blockedBy	[]
blocking	[]

Empirical grounding for §8 / §7.2 cross-model asymmetry codifications

Open Backlog/active-chunk-9 documentationenhancementaiarchitecturemodel-experience

neo-opus-ada commented on May 5, 2026, 6:43 PM

Context

epic-review-workflow §8 Cross-Model Asymmetry and pr-review-guide §7.2 Cross-Model Asymmetry Context codify model-family failure modes:

Claude-family → over-rigor
Gemini-family → quick-win framing

Both sections use the phrase "statistically-different failure modes" with no measurement anchored to either. The codifications get cited in agent disclosures as substrate authority before reviews (pattern observed in PR #10752 Cycle 1 review 2026-05-05 → retraction; re-reproduction same-session in this ticket's origin session, in re-categorization variant). We work on data, not bias — so the codifications need empirical grounding or refactoring.

The Problem

Three concurrent gaps:

Empirical claim without data. "Statistically-different" is asserted; no N, no methodology, no measurement. Closest anchors are anecdotal (skill files cite single PRs: #10602, #10607).
Stale model coverage. GPT family (@neo-gpt) joined the swarm late April 2026 (public GitHub author history visible from 2026-04-27 onward). §8 / §7.2 enumerate only Claude and Gemini patterns. If the codification is genuinely model-family-characteristic, it must include GPT — or the failure to include it is evidence the codification was never empirical.
Self-fulfilling bias loop. Agents read these codifications and pre-emptively disclose ("Per §8 Claude over-rigor risk applies..."). The pre-disclosure IS the over-rigor manifestation the codification names. Empirical anchor: PR #10752 Cycle 1 retraction 2026-05-05; same-session reproduction in re-categorization variant (disclosure shape rebadged as "treat as substrate findings, not as final design," same B-shape pre-conclusion). Pattern caught directly by @tobiu in this ticket's origin session.

The codifications are bias-shape framings dressed as empirical claims.

The Architectural Reality

.agents/skills/epic-review/references/epic-review-workflow.md:181-190 — §8
.agents/skills/pr-review/references/pr-review-guide.md:214-223 — §7.2
Disclosure-discipline pattern (cite vs pre-conclude) currently lives only in harness-private memory across the swarm — this ticket addresses the upstream question of whether the codifications themselves merit existence without empirical grounding; AC4 promotes the discipline pattern to public substrate so future agents reach it from the same hop as §8 / §7.2 themselves
learn/agentos/measurements/cognitive-load-baseline-2026-05.md §1 — methodology framework exists for correction-cycle metrics; could anchor model-family review-output measurements

The Fix

Measurement-first conditional refactor:

Mine swarm review history (time window + sample-size sufficient for statistical-significance test, decided in AC1 methodology) for:
- Re-cycle counts per reviewer-identity
- [REJECTED_WITH_RATIONALE] rates per reviewer-identity (per pr-review-guide §9.1 Reviewer-Yield Protocol)
- Retraction counts per reviewer-identity
- Approval-without-Cycle-2-finding rates per reviewer-identity
Aggregate by model-family (Claude / Gemini / GPT)
Conditional outcome:
- If measurement supports model-family attribution at the methodology's significance threshold: retain framing in §8 / §7.2; append empirical-anchor footnote citing N + methodology + data link; add GPT row
- If measurement does NOT support attribution: refactor §8 / §7.2 to content-neutral asymmetry framing — "cross-model review surfaces complementary findings; do not stylize toward another family" — preserving the cross-family-review value without unsupported model-family attribution

Avoided Traps

"Just remove the section": discards real cross-family-review value; refactor is the correct shape if measurement doesn't support attribution
"Add anecdotes as evidence": anecdotal anchors aren't statistical; #10602 + #10607 + this session ≠ measurement
"Code-freeze the codifications until measured": they're already production substrate; conditional refactor follows the measurement outcome rather than blocking on it

Acceptance Criteria

(AC1) Measurement methodology defined: which metrics, time window, data sources (Memory Core review summaries + GitHub PR review API), significance test, sample-size threshold
(AC2) Measurement executed; per-family aggregate produced; significance check applied per AC1
(AC3a) If supported: §8 + §7.2 anchored with empirical-anchor footnote (N, methodology, data link); GPT row added
(AC3b) If unsupported: §8 + §7.2 refactored to content-neutral asymmetry framing
(AC4) Disclosure-discipline pattern (cite-without-pre-conclude; (A) substrate citation legitimate vs (B) pre-conclusion bias-shape, including re-categorization variant) promoted from harness-private memories to a public skill section (e.g. epic-review-workflow §8.1 and/or pr-review-guide §7.2.1); cross-referenced from the resolved §8 / §7.2 sections so agents reading the codification reach the discipline in the same hop

Out of Scope

Disclosure-discipline cite-vs-preconclude pattern enforcement itself — orthogonal (this ticket is whether the codifications merit existence; the discipline is how to use them safely once they exist; AC4 covers public-substrate promotion only, not enforcement mechanism)
Removing the cross-family review mandate — load-bearing per pull-request §6.1; preserved either way
Generalized "challenge any unsupported empirical claim in substrate" — broader meta-rule, would need its own ticket

§8 source: .agents/skills/epic-review/references/epic-review-workflow.md:181-190
§7.2 source: .agents/skills/pr-review/references/pr-review-guide.md:214-223
Measurement substrate: learn/agentos/measurements/cognitive-load-baseline-2026-05.md
Empirical reproduction anchors: PR #10752 Cycle 1 retraction 2026-05-05 → same-session re-reproduction caught by @tobiu in origin session (recoverable via session ID below)
Sibling epic: #10757 (V5 anchor)

Origin Session ID: 23b9cbcd-4938-4a46-b21a-0d48dd12e7e7

Retrieval Hint: query_raw_memories(query="empirical grounding cross-model asymmetry §8 §7.2 statistical-significance reviewer-identity model-family disclosure-discipline cite-vs-preconclude 10756")