LearnNewsExamplesServices
Frontmatter
id10864
titleDocument Codex mid-session harness restart recovery
stateClosed
labels
documentationenhancementdeveloper-experienceaimodel-experience
assigneesneo-gpt
createdAtMay 7, 2026, 2:25 AM
updatedAtMay 7, 2026, 2:46 AM
githubUrlhttps://github.com/neomjs/neo/issues/10864
authorneo-gpt
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 7, 2026, 2:46 AM

Document Codex mid-session harness restart recovery

Closeddocumentationenhancementdeveloper-experienceaimodel-experience
neo-gpt
neo-gpt commented on May 7, 2026, 2:25 AM

Context

During the May 2026 multi-agent recovery / restart work, Codex repeatedly hit a specific diagnostic trap: native MCP tools could be healthy while shell commands inside the Codex sandbox failed against localhost, GitHub, or Chroma. That made harness-restart and Memory Core checks look broken even when the external substrate was healthy.

@tobiu explicitly asked to turn this friction into a small .codex artifact instead of growing global agent rules. The intended shape is a Codex-only edge-case card plus a single trigger line in the turn-loaded Codex reference.

Duplicate sweep performed before filing:

  • ask_knowledge_base(query='existing ticket Codex harness restart mid-session Memory Core Chroma GitHub sandbox restart doc', type='ticket') returned no relevant documents.
  • rg across resources/content/issues, resources/content/issue-archive, and resources/content/discussions found broad Chroma/restart history, but no equivalent Codex Desktop mid-session harness restart runbook ticket.
  • Current local uncommitted edits are the proposed implementation, not an existing tracked duplicate.

The Problem

When a harness restarts mid-session, agents can waste cycles by diagnosing the wrong layer:

  • A stale/coalesced wake can look like fresh work.
  • A2A mailbox state can drift from transcript wake text.
  • Native MCP calls can work while shell-sandbox curl localhost, npm run ai:mcp-client, or gh calls fail.
  • External Chroma can be healthy while Codex shell commands cannot reach loopback.
  • gh auth status can misreport auth failures inside sandboxing.

Without a concise Codex-specific trigger, the swarm repeats the same checks and risks either skipping add_memory or filing substrate tickets for sandbox artifacts.

The Architectural Reality

  • .codex/CODEX.md is injected into normal repo-root Codex turns through .codex/hooks.json, so adding large recovery prose there increases always-loaded context.
  • .codex/CODEX.md should stay a small turn-based memory/router surface.
  • A separate .codex/HARNESS_RESTART.md keeps the detailed guidance trigger-based instead of always-loaded.
  • The guidance is Codex Desktop-specific. It must not move broad repo policy, normal MCP autoload behavior, or shared agent rules into .codex.

The Fix

Add a compact Codex-only restart card:

  • .codex/HARNESS_RESTART.md — mid-session harness restart checks:
    • check unread A2A first;
    • treat old pre-restart self-wakes as noise unless they contain fresh handoff content;
    • verify checkout state before changing files;
    • distinguish native MCP health from shell-sandbox localhost failures;
    • retry required localhost/GitHub checks escalated before diagnosing infrastructure as down;
    • verify external Chroma with /api/v2/heartbeat where needed;
    • end with native add_memory even if shell clients cannot reach localhost.
  • .codex/CODEX.md — one-line trigger pointing to the restart card.

Contract Ledger Matrix

Target Surface Source of Authority Proposed Behavior Fallback / Edge Case Docs Evidence
Codex Desktop turn-loaded reference .codex/CODEX.md via .codex/hooks.json Add one trigger line only If the file is not injected, root AGENTS.md still governs baseline behavior .codex/CODEX.md Manual diff review: exactly one trigger line added
Codex restart edge-case card .codex/HARNESS_RESTART.md Provide concise restart diagnostics without bloating always-loaded instructions If a restart symptom is not listed, fall back to AGENTS mailbox + Memory Core save invariants .codex/HARNESS_RESTART.md Markdown content review + git diff --check

Acceptance Criteria

  • Add .codex/HARNESS_RESTART.md with a concise mid-session restart checklist.
  • Add exactly one trigger line to .codex/CODEX.md pointing to .codex/HARNESS_RESTART.md.
  • Keep restart guidance Codex-only; do not mutate AGENTS.md, shared skills, or global Atlas docs.
  • Include sandbox-vs-native-MCP distinction for localhost / Chroma / GitHub checks.
  • Include end-of-turn native add_memory reminder for shell-client failure cases.
  • Run git diff --check -- .codex/CODEX.md .codex/HARNESS_RESTART.md.

Out of Scope

  • Changing MCP server implementation.
  • Changing .codex/config.template.toml or .codex/config.toml.
  • Adding new global AGENTS mandates.
  • Generalizing this into a cross-harness restart manual.
  • Automating harness restart detection.

Avoided Traps / Gold Standards Rejected

  • Expanding .codex/CODEX.md into a full manual — rejected because it is injected every turn; this would increase loaded bytes for an edge case.
  • Putting this in AGENTS.md — rejected because the issue is Codex Desktop-specific and should not burden every harness.
  • Treating localhost shell failures as infrastructure failures — rejected because the observed failure class is often Codex shell sandboxing while native MCP remains healthy.

Related

  • PR #10861 and the surrounding Phase 6 KISS audit reinforced that substrate changes should be small and empirically justified.
  • This ticket deliberately follows that KISS shape: one trigger line plus one edge-case card.

Origin Session ID: 441c9fbc-2742-4b52-b371-77247e101f08 Retrieval Hint: "Codex mid-session harness restart .codex HARNESS_RESTART native MCP shell sandbox Chroma"

tobiu referenced in commit 96c98eb - "docs(codex): document harness restart recovery (#10864) (#10865) on May 7, 2026, 2:46 AM
tobiu closed this issue on May 7, 2026, 2:46 AM