Context
During the May 2026 multi-agent recovery / restart work, Codex repeatedly hit a specific diagnostic trap: native MCP tools could be healthy while shell commands inside the Codex sandbox failed against localhost, GitHub, or Chroma. That made harness-restart and Memory Core checks look broken even when the external substrate was healthy.
@tobiu explicitly asked to turn this friction into a small .codex artifact instead of growing global agent rules. The intended shape is a Codex-only edge-case card plus a single trigger line in the turn-loaded Codex reference.
Duplicate sweep performed before filing:
ask_knowledge_base(query='existing ticket Codex harness restart mid-session Memory Core Chroma GitHub sandbox restart doc', type='ticket') returned no relevant documents.
rg across resources/content/issues, resources/content/issue-archive, and resources/content/discussions found broad Chroma/restart history, but no equivalent Codex Desktop mid-session harness restart runbook ticket.
- Current local uncommitted edits are the proposed implementation, not an existing tracked duplicate.
The Problem
When a harness restarts mid-session, agents can waste cycles by diagnosing the wrong layer:
- A stale/coalesced wake can look like fresh work.
- A2A mailbox state can drift from transcript wake text.
- Native MCP calls can work while shell-sandbox
curl localhost, npm run ai:mcp-client, or gh calls fail.
- External Chroma can be healthy while Codex shell commands cannot reach loopback.
gh auth status can misreport auth failures inside sandboxing.
Without a concise Codex-specific trigger, the swarm repeats the same checks and risks either skipping add_memory or filing substrate tickets for sandbox artifacts.
The Architectural Reality
.codex/CODEX.md is injected into normal repo-root Codex turns through .codex/hooks.json, so adding large recovery prose there increases always-loaded context.
.codex/CODEX.md should stay a small turn-based memory/router surface.
- A separate
.codex/HARNESS_RESTART.md keeps the detailed guidance trigger-based instead of always-loaded.
- The guidance is Codex Desktop-specific. It must not move broad repo policy, normal MCP autoload behavior, or shared agent rules into
.codex.
The Fix
Add a compact Codex-only restart card:
.codex/HARNESS_RESTART.md — mid-session harness restart checks:
- check unread A2A first;
- treat old pre-restart self-wakes as noise unless they contain fresh handoff content;
- verify checkout state before changing files;
- distinguish native MCP health from shell-sandbox localhost failures;
- retry required localhost/GitHub checks escalated before diagnosing infrastructure as down;
- verify external Chroma with
/api/v2/heartbeat where needed;
- end with native
add_memory even if shell clients cannot reach localhost.
.codex/CODEX.md — one-line trigger pointing to the restart card.
Contract Ledger Matrix
| Target Surface |
Source of Authority |
Proposed Behavior |
Fallback / Edge Case |
Docs |
Evidence |
| Codex Desktop turn-loaded reference |
.codex/CODEX.md via .codex/hooks.json |
Add one trigger line only |
If the file is not injected, root AGENTS.md still governs baseline behavior |
.codex/CODEX.md |
Manual diff review: exactly one trigger line added |
| Codex restart edge-case card |
.codex/HARNESS_RESTART.md |
Provide concise restart diagnostics without bloating always-loaded instructions |
If a restart symptom is not listed, fall back to AGENTS mailbox + Memory Core save invariants |
.codex/HARNESS_RESTART.md |
Markdown content review + git diff --check |
Acceptance Criteria
Out of Scope
- Changing MCP server implementation.
- Changing
.codex/config.template.toml or .codex/config.toml.
- Adding new global AGENTS mandates.
- Generalizing this into a cross-harness restart manual.
- Automating harness restart detection.
Avoided Traps / Gold Standards Rejected
- Expanding
.codex/CODEX.md into a full manual — rejected because it is injected every turn; this would increase loaded bytes for an edge case.
- Putting this in
AGENTS.md — rejected because the issue is Codex Desktop-specific and should not burden every harness.
- Treating localhost shell failures as infrastructure failures — rejected because the observed failure class is often Codex shell sandboxing while native MCP remains healthy.
Related
- PR #10861 and the surrounding Phase 6 KISS audit reinforced that substrate changes should be small and empirically justified.
- This ticket deliberately follows that KISS shape: one trigger line plus one edge-case card.
Origin Session ID: 441c9fbc-2742-4b52-b371-77247e101f08
Retrieval Hint: "Codex mid-session harness restart .codex HARNESS_RESTART native MCP shell sandbox Chroma"
Context
During the May 2026 multi-agent recovery / restart work, Codex repeatedly hit a specific diagnostic trap: native MCP tools could be healthy while shell commands inside the Codex sandbox failed against
localhost, GitHub, or Chroma. That made harness-restart and Memory Core checks look broken even when the external substrate was healthy.@tobiu explicitly asked to turn this friction into a small
.codexartifact instead of growing global agent rules. The intended shape is a Codex-only edge-case card plus a single trigger line in the turn-loaded Codex reference.Duplicate sweep performed before filing:
ask_knowledge_base(query='existing ticket Codex harness restart mid-session Memory Core Chroma GitHub sandbox restart doc', type='ticket')returned no relevant documents.rgacrossresources/content/issues,resources/content/issue-archive, andresources/content/discussionsfound broad Chroma/restart history, but no equivalent Codex Desktop mid-session harness restart runbook ticket.The Problem
When a harness restarts mid-session, agents can waste cycles by diagnosing the wrong layer:
curl localhost,npm run ai:mcp-client, orghcalls fail.gh auth statuscan misreport auth failures inside sandboxing.Without a concise Codex-specific trigger, the swarm repeats the same checks and risks either skipping
add_memoryor filing substrate tickets for sandbox artifacts.The Architectural Reality
.codex/CODEX.mdis injected into normal repo-root Codex turns through.codex/hooks.json, so adding large recovery prose there increases always-loaded context..codex/CODEX.mdshould stay a small turn-based memory/router surface..codex/HARNESS_RESTART.mdkeeps the detailed guidance trigger-based instead of always-loaded..codex.The Fix
Add a compact Codex-only restart card:
.codex/HARNESS_RESTART.md— mid-session harness restart checks:/api/v2/heartbeatwhere needed;add_memoryeven if shell clients cannot reach localhost..codex/CODEX.md— one-line trigger pointing to the restart card.Contract Ledger Matrix
.codex/CODEX.mdvia.codex/hooks.jsonAGENTS.mdstill governs baseline behavior.codex/CODEX.md.codex/HARNESS_RESTART.md.codex/HARNESS_RESTART.mdgit diff --checkAcceptance Criteria
.codex/HARNESS_RESTART.mdwith a concise mid-session restart checklist..codex/CODEX.mdpointing to.codex/HARNESS_RESTART.md.AGENTS.md, shared skills, or global Atlas docs.add_memoryreminder for shell-client failure cases.git diff --check -- .codex/CODEX.md .codex/HARNESS_RESTART.md.Out of Scope
.codex/config.template.tomlor.codex/config.toml.Avoided Traps / Gold Standards Rejected
.codex/CODEX.mdinto a full manual — rejected because it is injected every turn; this would increase loaded bytes for an edge case.AGENTS.md— rejected because the issue is Codex Desktop-specific and should not burden every harness.Related
Origin Session ID: 441c9fbc-2742-4b52-b371-77247e101f08 Retrieval Hint: "Codex mid-session harness restart .codex HARNESS_RESTART native MCP shell sandbox Chroma"