Context
Live regression discovered 2026-05-03 ~12:58Z after @tobiu restarted bridge-daemon + MCP servers post-#10632 merge ("agreed restart should make things BETTER, instead full-blown regression").
Empirical anchors (verified independently by @neo-opus-4-7 + @neo-gpt cross-family memory-mining):
bridge-daemon log PID 43092 shows 3 osascript exit-1 failures with no successes:
2026-05-03T12:58:14.073Z — failed delivery (corresponds to GPT→Opus msg sent 12:57:41Z, ~33s coalescing window)
2026-05-03T12:58:17.093Z — failed delivery (corresponds to GPT→Gemini msg sent 12:57:45Z)
2026-05-03T12:59:35.101Z — failed delivery (corresponds to Gemini→Opus high-priority msg sent 12:59:03Z)
SQLite WAKE_SUBSCRIPTION rows show only 2 active Shape C subs after restart, NOT 3:
WAKE_SUB:2ac01429-... — @neo-gemini-3-1-pro / appName Antigravity / harnessTarget bridge-daemon
WAKE_SUB:4df2d514-... — @neo-opus-4-7 / appName Claude / harnessTarget bridge-daemon (auto-bootstrapped fresh on MCP restart at 12:51:59Z)
- No active
@neo-gpt Shape C subscription — the prior WAKE_SUB:2257d6ee-... (Codex) is gone
@neo-gpt's AgentIdentity in raw SQLite has the canonical subscriptionTemplate with harnessTargetMetadata.appName: 'Codex' and harnessTarget: 'bridge-daemon' — the data is fine.
manage_wake_subscription({action:'bootstrap'}) invoked from a fresh @neo-gpt Codex MCP session rejects with: "Cannot bootstrap subscription: no subscriptionTemplate found on AgentIdentity '@neo-gpt'" — even though raw SQLite shows the template DOES exist. Two possible failure surfaces here: live MCP GraphService cache staleness AND/OR the canonical allow-list rejecting 'Codex' at the validation gate.
Static analysis of WakeSubscriptionService.mjs: PR #10628 (closed #10624) added validAppNames = ['Antigravity', 'Claude'] and the validation throws "Invalid appName 'Codex'. Must be one of: Antigravity, Claude" when the bootstrap path attempts to write the canonical Codex sub.
The Problem
#10624's Fix section literally specified the initial allow-list: "currently: 'Antigravity', 'Claude'; future identities added as harness registry expands". The sub-implementation in PR #10628 codified that initial scope. Codex/@neo-gpt was deferred at write-time and propagated as oversight through implementation — even though @neo-gpt is a canonical trio member (e.g., checkAllAgentIdle.mjs shipped via PR #10631 hardcodes NEO_TRIO_IDENTITIES || '@neo-gemini-3-1-pro,@neo-opus-4-7,@neo-gpt').
The regression chain on MCP server restart:
- Auto-bootstrap (per #10437/#10438) fires for each AgentIdentity with a subscriptionTemplate
- For
@neo-gpt, the template specifies appName: 'Codex'
WakeSubscriptionService.validateMetadata throws because 'Codex' is not in validAppNames
- The sub is silently NOT created; @neo-gpt becomes unreachable via Shape C wake
- Codex agent receives messages via
list_messages (mailbox-DB layer is fine) but no wake injection
This is a substrate-truth gap of the #10624 family — case-handling/whitelist-completeness defects. Same pattern as #10619 Cycle 1 (AGENT_MEMORY vs MEMORY label drift), #10623 Cycle 1 ($.type vs $.label query drift): "silently-rejected-because-list-is-incomplete" rather than "silently-accepted-but-wrong-form" — symmetric failure mode in the validation direction.
The Architectural Reality
- Single touchpoint:
ai/mcp/server/memory-core/services/WakeSubscriptionService.mjs:60 — the validAppNames field declaration.
- Validation site: Same file, line ~464 (
validateMetadata method).
- Consumer test surface:
test/playwright/unit/ai/mcp/server/memory-core/services/WakeSubscriptionService.spec.mjs (negative-case rejection test exists per #10624 AC; needs symmetric positive-case for Codex).
- No other allow-list duplicates: grep confirms
validAppNames is the single source; openapi.yaml accepts appName as free-form string per pre-#10628 design.
- Trio canonicalization elsewhere:
ai/scripts/checkAllAgentIdle.mjs:18 hardcodes @neo-gemini-3-1-pro,@neo-opus-4-7,@neo-gpt — independent confirmation that Codex is a canonical trio member.
The Fix
Add 'Codex' to the validAppNames allow-list in WakeSubscriptionService.mjs:60:
validAppNames = ['Antigravity', 'Claude']
validAppNames = ['Antigravity', 'Claude', 'Codex']
Single-line code change. Ordering matches the chronological harness onboarding (Antigravity → Claude → Codex per swarm history).
Spec test extension in the corresponding WakeSubscriptionService.spec.mjs: add positive-case assertion that appName: 'Codex' is accepted (mirroring the existing 'Antigravity' and 'Claude' positive cases).
Operator step (post-merge, NOT auto-migration): after MCP server restart with the fix landed, the next auto-bootstrap pass will create @neo-gpt's Shape C sub from the canonical template. If the MCP GraphService cache is also stale on @neo-gpt's AgentIdentity (per @neo-gpt's evidence in cross-family A2A), a separate cache-invalidation ticket will be needed — that's out of scope here.
Acceptance Criteria
Out of Scope
- MCP GraphService cache staleness on AgentIdentity at restart — @neo-gpt's separate evidence (
bootstrap returns "no subscriptionTemplate found" while raw SQLite shows the template). If post-fix bootstrap still fails after MCP restart, file as a separate substrate ticket; do not bundle here.
- bridge-daemon stderr capture (
stdio:'ignore' swallows osascript stderr) — orthogonal observability gap, separate ticket scope. The 3 failures in this regression were initially undiagnosable because of this; fixing it is defense-in-depth that benefits all future osascript exit-N forensics.
- OS-level TCC keystroke permission — addressed externally by @tobiu System Settings update; not a code concern.
- Bulk validAppNames → registry refactor — premature. Static list with explicit additions per harness onboarding matches #10624 precedent.
Avoided Traps
- ❌ Auto-allow case-insensitive variants — same trap rejected by #10624 (silent normalization hides write-side defects). Throw on mismatch; require canonical exact match.
- ❌ Bundle the MCP cache-staleness fix — different substrate (GraphService cache vs validation list); different reviewer surface; bundling violates ticket-create §1 single-scope discipline.
- ❌ Bundle the bridge-daemon stderr observability fix — different file (
ai/scripts/bridge-daemon.mjs), different concern (post-hoc forensics vs validation list completeness). Filing separately preserves clean PR review surfaces.
- ❌ Replace static list with dynamic registry-lookup — premature abstraction. The trio is currently 3; explicit listing has lower drift risk than a derived list and is easier to grep.
- ❌ Auto-migrate
@neo-gpt's existing sub from SQLite at validation rollout — operator-coordinated re-bootstrap is safer (mirrors #10624 AC pattern).
Related
- Parent: #10601 (substrate-stack Epic) — this is a substrate-truth regression in lane #5 (canonical wake routes).
- Direct precursor: #10624 (canonical appName enforcement) / PR #10628 (implementation that introduced the gap by literal "future identities" deferral marker).
- Substrate-truth class siblings: #10619 Cycle 1 (
AGENT_MEMORY label drift), #10623 Cycle 1 ($.label query drift) — same substrate-truth-truth-not-honored failure family.
- Trio canonicalization confirmation: PR #10631 (#10625 all-agent-idle detection) —
NEO_TRIO_IDENTITIES env default lists @neo-gpt as canonical member.
- Cross-family memory anchor: @neo-gpt's MESSAGE:6b8c7086-6e66-4323-a9b8-4c8d3f6c929b (2026-05-03T13:06:06Z) — empirical evidence that timestamps + Codex-bootstrap-rejection trace.
Origin Session ID: 9766f91c-51f8-44fe-ac34-d79f61a0e1bf
Retrieval Hint: query_summaries("Codex bootstrap excluded validAppNames wake substrate regression 2026-05-03") + query_raw_memories("validAppNames Antigravity Claude Codex omitted #10624 #10628")
Context
Live regression discovered 2026-05-03 ~12:58Z after @tobiu restarted bridge-daemon + MCP servers post-#10632 merge ("agreed restart should make things BETTER, instead full-blown regression").
Empirical anchors (verified independently by @neo-opus-4-7 + @neo-gpt cross-family memory-mining):
bridge-daemon log PID 43092 shows 3 osascript exit-1 failures with no successes:
2026-05-03T12:58:14.073Z— failed delivery (corresponds to GPT→Opus msg sent 12:57:41Z, ~33s coalescing window)2026-05-03T12:58:17.093Z— failed delivery (corresponds to GPT→Gemini msg sent 12:57:45Z)2026-05-03T12:59:35.101Z— failed delivery (corresponds to Gemini→Opus high-priority msg sent 12:59:03Z)SQLite WAKE_SUBSCRIPTION rows show only 2 active Shape C subs after restart, NOT 3:
WAKE_SUB:2ac01429-...—@neo-gemini-3-1-pro/ appNameAntigravity/ harnessTargetbridge-daemonWAKE_SUB:4df2d514-...—@neo-opus-4-7/ appNameClaude/ harnessTargetbridge-daemon(auto-bootstrapped fresh on MCP restart at 12:51:59Z)@neo-gptShape C subscription — the priorWAKE_SUB:2257d6ee-...(Codex) is gone@neo-gpt's AgentIdentity in raw SQLite has the canonical subscriptionTemplate withharnessTargetMetadata.appName: 'Codex'andharnessTarget: 'bridge-daemon'— the data is fine.manage_wake_subscription({action:'bootstrap'})invoked from a fresh @neo-gpt Codex MCP session rejects with: "Cannot bootstrap subscription: no subscriptionTemplate found on AgentIdentity '@neo-gpt'" — even though raw SQLite shows the template DOES exist. Two possible failure surfaces here: live MCP GraphService cache staleness AND/OR the canonical allow-list rejecting'Codex'at the validation gate.Static analysis of
WakeSubscriptionService.mjs: PR #10628 (closed #10624) addedvalidAppNames = ['Antigravity', 'Claude']and the validation throws"Invalid appName 'Codex'. Must be one of: Antigravity, Claude"when the bootstrap path attempts to write the canonical Codex sub.The Problem
#10624's Fix section literally specified the initial allow-list: "currently: 'Antigravity', 'Claude'; future identities added as harness registry expands". The sub-implementation in PR #10628 codified that initial scope.
Codex/@neo-gptwas deferred at write-time and propagated as oversight through implementation — even though@neo-gptis a canonical trio member (e.g.,checkAllAgentIdle.mjsshipped via PR #10631 hardcodesNEO_TRIO_IDENTITIES || '@neo-gemini-3-1-pro,@neo-opus-4-7,@neo-gpt').The regression chain on MCP server restart:
@neo-gpt, the template specifiesappName: 'Codex'WakeSubscriptionService.validateMetadatathrows because'Codex'is not invalidAppNameslist_messages(mailbox-DB layer is fine) but no wake injectionThis is a substrate-truth gap of the #10624 family — case-handling/whitelist-completeness defects. Same pattern as #10619 Cycle 1 (
AGENT_MEMORYvsMEMORYlabel drift), #10623 Cycle 1 ($.typevs$.labelquery drift): "silently-rejected-because-list-is-incomplete" rather than "silently-accepted-but-wrong-form" — symmetric failure mode in the validation direction.The Architectural Reality
ai/mcp/server/memory-core/services/WakeSubscriptionService.mjs:60— thevalidAppNamesfield declaration.validateMetadatamethod).test/playwright/unit/ai/mcp/server/memory-core/services/WakeSubscriptionService.spec.mjs(negative-case rejection test exists per #10624 AC; needs symmetric positive-case for Codex).validAppNamesis the single source;openapi.yamlacceptsappNameas free-form string per pre-#10628 design.ai/scripts/checkAllAgentIdle.mjs:18hardcodes@neo-gemini-3-1-pro,@neo-opus-4-7,@neo-gpt— independent confirmation that Codex is a canonical trio member.The Fix
Add
'Codex'to thevalidAppNamesallow-list inWakeSubscriptionService.mjs:60:// Before: validAppNames = ['Antigravity', 'Claude'] // After: validAppNames = ['Antigravity', 'Claude', 'Codex']Single-line code change. Ordering matches the chronological harness onboarding (Antigravity → Claude → Codex per swarm history).
Spec test extension in the corresponding
WakeSubscriptionService.spec.mjs: add positive-case assertion thatappName: 'Codex'is accepted (mirroring the existing'Antigravity'and'Claude'positive cases).Operator step (post-merge, NOT auto-migration): after MCP server restart with the fix landed, the next auto-bootstrap pass will create
@neo-gpt's Shape C sub from the canonical template. If the MCP GraphService cache is also stale on@neo-gpt's AgentIdentity (per @neo-gpt's evidence in cross-family A2A), a separate cache-invalidation ticket will be needed — that's out of scope here.Acceptance Criteria
validAppNamesinWakeSubscriptionService.mjs:60includes'Codex'appName: 'Codex'accepted (positive case mirroring'Antigravity'/'Claude')@neo-gpt's Shape C sub via canonical template (verified by SQLite query showing 3 active subs across the trio)Out of Scope
bootstrapreturns "no subscriptionTemplate found" while raw SQLite shows the template). If post-fix bootstrap still fails after MCP restart, file as a separate substrate ticket; do not bundle here.stdio:'ignore'swallows osascript stderr) — orthogonal observability gap, separate ticket scope. The 3 failures in this regression were initially undiagnosable because of this; fixing it is defense-in-depth that benefits all future osascript exit-N forensics.Avoided Traps
ai/scripts/bridge-daemon.mjs), different concern (post-hoc forensics vs validation list completeness). Filing separately preserves clean PR review surfaces.@neo-gpt's existing sub from SQLite at validation rollout — operator-coordinated re-bootstrap is safer (mirrors #10624 AC pattern).Related
AGENT_MEMORYlabel drift), #10623 Cycle 1 ($.labelquery drift) — same substrate-truth-truth-not-honored failure family.NEO_TRIO_IDENTITIESenv default lists@neo-gptas canonical member.Origin Session ID: 9766f91c-51f8-44fe-ac34-d79f61a0e1bf
Retrieval Hint: query_summaries("Codex bootstrap excluded validAppNames wake substrate regression 2026-05-03") + query_raw_memories("validAppNames Antigravity Claude Codex omitted #10624 #10628")