Context
Child of #10647. The 2026-05-03 regression showed that the team repeatedly accepted layer-local success as full-loop wake success. The clearest example is #10644: the bridge believed it delivered to Antigravity because osascript exited 0, but the payload landed in editor/file content instead of the agent prompt surface.
The wake substrate needs a validation matrix that says exactly what must be true before anyone can call wake delivery working again.
Duplicate Sweep Notes
Creation sweep performed as part of #10647:
- Live latest-20 open GitHub issues were read with number/title/author/labels/URL. Adjacent entries included #10644 (Antigravity-specific prompt-surface bug), #10645 (Codex bootstrap), #10633/#10627 (heartbeat/recovery dependencies), #10601 (parent recovery epic), and #10517 (routing semantics). None define a cross-harness prompt-landing acceptance matrix.
- Local resource search found #10440 noting the old trap: a bootstrap check passed even though full-loop wake delivery had not been validated. That is historical adjacency, not an open matrix ticket.
ask_knowledge_base(type: 'ticket') found no equivalent ticket for prompt-landing validation.
The Problem
"Wake delivered" is currently ambiguous. It can mean any of:
- the A2A message was persisted;
list_messages can read it;
- a wake subscription exists;
- a coalesced/raw event was emitted;
- bridge-daemon attempted an adapter;
- osascript exited 0;
- the correct app became frontmost;
- the payload actually reached the agent prompt input;
- the agent accepted/submitted the prompt without corrupting files or spawning a wrong session.
Only the last two prove the user-visible effect. The substrate repeatedly regressed because earlier layers were treated as sufficient proof.
Historical Memory Context
Relevant Memory Core anchors:
summary_0763a9bf-1052-4a2f-99f3-a8e0e14f1671 — wake delivery previously required multiple fixes across permissions, tab focus, atomic paste, raw/digest envelope, and clone state.
summary_a8592d87-f132-42c5-af6f-0e066fd5f428 — a raw/coalescing change broke another delivery path, showing that wire-format and adapter compatibility must be tested as a loop.
summary_bf59d6c4-e250-44a2-b4b2-5bffae40ab5f — wake substrate reviews uncovered silent metadata stripping and led to mailbox protocol discipline.
The Architectural Reality
The validation surface spans:
- Memory Core A2A storage/listing (
add_message, list_messages).
- WakeSubscriptionService bootstrap/active subscription state.
- Coalescing/raw wake event shape.
- Bridge daemon or native harness transport.
- Harness-specific prompt surface behavior:
- Claude Desktop / Claude app tab or prompt field.
- Antigravity IDE agent composer, not editor files.
- Codex Desktop prompt/thread surface.
No single unit test can prove all of this for all harnesses. The matrix should combine deterministic unit/integration tests with explicit manual or semi-automated live validation evidence.
The Fix
Create a cross-harness wake validation matrix and wire it into the #10647 reactivation gate.
Minimum matrix columns:
- message persisted;
- unread/list state correct;
- subscription bootstrap and metadata correct;
- wake event emitted with expected envelope;
- adapter selected correct app/session target;
- prompt payload lands in the agent prompt/composer surface;
- no editor/file content is modified;
- no fresh session is spawned unless explicit sunset/unsubscribe permits it;
- recipient can act on the prompt or a clear blocked signal is emitted;
- evidence artifact captured in PR/comment/log.
Minimum rows:
- Claude Desktop / Claude app wake path.
- Antigravity IDE / Gemini wake path.
- Codex Desktop wake path.
Acceptance Criteria
Out of Scope
- Fixing any individual harness row failure. Failures should create or link specific tickets such as #10644/#10645.
- Building a full native UI automation layer for every harness in this ticket.
- Replacing #10517 HarnessPresence/wakePolicy routing semantics.
- Reactivating heartbeat.
Avoided Traps
- Trap: bridge log says delivered, therefore done. Rejected. Prompt landing is the observable effect.
- Trap: require impossible full automation before documenting the gate. Rejected. Manual evidence is acceptable where harness UI automation is brittle, but the matrix must make the gap explicit.
- Trap: test only the currently broken harness. Rejected. Cross-harness regression is the pattern.
- Trap: collapse storage/listing into wake delivery. Rejected. A2A mailbox can work while wake delivery is broken.
Related
- Parent epic: #10647.
- Grandparent epic: #10601.
- Antigravity prompt-surface regression: #10644.
- Codex bootstrap regression: #10645.
- Routing model: #10517.
- Historical full-loop validation miss: #10440.
Origin Session ID: 89b259c3-27ec-4afb-baaf-fd39b55bffe1
Retrieval Hint: wake prompt landing matrix bridge delivered not enough Antigravity file write Codex Claude harness validation.
Context
Child of #10647. The 2026-05-03 regression showed that the team repeatedly accepted layer-local success as full-loop wake success. The clearest example is #10644: the bridge believed it delivered to Antigravity because osascript exited 0, but the payload landed in editor/file content instead of the agent prompt surface.
The wake substrate needs a validation matrix that says exactly what must be true before anyone can call wake delivery working again.
Duplicate Sweep Notes
Creation sweep performed as part of #10647:
ask_knowledge_base(type: 'ticket')found no equivalent ticket for prompt-landing validation.The Problem
"Wake delivered" is currently ambiguous. It can mean any of:
list_messagescan read it;Only the last two prove the user-visible effect. The substrate repeatedly regressed because earlier layers were treated as sufficient proof.
Historical Memory Context
Relevant Memory Core anchors:
summary_0763a9bf-1052-4a2f-99f3-a8e0e14f1671— wake delivery previously required multiple fixes across permissions, tab focus, atomic paste, raw/digest envelope, and clone state.summary_a8592d87-f132-42c5-af6f-0e066fd5f428— a raw/coalescing change broke another delivery path, showing that wire-format and adapter compatibility must be tested as a loop.summary_bf59d6c4-e250-44a2-b4b2-5bffae40ab5f— wake substrate reviews uncovered silent metadata stripping and led to mailbox protocol discipline.The Architectural Reality
The validation surface spans:
add_message,list_messages).No single unit test can prove all of this for all harnesses. The matrix should combine deterministic unit/integration tests with explicit manual or semi-automated live validation evidence.
The Fix
Create a cross-harness wake validation matrix and wire it into the #10647 reactivation gate.
Minimum matrix columns:
Minimum rows:
Acceptance Criteria
Out of Scope
Avoided Traps
Related
Origin Session ID: 89b259c3-27ec-4afb-baaf-fd39b55bffe1
Retrieval Hint:
wake prompt landing matrix bridge delivered not enough Antigravity file write Codex Claude harness validation.