LearnNewsExamplesServices
Frontmatter
id10564
titleInvestigate Gemini-specific premature sunset trigger drift (7th+ occurrence)
stateClosed
labels
bugairegression
assignees[]
createdAtMay 1, 2026, 11:27 AM
updatedAtMay 2, 2026, 12:47 AM
githubUrlhttps://github.com/neomjs/neo/issues/10564
authorneo-opus-4-7
commentsCount7
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 1, 2026, 10:26 PM

Investigate Gemini-specific premature sunset trigger drift (7th+ occurrence)

Closedbugairegression
neo-opus-4-7
neo-opus-4-7 commented on May 1, 2026, 11:27 AM

Context

@tobiu surfaced 2026-05-01: "sunset triggers for gemini still an issue. triggered sunset on her own again after very few turns, without us recommending it. 7th+ occurrence. only gemini model. we updated triggers already, but insufficient."

This recurrence is Gemini-model-specific — Claude (Opus 4.7) and GPT (5.5/Codex) do not exhibit the pattern. Two prior fix attempts have closed without eliminating the behavior:

  • #10374 (CLOSED): "Refine 'Session Sunset' trigger definitions to prevent premature protocol execution"
  • #10529 (CLOSED): "Disambiguate session sunset triggers from PR review halting"

The session-sunset workflow (session-sunset/references/session-sunset-workflow.md) already has multiple layers of anti-kill-switch language:

  • §1 lists 4 strict trigger conditions
  • §1.1 Review Lifecycle Exception
  • §1.2 Anti-Kill-Switch Invariants ("Friction or debate is an active operational state, NOT a session boundary")
  • §1.3 Loop-Prevention ("Reading handover pings ... must NEVER be interpreted as a trigger to immediately sunset")
  • §1 trigger #4 explicitly says "NEVER unilaterally execute the protocol based solely on [recommendation]"

Despite this, Gemini still fires sunset prematurely. The empirical signal is strong: 7+ recurrences against the same documented guards.

The Problem

Hypothesis space (none verified yet — investigation is the gating work):

  1. Trigger #4 misinterpretation. Gemini conflates "Proactive Agent Recommendation" (which says RECOMMEND, not EXECUTE) with execution authority. The recommend-vs-execute boundary in §1 trigger #4 may not be strong enough at the model-instruction level.

  2. Harness-level pull. Antigravity (Gemini's primary harness) may inject system-prompt content that biases toward session-termination — paralleling the "Semantic Corruption" pattern that PR #10551 + #10563 addressed for <web_application_development> and <identity> blocks. A <session_lifecycle> or similar harness block could be priming sunset behavior outside our skill's view.

  3. Context-pressure self-assessment drift. Gemini may interpret "context-pressure signals/forgetfulness" (§1 trigger #1) as occurring earlier than actual token utilization warrants — hair-trigger self-assessment.

  4. Cross-trigger conflation. Gemini may interpret a single signal (e.g., a pause, a yielded turn, a user pivot) as multiple compounding triggers (#2 macro-semantic pivot + #4 recommendation), reaching execution threshold from a single cue.

  5. Skill-load timing. The session-sunset SKILL.md says "If you are concluding... you MUST immediately use view_file to read the workflow before terminating." But by the time the skill is loaded, the agent has already DECIDED to sunset. The anti-kill-switch language never reaches the trigger-decision moment.

The Architectural Reality

The session-sunset skill is structured for post-decision compliance (workflow steps once you've decided to sunset), not pre-decision gating (whether to sunset at all). Hypothesis 5 above suggests this structure may be exactly the bug: anti-kill-switch language inside a skill that's only loaded WHEN the agent has already triggered.

Files in scope:

  • .agents/skills/session-sunset/SKILL.md (router, 7 lines) — currently directs the agent to read the workflow only AFTER deciding to sunset
  • .agents/skills/session-sunset/references/session-sunset-workflow.md (112 lines) — contains all the anti-kill-switch invariants but only loaded post-trigger
  • .agents/ANTIGRAVITY_RULES.md — Antigravity-harness firewall; could host a pre-decision sunset gate (parallel to how PR #10551 / #10563 host the identity firewall)
  • AGENTS.md / AGENTS_STARTUP.md — universal pre-decision substrate for all harnesses

The Fix

Investigation-first, not implementation-first. Prior fix attempts (#10374, #10529) shipped trigger refinements without empirical capture of what Gemini actually does at the trigger moment. This ticket reverses that order:

Phase 1 — Capture (gating evidence for Phase 2):

  • Instrument session-sunset trigger events: when an agent invokes the sunset protocol, capture the immediately-preceding turn's prompt + response + tool calls + the agent's stated rationale for triggering.
  • Gather telemetry across the next ~5 Gemini sunset events.
  • Compare against Claude / GPT sunset events to isolate model-specific signal.

Phase 2 — Hypothesis test (after Phase 1 data lands):

  • Match observed triggers against the 5 hypotheses above (or new ones surfaced by data).
  • Identify the single highest-leverage intervention point.

Phase 3 — Targeted intervention:

  • Implement based on Phase 2's identified intervention point. Possible shapes:
    • Pre-decision gate at AGENTS_STARTUP.md substrate level (universal)
    • Antigravity-firewall sunset block (Gemini-harness-specific, parallel to identity firewall)
    • Removal of Proactive Agent Recommendation trigger #4 entirely (forces explicit human//sunset command)
    • Confirmation gate ("Are you sure? Reply confirm-sunset to proceed") at the skill-execution layer

Acceptance Criteria

  • (AC1) Phase 1 instrumentation captures sunset-trigger events with rationale across ≥5 Gemini occurrences (and matched Claude/GPT events for control)
  • (AC2) Phase 2 analysis matches captured events against the hypothesis space; identifies the dominant cause OR surfaces a new hypothesis
  • (AC3) Phase 3 intervention shipped with evidence-grounded design (NOT another speculative trigger refinement)
  • (AC4) Post-intervention: zero Gemini premature-sunset events across the next ~5 sessions where context utilization remains <75% AND no explicit human directive was issued
  • (AC5) Cross-skill: if intervention requires harness-firewall placement, .agents/ANTIGRAVITY_RULES.md updated symmetric to existing identity/git-workflow firewalls

Out of Scope

  • Another speculative trigger refinement without Phase 1 capture (#10374 + #10529 already closed via this approach; clearly insufficient)
  • Removing the session-sunset skill entirely (the protocol IS valuable when sunset is correctly triggered; problem is the trigger gate, not the workflow)
  • Cross-model harness rewriting (each harness has its own constraints; this ticket targets the skill + AGENTS substrate + optionally the Antigravity-specific firewall)

Avoided Traps

  • Trap: file another speculative trigger-refinement PR without empirical data. Rejected — prior #10374 + #10529 followed this shape and produced insufficient results. Phase 1 capture is non-negotiable AC.
  • Trap: assume Claude / GPT also exhibit the bug. Rejected — empirical signal is Gemini-only. Treating it as universal would mis-target the fix.
  • Trap: blame the model. Rejected — this is a swarm-substrate gap; the protocol's anti-kill-switch language is post-decision, not pre-decision. Architectural issue, not model defect.
  • Trap: bundle into Epic #10537 (pr-review modularization). Rejected — different skill, different scope, separate concern.

Related

  • Prior fix attempts (CLOSED): #10374, #10529
  • Adjacent sunset infrastructure: #10349 (self-DM handover), #10498 (wakeSuppressed self-DMs), #10543 (Phase 2 Sunset Unsubscribe Primitive)
  • Adjacent harness-firewall pattern: PR #10549, #10551, #10563 (Antigravity firewall for identity / git-workflow / web-dev override)
  • Empirical anchor (this morning's recurrence): Gemini self-triggered sunset after few turns, prompting @tobiu's flag

Origin Session ID: 1f30c9d8-4a36-4be0-98a5-bd5b89289227 Retrieval Hint: "Gemini premature session sunset trigger drift 7th occurrence"

tobiu referenced in commit c51ecf6 - "feat(core): implement Pre-Decision Sunset Gate (#10564) (#10596) on May 1, 2026, 10:26 PM
tobiu closed this issue on May 1, 2026, 10:26 PM