LearnNewsExamplesServices
Frontmatter
id11507
titleLane C of #11503 — wrap manual heavy maintenance scripts with HeavyMaintenanceLeaseService
stateClosed
labels
enhancementaimodel-experience
assigneesneo-opus-4-7
createdAtMay 17, 2026, 1:29 AM
updatedAtMay 17, 2026, 3:08 AM
githubUrlhttps://github.com/neomjs/neo/issues/11507
authorneo-opus-4-7
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 17, 2026, 3:08 AM

Lane C of #11503 — wrap manual heavy maintenance scripts with HeavyMaintenanceLeaseService

Closedenhancementaimodel-experience
neo-opus-4-7
neo-opus-4-7 commented on May 17, 2026, 1:29 AM

Resolves Lane C of #11503 umbrella. Builds on the lease primitive shipped by PR #11506 (#11505).

FAIR-band: in-band [10/30] — prio-0 per operator elevation 2026-05-16T22:55Z; closes the gh-CLI bypass surface at the maintenance-script layer (sibling to PR #11502 closing the same pattern at the pr-review surface).

Premise

PR #11506 / #11505 shipped HeavyMaintenanceLeaseService with withHeavyMaintenanceLease(task, {owner, ...}) wrapper. Lane C wires manual CLI scripts into that lease so they cannot collide with orchestrator-owned heavy work OR with each other.

Empirical anchor (today's wedge at 19:27Z): orchestrator's kbSync subprocess wedged while operator's manual npm run ai:sync-kb would have collided in parallel had operator triggered it. The substrate gap is real and exactly the failure class Lane C closes.

Scope (per #11503 ACs 4-7 and lead-sync from @neo-gpt MESSAGE:20f48f73)

Script AC Lease owner string
buildScripts/ai/runSandman.mjs AC4 sandman
buildScripts/ai/syncKnowledgeBase.mjs AC5 kbSync
buildScripts/ai/backup.mjs AC6 backup
buildScripts/ai/syncGithubWorkflow.mjs AC7 syncGithubWorkflow (whole-run guarded per GPT's call; Stage 1/Stage 2 split deferred)

Prescription

For each script, wrap the existing main() (or equivalent top-level async work) with withHeavyMaintenanceLease:

import {withHeavyMaintenanceLease} from '../../ai/services/github-workflow/...' // path TBD per services.mjs barrel

const result = await withHeavyMaintenanceLease(
    async (lease) => {
        // ... existing script body ...
    },
    {owner: 'sandman', reason: 'manual-cli'}
);

if (result.status === 'held') {
    console.log(`⏸️  Deferred: heavy-maintenance lease held by '${result.lease.owner}' (reason='${result.lease.reason}', pid=${result.lease.pid}, acquiredAt=${result.lease.acquiredAt}).`);
    console.log(`   This script will not run while another heavy-maintenance task is active.`);
    process.exit(0);
}

result.status === 'held' is a non-error exit (matches GPT's AC9 "deferred due to active heavy-maintenance lease" log shape).

Acceptance Criteria

  • AC1: All 4 scripts (runSandman.mjs, syncKnowledgeBase.mjs, backup.mjs, syncGithubWorkflow.mjs) wrap their main body with withHeavyMaintenanceLease
  • AC2: When lease is held by another owner, scripts exit 0 with a clear stdout message naming the active owner + acquisition timestamp
  • AC3: When lease is acquired, scripts execute existing behavior unchanged + auto-release on completion (via withHeavyMaintenanceLease finally block)
  • AC4: Spec coverage matches sibling-script-test precedent: per-script test asserting both code paths (acquired + held) — at least 1 test per script
  • AC5: No regression in non-lease behavior (existing tests + smoke runs continue to work)

Avoided Traps

  • Splitting syncGithubWorkflow into Stage 1 (GitHub IO, light) vs Stage 2 (Chroma writes, heavy): rejected per GPT's lead-sync (MESSAGE:20f48f73). Whole-run guard for v1; Stage-split refinement is a separate future ticket if lease hold-time becomes empirical pain.
  • Adding lease semantics to npm run ai:check-substrate-size / lint-skill-manifest / check-retired-primitives: rejected — these are pure-static-validation scripts (no Chroma/SQLite/LLM writes); not in the heavy-maintenance set.
  • Building a script-specific lock primitive in each script: rejected — that's exactly the "per-script private locks" trap #11503 explicitly named. The shared lease primitive (PR #11506) is the substrate.
  • Blocking the script on held-lease (wait-loop): rejected — withHeavyMaintenanceLease returns status: 'held' immediately; scripts should defer-and-exit, not wait. Operator can re-invoke later. Matches non-error deferral semantics from AC9.

Related

  • Parent umbrella: #11503 (Enforce heavy-maintenance mutex across Agent OS tasks)
  • Substrate dependency: PR #11506 / #11505 (shared lease primitive)
  • Conceptual sibling at different surface: PR #11502 / #11495 (CI lint for gh-pr-review CLI bypass) — same Helpful-Assistant CLI-bypass failure pattern, different substrate surface
  • Empirical anchor: today's wedge cascade at 19:27Z (manual kb-sync + orchestrator kb-sync collision the lease primitive + this Lane C close together)

Out of Scope

  • Lane A — adding backup to DEFAULT_HEAVY_MAINTENANCE_TASK_NAMES in Orchestrator.mjs (folded by GPT into Lane B per their MESSAGE:20f48f73; will be handled separately if not part of PR #11506)
  • Lane D — PrimaryRepoSyncService.runKbSync() nested cascade lease wiring (GPT considering during Lane B API design)
  • Lane E — observability/stale-adoption surfacing (separate ticket if needed)
tobiu referenced in commit a5c6380 - "feat(buildScripts): wrap manual heavy-maintenance scripts with shared lease — Lane C of #11503 (#11507) (#11509) on May 17, 2026, 3:08 AM
tobiu closed this issue on May 17, 2026, 3:08 AM