Authored by Claude Opus 4.7 (Claude Code). Session f662d055-a35b-446a-83ff-5fc859604722.
Friction-to-gold follow-up from PR #11509 review cycles 1 and 2 (per @neo-gpt MESSAGE:3c42e809-d1e9-43ca-a10c-9bdc9a08eb7d routing recommendation: "file a tiny follow-up after A so the backup PR does not grow a sidecar surface").
Context
PR #11509 (Lane C of #11503) shipped consumer-side wrapping of 4 manual heavy-maintenance CLI scripts in withHeavyMaintenanceLease. Two CHANGES_REQUESTED review cycles by @neo-gpt surfaced the SAME root failure mode at two different surfaces:
- Cycle 1 (
PRR_kwDODSospM8AAAABAJIdTg): the catch path on lease-acquisition failure fell through to GraphService.decayGlobalTopology() — graph mutation while the lease was NEVER acquired
- Cycle 2 (
PRR_kwDODSospM8AAAABAJKF8w): the success path placed GraphService.decayGlobalTopology() after await withHeavyMaintenanceLease(...) — graph mutation AFTER the helper's finally released the lease
Both fail-classes are substrate-mutation outside the lease window. Both shipped initially because the consumer-side mental model of the lease primitive is non-obvious: developers (and LLMs) naturally read await wrapper(...); doX() as "doX runs AFTER wrapper's effects", but JS await unblocks BEFORE the wrapper's own finally fires. The release-in-finally contract of withHeavyMaintenanceLease is therefore counter-intuitive at the call site.
The Problem
The primitive's API surface (ai/daemons/services/HeavyMaintenanceLeaseService.mjs) currently documents WHAT the wrapper does but NOT the timing-semantics consumers need to honor for substrate-protection correctness. Specifically:
- No JSDoc on
withHeavyMaintenanceLease calls out that the finally-based release happens BEFORE the awaited promise settles
- No reference-impl example shows the "graph-mutation inside an inner-finally" pattern that Lane C's
runSandman.mjs arrived at via 2 review cycles
- The risk class is invisible from the call site — a future consumer wrapping a new heavy script will likely make the same mistake without empirical hints
The MX cost is concrete: 2 full PR review cycles + 2 commits + author + reviewer + 3rd reviewer attention. Friction → gold candidate per AGENTS.md §13.2.
Architectural Reality
ai/daemons/services/HeavyMaintenanceLeaseService.mjs:257-280 — withHeavyMaintenanceLease(task, options) definition + finally-based release
ai/daemons/services/HeavyMaintenanceLeaseService.mjs:163 — acquireHeavyMaintenanceLease core acquisition
buildScripts/ai/runSandman.mjs:176-258 — current reference-impl pattern (decay inside inner finally; landed via PR #11509 cycles 1+2)
test/playwright/unit/ai/daemons/services/HeavyMaintenanceLeaseService.spec.mjs — current spec (8/8) verifies acquisition/release contract but does not pin the consumer-side anti-pattern
test/playwright/unit/ai/buildScripts/laneCManualScriptLeaseAdoption.spec.mjs:113-160 — release-timing invariant test on runSandman.mjs (pattern-pin via offset + structural finally-block regex)
The Fix
Three coordinated edits:
JSDoc augmentation on withHeavyMaintenanceLease in ai/daemons/services/HeavyMaintenanceLeaseService.mjs:257. Add explicit timing-semantic callout above the @param task block:
⚠️ Release timing: the lease is released in this helper's finally block. JS await semantics guarantee the helper's finally runs AFTER the awaited task settles but BEFORE this helper's returned promise resolves. Consequence: code placed AFTER await withHeavyMaintenanceLease(...) runs OUTSIDE the lease window. If your task has side effects that must remain inside the lease window (e.g., graph mutation, Chroma writes, SQLite writes), place them inside an inner try { ... } finally { /* side-effect */ } of the task itself — NOT after the await. See buildScripts/ai/runSandman.mjs for the canonical pattern.
Reference-impl example in the JSDoc — a runnable code block showing the inner-finally pattern, contrasting with the wrong shape:
await withHeavyMaintenanceLease(async () => { }, {owner: 'X'});
await mutateGraph();
await withHeavyMaintenanceLease(async () => {
try {
} finally {
await mutateGraph();
}
}, {owner: 'X'});
Spec assertion in HeavyMaintenanceLeaseService.spec.mjs pinning the release-timing contract: a test that wraps a task with an inner finally calling a probe, asserts probe-call timestamp is BEFORE the release-probe timestamp (release-after-task ordering). This becomes the contract anchor for the JSDoc claim — JSDoc + spec must agree.
Contract Ledger Matrix
| Target Surface |
Source of Authority |
Proposed Behavior |
Fallback |
Docs |
Evidence |
withHeavyMaintenanceLease JSDoc |
HeavyMaintenanceLeaseService.mjs:257 + this ticket |
Explicit release-timing callout + ✓/✗ reference-impl example |
N/A |
The JSDoc IS the doc |
Static read of JSDoc + visual confirmation in ask_knowledge_base results |
withHeavyMaintenanceLease release-timing contract |
HeavyMaintenanceLeaseService.spec.mjs + this ticket |
Spec test pins "task's inner finally runs BEFORE wrapper's release" ordering |
N/A |
Spec name self-documents |
Test fail-mode: any future refactor that releases before task settles fails |
Acceptance Criteria
Out of Scope
- Re-entrancy / owner-inheritance semantics (e.g., child process of an owner-X task inheriting the parent's lease) — that's a deeper architectural change with its own design tradeoffs; file separately if needed
- Auto-detection of the wrong-shape pattern via static analysis / lint — too speculative for v1; JSDoc + reference-impl + spec is sufficient
- Refactoring
withHeavyMaintenanceLease to return-before-release — would break the release-on-task-completion guarantee; not the right substrate change
Avoided Traps
- Adding the guidance ONLY in JSDoc without a spec assertion: rejected — JSDoc rots silently when a future refactor changes the release-timing behavior. Spec assertion is the load-bearing contract.
- Adding the guidance ONLY in a separate guide /
learn/ doc: rejected — consumers reading HeavyMaintenanceLeaseService.mjs won't see it. JSDoc lives at the consumption point.
- Adding the guidance as a section in
AGENTS.md / atlas: rejected — wrong tier per substrate accretion defense (§13). Consumer-side guidance for a specific primitive belongs at the primitive's API surface, not in always-loaded substrate.
Related
- Source friction: PR #11509 cycles 1 + 2 (review IDs
PRR_kwDODSospM8AAAABAJIdTg, PRR_kwDODSospM8AAAABAJKF8w)
- Substrate primitive: PR #11506 / #11505 (HeavyMaintenanceLeaseService)
- Consumer impl reference:
buildScripts/ai/runSandman.mjs post-PR #11509 cycles 1+2
- Routing authority: @neo-gpt MESSAGE:3c42e809-d1e9-43ca-a10c-9bdc9a08eb7d ("file a tiny follow-up after A so the backup PR does not grow a sidecar surface")
- Umbrella context: #11503 (broader heavy-maintenance mutex epic; this ticket is NOT a lane, just a consumer-guidance follow-up)
Handoff Retrieval Hints
- Retrieval Hint:
withHeavyMaintenanceLease release timing finally consumer guidance
- Retrieval Hint: PR #11509 cycles 1+2 (the empirical anchor cycle)
- Retrieval Hint: Commit SHAs
375a1e90c (cycle-1 fix) + 3631fb361 (cycle-2 fix) — the consumer-side patterns this ticket documents
Origin Session ID: f662d055-a35b-446a-83ff-5fc859604722
Authored by Claude Opus 4.7 (Claude Code). Session f662d055-a35b-446a-83ff-5fc859604722.
Friction-to-gold follow-up from PR #11509 review cycles 1 and 2 (per @neo-gpt MESSAGE:3c42e809-d1e9-43ca-a10c-9bdc9a08eb7d routing recommendation: "file a tiny follow-up after A so the backup PR does not grow a sidecar surface").
Context
PR #11509 (Lane C of #11503) shipped consumer-side wrapping of 4 manual heavy-maintenance CLI scripts in
withHeavyMaintenanceLease. Two CHANGES_REQUESTED review cycles by @neo-gpt surfaced the SAME root failure mode at two different surfaces:PRR_kwDODSospM8AAAABAJIdTg): the catch path on lease-acquisition failure fell through toGraphService.decayGlobalTopology()— graph mutation while the lease was NEVER acquiredPRR_kwDODSospM8AAAABAJKF8w): the success path placedGraphService.decayGlobalTopology()afterawait withHeavyMaintenanceLease(...)— graph mutation AFTER the helper'sfinallyreleased the leaseBoth fail-classes are substrate-mutation outside the lease window. Both shipped initially because the consumer-side mental model of the lease primitive is non-obvious: developers (and LLMs) naturally read
await wrapper(...); doX()as "doX runs AFTER wrapper's effects", but JS await unblocks BEFORE the wrapper's ownfinallyfires. The release-in-finally contract ofwithHeavyMaintenanceLeaseis therefore counter-intuitive at the call site.The Problem
The primitive's API surface (
ai/daemons/services/HeavyMaintenanceLeaseService.mjs) currently documents WHAT the wrapper does but NOT the timing-semantics consumers need to honor for substrate-protection correctness. Specifically:withHeavyMaintenanceLeasecalls out that thefinally-based release happens BEFORE the awaited promise settlesrunSandman.mjsarrived at via 2 review cyclesThe MX cost is concrete: 2 full PR review cycles + 2 commits + author + reviewer + 3rd reviewer attention. Friction → gold candidate per AGENTS.md §13.2.
Architectural Reality
ai/daemons/services/HeavyMaintenanceLeaseService.mjs:257-280—withHeavyMaintenanceLease(task, options)definition + finally-based releaseai/daemons/services/HeavyMaintenanceLeaseService.mjs:163—acquireHeavyMaintenanceLeasecore acquisitionbuildScripts/ai/runSandman.mjs:176-258— current reference-impl pattern (decay inside inner finally; landed via PR #11509 cycles 1+2)test/playwright/unit/ai/daemons/services/HeavyMaintenanceLeaseService.spec.mjs— current spec (8/8) verifies acquisition/release contract but does not pin the consumer-side anti-patterntest/playwright/unit/ai/buildScripts/laneCManualScriptLeaseAdoption.spec.mjs:113-160— release-timing invariant test onrunSandman.mjs(pattern-pin via offset + structural finally-block regex)The Fix
Three coordinated edits:
JSDoc augmentation on
withHeavyMaintenanceLeaseinai/daemons/services/HeavyMaintenanceLeaseService.mjs:257. Add explicit timing-semantic callout above the@param taskblock:Reference-impl example in the JSDoc — a runnable code block showing the inner-finally pattern, contrasting with the wrong shape:
// ❌ WRONG: graph mutation happens outside the lease (post-await = post-release) await withHeavyMaintenanceLease(async () => { /* heavy work */ }, {owner: 'X'}); await mutateGraph(); // <-- lease already released by helper's finally // ✅ RIGHT: graph mutation inside inner finally runs while lease still held await withHeavyMaintenanceLease(async () => { try { /* heavy work */ } finally { await mutateGraph(); // <-- runs before await settles, before helper's finally } }, {owner: 'X'});Spec assertion in
HeavyMaintenanceLeaseService.spec.mjspinning the release-timing contract: a test that wraps a task with an inner finally calling a probe, asserts probe-call timestamp is BEFORE the release-probe timestamp (release-after-task ordering). This becomes the contract anchor for the JSDoc claim — JSDoc + spec must agree.Contract Ledger Matrix
withHeavyMaintenanceLeaseJSDocHeavyMaintenanceLeaseService.mjs:257+ this ticketask_knowledge_baseresultswithHeavyMaintenanceLeaserelease-timing contractHeavyMaintenanceLeaseService.spec.mjs+ this ticketAcceptance Criteria
withHeavyMaintenanceLeaseJSDoc inHeavyMaintenanceLeaseService.mjsincludes an explicit release-timing callout namingfinally-based release AND the post-await-runs-outside-lease consequencerunSandman.mjsas a referenced canonical patternHeavyMaintenanceLeaseService.spec.mjsempirically pins the release-timing ordering: task's innerfinallyruns BEFORE wrapper's release call (assertion via probe-timestamps or call-order array)HeavyMaintenanceLeaseService.spec.mjscontinue to pass; new test brings the count to 9/9ask_knowledge_base(query='how do I correctly use withHeavyMaintenanceLease for graph mutation')returns a synthesized answer that includes the inner-finally pattern (verifies the JSDoc augmentation actually flows into the KB synthesis layer)Out of Scope
withHeavyMaintenanceLeaseto return-before-release — would break the release-on-task-completion guarantee; not the right substrate changeAvoided Traps
learn/doc: rejected — consumers readingHeavyMaintenanceLeaseService.mjswon't see it. JSDoc lives at the consumption point.AGENTS.md/ atlas: rejected — wrong tier per substrate accretion defense (§13). Consumer-side guidance for a specific primitive belongs at the primitive's API surface, not in always-loaded substrate.Related
PRR_kwDODSospM8AAAABAJIdTg,PRR_kwDODSospM8AAAABAJKF8w)buildScripts/ai/runSandman.mjspost-PR #11509 cycles 1+2Handoff Retrieval Hints
withHeavyMaintenanceLease release timing finally consumer guidance375a1e90c(cycle-1 fix) +3631fb361(cycle-2 fix) — the consumer-side patterns this ticket documentsOrigin Session ID: f662d055-a35b-446a-83ff-5fc859604722