Context
Surfaced during post-#11592-merge V-B-A on ai:sync-github-workflow (2026-05-18 ~20:55Z). Two [ARCHIVE ANOMALY] WARN emitted for issue #7910 within a single sync run — once in reconcile phase, once in pull phase:
[WARN] 🚨 [ARCHIVE ANOMALY] Issue #7910 closedAt shift detected: moving from bucket 'v11.12.0' to 'v11.13.0'. Dry-run review required.
Operator-direction 2026-05-18 ~21:05Z (in-session paraphrase): "VBA => reducing noise (for all real synced items) !== regression bug where an item got moved, that was not changed." Cycle-1 V-B-A pattern-matched against PR #11486 (which reduced WARN noise for genuinely-changed items) and falsely concluded "known intentional behavior." Operator-corrected V-B-A re-investigated and confirmed: this is a real regression — an UNCHANGED item is getting re-bucketed every sync.
The Problem
Empirical observation
Issue #7910 (CLOSED 2025-11-29T11:41:17Z, milestone 11.12.0, currently archived at resources/content/archive/issues/v11.12.0/chunk-1/issue-7910.md) hasn't been modified since 2025-11-29T11:44:14Z. Yet every sync emits the WARN "moving from v11.12.0 to v11.13.0."
GitHub release timeline:
- v11.12.0 published
2025-11-29T11:32:31Z (release body explicitly lists "Enhance SEO generator to support middleware-compatible routes (Issue #7910)" — #7910 IS fixed in v11.12.0)
- v11.13.0 published
2025-11-29T12:34:29Z (release body lists Issues #7911, #7912 — NOT #7910)
The CORRECT bucket per release-body content is v11.12.0. The syncer's inference logic computes v11.13.0.
Root cause
.sync-metadata.json for #7910 persists these fields:
{
"state": "CLOSED",
"closedAt": "2025-11-29T11:41:17Z",
"updatedAt": "2025-11-29T11:44:14Z",
"contentHash": "79ddf88bcecdf7eacf03784e3b3a2540eb740338ece6e55c5387314d7603e843",
"commentsTotal": 1
}
No milestone field. The on-disk frontmatter file has milestone: 11.12.0 but it's not captured in the sync metadata.
IssueSyncer.mjs#planBuckets (ai/services/github-workflow/sync/IssueSyncer.mjs:364-385):
for (const issue of combined.values()) {
let version = null;
if (issue.state === 'CLOSED') {
if (issue.milestone?.title) {
version = issue.milestone.title.startsWith(...) ? ... : ...;
} else if (issue.closedAt) {
const closed = new Date(issue.closedAt);
const release = (ReleaseNotesSyncer.sortedReleases || [])
.find(r => new Date(r.publishedAt) > closed);
if (release) {
version = ...;
}
}
}
}
Line 387-403: issue.oldVersion (v11.12.0) derived from cached path; version (v11.13.0) derived from fallback inference. They differ → WARN emits with "Dry-run review required" framing.
Sealed-chunk enforcement (line ~569-602) preserves the physical archive location (file stays at v11.12.0/) — good safety, but the WARN keeps firing because the metadata never captures the milestone.
Why fallback inference is wrong-shape for this case
The fallback find(r => new Date(r.publishedAt) > closed) assumes release-publish-time monotonically aligns with which release contains a given issue. That assumption breaks when:
- An issue is closed BEFORE the release containing its fix is published (rare for routine issues; common for issues closed during release-cut window where the closing PR is already in the next-release branch).
- Releases are published in batches very close together (multiple releases in the same hour, as happened 2025-11-29: v11.12.0 at 11:32 + v11.13.0 at 12:34 + sibling releases earlier the same day).
For #7910 specifically: the PR that fixed #7910 landed in v11.12.0's release branch, but #7910 wasn't auto-closed until AFTER v11.12.0's release publish — so the timestamp-only inference computes the WRONG bucket.
The Architectural Reality
- Source of authority for
(issue, version) pairs: GitHub release-body content (the human-curated ## 📦 Full Changelog references) > GitHub milestone field > closedAt-timestamp-vs-release-publish-timestamp inference.
- Current syncer order: milestone field > closedAt-timestamp inference. Release-body content is NOT consulted.
- Metadata gap: milestone field isn't even captured in
.sync-metadata.json, so the highest-quality available signal is silently discarded.
- Sealed-chunk enforcement (per ADR 0004 §3.6 + #11288) intentionally prevents physical moves to keep historical archive stable, but emits operator-actionable WARN when bucket inference disagrees with physical location.
The Fix
Two-part fix, lowest-blast-radius first:
Part 1 — Persist milestone in sync metadata (MUST-FIX):
In the metadata serialization path (where .sync-metadata.json is written per-issue), add milestone field capture from the issue's GraphQL milestone node title. Then #planBuckets line 367 dispatch works correctly: issue.milestone?.title resolves to "11.12.0" → version = 'v11.12.0' → matches oldVersion = 'v11.12.0' → no WARN.
Part 2 — Prefer oldVersion as source-of-truth for unchanged closed issues (OPTIONAL, defense-in-depth):
For issues where issue.state === 'CLOSED' AND issue.oldVersion is a valid semver AND the issue content hasn't changed since last sync (via contentHash or updatedAt comparison), prefer oldVersion over re-inference. This matches operator's framing: "an item that was not changed should not be moved." Eliminates ENTIRE class of timestamp-inference instability for stable historical issues.
Part 1 alone resolves the immediate regression for #7910. Part 2 is the broader substrate-discipline that prevents future inference-instability classes.
Contract Ledger Matrix
| Target Surface |
Source of Authority |
Proposed Behavior |
Fallback |
Docs |
Evidence |
.sync-metadata.json per-issue schema |
This ticket + operator-direction |
Persist milestone field alongside state/closedAt/updatedAt/contentHash/commentsTotal |
Re-fetch milestone on every sync (current state — but doesn't persist) |
learn/agentos/GitHubWorkflow.md archive-anomaly section |
Run ai:sync-github-workflow post-fix; .sync-metadata.json for #7910 includes milestone: '11.12.0'; subsequent sync run shows no ARCHIVE ANOMALY for #7910 |
IssueSyncer.mjs#planBuckets for unchanged closed issues |
This ticket + operator-direction |
When oldVersion is valid semver AND content unchanged, use oldVersion directly (skip timestamp-fallback inference) |
Continue current dispatch chain (milestone-then-timestamp) |
JSDoc on the planBuckets function |
Unit test: stable closedAt + unchanged content → version === oldVersion |
Acceptance Criteria
Out of Scope
- Reading GitHub release-body content for
(issue, release) mapping. That's a much larger substrate change (parser + edge cases for unstructured release-body markdown); could be a follow-up if Part 1 + Part 2 don't fully address the bucket-inference quality concern.
- Migrating existing
.sync-metadata.json files to backfill milestone for historical entries. The next full sync naturally backfills via the GraphQL milestone field on the GitHub side; no separate migration needed.
- Re-architecting the archive-anomaly WARN trigger path. The WARN is correctly designed for genuine bucket shifts (e.g., closedAt was actually edited by a maintainer); fixing the metadata gap removes the false-positive for unchanged issues without changing the WARN semantics.
Avoided Traps
- Pattern-matching V-B-A on related-PR titles — rejected. My Cycle-1 V-B-A found PR #11486 ("IssueSyncer ARCHIVE ANOMALY emits thousands of false-positive WARN during ADR 0004 clean-cut") and falsely concluded "known intentional behavior." PR #11486 reduced WARN noise during MIGRATION CLEAN-CUT for items where one bucket was an invalid semver (line 388-394 docstring). The current #7910 case has BOTH sides as valid semver — explicitly NOT covered by PR #11486's dedupe. Operator-corrected V-B-A surfaced this gap; this ticket is the substrate-aligned response.
- "Dry-run review required"-as-action — rejected. The WARN's framing implies operator-action exists (dry-run review), but there's no documented remediation path. The WARN persists across syncs regardless of operator-attention. Documenting a no-op WARN is worse than fixing the inference.
- Disabling the WARN entirely for unchanged items — rejected. The WARN is correctly designed for GENUINE shifts (e.g., closedAt was edited, milestone was changed mid-cycle). Suppressing it for "unchanged" cases would lose signal for actual maintainer-action-required bucket-shifts. Fix the root cause (metadata gap) instead.
Related
- Source V-B-A: post-#11592-merge
ai:sync-github-workflow run 2026-05-18 ~20:55Z (250s duration, 2884 PRs synced; #7910 WARN emitted twice). PR-comment empirical anchor: IC_kwDODSospM8AAAABCyg58A.
- Related substrate (NOT this fix):
- #11486 — Reduced WARN noise during migration clean-cut (oldVersion invalid semver). DOES NOT cover the current case (both sides valid semver).
- #11288 — Validate archive migration + document anomaly hooks. Established the WARN substrate; doesn't address inference accuracy.
- Inference logic:
ai/services/github-workflow/sync/IssueSyncer.mjs lines 305-411 (#planBuckets) + lines 565-602 (sealed-chunk reconcile).
- Substrate doc:
learn/agentos/GitHubWorkflow.md line 166 (Archive Anomaly Hooks section — describes intended behavior, doesn't acknowledge the metadata-gap regression).
- Operator framing this turn: correction to my Cycle-1 V-B-A pattern-match anti-pattern (related-PR title-match ≠ substantive verification).
Origin Session ID: 1b7a3403-06f3-4862-be80-479e129656de
Retrieval Hint: query_raw_memories("IssueSyncer archive anomaly #7910 milestone metadata gap planBuckets timestamp inference regression unchanged closed issue v11.12.0 v11.13.0")
Context
Surfaced during post-#11592-merge V-B-A on
ai:sync-github-workflow(2026-05-18 ~20:55Z). Two[ARCHIVE ANOMALY]WARN emitted for issue #7910 within a single sync run — once in reconcile phase, once in pull phase:Operator-direction 2026-05-18 ~21:05Z (in-session paraphrase): "VBA => reducing noise (for all real synced items) !== regression bug where an item got moved, that was not changed." Cycle-1 V-B-A pattern-matched against PR #11486 (which reduced WARN noise for genuinely-changed items) and falsely concluded "known intentional behavior." Operator-corrected V-B-A re-investigated and confirmed: this is a real regression — an UNCHANGED item is getting re-bucketed every sync.
The Problem
Empirical observation
Issue #7910 (CLOSED 2025-11-29T11:41:17Z, milestone
11.12.0, currently archived atresources/content/archive/issues/v11.12.0/chunk-1/issue-7910.md) hasn't been modified since 2025-11-29T11:44:14Z. Yet every sync emits the WARN "moving from v11.12.0 to v11.13.0."GitHub release timeline:
2025-11-29T11:32:31Z(release body explicitly lists "Enhance SEO generator to support middleware-compatible routes (Issue #7910)" — #7910 IS fixed in v11.12.0)2025-11-29T12:34:29Z(release body lists Issues #7911, #7912 — NOT #7910)The CORRECT bucket per release-body content is v11.12.0. The syncer's inference logic computes v11.13.0.
Root cause
.sync-metadata.jsonfor #7910 persists these fields:{ "state": "CLOSED", "closedAt": "2025-11-29T11:41:17Z", "updatedAt": "2025-11-29T11:44:14Z", "contentHash": "79ddf88bcecdf7eacf03784e3b3a2540eb740338ece6e55c5387314d7603e843", "commentsTotal": 1 }No
milestonefield. The on-disk frontmatter file hasmilestone: 11.12.0but it's not captured in the sync metadata.IssueSyncer.mjs#planBuckets(ai/services/github-workflow/sync/IssueSyncer.mjs:364-385):for (const issue of combined.values()) { let version = null; if (issue.state === 'CLOSED') { if (issue.milestone?.title) { // ← undefined for #7910 in metadata version = issue.milestone.title.startsWith(...) ? ... : ...; } else if (issue.closedAt) { // ← falls through here const closed = new Date(issue.closedAt); const release = (ReleaseNotesSyncer.sortedReleases || []) .find(r => new Date(r.publishedAt) > closed); // ← finds first release AFTER close // For #7910: closed 11:41:17Z; v11.12.0 published 11:32:31Z (BEFORE close); // v11.13.0 published 12:34:29Z (AFTER close) → version = 'v11.13.0' ❌ if (release) { version = ...; } } } // ... }Line 387-403:
issue.oldVersion (v11.12.0)derived from cached path;version (v11.13.0)derived from fallback inference. They differ → WARN emits with "Dry-run review required" framing.Sealed-chunk enforcement (line ~569-602) preserves the physical archive location (file stays at v11.12.0/) — good safety, but the WARN keeps firing because the metadata never captures the milestone.
Why fallback inference is wrong-shape for this case
The fallback
find(r => new Date(r.publishedAt) > closed)assumes release-publish-time monotonically aligns with which release contains a given issue. That assumption breaks when:For #7910 specifically: the PR that fixed #7910 landed in v11.12.0's release branch, but #7910 wasn't auto-closed until AFTER v11.12.0's release publish — so the timestamp-only inference computes the WRONG bucket.
The Architectural Reality
(issue, version)pairs: GitHub release-body content (the human-curated## 📦 Full Changelogreferences) > GitHub milestone field > closedAt-timestamp-vs-release-publish-timestamp inference..sync-metadata.json, so the highest-quality available signal is silently discarded.The Fix
Two-part fix, lowest-blast-radius first:
Part 1 — Persist
milestonein sync metadata (MUST-FIX):In the metadata serialization path (where
.sync-metadata.jsonis written per-issue), addmilestonefield capture from the issue's GraphQL milestone node title. Then#planBucketsline 367 dispatch works correctly:issue.milestone?.titleresolves to "11.12.0" →version = 'v11.12.0'→ matchesoldVersion = 'v11.12.0'→ no WARN.Part 2 — Prefer
oldVersionas source-of-truth for unchanged closed issues (OPTIONAL, defense-in-depth):For issues where
issue.state === 'CLOSED'ANDissue.oldVersionis a valid semver AND the issue content hasn't changed since last sync (viacontentHashorupdatedAtcomparison), preferoldVersionover re-inference. This matches operator's framing: "an item that was not changed should not be moved." Eliminates ENTIRE class of timestamp-inference instability for stable historical issues.Part 1 alone resolves the immediate regression for #7910. Part 2 is the broader substrate-discipline that prevents future inference-instability classes.
Contract Ledger Matrix
.sync-metadata.jsonper-issue schemamilestonefield alongsidestate/closedAt/updatedAt/contentHash/commentsTotallearn/agentos/GitHubWorkflow.mdarchive-anomaly sectionai:sync-github-workflowpost-fix;.sync-metadata.jsonfor #7910 includesmilestone: '11.12.0'; subsequent sync run shows no ARCHIVE ANOMALY for #7910IssueSyncer.mjs#planBucketsfor unchanged closed issuesoldVersionis valid semver AND content unchanged, useoldVersiondirectly (skip timestamp-fallback inference)version === oldVersionAcceptance Criteria
.sync-metadata.jsonper-issue schema persistsmilestonefield (string or null).npm run ai:sync-github-workflowagainst the affected commit emits ZERO[ARCHIVE ANOMALY]WARN for #7910 (currently emits 2 per sync).#planBucketsskips timestamp-fallback re-inference for closed issues whereoldVersionis valid semver ANDcontentHashis unchanged since last sync.oldVersionprecedence for unchanged closed issues, (c) regression scenario: issue closed BEFORE its release publish should still bucket per milestone.pull-request-workflow.md §6.1.Out of Scope
(issue, release)mapping. That's a much larger substrate change (parser + edge cases for unstructured release-body markdown); could be a follow-up if Part 1 + Part 2 don't fully address the bucket-inference quality concern..sync-metadata.jsonfiles to backfillmilestonefor historical entries. The next full sync naturally backfills via the GraphQL milestone field on the GitHub side; no separate migration needed.Avoided Traps
Related
ai:sync-github-workflowrun 2026-05-18 ~20:55Z (250s duration, 2884 PRs synced; #7910 WARN emitted twice). PR-comment empirical anchor:IC_kwDODSospM8AAAABCyg58A.ai/services/github-workflow/sync/IssueSyncer.mjslines 305-411 (#planBuckets) + lines 565-602 (sealed-chunk reconcile).learn/agentos/GitHubWorkflow.mdline 166 (Archive Anomaly Hooks section — describes intended behavior, doesn't acknowledge the metadata-gap regression).Origin Session ID:
1b7a3403-06f3-4862-be80-479e129656deRetrieval Hint:query_raw_memories("IssueSyncer archive anomaly #7910 milestone metadata gap planBuckets timestamp inference regression unchanged closed issue v11.12.0 v11.13.0")