This is the missing AC8 companion ticket for Epic #11187: reshape the existing PR archive corpus from legacy resources/content/pr-archive/ into the new single-root archive substrate at resources/content/archive/pulls/vN.M.K/{flat|chunk-N}/.
The gap surfaced after Gemini created the #11187 fanout (#11284-#11288). The fanout covered active discussions, read-path fallback, CI/build scripts, consumer cleanup, and validation/docs, but did not create a dedicated ticket for Epic #11187 AC8:
Reshape pr-archive/XXxx/ to archive/pulls/v*/{flat|chunk-N}/ via release-mapping.
PR #11282 already shipped the MetadataManager prerequisite that preserves pull-request mergedAt, milestone, and archiveVersion, which are required inputs for deterministic version inference.
The Problem
The issue archive has version folders, but the PR archive is currently ID-range oriented and lacks release-version grouping. Moving to the single-root archive shape requires mapping existing archived PR files to release versions before moving them under archive/pulls/.
Without a dedicated AC8 ticket, the migration can be accidentally folded into broad consumer cleanup or release-script work. That would blur the hardest part of the PR archive migration: deterministic version inference and dry-run evidence for existing historical files.
The Architectural Reality
Current and target surfaces:
Current legacy corpus: resources/content/pr-archive/ with PR ID-range layout.
Existing active PR corpus: resources/content/pulls/pr-XXxx/pr-NNNN.md remains active-tier ID-range per Epic #11187.
Metadata prerequisite: ai/services/github-workflow/sync/MetadataManager.mjs now preserves mergedAt, milestone, and archiveVersion for pull metadata via PR #11282.
Release/archive consumers include buildScripts/release/publish.mjs, buildScripts/release/analyzeClosedSinceRelease.mjs, .github/workflows/data-sync-pipeline.yml, and downstream docs/indexing scripts handled by sibling tickets.
Duplicate sweep notes:
#11117 is closed and represents the pre-#11187 PR archive/chunking shape; it targeted pr-archive/ and is superseded by the single-root archive/pulls/ substrate.
#11286 covers CI pipelines and release scripts, not the historical PR archive corpus migration and version inference itself.
#11287 covers consumer cleanup, not the data migration plan.
No open ticket found for AC8's PR archive version-mapping migration.
The Fix
Implement a deterministic PR archive migration plan for existing pr-archive files.
The implementation should:
Infer the target release version for each archived PR using the best available metadata in deterministic order.
Prefer explicit archiveVersion when present.
Fall back to milestone/release metadata when it is structurally reliable.
Fall back to mergedAt against release-cut dates when needed.
Emit an anomaly report for PRs that cannot be mapped without ambiguity.
Support a dry-run/report mode before moving files.
Move files only after the report is reviewable and deterministic.
Use the Epic #11187 lazy archive chunking contract: flat when a version folder has ≤100 PRs, chunk-N when it exceeds 100.
Migration logic maps legacy resources/content/pr-archive/**/pr-NNNN.md files into resources/content/archive/pulls/vN.M.K/{flat|chunk-N}/.
Version inference is deterministic and documented in code or migration output: archiveVersion → reliable milestone/release metadata → mergedAt against release-cut dates → anomaly report.
Dry-run/report mode lists planned moves, inferred version, inference source, and anomalies before writing files.
Fixture or targeted test coverage exercises explicit archiveVersion, milestone-based inference, mergedAt fallback, and unmappable PR anomaly cases.
Migration preserves active-tier PR layout under resources/content/pulls/pr-XXxx/.
Migration uses the Epic #11187 archive chunking contract: flat ≤100 files per version, chunk-N for >100.
Post-run validation reports legacy count, moved count, anomaly count, and target count.
PR body cites the dry-run output and explains any anomalies left for human review.
Out of Scope
Active pulls/ layout changes.
Issue archive migration; that is Epic #11187 AC7 / sibling lane.
Discussion archive creation; that is Epic #11187 AC9 / sibling lane.
Broad consumer cleanup beyond what is needed to execute and verify the PR archive migration; consumer cleanup belongs to #11287.
Folding AC8 into consumer cleanup. Rejected because version inference and data movement require their own evidence trail.
Guessing release versions silently. Rejected; ambiguous PRs must surface as anomalies.
Using raw ID range as the target archive shape. Rejected because Epic #11187 selected versioned archive folders with lazy chunking for archive-tier content.
Moving files without dry-run evidence. Rejected because archive corpus migration is high-blast and generated-sync-noise prone.
Context
This is the missing AC8 companion ticket for Epic #11187: reshape the existing PR archive corpus from legacy
resources/content/pr-archive/into the new single-root archive substrate atresources/content/archive/pulls/vN.M.K/{flat|chunk-N}/.The gap surfaced after Gemini created the #11187 fanout (#11284-#11288). The fanout covered active discussions, read-path fallback, CI/build scripts, consumer cleanup, and validation/docs, but did not create a dedicated ticket for Epic #11187 AC8:
PR #11282 already shipped the MetadataManager prerequisite that preserves pull-request
mergedAt,milestone, andarchiveVersion, which are required inputs for deterministic version inference.The Problem
The issue archive has version folders, but the PR archive is currently ID-range oriented and lacks release-version grouping. Moving to the single-root archive shape requires mapping existing archived PR files to release versions before moving them under
archive/pulls/.Without a dedicated AC8 ticket, the migration can be accidentally folded into broad consumer cleanup or release-script work. That would blur the hardest part of the PR archive migration: deterministic version inference and dry-run evidence for existing historical files.
The Architectural Reality
Current and target surfaces:
resources/content/pr-archive/with PR ID-range layout.resources/content/archive/pulls/vN.M.K/{flat|chunk-N}/pr-NNNN.md.resources/content/pulls/pr-XXxx/pr-NNNN.mdremains active-tier ID-range per Epic #11187.ai/services/github-workflow/sync/MetadataManager.mjsnow preservesmergedAt,milestone, andarchiveVersionfor pull metadata via PR #11282.ai/services/github-workflow/shared/archivePath.mjsowns lazy 100-item archive chunking semantics.buildScripts/release/publish.mjs,buildScripts/release/analyzeClosedSinceRelease.mjs,.github/workflows/data-sync-pipeline.yml, and downstream docs/indexing scripts handled by sibling tickets.Duplicate sweep notes:
pr-archive/and is superseded by the single-rootarchive/pulls/substrate.The Fix
Implement a deterministic PR archive migration plan for existing
pr-archivefiles.The implementation should:
archiveVersionwhen present.milestone/release metadata when it is structurally reliable.mergedAtagainst release-cut dates when needed.chunk-Nwhen it exceeds 100.Contract Ledger Matrix
resources/content/archive/pulls/vN.M.K/archiveVersionfirst, then reliable milestone/release metadata, thenmergedAtrelease-cut mappingpr-archive/Acceptance Criteria
resources/content/pr-archive/**/pr-NNNN.mdfiles intoresources/content/archive/pulls/vN.M.K/{flat|chunk-N}/.archiveVersion→ reliable milestone/release metadata →mergedAtagainst release-cut dates → anomaly report.archiveVersion, milestone-based inference,mergedAtfallback, and unmappable PR anomaly cases.resources/content/pulls/pr-XXxx/.chunk-Nfor >100.Out of Scope
pulls/layout changes.Avoided Traps
Related
resources/content/pr-archive/resources/content/archive/pulls/ai/services/github-workflow/shared/archivePath.mjsOrigin Session ID: d6d89930-f408-42a0-b60e-ec4487a8cc46
Handoff Retrieval Hints:
query_raw_memories(query="AC8 pr-archive version mapping archive pulls migration archiveVersion mergedAt milestone")ask_knowledge_base(query="PR archive migration archive/pulls release mapping archiveVersion mergedAt")