Context
Graduation Epic from Discussion #11180 — 3-way swarm + operator-correction convergence on type-separated archive symmetry with lazy density-tuned chunking.
Origin friction-anchor: PR #11174 (cosmetic naming-normalization of discussion-100xx/ → 100xx/) triggered operator architectural review on 2026-05-11. Cross-family ideation revealed:
- Asymmetric archive substrate:
issue-archive/ uses HYBRID version+chunk; pr-archive/ uses ID-only-no-version; no discussion-archive/ exists
- Sparse-folder anti-pattern: at low type-density (~100 discussions total), 100-ID-range chunking produces 1-5-item folders
- v13 release-cap risk: ~1500 issues closed-since-v12.1 currently in active
issues/ would exceed 1000-cap if archived without chunking
- Leverage opportunity:
.github/workflows/prevent-reopen.yml CI-enforces closedAt immutability past 24h grace, enabling one-way archive placement
Empirical anchors verified:
- ~1070 active issue files (open + closed-since-v12.1, approaching 1000-cap)
- ~734 active PR files
- ~68 active discussion files (sparse across 23 chunked subfolders averaging 3 items/folder)
- ~3000 archived issue files (across 165 version folders, mostly ≤100 items each)
- 212 archived PR files (ID-only chunks, no version layer)
- 0 archived discussion files (substrate doesn't exist)
[Cycle 2 body amendment 2026-05-11 per GPT epic-review]:
- Blocker 1 (Discussion #11180 body) RESOLVED by @neo-gemini-3-1-pro at 2026-05-11T08:26:38Z: Option A struck through and marked Withdrawn; full divergence matrix A through E''+S documented; OQ1-OQ3 marked
[RESOLVED]; OQ2 specifically uses closedAt + answerChosenAt (raw updatedAt reserved for sync-freshness only); RESOLVED_TO_AC comment + Epic #11187 cited as canonical.
- Blocker 2 (active-tier ordinal chunking) RESOLVED by this amendment: Active-tier issues + pulls retain XXxx ID-range chunking (
Math.floor(id/100)*100 deterministic lookup via LocalFileService#getIssueById). Ordinal chunk-N semantics now applies only to sealed archive chunks where closedAt immutability + version-folder boundary makes chunk membership stable. AC6 + AC7 (active-tier ordinal migration) DROPPED; AC8 renumbered to AC6. Subsequent ACs renumbered. Avoided Traps + architectural shape diagram + Lazy chunking algorithm section updated to reflect the active/archive split.
Phase 1 fully unblocked. AC1 lane (#11189 @neo-gemini-3-1-pro), AC2 lane (@neo-gpt), AC3 lane (@neo-opus-4-7) cleared to proceed.
The Problem
Three coupled architectural defects in the current substrate:
- Top-level fragmentation: 6 archive-related top-level folders (
issue-archive/, pr-archive/, discussion-archive/ to-be-created + 3 active) vs cleaner consolidation under one archive/ root
- Density-blind chunking: 100-ID-range chunking forces sparse folders for low-density types (discussions) while undersizing for high-density (v13 issues exceed even chunking-safe ranges per release)
- Two-way data flow: without leveraging prevent-reopen.yml, syncer would need continuous re-validation + file moves on
closedAt shifts; with the lean, archive placement is one-way + sealed-chunk semantic preserved
The Architectural Reality
File:line surfaces touched (V-B-A'd via operator reads):
ai/mcp/server/github-workflow/config.template.mjs — substrate config substrate
ai/services/github-workflow/IssueService.mjs + IssueSyncer
ai/services/github-workflow/PullRequestService.mjs + PullRequestSyncer
ai/services/github-workflow/DiscussionService.mjs + DiscussionSyncer
ai/services/github-workflow/SyncService.mjs — orchestrator
ai/services/github-workflow/LocalFileService.mjs — file path resolution (deterministic ID-based; preserved for active tier)
ai/services/github-workflow/HealthService.mjs — path-presence checks
- New
archivePath() helper — lazy chunking primitive (archive-tier only)
buildScripts/release/publish.mjs — release-cut archive logic + GH_SyncService.runFullSync()
buildScripts/release/analyzeClosedSinceRelease.mjs — analyzer
.github/workflows/data-sync-pipeline.yml — pages-repo sync paths (currently hardcodes issues/ + issue-archive/)
.github/workflows/prevent-reopen.yml — leveraged as substrate-correctness primitive (closedAt immutability)
- Migration scripts (multiple, per phase)
learn/agentos/sandman-handoff-format.md, learn/agentos/GitHubWorkflow.md — docs
The Fix
Adopt Option E''+S from Discussion #11180:
Architectural shape
resources/content/
├── issues/ (active, XXxx ID-range — deterministic ID lookup; UNCHANGED)
├── pulls/ (active, pr-XXxx ID-range — deterministic ID lookup; UNCHANGED)
├── discussions/ (active, flat — collapse current sparse XXxx; 68 items)
├── archive/
│ ├── issues/vN.M.K/ (flat ≤100; chunked >100 via chunk-N ordinal — sealed)
│ ├── pulls/vN.M.K/ (flat ≤100; chunked >100 via chunk-N ordinal — sealed)
│ └── discussions/vN.M.K/ (flat indefinitely at current density)
└── release-notes/
Active-tier (issues + pulls): UNCHANGED XXxx ID-range
- 100-ID-range folders (
110xx/, 111xx/, ...) — Math.floor(id/100)*100 deterministic lookup
- Preserved per
LocalFileService#getIssueById semantic — O(1) folder location from ID
- Discussions collapse to flat (current density 68 items — sparse XXxx churns to flat)
Archive-tier lazy chunking algorithm (sealed chunks only)
- ≤ 100 items in archive version-folder: flat (no
chunk-N/ wrapper)
- > 100 items: split into 100-item ordinal-count chunks named
chunk-N/ (non-numeric prefix, sequential)
- Chunks sealed once full —
closedAt immutability post-prevent-reopen.yml-24h-grace makes membership stable
- New items at release-cut go to first non-full chunk or fresh
chunk-N+1/
- Ordinal-count chosen for archive-tier because: (a) sealed-chunk semantic makes membership deterministic by version + insertion order, (b) avoids sparse-folder anti-pattern for low-density types (discussions)
Substrate-correctness leverage
closedAt (and mergedAt) treated as immutable post-prevent-reopen.yml-24h-grace
- Archive placement is one-way: items moved at release-cut, never re-moved
- Sealed-chunk semantic preserved (no mid-archive rebalancing)
- closedAt-shift anomaly → flagged for human review, not auto-corrected
Contract Ledger Matrix
| Target Surface |
Source of Authority |
Proposed Behavior |
Fallback |
Docs |
Evidence |
archive/{type}/vN.M.K/ paths |
This Epic + Discussion #11180 RESOLVED_TO_AC |
Single-root containing per-type subfolders; version-folder per release |
Legacy gh-workflow config fallbacks (issueSync.archiveDir, defaultArchiveVersion) retired by #11363; remaining data migration handled by AC7, not active config alias |
Updated GitHubWorkflow.md + sandman-handoff-format.md |
Phase 1 path-helper test coverage |
archivePath(type, version, id, count) helper |
New |
Lazy chunking: flat ≤100, chunk-N/ ordinal >100 |
None (single canonical path resolver) |
JSDoc on helper |
Phase 1 unit tests for boundary cases |
prevent-reopen.yml substrate-leverage |
Existing CI workflow |
Trusted for closedAt-immutability post-24h-grace |
Anomaly-detection + human-review hook for shift cases |
Phase 4 docs |
Phase 5 AC validation |
data-sync-pipeline.yml push paths |
Updated for new substrate |
Mirrors new archive/ shape to neomjs/pages |
Existing push pattern aliased during transition |
YAML inline comments |
Phase 4 CI run validation |
Acceptance Criteria — Phase-decomposed
Phase 1 — Foundation (~5 sub-tickets, code-only, no data move):
Phase 2 — Active-tier adjustment (~1 sub-ticket, discussions only):
Dropped (Cycle 2 amendment per GPT epic-review Blocker 2):
AC6 (original): Migrate active issues/XXxx/ → issues/chunk-N/ ordinal
AC7 (original): Migrate active pulls/pr-XXxx/ → pulls/chunk-N/ ordinal
Rationale: Active-tier items churn (open/close — though prevent-reopen.yml limits reopens). Ordinal chunk-N is INSERTION-ORDER-dependent, NOT deterministic from item ID. LocalFileService#getIssueById(id) currently computes Math.floor(id/100)*100 to locate the XXxx folder — O(1) deterministic. Switching to ordinal would force a folder-scan per lookup. ID-range XXxx remains correct shape for active tier; ordinal chunk-N applies only to sealed archive chunks where closedAt immutability guarantees membership.
Phase 3 — Archive-tier reshape (~3 sub-tickets, per existing type):
Phase 4 — Release + distribution (~3 sub-tickets):
Phase 5 — Validation + docs (~3 sub-tickets):
Phase 6 — Post-merge verification (open until 2026-09-01):
Out of Scope
- Active-tier ordinal-chunking refactor (deferred — would require LocalFileService deterministic-lookup redesign; ID-range remains correct shape per Cycle 2 amendment)
- Active-tier consolidation under
active/ root (deferred; bigger migration; operator didn't request)
- Extending
prevent-reopen.yml pattern to PRs + discussions (deferred; rare events; manually handle edge cases if surfaces)
- Cross-substrate identity canonicalization audit (#11181 family + #11182 Layer 4 — already separate)
- Broader release-process refactoring beyond archive-cutting (publish.mjs has other concerns)
- Migration of
neomjs/pages historical content beyond next post-merge sync cycle
Avoided Traps
- Option B (always-chunk version-outer + chunk-inner): 3-way converged but operator-challenged on sparse-tree anti-pattern. Rejected: produces 1-5-item folders for discussion-archive at current density.
- Option E (lazy chunking with 800-threshold): Rejected — 800-flat folder is browsing-hostile (GitHub UI + portal app concerns).
- Option F (strictly flat, no chunking ever): Rejected — breaks at v13 issues (1500+ items would exceed 1000-cap).
- Option G (per-type fixed shape: always-chunk-or-always-flat): Rejected — hardcodes type-density assumption; less elegant than lazy detection.
- ID-range chunks under non-numeric prefix for ARCHIVE: Rejected — preserves sparse-folder problem for low-density types. Ordinal-count chunks deliver consistent 100-item density (archive-tier only).
- Ordinal
chunk-N for ACTIVE tier: Rejected (Cycle 2 per GPT epic-review Blocker 2) — LocalFileService#getIssueById relies on Math.floor(id/100)*100 to locate XXxx folder in O(1). Ordinal chunk-N is insertion-order-dependent and not deterministic from ID; active tier churns; folder-scan-per-lookup is the regression. Ordinal chunking applies only to sealed archive chunks where closedAt immutability + version-folder boundary makes the chunk membership stable. Active tier retains XXxx ID-range.
- Two-way data flow with continuous closedAt re-validation: Rejected — leverage
prevent-reopen.yml CI primitive for immutability.
Related
- Discussion #11180 (parent ideation; full divergence-matrix + 3-way swarm convergence + 5 operator-corrections; body update awaited from @neo-gemini-3-1-pro per GPT epic-review Blocker 1)
- PR #11174 (closed; original cosmetic-naming-normalization that triggered architectural review)
- Epic #11120 (original chunking arc; this Epic supersedes its substrate direction)
- #11113, #11116, #11118, #11121, #11122 (original chunking sub-tickets; superseded)
- PR #11114 (original chunking implementation; precedent for XXxx primitive)
- PR #11186 (substrate-discipline doc: stale-magic-close-in-commit-bodies; merged)
- Discussion #11188 (Extended V-B-A divergent-thinking discipline; GPT's epic-review on this Epic IS the first positive empirical anchor for the proposed discipline)
.github/workflows/prevent-reopen.yml (leveraged substrate primitive)
buildScripts/release/publish.mjs (release-cut archive logic)
.github/workflows/data-sync-pipeline.yml (pages-repo sync paths)
ai/services/github-workflow/LocalFileService.mjs (deterministic ID-based path resolution; preserved for active tier per Cycle 2 amendment)
Origin Session ID
c2912891-b459-4a03-b2af-154d5e264df1
Handoff Retrieval Hints
query_raw_memories(query="single root archive substrate lazy 100-item chunking E prime prime S Discussion 11180")
ask_knowledge_base(query="archive folder structure version chunking GitHubWorkflow")
- Git commit-range anchor:
git log --oneline --grep="11180" --since="2026-05-11" for graduation-context
- File:line anchors:
ai/services/github-workflow/{IssueService,PullRequestService,DiscussionService,SyncService,LocalFileService}.mjs + buildScripts/release/publish.mjs + .github/workflows/data-sync-pipeline.yml
Context
Graduation Epic from Discussion #11180 — 3-way swarm + operator-correction convergence on type-separated archive symmetry with lazy density-tuned chunking.
Origin friction-anchor: PR #11174 (cosmetic naming-normalization of
discussion-100xx/→100xx/) triggered operator architectural review on 2026-05-11. Cross-family ideation revealed:issue-archive/uses HYBRID version+chunk;pr-archive/uses ID-only-no-version; nodiscussion-archive/existsissues/would exceed 1000-cap if archived without chunking.github/workflows/prevent-reopen.ymlCI-enforcesclosedAtimmutability past 24h grace, enabling one-way archive placementEmpirical anchors verified:
The Problem
Three coupled architectural defects in the current substrate:
issue-archive/,pr-archive/,discussion-archive/to-be-created + 3 active) vs cleaner consolidation under onearchive/rootclosedAtshifts; with the lean, archive placement is one-way + sealed-chunk semantic preservedThe Architectural Reality
File:line surfaces touched (V-B-A'd via operator reads):
ai/mcp/server/github-workflow/config.template.mjs— substrate config substrateai/services/github-workflow/IssueService.mjs+ IssueSyncerai/services/github-workflow/PullRequestService.mjs+ PullRequestSyncerai/services/github-workflow/DiscussionService.mjs+ DiscussionSyncerai/services/github-workflow/SyncService.mjs— orchestratorai/services/github-workflow/LocalFileService.mjs— file path resolution (deterministic ID-based; preserved for active tier)ai/services/github-workflow/HealthService.mjs— path-presence checksarchivePath()helper — lazy chunking primitive (archive-tier only)buildScripts/release/publish.mjs— release-cut archive logic +GH_SyncService.runFullSync()buildScripts/release/analyzeClosedSinceRelease.mjs— analyzer.github/workflows/data-sync-pipeline.yml— pages-repo sync paths (currently hardcodesissues/+issue-archive/).github/workflows/prevent-reopen.yml— leveraged as substrate-correctness primitive (closedAt immutability)learn/agentos/sandman-handoff-format.md,learn/agentos/GitHubWorkflow.md— docsThe Fix
Adopt Option E''+S from Discussion #11180:
Architectural shape
Active-tier (issues + pulls): UNCHANGED XXxx ID-range
110xx/,111xx/, ...) —Math.floor(id/100)*100deterministic lookupLocalFileService#getIssueByIdsemantic — O(1) folder location from IDArchive-tier lazy chunking algorithm (sealed chunks only)
chunk-N/wrapper)chunk-N/(non-numeric prefix, sequential)closedAtimmutability post-prevent-reopen.yml-24h-grace makes membership stablechunk-N+1/Substrate-correctness leverage
closedAt(andmergedAt) treated as immutable post-prevent-reopen.yml-24h-graceContract Ledger Matrix
archive/{type}/vN.M.K/pathsissueSync.archiveDir,defaultArchiveVersion) retired by #11363; remaining data migration handled by AC7, not active config aliasarchivePath(type, version, id, count)helperchunk-N/ordinal >100prevent-reopen.ymlsubstrate-leveragedata-sync-pipeline.ymlpush pathsarchive/shape to neomjs/pagesAcceptance Criteria — Phase-decomposed
Phase 1 — Foundation (~5 sub-tickets, code-only, no data move):
archiveRoot+ per-type sub-keys (archive.issues,archive.pulls,archive.discussions); legacyissueSync.archiveDir/defaultArchiveVersionconfig fallbacks retired by #11363archivePath()helper: lazy 100-item ordinal chunking + sealed-chunk semantics (archive-tier only)IssueService+IssueSyncerrefactor to use new helper (archive-write path); activeLocalFileService#getIssueByIdID-range semantic preservedPullRequestService+PullRequestSyncerrefactor (archive-write path); active pr-XXxx ID-range semantic preservedDiscussionService+DiscussionSyncerrefactor (also collapse active to flat)Phase 2 — Active-tier adjustment (~1 sub-ticket, discussions only):
discussions/XXxx/→ flatdiscussions/discussion-NNNN.md(68 items currently, sparse across 23 XXxx folders)Phase 3 — Archive-tier reshape (~3 sub-tickets, per existing type):
issue-archive/v*/XXxx/→archive/issues/v*/{flat|chunk-N}/pr-archive/XXxx/→archive/pulls/v*/{flat|chunk-N}/(via release-mapping)archive/discussions/; populate lazily at next release-cutPhase 4 — Release + distribution (~3 sub-tickets):
buildScripts/release/publish.mjsarchive-cutting logic for new pathsbuildScripts/release/analyzeClosedSinceRelease.mjs+ other analyzers.github/workflows/data-sync-pipeline.ymlpages-repo push pathsPhase 5 — Validation + docs (~3 sub-tickets):
sandman-handoff-format.md+GitHubWorkflow.md+ analyzer JSDocPhase 6 — Post-merge verification (open until 2026-09-01):
Out of Scope
active/root (deferred; bigger migration; operator didn't request)prevent-reopen.ymlpattern to PRs + discussions (deferred; rare events; manually handle edge cases if surfaces)neomjs/pageshistorical content beyond next post-merge sync cycleAvoided Traps
chunk-Nfor ACTIVE tier: Rejected (Cycle 2 per GPT epic-review Blocker 2) —LocalFileService#getIssueByIdrelies onMath.floor(id/100)*100to locate XXxx folder in O(1). Ordinalchunk-Nis insertion-order-dependent and not deterministic from ID; active tier churns; folder-scan-per-lookup is the regression. Ordinal chunking applies only to sealed archive chunks whereclosedAtimmutability + version-folder boundary makes the chunk membership stable. Active tier retains XXxx ID-range.prevent-reopen.ymlCI primitive for immutability.Related
.github/workflows/prevent-reopen.yml(leveraged substrate primitive)buildScripts/release/publish.mjs(release-cut archive logic).github/workflows/data-sync-pipeline.yml(pages-repo sync paths)ai/services/github-workflow/LocalFileService.mjs(deterministic ID-based path resolution; preserved for active tier per Cycle 2 amendment)Origin Session ID
c2912891-b459-4a03-b2af-154d5e264df1Handoff Retrieval Hints
query_raw_memories(query="single root archive substrate lazy 100-item chunking E prime prime S Discussion 11180")ask_knowledge_base(query="archive folder structure version chunking GitHubWorkflow")git log --oneline --grep="11180" --since="2026-05-11"for graduation-contextai/services/github-workflow/{IssueService,PullRequestService,DiscussionService,SyncService,LocalFileService}.mjs+buildScripts/release/publish.mjs+.github/workflows/data-sync-pipeline.yml