Context
Operator @tobiu surfaced 2026-05-10 during PR #11114 review:
"follow up tickets (where we can do better!) => we also sync discussions and pr conversations now. we do not have archives and chunking in place there yet."
resources/content/discussions/ currently holds 64 markdown files (one per synced GitHub Discussion). Distance to GitHub's 1000-file cap is large (~15× headroom; years away at current cadence). This is preventive substrate-evolution, not urgent — but worth filing alongside the more-urgent PR-side chunking (#11117) so the substrate primitive lands once across all GH-content sync types.
The Problem
When resources/content/discussions/ exceeds 1000 files, GitHub's web UI folder tree truncates display. Same friction class as #11113 (issues) and #11117 (PRs). Filing now lets the chunking land alongside the PR-side work + reuses the same substrate primitive.
Discussion-archive separation is also missing: there is no discussion-archive/ sibling to issue-archive/. As Discussions accumulate over years, the same UX-degradation friction emerges.
The Architectural Reality
ai/services/github-workflow/sync/DiscussionSyncer.mjs writes Discussion content to resources/content/discussions/discussion-NNNNN.md (flat shape). Consumers: build scripts, possibly portal app views.
Per #11113 / #11114 substrate (post-merge): inherit the chunking convention chosen by Gemini (currently XXxx per PR #11114; pending cycle-3 defense). Single substrate primitive across all GH-content syncs.
The Fix
Apply the same chunking + archive pattern that #11114 established for issues:
- Subdivide
resources/content/discussions/ into chunked sub-folders using whatever convention #11114 cycle-3 documents.
- Create
resources/content/discussion-archive/ with same chunked structure for closed Discussions (or by-Discussion-state criteria — TBD).
- Update
DiscussionSyncer.mjs to dynamically compute + write to the chunked path.
- Update consumers for recursive readdir parsing.
- Migration script to organize existing 64 files into the chunked structure.
Per #11116 friction-gold lesson: consider 2-sub-ticket split (code-side + data-migration). For this ticket the data-migration is small (64 files vs 4190 for issues), so single-PR may be acceptable — but document the decision rationale.
Acceptance Criteria
Out of Scope
- Naming-convention re-decision — defer to #11114 cycle-3 outcome.
- PR conversation chunking — separate sibling ticket (#11117).
- Generalized GH-content-chunking utility — premature; extract once 3 content types are individually chunked + the duplication friction warrants it.
- Active vs archive lifecycle policy for Discussions — define separately if scope requires.
- Pre-emptive cap-monitoring — could be a substrate observability ticket, but not this ticket's scope.
Avoided Traps / Gold Standards Rejected
Decision Matrix
Mirror #11113/#11114/#11117 chunking + archive pattern (Selected): Same substrate primitive applied to a 3rd content type. Single-substrate-across-types is the substrate-correct shape.
Defer until cap is actually approached: Rejected. 64 files is far from cap, but filing now lets the work batch naturally with PR-side chunking (#11117) + reduces total swarm-cycles spent on the same substrate primitive across 3 separate cycles.
Skip the discussion chunking entirely (treat as low-priority): Rejected. Operator framed it explicitly: "we do not have archives and chunking in place there yet." Filing acknowledges the gap; defer-implementation-priority is operator's call.
Combine with #11117 into a single epic: Considered. Pro: shared substrate primitive lands once. Con: discussions are 11× smaller than PRs; bundling delays the more-urgent PR-side work. Filed as siblings; can graduate to epic if #11117 implementation surfaces shared-utility extraction.
- Trap: Treating preventive substrate-evolution as low-priority. Rejection: Operator's "we can do better" framing is not low-priority — it's a substrate-quality observation. Filing-now is the substrate-correct response; implementation priority can be operator-paced.
Related
- #11113 (canonical issue chunking ticket)
- PR #11114 (in flight; #11113 implementation)
- #11116 (sister substrate-evolution: code-vs-data-migration commit-shape discipline)
- #11117 (sibling: PR conversation chunking; more urgent than this ticket; same substrate primitive)
- Operator framing 2026-05-10 (this session —
[paraphrase]: "we also sync discussions and pr conversations now")
- Substrate-quality
[RETROSPECTIVE]: Once issues + PRs + Discussions are all chunked, the GH-content-chunking primitive should be extracted to a shared utility (NOT in this ticket's scope; flag as future generalization candidate).
Origin Session ID: c2912891-b459-4a03-b2af-154d5e264df1
Retrieval Hint: "Discussion chunking", "resources/content/discussions 1k cap preventive", "DiscussionSyncer chunked path"
Context
Operator @tobiu surfaced 2026-05-10 during PR #11114 review:
resources/content/discussions/currently holds 64 markdown files (one per synced GitHub Discussion). Distance to GitHub's 1000-file cap is large (~15× headroom; years away at current cadence). This is preventive substrate-evolution, not urgent — but worth filing alongside the more-urgent PR-side chunking (#11117) so the substrate primitive lands once across all GH-content sync types.The Problem
When
resources/content/discussions/exceeds 1000 files, GitHub's web UI folder tree truncates display. Same friction class as #11113 (issues) and #11117 (PRs). Filing now lets the chunking land alongside the PR-side work + reuses the same substrate primitive.Discussion-archive separation is also missing: there is no
discussion-archive/sibling toissue-archive/. As Discussions accumulate over years, the same UX-degradation friction emerges.The Architectural Reality
ai/services/github-workflow/sync/DiscussionSyncer.mjswrites Discussion content toresources/content/discussions/discussion-NNNNN.md(flat shape). Consumers: build scripts, possibly portal app views.Per #11113 / #11114 substrate (post-merge): inherit the chunking convention chosen by Gemini (currently
XXxxper PR #11114; pending cycle-3 defense). Single substrate primitive across all GH-content syncs.The Fix
Apply the same chunking + archive pattern that #11114 established for issues:
resources/content/discussions/into chunked sub-folders using whatever convention #11114 cycle-3 documents.resources/content/discussion-archive/with same chunked structure for closed Discussions (or by-Discussion-state criteria — TBD).DiscussionSyncer.mjsto dynamically compute + write to the chunked path.Per #11116 friction-gold lesson: consider 2-sub-ticket split (code-side + data-migration). For this ticket the data-migration is small (64 files vs 4190 for issues), so single-PR may be acceptable — but document the decision rationale.
Acceptance Criteria
DiscussionSyncer.mjswrites new Discussions to chunked sub-folder per chosen conventionresources/content/discussion-archive/exists with same chunked structure (if archive-separation policy is in scope here)Out of Scope
Avoided Traps / Gold Standards Rejected
Decision Matrix
Mirror #11113/#11114/#11117 chunking + archive pattern (Selected): Same substrate primitive applied to a 3rd content type. Single-substrate-across-types is the substrate-correct shape.
Defer until cap is actually approached: Rejected. 64 files is far from cap, but filing now lets the work batch naturally with PR-side chunking (#11117) + reduces total swarm-cycles spent on the same substrate primitive across 3 separate cycles.
Skip the discussion chunking entirely (treat as low-priority): Rejected. Operator framed it explicitly: "we do not have archives and chunking in place there yet." Filing acknowledges the gap; defer-implementation-priority is operator's call.
Combine with #11117 into a single epic: Considered. Pro: shared substrate primitive lands once. Con: discussions are 11× smaller than PRs; bundling delays the more-urgent PR-side work. Filed as siblings; can graduate to epic if #11117 implementation surfaces shared-utility extraction.
Related
[paraphrase]: "we also sync discussions and pr conversations now")[RETROSPECTIVE]: Once issues + PRs + Discussions are all chunked, the GH-content-chunking primitive should be extracted to a shared utility (NOT in this ticket's scope; flag as future generalization candidate).Origin Session ID: c2912891-b459-4a03-b2af-154d5e264df1 Retrieval Hint: "Discussion chunking", "resources/content/discussions 1k cap preventive", "DiscussionSyncer chunked path"