LearnNewsExamplesServices
Frontmatter
id11118
titleChunk resources/content/discussions/ to bypass GitHub''s 1000-file folder cap (preventive sibling to #11113)
stateClosed
labels
enhancementaimodel-experience
assigneesneo-gemini-3-1-pro
createdAtMay 10, 2026, 4:24 PM
updatedAtMay 10, 2026, 6:47 PM
githubUrlhttps://github.com/neomjs/neo/issues/11118
authorneo-opus-4-7
commentsCount0
parentIssue11120
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 10, 2026, 6:47 PM

Chunk resources/content/discussions/ to bypass GitHub's 1000-file folder cap (preventive sibling to #11113)

Closedenhancementaimodel-experience
neo-opus-4-7
neo-opus-4-7 commented on May 10, 2026, 4:24 PM

Context

Operator @tobiu surfaced 2026-05-10 during PR #11114 review:

"follow up tickets (where we can do better!) => we also sync discussions and pr conversations now. we do not have archives and chunking in place there yet."

resources/content/discussions/ currently holds 64 markdown files (one per synced GitHub Discussion). Distance to GitHub's 1000-file cap is large (~15× headroom; years away at current cadence). This is preventive substrate-evolution, not urgent — but worth filing alongside the more-urgent PR-side chunking (#11117) so the substrate primitive lands once across all GH-content sync types.

The Problem

When resources/content/discussions/ exceeds 1000 files, GitHub's web UI folder tree truncates display. Same friction class as #11113 (issues) and #11117 (PRs). Filing now lets the chunking land alongside the PR-side work + reuses the same substrate primitive.

Discussion-archive separation is also missing: there is no discussion-archive/ sibling to issue-archive/. As Discussions accumulate over years, the same UX-degradation friction emerges.

The Architectural Reality

ai/services/github-workflow/sync/DiscussionSyncer.mjs writes Discussion content to resources/content/discussions/discussion-NNNNN.md (flat shape). Consumers: build scripts, possibly portal app views.

Per #11113 / #11114 substrate (post-merge): inherit the chunking convention chosen by Gemini (currently XXxx per PR #11114; pending cycle-3 defense). Single substrate primitive across all GH-content syncs.

The Fix

Apply the same chunking + archive pattern that #11114 established for issues:

  1. Subdivide resources/content/discussions/ into chunked sub-folders using whatever convention #11114 cycle-3 documents.
  2. Create resources/content/discussion-archive/ with same chunked structure for closed Discussions (or by-Discussion-state criteria — TBD).
  3. Update DiscussionSyncer.mjs to dynamically compute + write to the chunked path.
  4. Update consumers for recursive readdir parsing.
  5. Migration script to organize existing 64 files into the chunked structure.

Per #11116 friction-gold lesson: consider 2-sub-ticket split (code-side + data-migration). For this ticket the data-migration is small (64 files vs 4190 for issues), so single-PR may be acceptable — but document the decision rationale.

Acceptance Criteria

  • DiscussionSyncer.mjs writes new Discussions to chunked sub-folder per chosen convention
  • resources/content/discussion-archive/ exists with same chunked structure (if archive-separation policy is in scope here)
  • All consumer scripts updated for recursive readdir
  • Existing 64 Discussion markdown files migrated into chunked structure
  • Naming convention documented (inheriting from #11114 cycle-3 outcome)

Out of Scope

  • Naming-convention re-decision — defer to #11114 cycle-3 outcome.
  • PR conversation chunking — separate sibling ticket (#11117).
  • Generalized GH-content-chunking utility — premature; extract once 3 content types are individually chunked + the duplication friction warrants it.
  • Active vs archive lifecycle policy for Discussions — define separately if scope requires.
  • Pre-emptive cap-monitoring — could be a substrate observability ticket, but not this ticket's scope.

Avoided Traps / Gold Standards Rejected

Decision Matrix

  1. Mirror #11113/#11114/#11117 chunking + archive pattern (Selected): Same substrate primitive applied to a 3rd content type. Single-substrate-across-types is the substrate-correct shape.

  2. Defer until cap is actually approached: Rejected. 64 files is far from cap, but filing now lets the work batch naturally with PR-side chunking (#11117) + reduces total swarm-cycles spent on the same substrate primitive across 3 separate cycles.

  3. Skip the discussion chunking entirely (treat as low-priority): Rejected. Operator framed it explicitly: "we do not have archives and chunking in place there yet." Filing acknowledges the gap; defer-implementation-priority is operator's call.

  4. Combine with #11117 into a single epic: Considered. Pro: shared substrate primitive lands once. Con: discussions are 11× smaller than PRs; bundling delays the more-urgent PR-side work. Filed as siblings; can graduate to epic if #11117 implementation surfaces shared-utility extraction.

  • Trap: Treating preventive substrate-evolution as low-priority. Rejection: Operator's "we can do better" framing is not low-priority — it's a substrate-quality observation. Filing-now is the substrate-correct response; implementation priority can be operator-paced.

Related

  • #11113 (canonical issue chunking ticket)
  • PR #11114 (in flight; #11113 implementation)
  • #11116 (sister substrate-evolution: code-vs-data-migration commit-shape discipline)
  • #11117 (sibling: PR conversation chunking; more urgent than this ticket; same substrate primitive)
  • Operator framing 2026-05-10 (this session — [paraphrase]: "we also sync discussions and pr conversations now")
  • Substrate-quality [RETROSPECTIVE]: Once issues + PRs + Discussions are all chunked, the GH-content-chunking primitive should be extracted to a shared utility (NOT in this ticket's scope; flag as future generalization candidate).

Origin Session ID: c2912891-b459-4a03-b2af-154d5e264df1 Retrieval Hint: "Discussion chunking", "resources/content/discussions 1k cap preventive", "DiscussionSyncer chunked path"

tobiu referenced in commit e4ab571 - "feature(github-workflow): Sync discussions into chunked directories (#11118) (#11125) on May 10, 2026, 6:47 PM
tobiu closed this issue on May 10, 2026, 6:47 PM