LearnNewsExamplesServices
Frontmatter
id11379
titleConsolidate chunkPath.mjs + archivePath.mjs into universal contentPath.mjs (ADR 0004 Lane A)
stateClosed
labels
enhancementairefactoringagent-task:pendingarchitecturebuild
assigneesneo-opus-4-7
createdAtMay 15, 2026, 1:43 AM
updatedAtMay 15, 2026, 10:19 AM
githubUrlhttps://github.com/neomjs/neo/issues/11379
authorneo-opus-4-7
commentsCount0
parentIssue11372
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 15, 2026, 10:19 AM

Consolidate chunkPath.mjs + archivePath.mjs into universal contentPath.mjs (ADR 0004 Lane A)

Closedenhancementairefactoringagent-task:pendingarchitecturebuild
neo-opus-4-7
neo-opus-4-7 commented on May 15, 2026, 1:43 AM

Context

Lane A foundation under Epic #11372 / ADR 0004 (Universal Ordinal-100 Content Architecture). Per pre-sunset coordination broadcast (#11372 comment IC_kwDODSospM8AAAABCZD5bg), Phase 1 Tasks 1+2 are pre-allocated to @neo-opus-4-7 and block downstream lanes (Lane B Tasks 3+5 / @neo-gpt; Lane C Task 8 / @neo-gemini-3-1-pro via #11361). No producer-tier sub-ticket existed; this ticket files it per §0 Invariant 7 BEFORE any code edit.

Problem

ADR 0004 §3.1 codifies a single universal contentPath() helper to replace the two-primitive chunkPath.mjs (3-line ID-range, RETIRED per §2.3) + archivePath.mjs (188-line ordinal-100, retained but to be folded). The current substrate enforces this with two distinct primitives, which:

  • Adds cognitive load ("which helper applies here?" branching at every call site)
  • Has historically encouraged invention of parallel chunking rules (the §5.3 anti-pattern)
  • Couples active-tier code paths to String(id).padStart math that ADR 0004 §5.2 explicitly rejects

_index.json schema (ADR 0004 §3.2) is undefined; without it, Lane B can't safely write the index, and Lane C can't read it.

Architectural Reality

  • ai/services/github-workflow/shared/chunkPath.mjs — 3 lines, ID-range chunking via String(id).padStart(4, '0').slice(0, -2) + 'xx'. RETIRED per ADR §2.3.
  • ai/services/github-workflow/shared/archivePath.mjs — 188 lines with validateArchiveConfig(), archivePath() default, archiveBucketDir(), and 4 internal validators (validateBucket, validateNonNegativeInteger, validatePositiveInteger, validateSegment). Has correct ordinal-100 math for archive tier already.
  • Spec: test/playwright/unit/ai/services/github-workflow/shared/archivePath.spec.mjs (to verify; the pattern lives here).

Fix (concrete prescription)

Single new file ai/services/github-workflow/shared/contentPath.mjs consolidating both primitives under ADR 0004 §3.1's universal-ordinal-100 rule.

Signature (per ADR §3.1):

export default function contentPath({
    contentRoot, type, version, bucket, filename,
    itemIndex, itemsPerChunk = 100,
    chunkPrefix = 'chunk-'
}) {
    // active tier:  contentRoot/type/chunk-N/filename
    // archive tier: contentRoot/archive/type/{version|bucket}/chunk-N/filename
    const chunkNumber = Math.floor(itemIndex / itemsPerChunk) + 1;
    const chunkDir    = `${chunkPrefix}${chunkNumber}`;
    const bucketDir   = version || bucket
        ? path.join(contentRoot, 'archive', type, version || bucket)
        : path.join(contentRoot, type);
    return path.join(bucketDir, chunkDir, filename);
}

Exports: default contentPath; named contentBucketDir (the bucket-dir-without-chunk variant for syncer planning); named validators re-exported from the existing archivePath.mjs internals (lifted to module-level utilities); named DEFAULT_ITEMS_PER_CHUNK = 100, DEFAULT_CHUNK_PREFIX = 'chunk-'.

_index.json schema — JSDoc-codified in this same helper file (the schema's logical home is the path-resolution authority):

interface ContentIndexEntry {
    type: 'issues' | 'pulls' | 'discussions' | 'release-notes';
    id: number | string;        // GitHub ID for issues/pulls/discussions; semver for releases
    version: string | null;     // null for active tier; 'v<X.Y.Z>' for archive tier
    chunkNumber: number;        // 1-based
    path: string;               // relative to contentRoot
    bucket?: string;            // non-release bucket (e.g., 'rejected') when applicable
}
type ContentIndex = ContentIndexEntry[];

Validation invariants (lifted from archivePath.mjs):

  • contentRoot, type, filename must be non-empty strings
  • type must be a safe single segment (no /, \, ..)
  • itemIndex must be a non-negative integer
  • itemsPerChunk must be a positive integer
  • Exactly zero or one of version / bucket may be supplied (both = archive disambiguation conflict)

Coexistence path (operator-corrected 2026-05-15: clean-cut per ADR 0004 §2.3 "RETIRED" framing, NOT deprecation theater):

  • chunkPath.mjs → left untouched in its current 3-line state. NO @deprecated JSDoc — ADR 0004 doesn't say "deprecated" anywhere; it says "RETIRED" per §2.3. Retirement of the FILE happens when call sites migrate (Lane B PR deletes it).
  • archivePath.mjs → kept functional with archivePath() + archiveBucketDir() + validateArchiveConfig() public APIs intact. Internal chunk-math delegated to chunkNumberFor() from contentPath.mjs for canonical-single-source-of-truth — this is substrate-improvement (DRY), not deprecation theater. NO @deprecated JSDoc annotations.
  • Anti-pattern explicitly rejected (per operator correction 2026-05-15): @deprecated-annotated shim wrappers that create coexistence-window theater between the new and retired primitives. Per ADR 0004 §5.3, inventing parallel chunking rules — even as "deprecated" — is the substrate-bypass anti-pattern. The clean-cut shape is: add the new helper, leave the to-be-retired files untouched, let the Lane B PR delete them when migrating call sites.

Acceptance Criteria

  • AC1: ai/services/github-workflow/shared/contentPath.mjs exists with default + named exports matching the signature above
  • AC2: _index.json schema codified in JSDoc on contentPath.mjs (single-source-of-truth for the index contract)
  • AC3: All validation invariants from archivePath.mjs preserved (segment safety, integer guards, bucket-OR-version XOR)
  • AC4: Active-tier path (no version/bucket) returns {contentRoot}/{type}/chunk-{N}/{filename} for itemIndex >= 0
  • AC5: Archive-tier path (with version or bucket) returns {contentRoot}/archive/{type}/{version|bucket}/chunk-{N}/{filename}
  • AC6: chunkPath.mjs left untouched in its current 3-line state — NO @deprecated JSDoc additions per operator-corrected ADR 0004 §2.3 clean-cut framing (2026-05-15). File deletion happens when Lane B migrates call sites.
  • AC7: archivePath.mjs keeps archivePath() + archiveBucketDir() + validateArchiveConfig() public APIs intact; internal chunk-math computation delegates to chunkNumberFor() from contentPath.mjs for canonical single-source-of-truth. NO @deprecated JSDoc annotations.
  • AC8: Unit test coverage in test/playwright/unit/ai/services/github-workflow/shared/contentPath.spec.mjs covering: active-tier basic, archive-tier-with-version, archive-tier-with-bucket, validator failure modes (each invariant), itemIndex boundary cases, chunk-number computation at itemsPerChunk boundary
  • AC9: Existing archivePath.spec.mjs continues to pass (proves shim integrity)
  • AC10: PR body cites ADR 0004 §3.1 + §3.2 + §6 (V-B-A pre-flight); reviewer audit confirms substrate-bypass-immune
  • AC11: No call-site mutations (those are Lane B's scope) — rg "from.*chunkPath|from.*archivePath" ai/services/github-workflow returns the SAME set pre/post merge

Out of Scope

  • Lane B (Tasks 3+5): Mutating call sites in LocalFileService, IssueSyncer, PullRequestSyncer, DiscussionSyncer to use contentPath() + maintain _index.json — separate ticket (@neo-gpt).
  • Lane C (Task 8): Consumer rewires in TicketSource, PullRequestSource, DiscussionSource, IssueIngestor#11361 (@neo-gemini-3-1-pro).
  • Task 4: Config audit drop of archiveDir + defaultArchiveVersion#11363 (@neo-gpt).
  • Task 6: ReleaseNotesSyncer introduction — future Phase 1 ticket.
  • Task 7: publish.mjs review — future Phase 1 ticket.
  • Task 9: Stale-reference cleanup in skill workflow docs — future Phase 1 ticket.
  • Task 10: Clean-slate substrate purge + re-sync — LAST in Phase 1 per ADR §3.6.

Avoided Traps

  • §5.3 (Inventing parallel chunking rules): must NOT add a second helper or branch the existing primitives. ONE function with parameterized active-vs-archive routing.
  • §5.2 (GitHub-ID-stream math): signature deliberately omits id as a positional argument; itemIndex is ordinal, ID is opaque to the path math.
  • §5.4 (Skipping prevent-reopen.yml): the sealed-chunk invariant is mentioned in JSDoc on contentPath() to anchor the immutability assumption for future readers.
  • Premature call-site migration: explicit AC11 ensures the helper lands as additive substrate, not an in-place rewrite (which would inflate Lane A blast radius and entangle with Lane B's review surface).
  • Configurable-everything regret: itemsPerChunk defaults to 100 but is parameter-overridable for tests; the runtime config-tier flexibility (archiveChunkThreshold etc.) is the config-audit ticket's decision (#11363), NOT this one.

Related

  • Authority: ADR 0004 (learn/agentos/decisions/0004-github-content-architecture.md) §3.1 (helper consolidation), §3.2 (index map substrate), §6 (V-B-A pre-flight for future authors)
  • Parent epic: #11372
  • Sibling sub-tickets:
    • #11361 (Lane C: consumer rewires; blocked on this)
    • #11363 (Task 4: config audit; parallel-safe with this)
    • #11364 (PR archiveVersion metadata cleanup; aligned but independent)
  • Lane B (Tasks 3+5): sub-ticket not yet filed — @neo-gpt to file when claiming
  • Supersedes: #11187 (via parent #11372)

Origin Session

  • Origin Session ID: e095c569-beac-4743-998f-e07d4344492e

Retrieval Hint

Search for contentPath chunkPath archivePath ADR 0004 universal ordinal-100 _index.json schema Lane A.

tobiu referenced in commit 0e4c016 - "feat(github-workflow/shared): consolidate path primitives into universal contentPath.mjs (#11379) (#11381) on May 15, 2026, 10:19 AM
tobiu closed this issue on May 15, 2026, 10:19 AM