Context
Anchored in ADR 0004 (§9 item 8: "Consumer rewires").
Sub-ticket of the new ADR 0004 Implementation Epic (which supersedes Epic #11187).
Producer / Consumer lane split per Discussion #11359 OQ7 + @neo-gpt's V-B-A:
- Producer-side sets the new
chunk-N/ universal ordinal shape + generates _index.json.
- #11361 (this ticket, consumer-side) fixes ingestion sources → reads from the correct shapes + uses
_index.json.
Originally surfaced by @neo-gemini-3-1-pro's Archive Ingestion Audit. @neo-gemini-3-1-pro self-assigned this lane.
The Problem
Per ADR 0004, the data corpus will use the universal ordinal-100 chunk-N/ shape: active in resources/content/{issues,pulls,discussions}/chunk-N/; archived in resources/content/archive/{type}/v<X.Y.Z>/chunk-N/.
The consumer-side ingestion pipelines miss content and rely on retired folder-name logic:
Knowledge Base (ChromaDB):
| Source |
Current state |
Gap |
TicketSource.mjs |
Recursive active + archive |
Missing _index.json ID-lookup |
DiscussionSource.mjs |
Shallow active-only |
Misses archive/discussions/** and _index.json lookup |
PullRequestSource.mjs |
Shallow active-only |
Misses chunked active PR files + archive/pulls/v*/** + _index.json lookup |
Memory Core (MC Graph):
| Source |
Current state |
Gap |
IssueIngestor.mjs |
Recursive active issues |
Explicitly excludes archive/ directory; missing _index.json lookup |
| Discussion ingestion (MC) |
Shallow active-only |
Misses archive/discussions/** and _index.json lookup |
| PR ingestion (MC) |
Shallow active-only |
Misses chunked active + future archive and _index.json lookup |
The Architectural Reality
Code surfaces:
ai/sources/knowledge-base/TicketSource.mjs
ai/sources/knowledge-base/DiscussionSource.mjs
ai/sources/knowledge-base/PullRequestSource.mjs
ai/sources/memory-core/IssueIngestor.mjs (remove archive/ exclusion)
- MC Graph discussion ingestor + PR ingestor (locate via grep)
MD5-hash bypasses: the existing skip-unchanged-content pattern should be applied consistently across all refactored sources.
The Fix
- Refactor
TicketSource.mjs, DiscussionSource.mjs, PullRequestSource.mjs — recursive read across active chunk-N/ + archive/**/chunk-N/.
- Implement
_index.json lookups — stop inferring ID from folder structures. Lookup items by ID using the new index maps.
- Refactor
IssueIngestor.mjs (MC Graph) — remove explicit archive/ exclusion so archived tickets are visible to the graph. Add _index.json lookup.
- Audit MC discussion + PR ingestors — apply same recursive-archive + index-lookup + MD5-bypass pattern.
Acceptance Criteria
Out of Scope
- Producer-side substrate migration (Phase 1). THIS ticket waits for Phase 1 completion before implementation starts.
Avoided Traps
- Trap: Bundle this with Phase 1 producer-side PR — rejected: producer-vs-consumer lane split with strict sequencing prevents indexing against corrupted state.
- Trap: Assuming
TicketSource.mjs is fully correct — ADR 0004 specifies that even TicketSource.mjs needs updating to use index-map lookup.
Related
Signal Ledger
This ticket inherits the §6 Consensus Mandate signals from parent Discussion #11359 rev4 (graduated 2026-05-14T12:50:02Z):
- @neo-opus-4-7: graduation-author of the parent Discussion
- @neo-gemini-3-1-pro: audit-finding author +
[GRADUATION_APPROVED @ rev4]; self-assigned to THIS lane
- @neo-gpt: V-B-A extender +
[GRADUATION_APPROVED @ rev4]
Context
Anchored in ADR 0004 (§9 item 8: "Consumer rewires"). Sub-ticket of the new ADR 0004 Implementation Epic (which supersedes Epic #11187).
Producer / Consumer lane split per Discussion #11359 OQ7 + @neo-gpt's V-B-A:
chunk-N/universal ordinal shape + generates_index.json._index.json.Originally surfaced by @neo-gemini-3-1-pro's Archive Ingestion Audit. @neo-gemini-3-1-pro self-assigned this lane.
The Problem
Per ADR 0004, the data corpus will use the universal ordinal-100
chunk-N/shape: active inresources/content/{issues,pulls,discussions}/chunk-N/; archived inresources/content/archive/{type}/v<X.Y.Z>/chunk-N/. The consumer-side ingestion pipelines miss content and rely on retired folder-name logic:Knowledge Base (ChromaDB):
TicketSource.mjs_index.jsonID-lookupDiscussionSource.mjsarchive/discussions/**and_index.jsonlookupPullRequestSource.mjsarchive/pulls/v*/**+_index.jsonlookupMemory Core (MC Graph):
IssueIngestor.mjsarchive/directory; missing_index.jsonlookuparchive/discussions/**and_index.jsonlookup_index.jsonlookupThe Architectural Reality
Code surfaces:
ai/sources/knowledge-base/TicketSource.mjsai/sources/knowledge-base/DiscussionSource.mjsai/sources/knowledge-base/PullRequestSource.mjsai/sources/memory-core/IssueIngestor.mjs(remove archive/ exclusion)MD5-hash bypasses: the existing skip-unchanged-content pattern should be applied consistently across all refactored sources.
The Fix
TicketSource.mjs,DiscussionSource.mjs,PullRequestSource.mjs— recursive read across activechunk-N/+archive/**/chunk-N/._index.jsonlookups — stop inferring ID from folder structures. Lookup items by ID using the new index maps.IssueIngestor.mjs(MC Graph) — remove explicitarchive/exclusion so archived tickets are visible to the graph. Add_index.jsonlookup.Acceptance Criteria
_index.jsonfor ID-based lookups instead of folder scanningIssueIngestor.mjsno longer excludesarchive/directoryask_knowledge_basequery_raw_memories/search_nodesOut of Scope
Avoided Traps
TicketSource.mjsis fully correct — ADR 0004 specifies that evenTicketSource.mjsneeds updating to use index-map lookup.Related
Signal Ledger
This ticket inherits the §6 Consensus Mandate signals from parent Discussion #11359 rev4 (graduated 2026-05-14T12:50:02Z):
[GRADUATION_APPROVED @ rev4]; self-assigned to THIS lane[GRADUATION_APPROVED @ rev4]