Context
The Neo.mjs skill substrate is growing, and relying entirely on top-layer SKILL.md routers runs into context-window byte limits and truncation regressions. Agents need to be able to extract specific sub-rules (the "Dense Ground Truth" bottom layer) without loading the entire monolith.
The Problem
Currently, agents can use ask_knowledge_base and query_documents to find source code, issues, and guides, but the .agents/skills/**/*.md files are not ingested into the ChromaDB Knowledge Base. This prevents O(1) semantic discovery of operational ground truth.
The Architectural Reality
The Knowledge Base extraction pipeline is defined in ai/services/knowledge-base/source/*.mjs. It uses a source-class pattern (e.g., IssueSource, CodeSource). The sync is triggered via npm run ai:sync-kb. The MC SkillGraph (governance topology) is separate in SQLite.
The Fix
Implement a new SkillSource class to ingest all .agents/skills/**/*.md files into the Knowledge Base. This ensures skill chunks are semantically discoverable via existing tools (query_documents, ask_knowledge_base) with type: 'skill'.
Contract Ledger Matrix
| Target Surface |
Source of Authority |
Proposed Behavior |
Fallback |
Docs |
Evidence |
SkillSource |
Knowledge Base |
Ingests .agents/skills/**/*.md files |
N/A |
Update KB README |
Discussion #11316 |
Acceptance Criteria
Out of Scope
- Reactive file-system watchers for skills (sync remains batch-based).
- Merging KB ingestion with MC SkillGraph topology in SQLite.
- Changes to
CodeSource or other existing sources.
Avoided Traps / Gold Standards Rejected
- Reusing CodeSource: Rejected because skills need specific chunk-typing (
type: 'skill') and metadata that CodeSource does not provide.
- Reactive watcher: Rejected because it diverges from the existing
ai:sync-kb batch-sync pattern and adds runtime overhead.
- Separate ChromaDB collection: Rejected because it over-complicates agent tools (e.g.,
ask_knowledge_base would need multi-collection support).
Related
- Discussion #11316 (Skill Semantic Search)
- Discussion #11314 (Trigger-Aware Workflows)
Signal Ledger (From Discussion #11316)
- @neo-opus-4-7: APPROVED @ 11316 comments
- @neo-gpt: APPROVED @ 11316 comments
- @neo-gemini-3-1-pro: APPROVED @ 11316 body
Unresolved Dissent
(empty — positive signal)
Unresolved Liveness
(empty — positive signal)
Origin Session ID: 2c4aa4df-2628-45ae-a9c2-156fd9308f21
Retrieval Hint: "Skill Semantic Search Ingestion Pipeline SkillSource ChromaDB"
Context The Neo.mjs skill substrate is growing, and relying entirely on top-layer
SKILL.mdrouters runs into context-window byte limits and truncation regressions. Agents need to be able to extract specific sub-rules (the "Dense Ground Truth" bottom layer) without loading the entire monolith.The Problem Currently, agents can use
ask_knowledge_baseandquery_documentsto find source code, issues, and guides, but the.agents/skills/**/*.mdfiles are not ingested into the ChromaDB Knowledge Base. This prevents O(1) semantic discovery of operational ground truth.The Architectural Reality The Knowledge Base extraction pipeline is defined in
ai/services/knowledge-base/source/*.mjs. It uses a source-class pattern (e.g.,IssueSource,CodeSource). The sync is triggered vianpm run ai:sync-kb. The MC SkillGraph (governance topology) is separate in SQLite.The Fix Implement a new
SkillSourceclass to ingest all.agents/skills/**/*.mdfiles into the Knowledge Base. This ensures skill chunks are semantically discoverable via existing tools (query_documents,ask_knowledge_base) withtype: 'skill'.Contract Ledger Matrix
SkillSource.agents/skills/**/*.mdfilesAcceptance Criteria
SkillSource.mjsextending theBaseSourcepattern inai/services/knowledge-base/source/.SkillSourceto extract and chunk.agents/skills/**/*.mdfiles.type: 'skill'to all extracted chunks.skillName,sectionAnchor,triggerCondition, andisAtlasMonolithSubRule.SkillSourceis registered and executed duringnpm run ai:sync-kb.SkillSourcedemonstrating correct chunking and metadata attachment.Out of Scope
CodeSourceor other existing sources.Avoided Traps / Gold Standards Rejected
type: 'skill') and metadata thatCodeSourcedoes not provide.ai:sync-kbbatch-sync pattern and adds runtime overhead.ask_knowledge_basewould need multi-collection support).Related
Signal Ledger (From Discussion #11316)
Unresolved Dissent (empty — positive signal)
Unresolved Liveness (empty — positive signal)
Origin Session ID: 2c4aa4df-2628-45ae-a9c2-156fd9308f21 Retrieval Hint: "Skill Semantic Search Ingestion Pipeline SkillSource ChromaDB"