LearnNewsExamplesServices
Frontmatter
id10097
title[bug] ask_knowledge_base returns empty synthesis for type=''src''/''ai-infrastructure'' queries despite correct references
stateClosed
labels
bugaiarchitecture
assigneestobiu
createdAtApr 19, 2026, 2:25 PM
updatedAtApr 19, 2026, 2:46 PM
githubUrlhttps://github.com/neomjs/neo/issues/10097
authortobiu
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtApr 19, 2026, 2:46 PM

[bug] ask_knowledge_base returns empty synthesis for type='src'/'ai-infrastructure' queries despite correct references

Closedbugaiarchitecture
tobiu
tobiu commented on Apr 19, 2026, 2:25 PM

Symptom

ask_knowledge_base with type='src' or type='ai-infrastructure' returns placeholder text in the answer field ("I don't have enough information..." / "provided context documents are empty") while the references array comes back populated with high-scoring, correctly-indexed source files. The same query with type='all' (default) synthesizes a full, grounded answer from the same underlying corpus.

Reproducers

Case 1 — type='src' (empty answer, populated refs):

ask_knowledge_base({query: 'How does the IssueSyncer sync GitHub issues to local markdown files?', type: 'src'})
<h1 class="neo-h1" data-record-id="4">answer: &quot;I don&#39;t have enough information from the provided documents...&quot;</h1>

<h1 class="neo-h1" data-record-id="5">references: [src/core/Base.mjs, src/component/Base.mjs, src/main/addon/Base.mjs, ...]</h1>

Case 2 — type='ai-infrastructure' (same symptom):

ask_knowledge_base({query: 'refetchIssuesByNumber detectStaleCommentsCounts exhaustTimelineItems', type: 'ai-infrastructure'})
<h1 class="neo-h1" data-record-id="6">answer: &quot;I don&#39;t have enough information to answer your question, as the provided context documents are empty.&quot;</h1>

<h1 class="neo-h1" data-record-id="7">references: [</h1>

<h1 class="neo-h1" data-record-id="8">IssueSyncer.mjs (ai/mcp/server/github-workflow/services/sync/IssueSyncer.mjs, score 793),</h1>

<h1 class="neo-h1" data-record-id="9">IssueService.mjs (ai/mcp/server/github-workflow/services/IssueService.mjs, score 635),</h1>

<h1 class="neo-h1" data-record-id="10">DreamService.mjs (ai/daemons/DreamService.mjs, score 341), ...</h1>

<h1 class="neo-h1" data-record-id="11">]</h1>

Control — type='all', same query (full synthesis):

ask_knowledge_base({query: 'refetchIssuesByNumber detectStaleCommentsCounts IssueSyncer GitHub sync'})
<h1 class="neo-h1" data-record-id="12">answer: Full synthesis naming IssueSyncer, FETCH_ISSUES_FOR_SYNC, the comment-sync flow</h1>

<h1 class="neo-h1" data-record-id="13">references: [src/core/Base.mjs, ai/mcp/.../IssueSyncer.mjs, Updater.md, GitHubWorkflow.md, ...]</h1>

What's NOT the bug

  • Indexing is not broken. ApiSource.mjs:42-49 sourceMap includes 'ai': 'ai-infrastructure'. References prove the chunks are in the collection (with content-appropriate scores). type='all' synthesizes correctly using those same chunks.

Suspected Root Cause

Very likely in ai/mcp/server/knowledge-base/services/QueryService.mjs or DocumentService.mjs — the path that feeds chunk content to the synthesis LLM. Hypotheses to investigate:

  1. Type-filtered Chroma queries return only metadata + references (not chunk document content), so synthesis gets empty strings.
  2. SourceParser-emitted chunks store content under a different metadata key than LearningSource-emitted chunks, and the synthesis path reads only one.
  3. A recently-changed filter clause strips document field when a type filter is present.

Impact

Per AGENTS.md §2.1 Anti-Hallucination Tool Hierarchy, ask_knowledge_base is the PRIMARY tool for conceptual + implementation questions. Agents filtering by type='src' for code-level questions silently get useless placeholder answers despite the KB having the content — a latent value-defeat of the RAG layer. Discovered during the #10092 ticket-intake verification sweep.

Acceptance Criteria

  • Root cause identified and documented in the fix (QueryService.mjs or sibling)
  • ask_knowledge_base with type='src' returns a real synthesis matching type='all' quality
  • ask_knowledge_base with type='ai-infrastructure' returns a real synthesis
  • Playwright spec that queries with type='src' AND type='ai-infrastructure' for a known-indexed method signature; asserts the answer references the method name or nearby context (not placeholder "empty" text)
  • No regression on type='all' / type='guide' synthesis

Origin Session ID

d9eb5e76-5430-45f7-b3ea-8600664d28f9

Related: Discovered during #10092 / PR #10093 verification.

tobiu added the bug label on Apr 19, 2026, 2:25 PM
tobiu added the ai label on Apr 19, 2026, 2:25 PM
tobiu added the architecture label on Apr 19, 2026, 2:25 PM
tobiu assigned to @tobiu on Apr 19, 2026, 2:25 PM
tobiu referenced in commit 2ec752d - "fix(knowledge-base): resolve relative source paths in SearchService synthesis (#10097) on Apr 19, 2026, 2:28 PM
tobiu cross-referenced by PR #10098 on Apr 19, 2026, 2:28 PM
tobiu referenced in commit 2402e2a - "fix(knowledge-base): normalize ApiSource to emit absolute source paths (#10097) on Apr 19, 2026, 2:33 PM
tobiu referenced in commit 3f78740 - "fix(knowledge-base): normalize all source loaders to emit neoRootDir-relative paths (#10097) on Apr 19, 2026, 2:42 PM
tobiu closed this issue on Apr 19, 2026, 2:46 PM
tobiu referenced in commit e94e2c3 - "fix(knowledge-base): resolve relative source paths in SearchService synthesis (#10097) (#10098) on Apr 19, 2026, 2:46 PM