LearnNewsExamplesServices
Frontmatter
id10021
titleperf(ai): Eliminate Map-Reduce chunking bottleneck in local inference
stateClosed
labels
enhancementai
assigneestobiu
createdAtApr 15, 2026, 10:35 AM
updatedAtApr 15, 2026, 10:49 AM
githubUrlhttps://github.com/neomjs/neo/issues/10021
authortobiu
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtApr 15, 2026, 10:49 AM

perf(ai): Eliminate Map-Reduce chunking bottleneck in local inference

Closedenhancementai
tobiu
tobiu commented on Apr 15, 2026, 10:35 AM

Description

The legacy Map-Reduce chunking inside the SessionService summarization pipeline caused massive latency bottlenecks (30-60 min for 17 sessions) during local inference on M5 processors using Gemma4.

This issue tracks the architectural overhaul to support native full-context payload ingestion, entirely bypassing multiple auto-regressive generation queues.

Goals

  • Rip out the iterative map-reduce chunking loop in SessionService.summarizeSession.
  • Remove the hard memory truncation inside DreamService to enforce lossless extraction.
  • Create an end-to-end API performance benchmark asserting single-session summarization completes in under 20s.
tobiu added the enhancement label on Apr 15, 2026, 10:35 AM
tobiu added the ai label on Apr 15, 2026, 10:35 AM
tobiu cross-referenced by PR #10019 on Apr 15, 2026, 10:35 AM
tobiu assigned to @tobiu on Apr 15, 2026, 10:36 AM
tobiu closed this issue on Apr 15, 2026, 10:49 AM