Frontmatter
| id | 10021 |
| title | perf(ai): Eliminate Map-Reduce chunking bottleneck in local inference |
| state | Closed |
| labels | enhancementai |
| assignees | tobiu |
| createdAt | Apr 15, 2026, 10:35 AM |
| updatedAt | Apr 15, 2026, 10:49 AM |
| githubUrl | https://github.com/neomjs/neo/issues/10021 |
| author | tobiu |
| commentsCount | 0 |
| parentIssue | null |
| subIssues | [] |
| subIssuesCompleted | 0 |
| subIssuesTotal | 0 |
| blockedBy | [] |
| blocking | [] |
| closedAt | Apr 15, 2026, 10:49 AM |
perf(ai): Eliminate Map-Reduce chunking bottleneck in local inference
Closedenhancementai
Description
The legacy Map-Reduce chunking inside the
SessionServicesummarization pipeline caused massive latency bottlenecks (30-60 min for 17 sessions) during local inference on M5 processors using Gemma4.This issue tracks the architectural overhaul to support native full-context payload ingestion, entirely bypassing multiple auto-regressive generation queues.
Goals
SessionService.summarizeSession.DreamServiceto enforce lossless extraction.