LearnNewsExamplesServices
Frontmatter
id12073
titleSub 7: Hierarchical-summarization strategy (chunking-aware Tri-Vector)
stateOpen
labels
enhancementaiarchitectureneeds-re-triagemodel-experience
assignees[]
createdAtMay 27, 2026, 3:44 AM
updatedAtJun 7, 2026, 12:01 AM
githubUrlhttps://github.com/neomjs/neo/issues/12073
authorneo-opus-ada
commentsCount6
parentIssue12065
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]

Sub 7: Hierarchical-summarization strategy (chunking-aware Tri-Vector)

Open Backlog/active-chunk-15 enhancementaiarchitectureneeds-re-triagemodel-experience
neo-opus-ada
neo-opus-ada commented on May 27, 2026, 3:44 AM

Parent: #12065 Depends on: #12068, #12074

Sub-Issue 7: Hierarchical-Summarization Strategy (Chunking-Aware Tri-Vector)

Problem

Tri-Vector extraction (#12068) and error logging (#12074) enable semantic payloads, but large-session processing can exceed local-model context windows or degrade extraction quality. The current pipeline assumes a single-pass extraction path and does not define chunk boundaries, cross-chunk deduplication, or deterministic reduce semantics.

This blocks the Memory Core from reliably processing long sessions without either losing content or producing inconsistent graph structures.

Scope

Design and implement a hierarchical summarization strategy for Tri-Vector Memory Core extraction.

The strategy must be chunking-aware and deterministic:

  • split long session payloads into bounded chunks
  • extract Tri-Vector payloads per chunk
  • summarize/reduce chunk-level vectors into a final session-level vector
  • preserve enough metadata to diagnose how the final vector was produced
  • avoid duplicate entity/concept emission across chunk boundaries where practical

Acceptance Criteria

  1. Define a chunking strategy for large session payloads.
  2. Chunk boundaries must be deterministic.
  3. Chunk metadata must include enough information to trace source session coverage.
  4. Tri-Vector extraction must run per chunk when chunking is activated.
  5. Final session-level Tri-Vector output must be produced by a reduce/summarization step.
  6. Entity/concept duplication across chunks must be mitigated or explicitly documented.
  7. Failure of an individual chunk must be recorded in the REM state model.
  8. The implementation must preserve current behavior for small sessions.
  9. Add targeted tests for:
    • small-session single-pass behavior
    • large-session chunk activation
    • deterministic chunk IDs/order
    • failed-chunk reporting
    • cross-chunk entity/concept reconciliation
  10. Document the strategy in code comments or adjacent docs where future agents will encounter it.

Contract Ledger

Target Surface Source of Authority Proposed Behavior Fallback / Edge Case Docs Evidence
SemanticGraphExtractor.executeTriVectorExtraction(session, options?) #12073 / #12065 AC10 / Discussion #12062 OQ12 Preserve current single-pass path below safeProcessingLimitTokens; above threshold, split turn-aligned chunks, run per-chunk Tri-Vector map, then deterministic summary-reduce in chunk:0 through chunk:N-1 order. If chunking is not activated, behavior remains current single-pass. If any chunk returns null or fails validation, abort reduce and return null while recording chunk failure. Update method JSDoc with threshold, chunking, and failure semantics. Unit tests for below-threshold single-pass, above-threshold chunk IDs/order, failure abort, and dedup.
Chunk identity / boundaries #12073 AC2-AC4 Chunk IDs are <sessionId>:chunk:<N>, zero-indexed and monotonic; boundaries are turn-aligned; reduce order is deterministic. If one turn exceeds the safe limit, keep the turn intact and route through the existing guardrail/failure path rather than split mid-turn. JSDoc / test fixture notes. Multi-turn fixture proving no mid-turn split and deterministic chunk order.
Cross-chunk entity reconciliation #12073 AC6 Reduce pass dedups entities by (type, name) tuple before graph write. Tuple collisions are merged as the MVP caveat; no silent duplicate graph nodes for the same tuple. JSDoc. Fixture with the same entity in two chunks results in one node.
REM run-state integration #12088 / PR #12122 / #12073 AC7 Per-chunk failure is recorded in per-session/per-phase state with chunk id and failure reason. If the state writer is unavailable, fail through the existing REM state write error path; no silent loss. PR body and any touched run-state docs. Unit or integration proof against RemRunStateStore / DreamService state entry.
Activation threshold + telemetry aiConfig.localModels.chat.safeProcessingLimitTokens and existing ConsumerFrictionHelper guardrail Chunking activates only when estimated payload exceeds the safe processing limit; log/state records that chunking activated. Small sessions remain single-pass. Unknown/invalid threshold falls back to current guardrail behavior. Method JSDoc / PR body. Below/above threshold tests plus log/state assertion where feasible.
Benchmark handoff #12074 / #12073 AC9 PR reports scoped cost evidence for map+reduce versus single-pass where feasible, or explicitly defers live Gemma cost to post-merge/operator benchmark when sandbox ceiling applies. If no live Gemma run is available, do not claim live-model evidence; keep evidence at unit/static level with residual risk. PR Evidence line / Post-Merge Validation. Unit evidence plus optional operator/live benchmark.

Avoided Traps

  • Do not implement chunking as ad-hoc string slicing without source-position traceability.
  • Do not discard chunk-level failures silently.
  • Do not let chunking alter small-session behavior.
  • Do not conflate this with the embedding vector store; this ticket concerns Tri-Vector extraction payloads.
  • Do not implement an unbounded recursive summarizer without deterministic stopping criteria.

Notes

This is the session-scale counterpart to Tri-Vector extraction. It should be implemented after the base extractor and logging paths exist so the chunking layer has a clear contract to wrap.

tobiu referenced in commit 17c8ec6 - "feat(orchestrator): unified executeRemCycle() with typed cycle outcome (#12069) (#12096) on May 27, 2026, 6:10 PM