Parent: #12065
Depends on: #12068, #12074
Sub-Issue 7: Hierarchical-Summarization Strategy (Chunking-Aware Tri-Vector)
Problem
Tri-Vector extraction (#12068) and error logging (#12074) enable semantic payloads, but large-session processing can exceed local-model context windows or degrade extraction quality. The current pipeline assumes a single-pass extraction path and does not define chunk boundaries, cross-chunk deduplication, or deterministic reduce semantics.
This blocks the Memory Core from reliably processing long sessions without either losing content or producing inconsistent graph structures.
Scope
Design and implement a hierarchical summarization strategy for Tri-Vector Memory Core extraction.
The strategy must be chunking-aware and deterministic:
- split long session payloads into bounded chunks
- extract Tri-Vector payloads per chunk
- summarize/reduce chunk-level vectors into a final session-level vector
- preserve enough metadata to diagnose how the final vector was produced
- avoid duplicate entity/concept emission across chunk boundaries where practical
Acceptance Criteria
- Define a chunking strategy for large session payloads.
- Chunk boundaries must be deterministic.
- Chunk metadata must include enough information to trace source session coverage.
- Tri-Vector extraction must run per chunk when chunking is activated.
- Final session-level Tri-Vector output must be produced by a reduce/summarization step.
- Entity/concept duplication across chunks must be mitigated or explicitly documented.
- Failure of an individual chunk must be recorded in the REM state model.
- The implementation must preserve current behavior for small sessions.
- Add targeted tests for:
- small-session single-pass behavior
- large-session chunk activation
- deterministic chunk IDs/order
- failed-chunk reporting
- cross-chunk entity/concept reconciliation
- Document the strategy in code comments or adjacent docs where future agents will encounter it.
Contract Ledger
| Target Surface |
Source of Authority |
Proposed Behavior |
Fallback / Edge Case |
Docs |
Evidence |
SemanticGraphExtractor.executeTriVectorExtraction(session, options?) |
#12073 / #12065 AC10 / Discussion #12062 OQ12 |
Preserve current single-pass path below safeProcessingLimitTokens; above threshold, split turn-aligned chunks, run per-chunk Tri-Vector map, then deterministic summary-reduce in chunk:0 through chunk:N-1 order. |
If chunking is not activated, behavior remains current single-pass. If any chunk returns null or fails validation, abort reduce and return null while recording chunk failure. |
Update method JSDoc with threshold, chunking, and failure semantics. |
Unit tests for below-threshold single-pass, above-threshold chunk IDs/order, failure abort, and dedup. |
| Chunk identity / boundaries |
#12073 AC2-AC4 |
Chunk IDs are <sessionId>:chunk:<N>, zero-indexed and monotonic; boundaries are turn-aligned; reduce order is deterministic. |
If one turn exceeds the safe limit, keep the turn intact and route through the existing guardrail/failure path rather than split mid-turn. |
JSDoc / test fixture notes. |
Multi-turn fixture proving no mid-turn split and deterministic chunk order. |
| Cross-chunk entity reconciliation |
#12073 AC6 |
Reduce pass dedups entities by (type, name) tuple before graph write. |
Tuple collisions are merged as the MVP caveat; no silent duplicate graph nodes for the same tuple. |
JSDoc. |
Fixture with the same entity in two chunks results in one node. |
| REM run-state integration |
#12088 / PR #12122 / #12073 AC7 |
Per-chunk failure is recorded in per-session/per-phase state with chunk id and failure reason. |
If the state writer is unavailable, fail through the existing REM state write error path; no silent loss. |
PR body and any touched run-state docs. |
Unit or integration proof against RemRunStateStore / DreamService state entry. |
| Activation threshold + telemetry |
aiConfig.localModels.chat.safeProcessingLimitTokens and existing ConsumerFrictionHelper guardrail |
Chunking activates only when estimated payload exceeds the safe processing limit; log/state records that chunking activated. |
Small sessions remain single-pass. Unknown/invalid threshold falls back to current guardrail behavior. |
Method JSDoc / PR body. |
Below/above threshold tests plus log/state assertion where feasible. |
| Benchmark handoff |
#12074 / #12073 AC9 |
PR reports scoped cost evidence for map+reduce versus single-pass where feasible, or explicitly defers live Gemma cost to post-merge/operator benchmark when sandbox ceiling applies. |
If no live Gemma run is available, do not claim live-model evidence; keep evidence at unit/static level with residual risk. |
PR Evidence line / Post-Merge Validation. |
Unit evidence plus optional operator/live benchmark. |
Avoided Traps
- Do not implement chunking as ad-hoc string slicing without source-position traceability.
- Do not discard chunk-level failures silently.
- Do not let chunking alter small-session behavior.
- Do not conflate this with the embedding vector store; this ticket concerns Tri-Vector extraction payloads.
- Do not implement an unbounded recursive summarizer without deterministic stopping criteria.
Notes
This is the session-scale counterpart to Tri-Vector extraction. It should be implemented after the base extractor and logging paths exist so the chunking layer has a clear contract to wrap.
Parent: #12065 Depends on: #12068, #12074
Sub-Issue 7: Hierarchical-Summarization Strategy (Chunking-Aware Tri-Vector)
Problem
Tri-Vector extraction (#12068) and error logging (#12074) enable semantic payloads, but large-session processing can exceed local-model context windows or degrade extraction quality. The current pipeline assumes a single-pass extraction path and does not define chunk boundaries, cross-chunk deduplication, or deterministic reduce semantics.
This blocks the Memory Core from reliably processing long sessions without either losing content or producing inconsistent graph structures.
Scope
Design and implement a hierarchical summarization strategy for Tri-Vector Memory Core extraction.
The strategy must be chunking-aware and deterministic:
Acceptance Criteria
Contract Ledger
SemanticGraphExtractor.executeTriVectorExtraction(session, options?)safeProcessingLimitTokens; above threshold, split turn-aligned chunks, run per-chunk Tri-Vector map, then deterministic summary-reduce inchunk:0throughchunk:N-1order.nullor fails validation, abort reduce and returnnullwhile recording chunk failure.<sessionId>:chunk:<N>, zero-indexed and monotonic; boundaries are turn-aligned; reduce order is deterministic.(type, name)tuple before graph write.RemRunStateStore/DreamServicestate entry.aiConfig.localModels.chat.safeProcessingLimitTokensand existingConsumerFrictionHelperguardrailAvoided Traps
Notes
This is the session-scale counterpart to Tri-Vector extraction. It should be implemented after the base extractor and logging paths exist so the chunking layer has a clear contract to wrap.