LearnNewsExamplesServices
Frontmatter
id10651
titleUse raw memories for recent memory-mining freshness
stateClosed
labels
bugdocumentationaiarchitecture
assigneesneo-gpt
createdAtMay 3, 2026, 5:03 PM
updatedAtJun 3, 2026, 3:34 PM
githubUrlhttps://github.com/neomjs/neo/issues/10651
authorneo-gpt
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtJun 3, 2026, 3:34 PM

Use raw memories for recent memory-mining freshness

Closed Backlog/active-chunk-9 bugdocumentationaiarchitecture
neo-gpt
neo-gpt commented on May 3, 2026, 5:03 PM

Context

@tobiu clarified the freshness model on 2026-05-03 while reviewing #10646:

  • ask_knowledge_base(type: 'ticket') can be stale because it depends on Knowledge Base sync.
  • Local GitHub resource markdown can be stale because new tickets/discussions/PRs only appear after GitHub Workflow MCP sync.
  • The live latest-20 GitHub issue sweep in #10646 is intended to bridge exactly that freshness gap.
  • Memory Core has the same class of gap: active sessions are not necessarily summarized, so query_raw_memories can beat query_summaries for recent events.

This ticket captures the Memory Core side of that correction. The current memory-mining skill still says:

query_summaries (breadth first) — session-level topics surface cheapest. Start here.
query_raw_memories (depth second) — turn-level detail; use when a summary looked relevant but you need the reasoning trail.

That ordering is reasonable for older/stable history, but it is wrong as a blanket rule for fresh, active, or just-created incident context.

Duplicate Sweep Notes

Creation sweep performed before filing:

  • ask_knowledge_base(type: 'ticket') for active sessions / raw memories / unsummarized summaries found no equivalent open ticket.
  • Local resource search over resources/content/issues, resources/content/issue-archive, resources/content/discussions, and .agents/skills/memory-mining found adjacent but non-duplicate items:
    • #10586 — closed documentation bug correcting the false AGENTS_STARTUP.md auto-summarization claim.
    • #9997 — closed implementation improvement prioritizing latest active sessions inside the summarization pipeline.
    • #10075 — closed ticket that created memory-mining and codified the current summary-first query strategy.
    • #10556 / #10569 — broader Memory Core visibility and summarization freshness issues, not the skill-level query-order rule.
  • Live latest-20 open GitHub issue sweep on 2026-05-03 read #10650, #10649, #10648, #10647, #10646, #10645, #10644, #10643, #10635, #10633, #10627, #10605, #10604, #10601, #10537, #10517, #10494, #10484, #10476, #10462. None own this memory-mining freshness-lane correction.

The Problem

Freshness-sensitive investigations can miss the newest relevant context if agents follow query_summaries first and stop too early.

The failure mode:

  1. A live incident unfolds across one or more active sessions.
  2. Agents correctly call add_memory, so raw turns exist.
  3. The active sessions have not yet been summarized.
  4. query_summaries returns stale or no recent context.
  5. The agent infers that memory has no prior mapping, even though query_raw_memories would have surfaced the current incident trail.

This is the Memory Core analogue of #10646. In both cases, a semantically convenient cached/indexed layer is useful but not fresh enough to be the only source of truth during active swarm work.

The Architectural Reality

Relevant skill surface:

  • .agents/skills/memory-mining/SKILL.md — lightweight Progressive Disclosure router. Per create-skill, this should stay small unless trigger wording itself needs a tiny correction.
  • .agents/skills/memory-mining/references/memory-mining-protocol.md — heavy payload containing query strategy; this is the right file for the rule change.

Relevant prior tickets:

  • #10075 created the skill and deliberately prescribed query_summaries before query_raw_memories for breadth-first lookup.
  • #10586 corrected boot documentation to say auto-summarization is not automatic by default.
  • #9997 improved internal summarization ordering but does not guarantee that active-session summaries exist at query time.

Pre-flight against Progressive Disclosure: this proposal should update the reference payload, not bloat the router.

The Fix

Update memory-mining query strategy so agents choose freshness-aware order:

  1. For older/stable history and broad conceptual exploration, query_summaries may remain the cheap breadth-first step.
  2. For recent incidents, same-day regressions, active tickets/PRs, wake/A2A coordination, or anything the user says just happened, run query_raw_memories first or in parallel with query_summaries.
  3. Document the freshness rationale explicitly: active sessions can have raw memories before summary material exists.
  4. Add a stop-condition warning: do not conclude "memory has no prior mapping" from summaries alone when the target event is recent or active-session-bound.
  5. Add example query pairs showing the correct flow for incident work:
    • query_raw_memories("bridge restart wake regression osascript exit 1")
    • query_summaries("wake substrate bridge restart regression")

Acceptance Criteria

  • memory-mining-protocol.md §2 Query strategy distinguishes stable-history lookup from freshness-sensitive lookup.
  • For recent/active-session events, the protocol instructs agents to use query_raw_memories first or in parallel with query_summaries.
  • The protocol states the reason: active sessions may not be summarized yet, even when add_memory has persisted raw turns.
  • The protocol forbids treating a summary miss as a memory miss for same-day/recent incident work unless raw memories were also queried.
  • The router SKILL.md stays lightweight; any heavy guidance lives in references/memory-mining-protocol.md per Progressive Disclosure.
  • At least one concrete example covers the wake/A2A regression class from #10647 and shows raw-memory-first retrieval.
  • Related docs or references mention #10646 as the analogous GitHub/KB freshness pattern only if doing so does not create cross-skill bloat.

Out of Scope

  • Changing Memory Core summarization scheduling or enabling automatic summarization.
  • Modifying query_summaries or query_raw_memories implementation.
  • Rewriting the whole memory-mining skill.
  • Adding a new skill. This is a payload correction inside the existing memory-mining skill.
  • Replacing summaries with raw memories universally. Summaries remain useful for older/stable breadth-first discovery.

Avoided Traps

  • Trap: make raw memories the only memory-mining source. Rejected. Summaries are still cheaper for older/stable history.
  • Trap: trust summaries for current incidents. Rejected. Active sessions may not be summarized yet.
  • Trap: bloat SKILL.md. Rejected per create-skill; the heavy rule belongs in the reference payload.
  • Trap: solve this by forcing auto-summarization. Rejected. #10586 documents why auto-summarization is not automatic by default and why the default is load-bearing.
  • Trap: conclude no prior context from one query surface. Rejected. Freshness-sensitive work requires raw-memory verification.

Related

  • #10646 — live GitHub latest-open sweep for ticket-create freshness; same structural freshness gap on the GitHub/KB side.
  • #10075 — created the memory-mining skill and current summary-first rule.
  • #10586 — corrected false boot-summarization assumptions.
  • #9997 — prior summarization ordering improvement.
  • #10647 — current wake/A2A incident where raw memories and same-day context are especially important.

Origin Session ID: 89b259c3-27ec-4afb-baaf-fd39b55bffe1

Retrieval Hint: memory-mining active sessions not summarized raw memories before summaries recent incidents freshness.

tobiu referenced in commit d14299a - "docs(agentos): update memory-mining freshness order (#10651) (#12404) on Jun 3, 2026, 3:34 PM
tobiu closed this issue on Jun 3, 2026, 3:34 PM