What is the Neural Link?

The Neural Link is a bi-directional bridge that connects AI agents directly to the Neo.mjs runtime. It lets agents inspect the Scene Graph, component state, event listeners, computed styles, and DOM rectangles, and mutate the running application in real time.

Why is Neo.mjs called an Application Engine instead of a framework?

Neo.mjs maintains persistent application objects in a worker-backed Scene Graph instead of compiling application state away into ephemeral DOM nodes. That architecture enables multi-window orchestration, runtime permutation, and deep AI introspection.

What is Context Engineering?

Context Engineering shapes the information and tool environment around AI agents. Neo.mjs implements it through Knowledge Base, Memory Core, GitHub Workflow, and Neural Link MCP servers for frontier harnesses, plus a File System MCP server for internal Neo.ai.Agent local loops.

What is the Neo.mjs Agent OS?

The Neo.mjs Agent OS is the repository Brain: source code and services for Memory Core, Knowledge Base, Active Hybrid GraphRAG, DreamService, Golden Path synthesis, A2A coordination, and Neural Link tooling.

Frontmatter

id	11641
title	Phase 4C — Stale-Chunk Garbage Collection Daemon: Orphan Detection + Retention Enforcement
state	Closed
labels	enhancementaiarchitecture
assignees	neo-opus-ada
createdAt	May 19, 2026, 1:57 PM
updatedAt	Jun 7, 2026, 7:13 PM
githubUrl	https://github.com/neomjs/neo/issues/11641
author	neo-opus-ada
commentsCount	3
parentIssue	11628
subIssues	[]
subIssuesCompleted	0
subIssuesTotal	0
blockedBy	[x] 11712 Add a server-stamped ingestedAt timestamp to tenant KB chunk metadata, [x] 11640 Phase 4B — Manifest Reconciliation Daemon: Tenant-State vs Chroma-Actual Sync
blocking	[]
closedAt	May 21, 2026, 1:19 PM

Phase 4C — Stale-Chunk Garbage Collection Daemon: Orphan Detection + Retention Enforcement

Closed v13.0.0/archive-v13-0-0-chunk-12 enhancementaiarchitecture

neo-opus-ada commented on May 19, 2026, 1:57 PM

Context

Sub of Phase 4 Epic #11628 (meta-Epic #11624).

Stale-chunk garbage collection — distinct from reconciliation (Phase 4B). Reconciliation diffs claimed-vs-actual; GC enforces RETENTION POLICY (time-based, count-based, or version-based expiration).

The Problem

Without active GC:

Tombstoned chunks accumulate Chroma collection size (Chroma soft-deletes; physical reclaim needs explicit GC)
Source-config changes (parser swap, source-type removed) orphan chunks that no longer match tenant config
Per-tenant retention policies (e.g., "retain only last 90 days of historical chunks") need enforcement
ChromaDB defrag patterns (ai:defrag-kb precedent: 10s wall / 321→189MB / 41% reduction on ~10k chunks) — but defrag is operator-triggered, not automatic per-tenant

The Fix

New daemon: ai/scripts/kb-gc-daemon.mjs (sibling to existing daemons).

Per scheduled tick (configurable; default daily):

For each active tenant:
- Identify retention-expired chunks (per aiConfig.knowledgeBase.tenantRetention policy)
- Identify config-orphaned chunks (source-config change leaves chunks unmatched)
- Identify reconciliation-tombstoned chunks past grace period (Phase 4B integration)
Garbage collection actions:
- Delete from Chroma (via VectorService.delete({ids}))
- Emit observability events (Phase 4A)
- Trigger ai:defrag-kb if cumulative deletion > threshold (10% collection size)
Cross-tenant safety: tenant A's GC cannot touch tenant B's chunks (RLS via Phase 0/1D read-side filter)

Acceptance Criteria

ai/scripts/kb-gc-daemon.mjs exists; follows existing daemon pattern
Retention policy structure: per-tenant config (time-based / count-based / version-based)
Config-orphan detection: chunks whose source-config no longer matches tenant config
Tombstone-grace-period integration with Phase 4B
Defrag trigger when cumulative deletion > threshold (default 10% collection size)
Cross-tenant safety: GC respects RLS; tenant A cannot affect tenant B
Configurable tick interval (aiConfig.knowledgeBase.gcIntervalMs; default 86400000 = 24h)
Unit tests: retention enforcement, orphan detection, RLS isolation
Integration test: tenant pushes content, retention expires, GC removes, defrag triggers

Out of Scope

Reconciliation logic → Phase 4B
Observability event collection → Phase 4A
Initial defrag substrate (ai:defrag-kb already exists; this daemon TRIGGERS it conditionally)

Contract Ledger

Added at intake by @neo-opus-ada (Claude Code) 2026-05-21 — satisfies the ticket-intake §7 Contract Completeness gate. The original-author session is inactive; per §7 the claiming maintainer authors the ledger. Tier target: T3 (Explicit Matrix). The ledger is the binding contract; the loose Acceptance Criteria above are refined by these rows.

V1 scope. V1 = retention-policy chunk expiration + physical-reclaim signalling. The intake scope-refinement dropped config-orphan detection — it overlaps #11640's config-invalidation reconciliation (the same tenantConfigVersion signal). Time- and count-based retention are substrate-backed by the ingestedAt chunk stamp (#11712, merged to dev). Full design rationale: the intake-update comment https://github.com/neomjs/neo/issues/11641#issuecomment-4506820763.

Refined 2026-05-21 per @neo-gpt's #11641 pre-branch peer review: explicit OR-expiry semantics, a deterministic count-ordering tie-breaker, {tenantId, repoSlug} count-bucketing, and V1 emits a defrag-recommended signal rather than spawning ai:defrag-kb (Rows 2 + 4 + 6).

Target Surface	Source of Authority	Proposed Behavior	Fallback / Edge Case	Docs	Evidence
`aiConfig.knowledgeBase` GC config block — 5 new keys	#11628 Phase 4C; this ticket AC; #11640 / #11642's `aiConfig.knowledgeBase` precedent	`gcEnabled` (Boolean, default `false`) — master opt-in; the daemon exits early when false. `gcIntervalMs` (Number, default `86400000` = 24h) — poll-tick interval. `gcRetention` (Object, default `{}`) — `{maxAgeMs?, maxCount?}` retention policy. `gcAutoDelete` (Boolean, default `false`) — opt-in for the destructive Chroma delete; default-off ⇒ detect + emit telemetry only. `gcDefragThreshold` (Number, default `0.10`) — the cumulative-deletion fraction above which the daemon emits a `defrag-recommended` signal; `0` disables the signal.	A stale gitignored `config.mjs` predating #11641 lacks the `gc*` keys → each read defensively against its default (the #11640 / #11642 defensive-read pattern). An empty `gcRetention {}` ⇒ no chunk is ever retention-expired (conservative — the ticket's "default conservative" handoff hint).	Yes — `ai/config.template.mjs` block + JSDoc	Unit: config-defaulting + the daemon opt-in gate
Retention-expiry classification — `KbGarbageCollectionEngine` pure core	#11712 `ingestedAt` chunk stamp; this ticket's retention AC; @neo-gpt #11641 peer review	The pure, dependency-free classifier (mirrors #11640's `KbReconciliationEngine`). `selectExpiredChunks({rows, retention, now})` → a chunk is retention-expired under OR-expiry: expired if time-expired OR count-expired (the union — the broader set). Time-expiry: `typeof metadata.ingestedAt === 'number' && now − ingestedAt > maxAgeMs`. Count-expiry: rows are bucketed by `{tenantId, repoSlug}`, each bucket sorted `ingestedAt` desc, then chunk `id` asc (a deterministic tie-breaker — batch-ingested chunks share an `ingestedAt`); a chunk ranked at or beyond `maxCount` within its bucket is count-expired. Returns `{expiredIds, expiredCount, evaluatedCount}`. No I/O, no clock — the caller passes `now`.	A chunk with a missing / non-numeric `ingestedAt` (a pre-#11712 ingest) is never flagged — fail-safe for both time (age uncomputable) and count (unrankable → excluded from the expired set), mirroring #11640's missing-`tenantConfigVersion` skip. An empty / absent `retention` policy → empty result.	Yes — JSDoc	Unit: time-expiry, count-expiry per `{tenantId, repoSlug}` bucket, the OR-union, the deterministic tie-break on equal `ingestedAt`, missing-`ingestedAt` skip, empty-policy no-op
The destructive GC delete — `knowledge-base` Chroma collection delete	this ticket AC ("GC removes"); the destructive-action conservatism principle	When `gcAutoDelete` is `true`, the daemon deletes a tenant's `expiredIds` via `collection.delete({ids})`. Tenant-scoped — `expiredIds` derive only from rows fetched with `where: {tenantId}` (the `getTenantRows` batched-`collection.get` pattern); tenant A's GC never touches tenant B's chunks (the ticket's RLS-safety AC).	`gcAutoDelete` is `false` (the default) → no delete is ever issued; the daemon detects + emits telemetry only. A `collection.delete` throw → `logger.error`, best-effort; the daemon continues to the next tenant.	Yes — daemon JSDoc	Unit: delete gated by the opt-in flag; tenant-scoped id set; delete-throw tolerance
Defrag-recommended signal — physical-reclaim observability	this ticket AC ("trigger defrag"); @neo-gpt #11641 peer review	When a tick's cumulative deletion (summed across tenants) exceeds `gcDefragThreshold` of the collection's chunk count, the daemon emits a `defrag-recommended` signal — a `logger.warn` plus a telemetry `detail` flag — surfacing that an operator should run `ai:defrag-kb`. V1 does not spawn `ai:defrag-kb` — see Row 6.	`gcDefragThreshold` is `0` → no signal. The daemon spawns no subprocess in V1 → there is no defrag-vs-ingest concurrency surface.	Yes — daemon JSDoc	Unit: the signal fires when cumulative deletion exceeds the threshold, stays silent below it
Phase 4A telemetry emission — `KBRecorderService.recordIngestionMetric`	#11639 Phase 4A; `recordIngestionMetric`'s `'tombstone'` event type	When a tenant has ≥ 1 retention-expired chunk, the daemon emits one `recordIngestionMetric({tenantId, repoSlug, eventType: 'tombstone', chunksTotal: expiredCount, chunksDeleted: <count deleted this tick — 0 when gcAutoDelete is off>, detail: {expiredCount, deletedCount, retention, gcAutoDelete, defragRecommended}})`. A clean tenant (zero expired) emits nothing.	`recordIngestionMetric` is best-effort (never throws into the caller). A GC delete is a `'tombstone'`-class event — `recordIngestionMetric`'s taxonomy has no dedicated `'gc'` type; `'tombstone'` is the honest fit (a logical deletion).	Yes — JSDoc	Unit: a `tombstone` metric is emitted for a tenant with expired chunks, suppressed for a clean one
V1 scope boundary — config-orphan detection · per-tenant retention override · auto-defrag spawn	this ticket "The Fix" / ACs; the intake scope-refinement; @neo-gpt #11641 peer review	Documented V1 deltas. (a) Config-orphan detection is dropped — #11640's `KbReconciliationService` already detects + opt-in-tombstones config-stale chunks; 4C re-detecting them is double-handling (the intake de-dup). (b) Per-tenant retention override — V1 applies one global `gcRetention` per tenant; a per-tenant override needs extending `getTenantConfig`'s fixed projection (#11637 surface) → V1.x. (c) Auto-defrag spawn — V1 emits a `defrag-recommended` signal only (Row 4); the automated `ai:defrag-kb` spawn + its defrag-vs-ingest concurrency-coordination story are V1.x (auto-spawning a whole-collection nuke-and-pave from a poll-loop daemon is a separable, concurrency-sensitive design).	V1.x is a separate follow-up ticket; the PR body "Deltas" documents each.	Yes — PR body "Deltas" + a V1.x follow-up	N/A — explicit scope boundary

Prior Art / Defrag-Backup Substrate Cross-References

Substrate-correct V-B-A calibration 2026-05-19: per #10129 Phase 3 peer architecture, defragChromaDB.mjs and backup.mjs are peer scripts with orthogonal responsibilities, NOT delegates. Phase 4C extends/triggers existing defrag substrate:

buildScripts/ai/defragChromaDB.mjs — 5-step "Nuke and Pave": (1) Pre-Nuke Snapshot via fs.copy() to dist/chromadb-backups/<target>/backup-<numeric-ts>/ (HNSW state preserved); (2) Extract all collections to in-memory; (3) Nuke collections via API; (4) Load (recreate + reinsert; forces HNSW rebuild); (5) Cleanup orphan UUID directories. Existing retention: keep last 3, delete others older than 7 days.
buildScripts/ai/backup.mjs — JSONL bundle peer. Operators chain ai:defrag-kb && ai:backup for compacted bundles. Phase 4C daemon triggers defrag automatically based on cumulative-deletion threshold.
Tenant-aware GC requires read-side filter — Phase 0/1D's where: {tenantId} filter on QueryService ensures cross-tenant safety; the GC daemon enumerates orphans WITHIN a tenant's scope, never cross-tenant.
Existing test substrate to extend, not duplicate: test/playwright/unit/ai/buildScripts/backup.spec.mjs (defrag-trigger logic; retention pattern); KB DatabaseService.backup.spec.mjs (export/import lifecycle).

Parent: #11628
Blocked-by: Phase 4A (observability), Phase 4B (reconciliation tombstones)
Existing defrag substrate: npm run ai:defrag-kb (memory anchor: 10s wall / 321→189MB / 41% reduction on ~10k chunks); peer script buildScripts/ai/defragChromaDB.mjs
Related substrate: #10129 atomic-bundle backup peer architecture
RLS substrate: Phase 0/1D read-side filter (cross-tenant GC safety)

Origin Session ID

7360e917-1733-4cdd-a6f3-5ac51c34b838

Handoff Retrieval Hints

ai:defrag-kb script is the existing defrag substrate to integrate with
Retention policies are deployment-specific; default conservative (long retention) acceptable for V1

tobiu referenced in commit 5d64a1f - "feat(ai): KB ingestion telemetry schema + recordIngestionMetric API (#11639) (#11667) on May 20, 2026, 8:01 AM

tobiu referenced in commit d03179a - "feat(ai): stamp ingestedAt on tenant KB chunks (#11712) (#11713) on May 21, 2026, 11:33 AM

tobiu referenced in commit 82ea006 - "feat(ai): KB garbage-collection daemon — Phase 4C (#11641) (#11715) on May 21, 2026, 1:19 PM

tobiu closed this issue on May 21, 2026, 1:19 PM