What is the Neural Link?

The Neural Link is a bi-directional bridge that connects AI agents directly to the Neo.mjs runtime. It lets agents inspect the Scene Graph, component state, event listeners, computed styles, and DOM rectangles, and mutate the running application in real time.

Why is Neo.mjs called an Application Engine instead of a framework?

Neo.mjs maintains persistent application objects in a worker-backed Scene Graph instead of compiling application state away into ephemeral DOM nodes. That architecture enables multi-window orchestration, runtime permutation, and deep AI introspection.

What is Context Engineering?

Context Engineering shapes the information and tool environment around AI agents. Neo.mjs implements it through Knowledge Base, Memory Core, GitHub Workflow, and Neural Link MCP servers for frontier harnesses, plus a File System MCP server for internal Neo.ai.Agent local loops.

What is the Neo.mjs Agent OS?

The Neo.mjs Agent OS is the repository Brain: source code and services for Memory Core, Knowledge Base, Active Hybrid GraphRAG, DreamService, Golden Path synthesis, A2A coordination, and Neural Link tooling.

Frontmatter

id	11640
title	Phase 4B — Manifest Reconciliation Daemon: Tenant-State vs Chroma-Actual Sync
state	Closed
labels	enhancementaiarchitecture
assignees	neo-opus-ada
createdAt	May 19, 2026, 1:57 PM
updatedAt	Jun 7, 2026, 7:13 PM
githubUrl	https://github.com/neomjs/neo/issues/11640
author	neo-opus-ada
commentsCount	2
parentIssue	11628
subIssues	[]
subIssuesCompleted	0
subIssuesTotal	0
blockedBy	[x] 11639 Phase 4A — Per-Tenant Ingestion Observability Daemon (KBRecorderService Extension)
blocking	[x] 11641 Phase 4C — Stale-Chunk Garbage Collection Daemon: Orphan Detection + Retention Enforcement
closedAt	May 21, 2026, 11:33 AM

Phase 4B — Manifest Reconciliation Daemon: Tenant-State vs Chroma-Actual Sync

Closed v13.0.0/archive-v13-0-0-chunk-12 enhancementaiarchitecture

neo-opus-ada commented on May 19, 2026, 1:57 PM

Context

Sub of Phase 4 Epic #11628 (meta-Epic #11624).

Periodic state-reconciliation daemon — catches missed tombstones, handles force-push detection beyond per-push payload.

The Problem

Phase 0/1A defines tombstone / manifest / revision-boundary deletion-signaling, but production failure modes exist:

Client hooks fail mid-push (some files pushed, deletes lost)
Force-push history rewrite (per-push revision-boundary insufficient for branch-rewrite scenarios)
Tenant config changes invalidate previously-pushed chunks
Network errors cause partial-push state

Without periodic reconciliation, tenant Chroma state drifts from tenant's actual repo state silently.

The Fix

New daemon: ai/scripts/kb-reconciliation-daemon.mjs (sibling to existing daemons).

Per scheduled tick (configurable; default hourly):

For each active tenant:
- Fetch tenant's claimed-state manifest (from KnowledgeBaseTenantConfig Phase 2E + last-received revision boundary)
- Fetch Chroma's actual-state chunks for the tenant (via where: {tenantId} filter from Phase 0/1D)
- Diff: chunks in Chroma but not in claimed-state → orphaned (queue for GC or auto-tombstone)
- Diff: claimed-state paths not represented in Chroma → re-ingestion candidates (notify tenant via observability surface)
Reconciliation actions:
- Auto-tombstone orphans older than retention threshold (configurable per-tenant)
- Emit observability events via Phase 4A
- Alert operator if drift exceeds threshold (Phase 4D)

Acceptance Criteria

ai/scripts/kb-reconciliation-daemon.mjs exists; follows existing daemon pattern
Per-tenant reconciliation logic implemented (claim vs actual diff)
Configurable tick interval (aiConfig.knowledgeBase.reconciliationIntervalMs; default 3600000 = 1h)
Configurable per-tenant retention threshold (orphan auto-tombstone after N days)
Reconciliation events emitted via Phase 4A observability daemon
Alerts emitted via Phase 4D when drift exceeds threshold
Unit tests: diff logic, auto-tombstone, retention threshold
Integration test: simulate tenant force-push → reconciliation removes orphans

Out of Scope

Initial observability daemon → Phase 4A
Operator alerting infrastructure → Phase 4D
Per-chunk garbage collection scheduling → Phase 4C (separate ticket for GC-specific concerns)

Contract Ledger

Added at intake by @neo-opus-ada (Claude Code) 2026-05-21 — satisfies the ticket-intake §7 Contract Completeness readiness gate (intake comment: https://github.com/neomjs/neo/issues/11640#issuecomment-4504572192). The original-author session is inactive; per ticket-intake §7 the claiming maintainer authors the missing ledger. Tier target: T3 (Explicit Matrix). The ledger is the binding contract; the loose Acceptance Criteria above are refined by these rows.

V1 scope — substrate-grounded. A fresh sweep of the merged Phase 2 code shows the ticket's "claim-vs-actual path-manifest diff" envisions a persisted claimed-state manifest that Phase 2 does not store: KnowledgeBaseTenantConfig carries no path manifest, and revision boundaries (baseRevision / headRevision) are per-push payload parameters, not persisted state. A periodic daemon with no repo access and no stored manifest therefore cannot detect arbitrary drift. The substrate-real V1 reconciliation signal is config-invalidation reconciliation — the tenantConfigVersion chunk stamp (#11637 / VectorService.resolveTenantStamp) compared against the tenant's current getTenantConfig().version. This delivers the ticket's failure-mode #3 ("tenant config changes invalidate previously-pushed chunks") completely; failure-modes #1/#2/#4 (force-push, mid-push, partial-push) are V1.x-deferred — Row 6.

Refined 2026-05-21 per @neo-gpt's #11640 pre-PR peer review: V1 Phase 4D integration is drift-presence/frequency alerting via reconcileEvents (Row 5) — drift-volume threshold alerting needs a chunks_total rollup metric and is V1.x (Row 6).

Target Surface	Source of Authority	Proposed Behavior	Fallback / Edge Case	Docs	Evidence
`aiConfig.knowledgeBase` reconciliation config block — 4 new keys	#11628 Phase 4B; this ticket AC; #11642's `aiConfig.knowledgeBase` precedent	`reconciliationEnabled` (Boolean, default `false`) — master opt-in; the daemon exits early when false. `reconciliationIntervalMs` (Number, default `3600000` = 1h) — poll-tick interval. `reconciliationAutoTombstone` (Boolean, default `false`) — opt-in for the destructive auto-tombstone action; default-off ⇒ the daemon detects + reports only. `reconciliationOrphanVersionGap` (Number, default `2`) — a config-stale chunk is auto-tombstone-eligible when `currentVersion − chunk.tenantConfigVersion ≥ this`.	A stale gitignored `config.mjs` predating #11640 has no `knowledgeBase` key (or lacks the reconciliation keys) → every key is read defensively against its default (mirrors #11642's defensive read).	Yes — `ai/config.template.mjs` block + JSDoc	Unit: config-defaulting + the daemon's opt-in gate
Config-stale orphan detection — `KbReconciliationEngine` pure core	#11637 `tenantConfigVersion` stamp; `getTenantConfig().version`	The pure, dependency-free diff engine (mirrors #11642's `KbAlertRuleEngine`). `diffTenantChunks({rows, currentVersion, orphanVersionGap})` → a chunk is a config-stale orphan when `typeof metadata.tenantConfigVersion === 'number' && metadata.tenantConfigVersion < currentVersion`; its `versionGap = currentVersion − tenantConfigVersion`; it is actionable when `versionGap ≥ orphanVersionGap`. Returns `{staleOrphans, staleCount, actionableIds, actionableCount}`. No I/O, no clock.	`currentVersion === 0` (tenant on the yaml/default config tier — no graph node) → no chunk can be stale (`v < 0` is never true); zero orphans, no special-case. A chunk with a missing / non-numeric `tenantConfigVersion` (pre-#11637 ingest) → not flagged (fail-safe: never auto-action a chunk we cannot classify).	Yes — JSDoc	Unit: stale-detection, version-gap partition, `currentVersion: 0` no-op, missing-stamp skip
The auto-tombstone reconciliation action — `knowledge-base` Chroma collection delete	this ticket AC ("auto-tombstone orphans"); the destructive-action conservatism principle	When `reconciliationAutoTombstone` is `true`, the daemon deletes a tenant's `actionableIds` via `collection.delete({ids})`. The delete is tenant-scoped — `actionableIds` derive only from rows fetched with `where: {tenantId}` (the `getTenantRows` batched-`collection.get` pattern); tenant A's reconciliation never touches tenant B's chunks.	`reconciliationAutoTombstone` is `false` (the default) → no delete is ever issued; the daemon detects + emits telemetry only. A `collection.delete` throw → `logger.error`, best-effort; the daemon continues to the next tenant.	Yes — daemon JSDoc	Unit: delete gated by the opt-in flag; tenant-scoped id set; delete-throw tolerance
Phase 4A telemetry emission — `KBRecorderService.recordIngestionMetric`	#11639 / #11665 Phase 4A; `recordIngestionMetric`'s `'reconcile'` event type	When a tenant has ≥ 1 config-stale orphan, the daemon emits exactly one `recordIngestionMetric({tenantId, repoSlug, eventType: 'reconcile', chunksTotal: staleCount, chunksDeleted: <count tombstoned this tick — 0 when auto-tombstone is off>, detail: {staleCount, actionableCount, currentVersion, autoTombstone}})`. A clean tenant (zero orphans) emits nothing — so `reconcileEvents > 0` genuinely means "drift was found".	`recordIngestionMetric` is already best-effort (never throws into the caller). The recorder unavailable → the metric is silently dropped; reconciliation still runs.	Yes — JSDoc	Unit: a `reconcile` metric is emitted for a drifting tenant, suppressed for a clean one
Phase 4D alerting integration — telemetry-only seam	#11642 Phase 4D; `KbAlertRuleEngine.KNOWN_METRICS`	The ticket AC "alerts emitted via Phase 4D when drift exceeds threshold" is satisfied with no #11642 code change as drift-presence/frequency alerting. The daemon emits one `reconcile` event per tenant-with-drift per tick; `reconcileEvents` is already a `KbAlertRuleEngine.KNOWN_METRICS` field, so an operator's `aiConfig.knowledgeBase.alertRules` entry on `reconcileEvents` fires when a tenant shows persistent / frequent drift across the alert window. `chunksDeleted` is also `KNOWN_METRICS`-covered — alertable as action volume, but non-zero only when `reconciliationAutoTombstone` is on (`0` in the default detect-only posture). The existing #11642 `KbAlertingService` rolls up the events this daemon emits and fires; the reconciliation daemon does not dispatch alerts itself — no duplication of #11642's channel logic.	Drift-volume thresholding (alert when the stale-chunk count exceeds N) is not available in V1: `getTenantIngestionRollup` does not aggregate `chunks_total`, and `chunksTotal` / `staleCount` are not in `KNOWN_METRICS`. The daemon still records `chunksTotal: staleCount` + a `detail` payload (raw-row and `detail`-visible), so the data is captured — but rollup-aggregated volume alerting needs a `chunks_total` rollup metric + `KNOWN_METRICS` coverage → Row 6 (V1.x). No `alertRules` configured → telemetry is still recorded; only alert dispatch is absent.	Yes — JSDoc cross-ref	Verified by inspection (@neo-gpt #11640 pre-PR peer review, 2026-05-21): `reconcileEvents` ∈ `KNOWN_METRICS`; `getTenantIngestionRollup` SQL has no `chunks_total` aggregation
V1 scope boundary — manifest / force-push detection + per-tenant threshold override + drift-volume alerting	this ticket "The Fix" (failure-modes #1/#2/#4) + AC; ticket-intake "challenge prescribed fixes"; @neo-gpt #11640 pre-PR peer review	V1.x-deferred, documented. (a) Manifest / force-push / partial-push orphan detection requires a persisted claimed-state manifest or a persisted last-received revision boundary — neither exists in Phase 2 substrate. (b) A per-tenant `orphanVersionGap` override requires extending `getTenantConfig`'s fixed 8-field projection — a #11637-surface change. (c) Drift-volume threshold alerting requires a `chunks_total` rollup metric in `getTenantIngestionRollup` + `chunksTotal` in `KbAlertRuleEngine.KNOWN_METRICS` — a #11639 / #11642-surface change. V1 ships drift-presence/frequency alerting via `reconcileEvents` (Row 5) and touches zero merged Phase 2 code (purely additive: 3 new files + the `aiConfig` template block).	V1.x is a separate follow-up ticket: it adds the persisted-manifest substrate (or a daemon-side repo walk), the per-tenant override, and the `chunks_total` rollup metric. The PR body files it.	Yes — PR body "Deltas" + a V1.x follow-up ticket	N/A — explicit scope boundary

Prior Art / Backup-Restore Substrate Cross-References

Substrate-correct V-B-A calibration 2026-05-19 post-PR-#11647 merge: the backup/restore/defrag substrate I underspecified during Phase 0/1A scoping is load-bearing for this daemon's design. Per #10129 Phase 3 peer architecture:

buildScripts/ai/backup.mjs — canonical atomic-bundle orchestrator. Layout .neo-ai-data/backups/backup-<ISO-ts>/{kb,mc,graph,concepts,trajectories}/. Reconciliation daemon should COEXIST with the existing bundle-restore lifecycle, NOT replace it.
buildScripts/ai/restore.mjs — merge-mode preserve-live semantics already shipped. Graph SQLite uses INSERT OR IGNORE (post-#11141 fix; pre-fix used silently-broken INSERT OR REPLACE). Chroma side uses collection.upsert(); #11144 tracks preserve-live parity follow-up. The daemon's tenant-state reconciliation operates ABOVE this layer — it diffs tenant-claimed vs Chroma-actual; restore.mjs operates on the bundle vs collection axis. Both can coexist.
buildScripts/ai/defragChromaDB.mjs — peer of backup.mjs (NOT a delegate). 5-step nuke-and-pave with private pre-nuke physical-copy snapshots at dist/chromadb-backups/<target>/. Operators chain ai:defrag-kb && ai:backup for compacted backups. Phase 4C (#11641 GC daemon) is the closer substrate sibling.
Existing test substrate to extend, not duplicate: test/playwright/unit/ai/buildScripts/restore.spec.mjs, restore-hardening.spec.mjs, restore-filters.spec.mjs, backup.spec.mjs.

Operator framing (2026-05-19): "we have more complex restoration scripts, since our live db got wiped 2x already, and we can now merge backups and new live DB data." — merge-mode capability IS prior art, not Phase 4 new substrate. This daemon adds tenant-state reconciliation on top.

Parent: #11628
Blocked-by: Phase 4A (observability event emission), Phase 2 Epic (ingestion pipeline must exist)
Related substrate (cross-references): #10129 atomic-bundle backup orchestrator, #11141 graph preserve-live fix, #11144 Chroma preserve-live parity follow-up
Sibling backup primitives: buildScripts/ai/{backup,restore,defragChromaDB}.mjs; tests under test/playwright/unit/ai/buildScripts/
Daemon pattern precedent: ai/scripts/orchestrator-daemon.mjs
Tombstone contract: Phase 0/1A

Origin Session ID

7360e917-1733-4cdd-a6f3-5ac51c34b838

Handoff Retrieval Hints

orchestrator-daemon.mjs scheduling pattern is the architectural reference
Reconciliation = claim-vs-actual diff; the substrate-correct sibling pattern to look at is git's index-vs-working-tree-diff (conceptual reference, not code reuse)

tobiu referenced in commit 617c712 - "fix(memory-core): add .claude and .codex to FileSystemIngestor ignorePatterns (#11650) (#11651) on May 19, 2026, 5:25 PM

tobiu referenced in commit 5d64a1f - "feat(ai): KB ingestion telemetry schema + recordIngestionMetric API (#11639) (#11667) on May 20, 2026, 8:01 AM

tobiu referenced in commit 90a880d - "feat(ai): KB reconciliation daemon — Phase 4B (#11640) (#11710) on May 21, 2026, 11:33 AM

tobiu closed this issue on May 21, 2026, 11:33 AM

tobiu referenced in commit d03179a - "feat(ai): stamp ingestedAt on tenant KB chunks (#11712) (#11713) on May 21, 2026, 11:33 AM

tobiu referenced in commit 82ea006 - "feat(ai): KB garbage-collection daemon — Phase 4C (#11641) (#11715) on May 21, 2026, 1:19 PM

tobiu referenced in commit 5626d8e - "fix(ai): mark kb-config node visibility:team for offline daemon reads (#11716) (#11717) on May 21, 2026, 2:01 PM