LearnNewsExamplesServices
Frontmatter
id11711
titleKB reconciliation: force-push & manifest-orphan drift detection
stateClosed
labels
enhancementaiarchitecture
assigneesneo-gpt
createdAtMay 21, 2026, 9:21 AM
updatedAtMay 21, 2026, 1:08 PM
githubUrlhttps://github.com/neomjs/neo/issues/11711
authorneo-opus-ada
commentsCount1
parentIssue11628
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 21, 2026, 1:08 PM

KB reconciliation: force-push & manifest-orphan drift detection

Closed v13.0.0/archive-v13-0-0-chunk-12 enhancementaiarchitecture
neo-opus-ada
neo-opus-ada commented on May 21, 2026, 9:21 AM

Context

Phase 4B (#11640, PR #11710) shipped the KB reconciliation daemon's V1: config-invalidation reconciliation — it detects chunks left stale by a tenant KnowledgeBaseTenantConfig change (via the tenantConfigVersion chunk stamp) and, opt-in, tombstones them. During #11640 intake a substrate sweep established that #11640's three other named failure modes cannot be detected from merged Phase 2 substrate — they were V1.x-deferred in the #11640 Contract Ledger (Row 6). This ticket is that V1.x increment.

The Problem

#11640 named four KB-drift failure modes. V1 (PR #11710) delivered failure-mode #3 (config-invalidation). The other three remain:

  • Force-push / history rewrite — a tenant rewrites branch history; per-push revision-boundary signaling cannot express it.
  • Mid-push hook failure — a client pre-push hook fails partway; some files pushed, some deletes lost.
  • Partial-push (network error) — a push half-completes; tenant Chroma state diverges from the repo.

All three are arbitrary drift: Chroma holds chunks for paths the tenant's repo no longer has, with no per-push signal to catch it. A periodic daemon cannot detect arbitrary drift without knowing the tenant's actual current path set. Phase 2 substrate provides no such knowledge: KnowledgeBaseIngestionService.getTenantConfig() returns a fixed 8-field projection with no path manifest; baseRevision / headRevision are per-push ingestSourceFiles payload params, never persisted; manifestSnapshot.pathsAfterPush is consumed at push-time (applyDeletionSignals) and discarded.

The Architectural Reality

  • ai/daemons/KbReconciliationService.mjs (#11640) — the poll-loop daemon. pulse() → per-tenant reconcileTenant(). V1 fetchTenantRows already fetches a tenant's full Chroma row set (where: {tenantId}); the V1 engine diffTenantChunks classifies config-staleness. This ticket adds a second diff axis: claimed-paths vs. Chroma-paths.
  • ai/services/knowledge-base/KnowledgeBaseIngestionService.mjsingestSourceFiles already receives manifestSnapshot.pathsAfterPush and baseRevision / headRevision; applyDeletionSignals uses them transiently. The missing piece is persistence.
  • KnowledgeBaseTenantConfig graph node (#11637) — the natural home for a persisted per-tenant claimed-state manifest, or a sibling node.

The Fix

Recommended: persist the post-push claimed-state manifest. Extend the ingestion path so each successful push records the tenant's claimed path set (the manifestSnapshot.pathsAfterPush already received) into durable storage — a kb-manifest:<tenantId> graph node, or a field on KnowledgeBaseTenantConfig. The reconciliation daemon then gains a manifest diff pass: a Chroma chunk whose sourcePath is absent from the persisted claimed manifest is a manifest orphan — the same actionability / opt-in-tombstone treatment as a config-stale orphan. Force-push is then a special case (the manifest simply no longer lists the rewritten-away paths) — no separate force-push primitive needed.

Concretely:

  • KnowledgeBaseIngestionService — persist pathsAfterPush per (tenant, repoSlug) on each successful push.
  • KbReconciliationEngine — add diffTenantManifest({rows, claimedPaths}) (pure, mirrors diffTenantChunks).
  • KbReconciliationService.reconcileTenant — run both diff passes; union the orphan sets.

Acceptance Criteria

  • Post-push claimed-state manifest is persisted per (tenant, repoSlug), durable across restarts.
  • KbReconciliationEngine gains a pure diffTenantManifest pass (Chroma sourcePath ∉ claimed manifest → manifest orphan).
  • KbReconciliationService runs both the config-staleness and manifest diff passes per tenant; orphan sets are unioned.
  • Manifest orphans honor the same reconciliationAutoTombstone opt-in + tenant-scoped delete as V1.
  • Unit tests: the manifest diff pass, the manifest-persistence write/read-back.
  • Integration test (carries #11640 AC8): simulate a tenant force-push (manifest shrinks) → reconciliation flags + (opt-in) tombstones the orphaned chunks.
  • Contract Ledger finalized at intake (the manifest-storage surface + the daemon diff contract).

Out of Scope

Two small independent enhancements also surfaced by #11640 (PR #11710 Follow-ups), each trackable as its own narrow ticket when picked up:

  • A per-tenant orphanVersionGap override (extends getTenantConfig's fixed projection — a #11637-surface change).
  • A chunks_total rollup metric in getTenantIngestionRollup + chunksTotal in KbAlertRuleEngine.KNOWN_METRICS — enables drift-volume threshold alerting (V1 supports drift-presence/frequency alerting via reconcileEvents).

Per-chunk garbage-collection scheduling → Phase 4C (#11641).

Avoided Traps

  • Daemon-side repo walk (the daemon clones / fetches each tenant's repo to compute the actual path set) — rejected: a cloud KB server generally has no repo access or credentials for arbitrary tenant repos, and a periodic full-clone is heavy. The push path already receives the manifest; persisting what we already have is far cheaper and credential-free.
  • A dedicated force-push primitive (detecting history rewrite via revision-graph comparison) — rejected: force-push is subsumed by the manifest diff (a rewritten-away path simply drops out of pathsAfterPush). One mechanism covers all three remaining failure modes.

Contract Ledger

Provisional — to be finalized at intake once the manifest-storage shape is chosen. The recommended design's surfaces:

Target Surface Source of Authority Proposed Behavior Fallback / Edge Case Docs Evidence
Persisted claimed-state manifest — kb-manifest:<tenantId> graph node (or a KnowledgeBaseTenantConfig field) this ticket; #11637 Phase 2E node precedent Each successful ingestSourceFiles push records the tenant's post-push claimed path set per repoSlug, durably. A push with no manifestSnapshot → the prior manifest is retained (a push that does not declare its path set does not invalidate the last known one). Yes — JSDoc + learn/agentos/cloud-deployment/ Unit: manifest write + read-back
KbReconciliationEngine.diffTenantManifest this ticket; the V1 diffTenantChunks precedent Pure: a Chroma row whose metadata.sourcePath is absent from the tenant's claimed manifest is a manifest orphan. No persisted manifest for a tenant → no manifest-orphan classification (fail-safe: never auto-action without a claimed baseline). Yes — JSDoc Unit: manifest diff
KbReconciliationService manifest reconciliation this ticket; #11640 Each pulse runs the manifest diff alongside the config-staleness diff; manifest orphans honor the reconciliationAutoTombstone opt-in + tenant-scoped delete. Same opt-in / fail-safe posture as V1. Yes — daemon JSDoc Unit + the AC8 force-push integration test

Related

  • #11640 — Phase 4B reconciliation daemon (V1; this ticket is its V1.x increment). PR #11710.
  • #11628 — Phase 4 epic (parent).
  • #11637 — Phase 2E KnowledgeBaseTenantConfig (the persisted-config-node precedent).
  • #11633 — Phase 2 KnowledgeBaseIngestionService (the ingestion path that will persist the manifest).

Origin Session ID

470c38e7-1ffc-4851-867d-d30c1b6fbdb2

Handoff Retrieval Hints

  • The #11640 Contract Ledger (in the #11640 ticket body, Row 6) is the V1/V1.x scope boundary this ticket implements.
  • query_raw_memories: "KB reconciliation V1.x force-push manifest persistence"
tobiu referenced in commit d03179a - "feat(ai): stamp ingestedAt on tenant KB chunks (#11712) (#11713) on May 21, 2026, 11:33 AM
tobiu referenced in commit e496cb0 - "feat(ai): reconcile KB manifest orphans (#11711) (#11714) on May 21, 2026, 1:08 PM
tobiu closed this issue on May 21, 2026, 1:08 PM
tobiu referenced in commit 5626d8e - "fix(ai): mark kb-config node visibility:team for offline daemon reads (#11716) (#11717) on May 21, 2026, 2:01 PM