Context
This epic graduated from Ideation Sandbox Discussion #11782 (Server-side tenant-repo ingestion for cloud Agent OS) — cross-family converged (author @neo-opus-ada + [GRADUATION_APPROVED] by @neo-gpt), operator-approved graduation 2026-05-22. #11782 graduated a design that originated as Discussion #11718's deferred D-residual, tracked-but-parked under Epic #11730.
This ticket is the reshape of #11731 — formerly the parked "server-side repo-clone ingestion exploration" sub of #11730 — into the full implementation epic. Per the operator's #11730 gate-challenge (2026-05-22), the "exploration" framing and the failure-gating premise were withdrawn: server-side tenant-repo ingestion is an additive capability with independent value, not a remediation gated behind push-based ingestion failing.
The Problem
A cloud-deployed Agent OS serving an external tenant must keep that tenant's source repositories ingested into the deployment's Knowledge Base and fresh as they evolve.
The MVP (#11726 / #11743) solved this push-based: the tenant wires a hook/CI job that pushes file deltas to the deployment. That requires per-repo tenant-side wiring and gives no autonomous refresh. This epic adds the pull-based complement — the deployment acquires and refreshes tenant repos server-side.
The Architectural Reality
Model — persistent mirror + incremental refresh:
- Tenant repos are specified once in deployment config.
- Bootstrap (mirror absent):
git clone (via a credentialed reference) into a persistent container volume → ingest all files.
- Update (mirror present):
git fetch → git diff <lastIngestedRev>..<newHead> → ingest only changed files, tombstone deleted ones. Force-push / history rewrite (old HEAD no longer an ancestor) → full re-ingest fallback.
- A periodic refresh lane drives the update cycle.
Structural shape (Discussion #11782 OQ1 → Option A): a new cloud-deployable tenant-repo-sync lane + TenantRepoSyncService, built on a shared lower-level GitMirror primitive (clone-if-missing / fetch / resolve-head / ancestor-check / changed+deleted diff). TenantRepoSyncService consumes GitMirror and emits an ingestSourceFiles() envelope — reusing the existing ingestion core (#11726/#11743; the envelope already carries baseRevision/headRevision/deleted).
PrimaryRepoSyncService / primary-dev-sync is not rewritten — it stays local-only; ADR 0014's cloud/local lane separation is preserved. ADR 0014 currently lists server-side cloning as D3/out-of-scope and classifies kbSync/primary-dev-sync local-only — so this epic amends ADR 0014 to add tenant-repo-sync as a new cloud-deployable lane (additive amendment; the §5.2 anti-pattern against re-pointing kbSync at tenant content is preserved).
File placement for the new .mjs files (GitMirror, TenantRepoSyncService) is validated per-sub via structural-pre-flight at sub-implementation time.
The Fix — Sub-Decomposition
Six subs (per Discussion #11782's §5.2 Step-Back decomposition):
- ADR 0014 amendment / lane-taxonomy update — add
tenant-repo-sync as a cloud-deployable lane (absorbs #11740).
- Tenant-repo config + Credentialed Repo-Access Contract + no-secret-leak tests — implements the contract below (Discussion #11782 OQ6).
- Persistent mirror acquisition service + volume layout — the
GitMirror primitive + per-{tenantId,repoSlug} mirror paths + the deployment repo-mirror volume.
- Diff-to-ingest envelope builder — revision-diff →
ingestSourceFiles() envelope, with ancestor-check + force-push/full-resync fallback.
- Scheduler/trigger lane — periodic refresh (per-repo cadence + jitter/backoff) + a manual/operator run path. Webhook deferred.
- Operator docs + health/telemetry proof — repo-freshness logging/summary; consumer surfaces (compose/volumes, health/readiness, backup docs, tenant config storage, parser docs, deletion telemetry).
Contract Ledger Matrix — Credentialed Repo-Access Contract
The deployment never stores secret material; it stores a reference. The credential is supplied to git transiently and never lands in a URL-at-rest, process args, logs, persisted state, manifests, parsed-chunk-v1 metadata, or graph-visible config.
| Target Surface |
Source of Authority |
Proposed Behavior |
Fallback / Edge Case |
Docs |
Evidence |
| Tenant-repo config entry |
sub 2 + Discussion #11782 OQ6 |
Holds cloneUrl (clean — no userinfo@), credentialRef (env-var name / deploy-key path / helper name — never the secret), repoSlug (explicit or strict-normalized {host}/{org}/{repo}). |
A cloneUrl with a userinfo@ credential pattern is rejected at config load. |
TenantIngestionModel.md |
config-validation unit tests |
| Credential injection |
GitMirror primitive (sub 3) |
HTTPS → GIT_ASKPASS/credential.helper resolving credentialRef at call time; SSH → GIT_SSH_COMMAND with the referenced deploy-key. No token in URL/argv. |
Resolved secret lives only in process memory + the git child's transient env. |
sub-3 docs |
no-leak test |
| Log / telemetry / health surfaces |
sub 2 + sub 6 |
Captured git stdout/stderr, errors, telemetry, health surface pass a redactor stripping token/secret patterns. |
Backstop for git error text. |
sub-6 docs |
no-leak test (injected fake secret absent from every surface) |
Derived artifacts (repoSlug, mirror path, persisted sync state, manifests) |
sub 3 + sub 4 |
Derived from the clean identity only — never from a credentialed form. |
— |
TenantIngestionModel.md |
repoSlug / path-derivation test |
Discussion Criteria Mapping
Discussion #11782's resolved Open Questions → this epic's ACs:
- OQ1 (structural fork → Option A) → AC1 + sub 3.
- OQ2 (trigger → periodic + manual) → AC5 + sub 5.
- OQ3 (mirror volume / path determinism) → AC3 + sub 3.
- OQ4 (history-rewrite → ancestor-check + full-resync fallback) → AC4 + sub 4.
- OQ5 (repo de-scope → disabled/quarantined, not auto-purge) → AC6.
- OQ6 (credential contract) → AC2 + sub 2 (Contract Ledger above).
- OQ7 (multi-tenant isolation) → AC3 (per-
{tenantId,repoSlug} isolation).
- OQ8 (ticket reconciliation) → this epic (the #11731 reshape) + sub 1 (absorbs #11740).
Acceptance Criteria
Out of Scope
- The push-based MVP ingestion model (#11726/#11743) — remains the default; this epic is strictly additive.
- Rewriting
PrimaryRepoSyncService / the primary-dev-sync lane — untouched (it MAY adopt the GitMirror primitive later only if that reduces code; not this epic's scope).
- Webhook-driven triggering — deferred as a later accelerator.
Avoided Traps
| Trap |
Why rejected |
Enhance PrimaryRepoSyncService into a unified "repo-fleet" engine (Discussion #11782 Option B) |
Generalizing a dev-hardcoded, credential-free, local-only service spans local-only + cloud in one service, muddying ADR 0014's lane taxonomy + risking battle-tested machinery. |
Full git clone on every refresh (Option D) |
git fetch transfers only new objects; full clone re-transfers everything — strictly dominated. |
Re-point the local kbSync lane at tenant content |
ADR 0014 §5.2 anti-pattern — re-couples the cloud deployment to a local-checkout scan model. |
Related
- Origin Discussion: #11782 (graduated; cross-family converged) — itself from Discussion #11718.
- Sub (ADR amendment): #11740 — absorbed as sub 1.
- Parent: #9999 (v13 Cloud-Native Knowledge & Multi-Tenant umbrella).
- Sibling residual epic: #11730 — this epic graduates out of #11730's #11731/#11740 slice.
- Push-based MVP this complements: #11726 / #11743.
- ADRs: 0014 (cloud deployment topology — amended by sub 1), 0005 (ADR lifecycle).
Signal Ledger
- @neo-opus-ada — author / graduation author signal (Discussion #11782).
- @neo-gpt —
[GRADUATION_APPROVED] @ Discussion #11782 comment DC_kwDODSospM4BA8q0 (OQ6 credential-contract blocker cleared).
Unresolved Dissent
None — OQ1/OQ2/OQ6 converged cross-family; no DEFERRED/VETO outstanding.
Unresolved Liveness
- @neo-gemini-pro — no graduation signal; unavailable (~1 month). STATUS: pending-peer-repoll. Per ideation-sandbox §6.5 this liveness gap is preserved here; the swarm has no codified active-peer-quorum rule, so graduation proceeded on 2 active cross-family signals under explicit operator authorization (2026-05-22). A friction→gold follow-up will codify a standing active-peer-quorum rule.
Origin Session ID
39185c66-a107-46ea-b0bf-eb4fa1137257
Handoff Retrieval Hints
query_raw_memories("server-side tenant-repo ingestion GitMirror TenantRepoSyncService credential contract #11782 graduation")
- Origin Discussion #11782; the §5.2 Step-Back at discussioncomment-17025441.
Context
This epic graduated from Ideation Sandbox Discussion #11782 (Server-side tenant-repo ingestion for cloud Agent OS) — cross-family converged (author @neo-opus-ada +
[GRADUATION_APPROVED]by @neo-gpt), operator-approved graduation 2026-05-22. #11782 graduated a design that originated as Discussion #11718's deferred D-residual, tracked-but-parked under Epic #11730.This ticket is the reshape of #11731 — formerly the parked "server-side repo-clone ingestion exploration" sub of #11730 — into the full implementation epic. Per the operator's #11730 gate-challenge (2026-05-22), the "exploration" framing and the failure-gating premise were withdrawn: server-side tenant-repo ingestion is an additive capability with independent value, not a remediation gated behind push-based ingestion failing.
The Problem
A cloud-deployed Agent OS serving an external tenant must keep that tenant's source repositories ingested into the deployment's Knowledge Base and fresh as they evolve.
The MVP (#11726 / #11743) solved this push-based: the tenant wires a hook/CI job that pushes file deltas to the deployment. That requires per-repo tenant-side wiring and gives no autonomous refresh. This epic adds the pull-based complement — the deployment acquires and refreshes tenant repos server-side.
The Architectural Reality
Model — persistent mirror + incremental refresh:
git clone(via a credentialed reference) into a persistent container volume → ingest all files.git fetch→git diff <lastIngestedRev>..<newHead>→ ingest only changed files, tombstone deleted ones. Force-push / history rewrite (old HEAD no longer an ancestor) → full re-ingest fallback.Structural shape (Discussion #11782 OQ1 → Option A): a new cloud-deployable
tenant-repo-synclane +TenantRepoSyncService, built on a shared lower-levelGitMirrorprimitive (clone-if-missing / fetch / resolve-head / ancestor-check / changed+deleted diff).TenantRepoSyncServiceconsumesGitMirrorand emits aningestSourceFiles()envelope — reusing the existing ingestion core (#11726/#11743; the envelope already carriesbaseRevision/headRevision/deleted).PrimaryRepoSyncService/primary-dev-syncis not rewritten — it stays local-only; ADR 0014's cloud/local lane separation is preserved. ADR 0014 currently lists server-side cloning as D3/out-of-scope and classifieskbSync/primary-dev-synclocal-only — so this epic amends ADR 0014 to addtenant-repo-syncas a new cloud-deployable lane (additive amendment; the §5.2 anti-pattern against re-pointingkbSyncat tenant content is preserved).File placement for the new
.mjsfiles (GitMirror,TenantRepoSyncService) is validated per-sub viastructural-pre-flightat sub-implementation time.The Fix — Sub-Decomposition
Six subs (per Discussion #11782's §5.2 Step-Back decomposition):
tenant-repo-syncas a cloud-deployable lane (absorbs #11740).GitMirrorprimitive + per-{tenantId,repoSlug}mirror paths + the deployment repo-mirror volume.ingestSourceFiles()envelope, with ancestor-check + force-push/full-resync fallback.Contract Ledger Matrix — Credentialed Repo-Access Contract
The deployment never stores secret material; it stores a reference. The credential is supplied to git transiently and never lands in a URL-at-rest, process args, logs, persisted state, manifests,
parsed-chunk-v1metadata, or graph-visible config.cloneUrl(clean — nouserinfo@),credentialRef(env-var name / deploy-key path / helper name — never the secret),repoSlug(explicit or strict-normalized{host}/{org}/{repo}).cloneUrlwith auserinfo@credential pattern is rejected at config load.TenantIngestionModel.mdGitMirrorprimitive (sub 3)GIT_ASKPASS/credential.helperresolvingcredentialRefat call time; SSH →GIT_SSH_COMMANDwith the referenced deploy-key. No token in URL/argv.repoSlug, mirror path, persisted sync state, manifests)TenantIngestionModel.mdDiscussion Criteria Mapping
Discussion #11782's resolved Open Questions → this epic's ACs:
{tenantId,repoSlug}isolation).Acceptance Criteria
tenant-repo-synccloud-deployable lane +TenantRepoSyncServiceexist, distinct from local-onlyprimary-dev-sync.{tenantId, repoSlug}persistent mirrors on a deployment volume; paths computed from clean identity only.git diff→ingestSourceFiles()envelope; ancestor-check before incremental diff; force-push → full re-ingest fallback.tenant-repo-syncadded to the §2.1 lane taxonomy as cloud-deployable.Out of Scope
PrimaryRepoSyncService/ theprimary-dev-synclane — untouched (it MAY adopt theGitMirrorprimitive later only if that reduces code; not this epic's scope).Avoided Traps
PrimaryRepoSyncServiceinto a unified "repo-fleet" engine (Discussion #11782 Option B)dev-hardcoded, credential-free, local-only service spans local-only + cloud in one service, muddying ADR 0014's lane taxonomy + risking battle-tested machinery.git cloneon every refresh (Option D)git fetchtransfers only new objects; full clone re-transfers everything — strictly dominated.kbSynclane at tenant contentRelated
Signal Ledger
[GRADUATION_APPROVED]@ Discussion #11782 commentDC_kwDODSospM4BA8q0(OQ6 credential-contract blocker cleared).Unresolved Dissent
None — OQ1/OQ2/OQ6 converged cross-family; no DEFERRED/VETO outstanding.
Unresolved Liveness
Origin Session ID
39185c66-a107-46ea-b0bf-eb4fa1137257Handoff Retrieval Hints
query_raw_memories("server-side tenant-repo ingestion GitMirror TenantRepoSyncService credential contract #11782 graduation")