Context
Surfaced during the writing of the Deployment Cookbook (#10800).
The Problem
The framework currently relies solely on unit-tested silos (HealthService.spec.mjs, etc.). There is no staged-stack integration test that exercises the KB + MC + Chroma + Reverse Proxy + OAuth stack as a single deployed unit.
The Architectural Reality
Testing the seams between these independent processes is critical to prevent silent integration failures in cloud deployments.
The Fix
Implement a Playwright-based or equivalent end-to-end integration test harness that stands up the full stack (via Docker) and executes cross-boundary verifications.
Acceptance Criteria
Out of Scope
- CI workflow that runs
npm run test-integration on PR — file as separate follow-up ticket; benefits all suites uniformly.
- Full Keycloak/Dex OIDC fixture — Lane C+1 fidelity-upgrade follow-up.
- Real cloud staging environment — operator-territory infra.
Contract Ledger (T3)
Per canonical specification in learn/agentos/contract-ledger.md. Authored 2026-05-06 via batch-Ledger-upgrade pass on cookbook follow-ups (#10801-#10805) — proposed by @neo-opus-4-7 (proposal at issue-comment 4386566803) and explicitly delegated to body-incorporation by @neo-gemini-3-1-pro per ticket-create-workflow §11 Authorship Respect delegation pattern.
| Target Surface |
Source of Authority |
Proposed Behavior |
Fallback / Edge Case |
Docs |
Evidence |
docker-compose.test.yml (new) + test/playwright/integration/ directory + playwright.config.integration.mjs (new sibling to playwright.config.e2e.mjs) |
#10805, parent #9999, surfacing PR #10806 cookbook gap-audit, sibling-disambiguation from #10008 (mocked Playwright unit-level) |
Docker Compose fixture spins up Chroma (tmpfs-backed for per-run ephemerality) + KB MCP server + MC MCP server. playwright.config.integration.mjs has webServer hook bringing up the compose stack; exposes http://localhost:<KB_PORT> + http://localhost:<MC_PORT> as MCP endpoints. Initial 3 specs in test/playwright/integration/ exercise: (a) /healthcheck shape verification with live providers.* blocks, (b) cross-tenant isolation alice/bob property at deployed level, (c) 401-reject when trustProxyIdentity=true + missing X-PREFERRED-USERNAME. |
Use trustProxyIdentity mode for MVP fixture (NOT full OIDC) — simpler operator-facing fixture; matches documented "deploy behind oauth2-proxy" path in SharedDeployment.md. Tests skip-with-warning if Docker daemon unavailable on operator dev environments. Tmpfs Chroma cleared per-run; clean teardown. |
Update SharedDeployment.md "Healthcheck Verification" section with reference to integration fixture; cookbook (PR #10806) Section 8 (First-Connection Smoke Test) cross-link to this fixture once it lands; possibly README contributing-guide entry. |
L2 — 3 integration specs covering the contracts above; npx playwright test -c test/playwright/playwright.config.integration.mjs --reporter=line runs all green locally. AC5 (deliberate-regression empirical verification): induce a regression on a throwaway branch (e.g., flip OIDC precedence in Server.mjs#buildRequestContext, or introduce a clientSecret leak in buildAuthProviderBlock) and confirm at least one integration spec catches it. Document the verification in PR body. |
npm run test-integration script (new in package.json) |
#10805, follows existing pattern of test-unit/test-components/test-e2e siblings |
Single-command invocation that brings up the docker-compose stack, runs the Playwright integration suite, and tears down cleanly. Same shape as existing test scripts. |
Manual fallback: direct npx playwright test -c test/playwright/playwright.config.integration.mjs for debugging or if script preconditions fail. |
package.json script entry; possibly contributing-guide reference. |
L1 — script entry verifiable via npm run test-integration --dry-run or by running and observing exit codes. |
trustProxyIdentity MVP-mode test contract (header injection in fixtures) |
PR #10768 auth.trustProxyIdentity substrate, PR #10785 401 gate, SharedDeployment.md §Authentication threat model |
Integration test fixtures inject X-PREFERRED-USERNAME headers directly (no oauth2-proxy mock needed for MVP) to exercise the proxy-header path of Server.mjs#buildRequestContext. Two test identities (alice, bob) used for cross-tenant isolation spec. |
Real OIDC fixture (full Keycloak/Dex stack) is a Lane C+1 fidelity-upgrade follow-up — not in MVP scope. |
Cookbook Section 4 (IdP Setup) cross-link: "for integration testing, see #10805 fixture using trustProxyIdentity-mode shortcut." |
L2 — fixtures injection verified in spec assertions; alice-only-data-back property as cross-tenant-isolation proof. |
Origin Session ID: 88a6ed3a-b1b9-461a-aaf3-7c9984bd12e7
Retrieval Hint: Deployment Cookbook gap surfacing
Context
Surfaced during the writing of the Deployment Cookbook (#10800).
The Problem
The framework currently relies solely on unit-tested silos (
HealthService.spec.mjs, etc.). There is no staged-stack integration test that exercises the KB + MC + Chroma + Reverse Proxy + OAuth stack as a single deployed unit.The Architectural Reality
Testing the seams between these independent processes is critical to prevent silent integration failures in cloud deployments.
The Fix
Implement a Playwright-based or equivalent end-to-end integration test harness that stands up the full stack (via Docker) and executes cross-boundary verifications.
Acceptance Criteria
docker-compose.test.yml+playwright.config.integration.mjs+test/playwright/integration/scaffolding created per Ledger row 1.npm run test-integrationscript registered inpackage.jsonper Ledger row 2.Out of Scope
npm run test-integrationon PR — file as separate follow-up ticket; benefits all suites uniformly.Contract Ledger (T3)
Per canonical specification in
learn/agentos/contract-ledger.md. Authored 2026-05-06 via batch-Ledger-upgrade pass on cookbook follow-ups (#10801-#10805) — proposed by @neo-opus-4-7 (proposal at issue-comment 4386566803) and explicitly delegated to body-incorporation by @neo-gemini-3-1-pro perticket-create-workflow §11Authorship Respect delegation pattern.docker-compose.test.yml(new) +test/playwright/integration/directory +playwright.config.integration.mjs(new sibling toplaywright.config.e2e.mjs)playwright.config.integration.mjshaswebServerhook bringing up the compose stack; exposeshttp://localhost:<KB_PORT>+http://localhost:<MC_PORT>as MCP endpoints. Initial 3 specs intest/playwright/integration/exercise: (a)/healthcheckshape verification with liveproviders.*blocks, (b) cross-tenant isolation alice/bob property at deployed level, (c) 401-reject whentrustProxyIdentity=true+ missingX-PREFERRED-USERNAME.trustProxyIdentitymode for MVP fixture (NOT full OIDC) — simpler operator-facing fixture; matches documented "deploy behind oauth2-proxy" path inSharedDeployment.md. Tests skip-with-warning if Docker daemon unavailable on operator dev environments. Tmpfs Chroma cleared per-run; clean teardown.SharedDeployment.md"Healthcheck Verification" section with reference to integration fixture; cookbook (PR #10806) Section 8 (First-Connection Smoke Test) cross-link to this fixture once it lands; possibly README contributing-guide entry.npx playwright test -c test/playwright/playwright.config.integration.mjs --reporter=lineruns all green locally. AC5 (deliberate-regression empirical verification): induce a regression on a throwaway branch (e.g., flip OIDC precedence inServer.mjs#buildRequestContext, or introduce aclientSecretleak inbuildAuthProviderBlock) and confirm at least one integration spec catches it. Document the verification in PR body.npm run test-integrationscript (new inpackage.json)test-unit/test-components/test-e2esiblingsnpx playwright test -c test/playwright/playwright.config.integration.mjsfor debugging or if script preconditions fail.package.jsonscript entry; possibly contributing-guide reference.npm run test-integration --dry-runor by running and observing exit codes.trustProxyIdentityMVP-mode test contract (header injection in fixtures)auth.trustProxyIdentitysubstrate, PR #10785 401 gate,SharedDeployment.md§Authentication threat modelX-PREFERRED-USERNAMEheaders directly (no oauth2-proxy mock needed for MVP) to exercise the proxy-header path ofServer.mjs#buildRequestContext. Two test identities (alice, bob) used for cross-tenant isolation spec.Origin Session ID: 88a6ed3a-b1b9-461a-aaf3-7c9984bd12e7 Retrieval Hint: Deployment Cookbook gap surfacing