Context
#10805 closed 2026-05-07 via #10893 (GPT's "Lane A vertical slice" per his intake comment 4396XXXXX). Lane A delivered exactly one integration spec — test/playwright/integration/healthcheck.spec.mjs — covering the (a) /healthcheck shape verification row of the original Contract Ledger.
The remaining two specs from the same Ledger row are uncovered:
(b) cross-tenant isolation alice/bob property at deployed level
(c) 401-reject when trustProxyIdentity=true + missing X-PREFERRED-USERNAME
Plus AC5 from the same ticket — deliberate-regression empirical verification — never landed. The fixture exists (composeWebServer.mjs + playwright.config.integration.mjs + npm run test-integration); it just doesn't exercise auth or cross-tenant paths.
This ticket completes the Lane B residuals @neo-gpt explicitly carried forward in his intake comment ("Cross-tenant isolation and 401-reject remain in #10805's full scope as Lane B residuals unless they fit cleanly after the substrate lands").
The Problem
Three real risks are presently uncovered at the deployed-shape level:
- Cross-tenant data leakage regressions.
trustProxyIdentity substrate (PR #10768) maps X-PREFERRED-USERNAME headers to ChromaDB userId metadata. A regression that strips the header propagation, mis-routes the metadata, or relaxes the read-time filter would silently allow alice to read bob's data — and unit tests with mocks would not catch it because mocks bypass the real Chroma metadata layer.
- Auth bypass regressions. PR #10785 installed the 401 gate that rejects requests when
trustProxyIdentity=true and X-PREFERRED-USERNAME is absent. A regression that defaults the gate open (e.g., a flipped boolean, an early-return, or a config-resolution path that loses the trustProxyIdentity flag) silently disables tenant isolation in deployed environments. No spec currently exercises the rejection path under real-component conditions.
- No integration-time identity primitive. Lane A's
healthcheck.spec.mjs connects via StreamableHTTPClientTransport with no headers. Future integration specs that need identity-aware behavior would each have to re-derive header injection. The fixture pattern is missing — copy-paste rot is the predictable trajectory.
The original #10805 Ledger explicitly named both specs, but the Lane A scope split deferred them. They are not new — they are completion of pre-agreed scope.
The Architectural Reality
- Existing fixture surface (
test/playwright/integration/):
fixtures/composeWebServer.mjs — webServer hook that spins up ai/deploy/docker-compose.test.yml + waits for KB(:13000), MC(:13001), Chroma(:18080) ready ports.
playwright.config.integration.mjs — fullyParallel:false, workers:1, timeout:120s, single readiness gate at http://127.0.0.1:13090/ready.
healthcheck.spec.mjs — single Lane A spec; no header injection; no identity context; uses StreamableHTTPClientTransport from @modelcontextprotocol/sdk/client/streamableHttp.js.
- Existing compose (
ai/deploy/docker-compose.test.yml): KB+MC envs do NOT currently set NEO_AUTH_TRUST_PROXY_IDENTITY=true. The auth substrate (PR #10768 + #10785) is dormant in the test stack.
- Auth substrate to exercise:
Server.mjs#buildRequestContext — header → identity resolution.
Server.mjs#buildAuthProviderBlock — provider-block emission inside /healthcheck payload.
- The 401-reject early-return path when
trustProxyIdentity=true + missing header.
- Cross-tenant isolation surface:
- MC
add_memory writes through the request-bound identity context to ChromaDB metadata.
- MC
query_raw_memories filters retrieval by the request-bound identity context.
- Property: alice writes M_alice, bob writes M_bob, alice queries → only M_alice returned, bob queries → only M_bob returned.
- MCP SDK header injection:
StreamableHTTPClientTransport(URL, {requestInit: {headers: {...}}}) — supported since SDK release notes — #10804 thread.
The Fix
Three deliveries in one PR (substrate primitive + both consumers):
1. Identity fixture primitive
New file: test/playwright/integration/fixtures/identityClient.mjs
Exports a small factory that returns an MCP client wired with header injection:
export async function createIdentityClient({baseUrl, identity = null, clientName = 'neo-integration-spec'}) {
const headers = identity ? {'X-PREFERRED-USERNAME': identity} : {};
const transport = new StreamableHTTPClientTransport(new URL('/mcp', baseUrl), {
requestInit: {headers}
});
const client = new Client({name: clientName, version: '1.0.0'}, {capabilities: {}});
await client.connect(transport);
return client;
}
This is the primitive every future identity-aware integration spec re-uses. No copy-paste of header construction.
2. Compose update — enable auth in test stack
ai/deploy/docker-compose.test.yml envs for both KB and MC:
- NEO_AUTH_TRUST_PROXY_IDENTITY=true
This dormant flag activates the substrate paths that the two new specs exercise. Lane A's healthcheck.spec.mjs continues to pass because its single test does not require auth (healthcheck is the unauthenticated probe per Server.mjs exclusion).
Important: verify that healthcheck MCP tool remains unauthenticated under trustProxyIdentity=true. If it is gated, healthcheck.spec.mjs will need an injection too — surface this as a Required Action during cycle review.
3. Two new specs
test/playwright/integration/CrossTenantIsolation.integration.spec.mjs:
- Two clients via
createIdentityClient: one for alice, one for bob.
- alice writes a memory via MC
add_memory({prompt, thought, response}) with a unique sentinel string.
- bob writes a memory via MC
add_memory with a different sentinel.
- alice queries via MC
query_raw_memories({query: alice-sentinel}) → expect alice's memory returned, bob's NOT returned.
- bob queries via MC
query_raw_memories({query: bob-sentinel}) → expect bob's memory returned, alice's NOT returned.
- Assertion: cross-tenant invisibility holds (alice's query never returns bob's sentinel string anywhere in the result set, and vice versa).
test/playwright/integration/AuthRejection.integration.spec.mjs:
- One client via
createIdentityClient({identity: null}) — no header injected.
- Attempt MC
healthcheck (or any tool) → expect transport-layer 401.
- Verify Server.mjs returned the 401 status code on the underlying HTTP response (transport surface).
- Counter-positive: a second client with
identity: 'alice' succeeds.
4. AC5 — deliberate-regression empirical verification
Per #10805 Ledger row 1 evidence column. Procedure documented in PR body:
- Branch:
claude/10805-ac5-regression-proof (throwaway).
- Edit: in
Server.mjs#buildRequestContext, flip the precedence so trustProxyIdentity is ignored when set.
- Run:
npm run test-integration — confirm AuthRejection.integration.spec.mjs fails (the 401 gate no longer rejects; the regression is caught).
- Edit: revert the regression. Re-run — confirm green.
- Document the verification in PR body with the precise patch + observed test failure log.
Contract Ledger (T3)
| Target Surface |
Source of Authority |
Proposed Behavior |
Fallback / Edge Case |
Docs |
Evidence |
test/playwright/integration/fixtures/identityClient.mjs (new) |
This ticket; substrate-symmetric to existing composeWebServer.mjs fixture pattern |
Factory createIdentityClient({baseUrl, identity, clientName}) returning an MCP Client with X-PREFERRED-USERNAME header injected when identity is non-null. Re-usable by all future identity-aware integration specs. |
identity: null produces a client with no auth header — for negative-test cases like AuthRejection.integration.spec.mjs. Caller must await client.close() in spec teardown. |
Cross-link from cookbook (learn/agentos/DeploymentCookbook.md) Section 8 (First-Connection Smoke Test) once the identity-fixture is the canonical client constructor. |
L2 — both new specs construct via this fixture; pattern verified across 2+ consumers. |
test/playwright/integration/CrossTenantIsolation.integration.spec.mjs (new) |
This ticket; #10805 Ledger row 1 spec (b); PR #10768 auth.trustProxyIdentity substrate; #10000 cross-tenant isolation property |
alice writes via MC add_memory with sentinel; bob writes with different sentinel. Each queries via query_raw_memories and observes only their own sentinel — never the other's. Assertion uses sentinel-presence in result-set as the property. |
Skip-with-warning if Docker daemon unavailable per existing composeWebServer.mjs readiness gate. Test uses unique per-run sentinel strings to avoid cross-run pollution; teardown deletes both memories. |
Cross-link from cookbook Section 4 (IdP Setup) and SharedDeployment.md §Authentication threat model — both gain reference to this empirical verification spec. |
L2 — alice-only-data-back property as cross-tenant-isolation proof; AC5 empirical evidence. |
test/playwright/integration/AuthRejection.integration.spec.mjs (new) |
This ticket; #10805 Ledger row 1 spec (c); PR #10785 401 gate substrate |
Client with identity: null calling any MC/KB MCP tool receives transport-layer 401 from Server.mjs before tool dispatch. Counter-positive: identity-bearing client succeeds against same endpoint. |
Caller must distinguish transport 401 from tool-level error response (tool errors come back as {isError: true}; auth rejection happens at HTTP layer, surfaces as a transport exception). |
Cross-link from SharedDeployment.md §Authentication; cookbook Section 4. |
L2 — 401 status verified at HTTP transport layer; counter-positive client succeeds; AC5 deliberate-regression catches a flipped gate. |
ai/deploy/docker-compose.test.yml (mod) |
This ticket; activates dormant auth substrate in test stack |
Adds NEO_AUTH_TRUST_PROXY_IDENTITY=true env to both kb-server and mc-server services. Lane A's existing healthcheck.spec.mjs continues to pass (verify pre-change that healthcheck MCP tool is unauthenticated path). |
If healthcheck requires auth under this flag, healthcheck.spec.mjs gains identity injection too — surface as cycle-1 RA. |
Inline comment in compose file linking to PR #10768 + #10785 substrate. |
L1 — env present after change; L2 — both new specs exercise the substrate; L3 — AC5 catches regression. |
Acceptance Criteria
Out of Scope
- Full Keycloak/Dex OIDC fixture — the original "Lane C+1" fidelity upgrade.
trustProxyIdentity-mode is sufficient for MVP regression-catching per #10805 Ledger.
- CI workflow for
npm run test-integration — covered by sibling Lane C ticket (filed concurrently); do not bundle.
- Heartbeat-over-time / sustained-liveness coverage — covered by sibling Lane B ticket (filed concurrently).
- OIDC discovery / JWKS path — different substrate (real-OIDC, not header-injection). Future ticket.
Avoided Traps / Gold Standards Rejected
- Rejected: re-derive header injection inside each spec. Copy-paste rot is the predictable trajectory across N specs. The fixture-primitive shape is canonical (substrate before consumer; reusable across all future integration tests).
- Rejected: separate test compose for auth scenario. Adds compose-file proliferation; hides the truth that auth is part of the deployed shape. Activating
trustProxyIdentity=true in the existing compose is the elegant move — the test stack matches the production-deployed shape.
- Rejected: mock the 401 path. Defeats the purpose. Real Server.mjs reject is what production deploys do; mocking would re-create the same gap that #10805 exists to close.
- Rejected: skip AC5 because "the specs are the proof". A spec that doesn't catch a deliberately-induced regression is theater. AC5 is the property test for the test suite itself.
- Rejected: bundle CI workflow into this ticket. Cross-cutting concern (benefits all test types uniformly); deserves the dedicated Lane C ticket.
Related
- Closes scope from: #10805 (Lane B residuals + AC5).
- Parent epic: #10009 — Playwright Test Coverage: Federated Cloud Topology (this ticket directly executes #10009 Tasks 1 & 3 — network isolation + OIDC headers → ChromaDB userId metadata restrictions).
- Substrate dependencies: PR #10768 (
trustProxyIdentity substrate), PR #10785 (401 gate), PR #10880 (Docker artifacts), PR #10893 (Lane A vertical slice).
- Sibling lanes (filed concurrently): Lane B (sustained-liveness heartbeat spec), Lane C (CI test-matrix workflow).
- Cookbook cross-link:
learn/agentos/DeploymentCookbook.md Sections 4, 8.
- Architecture cross-link:
learn/agentos/SharedDeployment.md §Authentication threat model.
Origin Session ID: 7e897a0b-33ce-4d6c-b1a9-a1ff93e4e571
Retrieval Hint: query_raw_memories(query="cross-tenant isolation 401-reject identity-fixture integration spec #10805 Lane B residuals")
Context
#10805 closed 2026-05-07 via #10893 (GPT's "Lane A vertical slice" per his intake comment 4396XXXXX). Lane A delivered exactly one integration spec —
test/playwright/integration/healthcheck.spec.mjs— covering the(a) /healthcheck shape verificationrow of the original Contract Ledger.The remaining two specs from the same Ledger row are uncovered:
(b) cross-tenant isolation alice/bob property at deployed level(c) 401-reject when trustProxyIdentity=true + missing X-PREFERRED-USERNAMEPlus AC5 from the same ticket — deliberate-regression empirical verification — never landed. The fixture exists (
composeWebServer.mjs+playwright.config.integration.mjs+npm run test-integration); it just doesn't exercise auth or cross-tenant paths.This ticket completes the Lane B residuals @neo-gpt explicitly carried forward in his intake comment ("Cross-tenant isolation and 401-reject remain in #10805's full scope as Lane B residuals unless they fit cleanly after the substrate lands").
The Problem
Three real risks are presently uncovered at the deployed-shape level:
trustProxyIdentitysubstrate (PR #10768) mapsX-PREFERRED-USERNAMEheaders to ChromaDBuserIdmetadata. A regression that strips the header propagation, mis-routes the metadata, or relaxes the read-time filter would silently allow alice to read bob's data — and unit tests with mocks would not catch it because mocks bypass the real Chroma metadata layer.trustProxyIdentity=trueandX-PREFERRED-USERNAMEis absent. A regression that defaults the gate open (e.g., a flipped boolean, an early-return, or a config-resolution path that loses thetrustProxyIdentityflag) silently disables tenant isolation in deployed environments. No spec currently exercises the rejection path under real-component conditions.healthcheck.spec.mjsconnects viaStreamableHTTPClientTransportwith no headers. Future integration specs that need identity-aware behavior would each have to re-derive header injection. The fixture pattern is missing — copy-paste rot is the predictable trajectory.The original #10805 Ledger explicitly named both specs, but the Lane A scope split deferred them. They are not new — they are completion of pre-agreed scope.
The Architectural Reality
test/playwright/integration/):fixtures/composeWebServer.mjs—webServerhook that spins upai/deploy/docker-compose.test.yml+ waits for KB(:13000), MC(:13001), Chroma(:18080) ready ports.playwright.config.integration.mjs—fullyParallel:false,workers:1,timeout:120s, single readiness gate athttp://127.0.0.1:13090/ready.healthcheck.spec.mjs— single Lane A spec; no header injection; no identity context; usesStreamableHTTPClientTransportfrom@modelcontextprotocol/sdk/client/streamableHttp.js.ai/deploy/docker-compose.test.yml): KB+MC envs do NOT currently setNEO_AUTH_TRUST_PROXY_IDENTITY=true. The auth substrate (PR #10768 + #10785) is dormant in the test stack.Server.mjs#buildRequestContext— header → identity resolution.Server.mjs#buildAuthProviderBlock— provider-block emission inside/healthcheckpayload.trustProxyIdentity=true+ missing header.add_memorywrites through the request-bound identity context to ChromaDB metadata.query_raw_memoriesfilters retrieval by the request-bound identity context.StreamableHTTPClientTransport(URL, {requestInit: {headers: {...}}})— supported since SDK release notes — #10804 thread.The Fix
Three deliveries in one PR (substrate primitive + both consumers):
1. Identity fixture primitive
New file:
test/playwright/integration/fixtures/identityClient.mjsExports a small factory that returns an MCP client wired with header injection:
export async function createIdentityClient({baseUrl, identity = null, clientName = 'neo-integration-spec'}) { const headers = identity ? {'X-PREFERRED-USERNAME': identity} : {}; const transport = new StreamableHTTPClientTransport(new URL('/mcp', baseUrl), { requestInit: {headers} }); const client = new Client({name: clientName, version: '1.0.0'}, {capabilities: {}}); await client.connect(transport); return client; }This is the primitive every future identity-aware integration spec re-uses. No copy-paste of header construction.
2. Compose update — enable auth in test stack
ai/deploy/docker-compose.test.ymlenvs for both KB and MC:- NEO_AUTH_TRUST_PROXY_IDENTITY=trueThis dormant flag activates the substrate paths that the two new specs exercise. Lane A's
healthcheck.spec.mjscontinues to pass because its single test does not require auth (healthcheck is the unauthenticated probe perServer.mjsexclusion).Important: verify that
healthcheckMCP tool remains unauthenticated undertrustProxyIdentity=true. If it is gated, healthcheck.spec.mjs will need an injection too — surface this as a Required Action during cycle review.3. Two new specs
test/playwright/integration/CrossTenantIsolation.integration.spec.mjs:createIdentityClient: one for alice, one for bob.add_memory({prompt, thought, response})with a unique sentinel string.add_memorywith a different sentinel.query_raw_memories({query: alice-sentinel})→ expect alice's memory returned, bob's NOT returned.query_raw_memories({query: bob-sentinel})→ expect bob's memory returned, alice's NOT returned.test/playwright/integration/AuthRejection.integration.spec.mjs:createIdentityClient({identity: null})— no header injected.healthcheck(or any tool) → expect transport-layer 401.identity: 'alice'succeeds.4. AC5 — deliberate-regression empirical verification
Per #10805 Ledger row 1 evidence column. Procedure documented in PR body:
claude/10805-ac5-regression-proof(throwaway).Server.mjs#buildRequestContext, flip the precedence sotrustProxyIdentityis ignored when set.npm run test-integration— confirmAuthRejection.integration.spec.mjsfails (the 401 gate no longer rejects; the regression is caught).Contract Ledger (T3)
test/playwright/integration/fixtures/identityClient.mjs(new)composeWebServer.mjsfixture patterncreateIdentityClient({baseUrl, identity, clientName})returning an MCPClientwithX-PREFERRED-USERNAMEheader injected whenidentityis non-null. Re-usable by all future identity-aware integration specs.identity: nullproduces a client with no auth header — for negative-test cases likeAuthRejection.integration.spec.mjs. Caller mustawait client.close()in spec teardown.learn/agentos/DeploymentCookbook.md) Section 8 (First-Connection Smoke Test) once the identity-fixture is the canonical client constructor.test/playwright/integration/CrossTenantIsolation.integration.spec.mjs(new)auth.trustProxyIdentitysubstrate; #10000 cross-tenant isolation propertyadd_memorywith sentinel; bob writes with different sentinel. Each queries viaquery_raw_memoriesand observes only their own sentinel — never the other's. Assertion uses sentinel-presence in result-set as the property.composeWebServer.mjsreadiness gate. Test uses unique per-run sentinel strings to avoid cross-run pollution; teardown deletes both memories.SharedDeployment.md§Authentication threat model — both gain reference to this empirical verification spec.test/playwright/integration/AuthRejection.integration.spec.mjs(new)identity: nullcalling any MC/KB MCP tool receives transport-layer 401 fromServer.mjsbefore tool dispatch. Counter-positive: identity-bearing client succeeds against same endpoint.{isError: true}; auth rejection happens at HTTP layer, surfaces as a transport exception).SharedDeployment.md§Authentication; cookbook Section 4.ai/deploy/docker-compose.test.yml(mod)NEO_AUTH_TRUST_PROXY_IDENTITY=trueenv to bothkb-serverandmc-serverservices. Lane A's existinghealthcheck.spec.mjscontinues to pass (verify pre-change that healthcheck MCP tool is unauthenticated path).Acceptance Criteria
test/playwright/integration/fixtures/identityClient.mjsexists and exportscreateIdentityClientfactory per Ledger row 1.test/playwright/integration/CrossTenantIsolation.integration.spec.mjsships with alice/bob sentinel-isolation property assertion.test/playwright/integration/AuthRejection.integration.spec.mjsships with 401-reject + counter-positive identity-bearing-client-succeeds.ai/deploy/docker-compose.test.ymlupdated withNEO_AUTH_TRUST_PROXY_IDENTITY=trueon bothkb-serverandmc-server.npm run test-integrationlocally.learn/agentos/DeploymentCookbook.mdandlearn/agentos/SharedDeployment.mdcross-linked from the new specs (one-line refs in each).Out of Scope
trustProxyIdentity-mode is sufficient for MVP regression-catching per #10805 Ledger.npm run test-integration— covered by sibling Lane C ticket (filed concurrently); do not bundle.Avoided Traps / Gold Standards Rejected
trustProxyIdentity=truein the existing compose is the elegant move — the test stack matches the production-deployed shape.Related
trustProxyIdentitysubstrate), PR #10785 (401 gate), PR #10880 (Docker artifacts), PR #10893 (Lane A vertical slice).learn/agentos/DeploymentCookbook.mdSections 4, 8.learn/agentos/SharedDeployment.md§Authentication threat model.Origin Session ID:
7e897a0b-33ce-4d6c-b1a9-a1ff93e4e571Retrieval Hint:
query_raw_memories(query="cross-tenant isolation 401-reject identity-fixture integration spec #10805 Lane B residuals")