LearnNewsExamplesServices
Frontmatter
id10895
titleIdentity fixture + tenant isolation + auth-rejection integration specs
stateClosed
labels
enhancementaitestingarchitecture
assigneesneo-gpt
createdAtMay 7, 2026, 2:54 PM
updatedAtMay 7, 2026, 3:44 PM
githubUrlhttps://github.com/neomjs/neo/issues/10895
authorneo-opus-4-7
commentsCount1
parentIssue10009
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 7, 2026, 3:44 PM

Identity fixture + tenant isolation + auth-rejection integration specs

Closedenhancementaitestingarchitecture
neo-opus-4-7
neo-opus-4-7 commented on May 7, 2026, 2:54 PM

Context

#10805 closed 2026-05-07 via #10893 (GPT's "Lane A vertical slice" per his intake comment 4396XXXXX). Lane A delivered exactly one integration spec — test/playwright/integration/healthcheck.spec.mjs — covering the (a) /healthcheck shape verification row of the original Contract Ledger.

The remaining two specs from the same Ledger row are uncovered:

  • (b) cross-tenant isolation alice/bob property at deployed level
  • (c) 401-reject when trustProxyIdentity=true + missing X-PREFERRED-USERNAME

Plus AC5 from the same ticket — deliberate-regression empirical verification — never landed. The fixture exists (composeWebServer.mjs + playwright.config.integration.mjs + npm run test-integration); it just doesn't exercise auth or cross-tenant paths.

This ticket completes the Lane B residuals @neo-gpt explicitly carried forward in his intake comment ("Cross-tenant isolation and 401-reject remain in #10805's full scope as Lane B residuals unless they fit cleanly after the substrate lands").

The Problem

Three real risks are presently uncovered at the deployed-shape level:

  1. Cross-tenant data leakage regressions. trustProxyIdentity substrate (PR #10768) maps X-PREFERRED-USERNAME headers to ChromaDB userId metadata. A regression that strips the header propagation, mis-routes the metadata, or relaxes the read-time filter would silently allow alice to read bob's data — and unit tests with mocks would not catch it because mocks bypass the real Chroma metadata layer.
  2. Auth bypass regressions. PR #10785 installed the 401 gate that rejects requests when trustProxyIdentity=true and X-PREFERRED-USERNAME is absent. A regression that defaults the gate open (e.g., a flipped boolean, an early-return, or a config-resolution path that loses the trustProxyIdentity flag) silently disables tenant isolation in deployed environments. No spec currently exercises the rejection path under real-component conditions.
  3. No integration-time identity primitive. Lane A's healthcheck.spec.mjs connects via StreamableHTTPClientTransport with no headers. Future integration specs that need identity-aware behavior would each have to re-derive header injection. The fixture pattern is missing — copy-paste rot is the predictable trajectory.

The original #10805 Ledger explicitly named both specs, but the Lane A scope split deferred them. They are not new — they are completion of pre-agreed scope.

The Architectural Reality

  • Existing fixture surface (test/playwright/integration/):
    • fixtures/composeWebServer.mjswebServer hook that spins up ai/deploy/docker-compose.test.yml + waits for KB(:13000), MC(:13001), Chroma(:18080) ready ports.
    • playwright.config.integration.mjsfullyParallel:false, workers:1, timeout:120s, single readiness gate at http://127.0.0.1:13090/ready.
    • healthcheck.spec.mjs — single Lane A spec; no header injection; no identity context; uses StreamableHTTPClientTransport from @modelcontextprotocol/sdk/client/streamableHttp.js.
  • Existing compose (ai/deploy/docker-compose.test.yml): KB+MC envs do NOT currently set NEO_AUTH_TRUST_PROXY_IDENTITY=true. The auth substrate (PR #10768 + #10785) is dormant in the test stack.
  • Auth substrate to exercise:
    • Server.mjs#buildRequestContext — header → identity resolution.
    • Server.mjs#buildAuthProviderBlock — provider-block emission inside /healthcheck payload.
    • The 401-reject early-return path when trustProxyIdentity=true + missing header.
  • Cross-tenant isolation surface:
    • MC add_memory writes through the request-bound identity context to ChromaDB metadata.
    • MC query_raw_memories filters retrieval by the request-bound identity context.
    • Property: alice writes M_alice, bob writes M_bob, alice queries → only M_alice returned, bob queries → only M_bob returned.
  • MCP SDK header injection: StreamableHTTPClientTransport(URL, {requestInit: {headers: {...}}}) — supported since SDK release notes — #10804 thread.

The Fix

Three deliveries in one PR (substrate primitive + both consumers):

1. Identity fixture primitive

New file: test/playwright/integration/fixtures/identityClient.mjs

Exports a small factory that returns an MCP client wired with header injection:

export async function createIdentityClient({baseUrl, identity = null, clientName = 'neo-integration-spec'}) {
    const headers = identity ? {'X-PREFERRED-USERNAME': identity} : {};
    const transport = new StreamableHTTPClientTransport(new URL('/mcp', baseUrl), {
        requestInit: {headers}
    });
    const client = new Client({name: clientName, version: '1.0.0'}, {capabilities: {}});
    await client.connect(transport);
    return client;
}

This is the primitive every future identity-aware integration spec re-uses. No copy-paste of header construction.

2. Compose update — enable auth in test stack

ai/deploy/docker-compose.test.yml envs for both KB and MC:

- NEO_AUTH_TRUST_PROXY_IDENTITY=true

This dormant flag activates the substrate paths that the two new specs exercise. Lane A's healthcheck.spec.mjs continues to pass because its single test does not require auth (healthcheck is the unauthenticated probe per Server.mjs exclusion).

Important: verify that healthcheck MCP tool remains unauthenticated under trustProxyIdentity=true. If it is gated, healthcheck.spec.mjs will need an injection too — surface this as a Required Action during cycle review.

3. Two new specs

test/playwright/integration/CrossTenantIsolation.integration.spec.mjs:

  • Two clients via createIdentityClient: one for alice, one for bob.
  • alice writes a memory via MC add_memory({prompt, thought, response}) with a unique sentinel string.
  • bob writes a memory via MC add_memory with a different sentinel.
  • alice queries via MC query_raw_memories({query: alice-sentinel}) → expect alice's memory returned, bob's NOT returned.
  • bob queries via MC query_raw_memories({query: bob-sentinel}) → expect bob's memory returned, alice's NOT returned.
  • Assertion: cross-tenant invisibility holds (alice's query never returns bob's sentinel string anywhere in the result set, and vice versa).

test/playwright/integration/AuthRejection.integration.spec.mjs:

  • One client via createIdentityClient({identity: null}) — no header injected.
  • Attempt MC healthcheck (or any tool) → expect transport-layer 401.
  • Verify Server.mjs returned the 401 status code on the underlying HTTP response (transport surface).
  • Counter-positive: a second client with identity: 'alice' succeeds.

4. AC5 — deliberate-regression empirical verification

Per #10805 Ledger row 1 evidence column. Procedure documented in PR body:

  • Branch: claude/10805-ac5-regression-proof (throwaway).
  • Edit: in Server.mjs#buildRequestContext, flip the precedence so trustProxyIdentity is ignored when set.
  • Run: npm run test-integration — confirm AuthRejection.integration.spec.mjs fails (the 401 gate no longer rejects; the regression is caught).
  • Edit: revert the regression. Re-run — confirm green.
  • Document the verification in PR body with the precise patch + observed test failure log.

Contract Ledger (T3)

Target Surface Source of Authority Proposed Behavior Fallback / Edge Case Docs Evidence
test/playwright/integration/fixtures/identityClient.mjs (new) This ticket; substrate-symmetric to existing composeWebServer.mjs fixture pattern Factory createIdentityClient({baseUrl, identity, clientName}) returning an MCP Client with X-PREFERRED-USERNAME header injected when identity is non-null. Re-usable by all future identity-aware integration specs. identity: null produces a client with no auth header — for negative-test cases like AuthRejection.integration.spec.mjs. Caller must await client.close() in spec teardown. Cross-link from cookbook (learn/agentos/DeploymentCookbook.md) Section 8 (First-Connection Smoke Test) once the identity-fixture is the canonical client constructor. L2 — both new specs construct via this fixture; pattern verified across 2+ consumers.
test/playwright/integration/CrossTenantIsolation.integration.spec.mjs (new) This ticket; #10805 Ledger row 1 spec (b); PR #10768 auth.trustProxyIdentity substrate; #10000 cross-tenant isolation property alice writes via MC add_memory with sentinel; bob writes with different sentinel. Each queries via query_raw_memories and observes only their own sentinel — never the other's. Assertion uses sentinel-presence in result-set as the property. Skip-with-warning if Docker daemon unavailable per existing composeWebServer.mjs readiness gate. Test uses unique per-run sentinel strings to avoid cross-run pollution; teardown deletes both memories. Cross-link from cookbook Section 4 (IdP Setup) and SharedDeployment.md §Authentication threat model — both gain reference to this empirical verification spec. L2 — alice-only-data-back property as cross-tenant-isolation proof; AC5 empirical evidence.
test/playwright/integration/AuthRejection.integration.spec.mjs (new) This ticket; #10805 Ledger row 1 spec (c); PR #10785 401 gate substrate Client with identity: null calling any MC/KB MCP tool receives transport-layer 401 from Server.mjs before tool dispatch. Counter-positive: identity-bearing client succeeds against same endpoint. Caller must distinguish transport 401 from tool-level error response (tool errors come back as {isError: true}; auth rejection happens at HTTP layer, surfaces as a transport exception). Cross-link from SharedDeployment.md §Authentication; cookbook Section 4. L2 — 401 status verified at HTTP transport layer; counter-positive client succeeds; AC5 deliberate-regression catches a flipped gate.
ai/deploy/docker-compose.test.yml (mod) This ticket; activates dormant auth substrate in test stack Adds NEO_AUTH_TRUST_PROXY_IDENTITY=true env to both kb-server and mc-server services. Lane A's existing healthcheck.spec.mjs continues to pass (verify pre-change that healthcheck MCP tool is unauthenticated path). If healthcheck requires auth under this flag, healthcheck.spec.mjs gains identity injection too — surface as cycle-1 RA. Inline comment in compose file linking to PR #10768 + #10785 substrate. L1 — env present after change; L2 — both new specs exercise the substrate; L3 — AC5 catches regression.

Acceptance Criteria

  • test/playwright/integration/fixtures/identityClient.mjs exists and exports createIdentityClient factory per Ledger row 1.
  • test/playwright/integration/CrossTenantIsolation.integration.spec.mjs ships with alice/bob sentinel-isolation property assertion.
  • test/playwright/integration/AuthRejection.integration.spec.mjs ships with 401-reject + counter-positive identity-bearing-client-succeeds.
  • ai/deploy/docker-compose.test.yml updated with NEO_AUTH_TRUST_PROXY_IDENTITY=true on both kb-server and mc-server.
  • All three integration specs (existing + 2 new) pass via npm run test-integration locally.
  • AC5 deliberate-regression empirical verification documented in PR body per Ledger row 3 evidence column.
  • learn/agentos/DeploymentCookbook.md and learn/agentos/SharedDeployment.md cross-linked from the new specs (one-line refs in each).

Out of Scope

  • Full Keycloak/Dex OIDC fixture — the original "Lane C+1" fidelity upgrade. trustProxyIdentity-mode is sufficient for MVP regression-catching per #10805 Ledger.
  • CI workflow for npm run test-integration — covered by sibling Lane C ticket (filed concurrently); do not bundle.
  • Heartbeat-over-time / sustained-liveness coverage — covered by sibling Lane B ticket (filed concurrently).
  • OIDC discovery / JWKS path — different substrate (real-OIDC, not header-injection). Future ticket.

Avoided Traps / Gold Standards Rejected

  • Rejected: re-derive header injection inside each spec. Copy-paste rot is the predictable trajectory across N specs. The fixture-primitive shape is canonical (substrate before consumer; reusable across all future integration tests).
  • Rejected: separate test compose for auth scenario. Adds compose-file proliferation; hides the truth that auth is part of the deployed shape. Activating trustProxyIdentity=true in the existing compose is the elegant move — the test stack matches the production-deployed shape.
  • Rejected: mock the 401 path. Defeats the purpose. Real Server.mjs reject is what production deploys do; mocking would re-create the same gap that #10805 exists to close.
  • Rejected: skip AC5 because "the specs are the proof". A spec that doesn't catch a deliberately-induced regression is theater. AC5 is the property test for the test suite itself.
  • Rejected: bundle CI workflow into this ticket. Cross-cutting concern (benefits all test types uniformly); deserves the dedicated Lane C ticket.

Related

  • Closes scope from: #10805 (Lane B residuals + AC5).
  • Parent epic: #10009 — Playwright Test Coverage: Federated Cloud Topology (this ticket directly executes #10009 Tasks 1 & 3 — network isolation + OIDC headers → ChromaDB userId metadata restrictions).
  • Substrate dependencies: PR #10768 (trustProxyIdentity substrate), PR #10785 (401 gate), PR #10880 (Docker artifacts), PR #10893 (Lane A vertical slice).
  • Sibling lanes (filed concurrently): Lane B (sustained-liveness heartbeat spec), Lane C (CI test-matrix workflow).
  • Cookbook cross-link: learn/agentos/DeploymentCookbook.md Sections 4, 8.
  • Architecture cross-link: learn/agentos/SharedDeployment.md §Authentication threat model.

Origin Session ID: 7e897a0b-33ce-4d6c-b1a9-a1ff93e4e571

Retrieval Hint: query_raw_memories(query="cross-tenant isolation 401-reject identity-fixture integration spec #10805 Lane B residuals")

tobiu referenced in commit 6816da3 - "test(integration): cover proxy identity isolation (#10895) (#10901) on May 7, 2026, 3:44 PM
tobiu closed this issue on May 7, 2026, 3:44 PM