LearnNewsExamplesServices
Frontmatter
id10917
titleCrossTenantIsolation integration spec: alice add_memory returns isError post-substrate-fix
stateClosed
labels
bugaitesting
assigneesneo-gpt
createdAtMay 7, 2026, 7:30 PM
updatedAtMay 7, 2026, 9:08 PM
githubUrlhttps://github.com/neomjs/neo/issues/10917
authorneo-opus-4-7
commentsCount2
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 7, 2026, 9:08 PM

CrossTenantIsolation integration spec: alice add_memory returns isError post-substrate-fix

Closedbugaitesting
neo-opus-4-7
neo-opus-4-7 commented on May 7, 2026, 7:30 PM

Context

Surfaced 2026-05-07 by Lane C #10899 integration row CI on rebased head 56ed56e92 after the full substrate-fix cascade (#10904 + #10914 + #10916) merged. The basic healthcheck and AuthRejection specs PASS (proving the substrate is sound), but CrossTenantIsolation.integration.spec.mjs:43 fails when alice's add_memory MCP tool call returns result.isError === true (server-side application error, not transport).

Per agreement with @tobiu, deferring this with a NEO_TEST_SKIP_CI skip-guard to unblock Lane C, with this dedicated ticket for separate investigation.

The Problem

test/playwright/integration/CrossTenantIsolation.integration.spec.mjs:20:5"alice and bob only retrieve their own tenant-tagged memories"

Symptom: alice's first add_memory call returns {isError: true} from MC's tool dispatch. The error surfaces at line 43 in callJsonTool via the expect(result.isError).not.toBe(true) assertion in mcpClient.mjs:78.

What we know:

  • Empirical evidence: Lane C run 25511367136 integration job, post-substrate-fix.
  • AuthRejection test PASSES — so the trust-proxy-identity gate is firing correctly (alice's identity-bearing client gets through; identity-less client gets 401).
  • Basic healthcheck (Lane A) PASSES — so the per-session McpServer flow is sound.
  • Sustained-liveness composability (5s/1s window) PASSES — so assertSustainedHealth works.
  • This SPECIFIC failure is application-layer: alice CONNECTS successfully + sends add_memory request + MC returns tool-level error.

What we don't know:

  • The actual error message in result.content (CI log doesn't surface it; needs docker compose logs mc-server capture or in-test logging of the error response).
  • Whether the issue is in add_memory schema validation, ChromaDB write under tenant-scoped userId, or some other application-layer cause.

The Architectural Reality

  • test/playwright/integration/CrossTenantIsolation.integration.spec.mjs — Lane A's spec from #10901, tests the userId-scoped tenant isolation on MC.
  • The add_memory tool path: writes Chroma collection metadata-tagged with the auth-resolved userId. With trust-proxy-identity active and alice's X-PREFERRED-USERNAME: alice header, the userId should resolve to alice.
  • Possible failure surfaces: schema validation, Chroma collection-create-on-first-write, embedding service, the per-session McpServer's request-context propagation.

The Fix (Investigation Required)

This is an investigation-shaped ticket — root cause not yet diagnosed.

Phase 1: capture the actual result.content[0].text error message via either:

  • (a) Add docker compose logs mc-server step to Lane C workflow on integration failure (workflow-side instrumentation).
  • (b) Modify mcpClient.mjs:callJsonTool to log the error result before assertion (test-side instrumentation).
  • (c) Reproduce locally with Docker available + read MC log directly.

Phase 2: based on Phase 1 evidence, determine fix shape (could be application bug in MC's add_memory, test-spec setup gap, schema validation, or something else).

Acceptance Criteria

  • Phase 1 diagnostic completes: actual error message from MC's add_memory response surfaced.
  • Phase 2 fix lands based on Phase 1 evidence (likely 1 commit + reverting the NEO_TEST_SKIP_CI skip-guard from the spec).
  • CrossTenantIsolation integration spec passes in CI (npm run test-integration with all servers up).

Out of Scope

  • HeartbeatPropagation uptime-equality bug (filed as separate sibling ticket).
  • Refactoring the test to use a different tenant-isolation testing pattern.

Related

  • Surfacing context: Lane C CI run 25511367136 integration job — first end-to-end run with all 5 substrate fixes applied.
  • Predecessor substrate work: #10915#10916 (per-session McpServer + NEO_AUTH_* binding); #10913#10914 (Chroma bash TCP probe); etc.
  • Originating spec: #10895#10901 Lane A (GPT authored the cross-tenant isolation spec).
  • Skip-guard PR: TBD (filed concurrently with this ticket).

Origin Session ID: 7e897a0b-33ce-4d6c-b1a9-a1ff93e4e571

Retrieval Hint: query_raw_memories(query="CrossTenantIsolation alice add_memory isError post-substrate Lane C deferred application")

tobiu referenced in commit ad04351 - "test(integration): defer 2 application-spec failures pending investigation (#10917) (#10919) on May 7, 2026, 7:58 PM
tobiu referenced in commit 4fb4bca - "feat(ci): test matrix workflow gating PRs on unit + integration suites (#10897) (#10899) on May 7, 2026, 8:19 PM
tobiu referenced in commit 9c95647 - "fix(test): provision integration embeddings (#10917) (#10927) on May 7, 2026, 9:08 PM
tobiu closed this issue on May 7, 2026, 9:08 PM