LearnNewsExamplesServices
Frontmatter
id11652
titleSubstrate-layer guard: refuse canonical Chroma collection delete without UNIT_TEST_MODE
stateClosed
labels
bugaitestingarchitecture
assigneesneo-opus-ada
createdAtMay 19, 2026, 5:35 PM
updatedAtMay 20, 2026, 2:48 AM
githubUrlhttps://github.com/neomjs/neo/issues/11652
authorneo-opus-ada
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 20, 2026, 2:48 AM

Substrate-layer guard: refuse canonical Chroma collection delete without UNIT_TEST_MODE

Closed v13.0.0/archive-v13-0-0-chunk-12 bugaitestingarchitecture
neo-opus-ada
neo-opus-ada commented on May 19, 2026, 5:35 PM

Context

A second Memory Core wipe incident has occurred: live MC went from 11,286 memories → 1,102 memories between the 2026-05-16T13-08-06.565Z and 2026-05-17T... backup snapshots. Summaries went from 980 → 81 over the same window. Restored from the 5/16 bundle this turn (12,266 records back via npm run ai:restore -- ... --mode merge --only-substrate=mc).

Forensic context: prior wipe (#10867) was solved by adding a hardcoded safety check at the test-helper layer (test/playwright/unit/ai/services/memory-core/util.mjs:16-18 throws on canonical-collection-name match). That guard fires only when the test routes through cleanupChromaManager(). The current recurrence happened despite that guard — most plausible attack surface: an agent invoking npx playwright <spec> directly (no playwright.config.unit.mjs loaded), so process.env.UNIT_TEST_MODE is undefined, ai/mcp/server/memory-core/config.mjs:228-229 falls through to canonical names, and any test (or service-startup path) that touches Chroma destructive APIs operates on the live collection.

Operator framing (2026-05-19 ~15:30Z): "gemini does frequently use npx playwright without our config files. so our tests must get hardened to still never delete live dbs."

The Problem

Defense-in-depth is currently asymmetric:

  • Config layer (config.mjs:228-229) — toggles collection names on UNIT_TEST_MODE. Correct under npm run test-unit. Bypassed under npx playwright (config file not loaded).
  • Test-helper layer (util.mjs:16-18) — guards cleanupChromaManager(). Bypassed if any code path calls ChromaManager.client.deleteCollection() directly, or if the test forgets the helper.
  • Substrate layer (ai/services/memory-core/managers/ChromaManager.mjs) — NO GUARD. The underlying chromadb-js client is exposed unwrapped (ChromaManager.client.deleteCollection({name})). Any caller — production code, test, daemon, ingest path — can drop any collection by name, including the canonical ones.

The pattern works for the happy path but has zero defense against a caller who acquires ChromaManager.client and operates on it without going through the cleanup helper.

The Architectural Reality

Layer File Current behavior Gap
Config ai/mcp/server/memory-core/config.mjs:228-229 process.env.UNIT_TEST_MODE === 'true' ? 'test-...' : 'neo-agent-memory' Only triggers when env var loaded; npx playwright bypasses
Test helper test/playwright/unit/ai/services/memory-core/util.mjs:16-18 Throws if collectionsConfig.memory === 'neo-agent-memory' Only fires inside the helper; direct client.deleteCollection skips
ChromaManager ai/services/memory-core/managers/ChromaManager.mjs No deleteCollection method; client exposed directly No substrate-level invariant
Existing destructive-op guard Shared_DestructiveOperationGuard (#10845) Wraps known destructive paths (restore.mjs, etc.) Not wired to chroma collection delete

KB side has the same shape: ai/services/knowledge-base/DatabaseService.mjs:372 truncateDatabase does route through DestructiveOperationGuard for the wipe-for-restore path, but ChromaManager.client.deleteCollection is still reachable bare.

The Fix

Add a substrate-layer invariant at ChromaManager: refuse deleteCollection against canonical collection names UNLESS process.env.UNIT_TEST_MODE === 'true' is set in the environment of the calling process.

Concretely:

  1. ai/services/memory-core/managers/ChromaManager.mjs — add a guarded async deleteCollection({name}) method that:
    • Throws CANONICAL_COLLECTION_GUARDED if name matches the canonical set (neo-agent-memory, neo-agent-sessions, neo-agent-knowledge-base, plus any other canonical names sourced from config.mjs).
    • Bypass only if process.env.UNIT_TEST_MODE === 'true' (test path).
    • Bypass only via explicit operator-acknowledged token if the call originates from a documented destructive recovery flow (e.g. restore.mjs --mode replace already gates via DestructiveOperationGuard).
  2. Forbid direct ChromaManager.client.deleteCollection access — either wrap the client behind a Proxy in the manager so the underlying deleteCollection is intercepted at the JS level, OR add a CI lint rule (grep test) refusing the pattern ChromaManager.client.deleteCollection outside the manager itself.
  3. Mirror at the KB side (ai/services/knowledge-base/managers/ChromaManager.mjs if it has one — verify during impl) so the invariant applies uniformly across memory-core + knowledge-base subsystems.
  4. Add UNIT_TEST_MODE-required check at cleanupChromaManager — current util.mjs:16-18 only checks the collection name; also check process.env.UNIT_TEST_MODE === 'true'. If not set, refuse even non-canonical names (covers the npx playwright-without-config scenario where the collection name falls through to canonical and the test author thought they were on a test-prefixed name).
  5. Spec coverage — new test test/playwright/unit/ai/services/memory-core/managers/ChromaManager.canonicalGuard.spec.mjs that:
    • Calls the guarded delete against a canonical name without UNIT_TEST_MODE → expects throw
    • Calls against a canonical name with UNIT_TEST_MODE=true → expects pass
    • Calls against a test-prefixed name without UNIT_TEST_MODE → expects throw (defensive: the prefix-rule already gives test isolation, but if env var is missing the caller shouldn't be in a test path)

Acceptance Criteria

  • ChromaManager.deleteCollection({name}) method exists and routes all destructive collection-delete operations through a single guarded path.
  • Guard throws CANONICAL_COLLECTION_GUARDED when name matches the canonical set AND process.env.UNIT_TEST_MODE !== 'true', regardless of caller.
  • ChromaManager.client either Proxy-wrapped or client.deleteCollection made unreachable from outside the manager (lint rule + CI grep gate accepted as alternative if Proxy adds too much overhead).
  • KB-side ChromaManager (if exists) mirrors the same invariant.
  • cleanupChromaManager() adds process.env.UNIT_TEST_MODE === 'true' precondition; throws if absent.
  • New spec ChromaManager.canonicalGuard.spec.mjs covers 3 scenarios above.
  • Manual verification: npx playwright test/playwright/unit/ai/services/memory-core/<any-spec>.spec.mjs without npm run test-unit wrapper → tests refuse to touch canonical collections (the guard fires).
  • Memory Core MCP daemon startup is unaffected (it reads canonical names directly from config.mjs and never invokes deleteCollection; verify production path).

Out of Scope

  • Forensic recovery of the 2026-05-17 wipe (separate operator-orchestrated investigation; restore already executed this turn).
  • Backup retention policy adjustments (#11628 Phase 4 covers retention framing).
  • Cross-substrate destructive-op guard unification beyond memory-core + KB Chroma surfaces (e.g., SQLite-layer guard already covered by #10845 DestructiveOperationGuard).
  • Test-skill triggering hardening (#11529 closed; this ticket is one layer below).

Avoided Traps

Trap Why rejected
Only extending util.mjs test-helper check Same shape as #10867 fix that just failed in production. Doesn't defend against direct ChromaManager.client.deleteCollection callers. Substrate-layer invariant is required.
Removing ChromaManager.client accessor entirely Production read paths legitimately need the client for non-destructive ops. Proxy-wrap or method-level guard is the right granularity.
Lock canonical names behind a --force CLI flag instead of env var --force requires CLI wiring; tests don't have it. process.env.UNIT_TEST_MODE === 'true' is the existing convention and applies symmetrically.
Add a runtime "are we in a test process?" heuristic Heuristics drift. Explicit env var is the invariant — if it's not set, you're not in a test.
Skip the lint/CI gate and rely on developer discipline Empirically failed twice (#10867 + this incident). The guard must be mechanically enforced, not advisory.

Related

  • Prior incident: #10867 (closed; insufficient depth, only addressed test-helper layer)
  • Skill-trigger hardening: #11529 (closed; meta layer above this)
  • Existing destructive-op guard: #10845 (DestructiveOperationGuard infrastructure to extend)
  • Restore tooling that fires the existing guard correctly: buildScripts/ai/restore.mjs (#10129 / #11141 / #11144)

Origin Session ID

7360e917-1733-4cdd-a6f3-5ac51c34b838

Handoff Retrieval Hints

  • query_raw_memories({query: 'memory core wipe npx playwright canonical collection guard'})
  • ask_knowledge_base({query: 'ChromaManager deleteCollection guard canonical names UNIT_TEST_MODE', type: 'src'})
  • Empirical anchor: live MC count 1,102 → 12,842 post-restore on 2026-05-19; backup backup-2026-05-16T13-08-06.565Z contained 11,286+980 records
  • Existing guard precedent: DestructiveOperationGuard.assertDestructiveTargetAllowed in ai/services/shared/DestructiveOperationGuard.mjs
tobiu referenced in commit 7022a1a - "fix(ai): substrate guard refusing canonical Chroma collection deletes (#11652) (#11656) on May 20, 2026, 2:48 AM
tobiu closed this issue on May 20, 2026, 2:48 AM