LearnNewsExamplesServices
Frontmatter
id10897
titleCI workflow: matrix(unit, integration) test runner
stateClosed
labels
enhancementaitestingbuild
assigneesneo-opus-4-7
createdAtMay 7, 2026, 2:56 PM
updatedAtMay 7, 2026, 8:19 PM
githubUrlhttps://github.com/neomjs/neo/issues/10897
authorneo-opus-4-7
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 7, 2026, 8:19 PM

CI workflow: matrix(unit, integration) test runner

Closedenhancementaitestingbuild
neo-opus-4-7
neo-opus-4-7 commented on May 7, 2026, 2:56 PM

Context

#10805 closed 2026-05-07 with the test-integration harness merged via #10893. The CI execution side was explicitly deferred per the #10805 Out-of-Scope section: "CI workflow that runs npm run test-integration on PR — file as separate follow-up ticket; benefits all suites uniformly."

Empirical state of .github/workflows/:

close-inactive-issues.yml
codeql-analysis.yml
data-sync-pipeline.yml
npm-publish.yml
prevent-reopen.yml

Zero workflows run any npm test* script on PR. Every test type — unit, components, integration, e2e — is local-only today. The integration harness from #10805 is the most recent example: a substantial Docker stack + 1 spec, with no CI gate.

This is the canonical follow-up #10805 named, executed at the right elegance threshold: a single test.yml workflow with a matrix dimension that future test types plug into symmetrically — not a one-off test-integration.yml that locks in the asymmetry.

The Problem

  1. Regressions land in dev undetected. Lane A's spec (#10893) is one example; PRs that touch MCP server code can break it without anyone noticing until next manual npm run test-integration. Same risk applies to all 200+ unit specs.
  2. Asymmetric workflow files. A standalone test-integration.yml would lock in a per-test-type proliferation pattern. Every future addition (unit, components, e2e, whitebox-e2e) would mean a new workflow file. The matrix shape resolves this once.
  3. No shared setup pattern. Each test type currently has its own playwright.config.*.mjs; a per-workflow setup duplicates Node version, npm install, and bundle-parse5 across files. Matrix with shared setup step is canonical.
  4. Docker availability handling. Integration tests skip with warning when Docker unavailable per composeWebServer.mjs readiness gate. CI runner (ubuntu-latest) has Docker pre-installed; a matrix job for integration runs the real stack. Local opt-out preserved.

The Architectural Reality

  • GitHub Actions runner availability:
    • ubuntu-latest has Docker + Docker Compose pre-installed; supports the ai/deploy/docker-compose.test.yml stack out-of-box.
    • Node.js setup via actions/setup-node@v4 with cached package-lock.json.
    • Concurrency controls available via concurrency.group to cancel superseded PR runs.
  • Existing test scripts (package.json):
    • test-unitplaywright test -c test/playwright/playwright.config.unit.mjs
    • test-componentsplaywright test -c test/playwright/playwright.config.component.mjs
    • test-integrationplaywright test -c test/playwright/playwright.config.integration.mjs
    • test-e2eplaywright test -c test/playwright/playwright.config.e2e.mjs
  • Existing CI patterns to mirror:
    • data-sync-pipeline.yml — uses Node 24, npm install, has a clear concurrency block.
    • codeql-analysis.yml — matrix-on-language pattern; same shape we want for matrix-on-test-type.
  • Worktree bootstrap concern: CI clones fresh; .neo-ai-data/ and gitignored MCP config.mjs files do NOT exist. Need node ai/scripts/bootstrapWorktree.mjs --link-data --install (or equivalent CI-friendly init) per the worktree-bootstrap memory anchor — except CI is not a worktree, so the --link-data step is no-op. Test scripts may need an npm prepare-equivalent bootstrap (npm run prepare already exists per package.json).
  • Empirical install times (memory anchor ChromaDB defrag perf adjacent: per worktree-bootstrap notes, npm install ~17s on populated cache, ~minutes cold). CI is cold-cache by default → use actions/setup-node cache for npm to bring this down.

The Fix

Single new workflow file: .github/workflows/test.yml

name: Tests
on:
  pull_request:
    branches: [dev, main]
  push:
    branches: [dev]

concurrency:
  group: test-${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        suite: [unit, integration]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 24
          cache: npm
      - run: npm ci
      - run: npm run prepare  # bundle-parse5 + setup steps
      - name: Run ${{ matrix.suite }} tests
        run: npm run test-${{ matrix.suite }}
        env:
          # Integration suite needs longer test timeout; matrix-aware
          NEO_INTEGRATION_STACK_TIMEOUT_MS: 240000
      - name: Upload test artifacts on failure
        if: failure()
        uses: actions/upload-artifact@v4
        with:
          name: test-results-${{ matrix.suite }}-${{ github.run_id }}
          path: test/playwright/test-results/

Why this shape:

  • Matrix-on-suite is the elegance — adding components, e2e, or future whitebox-e2e to the matrix is a one-line change. Zero per-suite workflow proliferation.
  • fail-fast: false — failing one suite shouldn't kill visibility into the others. Each suite reports independently.
  • concurrency group with cancel-in-progress: true — superseded PR pushes cancel the prior run; saves CI minutes.
  • pull_request + push to dev — covers PR review gating AND post-merge verification. Doesn't run on main PRs (per data-sync-pipeline.yml precedent — only dev is the active branch).
  • Artifact upload on failure — the test-results/ directory has Playwright reports; uploading lets reviewers inspect without re-running locally.
  • NEO_INTEGRATION_STACK_TIMEOUT_MS: 240000 — CI cold-Docker-build is slower than local; existing default 210000 may be tight on cold runner. Override is matrix-row-scoped via if: matrix.suite == 'integration' if we want to be precise — using global env keeps it simple at small cost.

Sequencing note re: sibling lanes

This workflow gates on npm run test-integration working — which assumes Lane A (#10895) and Lane B (#10896) specs are landed and passing locally. The CI workflow itself can land first (gating on the existing single Lane A spec); Lanes A + B's new specs flow through the same CI on their respective PRs.

Branch-protection follow-up (out-of-scope here, file separately)

Once this workflow is green on a few PRs, configure branch protection on dev to require the test (unit) + test (integration) checks. That's a repo-admin operation, not a code change — files as separate follow-up.

Contract Ledger (T3)

Target Surface Source of Authority Proposed Behavior Fallback / Edge Case Docs Evidence
.github/workflows/test.yml (new) This ticket; #10805 explicit out-of-scope follow-up; @tobiu's lead-coordination directive 2026-05-07 GitHub Actions matrix workflow running npm run test-{unit,integration} on PR-to-dev and push-to-dev. Matrix-on-suite for symmetric coverage; fail-fast:false for independent suite reporting; concurrency-cancel for superseded runs; artifact upload on failure. If integration suite Docker stack times out, suite fails with diagnostic output (composeWebServer logs surface via Playwright webServer.stdout: 'pipe'). Unit suite remains independent. Future suite types (components, e2e, whitebox-e2e) plug into matrix as one-line additions. One-line README contributing-guide entry pointing to CI green requirement; cookbook (learn/agentos/DeploymentCookbook.md) Section 8 cross-link. L1 — workflow file exists and parses (GitHub Actions linter on push). L3 — workflow runs successfully on a test PR including this very change; both matrix rows green.
Branch protection on dev requiring test (unit) + test (integration) checks (out-of-scope here) This ticket marks it as the natural follow-up; repo-admin operation Once workflow green on several PRs, require both check-conclusions for merge. Closes the gate. Admin override available per GitHub native control. Cookbook + contributing-guide reference. Out-of-scope; separate follow-up ticket.

Acceptance Criteria

  • .github/workflows/test.yml exists per Ledger row 1 with matrix-on-suite shape (unit, integration).
  • Workflow runs npm ci + npm run prepare + npm run test-${suite} per matrix row.
  • concurrency.group + cancel-in-progress: true configured.
  • actions/setup-node@v4 with cache: npm configured.
  • actions/upload-artifact@v4 configured for failure path.
  • Workflow runs on PR-to-dev AND push-to-dev per existing data-sync-pipeline.yml precedent.
  • First green run observed on this very ticket's PR (both matrix rows pass) — documented in PR body with run URL per L3 evidence.
  • No changes required to package.json test scripts, playwright.config.*.mjs files, or composeWebServer.mjs (the workflow consumes existing surfaces).

Out of Scope

  • Branch protection configuration on dev — repo-admin operation, not a code change. Filed as separate follow-up once workflow is green on multiple PRs.
  • Adding components / e2e / whitebox-e2e to the matrix — extension is a one-line addition once the matrix shape is in place. File as separate follow-ups per test-type as their stability matures.
  • Caching node_modules/ or Playwright browsers — possible optimization later; first version uses default setup-node cache for npm only.
  • Self-hosted runners — public ubuntu-latest sufficient for current scale.
  • Test result aggregation / dashboards — Playwright artifact upload covers reviewer-side inspection; aggregation is future ticket.
  • CI for npm run build-all / production bundle — different concern (bundle gating); separate follow-up.

Avoided Traps / Gold Standards Rejected

  • Rejected: standalone .github/workflows/test-integration.yml. Locks in per-test-type workflow proliferation. Every future test type would mean a new workflow file. Matrix-on-suite resolves this once.
  • Rejected: bundle CI workflow into Lane A or Lane B PR. Cross-cutting concern; benefits all test types uniformly. Filing separately keeps each lane's PR focused on the substrate it owns.
  • Rejected: fail-fast: true on the matrix. Would cancel running suites if one fails; reviewers lose visibility. fail-fast: false is correct for parallel diagnostic info.
  • Rejected: npm install (instead of npm ci). npm ci is reproducible-from-lockfile; npm install is mutating. Reproducibility is the CI virtue.
  • Rejected: skip Docker availability probe in CI. composeWebServer.mjs already has the readiness gate; on ubuntu-latest, Docker is always available so the probe is a no-op. No special-casing needed.
  • Rejected: per-matrix-row if conditions for env vars. Adds complexity; the global env override (NEO_INTEGRATION_STACK_TIMEOUT_MS: 240000) is harmless for unit tests and keeps the workflow file simple.
  • Rejected: trigger on push: main. Per data-sync-pipeline.yml precedent, dev is the active branch; main is the slower release branch. Nothing to gate on main PRs that wasn't already gated when they landed in dev.

Related

  • Closes scope from: #10805 (explicit out-of-scope follow-up — "CI workflow that runs npm run test-integration on PR").
  • Sibling lanes (filed concurrently): #10895 Lane A (tenant isolation + auth rejection), #10896 Lane B (sustained-liveness heartbeat).
  • Substrate dependencies: PR #10880 (Docker artifacts), PR #10893 (Lane A vertical slice + test-integration script).
  • CI pattern precedent: .github/workflows/data-sync-pipeline.yml (Node 24 + npm + concurrency), .github/workflows/codeql-analysis.yml (matrix shape).
  • Cookbook cross-link target: learn/agentos/DeploymentCookbook.md Section 8.
  • Operator framing: lead-coordination handoff 2026-05-07 — "deployment pipelines + heartbeats" → this ticket gates the deployment-pipeline tests in CI.

Origin Session ID: 7e897a0b-33ce-4d6c-b1a9-a1ff93e4e571

Retrieval Hint: query_raw_memories(query="CI workflow matrix unit integration test runner github actions follow-up #10805")

tobiu referenced in commit 2721678 - "fix(deploy): add --ignore-scripts to npm ci in Dockerfile (#10902) (#10904) on May 7, 2026, 3:43 PM
tobiu referenced in commit 30ad551 - "feat(testing): NEO_TEST_SKIP_CI guard for heavy-SLM + auth specs (#10903) (#10907) on May 7, 2026, 5:58 PM
tobiu referenced in commit 4fb4bca - "feat(ci): test matrix workflow gating PRs on unit + integration suites (#10897) (#10899) on May 7, 2026, 8:19 PM
tobiu closed this issue on May 7, 2026, 8:19 PM