What is the Neural Link?

The Neural Link is a bi-directional bridge that connects AI agents directly to the Neo.mjs runtime. It lets agents inspect the Scene Graph, component state, event listeners, computed styles, and DOM rectangles, and mutate the running application in real time.

Why is Neo.mjs called an Application Engine instead of a framework?

Neo.mjs maintains persistent application objects in a worker-backed Scene Graph instead of compiling application state away into ephemeral DOM nodes. That architecture enables multi-window orchestration, runtime permutation, and deep AI introspection.

What is Context Engineering?

Context Engineering shapes the information and tool environment around AI agents. Neo.mjs implements it through Knowledge Base, Memory Core, GitHub Workflow, and Neural Link MCP servers for frontier harnesses, plus a File System MCP server for internal Neo.ai.Agent local loops.

What is the Neo.mjs Agent OS?

The Neo.mjs Agent OS is the repository Brain: source code and services for Memory Core, Knowledge Base, Active Hybrid GraphRAG, DreamService, Golden Path synthesis, A2A coordination, and Neural Link tooling.

Cloud-Native KB Ingestion — Overview

Status — Phase 3B operational substrate. This guide describes the cloud-native Knowledge Base ingestion substrate (Epic #11624): tenant identity, ingestion contracts, server-side stamping, read-side filtering, and the Phase 2 push/bulk facades.

What "cloud-native KB ingestion" means

Neo's Knowledge Base (KB) is a Chroma-backed vector index over source content — guides, ADRs, skills, API docs, tickets, PRs, test specs. In a single-repo deployment, the KB indexes exactly one repository: Neo's own.

A cloud-native deployment runs Agent OS as a service for external tenant workspaces. Each workspace is a distinct tenant with its own source content (its own repos, possibly in languages and layouts unlike Neo's). The cloud deployment must let every tenant ingest its own content into the KB while continuing to serve Neo's curated content (guides, ADRs, skills) to all tenants.

The substrate that makes this possible is the per-tenant ingestion contract: every chunk in the KB carries an authoritative identity tuple, content is server-stamped at write time, and reads are tenant-scoped. No tenant can read another tenant's private content; every tenant can read Neo's team-visible curated corpus.

Read this guide after Why Deploy the Agent OS, the Day-0 Tutorial, Tenant Ingestion Model, Configuration, and Security. At that point the reader already knows why the Brain exists, how to stand it up, and which identity boundary protects it; this page explains the contract that keeps arbitrary tenant content useful without letting it blur together.

The contract split (Phase 0/1)

Phase 0/1 defines the data contracts cloud ingestion is built on. PRs #11647, #11659, #11661, and #11662 are merged:

Contract	Ticket / PR	What it provides
`parsed-chunk-v1` schema	#11629 / PR #11647	The ingest-side chunk shape every Source/Parser emits. Rejects records carrying an `embedding` field (that field belongs only to the restore-only `backup-record-v1` sibling).
Path-identity tuple	#11629 / PR #11647	`{tenantId, repoSlug, rootKind, sourcePath}` in chunk metadata — replaces the legacy single-`neoRootDir`-relative `source` string. Neo's curated content uses `tenantId: 'neo-shared'`, `repoSlug: 'neo'`.
Deletion-signaling contract	#11629 / PR #11647	Tombstone + manifest + revision-boundary mechanisms for propagating source deletions into the index.
`SourceRegistry`	#11658 / PR #11659	Data-driven Source/Parser registration. `useDefaultSources` / `useDefaultParsers` gate Neo's curated sources; `customSources` / `customParsers` register tenant-supplied classes; `rawRepoSource` explicitly enables the built-in raw repo fallback.
Per-source path externalization	#11660 / PR #11661	`aiConfig.sourcePaths` — each default Source class reads its input path from config, so a tenant whose layout differs from Neo's can reuse the curated Source classes.
Write-side tenant stamping	#11631 / PR #11662	`VectorService.embed` server-stamps `{tenantId, repoSlug, visibility, originAgentIdentity}` from the authenticated ingestion context; client-supplied identity fields are overwritten or rejected. Tenant-aware Chroma IDs prevent byte-identical chunks from different tenants colliding.

The split was deliberate: contracts before transport. The schemas and registry stabilized first; the Phase 2 ingestion endpoints were built against that frozen contract surface.

Topology anchor — one unified Chroma store

A cloud deployment keeps all its vector data in one ChromaDB store named unified, with the same layout locally and in the cloud. A single Chroma daemon serves three collections — the Knowledge Base, your agents' memories, and their session summaries — and two MCP servers (Knowledge Base and Memory Core) read from it. Cloud ingestion only ever writes to the Knowledge Base collection.

The Memory Core's edge graph is not stored in Chroma — it lives in a separate SQLite database.

There is no separate Chroma instance per tenant. Every piece of content is tagged with its owner when it's written and filtered by owner when it's read, so each tenant sees its own content plus Neo's shared library — never another tenant's private data.

Default-source inheritance

A zero-config Neo deployment behaves identically with or without the cloud-ingestion substrate: useDefaultSources defaults to true, rawRepoSource defaults to false, the SourceRegistry auto-registers Neo's 10 curated Source classes, and aiConfig.sourcePaths carries Neo's default layout. A cloud tenant opting out of Neo's curated content sets useDefaultSources: false; a tenant whose repo layout differs overrides only the sourcePaths keys it needs; a tenant with no known shape can opt into rawRepoSource as a day-0 fallback. Inheritance is the default; divergence is opt-in and granular.

Registry contract split — Source vs Parser

A Source locates and reads content from a territory (a directory tree, an external workspace) and emits knowledge chunks into the full-corpus build.
A Parser transforms a specific file format into chunk content (e.g. a .proto parser, an ES5-aware parser).

SourceRegistry holds both. Neo's curated Sources auto-register; tenant Sources/Parsers register declaratively via aiConfig.customSources / aiConfig.customParsers or programmatically via SourceRegistry.registerSource(...). The Phase 2 ingestion facades can invoke a registered Parser during an ingestion call; with no parser match, the raw-text fallback ingests the whole file as one chunk.

Relationship to Neo's curated content

Neo's own guides, ADRs, skills, and API docs remain in the KB under tenantId: 'neo-shared', visibility: 'team' — readable by every tenant. A cloud tenant's content is stamped with the tenant's own tenantId and a visibility of team (tenant-internal) or private. Read-side filtering (#11632) resolves each query against the requester's authenticated identity: a tenant sees its own content plus neo-shared, never another tenant's private chunks.

Where to go next

Security — tenant-isolation invariants, write-side stamping + spoof-rejection, parser-execution boundary framing, and the KB-as-cache vs MC-as-store recovery model.
Day-0 Tutorial - linear PoC walkthrough for a fresh operator or agent: remote MCP healthcheck, MC/KB connection, tenant ingestion, client-side parsing, bulk path, and backup/redeploy handoff.
llama.cpp Profile - OpenAI-compatible cloud-provider profile for self-hosted llama.cpp, including chat+embedding residency and handoff smoke.
Migration Path — how an existing single-repo Neo deployment upgrades to the cloud-ingestion substrate with zero config changes.
Tenant Ingestion Model — the operator-facing model for tenant repo identity, credential boundaries, parser dispatch, source-family inventory, and push-vs-bulk ingestion choices.
Configuration — the aiConfig keys and the KnowledgeBaseTenantConfig / kb-config.yaml tenant-config storage.
Custom Sources / Custom Parsers — authoring a Source for the full-corpus build, or a Parser for the push path.
Hook Wiring — the ingest_source_files / ai:ingest-tenant facades and git-hook patterns; runnable companions live under ai/examples/cloud-deployment/.