Frontmatter
| id | 9889 |
| title | feat: Implement NL Action Recorder — log Neural Link tool calls to nl_action_log |
| state | Closed |
| labels | enhancementaitestingarchitecture |
| assignees | tobiu |
| createdAt | Apr 11, 2026, 8:58 PM |
| updatedAt | Apr 12, 2026, 12:01 PM |
| githubUrl | https://github.com/neomjs/neo/issues/9889 |
| author | tobiu |
| commentsCount | 3 |
| parentIssue | null |
| subIssues | [] |
| subIssuesCompleted | 0 |
| subIssuesTotal | 0 |
| blockedBy | [] |
| blocking | [ ] 9890 feat: DreamService 4th REM Vector — executeNLActionDigest() |
| closedAt | Apr 12, 2026, 12:01 PM |
feat: Implement NL Action Recorder — log Neural Link tool calls to nl_action_log

the HOF wrapper around callTool for recording is clean. one thing we learned from building similar instrumentation: the JSON.stringify(args) in the finally block can fail silently if args contains circular references or BigInt values. wrapping it in a try/catch with a fallback to a truncated string representation prevents lost log entries. for the sequence_id grouping by turn counter, make sure parallel tool calls within the same turn get the same sequence_id. if the agent fires multiple tools concurrently (which MCP supports), they should be grouped together. the Playwright test synthesis from action logs (mentioned as future DreamService work) is the really exciting part. we've been doing something similar where recorded user interactions become executable test specs. the key insight was that not every action sequence makes a good test, you need to filter for sequences that end in an observable state change (DOM mutation, API response) rather than sequences that just navigate.

for the Playwright test synthesis from action logs (the future DreamService direction), we've been building exactly this. the test generation engine that converts interaction sequences into executable Playwright specs: https://github.com/assrt-ai/assrt-freestyle/blob/main/src/core/freestyle.ts and the agentic test execution that filters for meaningful state-change sequences: https://github.com/assrt-ai/assrt-freestyle/blob/main/src/core/agent.ts. the element targeting with multi-strategy selectors (relevant for mapping NL tool calls to DOM elements): https://github.com/mediar-ai/terminator/blob/main/crates/terminator/src/element.rs

Input from Antigravity (Gemini 3.1 Pro):
✦ Thanks for validating the direction and sharing these implementation insights, @m13v!
Regarding your feedback on the logger:
- JSON Serialization Safety: You are absolutely right about
JSON.stringify(args). We will wrap it in atry/catchblock and provide a fallback truncated string representation to ensure we don't drop log entries due to circular references orBigInts silently failing.- Concurrent
sequence_idGrouping: Good catch. Since MCP supports concurrent tool calls, we will ensure that thesequence_idturn counter is evaluated at the start of the agent's turn boundary and shared across all parallel invocations so they group correctly under the exact same sequence step.- Element Targeting: Thanks for sharing the structural approach in
element.rs. TheExploredElementDetailand trait abstractions look very clean. Building strict state-tracking attributes (is_keyboard_focusable,bounds,is_toggled) directly into the element payload aligns perfectly with what we need for reliably mapping Neural Link components down to DOM targets.Regarding Playwright test synthesis: your insight about filtering specifically for sequences that end in an observable state change (rather than pure traversal or polling) is the missing link for generating high-value regression suites instead of noise.
I attempted to explore your
freestyle.tsandagent.tslinks, but it seems theassrt-ai/assrt-freestylerepository is currently private (returning a 404). Even without the source, the overarching concept of state-change filtration will heavily influence our approach toDreamServiceingestion.
Summary
Add a
RecorderServiceto the Neural Link MCP server that intercepts everycallTool()invocation and persists structured action logs to a newnl_action_logtable inmemory-core.sqlite. This is the foundational primitive for the Karpathy Loop (discussion #9887) — enabling RLAIF dataset generation, automated Playwright test synthesis, and downstream fine-tuning pipelines.Motivation
Without a structured record of what Neural Link tools agents invoke, in what order, and with what outcome (success/failure/duration/reward), the following capabilities are blocked:
TESTnodes →CLASSnodes viaVALIDATESedges in the Native Edge GraphArchitectural Context
Call Chain (confirmed by codebase audit):
AI Agent (MCP stdio) → neural-link/Server.mjs [CallToolRequestSchema handler] → neural-link/services/toolService.mjs [serviceMapping dispatch] → ai/mcp/ToolService.mjs :: callTool() ← 🎯 INTERCEPT POINT → {ComponentService, InstanceService, RuntimeService, ...} → Bridge.mjs [WebSocket hub] → Browser App WorkerStorage Decision: The NL MCP server and Memory Core MCP server are separate processes (each has its own
mcp-server.mjsentry point). The cleanest approach is forRecorderServiceto open a dedicatedbetter-sqlite3handle directly tomemory-core.sqlite. Since WAL mode is already enforced (PRAGMA journal_mode = WALinSQLiteVectorManager.initAsync()), multiple concurrent writers are safe without coordination overhead.Session Context Available:
ConnectionService.sessionDatamapsappWorkerId → { appName, connectedAt }. ThesessionIdparameter already flows through every NL tool invocation's args.ConnectionService.agentId(agent-{uuid}) is set once at NL server startup and cleanly identifies the agent process.sequence_idDesign: TheServer.mjsCallToolRequestSchemahandler fires once per MCP invocation. A module-level turn counter incremented there groups all tool calls within one agent step under the samesequence_id = agentId + '_' + turnCounter.Schema —
nl_action_logCREATE TABLE IF NOT EXISTS nl_action_log ( id TEXT PRIMARY KEY, -- crypto.randomUUID() agent_id TEXT NOT NULL, -- ConnectionService.agentId (process-level identity) session_id TEXT, -- appWorkerId from tool args (target App Worker) sequence_id TEXT NOT NULL, -- agent_id + '_' + turn_counter (groups one agent turn) timestamp INTEGER NOT NULL, -- Date.now() tool TEXT NOT NULL, -- e.g. 'simulate_event', 'set_instance_properties' args TEXT NOT NULL, -- JSON.stringify(args) — full arg payload result TEXT, -- JSON.stringify(result) or error message success INTEGER DEFAULT 0, -- 1 = success, 0 = thrown error duration_ms INTEGER, -- wall-clock latency in ms app_name TEXT, -- resolved from ConnectionService.sessionData reward REAL DEFAULT NULL -- NULL until set by DreamService RLAIF scorer (future) ); CREATE INDEX IF NOT EXISTS idx_nl_action_log_sequence ON nl_action_log(sequence_id); CREATE INDEX IF NOT EXISTS idx_nl_action_log_session ON nl_action_log(session_id); CREATE INDEX IF NOT EXISTS idx_nl_action_log_timestamp ON nl_action_log(timestamp);New Files
ai/mcp/server/neural-link/services/RecorderService.mjsNeo.core.Base, singletonbetter-sqlite3handle tomemory-core.sqlitepath (read from NLconfig.mjs)nl_action_logtable + indexes ininitAsync()if absentlog(entry)— synchronousINSERT(fire-and-forget, never throws)querySequences({ sinceTimestamp, minSuccessRate, limit })— forDreamServiceingestionpruneOlderThan(days)— housekeeping, callable from Sandman REM cycleModified Files
ai/mcp/server/neural-link/services/toolService.mjsHOF wrapper around
callTool— intercepts all 33 tool invocations:const _callTool = toolService.callTool.bind(toolService); const callTool = async (name, args) => { const t0 = Date.now(); const seqId = `${ConnectionService.agentId}_${currentTurnId}`; const sessionId = args?.sessionId ?? ConnectionService.getDefaultSessionId(); const appName = ConnectionService.sessionData.get(sessionId)?.appName ?? null; let result, success = 0; try { result = await _callTool(name, args); success = 1; return result; } catch (err) { result = { error: err.message }; throw err; } finally { RecorderService.log({ agent_id : ConnectionService.agentId, session_id : sessionId, sequence_id: seqId, timestamp : t0, tool : name, args : JSON.stringify(args ?? {}), result : JSON.stringify(result ?? null), success, duration_ms: Date.now() - t0, app_name : appName }); } };ai/mcp/server/neural-link/Server.mjslet _turnId = 0CallToolRequestSchemahandler (before the health check gate)getCurrentTurnId()fortoolService.mjsto readai/mcp/server/neural-link/config.mjsmemoryCoreDbPathconfig key pointing to the same SQLite file as Memory CoreaiConfig.engines.neo.dataDir + filenameTest
test/playwright/unit/ai/neural-link/RecorderService.spec.mjsFollows the exact isolation pattern from
DreamService.spec.mjs:tmp/SQLite DB (unique perprocess.pid + Date.now())beforeAll: configureaiConfigto point to tmp path, initSystemLifecycleServiceafterAll: close DB handle,unlinkSynctmp fileslog()inserts a row with correct fieldsquerySequences()groups and filters bysequence_idsuccessflag is1on clean call,0on thrown errorapp_nameis populated fromConnectionService.sessionDatarewardisNULLon initial insertpruneOlderThan(0)deletes all rowsOut of Scope (follow-up tickets)
These are explicitly deferred to preserve ticket atomicity:
DreamService.executeNLActionDigest()— 4th REM ingestion vector that readsnl_action_logand synthesizes Playwright test scaffoldstest/playwright/e2e/generated/TESTnode upsert +VALIDATESedgesAcceptance Criteria
nl_action_logtable + indexes created on NL server startup if absentsequence_idcorrectly groups all tools fired within the same MCP turnapp_nameis populated fromConnectionService.sessionDatawhen availablerewardcolumn isNULLon insert (reserved for future RLAIF scorer)RecorderService.log()never throws — errors are swallowed internally to avoid breaking tool executionbetter-sqlite3WAL-mode concurrent write confirmed safe (existingPRAGMA journal_mode = WALinSQLiteVectorManager)A2A Context Bridge
Avoided Pitfalls:
.sqlitefile — thenl_action_logtable must be co-located inmemory-core.sqliteto enable future JOIN queries between action sequences and session memories without cross-DB coordination.log()on every call — establish once ininitAsync()and reuse the handle.log()— the synchronousbetter-sqlite3.run()is the correct pattern here (already established bySQLiteVectorManager). Async embedding is not needed for structured relational data.RecorderServicemust beinitAsync()-aware — it must not block the MCP server startup on DB unavailability. Degrade gracefully with a warning log if the memory-core path is not configured.