Context
Post-#11461 (ADR 0004 Phase 1 Task 10 clean-slate purge) merged 2026-05-16, the canonical way to rehydrate resources/content/ is via the gh-workflow sync_all orchestration. Two invocation surfaces exist today:
- The
gh-workflow MCP sync_all tool — agent-invokable but bound to MCP request-response timing
- The orchestrator-daemon's internal scheduled sweeps — automated but NOT including a full gh-workflow sync (the daemon's
kbSyncInterval schedules KB sync + summarization only)
Empirically tested 2026-05-16 post-#11461 merge: clean-slate sync_all via MCP exceeds the MCP request timeout (8,502 issues + 2,816 PRs + 165 discussions + 166 release notes via GraphQL pagination = many minutes). The request-response model breaks on initial cold-cache emission; the operator is left invoking the daemon or waiting for periodic sweeps.
Problem
There's no ai:* package.json script for invoking the gh-workflow sync as a long-lived CLI process. Adjacent scripts exist (ai:sync-kb via syncKnowledgeBase.mjs, ai:run-sandman, ai:defrag-kb, etc.) but the gh-workflow analog is missing.
Naming note (per operator 2026-05-16): the natural-sounding name ai:sync-all is misleading — it would suggest syncing all substrates (KB, memory, gh-workflow, etc.) but the script scope is the gh-workflow server only. Use ai:sync-github-workflow to mirror the existing ai:mcp-server-github-workflow naming convention — same prefix-component, explicit scope.
This blocks clean test workflows post-clean-slate purge: the operator wants a way to invoke the gh-workflow sync from CLI with full stderr/stdout visibility + no MCP timeout ceiling. Currently the only options are:
- Manually spin up the gh-workflow MCP server + craft a tool call (heavy)
- Wait for orchestrator-daemon's KB sweep (which does NOT include gh-workflow sync_all)
- Operator-side script-authoring at the moment of test (substrate-tool friction)
Architectural Reality
The CLI pattern is already well-established. buildScripts/ai/syncKnowledgeBase.mjs (invoked via ai:sync-kb) is the direct parallel — same shape:
import Neo from '../../src/Neo.mjs';
import * as core from '../../src/core/_export.mjs';
import InstanceManager from '../../src/manager/Instance.mjs';
import KB_Config from '../../ai/mcp/server/knowledge-base/config.mjs';
import KB_DatabaseService from '../../ai/services/knowledge-base/DatabaseService.mjs';
async function syncKnowledgeBase() {
KB_Config.data.debug = true;
await KB_LifecycleService.ready();
await KB_ChromaManager.ready();
await KB_DatabaseService.ready();
const result = await KB_DatabaseService.syncDatabase();
console.log('✅ Synchronization Complete:', result);
process.exit(0);
}
GH_SyncService.runFullSync() is the established orchestration entry point (referenced at ai/services/github-workflow/SyncService.mjs:87). The MCP sync_all tool already delegates to it (per toolService.mjs). A CLI wrapper just provides the second invocation surface.
The Fix
Create buildScripts/ai/syncGithubWorkflow.mjs mirroring syncKnowledgeBase.mjs's pattern:
import Neo from '../../src/Neo.mjs';
import * as core from '../../src/core/_export.mjs';
import InstanceManager from '../../src/manager/Instance.mjs';
import GH_Config from '../../ai/mcp/server/github-workflow/config.mjs';
import GH_SyncService from '../../ai/services/github-workflow/SyncService.mjs';
async function syncGithubWorkflow() {
GH_Config.data.debug = true;
try {
const result = await GH_SyncService.runFullSync();
console.log('✅ Sync complete:', result);
process.exit(0);
} catch (e) {
console.error('❌ Sync failed:', e);
process.exit(1);
}
}
syncGithubWorkflow();
Add to package.json (alphabetical position between ai:summarize-sessions and ai:sync-kb):
"ai:sync-github-workflow": "node ./buildScripts/ai/syncGithubWorkflow.mjs"
Acceptance Criteria
Out of Scope
- Refactoring
GH_SyncService.runFullSync() itself (already correct post-#11461).
- Adding a flag-based filter (e.g.,
--type issues to sync only one content type). Future ticket if friction surfaces.
- Wiring
ai:sync-github-workflow into the orchestrator-daemon's scheduled tasks (deliberately on-demand only — the daemon's existing KB sweep + summarization shouldn't auto-fire heavy syncs).
- Changing the MCP tool's request-timeout behavior. The MCP tool is for delta-syncs (the "cached for fast no-ops" path per its own description); the CLI script is for full cold-cache emission.
- Adding a branch-guard mirroring the MCP tool's dev-branch precondition. The CLI script runs in operator's main checkout (which is on dev by convention); branch guard is the MCP tool's protection against agents invoking from feature-branch worktrees. CLI invocation is operator-side and trusted.
Avoided Traps
- Naming
ai:sync-all: rejected per operator 2026-05-16 — scope is gh-workflow only, not "all substrates". Use ai:sync-github-workflow for explicit scope (parallels existing ai:mcp-server-github-workflow naming).
- Adding the script to the daemon's scheduled sweeps: rejected per "Out of Scope" — sync should remain operator-on-demand, not automatic. Periodic daemon sync would defeat the regeneratable-cache principle (ADR 0004 §1.3) by re-emitting unchanged corpus repeatedly.
- Using the Agent SDK / AgentOrchestrator pattern: surveyed
runAgent.mjs — that's for the full agent harness, not service-method invocation. syncKnowledgeBase.mjs is the correct architectural parallel: direct service singleton + Neo namespace bootstrap.
- Wrapping the MCP tool via stdio: rejected — spawning an MCP server just to call its
sync_all tool from CLI adds process overhead + still hits transport request-response semantics. Direct service method call is simpler.
- Filtering via env vars to limit scope: rejected for v1 — defer until friction surfaces. The natural CLI flag pattern (e.g.,
--type pulls) can be added later when there's empirical demand.
Related
- Parent context: ADR 0004 Phase 1 Task 10 close-out (#11451 / merged via PR #11461)
- Authority artifact:
learn/agentos/decisions/0004-github-content-architecture.md §1.3 (regeneratable cache) + §3.6 (clean-slate purge)
- Architectural parallel:
buildScripts/ai/syncKnowledgeBase.mjs (the established pattern this script mirrors) + ai:sync-kb npm script
- MCP tool:
mcp__neo-mjs-github-workflow__sync_all (existing invocation surface; this enabler adds the CLI dual)
- Empirical anchor for the gap: 2026-05-16 post-#11461-merge sync_all attempt via MCP — request timed out before sync emission began; operator-side direction was to file this enabler ticket.
Origin Session ID: 0064efde-455e-4ecd-a26f-574381b3766a
Retrieval Hint: query_raw_memories(query="ai:sync-github-workflow npm script CLI wrapper GH_SyncService runFullSync MCP timeout enabler")
Context
Post-#11461 (ADR 0004 Phase 1 Task 10 clean-slate purge) merged 2026-05-16, the canonical way to rehydrate
resources/content/is via the gh-workflowsync_allorchestration. Two invocation surfaces exist today:gh-workflowMCPsync_alltool — agent-invokable but bound to MCP request-response timingkbSyncIntervalschedules KB sync + summarization only)Empirically tested 2026-05-16 post-#11461 merge: clean-slate
sync_allvia MCP exceeds the MCP request timeout (8,502 issues + 2,816 PRs + 165 discussions + 166 release notes via GraphQL pagination = many minutes). The request-response model breaks on initial cold-cache emission; the operator is left invoking the daemon or waiting for periodic sweeps.Problem
There's no
ai:*package.json script for invoking the gh-workflow sync as a long-lived CLI process. Adjacent scripts exist (ai:sync-kbviasyncKnowledgeBase.mjs,ai:run-sandman,ai:defrag-kb, etc.) but the gh-workflow analog is missing.Naming note (per operator 2026-05-16): the natural-sounding name
ai:sync-allis misleading — it would suggest syncing all substrates (KB, memory, gh-workflow, etc.) but the script scope is the gh-workflow server only. Useai:sync-github-workflowto mirror the existingai:mcp-server-github-workflownaming convention — same prefix-component, explicit scope.This blocks clean test workflows post-clean-slate purge: the operator wants a way to invoke the gh-workflow sync from CLI with full stderr/stdout visibility + no MCP timeout ceiling. Currently the only options are:
Architectural Reality
The CLI pattern is already well-established.
buildScripts/ai/syncKnowledgeBase.mjs(invoked viaai:sync-kb) is the direct parallel — same shape:import Neo from '../../src/Neo.mjs'; import * as core from '../../src/core/_export.mjs'; import InstanceManager from '../../src/manager/Instance.mjs'; import KB_Config from '../../ai/mcp/server/knowledge-base/config.mjs'; import KB_DatabaseService from '../../ai/services/knowledge-base/DatabaseService.mjs'; // ... async function syncKnowledgeBase() { KB_Config.data.debug = true; await KB_LifecycleService.ready(); await KB_ChromaManager.ready(); await KB_DatabaseService.ready(); const result = await KB_DatabaseService.syncDatabase(); console.log('✅ Synchronization Complete:', result); process.exit(0); }GH_SyncService.runFullSync()is the established orchestration entry point (referenced atai/services/github-workflow/SyncService.mjs:87). The MCPsync_alltool already delegates to it (pertoolService.mjs). A CLI wrapper just provides the second invocation surface.The Fix
Create
buildScripts/ai/syncGithubWorkflow.mjsmirroringsyncKnowledgeBase.mjs's pattern:import Neo from '../../src/Neo.mjs'; import * as core from '../../src/core/_export.mjs'; import InstanceManager from '../../src/manager/Instance.mjs'; import GH_Config from '../../ai/mcp/server/github-workflow/config.mjs'; import GH_SyncService from '../../ai/services/github-workflow/SyncService.mjs'; async function syncGithubWorkflow() { GH_Config.data.debug = true; try { const result = await GH_SyncService.runFullSync(); console.log('✅ Sync complete:', result); process.exit(0); } catch (e) { console.error('❌ Sync failed:', e); process.exit(1); } } syncGithubWorkflow();Add to
package.json(alphabetical position betweenai:summarize-sessionsandai:sync-kb):"ai:sync-github-workflow": "node ./buildScripts/ai/syncGithubWorkflow.mjs"Acceptance Criteria
buildScripts/ai/syncGithubWorkflow.mjs— bootstrap Neo namespace + importGH_SyncService+ invokerunFullSync()+ log result + exit persyncKnowledgeBase.mjs's established pattern.package.jsonadds"ai:sync-github-workflow": "node ./buildScripts/ai/syncGithubWorkflow.mjs"next toai:sync-kb(alphabetical placement preserves existing ordering).npm run ai:sync-github-workflowinvokes the full sync end-to-end with visible stderr/stdout progress; no MCP timeout ceiling; exit code 0 on success / 1 on failure.sync_alltool) + cites authority anchors (ADR 0004 §1.3 + §3.6, #11451, PR #11461).Out of Scope
GH_SyncService.runFullSync()itself (already correct post-#11461).--type issuesto sync only one content type). Future ticket if friction surfaces.ai:sync-github-workflowinto the orchestrator-daemon's scheduled tasks (deliberately on-demand only — the daemon's existing KB sweep + summarization shouldn't auto-fire heavy syncs).Avoided Traps
ai:sync-all: rejected per operator 2026-05-16 — scope is gh-workflow only, not "all substrates". Useai:sync-github-workflowfor explicit scope (parallels existingai:mcp-server-github-workflownaming).runAgent.mjs— that's for the full agent harness, not service-method invocation.syncKnowledgeBase.mjsis the correct architectural parallel: direct service singleton + Neo namespace bootstrap.sync_alltool from CLI adds process overhead + still hits transport request-response semantics. Direct service method call is simpler.--type pulls) can be added later when there's empirical demand.Related
learn/agentos/decisions/0004-github-content-architecture.md§1.3 (regeneratable cache) + §3.6 (clean-slate purge)buildScripts/ai/syncKnowledgeBase.mjs(the established pattern this script mirrors) +ai:sync-kbnpm scriptmcp__neo-mjs-github-workflow__sync_all(existing invocation surface; this enabler adds the CLI dual)Origin Session ID: 0064efde-455e-4ecd-a26f-574381b3766a
Retrieval Hint:
query_raw_memories(query="ai:sync-github-workflow npm script CLI wrapper GH_SyncService runFullSync MCP timeout enabler")