LearnNewsExamplesServices
Frontmatter
id11469
titleAdd `ai:sync-github-workflow` npm script wrapping `GH_SyncService.runFullSync()` for CLI invocation
stateClosed
labels
enhancementaiarchitecturebuildmodel-experience
assigneesneo-opus-4-7
createdAtMay 16, 2026, 5:28 PM
updatedAtMay 16, 2026, 6:58 PM
githubUrlhttps://github.com/neomjs/neo/issues/11469
authorneo-opus-4-7
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 16, 2026, 6:58 PM

Add ai:sync-github-workflow npm script wrapping GH_SyncService.runFullSync() for CLI invocation

Closedenhancementaiarchitecturebuildmodel-experience
neo-opus-4-7
neo-opus-4-7 commented on May 16, 2026, 5:28 PM

Context

Post-#11461 (ADR 0004 Phase 1 Task 10 clean-slate purge) merged 2026-05-16, the canonical way to rehydrate resources/content/ is via the gh-workflow sync_all orchestration. Two invocation surfaces exist today:

  1. The gh-workflow MCP sync_all tool — agent-invokable but bound to MCP request-response timing
  2. The orchestrator-daemon's internal scheduled sweeps — automated but NOT including a full gh-workflow sync (the daemon's kbSyncInterval schedules KB sync + summarization only)

Empirically tested 2026-05-16 post-#11461 merge: clean-slate sync_all via MCP exceeds the MCP request timeout (8,502 issues + 2,816 PRs + 165 discussions + 166 release notes via GraphQL pagination = many minutes). The request-response model breaks on initial cold-cache emission; the operator is left invoking the daemon or waiting for periodic sweeps.

Problem

There's no ai:* package.json script for invoking the gh-workflow sync as a long-lived CLI process. Adjacent scripts exist (ai:sync-kb via syncKnowledgeBase.mjs, ai:run-sandman, ai:defrag-kb, etc.) but the gh-workflow analog is missing.

Naming note (per operator 2026-05-16): the natural-sounding name ai:sync-all is misleading — it would suggest syncing all substrates (KB, memory, gh-workflow, etc.) but the script scope is the gh-workflow server only. Use ai:sync-github-workflow to mirror the existing ai:mcp-server-github-workflow naming convention — same prefix-component, explicit scope.

This blocks clean test workflows post-clean-slate purge: the operator wants a way to invoke the gh-workflow sync from CLI with full stderr/stdout visibility + no MCP timeout ceiling. Currently the only options are:

  • Manually spin up the gh-workflow MCP server + craft a tool call (heavy)
  • Wait for orchestrator-daemon's KB sweep (which does NOT include gh-workflow sync_all)
  • Operator-side script-authoring at the moment of test (substrate-tool friction)

Architectural Reality

The CLI pattern is already well-established. buildScripts/ai/syncKnowledgeBase.mjs (invoked via ai:sync-kb) is the direct parallel — same shape:

import Neo                  from '../../src/Neo.mjs';
import * as core            from '../../src/core/_export.mjs';
import InstanceManager      from '../../src/manager/Instance.mjs';
import KB_Config            from '../../ai/mcp/server/knowledge-base/config.mjs';
import KB_DatabaseService   from '../../ai/services/knowledge-base/DatabaseService.mjs';
// ...

async function syncKnowledgeBase() {
    KB_Config.data.debug = true;
    await KB_LifecycleService.ready();
    await KB_ChromaManager.ready();
    await KB_DatabaseService.ready();
    const result = await KB_DatabaseService.syncDatabase();
    console.log('✅ Synchronization Complete:', result);
    process.exit(0);
}

GH_SyncService.runFullSync() is the established orchestration entry point (referenced at ai/services/github-workflow/SyncService.mjs:87). The MCP sync_all tool already delegates to it (per toolService.mjs). A CLI wrapper just provides the second invocation surface.

The Fix

Create buildScripts/ai/syncGithubWorkflow.mjs mirroring syncKnowledgeBase.mjs's pattern:

import Neo               from '../../src/Neo.mjs';
import * as core         from '../../src/core/_export.mjs';
import InstanceManager   from '../../src/manager/Instance.mjs';
import GH_Config         from '../../ai/mcp/server/github-workflow/config.mjs';
import GH_SyncService    from '../../ai/services/github-workflow/SyncService.mjs';

async function syncGithubWorkflow() {
    GH_Config.data.debug = true;
    try {
        const result = await GH_SyncService.runFullSync();
        console.log('✅ Sync complete:', result);
        process.exit(0);
    } catch (e) {
        console.error('❌ Sync failed:', e);
        process.exit(1);
    }
}

syncGithubWorkflow();

Add to package.json (alphabetical position between ai:summarize-sessions and ai:sync-kb):

"ai:sync-github-workflow": "node ./buildScripts/ai/syncGithubWorkflow.mjs"

Acceptance Criteria

  • New file: buildScripts/ai/syncGithubWorkflow.mjs — bootstrap Neo namespace + import GH_SyncService + invoke runFullSync() + log result + exit per syncKnowledgeBase.mjs's established pattern.
  • package.json adds "ai:sync-github-workflow": "node ./buildScripts/ai/syncGithubWorkflow.mjs" next to ai:sync-kb (alphabetical placement preserves existing ordering).
  • Manual verification: npm run ai:sync-github-workflow invokes the full sync end-to-end with visible stderr/stdout progress; no MCP timeout ceiling; exit code 0 on success / 1 on failure.
  • No regressions on the MCP tool path — both invocation surfaces continue to work; this is additive.
  • JSDoc module header documents the dual-invocation rationale (CLI dual to MCP sync_all tool) + cites authority anchors (ADR 0004 §1.3 + §3.6, #11451, PR #11461).

Out of Scope

  • Refactoring GH_SyncService.runFullSync() itself (already correct post-#11461).
  • Adding a flag-based filter (e.g., --type issues to sync only one content type). Future ticket if friction surfaces.
  • Wiring ai:sync-github-workflow into the orchestrator-daemon's scheduled tasks (deliberately on-demand only — the daemon's existing KB sweep + summarization shouldn't auto-fire heavy syncs).
  • Changing the MCP tool's request-timeout behavior. The MCP tool is for delta-syncs (the "cached for fast no-ops" path per its own description); the CLI script is for full cold-cache emission.
  • Adding a branch-guard mirroring the MCP tool's dev-branch precondition. The CLI script runs in operator's main checkout (which is on dev by convention); branch guard is the MCP tool's protection against agents invoking from feature-branch worktrees. CLI invocation is operator-side and trusted.

Avoided Traps

  • Naming ai:sync-all: rejected per operator 2026-05-16 — scope is gh-workflow only, not "all substrates". Use ai:sync-github-workflow for explicit scope (parallels existing ai:mcp-server-github-workflow naming).
  • Adding the script to the daemon's scheduled sweeps: rejected per "Out of Scope" — sync should remain operator-on-demand, not automatic. Periodic daemon sync would defeat the regeneratable-cache principle (ADR 0004 §1.3) by re-emitting unchanged corpus repeatedly.
  • Using the Agent SDK / AgentOrchestrator pattern: surveyed runAgent.mjs — that's for the full agent harness, not service-method invocation. syncKnowledgeBase.mjs is the correct architectural parallel: direct service singleton + Neo namespace bootstrap.
  • Wrapping the MCP tool via stdio: rejected — spawning an MCP server just to call its sync_all tool from CLI adds process overhead + still hits transport request-response semantics. Direct service method call is simpler.
  • Filtering via env vars to limit scope: rejected for v1 — defer until friction surfaces. The natural CLI flag pattern (e.g., --type pulls) can be added later when there's empirical demand.

Related

  • Parent context: ADR 0004 Phase 1 Task 10 close-out (#11451 / merged via PR #11461)
  • Authority artifact: learn/agentos/decisions/0004-github-content-architecture.md §1.3 (regeneratable cache) + §3.6 (clean-slate purge)
  • Architectural parallel: buildScripts/ai/syncKnowledgeBase.mjs (the established pattern this script mirrors) + ai:sync-kb npm script
  • MCP tool: mcp__neo-mjs-github-workflow__sync_all (existing invocation surface; this enabler adds the CLI dual)
  • Empirical anchor for the gap: 2026-05-16 post-#11461-merge sync_all attempt via MCP — request timed out before sync emission began; operator-side direction was to file this enabler ticket.

Origin Session ID: 0064efde-455e-4ecd-a26f-574381b3766a

Retrieval Hint: query_raw_memories(query="ai:sync-github-workflow npm script CLI wrapper GH_SyncService runFullSync MCP timeout enabler")

tobiu closed this issue on May 16, 2026, 6:58 PM