LearnNewsExamplesServices
Frontmatter
id11580
titleFix sync-data hook for automated commits and repo-local branch checks
stateClosed
labels
bugaibuildmodel-experience
assignees[]
createdAtMay 18, 2026, 12:40 PM
updatedAtMay 18, 2026, 1:54 PM
githubUrlhttps://github.com/neomjs/neo/issues/11580
authorneo-gpt
commentsCount0
parentIssuenull
subIssues[]
subIssuesCompleted0
subIssuesTotal0
blockedBy[]
blocking[]
closedAtMay 18, 2026, 1:54 PM

Fix sync-data hook for automated commits and repo-local branch checks

Closedbugaibuildmodel-experience
neo-gpt
neo-gpt commented on May 18, 2026, 12:40 PM

Context

Operator report on 2026-05-18: running GitHub Workflow sync from Claude's repo root failed during the Husky pre-commit path with:

chore: gh workflow sync [skip ci]
[neo] Error: Sync-data leakage detected.
Branch 'codex/11578-unit-setup-main-localstorage' is not a designated data-sync branch (e.g., 'chore/sync-*' or 'agent/sync-*').
The following data files are staged for commit:
- resources/content/discussions/chunk-1/discussion-10040.md

The branch named in the error is @neo-gpt's unrelated Neo checkout branch, not the caller's repo branch. The guard is therefore creating two failures at once: it blocks the sanctioned automated sync path, and it can report / enforce against the wrong worktree context.

The Problem

The sync-data guard was introduced to prevent accidental resources/content/** leakage on feature branches, but the current implementation is over-broad in the automated sync path and under-specified for multi-checkout / multi-harness contexts.

Bug 1: automated sync commits should not be blocked by the same accidental-human-staging guard. SyncService.runFullSync() has an explicit pushToRepoAfterSync path that stages resources/content, commits chore: ticket sync [skip ci], rebases, and pushes. That path is the sanctioned data-sync writer. The current hook still runs and rejects staged resources/content/issues/ or resources/content/discussions/ unless the active branch starts with chore/sync- or agent/sync-.

Bug 2: when sync is invoked from another repo / harness context, the branch check must honor the actual sync target repository, not a different agent's checkout. The observed error named codex/11578-unit-setup-main-localstorage, which was @neo-gpt's active branch in /Users/Shared/codex/neomjs/neo, while the operator was acting from Claude's repo root.

The Architectural Reality

V-B-A evidence from current source:

  • .husky/pre-commit:1 runs node ./buildScripts/util/check-chore-sync.mjs before npx lint-staged.
  • buildScripts/util/check-chore-sync.mjs:7 reads the current branch with git rev-parse --abbrev-ref HEAD and no explicit target-root argument.
  • buildScripts/util/check-chore-sync.mjs:14-18 only allows branches prefixed chore/sync- or agent/sync-.
  • buildScripts/util/check-chore-sync.mjs:24 reads staged files with git diff --cached --name-only, again with no explicit target-root argument.
  • buildScripts/util/check-chore-sync.mjs:35-40 flags resources/content/issues/ and resources/content/discussions/ as violations.
  • ai/services/github-workflow/SyncService.mjs:152-167 is the sanctioned auto-push path: it uses aiConfig.projectRoot, runs git add resources/content, then runs git commit -m "chore: ticket sync [skip ci]" without --no-verify or a hook-scoped bypass.
  • ai/mcp/server/github-workflow/toolService.mjs:29-31 detects the MCP sync_all branch by running git branch --show-current in config.projectRoot. If the server config is bound to a different checkout than the caller's intended repo, the guard reports that other checkout's branch.

The older sync-leakage tickets (#11115, #11133, #11148, #11360) are adjacent but not equivalent. They established the need for sync-data safeguards; this bug is about the safeguard now rejecting valid automated sync commits and leaking branch context across repo/harness boundaries.

The Fix

Preserve the accidental-leakage protection, but make it context-aware and automation-aware:

  1. Add an explicit sanctioned-sync bypass for SyncService.runFullSync() auto-commit operations. Acceptable shapes include git commit --no-verify for the auto-generated sync commit, or a narrow environment variable such as NEO_SYNC_AUTOCOMMIT=1 that check-chore-sync.mjs honors only when the commit message / staged paths match the sync contract.
  2. Make check-chore-sync.mjs operate on an explicit repository root. It should resolve git rev-parse --show-toplevel, compare it against the intended sync root when one is provided, and print that root in failures.
  3. Ensure the branch check used by the GitHub Workflow MCP path cannot silently bind to another agent's unrelated checkout. If config.projectRoot and the caller / sync target disagree, fail with a clear root-mismatch error before staging or committing.
  4. Update or add unit coverage for both surfaces: sanctioned auto-sync commit bypass and wrong-root rejection / diagnostics.

Contract Ledger Matrix

Target Surface Source of Authority Proposed Behavior Fallback Docs Evidence
Husky pre-commit sync-data guard .husky/pre-commit, buildScripts/util/check-chore-sync.mjs Continue blocking accidental resources/content/{issues,discussions}/ staging on normal branches, but do not block sanctioned auto-sync commits Fail closed with branch + repo-root diagnostics Hook error text and script JSDoc Unit test / shell fixture for feature-branch human commit vs sanctioned sync commit
GitHub Workflow auto-push path ai/services/github-workflow/SyncService.mjs Auto-generated sync commits can commit/push their own staged content without being rejected by the local leakage guard Roll back or log failure if root/branch contract cannot be verified Inline comment near commit invocation Unit test around command construction or integration fixture with hook enabled
MCP sync_all repo-root branch detection ai/mcp/server/github-workflow/toolService.mjs Branch/root validation is anchored to the real sync target repository and cannot report another agent checkout's branch Reject before staging with root-mismatch message OpenAPI / tool description if behavior changes Unit test for injected projectRoot / caller-root mismatch

Acceptance Criteria

  • Automated SyncService.runFullSync() commits with generated chore:* sync [skip ci] messages are not rejected by check-chore-sync.mjs when staging only the sanctioned sync-data paths.
  • Normal developer / agent commits on non-data-sync branches are still rejected when they stage resources/content/issues/ or resources/content/discussions/ accidentally.
  • The hook error prints the repository root it inspected, the branch it inspected, and the staged files it used for the decision.
  • A sync invocation from a different checkout cannot evaluate @neo-gpt's /Users/Shared/codex/neomjs/neo branch unless that is explicitly the configured sync target; root mismatch fails before git add / git commit.
  • Tests cover the sanctioned auto-sync path and the wrong-root / unrelated-checkout diagnostic.

Out of Scope

  • Removing the sync-data leakage guard entirely.
  • Relaxing normal feature-branch protection for human-authored resources/content/** commits.
  • Reworking the broader sync architecture or archive/chunking substrate from #11360.
  • Merging or pushing any active feature branch.

Avoided Traps / Gold Standards Rejected

  • Trap: tell operators to always use --no-verify. Rejected. The auto-sync writer is code-owned and should carry its own narrow bypass instead of outsourcing correctness to humans.
  • Trap: delete the hook because it now causes friction. Rejected. The original contamination class is real (#11115/#11133); the fix is to distinguish sanctioned automation from accidental staging.
  • Trap: trust branch name alone. Rejected. The reported failure proves branch-only diagnostics are insufficient in multi-checkout / multi-agent sessions; root identity must be part of the contract.

Related

  • #11115 — original open safeguard ticket for chore-sync contamination.
  • #11133 — empirical stale-branch + chore-sync contamination cluster.
  • #11148 — prior dev-branch-only sync_all framing.
  • #11360 — broader syncer/data-substrate cleanup.

Origin Session ID: 8591bc48-0ddc-48bf-aa47-58e53ea81a57

Retrieval Hint: query_raw_memories(query="Sync-data leakage detected Branch codex/11578-unit-setup-main-localstorage Claude repo root check-chore-sync SyncService git commit no-verify root mismatch")

tobiu referenced in commit 39690b7 - "feat(mcp): sync_all strict root validation and auto-commit bypass (#11580) (#11581) on May 18, 2026, 1:54 PM
tobiu closed this issue on May 18, 2026, 1:54 PM