Commit Graph

4 Commits

Author SHA1 Message Date
claude code agent 227
3d7f434b0c fix(git-sync): red-team hardening — 12 confirmed sync-breaking bugs + regression tests
A 10-agent red-team pass on the two-way Docmost<->git sync surfaced 16 ranked
findings (9 others triaged out as already-defended). Wrote a reproduction test
per finding (each asserts the CORRECT behavior, so it fails on the bug), then
fixed the production code so every repro goes green. All confirmed bugs:

Round-trip data loss (markdown-converter.ts + docmost-schema.ts mirror):
- #1 editor-ext node types silently dropped on export — ported the 8 missing
  canon nodes (footnoteReference/footnotesList/footnoteDefinition, htmlEmbed,
  status, pageEmbed, transclusionSource/Reference) into the git-sync schema
  mirror and added converter cases that emit their schema-matching HTML instead
  of flattening unknown nodes to '' (this was the critical data-loss flagged in
  review #1679: footnotes/htmlEmbed lost on sync). Snapshot surface updated.
- #2 top-level image lost width/height/align/attachmentId — now emits an HTML
  <img> (like video/diagrams) when it carries layout attrs; bare images stay
  ![](src). Image node parses width/height as strings so they re-import.
- #3 code block containing a ``` fence corrupted on round-trip — outer fence is
  now widened to (longest-inner-backtick-run + 1).
- #16 deep nesting threw RangeError (page never synced) — added a depth guard
  (MAX_NODE_DEPTH=400) so the converter never overflows the stack.

Push/layout/cycle (engine):
- #4 disambiguation ' ~slugId' suffix corrupted Docmost titles + order-dependent
  layout — deterministic, order-independent sibling disambiguation; suffix is
  stripped from a path-derived title ONLY when the new name is exactly the old
  title plus the suffix (never a genuine retitle ending in ' ~token').
- #6 retry-adopt by (parent,title) clobbered the wrong duplicate-title sibling —
  ambiguous (parent,title) is no longer adopted (falls back to fresh create).
- #12 a new child under a new parent was created at ROOT — creates are ordered
  parent-before-child with an in-memory created-id map for parent resolution.
- #13 git conflict markers could reach Docmost — bodies are scanned and the
  marker lines stripped (a '=======' line is only treated as a conflict
  separator inside a <<<<<<< ... >>>>>>> block, so setext headings are safe).
- #15 a divergent `docmost` mirror was escalated by runPush but dropped by
  runCycle — RunCycleResult now forwards divergentDocmost to the orchestrator.

Server (merge / lock / provenance):
- #9 3-way merge lost a human's block edit when git inserted an adjacent block —
  finer-grained diff3 region merge (via lcs) preserves non-overlapping human
  edits; genuine same-block conflicts still resolve git-wins.
- #10 single-writer race — module-static liveLocks closes the same-process TOCTOU
  window, and a heartbeat refresh that cannot confirm the lock now aborts the
  cycle at its next write checkpoint (cooperative AbortSignal threaded through
  runCycle). Cross-process fencing tokens remain a follow-up.
- #14 sticky-agent provenance overrode an explicit actor='git-sync' write,
  blinding the listener loop-guard — resolveSource now lets an explicit actor
  win over the sticky-agent fallback (explicit agent still wins).

Verified: git-sync vitest 617 pass (+1 expected-fail), server unit jest 1541
pass, server tsc clean. A review pass over the fixes caught and corrected a
title-suffix over-strip, an inert abort signal, a document-wide conflict-marker
strip, and two leaf-atom content-holes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 01:29:02 +03:00
claude_code
4213a12180 fix(git-sync): address PR #119 review (#1571)
Resolve the code-review findings from comment #1571 on PR #119.

Engine (packages/git-sync):
- Idempotent CREATE on retry: before createPage, look the page up in the
  live Docmost tree by (parentPageId, title) and ADOPT it instead of
  duplicating when a prior cycle created it but failed to persist the
  pageId back to disk. Only trust a COMPLETE tree for the lookup; fall
  back to createPage otherwise. Covered by new tests incl. a complete=false
  regression-lock.
- Route applyPullActions diagnostics through an injected logger instead of
  bare console (thread log from the cycle).
- Add a timeout to the git execFile chokepoint (runRaw) so a hung git
  subprocess cannot wedge a sync cycle.
- Translate remaining Russian code comments to English.
- Remove dead standalone-CLI code (parseArgs/PushParsedArgs,
  parseSettings/envSchema, loadSettingsOrExit + config-errors.ts) and the
  matching index exports/specs; keep the Settings type.
- Fix the dangling docs link in package.json.
- Add a schema-surface snapshot guard so any drift in the vendored
  document schema is a loud, must-review CI failure (+ provenance header).

Server (apps/server):
- Add a configurable watchdog timeout to the spawned git http-backend so a
  stalled push cannot hold the per-space lock forever
  (GIT_SYNC_BACKEND_TIMEOUT_MS).
- Close the in-process TOCTOU window in SpaceLockService.withSpaceLock by
  reserving the slot synchronously before acquire.
- Add tests: removePage git-sync provenance (both branches), ensureServable
  force-push-protection git configs, and the phase-B+ datasource methods.

Docs / build:
- AGENTS.md: list git-sync as the fifth workspace package and note the
  three schema mirrors; fix the dangling git-sync-plan.md backlog link.
- pnpm-lock.yaml: add the missing @docmost/git-sync workspace link so
  pnpm install --frozen-lockfile (CI default) succeeds.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 00:18:20 +03:00
claude code agent 227
c76f255b1c refactor(git-sync): internalize the engine — first-class ESM, no vendoring bridge (#119 review)
Closes the architecture item from the #119 review: drop the "vendored from
docmost-sync" framing and the CJS↔ESM `Function('import()')` bridge so the engine
is a normal first-class gitmost package.

Part 1 — vendoring markers removed (prose only, zero behavior change): reworded
"VENDORED into gitmost" / "vendored from docmost-sync" / "Engine LOGIC is
byte-identical" / "it's a port" comments across the engine. Behavior-bearing
strings are untouched: BOT_AUTHOR_NAME/EMAIL and the `Docmost-Sync-Source:`
provenance trailers (changing them would break git authorship + the loop-guard).

Part 2 — the package is now ESM (matching the sibling @docmost/mcp): `type: module`,
tsconfig Node16, `.js` extensions on relative imports, and a static
`import { marked }` replacing the `new Function('return import(...)')` /
`loadMarked` hack — the bridge is GONE from the package. The CommonJS NestJS
server loads the now-ESM engine via a new `git-sync.loader.ts` that mirrors the
existing `docmost-client.loader.ts` mcp loader exactly (Function-indirected
dynamic import + cached promise + retry-on-reject). The 4 server consumers
(orchestrator/datasource/vault-registry/git-http-backend) call `await loadGitSync()`
for value exports; types stay `import type` (erased). The converter-gate spec —
which needs the real converter — loads the package's TS source via a jest
moduleNameMapper + isolatedModules (documented in that spec); the other git-sync
specs mock the loader.

Verified: engine builds pure ESM (no Function/require leftover), vitest 614,
editor-ext build, server + client tsc, full server jest 1397/0. Live stand
smoke-test: server starts clean on the ESM engine (no ERR_REQUIRE_ESM), a real
sync cycle runs through the loader, and the basic e2e suite is 12/12 (clone via
git-http-backend, push, pull, delete, 3-way merge — all through the new loader).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 00:18:20 +03:00
claude code agent 227
b0fc49cf9d refactor(git-sync): move the PULL->PUSH cycle into the engine as runCycle (PR #119 review, arch #1)
The reconcile choreography (ensureRepo -> merge-check -> ensureBranch ->
checkout('docmost') -> pull -> push) was hand-rolled in the app orchestrator's
driveCycle, duplicating an order the vendored engine owns and could drift from on
upgrade — the failure mode is data clobber. Lift it into @docmost/git-sync as a
single entry point, `runCycle(deps)`. The orchestrator now calls runCycle and
keeps only the lock (its caller) and the gitmost-specific delete-cap POLICY,
injected as the `resolveApplyClient` hook (the engine does the dry-run, hands the
hook the planned delete count — Infinity if planning failed — and uses whatever
client it returns for the apply). driveCycle drops from ~150 lines to ~30.

Tests:
- engine test/cycle.test.ts: composition (merge-in-progress short-circuit;
  ensureRepo->ensureBranch->checkout staging order before the pull; the cap hook
  is consulted with the planned count; no dry-run when no hook).
- engine test/cycle-roundtrip.test.ts: runCycle against a REAL VaultGit in a temp
  repo with a faked Docmost client — a git-originated CREATE flows pull->push and
  the assigned pageId is written back; an unresolved merge short-circuits before
  any client call.
- orchestrator spec rewired to mock runCycle and assert the wiring + the
  resolveApplyClient cap policy (the engine-internal cycle-order/merge tests moved
  to the engine).

Validated end to end on a live stand (real Postgres/Redis + server): a git clone
-> edit -> push over the /git remote round-trips the change into the Docmost page
through the refactored cycle.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 00:18:20 +03:00