857a0064f714a61192a9b1f3c701e005863890f3
28 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
857a0064f7 |
fix(git-sync): make reconcile import truly idempotent — stop runaway whole-body duplication
The live Yjs document materializes the editor-schema default `indent: 0` on
every paragraph/heading (and on the paragraph inside every list item, callout,
and table cell), but a body re-imported from git — parsed from clean markdown —
carries no indent attribute. So every live block's merge key differed from the
same block coming back from git: the three-way merge could anchor on nothing,
and the trailing unit that git's export already contained (but the merge could
not match against the byte-identical live tail) was re-appended on every
reconcile cycle. Each grown export then diverged from the last-pushed base by one
more unit — a self-sustaining, unbounded whole-body duplication loop with no
client connected.
The prior fix (
|
||
|
|
daf6c9ea16 |
fix(git-sync): propagate remote custom-event handler errors instead of 30s timeout
When a git-sync body write (gitSyncWriteBody) is routed to the collab instance
that owns the doc, the handler runs remotely inside handleRedisMessage and CAN
throw (markdown->ProseMirror transform). Previously the throw was uncaught: the
customEventComplete reply was never published, so the origin's writePageBody
promise only rejected after customEventTTL (~30s) as a generic 'TIMEOUT', and an
unhandledRejection escaped the async messageBuffer listener on the owning
instance.
Now the owner wraps handleEventLocally in try/catch and, on throw, publishes a
customEventComplete carrying an `error` field on the same correlation channel.
The origin's pendingReplies holds {resolve, reject} and rejects promptly with the
real Error. The TTL TIMEOUT remains as the fallback for a genuinely lost reply.
The no-throw and local (same-instance) paths are unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
9e69d917ee |
fix(git-sync): converge git-ingest with open editor sessions — stop silent revert/data-loss on live pages
A git push to a page with an OPEN editor was silently reverted: the git commit landed and the DB body updated, but the page in the browser stayed on the old content and the editor's next autosave overwrote the git change. Root cause (distributed, not in the merge): writeBody applied the body merge via collabGateway.openDirectConnection on whichever instance/process runs git-sync (the api/worker). When an editor is connected to a DIFFERENT collab instance/process, that opens a SEPARATE, detached Y.Doc. The merge landed in the detached doc + DB, but the live editor's Y.Doc never received the Yjs update; its debounced autosave then persisted its STALE state over the DB, reverting the git change (and, for concurrent edits to different paragraphs, losing the git side). In one process the bug is invisible because the direct connection already shares the editor's doc. Fix: route the body write through the existing custom-event channel (the same mechanism comment-marks and updatePageContent use) so the merge runs on the instance that OWNS the live doc. Its update is then broadcast to every connection (Document.handleUpdate) and the editor's CRDT converges on the merged result. New CollaborationGateway.writePageBody dispatches to a new gitSyncWriteBody handler (builds incoming/base docs before opening the connection — crash-safe — then 3-way/2-way merges into the live fragment); without redis it runs locally on the single (owning) instance. writeBody now just forwards the converted ProseMirror bodies + service userId. Evidence: - git-ingest-convergence.spec.ts: deterministic two-Y.Doc repro. PATH B (undelivered update) asserts the LOSS (the bug); PATH A (update delivered, as the owner-routed write does) asserts the git change SURVIVES and that concurrent edits to different paragraphs both survive. - collaboration.handler.git-sync.spec.ts: exercises the real gitSyncWriteBody against a shared doc wired to a connected "editor" doc (models the owning-instance broadcast) — editor converges, concurrent edit preserved, crash-safe on transform failure. - gitmost-datasource.service.spec.ts: writeBody now routes via writePageBody (RED before this change — it called openDirectConnection). Honest scope: the failure is cross-instance; full multi-instance convergence needs a live Hocuspocus + redis and is not provable in a unit test, so the convergence invariant is captured at the Yjs update-exchange level. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
2594828758 |
fix(git-sync): idempotent first-block reconciliation — stop start-of-doc content duplicating every sync cycle
The block-level body merge keyed each block by its full attribute set, including the per-block UniqueID the editor stamps on every heading/paragraph. A body arriving from git is parsed from clean markdown and carries no block ids, so a live block (id present) never matched the same block coming from git (no id). The three-way merge's LCS could not anchor on it, and an incoming block with no matching anchor — content inserted at the TOP of the page — was re-added on every push/pull cycle: a non-convergent, unbounded duplication loop. Exclude the volatile 'id' attribute from the block comparison key (serializeXmlNode) so blocks compare by content across the git round-trip. The merge keeps the live block INSTANCE (and its id, and any in-flight edit) for an anchor — picks are by index, not key — so identity is preserved while reconciliation becomes idempotent. Mirrors canonicalize.ts, which already strips the regenerated block id from the round-trip idempotency comparison. Adds a RED-before-fix repro modelling the live-id vs git-no-id asymmetry and asserting no block growth across cycles. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
e777ebcf4f |
feat(git-sync): remove the per-cycle delete cap; deletes apply + are logged every cycle
The delete cap (GIT_SYNC_MAX_DELETES_PER_CYCLE, default 5) was a defense-in-depth guard that SUPPRESSED a cycle's deletions when the planned count exceeded the limit. In practice it was a crutch over engine correctness that also blocked legitimate deletes: deleting a folder with many child pages is a normal action, and git-sync deletes are SOFT (Trash, reversible), so a blocking limit has little upside and real downside. There is also no user-facing surface to "confirm" a large delete from a background sync — the only channel is the operator log. So: drop the cap entirely. Deletes apply unconditionally; every cycle already logs its full push plan, per-action `delete: <pageId>` lines, and completion counts through the engine `log`, so what was deleted (and what was skipped) is always recorded. Engine correctness (the reconcile/layout/round-trip tests) is what prevents phantom deletions — not a blocking cap. Removed: orchestrator `resolveApplyClient` cap hook + `maxDeletes`, `getGitSyncMaxDeletesPerCycle`, the `GIT_SYNC_MAX_DELETES_PER_CYCLE` env/validation/.env.example, and the cap tests. (The engine's generic optional `resolveApplyClient` hook is left as an unused extension point.) server tsc clean, git-sync + environment jest 174. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
5125296bfa |
fix(git-sync): subpages round-trips (was {{SUBPAGES}} literal) + exhaustive all-node round-trip test
subpages exported to the literal `{{SUBPAGES}}`, which has no markdown/HTML
inverse, so on re-import it came back as a plain paragraph holding the visible
text "{{SUBPAGES}}" — the embed rendered as that literal string on the page
after a sync (round-trip data loss, seen live). It now emits the schema-matching
`<div data-type="subpages">` like every other embed node, so the schema's
parseHTML rebuilds the subpages node. Also dropped the leaf-atom content-hole
in the subpages renderHTML.
New committed regression coverage:
- packages/git-sync/test/roundtrip-all-nodes.test.ts — exhaustive serialize ->
deserialize round trip for ALL 40 node/mark types; each asserts the node/mark
survives and no `{{...}}` literal leaks. This is the test that caught subpages.
- §13.1 gate (git-sync-converter-gate.spec.ts): subpages added to the green
corpus (round-trips through the REAL server schema).
- Corrected two PR-authored tests that asserted the old {{SUBPAGES}} loss as
"by design" — they now assert the fixed round trip.
Also folds in review #1679 coverage-gap tests (no prod change): orchestrator
pollTick/enabledSpaces, datasource 3-way merge dispatch, page.repo
last_updated_source provenance SQL.
git-sync vitest 659 (+1 expected-fail), server tsc clean, server specs green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
a40a00d5c5 |
feat(git-sync): per-space toggle for conflict-marker handling on push (#13)
Red-team #13 (conflict markers reaching Docmost) is now a per-space policy exposed as a UI toggle, instead of a hardcoded behavior. New boolean `gitSync.autoMergeConflicts` (default FALSE), mirroring the existing per-space `gitSync.enabled` flag end-to-end (jsonb space settings -> update-space DTO -> space.service -> client types -> space settings form switch): - OFF (default, safe): a page whose committed body still has unresolved git conflict markers is NOT pushed — it is recorded as a per-page push FAILURE ("unresolved conflict markers — resolve in git first"). Recording a failure (not a soft skip) deliberately HOLDS refs/docmost/last-pushed so the conflict commit is never marked pushed and a later pull cannot clobber the user's in-progress resolution; the page retries until the conflict is resolved in git. - ON: the marker lines are stripped and both sides' content is pushed (the prior behavior), so the conflict becomes visible/fixable inside Docmost. The engine Settings carries `autoMergeConflicts`; runPush threads it into the update AND create paths. The orchestrator's buildSettings reads the per-space flag from jsonb (strict opt-in like `enabled`, default false). Tests: redteam-push-cycle #13 rewritten (default -> not pushed + failure + refs held; ON -> strip-and-push); space.service + edit-space-form + orchestrator specs extended. git-sync vitest 618, server jest space+git-sync 163, client edit-space-form 11, server/client tsc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
81c0226be7 |
docs(git-sync): document GIT_SYNC_BACKEND_TIMEOUT_MS, drop dead consts, fix dangling plan refs
Address the non-red-team documentation/cleanup items from review #1679: - Document the GIT_SYNC_BACKEND_TIMEOUT_MS watchdog (git http-backend) in .env.example and add it to the environment validation schema — it was used (getGitSyncBackendTimeoutMs, default 120000) but undocumented/unvalidated. - Remove the dead GIT_SYNC_DEBOUNCE_MS_DEFAULT / GIT_SYNC_POLL_INTERVAL_MS_DEFAULT exports (never imported; environment.service is the single source of defaults). - Redirect the dangling `plan §X.Y` comment references to issue #194 (the git-sync spec moved there when docs/git-sync-plan.md was deleted by this PR). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
d5079aa1d8 |
fix(git-sync): red-team hardening — 12 confirmed sync-breaking bugs + regression tests
A 10-agent red-team pass on the two-way Docmost<->git sync surfaced 16 ranked findings (9 others triaged out as already-defended). Wrote a reproduction test per finding (each asserts the CORRECT behavior, so it fails on the bug), then fixed the production code so every repro goes green. All confirmed bugs: Round-trip data loss (markdown-converter.ts + docmost-schema.ts mirror): - #1 editor-ext node types silently dropped on export — ported the 8 missing canon nodes (footnoteReference/footnotesList/footnoteDefinition, htmlEmbed, status, pageEmbed, transclusionSource/Reference) into the git-sync schema mirror and added converter cases that emit their schema-matching HTML instead of flattening unknown nodes to '' (this was the critical data-loss flagged in review #1679: footnotes/htmlEmbed lost on sync). Snapshot surface updated. - #2 top-level image lost width/height/align/attachmentId — now emits an HTML <img> (like video/diagrams) when it carries layout attrs; bare images stay . Image node parses width/height as strings so they re-import. - #3 code block containing a ``` fence corrupted on round-trip — outer fence is now widened to (longest-inner-backtick-run + 1). - #16 deep nesting threw RangeError (page never synced) — added a depth guard (MAX_NODE_DEPTH=400) so the converter never overflows the stack. Push/layout/cycle (engine): - #4 disambiguation ' ~slugId' suffix corrupted Docmost titles + order-dependent layout — deterministic, order-independent sibling disambiguation; suffix is stripped from a path-derived title ONLY when the new name is exactly the old title plus the suffix (never a genuine retitle ending in ' ~token'). - #6 retry-adopt by (parent,title) clobbered the wrong duplicate-title sibling — ambiguous (parent,title) is no longer adopted (falls back to fresh create). - #12 a new child under a new parent was created at ROOT — creates are ordered parent-before-child with an in-memory created-id map for parent resolution. - #13 git conflict markers could reach Docmost — bodies are scanned and the marker lines stripped (a '=======' line is only treated as a conflict separator inside a <<<<<<< ... >>>>>>> block, so setext headings are safe). - #15 a divergent `docmost` mirror was escalated by runPush but dropped by runCycle — RunCycleResult now forwards divergentDocmost to the orchestrator. Server (merge / lock / provenance): - #9 3-way merge lost a human's block edit when git inserted an adjacent block — finer-grained diff3 region merge (via lcs) preserves non-overlapping human edits; genuine same-block conflicts still resolve git-wins. - #10 single-writer race — module-static liveLocks closes the same-process TOCTOU window, and a heartbeat refresh that cannot confirm the lock now aborts the cycle at its next write checkpoint (cooperative AbortSignal threaded through runCycle). Cross-process fencing tokens remain a follow-up. - #14 sticky-agent provenance overrode an explicit actor='git-sync' write, blinding the listener loop-guard — resolveSource now lets an explicit actor win over the sticky-agent fallback (explicit agent still wins). Verified: git-sync vitest 617 pass (+1 expected-fail), server unit jest 1541 pass, server tsc clean. A review pass over the fixes caught and corrected a title-suffix over-strip, an inert abort signal, a document-wide conflict-marker strip, and two leaf-atom content-holes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
28d2560dfd |
fix(git-sync): address PR #119 review (#1571)
Resolve the code-review findings from comment #1571 on PR #119. Engine (packages/git-sync): - Idempotent CREATE on retry: before createPage, look the page up in the live Docmost tree by (parentPageId, title) and ADOPT it instead of duplicating when a prior cycle created it but failed to persist the pageId back to disk. Only trust a COMPLETE tree for the lookup; fall back to createPage otherwise. Covered by new tests incl. a complete=false regression-lock. - Route applyPullActions diagnostics through an injected logger instead of bare console (thread log from the cycle). - Add a timeout to the git execFile chokepoint (runRaw) so a hung git subprocess cannot wedge a sync cycle. - Translate remaining Russian code comments to English. - Remove dead standalone-CLI code (parseArgs/PushParsedArgs, parseSettings/envSchema, loadSettingsOrExit + config-errors.ts) and the matching index exports/specs; keep the Settings type. - Fix the dangling docs link in package.json. - Add a schema-surface snapshot guard so any drift in the vendored document schema is a loud, must-review CI failure (+ provenance header). Server (apps/server): - Add a configurable watchdog timeout to the spawned git http-backend so a stalled push cannot hold the per-space lock forever (GIT_SYNC_BACKEND_TIMEOUT_MS). - Close the in-process TOCTOU window in SpaceLockService.withSpaceLock by reserving the slot synchronously before acquire. - Add tests: removePage git-sync provenance (both branches), ensureServable force-push-protection git configs, and the phase-B+ datasource methods. Docs / build: - AGENTS.md: list git-sync as the fifth workspace package and note the three schema mirrors; fix the dangling git-sync-plan.md backlog link. - pnpm-lock.yaml: add the missing @docmost/git-sync workspace link so pnpm install --frozen-lockfile (CI default) succeeds. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
5da12e89f9 |
refactor(git-sync): internalize the engine — first-class ESM, no vendoring bridge (#119 review)
Closes the architecture item from the #119 review: drop the "vendored from docmost-sync" framing and the CJS↔ESM `Function('import()')` bridge so the engine is a normal first-class gitmost package. Part 1 — vendoring markers removed (prose only, zero behavior change): reworded "VENDORED into gitmost" / "vendored from docmost-sync" / "Engine LOGIC is byte-identical" / "it's a port" comments across the engine. Behavior-bearing strings are untouched: BOT_AUTHOR_NAME/EMAIL and the `Docmost-Sync-Source:` provenance trailers (changing them would break git authorship + the loop-guard). Part 2 — the package is now ESM (matching the sibling @docmost/mcp): `type: module`, tsconfig Node16, `.js` extensions on relative imports, and a static `import { marked }` replacing the `new Function('return import(...)')` / `loadMarked` hack — the bridge is GONE from the package. The CommonJS NestJS server loads the now-ESM engine via a new `git-sync.loader.ts` that mirrors the existing `docmost-client.loader.ts` mcp loader exactly (Function-indirected dynamic import + cached promise + retry-on-reject). The 4 server consumers (orchestrator/datasource/vault-registry/git-http-backend) call `await loadGitSync()` for value exports; types stay `import type` (erased). The converter-gate spec — which needs the real converter — loads the package's TS source via a jest moduleNameMapper + isolatedModules (documented in that spec); the other git-sync specs mock the loader. Verified: engine builds pure ESM (no Function/require leftover), vitest 614, editor-ext build, server + client tsc, full server jest 1397/0. Live stand smoke-test: server starts clean on the ESM engine (no ERR_REQUIRE_ESM), a real sync cycle runs through the loader, and the basic e2e suite is 12/12 (clone via git-http-backend, push, pull, delete, 3-way merge — all through the new loader). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
71375e25ee |
chore(git-sync): drop now-unused dirname import (PR #119 review)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
dc7a0ec9f5 |
refactor(git-sync): move the PULL->PUSH cycle into the engine as runCycle (PR #119 review, arch #1)
The reconcile choreography (ensureRepo -> merge-check -> ensureBranch ->
checkout('docmost') -> pull -> push) was hand-rolled in the app orchestrator's
driveCycle, duplicating an order the vendored engine owns and could drift from on
upgrade — the failure mode is data clobber. Lift it into @docmost/git-sync as a
single entry point, `runCycle(deps)`. The orchestrator now calls runCycle and
keeps only the lock (its caller) and the gitmost-specific delete-cap POLICY,
injected as the `resolveApplyClient` hook (the engine does the dry-run, hands the
hook the planned delete count — Infinity if planning failed — and uses whatever
client it returns for the apply). driveCycle drops from ~150 lines to ~30.
Tests:
- engine test/cycle.test.ts: composition (merge-in-progress short-circuit;
ensureRepo->ensureBranch->checkout staging order before the pull; the cap hook
is consulted with the planned count; no dry-run when no hook).
- engine test/cycle-roundtrip.test.ts: runCycle against a REAL VaultGit in a temp
repo with a faked Docmost client — a git-originated CREATE flows pull->push and
the assigned pageId is written back; an unresolved merge short-circuits before
any client call.
- orchestrator spec rewired to mock runCycle and assert the wiring + the
resolveApplyClient cap policy (the engine-internal cycle-order/merge tests moved
to the engine).
Validated end to end on a live stand (real Postgres/Redis + server): a git clone
-> edit -> push over the /git remote round-trips the change into the Docmost page
through the refactored cycle.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
085a30575f |
test(git-sync): cover ingestExternalPush in the orchestrator spec (PR #119 review)
Closes the test-coverage warning that the smart-HTTP push ingest path was unexercised. Adds 5 cases: receive-pack streams BEFORE the Docmost cycle; a held lock throws GitSyncLockHeldError and runs neither the receive-pack nor the cycle; a post-push cycle error is swallowed (the push is durable, poll retries) while the lock is still released; a missing service user runs the receive-pack but skips the immediate cycle; and a globally-disabled git-sync refuses without touching the lock. (The 503/Retry-After mapping in git-http.service is the sibling warning; its spec is in the repo's pre-existing set of jest suites that can't load locally via the react-dom/tiptap transform chain, so that case is left for CI.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
95bc9fe98d |
refactor(git-sync): extract SpaceLockService from the orchestrator (PR #119 review, arch #2)
The per-space single-writer lock — Redis CAS leader lock (SET NX PX, DEL-CAS and PEXPIRE-CAS Lua), the in-process mutex, the per-process instanceId and the heartbeat — lived inline in GitSyncOrchestrator. Extract it into a dedicated @Injectable() SpaceLockService exposing one narrow surface, withSpaceLock(spaceId, fn), so the lock is the orchestrator's only Redis-lock touch-point and is testable in isolation. The orchestrator now injects SpaceLockService and both consumers (runOnce, ingestExternalPush) go through spaceLock.withSpaceLock — behavior unchanged (same sentinel returns, same 503-on-lock-held contract). Orchestrator drops 591→472 lines. Adds space-lock.service.spec.ts asserting the lock SEMANTICS against a fake Redis (the test-coverage warning from the review): the SET NX/PX args, the DEL-CAS and PEXPIRE-CAS Lua + ARGV[1]=instanceId, plus the lock-held / in-progress / throw- still-releases paths. The orchestrator spec is unchanged in count and stays green (it now builds the real SpaceLockService over its mock Redis). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
cca0bfe306 |
docs(git-sync): remove dangling references to the deleted git-sync-plan doc (PR #119 review)
The implementation spec docs/git-sync-plan.md was removed as completed, but ~44 code comments still cited it as "plan §N". Strip those citations (comments only), keeping each comment grammatical. The vendored engine's own "SPEC §N" references point at a different, still-present spec and are left untouched. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
fb357cd52e |
refactor(git-sync): extract shared buildLcsTable for the two block diffs (PR #119 review)
The two-way block diff (yjs-body-merge.diffBlocks) and the three-way merge planner (three-way-merge.lcsPairs) built the identical backward-filled LCS DP table inline. Extract it to lcs.ts (buildLcsTable); each caller keeps its own traceback. No behavior change — merge specs unchanged and green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
177d8a31d4 |
fix(git-sync): hold refs on suppressed deletes + stamp delete/restore provenance (PR #119 review)
Two stability warnings from the #119 review: 1. delete-cap no longer drops deletions forever. When planned deletes exceed GIT_SYNC_MAX_DELETES_PER_CYCLE the apply client's deletePage now THROWS instead of resolving to a no-op. A throw is recorded by the engine as a per-page failure, so `refs/docmost/last-pushed` is NOT advanced past the commit that dropped the files — the next cycle re-diffs from the un-advanced ref and re-plans the same deletes (a transient over-cap is retried, not silently dropped and then recreated by the next pull). Previously a resolving no-op let the engine count `deleted++` with no failure, advance the ref, and never replay the deletions. 2. git-sync soft-delete and restore now stamp provenance. deletePage routes GIT_SYNC_PROVENANCE through pageService.removePage, and restorePage stamps lastUpdatedSource='git-sync' on the restore update — so the page-change listener's loop-guard (skip when lastUpdatedSource==='git-sync') recognizes both as its own writes instead of scheduling a wasted echo cycle. Done via a backward-compatible optional `lastUpdatedSource` param on pageRepo.removePage/restorePage (omitted for ordinary user deletes/restores). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
8fa32e8438 |
docs(git-sync): document GIT_SYNC_* env vars; fix stale/non-English comments (PR #119 review)
Addresses the documentation/convention warnings from the #119 review: - .env.example: add the GIT-SYNC block (9 GIT_SYNC_* vars with defaults), noting GIT_SYNC_SERVICE_USER_ID is required when sync is enabled. - yjs-body-merge.ts: translate the Russian review note in the docstring to English (comments-only-in-English rule). - persistence.extension.ts: correct the stale "git-sync writes are full-body replaces" rationale — a git-sync write is now a block-level merge into the live doc, which is why it is debounced like a human edit rather than snapshotted. - history-item.tsx: the GitSyncBadge version is created on the PUSH path (writing the git body back into the doc), not by the pull — fix the comment. - edit-space-form.tsx: log the raw error in the git-sync toggle catch instead of swallowing it (AGENTS.md). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
fa89cba023 |
feat(git-sync): three-way body merge using the last-synced base (no edit loss)
Upgrades the 2-way body merge to a real diff3 three-way merge (review #5), so a block ONLY the human changed is KEPT when git changed a DIFFERENT block — the 2-way merge would revert it to git's stale version. Engine: the push update loop reads the last-synced pre-image (`git.showFileAtRef(refs/docmost/last-pushed, path)`) and passes it as the optional `baseMarkdown` to `client.importPageMarkdown` (the common ancestor). Server: gitmost-datasource converts base+incoming, and writeBody runs a block- level diff3 (new three-way-merge.ts `diff3Plan`): live-only change -> keep live, git-only change -> take git, both-changed -> git wins (conflict policy), inserts/ deletes from either side preserved. Without a base (createPage) it falls back to the 2-way merge. Crash-safety unchanged (docs built before the connection opens). Tests: three-way-merge.spec.ts (14 — every diff3 case incl. the cross-block preservation and conflict policy), yjs-body-merge 3-way (real Y.Docs: human's block instance preserved while git's block is applied), plus an engine test that the base is forwarded from showFileAtRef. Existing push assertions updated for the new base arg. git-sync 589 pass; server merge/datasource/gate 62 pass; typecheck clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
3386bf2865 |
fix(git-sync): merge git body into the live doc block-by-block (no clobber)
Supersedes the active-session "defer" guard with a real merge (review #5 — "запись делать через мерж", not skip-while-editing). writeBody no longer does delete-all + re-insert (which discarded a concurrent editor's in-flight changes on every sync). It now diffs the live body against the incoming git body at TOP-LEVEL BLOCK granularity (LCS over a canonical structural serialization) and applies only the minimal inserts/deletes: - a block a human is editing is left UNTOUCHED when git changed a DIFFERENT block; - an unchanged resync is a complete 0-op write; - Yjs CRDT-merges the minimal ops with concurrent edits. New yjs-body-merge.ts (mergeXmlFragments + cloneXmlNode + diffBlocks) is pure-Yjs and unit-tested with real Y.Docs (8 tests): identical->0 ops, edit-one-block keeps the other block instances, append/delete keep neighbours, marks survive the cross-doc clone. Crash-safety kept: the incoming doc is built before the connection opens, so a transform failure can't empty the body. Removed: the ActiveEditSessionError defer path and the now-unused CollaborationGateway.getActiveEditorCount. Honest limitation: this is a 2-way merge — for a block BOTH sides changed since the last sync, git wins (no common ancestor to decide). A full 3-way merge would need the last-synced base plumbed from the engine; the dominant cases (unchanged resync, edits to different blocks) are now lossless. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
181a8330f3 |
fix(git-sync): don't clobber pages with a live editing session; crash-safe body write
Review finding #5: the git -> page body write (writeBody) did a full-body replace
(delete-all + re-insert) on the shared Yjs doc. Applied while a human is editing
the page, it discarded their in-flight changes; and TiptapTransformer.toYdoc ran
AFTER the fragment was cleared, so a conversion failure could leave the page with
an empty body.
Fixes:
- Active-session guard: CollaborationGateway.getActiveEditorCount(documentName)
reports live human (websocket) editor sessions for a doc, excluding server-side
direct connections. writeBody now throws ActiveEditSessionError when an editor
is connected. The engine's push loop already isolates each importPageMarkdown in
try/catch and does not advance the loop-guard on failure, so the write is simply
retried on the next poll once the editor disconnects — never a clobber.
- Crash-safe conversion: build the replacement Yjs update BEFORE opening the
connection / clearing the fragment, so a transform failure can never leave the
body empty.
Also updates the server-side converter gate spec to the corrected round-trip
shape: the block-image hoist no longer leaves a leading empty paragraph (the
git-sync converter fix in
|
||
|
|
04032ae677 |
feat(git-sync): serve spaces over smart-HTTP (gitmost as a two-way git host)
Expose each git-sync-enabled space as a clonable/pushable git repo over HTTP, so `git clone https://<user>:<pass>@<host>/git/<spaceId>.git` works and external pushes flow back into Docmost pages — gitmost itself acts as the git host (no external GitHub/Gitea, no SSH). Transport: shell out to `git http-backend` (CGI; git is already in the runtime image) which implements the full smart-HTTP protocol (info/refs, upload-pack, receive-pack, protocol v2). A raw Fastify route `/git/*` (mounted at the root, outside the `/api` prefix) bridges the request/response to the CGI; passthrough content-type parsers for the git media types stream the raw body to stdin. Reuse the existing engine: clients push the vault's `main` branch, whose commits beyond `refs/docmost/last-pushed` the engine already reconciles into Docmost. - http/git-http.service.ts — auth (HTTP Basic -> AuthService.verifyUserCredentials), self-resolved workspace (DomainMiddleware does not run for this raw route), per-space gating (global + per-space gitSync flags, 404 hides existence), CASL authz (Read=fetch, Manage=push), dispatch. - http/git-http-backend.service.ts — spawn `git http-backend`, binary-safe CGI response parsing (Status/headers/body), stream to the socket. - http/git-http.helpers.ts — pure path parse, service->kind mapping, gate decision (unit-tested); rejects literal and percent-encoded path traversal. - orchestrator: extract reusable withSpaceLock (CAS-guarded lock heartbeat so a long push cannot let the lock expire mid-cycle) and add ingestExternalPush (receive-pack + Docmost cycle under one lock; 503 on contention). - vault-registry: ensureServable() — ensureRepo + idempotent receive.denyCurrentBranch =updateInstead / denyNonFastForwards / http.receivepack / http.uploadpack. - env: GIT_SYNC_HTTP_ENABLED (defaults to GIT_SYNC_ENABLED) + validation. - main.ts: register the /git/* route and the git content-type parsers. Tests: pure helpers, CGI parsing, and the GitHttpService handler (auth/gate/authz + workspace resolution). Server tsc + git-sync/env suites green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
d9d1d54aaa |
test(git-sync): add reviewer-requested coverage across engine, server, client
Implements the test cases called out in the PR #119 review threads (code-review, test-strategy report, red-team) — TESTS ONLY, no production code changes. packages/git-sync (vitest): - lib converter/markdown gaps: pageBreak data-loss (it.fails repro), subpages lossy round-trip, nested/fenced callouts, ol->taskList bridge, column.width number<->string drift, empty details. - engine units: parentFolderFile, planReconciliation swap/chained move, buildVaultLayout last-resort-by-id, firstDivergence, applyPushActions / applyPullActions failure isolation. - real temp-git integration: diffNameStatus -z rename+add/modify alignment, copy-line behavior, per-invocation committer identity (no leak into repo/global config). - ENFORCED type-level GitSyncClient contract via vitest typecheck over a *.test-d.ts file (tsconfig.vitest.json; build tsconfig untouched). apps/server (jest): - orchestrator: delete-cap neutralization + fail-safe, Redis lock / mutex skip ladder + release-on-throw, merge guard, pull/push order, remote template substitution, poll lifecycle. - page-change listener: loop-guard, debounce coalescing, id resolution, error swallowing. - vault registry, controller authz (trigger + status), env validation/getters, page.service git-sync provenance stamping, persistence precedence (agent > git-sync > user) + no boundary snapshot, space.service audit-delta, space.repo jsonb-merge, converter-gate corpus extension (mention/math/details/marks). apps/client (vitest + testing-library): - history-item git-sync badge: render gating + non-clickable. - edit-space-form toggle: initial state, optimistic payload, rollback on error, disabled states. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
593f181bbc |
fix(git-sync): address review — configurable poll, always-on loop-guard, cleanup
Comprehensive-review follow-ups (APPROVE WITH SUGGESTIONS; no critical issues):
- poll interval is now actually configurable: replaced the hardcoded
@Interval('git-sync-poll', 15000) with a dynamic SchedulerRegistry interval
registered in onModuleInit from getGitSyncPollIntervalMs() (cleared in
onModuleDestroy); /status and the real cadence now share one config source.
Boots logging 'poll interval registered (Nms)'.
- loop-guard now ALWAYS applies: the lastUpdatedSource==='git-sync' skip was
nested inside the !spaceId/!workspaceId branch, so structural self-writes
(CREATE/MOVE/RESTORE/SOFT_DELETE, which carry spaceId+workspaceId) bypassed it
and re-triggered cycles. Fetch the page row once, guard unconditionally, then
resolve space/workspace.
- remove the dead PAGE_CONTENT_UPDATED subscription (it's a BullMQ job, never an
EventEmitter event; body edits arrive via PAGE_UPDATED).
- fix the stale datasource comment (PageService DOES stamp 'git-sync' now).
- env getters: parseInt radix 10 + NaN/<=0 fallback for poll/debounce (+ max
deletes), with 6 new environment.service.spec tests.
tsc clean; jest 723 pass; live cycle re-verified post-refactor (ran, push
applied, unflagged 92-page space untouched).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
8373360a67 |
fix(git-sync): branch choreography + strict scoping + delete cap (Phase B hardening)
Fixes found by the live pull/push e2e: - CRITICAL: driveCycle never checked out the 'docmost' branch before applyPullActions, so Docmost content was written straight onto 'main', clobbering local file edits before push could diff them. Now checkout 'docmost' before pull (applyPullActions commits there then checks out main + merges) — mirrors the engine's pull main(). Round-trip now works both ways. - add an unresolved-merge guard (SPEC §9): skip the cycle if the vault is mid-merge instead of failing on checkout. - SAFETY: enabledSpaces() is now STRICT opt-in — only spaces with settings.gitSync.enabled===true; removed the all-spaces fallback that synced every space (incl. a 92-page one) the moment GIT_SYNC_ENABLED flipped. - SAFETY: per-cycle delete cap (GIT_SYNC_MAX_DELETES_PER_CYCLE, default 5): dry-run the push, and if planned deletes exceed the cap, run the apply with deletePage neutralized — phantom absence-deletions from a non-convergent vault can't soft-delete real pages. Fails safe if the dry-run throws. - fix manual trigger: TriggerGitSyncDto.spaceId needs @IsUUID or the global whitelist ValidationPipe strips it (arrived undefined -> vault 'undefined'). Live-verified on an isolated flagged space: push (vault file edit -> Docmost content, stamped lastUpdatedSource='git-sync') and pull (Docmost rename -> vault file + meta) both work; an unrelated 92-page space stayed untouched throughout. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
e2493cafa9 |
feat(git-sync): GitSyncModule orchestrator + config + listener (Phase A.4b/B)
Control plane wiring (plan §5-§11): - PageService create/update/movePage now honor provenance actor 'git-sync' (stamp lastUpdatedSource='git-sync'), closing the A.4a gap. - EnvironmentService: GIT_SYNC_ENABLED / DATA_DIR / REMOTE_TEMPLATE / POLL_INTERVAL_MS / DEBOUNCE_MS / SERVICE_USER_ID (required-if-enabled) / SSH_KEY_PATH + validation. - VaultRegistryService: per-space vault path + cached VaultGit. - GitSyncOrchestrator: per-space Redis leader-lock (SET NX PX + CAS-Lua release, randomUUID instanceId) + in-process mutex; runOnce drives the vendored engine PULL (readExisting->computePullActions->applyPullActions) then PUSH (runPush) with the bound native GitSyncClient + VaultGit; @Interval poll-safety gated on GIT_SYNC_ENABLED; imports plain ScheduleModule (TelemetryModule owns forRoot). - PageChangeListener: @OnEvent PAGE_* -> per-space debounce -> runOnce, with a best-effort lastUpdatedSource==='git-sync' loop-guard. - GitSyncController: admin POST /api/git-sync/trigger + GET /status (ops/e2e). - GitSyncModule registered in app.module. Enabled-space enumeration uses settings.gitSync.enabled, falling back to all live spaces until Phase C writes the flag (master gate = GIT_SYNC_ENABLED). tsc clean; 713 tests/71 suites pass; dev server hot-reloaded the module (route live, DI graph boots). Live pull/push round-trip verified next. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
5a4d9f84d7 |
feat(git-sync): native GitmostDataSource + 'git-sync' provenance (Phase A.4a)
Native data plane for git-sync (plan §3, §8.1):
- provenance: widen actor to 'user'|'agent'|'git-sync' (jwt-payload,
auth-provenance decorator); PersistenceExtension resolves lastUpdatedSource
with precedence agent > git-sync > user, debounced history (like a human edit,
not the agent's immediate snapshot).
- GitmostDataSourceService implements @docmost/git-sync's GitSyncClient natively:
reads via PageRepo/SpaceRepo (listSpaceTree complete:true, getPageJson), writes
via PageService (create/removePage soft-delete/movePage with computed fractional
position/update-rename/restore) + the writeBody linchpin through collab
openDirectConnection('page.'+id, {actor:'git-sync'}) mirroring
collaboration.handler withYdocConnection 'replace'. bind({workspaceId,userId})
returns the context-bound client for the orchestrator.
- 10 unit/contract tests (mapping + soft-delete + move-position), tsc clean.
Known gap (closed in A.4b): PageService.create/update/movePage only branch on
actor==='agent'; git-sync provenance is already passed through so the row source
marker propagates once PageService honors 'git-sync'. Module/orchestrator/config
come next.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|