fix(git-sync): converge git-ingest with open editor sessions — stop silent revert/data-loss on live pages

A git push to a page with an OPEN editor was silently reverted: the git
commit landed and the DB body updated, but the page in the browser stayed
on the old content and the editor's next autosave overwrote the git change.

Root cause (distributed, not in the merge): writeBody applied the body
merge via collabGateway.openDirectConnection on whichever instance/process
runs git-sync (the api/worker). When an editor is connected to a DIFFERENT
collab instance/process, that opens a SEPARATE, detached Y.Doc. The merge
landed in the detached doc + DB, but the live editor's Y.Doc never received
the Yjs update; its debounced autosave then persisted its STALE state over
the DB, reverting the git change (and, for concurrent edits to different
paragraphs, losing the git side). In one process the bug is invisible
because the direct connection already shares the editor's doc.

Fix: route the body write through the existing custom-event channel (the
same mechanism comment-marks and updatePageContent use) so the merge runs
on the instance that OWNS the live doc. Its update is then broadcast to
every connection (Document.handleUpdate) and the editor's CRDT converges on
the merged result. New CollaborationGateway.writePageBody dispatches to a
new gitSyncWriteBody handler (builds incoming/base docs before opening the
connection — crash-safe — then 3-way/2-way merges into the live fragment);
without redis it runs locally on the single (owning) instance. writeBody
now just forwards the converted ProseMirror bodies + service userId.

Evidence:
- git-ingest-convergence.spec.ts: deterministic two-Y.Doc repro. PATH B
  (undelivered update) asserts the LOSS (the bug); PATH A (update delivered,
  as the owner-routed write does) asserts the git change SURVIVES and that
  concurrent edits to different paragraphs both survive.
- collaboration.handler.git-sync.spec.ts: exercises the real gitSyncWriteBody
  against a shared doc wired to a connected "editor" doc (models the
  owning-instance broadcast) — editor converges, concurrent edit preserved,
  crash-safe on transform failure.
- gitmost-datasource.service.spec.ts: writeBody now routes via writePageBody
  (RED before this change — it called openDirectConnection).

Honest scope: the failure is cross-instance; full multi-instance convergence
needs a live Hocuspocus + redis and is not provable in a unit test, so the
convergence invariant is captured at the Yjs update-exchange level.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
claude code agent 227
2026-06-26 08:11:59 +03:00
parent 2594828758
commit 9e69d917ee
6 changed files with 522 additions and 137 deletions

View File

@@ -1,5 +1,4 @@
import { Injectable, Logger, NotFoundException } from '@nestjs/common';
import { TiptapTransformer } from '@hocuspocus/transformer';
import { generateJitteredKeyBetween } from 'fractional-indexing-jittered';
import type {
GitSyncClient,
@@ -12,8 +11,6 @@ import { InjectKysely } from 'nestjs-kysely';
import { KyselyDB } from '@docmost/db/types/kysely.types';
import { PageService } from '../../../core/page/services/page.service';
import { CollaborationGateway } from '../../../collaboration/collaboration.gateway';
import { tiptapExtensions } from '../../../collaboration/collaboration.util';
import { mergeXmlFragments, mergeXmlFragments3Way } from './yjs-body-merge';
import { AuthProvenanceData } from '../../../common/decorators/auth-provenance.decorator';
/**
@@ -387,15 +384,27 @@ export class GitmostDataSourceService {
// --- linchpin: native body write (§3.3) -----------------------------------
/**
* In-process body write — no loopback websocket, no service-user token. Mirrors
* the collab handler's 'replace' operation exactly: open a direct connection,
* drop the existing fragment, apply the converted doc, then disconnect.
* In-process body write — no loopback websocket, no service-user token.
*
* The `{ actor: 'git-sync', user: { id: userId } }` context flows into
* Routes the write through `CollaborationGateway.writePageBody`, which applies
* the block-level MERGE on the instance that OWNS the live Y.Doc (via the
* custom-event channel) rather than opening a direct connection on this
* (api/worker) instance. That distinction is load-bearing: when an editor is
* connected to a different collab instance/process, a direct connection here
* mutates a SEPARATE, detached doc the editor never sees — the editor's next
* autosave then silently REVERTS the git change (data loss). Running on the
* owning instance broadcasts the merge as a Yjs update so the editor converges
* (see CollaborationGateway.writePageBody for the full rationale).
*
* The merge itself stays a block-level reconcile, not a full-body replace
* (review #5): only changed blocks are touched, concurrently-edited blocks are
* left untouched, and an unchanged resync is a 0-op write. With a `base` (the
* last-synced version) it is a THREE-WAY merge so a block ONLY the human
* changed is kept and a block ONLY git changed is taken (conflicts -> git);
* without a base (e.g. createPage) it falls back to the 2-way merge. The
* `{ actor: 'git-sync', user: { id: userId } }` context flows into
* PersistenceExtension.onStoreDocument, which persists ydoc+content+textContent,
* stamps `lastUpdatedSource = 'git-sync'`, and broadcasts `page.updated`. The
* service user (`user.id`) stays the responsible `lastUpdatedById`; the actor
* marks provenance.
* stamps `lastUpdatedSource = 'git-sync'`, and broadcasts `page.updated`.
*/
private async writeBody(
pageId: string,
@@ -404,51 +413,10 @@ export class GitmostDataSourceService {
baseProsemirrorJson?: unknown,
): Promise<void> {
const documentName = `page.${pageId}`;
// Build the incoming (and base) Yjs docs BEFORE opening the connection /
// touching the live doc. If a transform throws (a malformed/unsupported doc)
// we must NOT have mutated the live body — otherwise a conversion failure
// could leave the page empty (review #5 — crash-safe conversion).
const targetDoc = TiptapTransformer.toYdoc(
await this.collabGateway.writePageBody(documentName, {
prosemirrorJson,
'default',
tiptapExtensions,
);
const baseDoc =
baseProsemirrorJson != null
? TiptapTransformer.toYdoc(baseProsemirrorJson, 'default', tiptapExtensions)
: null;
const conn = await this.collabGateway.openDirectConnection(documentName, {
actor: 'git-sync',
// PersistenceExtension reads `context.user.id` for lastUpdatedById, so the
// service user is required on the context (unlike the bare `{ actor }`
// sketch in issue #194).
user: { id: userId },
baseProsemirrorJson,
userId,
});
try {
await conn.transact((doc) => {
const liveFrag = doc.getXmlFragment('default');
const targetFrag = targetDoc.getXmlFragment('default');
// Block-level MERGE rather than a full-body replace (review #5): diff the
// live body against the incoming git body and apply only the blocks that
// actually changed; concurrently-edited blocks are left untouched and an
// unchanged resync is a 0-op write. With a `base` (the last-synced
// version) do a THREE-WAY merge so a block ONLY the human changed is kept
// and a block ONLY git changed is taken (conflicts -> git). Without a base
// (e.g. createPage), fall back to the 2-way merge.
if (baseDoc) {
mergeXmlFragments3Way(
liveFrag,
targetFrag,
baseDoc.getXmlFragment('default'),
);
} else {
mergeXmlFragments(liveFrag, targetFrag);
}
});
} finally {
await conn.disconnect();
}
}
}