fix(git-sync): push 503 starvation + concurrent-edit marker leak/silent loss
Bug #1 (push 503 starvation): an external receive-pack that briefly overlapped a poll cycle immediately 503'd because the per-space single-writer lock was held. Add a BOUNDED retry-acquire on the PUSH path only (SpaceLockService .withSpaceLock acquireRetry: capped exponential backoff up to ~5s); a transient overlap now waits and succeeds, a genuinely stuck cycle still 503s after the bound. The poll cycle passes no retry (immediate skip). Push result stays deterministic: the receive-pack only runs once the lock is held, so a 503 never leaves a half-applied ref. Bug #2 (concurrent-edit marker leak + silent same-block loss): - Marker leak (a): the push UPDATE path stripped markers for the body sent to Docmost but left raw <<<<<<</>>>>>>> committed on the published `main` vault forever (autoMergeConflicts ON). Now the cleaned body is written back to the vault file + recorded in writtenBack so runPush commits it on `main` and the vault converges to clean bytes. - Marker leak (b): pin merge.conflictStyle=merge in ensureRepo and teach stripConflictMarkers/hasConflictMarkers about the diff3 `|||||||` base section (drop the marker AND the stale base region) so diff3/zdiff3 conflicts can never leak `|||||||` + base content into a page. Also scrub the 3-way merge BASE markdown. - Silent same-block loss: the block 3-way merge still resolves same-block conflicts deterministically to git, but it is no longer silent: diff3Plan now reports a conflict count (mergeXmlFragments3WayWithStats), gitSyncWriteBody logs it, and the persistence boundary-snapshot now fires for git-sync writes over a non-git-sync baseline so the human's pre-merge content is preserved in page history (recoverable). Full both-preserved persisted-conflict UI remains the deferred redesign. Tests: space-lock bounded-retry (success/stuck/poll-immediate); push vault-clean + diff3 ||||||| strip; ensureRepo conflictStyle pin; diff3Plan/3-way conflict counts; persistence git-sync boundary snapshot. Server tsc clean; git-sync vitest + server collaboration/git-sync jest all green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -22,6 +22,11 @@ import { EnvironmentService } from '../../environment/environment.service';
|
||||
import { GitmostDataSourceService } from './gitmost-datasource.service';
|
||||
import { VaultRegistryService } from './vault-registry.service';
|
||||
import { SpaceLockService } from './space-lock.service';
|
||||
import {
|
||||
GIT_SYNC_PUSH_LOCK_RETRY_BASE_MS,
|
||||
GIT_SYNC_PUSH_LOCK_RETRY_MAX_MS,
|
||||
GIT_SYNC_PUSH_LOCK_RETRY_TOTAL_MS,
|
||||
} from '../git-sync.constants';
|
||||
|
||||
/** A space the poll loop should reconcile: its id + the workspace it lives in. */
|
||||
interface EnabledSpace {
|
||||
@@ -244,7 +249,9 @@ export class GitSyncOrchestrator implements OnModuleInit, OnModuleDestroy {
|
||||
}
|
||||
const serviceUserId = this.environmentService.getGitSyncServiceUserId();
|
||||
|
||||
const result = await this.spaceLock.withSpaceLock(spaceId, async (signal) => {
|
||||
const result = await this.spaceLock.withSpaceLock(
|
||||
spaceId,
|
||||
async (signal) => {
|
||||
// 1) Stream the receive-pack to the client (durable commits land on main).
|
||||
// Pass the lost-lock signal so the receive-pack child is killed if the lock
|
||||
// lapses mid-write (no concurrent working-tree writer across replicas).
|
||||
@@ -273,7 +280,23 @@ export class GitSyncOrchestrator implements OnModuleInit, OnModuleDestroy {
|
||||
);
|
||||
}
|
||||
return;
|
||||
});
|
||||
},
|
||||
// BOUNDED retry-acquire (push path only): a push that briefly overlaps a
|
||||
// poll cycle waits a moment (capped backoff up to the budget) instead of
|
||||
// immediately 503-ing — the cycle releases the lock in well under a second
|
||||
// for most spaces, so this turns a transient overlap into a SUCCESS rather
|
||||
// than a spurious failure. A genuinely long/stuck cycle still skips after
|
||||
// the bound -> GitSyncLockHeldError -> 503, and git retries the whole push
|
||||
// (the receive-pack only runs once the lock is held, so there is never a
|
||||
// half-applied ref on a 503).
|
||||
{
|
||||
acquireRetry: {
|
||||
timeoutMs: GIT_SYNC_PUSH_LOCK_RETRY_TOTAL_MS,
|
||||
baseMs: GIT_SYNC_PUSH_LOCK_RETRY_BASE_MS,
|
||||
maxMs: GIT_SYNC_PUSH_LOCK_RETRY_MAX_MS,
|
||||
},
|
||||
},
|
||||
);
|
||||
|
||||
// The lock was held (in-progress or another replica) — surface to the caller
|
||||
// so the HTTP handler can answer 503 and let git retry.
|
||||
|
||||
Reference in New Issue
Block a user