fix(git-sync): red-team hardening — 12 confirmed sync-breaking bugs + regression tests

A 10-agent red-team pass on the two-way Docmost<->git sync surfaced 16 ranked
findings (9 others triaged out as already-defended). Wrote a reproduction test
per finding (each asserts the CORRECT behavior, so it fails on the bug), then
fixed the production code so every repro goes green. All confirmed bugs:

Round-trip data loss (markdown-converter.ts + docmost-schema.ts mirror):
- #1 editor-ext node types silently dropped on export — ported the 8 missing
  canon nodes (footnoteReference/footnotesList/footnoteDefinition, htmlEmbed,
  status, pageEmbed, transclusionSource/Reference) into the git-sync schema
  mirror and added converter cases that emit their schema-matching HTML instead
  of flattening unknown nodes to '' (this was the critical data-loss flagged in
  review #1679: footnotes/htmlEmbed lost on sync). Snapshot surface updated.
- #2 top-level image lost width/height/align/attachmentId — now emits an HTML
  <img> (like video/diagrams) when it carries layout attrs; bare images stay
  ![](src). Image node parses width/height as strings so they re-import.
- #3 code block containing a ``` fence corrupted on round-trip — outer fence is
  now widened to (longest-inner-backtick-run + 1).
- #16 deep nesting threw RangeError (page never synced) — added a depth guard
  (MAX_NODE_DEPTH=400) so the converter never overflows the stack.

Push/layout/cycle (engine):
- #4 disambiguation ' ~slugId' suffix corrupted Docmost titles + order-dependent
  layout — deterministic, order-independent sibling disambiguation; suffix is
  stripped from a path-derived title ONLY when the new name is exactly the old
  title plus the suffix (never a genuine retitle ending in ' ~token').
- #6 retry-adopt by (parent,title) clobbered the wrong duplicate-title sibling —
  ambiguous (parent,title) is no longer adopted (falls back to fresh create).
- #12 a new child under a new parent was created at ROOT — creates are ordered
  parent-before-child with an in-memory created-id map for parent resolution.
- #13 git conflict markers could reach Docmost — bodies are scanned and the
  marker lines stripped (a '=======' line is only treated as a conflict
  separator inside a <<<<<<< ... >>>>>>> block, so setext headings are safe).
- #15 a divergent `docmost` mirror was escalated by runPush but dropped by
  runCycle — RunCycleResult now forwards divergentDocmost to the orchestrator.

Server (merge / lock / provenance):
- #9 3-way merge lost a human's block edit when git inserted an adjacent block —
  finer-grained diff3 region merge (via lcs) preserves non-overlapping human
  edits; genuine same-block conflicts still resolve git-wins.
- #10 single-writer race — module-static liveLocks closes the same-process TOCTOU
  window, and a heartbeat refresh that cannot confirm the lock now aborts the
  cycle at its next write checkpoint (cooperative AbortSignal threaded through
  runCycle). Cross-process fencing tokens remain a follow-up.
- #14 sticky-agent provenance overrode an explicit actor='git-sync' write,
  blinding the listener loop-guard — resolveSource now lets an explicit actor
  win over the sticky-agent fallback (explicit agent still wins).

Verified: git-sync vitest 617 pass (+1 expected-fail), server unit jest 1541
pass, server tsc clean. A review pass over the fixes caught and corrected a
title-suffix over-strip, an inert abort signal, a document-wide conflict-marker
strip, and two leaf-atom content-holes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
claude code agent 227
2026-06-26 01:29:02 +03:00
parent 142ed3a825
commit 3d7f434b0c
20 changed files with 1621 additions and 135 deletions

View File

@@ -133,8 +133,27 @@ export function classifyRenameMoves(
return renamesMoves.map((rm) => {
const newParent = deps.resolveParentPageId(rm.newPath, "current");
const oldParent = deps.resolveParentPageId(rm.oldPath, "prev");
const newTitle = deps.metaAt(rm.newPath, "current")?.title;
const oldTitle = deps.metaAt(rm.oldPath, "prev")?.title;
// Strip the cosmetic ` ~<slugId>` disambiguation suffix before comparing
// titles: it is a LOCAL filesystem artifact (`buildVaultLayout` appends it to
// a colliding sibling's stem), NOT part of the page's real title. A pure
// disambiguation file-rename ('Report.md' -> 'Report ~a1.md') must therefore
// NOT be pushed to Docmost as a title change (red-team #4b), and any title we
// DO push must carry the real title ('Report'), never the suffixed form.
const rawNewTitle = deps.metaAt(rm.newPath, "current")?.title;
const rawOldTitle = deps.metaAt(rm.oldPath, "prev")?.title;
// A PURE disambiguation rename only APPENDS a cosmetic ` ~<suffix>` to the
// SAME title (layout.ts), so the real Docmost title is unchanged. Strip the
// suffix ONLY when the new name is exactly the old title plus that suffix —
// never blindly strip a genuine retitle whose new title legitimately ends in
// ` ~token` (e.g. "Budget ~draft" -> "Budget ~final"), which would corrupt
// the title in Docmost / drop a real rename (review finding).
const isCosmeticDisambiguation =
typeof rawNewTitle === "string" &&
typeof rawOldTitle === "string" &&
rawNewTitle !== rawOldTitle &&
stripDisambiguationSuffix(rawNewTitle) === rawOldTitle;
const newTitle = isCosmeticDisambiguation ? rawOldTitle : rawNewTitle;
const oldTitle = rawOldTitle;
const out: RenameMoveActionClassified = {
pageId: rm.pageId,
@@ -646,7 +665,11 @@ export async function applyPushActions(
// Push the CLEAN body only (no `gitmost_id` frontmatter): the frontmatter
// is engine metadata, never page content. The server converts the markdown
// it receives verbatim, so stripping here keeps the id out of Docmost.
const body = parsePageFile(await deps.readFile(u.path)).body;
// Also strip any git conflict markers — they must NEVER reach Docmost
// (SPEC §9, red-team #13); content on both sides is preserved.
const body = stripConflictMarkers(
parsePageFile(await deps.readFile(u.path)).body,
);
// The last-synced version of this file (pre-image) is the common ancestor
// for a 3-way merge against the live page, so concurrent human edits are
// not clobbered (review #5). Null when the file is new at last-pushed. Its
@@ -689,6 +712,10 @@ export async function applyPushActions(
// folder, so (parentPageId, title) identifies the page; a match means a prior
// cycle already created it, so we ADOPT instead of duplicating.
let liveByParentTitle: Map<string, string> | null = null;
// A (parentPageId, title) that more than ONE live page shares is AMBIGUOUS:
// adopting one of them would silently overwrite an arbitrary, possibly-unrelated
// sibling (red-team #6). Such keys are recorded here and EXCLUDED from adoption.
const ambiguousAdoptKeys = new Set<string>();
if (actions.creates.length > 0) {
const live = await client.listSpaceTree(deps.spaceId);
// Only trust a COMPLETE tree for retry-adopt: a truncated tree could miss an
@@ -699,32 +726,56 @@ export async function applyPushActions(
liveByParentTitle = new Map();
for (const n of live.pages) {
const key = `${n.parentPageId ?? " root"} ${n.title ?? ""}`;
// Keep the FIRST node for a key (the layout makes this unique in practice).
if (!liveByParentTitle.has(key)) liveByParentTitle.set(key, n.id);
// First node claims the key; a SECOND match marks it ambiguous so neither
// is ever adopted-over (the create falls back to a fresh createPage).
if (liveByParentTitle.has(key)) ambiguousAdoptKeys.add(key);
else liveByParentTitle.set(key, n.id);
}
}
}
for (const c of actions.creates) {
// Order creates PARENT-before-CHILD (red-team #12): a child whose parent is
// ALSO a fresh create must run AFTER its parent so the parent's just-assigned
// pageId is available to parent it (otherwise it is placed at the space ROOT).
const orderedCreates = orderCreatesParentFirst(actions.creates);
// Track pageIds assigned (or adopted) to each create's PATH in THIS batch, so a
// child can resolve its freshly-created parent's id without depending on the
// on-disk write-back being observable yet (red-team #12).
const createdIdByPath = new Map<string, string>();
for (const c of orderedCreates) {
try {
const text = await deps.readFile(c.path);
const { body } = parsePageFile(text);
// Conflict markers must never reach Docmost (SPEC §9, red-team #13); strip
// them from the create body too, preserving both sides' content.
const body = stripConflictMarkers(parsePageFile(text).body);
// Derive create args from the PATH (native-Obsidian, SPEC §5): title from
// the filename, parent from the enclosing folder's folder-note, space from
// the run (the vault's space). `parentPageId: null` -> created at ROOT.
const title = titleFromPath(c.path);
// Resolve the parent from the PATH (SPEC §5). Prefer an id assigned to the
// parent's folder-note EARLIER in this same batch — a freshly-created parent
// whose on-disk write-back may not be observable yet (red-team #12; creates
// are ordered parent-before-child so the parent already ran).
const parentFile = parentFolderFile(c.path);
const parentPageId =
(await resolveParentPageIdViaTree(deps, c.path, "current")) ?? undefined;
(parentFile !== null ? createdIdByPath.get(parentFile) : undefined) ??
(await resolveParentPageIdViaTree(deps, c.path, "current")) ??
undefined;
// Retry-adopt (#1 idempotency): a prior cycle already created this page in
// Docmost but failed to persist the pageId back to the file, so it was
// re-seen as a create. Adopt the existing page instead of duplicating it:
// write the id back (file becomes tracked) and push the body as an UPDATE
// (idempotent — targets by pageId). Do NOT call createPage again.
// (idempotent — targets by pageId). Do NOT call createPage again. SKIP
// adoption when the (parent, title) is AMBIGUOUS — adopting an arbitrary
// duplicate-title sibling would silently overwrite it (red-team #6).
const adoptKey = `${parentPageId ?? " root"} ${title}`;
const existingId = liveByParentTitle?.get(adoptKey);
const existingId = ambiguousAdoptKeys.has(adoptKey)
? undefined
: liveByParentTitle?.get(adoptKey);
if (existingId) {
const rewritten = serializePageFile(existingId, body);
await deps.writeFile(c.path, rewritten);
writtenBack.push({ path: c.path, pageId: existingId });
createdIdByPath.set(c.path, existingId);
const adopted = await client.importPageMarkdown(existingId, body, null);
pushed.push({
pageId: existingId,
@@ -749,6 +800,7 @@ export async function applyPushActions(
const rewritten = serializePageFile(assignedPageId, body);
await deps.writeFile(c.path, rewritten);
writtenBack.push({ path: c.path, pageId: assignedPageId });
createdIdByPath.set(c.path, assignedPageId);
// §10 loop-guard data for the created page (hash the pushed BODY).
pushed.push({
pageId: assignedPageId,
@@ -942,6 +994,35 @@ export function parentFolderFile(path: string): string | null {
return folderNote;
}
/**
* Order CREATE actions so a create whose parent folder-note is ALSO being created
* appears AFTER its parent (red-team #12). A child created before its fresh parent
* cannot resolve the parent's pageId and would be placed at the space ROOT.
* Topological over the `parentFolderFile` relation, restricted to paths within the
* create set; an `inProgress` guard makes a malformed parent cycle safe.
*/
export function orderCreatesParentFirst(creates: CreateAction[]): CreateAction[] {
const byPath = new Map<string, CreateAction>();
for (const c of creates) byPath.set(c.path, c);
const ordered: CreateAction[] = [];
const visited = new Set<string>();
const inProgress = new Set<string>();
const visit = (c: CreateAction): void => {
if (visited.has(c.path) || inProgress.has(c.path)) return;
inProgress.add(c.path);
const parent = parentFolderFile(c.path);
if (parent !== null && parent !== c.path) {
const parentCreate = byPath.get(parent);
if (parentCreate) visit(parentCreate);
}
inProgress.delete(c.path);
visited.add(c.path);
ordered.push(c);
};
for (const c of creates) visit(c);
return ordered;
}
/**
* Whether a vault path is a Docmost PAGE file (design §"Adoption"): a `.md` file
* with NO dot-segment anywhere in its path. This excludes `.obsidian/` config,
@@ -955,6 +1036,51 @@ export function isPageFile(path: string): boolean {
return !path.split("/").some((seg) => seg.startsWith("."));
}
/**
* Git conflict-marker scan + strip (SPEC §9 — conflict markers must NEVER reach
* Docmost). A body is treated as conflicted only when it carries BOTH a begin
* (`<<<<<<<`) and an end (`>>>>>>>`) marker line, so a legitimate Markdown setext
* heading underline (`=======`) is not mistaken for a conflict. When conflicted,
* the three marker line types are removed while BOTH sides' content is preserved
* (no data loss): the marker SYNTAX never reaches Docmost, but the human's content
* does — where the conflict is visible and fixable rather than silently dropped.
*/
const CONFLICT_BEGIN_RE = /^<{7}/m;
const CONFLICT_END_RE = /^>{7}/m;
const CONFLICT_BEGIN_LINE_RE = /^<{7}/;
const CONFLICT_SEP_LINE_RE = /^={7}/;
const CONFLICT_END_LINE_RE = /^>{7}/;
export function hasConflictMarkers(body: string): boolean {
return CONFLICT_BEGIN_RE.test(body) && CONFLICT_END_RE.test(body);
}
function stripConflictMarkers(body: string): string {
if (!hasConflictMarkers(body)) return body;
// Remove ONLY the three marker line types, and treat a `=======` line as a
// conflict separator ONLY when we are between a `<<<<<<<` begin and a `>>>>>>>`
// end — so a legitimate Markdown setext heading underline (`=======`) outside a
// conflict block is preserved (review finding). Both conflict sides' content is
// kept; only the marker SYNTAX is dropped.
let inBlock = false;
const out: string[] = [];
for (const line of body.split("\n")) {
if (CONFLICT_BEGIN_LINE_RE.test(line)) {
inBlock = true;
continue;
}
if (CONFLICT_END_LINE_RE.test(line)) {
inBlock = false;
continue;
}
if (inBlock && CONFLICT_SEP_LINE_RE.test(line)) {
continue;
}
out.push(line);
}
return out.join("\n");
}
/** The last path segment of a forward-slash path (the folder/file base name). */
function baseSegment(path: string): string {
const slash = path.lastIndexOf("/");
@@ -974,6 +1100,20 @@ function titleFromPath(path: string): string {
return base.endsWith(".md") ? base.slice(0, -3) : base;
}
/**
* The exact ` ~<slugId>` disambiguation suffix `buildVaultLayout`/`disambiguate`
* append to a colliding sibling's file stem (layout.ts): a single trailing
* ` ~<one path component>` (no slash, no further `~`). It is a COSMETIC, local
* filesystem artifact — never part of the page's real Docmost title — so it is
* stripped before a path-derived title is compared/pushed (red-team #4b).
*/
const DISAMBIGUATION_SUFFIX_RE = / ~[^/~]+$/;
/** Remove a single trailing ` ~<slugId>` disambiguation suffix, if present. */
function stripDisambiguationSuffix(title: string): string {
return title.replace(DISAMBIGUATION_SUFFIX_RE, "");
}
/**
* Build the synthetic `DocmostMdMeta` the planner/classifier consume, from the
* NATIVE format: `pageId` from the `gitmost_id` frontmatter, `title` from the