feat(sync): FS->Docmost push #1 — diff/ref primitives + pure planner + apply (fakes)

First slice of the push direction (SPEC §6), mirroring pull: VaultGit primitives +
pure planner + thin injectable apply, exercised via fakes (no live destructive run).

- git.ts: diffNameStatus (--name-status -M -z, NUL-parsed, rename-aware),
  revParse/readRef/updateRef (refs/docmost/last-pushed), showFileAtRef (recover a
  deleted file's pre-image pageId)
- push.ts computePushActions (pure): A/M/D/R -> create/update/delete/renamesMoves;
  delete only when pageId is recovered from the pre-image, else skipped (§8 guard —
  no spurious Docmost delete)
- push.ts applyPushActions (fakes): update via importPageMarkdown (collab/Yjs path,
  §2 — never a raw jsonb overwrite); create via createPage then write the assigned
  pageId back into the file meta (body preserved); delete via deletePage (soft, §8);
  renamesMoves deferred; advances last-pushed
- tests (+26): diffNameStatus A/M/D/rename, ref round-trip, showFileAtRef; pure
  classification incl. §8 no-pageid skip; apply with fakes (collab-path update,
  pageid write-back, soft-delete, deferred moves)
- 683 -> 709 green; build clean; corpus STABLE

Deferred (next increment): move/rename apply, loop-guard (§10), watcher/debounce,
remote push, live main wiring, empty-spaceId create guard, per-page error isolation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
vvzvlad
2026-06-17 02:32:15 +03:00
parent 480f4c3747
commit 9c6283aa8e
5 changed files with 1131 additions and 0 deletions

View File

@@ -32,6 +32,22 @@ export const BOT_AUTHOR_EMAIL = "docmost-sync@local";
/** Default branch the vault repo is initialized on. */
export const DEFAULT_BRANCH = "main";
/**
* One row of `git diff --name-status` (SPEC §6 "ФС → Docmost"). `status` is the
* single-letter change code (`-M` rename detection on), `path` is the (new) file
* path; for a rename/copy (`R`/`C`) `oldPath` is the source and `path` is the
* destination, with `score` carrying git's similarity index (0–100).
*/
export interface DiffEntry {
status: "A" | "M" | "D" | "R" | "C";
/** New (destination) path. For A/M/D it is the only path. */
path: string;
/** Source path — present only for R/C. */
oldPath?: string;
/** Rename/copy similarity score (0–100) — present only for R/C. */
score?: number;
}
/** Result of a `merge`: whether it succeeded cleanly or left conflict markers. */
export interface MergeResult {
/** True when the merge applied cleanly (fast-forward or clean 3-way). */
@@ -419,6 +435,122 @@ export class VaultGit {
}
return r.stdout.split("\0").filter((p) => p.length > 0);
}
/**
* Diff two refs with `--name-status -M -z` and parse the NUL-delimited output
* (SPEC §6: the FS→Docmost push direction diffs `main` against
* `refs/docmost/last-pushed`). Rename detection is ON (`-M`), so a moved/renamed
* file is reported as a single `R` row with both its old and new path instead
* of a delete+add pair — that distinction is what lets the push planner tell a
* move from a delete+create (SPEC §8 "Move vs delete").
*
* `-z` makes git emit NUL-delimited RAW UTF-8 records (the Russian wiki has
* Cyrillic file names) with NO quoting/escaping. The record shape differs by
* status:
* - A/M/D: `status\0path\0`
* - R/C: `Rnnn\0oldPath\0newPath\0` (nnn = similarity score, e.g. `R100`)
* We read the RAW stdout (not the trimming `run()` helper, which would mangle
* the NUL bytes), split on `\0`, drop the trailing empty entry, and walk the
* tokens pulling 1 or 2 path tokens per status. Paths are returned verbatim.
*/
async diffNameStatus(
fromRef: string,
toRef: string,
): Promise<DiffEntry[]> {
const r = await this.runRaw([
"diff",
"--name-status",
"-M",
"-z",
fromRef,
toRef,
]);
if (r.code !== 0) {
const detail = (r.stderr || r.stdout || "").trim();
throw new Error(`git diff --name-status failed: ${detail}`);
}
// Tokens alternate: <status> <path...> <status> <path...> ... With `-z`,
// each token (status code AND each path) is its own NUL-delimited field.
const tokens = r.stdout.split("\0").filter((t) => t.length > 0);
const entries: DiffEntry[] = [];
let i = 0;
while (i < tokens.length) {
const raw = tokens[i++];
// The status token is e.g. `A`, `M`, `D`, or `R100` / `C075`. The leading
// letter is the change kind; any trailing digits are the similarity score.
const letter = raw[0] as DiffEntry["status"];
if (letter === "R" || letter === "C") {
const score = Number.parseInt(raw.slice(1), 10);
const oldPath = tokens[i++];
const path = tokens[i++];
if (oldPath === undefined || path === undefined) break; // malformed tail
entries.push({
status: letter,
path,
oldPath,
...(Number.isFinite(score) ? { score } : {}),
});
} else if (letter === "A" || letter === "M" || letter === "D") {
const path = tokens[i++];
if (path === undefined) break; // malformed tail
entries.push({ status: letter, path });
} else {
// Unknown/other status (e.g. T type-change, U unmerged) — consume one
// path token defensively so the walk stays aligned, but do not emit it
// (the push planner only handles A/M/D/R/C).
i++;
}
}
return entries;
}
/**
* Resolve a ref/commit-ish to its full SHA, or `null` if it does not exist.
* `rev-parse --verify --quiet` exits non-zero (and prints nothing) for an
* unknown ref, so a non-zero exit maps cleanly to `null`. Used to read
* `refs/docmost/last-pushed` (SPEC §5) — which is absent before the first push.
*/
async revParse(ref: string): Promise<string | null> {
const r = await this.runRaw(["rev-parse", "--verify", "--quiet", ref]);
if (r.code !== 0) return null;
const sha = r.stdout.trim();
return sha.length > 0 ? sha : null;
}
/**
* Read a ref to its SHA, or `null` if unset. Thin alias over `revParse`,
* named for the push direction's marker `refs/docmost/last-pushed` (SPEC §5:
* "что из `main` уже отражено в Docmost").
*/
async readRef(ref: string): Promise<string | null> {
return this.revParse(ref);
}
/**
* Point `ref` at `target` (`git update-ref <ref> <target>`). Used to advance
* `refs/docmost/last-pushed` to the just-pushed `main` commit after a push
* (SPEC §6 step 3 / §5). `target` may be a SHA or any commit-ish git accepts.
*/
async updateRef(ref: string, target: string): Promise<void> {
await this.run(["update-ref", ref, target]);
}
/**
* Read a file's content at a specific ref (`git show <ref>:<path>`), or `null`
* if the path does not exist there. Used by the push direction to read the
* PRE-IMAGE of a DELETED file (e.g. at `refs/docmost/last-pushed`) so its
* `docmost:meta` — and therefore its `pageId` — can be recovered to translate
* the deletion into a `delete_page` (SPEC §6/§8: only TRACKED files, i.e. ones
* that had a pageId, are deleted in Docmost). A non-zero exit (path absent at
* that ref) maps to `null` rather than throwing.
*/
async showFileAtRef(ref: string, path: string): Promise<string | null> {
// `git show <ref>:<path>` requires the path relative to the repo root; pass
// it verbatim (forward-slash, matching `listTrackedFiles` / diff output).
const r = await this.runRaw(["show", `${ref}:${path}`]);
if (r.code !== 0) return null;
return r.stdout;
}
}
/**

381
src/push.ts Normal file
View File

@@ -0,0 +1,381 @@
/**
* Push cycle — vault -> Docmost (SPEC §6 "ФС → Docmost"), FIRST increment.
*
* This module mirrors the structure of `src/pull.ts`: a set of VaultGit diff/ref
* primitives (in `src/git.ts`), a PURE planner (`computePushActions`) that turns
* a git diff into a classified action set with NO IO, and a THIN injectable
* applier (`applyPushActions`) exercised in tests via fakes only.
*
* Direction is vault -> Docmost. The diff is `main` against
* `refs/docmost/last-pushed` (SPEC §6 step 2); each `A`/`M`/`D`/`R` row is
* translated into a Docmost mutation by `pageId` identity (SPEC §4):
* - A without pageId -> create_page (then write the assigned pageId back).
* - A with pageId -> update (restored/copied file; the page already exists).
* - M -> update content (collab/Yjs path, SPEC §2/§15.6).
* - D -> delete_page (pageId recovered from the PRE-IMAGE meta).
* - R -> rename/move (RECORDED ONLY here; see the TODO below).
*
* SCOPE OF THIS INCREMENT — what is intentionally NOT here yet (next increment),
* left as explicit TODO markers:
* - TODO(next-increment): move/rename APPLY — resolving move-vs-rename and the
* new parentPageId, then calling `move_page` / `rename_page` (SPEC §6/§8).
* `computePushActions` already CLASSIFIES R into `renamesMoves`, and
* `applyPushActions` returns them as `deferred` without any client call.
* - TODO(next-increment): loop-guard (SPEC §10) — record the `updatedAt` from
* each write response + provenance trailer so the next pull does not pull our
* own write back; suppress self-writes by body hash.
* - TODO(next-increment): FS-watcher + debounce (SPEC §7.1) that commits on
* `main` and triggers a push.
* - TODO(next-increment): `git push` to the git remote (SPEC §6 step 1/§7.2,
* pull-rebase-push with retry).
* - TODO(next-increment): fast-forward the `docmost` mirror branch after a push
* (SPEC §6 step 3) — only `refs/docmost/last-pushed` is advanced here.
* - TODO(next-increment): a runnable live `main()` wired to a real Docmost.
* There is deliberately NO CLI entrypoint in this file: nothing here can run
* a destructive write against a real Docmost. `applyPushActions` is reached
* only through tests with fakes.
*/
import type { DocmostClient } from "docmost-client";
import {
parseDocmostMarkdown,
serializeDocmostMarkdownBody,
type DocmostMdMeta,
} from "docmost-client";
import type { DiffEntry, VaultGit } from "./git.js";
// Re-export so callers/tests can import the diff row shape from either module.
export type { DiffEntry } from "./git.js";
/** A page to CREATE in Docmost (new local file, meta has no pageId yet). */
export interface CreateAction {
/** Vault-relative path of the new file. */
path: string;
}
/** A page whose CONTENT changed (meta carries the existing pageId). */
export interface UpdateAction {
pageId: string;
/** Vault-relative path of the changed file. */
path: string;
}
/** A page to soft-delete in Docmost (Trash, SPEC §8). */
export interface DeleteAction {
pageId: string;
}
/** A renamed/moved page (same pageId, new path). Resolution DEFERRED. */
export interface RenameMoveAction {
pageId: string;
oldPath: string;
newPath: string;
}
/** The classified set of push actions (PURE output of `computePushActions`). */
export interface PushActions {
creates: CreateAction[];
updates: UpdateAction[];
deletes: DeleteAction[];
renamesMoves: RenameMoveAction[];
/**
* Diff rows that could NOT be classified into an action, with a reason — e.g.
* a deleted file whose PRE-IMAGE meta carried no recoverable pageId (the
* untracked-file guard, SPEC §8: only files that were tracked with a pageId
* are deleted in Docmost). Carried so the caller can log them.
*/
skipped: { path: string; status: DiffEntry["status"]; reason: string }[];
}
/**
* Which tree a `metaAt` lookup reads the file's `docmost:meta` from:
* - `current`: the current `main` tree (the live file content) — used for
* A/M/R, where the file still exists.
* - `prev`: the last-pushed PRE-IMAGE (e.g. `refs/docmost/last-pushed:<path>`)
* — used for D, where the file is gone from `main` but its pageId must be
* recovered from the version Docmost last knew (SPEC §6/§8).
*/
export type MetaSide = "current" | "prev";
/** Input to the PURE planner. `metaAt` is injected (no IO inside the planner). */
export interface PushActionsInput {
/** Diff rows of `main` vs `refs/docmost/last-pushed` (SPEC §6 step 2). */
changes: DiffEntry[];
/**
* Resolve a file's `docmost:meta` at a given side, or `null` if the file is
* absent there / has no parseable meta. PURE injection: the real `main` reads
* the working tree (current) or `git show <last-pushed>:<path>` (prev); tests
* pass a plain lookup.
*/
metaAt: (path: string, side: MetaSide) => DocmostMdMeta | null;
}
/**
* PURE push planner (SPEC §4/§6/§8). Classifies each diff row into a Docmost
* action by `pageId` identity, with NO IO (the `metaAt` resolver is injected).
*
* Classification rules:
* - `A` (added):
* - current meta has NO pageId -> CREATE (a brand-new local file; the
* page does not exist in Docmost yet).
* - current meta HAS a pageId -> UPDATE (a restored/copied file whose
* page already exists; we push its content rather than create a dup).
* - `M` (modified): current meta has a pageId -> UPDATE content. (If a modified
* file somehow lost its pageId it is skipped — there is nothing to target.)
* - `D` (deleted): recover the pageId from the PRE-IMAGE meta (`metaAt(path,
* 'prev')`) -> DELETE. If no pageId can be recovered, SKIP with a reason
* (untracked-file guard, SPEC §8: never delete an untracked page).
* - `R` (renamed/moved): same pageId (from current meta), path changed ->
* RENAME/MOVE. Resolution of move-vs-rename + the new parentPageId is
* DEFERRED to the next increment; here we only record oldPath/newPath/
* pageId. If the renamed file has no recoverable pageId it is SKIPPED.
* (`C` copy is treated the same as `R` for recording purposes.)
*/
export function computePushActions(input: PushActionsInput): PushActions {
const { changes, metaAt } = input;
const actions: PushActions = {
creates: [],
updates: [],
deletes: [],
renamesMoves: [],
skipped: [],
};
for (const change of changes) {
switch (change.status) {
case "A": {
const meta = metaAt(change.path, "current");
const pageId = meta?.pageId;
if (pageId) {
// Added but already carries a pageId (restored/copied file): the page
// exists in Docmost, so push content as an UPDATE — never a duplicate.
actions.updates.push({ pageId, path: change.path });
} else {
// Brand-new local file -> create the page, then write the assigned
// pageId back into its meta (done in `applyPushActions`).
actions.creates.push({ path: change.path });
}
break;
}
case "M": {
const meta = metaAt(change.path, "current");
const pageId = meta?.pageId;
if (pageId) {
actions.updates.push({ pageId, path: change.path });
} else {
// A modified file with no pageId has no Docmost target to update.
actions.skipped.push({
path: change.path,
status: "M",
reason: "modified file has no pageId in meta",
});
}
break;
}
case "D": {
// The file is gone from `main`; recover its pageId from the PRE-IMAGE
// (the version last pushed to Docmost) so we delete the RIGHT page.
const prevMeta = metaAt(change.path, "prev");
const pageId = prevMeta?.pageId;
if (pageId) {
actions.deletes.push({ pageId });
} else {
// Untracked-file guard (SPEC §8): a file with no recoverable pageId was
// never a Docmost page — do NOT translate its removal into a delete.
actions.skipped.push({
path: change.path,
status: "D",
reason: "deleted file has no recoverable pageId (pre-image meta)",
});
}
break;
}
case "R":
case "C": {
// Same page, new path. Identity comes from the CURRENT (post-rename) meta
// since the file still exists. RESOLUTION (move vs rename, parentPageId)
// is deferred — record oldPath/newPath/pageId only.
const meta = metaAt(change.path, "current");
const pageId = meta?.pageId;
const oldPath = change.oldPath ?? change.path;
if (pageId) {
actions.renamesMoves.push({
pageId,
oldPath,
newPath: change.path,
});
} else {
actions.skipped.push({
path: change.path,
status: change.status,
reason: "renamed/moved file has no pageId in meta",
});
}
break;
}
default: {
// Unreachable for A/M/D/R/C; defensive for any future status.
actions.skipped.push({
path: change.path,
status: change.status,
reason: `unhandled diff status ${change.status}`,
});
}
}
}
return actions;
}
// --- thin apply (create/update/delete), fakes-only in this increment ---------
/** The marker the push direction advances after a successful push (SPEC §5/§6). */
export const LAST_PUSHED_REF = "refs/docmost/last-pushed";
/**
* Injectable IO for `applyPushActions`. The real `main` (NEXT increment) wires
* these to the live client, `node:fs/promises`, and the vault git wrapper; this
* increment drives them only through FAKES in tests (no live destructive run).
* - `client`: the create/update/delete subset of `DocmostClient`.
* - `readFile`/`writeFile`: read a changed file's body / write a file back
* (by vault-relative path; the applier does not resolve absolute paths so
* fakes stay trivial).
* - `git`: only `updateRef` is used here (advance `refs/docmost/last-pushed`).
*/
export interface ApplyPushDeps {
client: Pick<
DocmostClient,
"importPageMarkdown" | "createPage" | "deletePage"
>;
/** Read a changed file's full text by its vault-relative path. */
readFile: (path: string) => Promise<string>;
/** Write a file's full text by its vault-relative path. */
writeFile: (path: string, text: string) => Promise<void>;
git: Pick<VaultGit, "updateRef">;
}
/** A file whose meta was rewritten with a freshly-assigned pageId (post-create). */
export interface WrittenBackPage {
path: string;
pageId: string;
}
/** Structured outcome of `applyPushActions` (counts + write-backs + deferred). */
export interface ApplyPushResult {
created: number;
updated: number;
deleted: number;
/**
* Files whose `docmost:meta` was rewritten with the pageId Docmost assigned on
* create — these now need a FOLLOW-UP commit (the meta on disk changed). The
* commit itself is the caller's job (NEXT increment); recorded here so it is
* not lost.
*/
writtenBack: WrittenBackPage[];
/** Rename/move actions NOT executed this increment (apply is deferred). */
deferred: RenameMoveAction[];
/** Diff rows the planner could not classify (carried through for logging). */
skipped: PushActions["skipped"];
/** Whether `refs/docmost/last-pushed` was advanced (only when `pushedCommit`). */
lastPushedAdvanced: boolean;
}
/**
* THIN IO applier for the COMMON push cases (create/update/delete). Exercised
* via FAKES only in this increment — there is no live wiring.
*
* - UPDATE: read the file body, then `client.importPageMarkdown(pageId, body)`.
* This is the collab/Yjs write path (SPEC §2/§15.6) — NEVER a raw jsonb
* overwrite. The full self-contained markdown (meta + body) is sent as-is;
* `importPageMarkdown` parses the meta/body itself.
* - CREATE: derive title/spaceId/parentPageId from the file's current meta,
* `client.createPage(...)`, take the assigned pageId from the result, and
* write it BACK into the file's `docmost:meta` (re-serialized via
* `serializeDocmostMarkdownBody`, body preserved) so the file becomes
* tracked. The write-back is recorded in `writtenBack` (a follow-up commit
* is needed — NEXT increment).
* - DELETE: `client.deletePage(pageId)` — soft-delete to Trash (SPEC §8).
* - RENAME/MOVE: NOT executed — returned as `deferred` (NEXT increment).
*
* After applying, if a `pushedCommit` is given, advance
* `refs/docmost/last-pushed` to it (SPEC §6 step 3). Fast-forwarding the
* `docmost` branch and the loop-guard are DEFERRED (see the module TODO list).
*
* @param pushedCommit The `main` commit just reflected into Docmost (SHA or
* commit-ish). When omitted, the ref is NOT advanced (e.g. a dry plan).
*/
export async function applyPushActions(
deps: ApplyPushDeps,
actions: PushActions,
pushedCommit?: string,
): Promise<ApplyPushResult> {
const { client, git } = deps;
let created = 0;
let updated = 0;
let deleted = 0;
const writtenBack: WrittenBackPage[] = [];
// 1. UPDATES — collab/Yjs write path (SPEC §2/§15.6), never a raw overwrite.
for (const u of actions.updates) {
const fullMarkdown = await deps.readFile(u.path);
await client.importPageMarkdown(u.pageId, fullMarkdown);
updated++;
}
// 2. CREATES — create the page, then write the assigned pageId back to meta so
// the file becomes tracked (SPEC §4 "записать присвоенный pageId обратно").
for (const c of actions.creates) {
const text = await deps.readFile(c.path);
const { meta, body } = parseDocmostMarkdown(text);
// Derive create args from the file's current meta. A new local file may have
// a partial meta (e.g. title/spaceId only); spaceId is required by Docmost.
const title = meta?.title ?? "";
const spaceId = meta?.spaceId ?? "";
const parentPageId = meta?.parentPageId ?? undefined;
const result = await client.createPage(title, body, spaceId, parentPageId);
// `createPage` returns `{ data: { id, ... }, success }`; the assigned pageId
// is at `result.data.id`.
const assignedPageId: string | undefined = result?.data?.id;
if (assignedPageId) {
// Re-serialize the file with the pageId written into meta, body preserved.
const newMeta: DocmostMdMeta = {
version: meta?.version ?? 1,
...meta,
pageId: assignedPageId,
};
const rewritten = serializeDocmostMarkdownBody(newMeta, body);
await deps.writeFile(c.path, rewritten);
writtenBack.push({ path: c.path, pageId: assignedPageId });
}
created++;
}
// 3. DELETES — soft-delete to Trash (SPEC §8), obratimo.
for (const d of actions.deletes) {
await client.deletePage(d.pageId);
deleted++;
}
// 4. RENAME/MOVE — DEFERRED (NEXT increment): no client call. Returned as
// `deferred` so the caller can see what still needs the move/rename apply.
// 5. Advance `refs/docmost/last-pushed` to the pushed `main` commit (SPEC §6
// step 3 / §5). TODO(next-increment): fast-forward the `docmost` mirror
// branch (Docmost already contains these changes) and record the `updatedAt`
// from each write response for the loop-guard (SPEC §10).
let lastPushedAdvanced = false;
if (pushedCommit) {
await git.updateRef(LAST_PUSHED_REF, pushedCommit);
lastPushedAdvanced = true;
}
return {
created,
updated,
deleted,
writtenBack,
deferred: actions.renamesMoves,
skipped: actions.skipped,
lastPushedAdvanced,
};
}