b47751349f373e2aa434d5d7b59aaa8a85635fca
195 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
b47751349f |
fix(git-sync): kill spurious marker-leaking conflict, concurrent-edit loss, flapping HEAD
Three more git-sync QA defects from the 2nd live pass on PR #119, plus a callout-fidelity nit: 1. SPURIOUS conflict leaked raw markers into canonical main (root cause). On an ordinary round-trip the only difference between the docmost mirror (normalize- on-write) and a user's raw push is trailing/empty-line normalization, which made git's line-based docmost->main merge CONFLICT, and the wedge fix then committed the file WITH literal <<<<<<< / ======= / >>>>>>> markers onto main (git and the DB silently diverged for cycles). Fix: on a conflict, normalize trailing/empty lines on BOTH sides (showStage :2:/:3:) before comparing — a trailing-only diff is recognized as spurious and resolved to the clean normalized form. A GENUINE same-block conflict is auto-resolved to OURS (git wins, mirroring the live-doc 3-way rule); the docmost side stays on the `docmost` branch + page history. Raw markers NEVER reach main again. 2. Concurrent UI<->git edit silently lost the UI side. The git->Docmost 3-way merge ran against a live Y.Doc that hadn't yet received the user's debounced in-flight edit, so git clean-applied (no conflict detected) and the edit vanished even on a different block. Fix: flush the pending debounced store before the merge so the in-flight edit is drained into the live doc first — a different-block edit is merged, a same-block one is detected and pinned to history (recoverable). 3. Smart-HTTP HEAD flapped to the read-only `docmost` mirror (~1/4 of clones). The engine transiently checks out `docmost` mid-pull and the host advertises whatever HEAD resolves to. Fix: VaultGit.pinHeadToMain(); the cycle restores HEAD->main in a finally; and the upload-pack ref advertisement is served HEAD-pinned under the per-space lock so it can never observe a mid-cycle HEAD. 4. (callout) clampCalloutType now mirrors the editor's GITHUB_ALERT_TYPE_MAP for non-schema aliases (tip->success, caution->danger, important->info) instead of flatly collapsing to info. The editor schema genuinely supports only the six banner types, so unknown types still fall back to info (by design). Tests: deterministic real-git trailing-blank round-trip (no conflict, no markers, in sync over 2 cycles) + genuine-conflict no-marker-leak; HEAD advertisement stability; pre/post-flush concurrent-edit survival; serveReadAdvertisement lock pin; widened callout-alias coverage. Engine vitest + server tsc + collaboration / git-http / orchestrator specs all green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
b7e5cb6970 |
fix(git-sync): push 503 starvation + concurrent-edit marker leak/silent loss
Bug #1 (push 503 starvation): an external receive-pack that briefly overlapped a poll cycle immediately 503'd because the per-space single-writer lock was held. Add a BOUNDED retry-acquire on the PUSH path only (SpaceLockService .withSpaceLock acquireRetry: capped exponential backoff up to ~5s); a transient overlap now waits and succeeds, a genuinely stuck cycle still 503s after the bound. The poll cycle passes no retry (immediate skip). Push result stays deterministic: the receive-pack only runs once the lock is held, so a 503 never leaves a half-applied ref. Bug #2 (concurrent-edit marker leak + silent same-block loss): - Marker leak (a): the push UPDATE path stripped markers for the body sent to Docmost but left raw <<<<<<</>>>>>>> committed on the published `main` vault forever (autoMergeConflicts ON). Now the cleaned body is written back to the vault file + recorded in writtenBack so runPush commits it on `main` and the vault converges to clean bytes. - Marker leak (b): pin merge.conflictStyle=merge in ensureRepo and teach stripConflictMarkers/hasConflictMarkers about the diff3 `|||||||` base section (drop the marker AND the stale base region) so diff3/zdiff3 conflicts can never leak `|||||||` + base content into a page. Also scrub the 3-way merge BASE markdown. - Silent same-block loss: the block 3-way merge still resolves same-block conflicts deterministically to git, but it is no longer silent: diff3Plan now reports a conflict count (mergeXmlFragments3WayWithStats), gitSyncWriteBody logs it, and the persistence boundary-snapshot now fires for git-sync writes over a non-git-sync baseline so the human's pre-merge content is preserved in page history (recoverable). Full both-preserved persisted-conflict UI remains the deferred redesign. Tests: space-lock bounded-retry (success/stuck/poll-immediate); push vault-clean + diff3 ||||||| strip; ensureRepo conflictStyle pin; diff3Plan/3-way conflict counts; persistence git-sync boundary snapshot. Server tsc clean; git-sync vitest + server collaboration/git-sync jest all green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
906733b5c8 |
fix(git-sync): address PR #119 review #4 — symlink guard, dead-code cull, changelog + warnings/suggestions
Blocking (review id 2514): - [security] Forbid symlinks in vaults. ensureServable now sets core.symlinks=false in each vault's local git config (a pushed symlink is checked out as a plain file, never a real link), and the engine cycle wraps every read/write/mkdir in an lstat/realpath guard (new path-guard.ts) that refuses a path that is — or traverses — a symlink, or whose realpath escapes the vault root. Prevents a writer from publishing /etc/passwd or the server .env, or writing outside the vault. Adds unit tests (path-guard.test.ts) + a read-guard integration test (cycle.test.ts) + real lstat/realpath in the roundtrip integration test. - [simplification] Delete dead lib/diff.ts + test/diff.test.ts and drop the now-unused @fellow/prosemirror-recreate-transform dependency. - [documentation] Add a CHANGELOG [Unreleased] → Added entry for git-sync. Warnings: - [test-coverage] Cover the CREATE-branch conflict-markers guard (a new .md with markers and no gitmost_id is recorded as a create failure, never created). Suggestions: - [stability] Bound each `git config` in ensureServable with a timeout. - [authz] Trigger endpoint resolves spaceId workspace-scoped and 404s a foreign space before any vault directory is created. - [stability] Attribute git-initiated moves to the service account (lastUpdatedById), via an optional actor param on PageService.movePage. - [documentation] Document the per-space autoMergeConflicts toggle in AGENTS.md. - [test-coverage] Cover the unterminated `:::` callout fence fallback. - [simplification] Move test-only roundtrip-helpers.ts out of src/ into test/. Architecture: - Move the Yjs/ProseMirror merge primitives (yjs-body-merge, three-way-merge, lcs + specs) into collaboration/merge/, breaking the collaboration → integrations/git-sync dependency cycle this PR introduced. - Port the schema-surface drift gate to packages/mcp (the mcp schema mirror had none); pins 52 entries. Deferred (with rationale in the review thread): the incremental-pull perf warning (correctness-neutral; needs a high-water-mark design + its own tests on the data-loss-critical path) and the redis-sync rolling-deploy mixed-version edge (the deficient behavior is in already-released old-instance code; the new code is correct on both sides; impact is a transient rollout-window artifact). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
22e3fcdeba |
fix(git-sync): address PR #119 review #2 — throttle /git Basic auth, fix mcp schema drift + warnings/tests
Must-fix:
- Throttle the raw /git HTTP-Basic path: it bypasses Nest/ThrottlerGuard, so
verifyUserCredentials (bcrypt) ran unthrottled. Wrap it in the SAME
FailedLoginLimiter the /mcp path uses (5/60s; per-IP, per-IP+email, global
per-email keys; atomic tryReserve BEFORE bcrypt; success resets, non-credential
errors release). The (threshold+1)-th attempt now gets 429 pre-bcrypt. Sweep
timer + onModuleDestroy mirror McpService.
- Fix the mcp schema mirror drift: packages/mcp details `open` attr now reads via
hasAttribute (matches editor-ext canon + git-sync copy); getAttribute dropped a
bare `<details open>` state. (build/ is gitignored — rebuilt locally.)
Tests added:
- /git brute-force throttle: pre-bcrypt 429 on the 6th failure; success resets;
non-credential error releases the budget.
- git-http-backend lost-lock AbortSignal: already-aborted -> no spawn + 500;
live abort mid-request -> SIGTERM + response closed.
- orchestrator divergentDocmost -> WARN + flag surfaced in status (+ clean case).
- pollTick re-entrancy guard skips an overlapping tick.
- datasource NotFound early-throws (getPageJson/move/rename) + updatedAt:undefined
stale-read branch (importPageMarkdown/createPage).
Suggestions:
- space.repo updateGitSyncSettings: parameterize the jsonb key (`${prefKey}::text`)
instead of sql.raw (latent-injection footgun); value stays sql.lit. Spec updated.
- pollTick re-entrancy guard (private `polling` flag).
- page-change.listener docstring: honest about the move/rename/delete over-skip
(loop-guard keys only on lastUpdatedSource) -> ~poll-interval latency, not loss.
- AGENTS.md: document the root /git smart-HTTP route + GitSyncModule.
- Remove redundant redteam-provenance.spec.ts (covered e2e in
persistence.extension.spec.ts:145).
- Extract the duplicated SIGTERM->SIGKILL+finish block (watchdog + abort) into
terminateChild; centralize watchdog-timer teardown in done().
Architecture (deferred, documented): mcp schema header now carries the three-copy
keep-in-sync + schema-core note; the editor-ext contract test documents that the
mcp copy and attribute-behaviour drift (details `open`) are not mechanically
covered yet.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
7179f8a5b2 |
fix(git-sync): address PR #119 review — close 403/404 space-existence leak + warnings/tests/arch
Security (must-fix):
- /git smart-HTTP gate: an authenticated NON-member of a git-sync space now gets
404 (not 403), so the 403<->404 difference can no longer be used to brute-force
which spaces exist / have git-sync enabled. 403 is reserved for a MEMBER who
lacks the required role (existence already known). New gate input
userIsSpaceMember; decision-table + service specs extended.
Config (must-fix):
- Remove the dead GIT_SYNC_SSH_KEY_PATH knob (getter + validation field + two
.env.example lines) — it had zero consumers and advertised a nonexistent push
capability.
Stability/docs (warnings):
- Wire the lost-lock AbortSignal into runReceivePack -> git http-backend so the
receive-pack child is killed if the per-space lock lapses mid-write.
- Raise the divergent-`docmost` (invariant §5) push refusal from info -> warn and
surface divergentDocmost in the run status (/status).
- Comment the stale read-after-debounced-collab-write updatedAt in
importPageMarkdown (deferred §10 loop-guard must not trust it).
- Fix the Dockerfile comment: the loader uses require.resolve + dynamic import(),
it deliberately does NOT require('@docmost/git-sync').
- Merge the two near-identical space toggle handlers into one parameterized
handler; add the 2 missing en-US i18n keys for the auto-merge switch (ru-RU not
maintained for these git-sync strings, mirrored).
Tests:
- isGitSyncHttpEnabled() default-branch (unset -> isGitSyncEnabled fallback).
- agentSourceFields 'git-sync' case (source stamped, chat key omitted).
- editor-ext name-level schema contract (vendored mirror superset of editor-ext
node/mark types) + the new shared resolver + non-member 404 gate cases.
Architecture:
- Extract resolveRequestWorkspace shared by DomainMiddleware + GitHttpService
(the two real self-hosted/cloud copies; McpService has no cloud branch).
- Document the in-process setInterval multi-replica limitation + BullMQ/fencing
future direction (deferred, not implemented).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
fe4adf23a0 |
fix(git-sync): unwedge per-page conflicts, preserve callout types, flush collab on disconnect
Addresses QA findings on PR #119 (issues #235/#236). SYNC-WEDGE (HIGH): one same-line conflict on one page froze sync for the WHOLE space in both directions forever. The pull's docmost->main merge left the vault mid-merge, so every later cycle's isMergeInProgress() check returned skipped:"merge-in-progress" and skipped the entire space with no recovery. - pull.ts now COMMITS a conflicting merge with markers in place (commitMerge): cleanly-merged pages land, the conflicted page carries its markers on main and is isolated by the existing push-side conflict-marker skip (markers never reach Docmost), and the next cycle is no longer wedged. conflictedPaths is surfaced. - cycle.ts now RECOVERS a vault left mid-merge by a prior/pre-fix cycle: it aborts the stale merge (merge --abort, hard-reset fallback) and continues, instead of skipping the space forever. - git.ts: listUnmergedPaths / commitMerge / abortMerge / resetHardToHead. CALLOUT TYPE FIDELITY: git-sync's CALLOUT_TYPES was missing "note" and "default" (editor-canonical types), so [!note]/[!default] callouts flattened to [!info] on every round-trip. Aligned the list with @docmost/editor-ext getValidCalloutType. LOSS-ON-FAST-CLOSE: editing a page then closing the tab inside the collab debounce window (~3-18s) lost the edit, because with unloadImmediately:false Hocuspocus does not flush the debounced onStoreDocument on the last-client disconnect. PersistenceExtension.onDisconnect now flushes the pending store (debouncer.executeNow) on the last disconnect only, with no redundant write. DUPLICATION re-verify (#1): the schema-default merge-key normalization is intact; faithful toYdoc-based reproduction shows callout + rich content resync with 0 ops and no growth/strip across cycles -> the re-report was leftover vault data, not a live regression. Locked with a callout regression spec. Tests: git-sync 688 pass (incl. real-VaultGit wedge-recovery integration); server git-sync+collaboration 285 pass; new callout merge/fidelity + onDisconnect-flush specs. tsc --noEmit clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
32e99c6e42 |
fix(editor,git-sync): parse details open as a boolean so open state survives render/round-trip
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
e48d7720e9 |
fix(git-sync): propagate nested details open; drop dead delete-cap wiring; cover lost-lock abort + lose-prone atom round-trips
Addresses review 1863 (delta) on PR #119. MUST-FIX: - detailsToHtml (the raw-HTML path used for a details nested inside columns/spanned cells) now emits `<details${open}>`, mirroring the top-level case, so `open` no longer silently drops every round trip. - Remove the dead `resolveApplyClient` delete-cap hook from the engine `runCycle`: the orchestrator stopped passing it, so the hook + its dry-run pass were inert. Deletes are soft (Trash) + always logged and engine convergence is the guard, so no cap is re-added — just the dead wiring removed. TEST COVERAGE: - space-lock: heartbeat refresh CAS-miss (eval -> 0) and Redis-error (eval throws) both abort the in-flight fn's signal. - cycle: a pre-aborted signal (and an abort during the pull read) throws before the push apply / first destructive phase. - converter: htmlEmbed source VALUE + height survive; encode/decode UTF-8 symmetry and '' -> ''; footnote definition body + ref/def id match; transclusionReference both ids survive; fix the bad transclusionSource fixture (wrong `pageId` attr + empty content -> schema `id` + a block child); nested details `open` parity test. - orchestrator: autoMergeConflicts:true reaches engine settings; default false on a missing settings row. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
b5ce63a956 |
feat(git-sync): Obsidian-native callouts (> [!type]) instead of :::type
Callouts now export as Obsidian's blockquote-callout syntax — `> [!type]` opener plus a `>`-prefixed body — so they render as real callouts when the vault is opened in Obsidian, instead of `:::type` (Docusaurus-style) which Obsidian shows as a plain blockquote. - Export (markdown-converter `case "callout"`): `> [!type]` + each body line blockquote-prefixed (a blank line becomes a bare `>` so the callout is not split). Nested callouts naturally become `> > [!type]`. - Import (preprocessCallouts): a new branch recognizes `> [!type]` openers and the contiguous `>`-prefixed body, strips one blockquote level and recurses (so nested callouts work), emitting the same callout div the `:::` path produces. The legacy `:::type` parser is KEPT so existing vaults keep importing. A plain blockquote (no `[!type]`) stays a blockquote. Tests: 4 converter golden tests updated to the new `> [!type]` output; 4 new import tests (simple, nested, round-trip, plain-blockquote-untouched). The §13.1 gate still round-trips callout losslessly through the real server schema. git-sync vitest 675 (+1 expected-fail), gate 27. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
abd6e3948b |
fix(git-sync): preserve subpages.recursive and details.open on round trip
Found proactively by deepening the round-trip test from node-TYPE survival to ATTRIBUTE fidelity (distinctive attr values per node). Two real losses (the other 3 candidates — mathInline/mathBlock/pageEmbed — were verified to be correct; the probe had used wrong attr names): - subpages `recursive`: the converter emitted a bare div and the schema mirror didn't model the attr, so a recursive subpages reverted to non-recursive on a round trip. Now emits `data-recursive="true"` and the mirror parses it back (matching @docmost/editor-ext). - details `open`: the `open` (collapsed/expanded) state lives on the details node, but the converter emitted the `<details>` wrapper from the summary case without it, so the state was dropped. The wrapper now carries `open`. The round-trip test now also asserts attribute fidelity (12 cases) so these are locked. Schema-surface snapshot updated for the new subpages attr. git-sync vitest 671 (+1 expected-fail), §13.1 gate 27. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
5125296bfa |
fix(git-sync): subpages round-trips (was {{SUBPAGES}} literal) + exhaustive all-node round-trip test
subpages exported to the literal `{{SUBPAGES}}`, which has no markdown/HTML
inverse, so on re-import it came back as a plain paragraph holding the visible
text "{{SUBPAGES}}" — the embed rendered as that literal string on the page
after a sync (round-trip data loss, seen live). It now emits the schema-matching
`<div data-type="subpages">` like every other embed node, so the schema's
parseHTML rebuilds the subpages node. Also dropped the leaf-atom content-hole
in the subpages renderHTML.
New committed regression coverage:
- packages/git-sync/test/roundtrip-all-nodes.test.ts — exhaustive serialize ->
deserialize round trip for ALL 40 node/mark types; each asserts the node/mark
survives and no `{{...}}` literal leaks. This is the test that caught subpages.
- §13.1 gate (git-sync-converter-gate.spec.ts): subpages added to the green
corpus (round-trips through the REAL server schema).
- Corrected two PR-authored tests that asserted the old {{SUBPAGES}} loss as
"by design" — they now assert the fixed round trip.
Also folds in review #1679 coverage-gap tests (no prod change): orchestrator
pollTick/enabledSpaces, datasource 3-way merge dispatch, page.repo
last_updated_source provenance SQL.
git-sync vitest 659 (+1 expected-fail), server tsc clean, server specs green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
a40a00d5c5 |
feat(git-sync): per-space toggle for conflict-marker handling on push (#13)
Red-team #13 (conflict markers reaching Docmost) is now a per-space policy exposed as a UI toggle, instead of a hardcoded behavior. New boolean `gitSync.autoMergeConflicts` (default FALSE), mirroring the existing per-space `gitSync.enabled` flag end-to-end (jsonb space settings -> update-space DTO -> space.service -> client types -> space settings form switch): - OFF (default, safe): a page whose committed body still has unresolved git conflict markers is NOT pushed — it is recorded as a per-page push FAILURE ("unresolved conflict markers — resolve in git first"). Recording a failure (not a soft skip) deliberately HOLDS refs/docmost/last-pushed so the conflict commit is never marked pushed and a later pull cannot clobber the user's in-progress resolution; the page retries until the conflict is resolved in git. - ON: the marker lines are stripped and both sides' content is pushed (the prior behavior), so the conflict becomes visible/fixable inside Docmost. The engine Settings carries `autoMergeConflicts`; runPush threads it into the update AND create paths. The orchestrator's buildSettings reads the per-space flag from jsonb (strict opt-in like `enabled`, default false). Tests: redteam-push-cycle #13 rewritten (default -> not pushed + failure + refs held; ON -> strip-and-push); space.service + edit-space-form + orchestrator specs extended. git-sync vitest 618, server jest space+git-sync 163, client edit-space-form 11, server/client tsc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
d5079aa1d8 |
fix(git-sync): red-team hardening — 12 confirmed sync-breaking bugs + regression tests
A 10-agent red-team pass on the two-way Docmost<->git sync surfaced 16 ranked findings (9 others triaged out as already-defended). Wrote a reproduction test per finding (each asserts the CORRECT behavior, so it fails on the bug), then fixed the production code so every repro goes green. All confirmed bugs: Round-trip data loss (markdown-converter.ts + docmost-schema.ts mirror): - #1 editor-ext node types silently dropped on export — ported the 8 missing canon nodes (footnoteReference/footnotesList/footnoteDefinition, htmlEmbed, status, pageEmbed, transclusionSource/Reference) into the git-sync schema mirror and added converter cases that emit their schema-matching HTML instead of flattening unknown nodes to '' (this was the critical data-loss flagged in review #1679: footnotes/htmlEmbed lost on sync). Snapshot surface updated. - #2 top-level image lost width/height/align/attachmentId — now emits an HTML <img> (like video/diagrams) when it carries layout attrs; bare images stay . Image node parses width/height as strings so they re-import. - #3 code block containing a ``` fence corrupted on round-trip — outer fence is now widened to (longest-inner-backtick-run + 1). - #16 deep nesting threw RangeError (page never synced) — added a depth guard (MAX_NODE_DEPTH=400) so the converter never overflows the stack. Push/layout/cycle (engine): - #4 disambiguation ' ~slugId' suffix corrupted Docmost titles + order-dependent layout — deterministic, order-independent sibling disambiguation; suffix is stripped from a path-derived title ONLY when the new name is exactly the old title plus the suffix (never a genuine retitle ending in ' ~token'). - #6 retry-adopt by (parent,title) clobbered the wrong duplicate-title sibling — ambiguous (parent,title) is no longer adopted (falls back to fresh create). - #12 a new child under a new parent was created at ROOT — creates are ordered parent-before-child with an in-memory created-id map for parent resolution. - #13 git conflict markers could reach Docmost — bodies are scanned and the marker lines stripped (a '=======' line is only treated as a conflict separator inside a <<<<<<< ... >>>>>>> block, so setext headings are safe). - #15 a divergent `docmost` mirror was escalated by runPush but dropped by runCycle — RunCycleResult now forwards divergentDocmost to the orchestrator. Server (merge / lock / provenance): - #9 3-way merge lost a human's block edit when git inserted an adjacent block — finer-grained diff3 region merge (via lcs) preserves non-overlapping human edits; genuine same-block conflicts still resolve git-wins. - #10 single-writer race — module-static liveLocks closes the same-process TOCTOU window, and a heartbeat refresh that cannot confirm the lock now aborts the cycle at its next write checkpoint (cooperative AbortSignal threaded through runCycle). Cross-process fencing tokens remain a follow-up. - #14 sticky-agent provenance overrode an explicit actor='git-sync' write, blinding the listener loop-guard — resolveSource now lets an explicit actor win over the sticky-agent fallback (explicit agent still wins). Verified: git-sync vitest 617 pass (+1 expected-fail), server unit jest 1541 pass, server tsc clean. A review pass over the fixes caught and corrected a title-suffix over-strip, an inert abort signal, a document-wide conflict-marker strip, and two leaf-atom content-holes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
b536a41ad3 |
chore(git-sync): drop stray build/ artifacts re-introduced during rebase
build/ is gitignored and compiled in CI/Docker; a few files leaked back into the tree while replaying commits onto develop. Remove them so the package keeps a single source of truth (src/). |
||
|
|
28d2560dfd |
fix(git-sync): address PR #119 review (#1571)
Resolve the code-review findings from comment #1571 on PR #119. Engine (packages/git-sync): - Idempotent CREATE on retry: before createPage, look the page up in the live Docmost tree by (parentPageId, title) and ADOPT it instead of duplicating when a prior cycle created it but failed to persist the pageId back to disk. Only trust a COMPLETE tree for the lookup; fall back to createPage otherwise. Covered by new tests incl. a complete=false regression-lock. - Route applyPullActions diagnostics through an injected logger instead of bare console (thread log from the cycle). - Add a timeout to the git execFile chokepoint (runRaw) so a hung git subprocess cannot wedge a sync cycle. - Translate remaining Russian code comments to English. - Remove dead standalone-CLI code (parseArgs/PushParsedArgs, parseSettings/envSchema, loadSettingsOrExit + config-errors.ts) and the matching index exports/specs; keep the Settings type. - Fix the dangling docs link in package.json. - Add a schema-surface snapshot guard so any drift in the vendored document schema is a loud, must-review CI failure (+ provenance header). Server (apps/server): - Add a configurable watchdog timeout to the spawned git http-backend so a stalled push cannot hold the per-space lock forever (GIT_SYNC_BACKEND_TIMEOUT_MS). - Close the in-process TOCTOU window in SpaceLockService.withSpaceLock by reserving the slot synchronously before acquire. - Add tests: removePage git-sync provenance (both branches), ensureServable force-push-protection git configs, and the phase-B+ datasource methods. Docs / build: - AGENTS.md: list git-sync as the fifth workspace package and note the three schema mirrors; fix the dangling git-sync-plan.md backlog link. - pnpm-lock.yaml: add the missing @docmost/git-sync workspace link so pnpm install --frozen-lockfile (CI default) succeeds. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
52959de2f3 |
chore(mcp): drop build/ + node_modules leftovers after rebase
These files (build/lib/footnote-analyze.js, build/lib/footnote-lex.js from the merged footnote work, and the y-prosemirror node_modules symlink) survived the rebase because this branch's earlier "stop committing build/ and node_modules" commit predated them. They are gitignored (packages/mcp/build/) and generated / symlinked, so untrack them to keep the branch consistent with that decision. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
5da12e89f9 |
refactor(git-sync): internalize the engine — first-class ESM, no vendoring bridge (#119 review)
Closes the architecture item from the #119 review: drop the "vendored from docmost-sync" framing and the CJS↔ESM `Function('import()')` bridge so the engine is a normal first-class gitmost package. Part 1 — vendoring markers removed (prose only, zero behavior change): reworded "VENDORED into gitmost" / "vendored from docmost-sync" / "Engine LOGIC is byte-identical" / "it's a port" comments across the engine. Behavior-bearing strings are untouched: BOT_AUTHOR_NAME/EMAIL and the `Docmost-Sync-Source:` provenance trailers (changing them would break git authorship + the loop-guard). Part 2 — the package is now ESM (matching the sibling @docmost/mcp): `type: module`, tsconfig Node16, `.js` extensions on relative imports, and a static `import { marked }` replacing the `new Function('return import(...)')` / `loadMarked` hack — the bridge is GONE from the package. The CommonJS NestJS server loads the now-ESM engine via a new `git-sync.loader.ts` that mirrors the existing `docmost-client.loader.ts` mcp loader exactly (Function-indirected dynamic import + cached promise + retry-on-reject). The 4 server consumers (orchestrator/datasource/vault-registry/git-http-backend) call `await loadGitSync()` for value exports; types stay `import type` (erased). The converter-gate spec — which needs the real converter — loads the package's TS source via a jest moduleNameMapper + isolatedModules (documented in that spec); the other git-sync specs mock the loader. Verified: engine builds pure ESM (no Function/require leftover), vitest 614, editor-ext build, server + client tsc, full server jest 1397/0. Live stand smoke-test: server starts clean on the ESM engine (no ERR_REQUIRE_ESM), a real sync cycle runs through the loader, and the basic e2e suite is 12/12 (clone via git-http-backend, push, pull, delete, 3-way merge — all through the new loader). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
f6d22a59a6 |
fix(git-sync): screen non-page files out of PUSH (CRITICAL — review)
Self-review of phase 3 caught a data-corruption regression: nativeMeta always supplies the run's spaceId, so the planner's 'create-without-spaceId' skip — which had doubled as the only filter for non-page files — went dead. An ADDED .obsidian/*.json, attachment, or dotfile (committed to the vault, no .gitignore) would then be classified as a CREATE: a junk Docmost page, plus a gitmost_id frontmatter written INTO the file, corrupting it. Fix: isPageFile(path) — a .md file with NO dot-segment anywhere — and filter the diff to page files at the very top of computePushActions, BEFORE any classification, so non-page A/M/D/R are ignored (design §Адопция). 2 unit tests pin it (.obsidian/json, attachment, dotfile, dot-segment, .md dotfile all ignored; real pages still created). 614 engine tests green. Also: refreshed stale docmost:meta comments to gitmost_id (review SUGGESTION), and documented the deferred adoption frontmatter-preservation gap (review WARNING) in page-file.ts + the design doc (do NOT roll native onto a real vault with Obsidian properties until phase 4 round-trips them). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
d255afa611 |
feat(git-sync): phase 3 — PUSH reads native gitmost_id + derives title/parent from path
PUSH now consumes the native-Obsidian format end-to-end: - identity from the gitmost_id frontmatter (parsePageFile), not docmost:meta; - title from the FILENAME, parentPageId from the enclosing folder's folder-note (parentFolderFile is now FOLDER-NOTE aware: a child's parent is dir/dir.md, and a folder-note's own parent is one level up), spaceId from the run (every vault file belongs to the vault's space); - CREATE derives title/parent/space from path + run and writes the assigned pageId back as gitmost_id frontmatter (serializePageFile); - UPDATE pushes the STRIPPED body (current + 3-way-merge base), so the frontmatter never leaks into Docmost content; the loop-guard hashes the body. The PURE delete-sensitive classifier (computePushActions/classifyRenameMoves) is UNCHANGED — only the injected IO resolvers (metaAt, parent, create write-back) switched source. nativeMeta always carries the run spaceId, so the legacy 'create-without-spaceId' skip no longer fires through runPush. Tests rewritten to native fixtures + folder-note parent paths; the noop case is now a child under a renamed parent folder (filename=title, so a path-only-noop needs an ancestor rename). parentFolderFile tests cover leaf/folder-note/nested/ dotted. 612 engine tests green; engine rebuilt. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
73c5c44301 |
feat(git-sync): phase 2b — PULL writes native gitmost_id frontmatter
PULL now serializes each page as the native-Obsidian format (serializePageFile: a minimal gitmost_id frontmatter + the fixpoint markdown body) instead of the heavy docmost:meta envelope. title/parent/space are derived (filename / folder / repo), so only the pageId is persisted. readExisting recovers identity from the gitmost_id frontmatter (parsePageFile) instead of docmost:meta. Extracted stabilizePageBody() (the export->import->export fixpoint, no meta) so the native writer and the legacy serializer share the same deterministic body — re-pulls of an unchanged page stay byte-identical (loop-guard). Tests: read-existing fixtures rewritten to gitmost_id; apply-pull asserts the written text is native frontmatter and carries NO docmost:meta (regression guard). 611 engine tests green. NOTE: PUSH still reads docmost:meta — the end-to-end cycle is intentionally NOT runnable until phase 3 (PUSH reads frontmatter + derives title/parent from path) lands; no vault is wiped/deployed until then. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
8c42c4f0d6 |
feat(git-sync): phase 2a — folder-note layout (parent -> Folder/Folder.md)
Native-Obsidian structure: a page WITH children now lives at its folder-note <name>/<name>.md (LostPaul Folder Notes convention) with its children alongside; a leaf stays <name>.md. Folder-notes claim their canonical path before a same-named child, so the child (a leaf) is the one disambiguated, never the folder-note — a folder X/ always contains its own note X. Format-agnostic and safe in isolation: only the destination PATH changes, the file content/serialization is untouched, so an existing parent relocates via the move-by-id path (no delete). The frontmatter format flip (pull+push) is next. 6 new layout unit tests (leaf / parent / nested / child-named-as-parent / twin-parents / childless). 611 engine tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
071eae4e2a |
feat(git-sync): drop legacy docmost:meta back-compat (vaults wipe+rebuild)
Per owner: test data, no migration. parsePageFile no longer reads the old docmost:meta block — a file without a gitmost_id frontmatter is simply un-tracked (adopt). Vaults are a cache: rm -rf on the transition, rebuilt native from Docmost. Simplifies the format work (no fallback). Doc updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
a91405632e |
feat(git-sync): native-Obsidian format — phase 1 = page-file (frontmatter gitmost_id)
Pivot the thin-meta design to "the vault IS a native Obsidian vault": clean markdown + a minimal YAML frontmatter `gitmost_id:` (the durable pageId, travels with the file so identity survives any move); folders mirror the page tree with the parent's body as a folder-note `<Folder>/<Folder>.md` (LostPaul Folder Notes convention); links as `[[wikilinks]]` (basename-resolved → reparent never breaks a link, only retitle does); collisions disambiguated Obsidian-style; `.obsidian/` and non-page files left untouched (no .gitignore). Verified the conventions against the Obsidian/Folder-Notes docs. Replaces the abandoned `.gitmost/index.json` sidecar (path-keyed → fragile to git-undetected renames; the in-file id is self-sufficient): removes vault-index.ts. Adds lib/page-file.ts — parsePageFile/serializePageFile (frontmatter id + clean body) with a LEGACY `docmost:meta` fallback for migration. 6 unit tests; engine suite green. Not yet wired into pull/push — no behavior change. Design doc rewritten to the native-Obsidian format. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
aa1ee64b7a |
feat(git-sync): thin-meta phase 1 — the .gitmost/index.json sidecar module
Pure read/write/lookup for the vault sidecar index that will hold page identity (pageId) + collision token (slugId) keyed by file path, so the .md files can be clean markdown. parseVaultIndex is tolerant (missing/garbage/bad entries degrade to empty/skipped — never crashes a cycle); serializeVaultIndex is deterministic (sorted keys -> stable diffs, no churn). Lookups (pageIdAt, pathForPageId reverse, trackedPageIds) + mutations (set/remove/move). NOT wired into pull/push yet — no behavior change. 5 unit tests; engine suite green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
40ca04eb08 |
fix(git-sync): never trash a page whose pageId still exists in the tree (cross-cycle move) + browser e2e
Follow-up to
|
||
|
|
c3dbee9fbf |
fix(git-sync): never trash a page that only MOVED (pageId-identity, not git rename heuristics) — data loss
CRITICAL data-loss bug: creating pages in Docmost (which start UNTITLED) and then typing a title could soft-delete OTHER pages. Untitled pages all serialize to the `_` fallback filename; the layout disambiguates them (`_.md`, `_ ~slug.md`). Retitling one frees the bare `_` and another untitled page's file relocates into it. git's rename detection (`-M`) can't see the move (the tiny meta-only files are too dissimilar), so `git diff` reports it as DELETE(old) + ADD/MODIFY(new). The push took the DELETE literally and trashed a live page. Root cause is that the push trusted git's path-level rename heuristic for page IDENTITY. Identity is the pageId. Fix: before emitting any delete, coalesce by pageId — a pageId that is BOTH deleted (pre-image) AND present on the surviving side (current meta of an ADD or a MODIFY, since a relocation into an occupied path shows as M) is one page that MOVED, classified as a rename/move and NEVER a delete. Reproduced + verified on a live stand: 4 untitled pages + retitle one trashed a different page before; after the fix, retitling one (and stress-retitling all) trashes nothing. Engine suite 598 green; 3 new computePushActions cases (ghost D+A move -> rename; real delete still deletes; unrelated D+A stay delete+update). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
dc7a0ec9f5 |
refactor(git-sync): move the PULL->PUSH cycle into the engine as runCycle (PR #119 review, arch #1)
The reconcile choreography (ensureRepo -> merge-check -> ensureBranch ->
checkout('docmost') -> pull -> push) was hand-rolled in the app orchestrator's
driveCycle, duplicating an order the vendored engine owns and could drift from on
upgrade — the failure mode is data clobber. Lift it into @docmost/git-sync as a
single entry point, `runCycle(deps)`. The orchestrator now calls runCycle and
keeps only the lock (its caller) and the gitmost-specific delete-cap POLICY,
injected as the `resolveApplyClient` hook (the engine does the dry-run, hands the
hook the planned delete count — Infinity if planning failed — and uses whatever
client it returns for the apply). driveCycle drops from ~150 lines to ~30.
Tests:
- engine test/cycle.test.ts: composition (merge-in-progress short-circuit;
ensureRepo->ensureBranch->checkout staging order before the pull; the cap hook
is consulted with the planned count; no dry-run when no hook).
- engine test/cycle-roundtrip.test.ts: runCycle against a REAL VaultGit in a temp
repo with a faked Docmost client — a git-originated CREATE flows pull->push and
the assigned pageId is written back; an unresolved merge short-circuits before
any client call.
- orchestrator spec rewired to mock runCycle and assert the wiring + the
resolveApplyClient cap policy (the engine-internal cycle-order/merge tests moved
to the engine).
Validated end to end on a live stand (real Postgres/Redis + server): a git clone
-> edit -> push over the /git remote round-trips the change into the Docmost page
through the refactored cycle.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
cca0bfe306 |
docs(git-sync): remove dangling references to the deleted git-sync-plan doc (PR #119 review)
The implementation spec docs/git-sync-plan.md was removed as completed, but ~44 code comments still cited it as "plan §N". Strip those citations (comments only), keeping each comment grammatical. The vendored engine's own "SPEC §N" references point at a different, still-present spec and are left untouched. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
807ff1f5f5 |
chore(mcp): stop committing build/ and node_modules; build in CI/Docker
Same hygiene fix as git-sync (review #2), applied to packages/mcp which had the identical pre-existing problem: committed build/ (20 files) + node_modules (28, pnpm symlinks with a baked /home/claude store path). - git rm --cached packages/mcp/{build,node_modules}. - .gitignore: add packages/mcp/build/ (packages/*/node_modules/ already covers it). - Build where consumed: apps/server `pretest` and the CI Test workflow now build @docmost/mcp too. The Dockerfile builder already runs `pnpm build` (nx builds mcp) and already COPYs packages/mcp/build into the runtime image. Verified: wiped build/, rebuilt via `pnpm --filter @docmost/mcp build`; the mcp server suites (96 tests) pass against the freshly-built, non-committed output. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
fa89cba023 |
feat(git-sync): three-way body merge using the last-synced base (no edit loss)
Upgrades the 2-way body merge to a real diff3 three-way merge (review #5), so a block ONLY the human changed is KEPT when git changed a DIFFERENT block — the 2-way merge would revert it to git's stale version. Engine: the push update loop reads the last-synced pre-image (`git.showFileAtRef(refs/docmost/last-pushed, path)`) and passes it as the optional `baseMarkdown` to `client.importPageMarkdown` (the common ancestor). Server: gitmost-datasource converts base+incoming, and writeBody runs a block- level diff3 (new three-way-merge.ts `diff3Plan`): live-only change -> keep live, git-only change -> take git, both-changed -> git wins (conflict policy), inserts/ deletes from either side preserved. Without a base (createPage) it falls back to the 2-way merge. Crash-safety unchanged (docs built before the connection opens). Tests: three-way-merge.spec.ts (14 — every diff3 case incl. the cross-block preservation and conflict policy), yjs-body-merge 3-way (real Y.Docs: human's block instance preserved while git's block is applied), plus an engine test that the base is forwarded from showFileAtRef. Existing push assertions updated for the new base arg. git-sync 589 pass; server merge/datasource/gate 62 pass; typecheck clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
98253cf614 |
chore(git-sync): stop committing build/ and node_modules; build in CI/Docker
Review finding #2: packages/git-sync/build/ (the COMPILED engine) and the package's node_modules/ were committed. Prod executed the committed build/ while CI/tests ran src/ and never rebuilt it — so a fix in src/ could pass tests while stale compiled code shipped (a silent src/prod skew). The committed node_modules were pnpm symlinks with a baked machine-local store path (/home/claude/...), useless and misleading for everyone else. - git rm --cached packages/git-sync/{build,node_modules} (42 + 31 files). - .gitignore: ignore packages/*/node_modules/ and packages/git-sync/build/. - Build the package where it is actually consumed: apps/server `pretest` now builds @docmost/git-sync (its suite imports the built build/index.js), and the CI Test workflow gains an explicit "Build git-sync" step. The Dockerfile builder already runs `pnpm build` (nx builds the package) and now COPYs the fresh build/. Verified: wiped build/, rebuilt via `pnpm --filter @docmost/git-sync build`, then the server converter gate (26/26, imports the rebuilt package) and the git-sync suite (588 passed) both pass against the freshly-built, non-committed output. NOTE: packages/mcp/ has the same committed-build/node_modules pattern (pre-existing, out of this PR's scope) and should get the same treatment in a follow-up. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
d06cf97ed6 |
test(git-sync): exhaustive converter coverage + fix 3 round-trip data-loss bugs
Coder↔reviewer design loop (9 rounds, reviewer verdict: exhaustive) produced 92 specs; implemented +123 tests (465 -> 588 passing). The new round-trip coverage exposed three genuine data-loss bugs in the Markdown<->ProseMirror converter, all now FIXED (round-trip is lossless for these): 1. pageBreak was lost on export (no converter case -> rendered to "" and the node vanished). Now emits <div data-type="pageBreak"></div>, which the schema parses back -> round-trips. 2. A block image between blocks left an empty <p> artifact after import-hoisting, producing a phantom blank-gap diff on every sync. markdownToProseMirror now strips content-less paragraphs after generateJSON — with a schema-validity guard that keeps the obligatory single empty paragraph in `content: "block+"` containers (tableCell/tableHeader/blockquote/column/callout/doc), so empty cells/quotes never become an invalid `content: []`. 3. The `code` mark combined with another mark was not byte-stable (emitted nested HTML that the schema's `code` `excludes:"_"` collapsed on import). The converter now emits code-only when `code` co-occurs, matching the editor. New coverage spans media/diagram/details/columns/math/mention attribute round-trips, converter emission branches, git error paths, and engine decision branches. A dedicated test pins the empty-container schema validity (the review catch on the bug-2 fix). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
d9d1d54aaa |
test(git-sync): add reviewer-requested coverage across engine, server, client
Implements the test cases called out in the PR #119 review threads (code-review, test-strategy report, red-team) — TESTS ONLY, no production code changes. packages/git-sync (vitest): - lib converter/markdown gaps: pageBreak data-loss (it.fails repro), subpages lossy round-trip, nested/fenced callouts, ol->taskList bridge, column.width number<->string drift, empty details. - engine units: parentFolderFile, planReconciliation swap/chained move, buildVaultLayout last-resort-by-id, firstDivergence, applyPushActions / applyPullActions failure isolation. - real temp-git integration: diffNameStatus -z rename+add/modify alignment, copy-line behavior, per-invocation committer identity (no leak into repo/global config). - ENFORCED type-level GitSyncClient contract via vitest typecheck over a *.test-d.ts file (tsconfig.vitest.json; build tsconfig untouched). apps/server (jest): - orchestrator: delete-cap neutralization + fail-safe, Redis lock / mutex skip ladder + release-on-throw, merge guard, pull/push order, remote template substitution, poll lifecycle. - page-change listener: loop-guard, debounce coalescing, id resolution, error swallowing. - vault registry, controller authz (trigger + status), env validation/getters, page.service git-sync provenance stamping, persistence precedence (agent > git-sync > user) + no boundary snapshot, space.service audit-delta, space.repo jsonb-merge, converter-gate corpus extension (mention/math/details/marks). apps/client (vitest + testing-library): - history-item git-sync badge: render gating + non-clickable. - edit-space-form toggle: initial state, optimistic payload, rollback on error, disabled states. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
70bd0dba4d |
feat(git-sync): vendor IO engine (pull/push/git/settings) with GitSyncClient seam (Phase A.3)
Vendor the IO engine from docmost-sync into packages/git-sync/src/engine: - git.ts (VaultGit, execFile shell-out — verbatim) - pull.ts (readExisting, computePullActions, applyPullActions) - push.ts (classifyRenameMoves, computePushActions, applyPushActions, runPush) - settings.ts adapted (pure parseSettings + Settings type; no process.env binding — the server builds Settings from EnvironmentService later), config-errors.ts. CLI main()/import.meta entrypoints dropped (server drives in-process). Client seam: new engine/client.types.ts defines GitSyncClient; pull.ts/push.ts now use Pick<GitSyncClient, ...> instead of the non-vendored DocmostClient. Engine logic byte-identical except a zod4-compat fix in config-errors (zod4 dropped the issue.received==='undefined' signal; match /received undefined/ on the message). Ported the engine unit tests (compute/apply pull+push actions, classify-rename- moves, run-push, settings, config-errors) incl. real-git temp-repo tests: 431 pass / 3 expected-fail (was 314/3). REST/CLI-coupled upstream tests skipped (noted). CJS build clean. No apps/server wiring yet (next step). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
b0cd4bd6cf |
feat(git-sync): CommonJS build + §13.1 editor-ext idempotency gate (Phase A.2)
Make @docmost/git-sync natively consumable by the CommonJS server (and jest):
build to CommonJS (tsconfig module CommonJS, drop type:module, strip .js from
relative imports), and lazy-load the only ESM-only dep (marked) via the dynamic
Function('import()') trick (mirrors docmost-client.loader.ts) with a require()
fallback so vitest's evaluator works too. git-sync tests stay green (314 pass,
3 expected fail).
Add the §13.1 idempotency gate (apps/server .../git-sync-converter-gate.spec.ts):
13 editor-ext docs (paragraphs/headings, marks, links, bullet/ordered/task lists,
blockquote, callouts, code block, hr, table, nested mix) round-trip
content(editor-ext) -> convertProseMirrorToMarkdown -> markdownToProseMirror ->
TiptapTransformer.toYdoc/fromYdoc(tiptapExtensions) -> canonicalize and assert
docsCanonicallyEqual. All green => the vendored converter's docmost-schema is
schema-compatible with editor-ext (no node/mark/attr loss), which the plan §13.1
requires before Phase B. The one intrinsic markdown-image lossiness (width/height
/align can't ride plain ) is isolated in a KNOWN DIVERGENCE block, not
hidden. Server tsc clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
56ab17fbc2 |
feat(git-sync): vendor pure converter + engine into @docmost/git-sync (Phase A.1)
First step of docs/git-sync-plan.md. New workspace package @docmost/git-sync vendoring the PURE parts from docmost-sync (HEAD b03eb35): - lib: markdown-converter, markdown-document, canonicalize, docmost-schema, node-ops, diff, and an extracted markdown-to-prosemirror (only the pure marked->HTML->generateJSON path from upstream collaboration.ts; no websocket). - engine (pure, no IO): reconcile, layout, sanitize, stabilize, loop-guard. Ported the upstream pure-module + round-trip corpus tests (vitest): 314 pass, 3 expected upstream known-limitation fails. tsc clean. No server wiring yet. docmost-schema inlines getStyleProperty (as packages/mcp does — @tiptap/core 3.20.4 doesn't export it). IO engine (pull/push/git/settings) deferred to later Phase A/B steps; the editor-ext idempotency gate (plan §13.1) is the next step. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
c5109aa2a3 |
Merge pull request 'feat(footnotes): author-inline footnotes + deterministic server canonicalization (#228)' (#232) from feat/228-inline-footnotes into develop
Reviewed-on: #232 |
||
|
|
c4ed4a4855 |
fix(footnotes): strip bare definitions on rebuild; MCP full-doc + zip-import canonicalize tests (#228)
Review #6 (approve-with-comments) follow-ups: 1. canonicalize step 7 now strips bare footnoteDefinitions at ANY depth (stripFootnoteDefinitionsDeep), not just footnotesList, in BOTH copies. A definition hand-authored outside a list (e.g. nested in a callout via a raw-JSON write path) was left in place while a copy was also added to the rebuilt list -> duplicate, idempotent, self-perpetuating. Runs only in the rebuild path (after the lists are stripped); the fast-path / placement-keep branch is untouched. Added a shared-corpus case (bare def nested in a callout) to pin it in both mirrors. 2. markdown-clipboard: removed the dead top-level footnoteReference check in canonicalizePastedFootnotes (an inline atom is never a top-level slice child; only the descendants scan can find it). Test coverage: 4. New MCP binding tests (full-doc-write-canonicalize.test.mjs): update_page_json and copy_page_content canonicalize the persisted full doc, asserted via a new `replacePage` seam (symmetric to the existing `mutatePage` seam) so no live collab socket is needed. Routed both writers through the seam. 5. New server spec (file-import-task.service.footnote-canonicalize.spec.ts): the zip-import path (processGenericImport) canonicalizes footnotes — real markdown->HTML->JSON via a real ImportService over a temp-dir .md file, DB trx stubbed to capture the persisted page content. FileImportTaskService had no spec before. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
9c1f952b2f |
fix(footnotes): guard insert against nested/bare definitions, skip definitions-only paste, doc + reorder fixes (#228)
Must-fix: - insertInlineFootnote could glue a footnoteReference inside an EXISTING definition (nested footnotesList, or a bare footnoteDefinition with no list wrapper), which canonicalize then dropped as an orphan — silently losing the definition's prose. Now: (a) the body/notes boundary is computed from the first top-level block that IS or CONTAINS (recursively) a footnotesList/ footnoteDefinition, not just a top-level list; and (b) the insertNodesAfterAnchor core skips footnotesList/footnoteDefinition subtrees entirely (skipSubtreeTypes), so an anchor whose only match is inside a definition -> inserted:false (clean abort, no write). Added tests: nested-definition, bare-definition, and body-before-nested-list-still-inserts. - editor-ext footnote-canonicalize header listed `markdownToProseMirror` among the canonicalizing MCP paths; it is the NON-canonicalizing primitive. Replaced with `markdownToProseMirrorCanonical` (+ note that the plain primitive is for comment bodies) and added copy_page_content. - Client paste: canonicalizePastedFootnotes now skips a definitions-ONLY paste (no footnoteReference anywhere) — canonicalizing it would strip the reference-less list and yield an EMPTY paste. Added a test. Suggestions: - docmost_transform now runs validateDocStructure/validateDocUrls on the RAW transform output BEFORE canonicalizeFootnotes (mirrors updatePageJson), so a too-deep doc gives the intended max-depth error instead of a stack overflow. - docmost_transform tool description now states the RESULT is footnote-canonical (dryRun diff may show tidy-ups; idempotent after first run). - insertFootnote: dropped the dead `result ? … : undefined` ternaries and the `as any` casts (result is always set by the time we return; the not-found path throws and aborts mutatePage). `const r = result!;`. Tests / architecture: - Added a LIVE-plugin golden case: the real footnoteSyncPlugin leaves a list with non-empty content after it in place, and canonicalize agrees (placement parity is now a driven property, not a hand-set expected). - Added generateFootnoteId uuidv7 shape + uniqueness test. - Item 9: added the ENFORCEMENT-RULE comments at the server parseProsemirrorContent and the MCP canonicalizer header (any NEW full-doc persist path MUST canonicalize; fragments/append/prepend and comment bodies MUST NOT). Kept per-call-site over a brittle grep CI test (the replace-vs-fragment + comment-vs-page nuance makes a single wrapper unsafe). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
3fd66b4245 |
fix(footnotes): don't canonicalize comment bodies (data loss); canonicalize only page write paths (#228)
Must-fix (REAL DATA LOSS): - markdownToProseMirror is reused for COMMENT bodies (createComment/updateComment). It unconditionally canonicalized, so a comment carrying a standalone footnote definition ([^1]: text with no matching reference) had its whole footnotesList stripped (referenceIds.length===0 -> stripFootnotesListsDeep) — the text vanished. Fix: markdownToProseMirror no longer canonicalizes (content-preserving primitive); a new markdownToProseMirrorCanonical wraps it for the PAGE write paths (markdown import via importPageMarkdown, update_page markdown via updatePageContentRealtime). Comment callers keep the non-canonicalizing primitive. Updated the now-false header comment and added create/update-comment inline notes. Added collaboration tests: comment path PRESERVES a reference-less definition; page path still drops it AND still reorders real footnotes. Updated the page-import canonicalization test to use the canonical variant. Suggestions / architecture: - #2: collapsed transforms.footnoteDefinition onto the shared makeFootnoteDefinition factory (adds only the inner paragraph block id); kept the dependency direction transforms -> footnote-authoring (no circular import, mirror stays pure). - #3: confirmed docmost_transform auto-canonicalization is documented (inline comment, tool description, CHANGELOG) — no code change. - #4: copyPageContent is a FULL-document write (replacePageContent of a type:"doc"); added a defensive canonicalizeFootnotes pass (no-op on already-canonical source). - CHANGELOG entry refined to list the FULL-document write paths (incl. copy_page_content) and to state canonicalization is NOT applied to comment bodies. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
40d1cdfc77 |
refactor(review): address #230 third review — callout dedup, ticket/type tidy
Approve-with-comments follow-ups (no blockers): - callout: unify the GitHub-callout feature ticket on #192 (the callout-paste feature the CHANGELOG already tracks); #218 is the public-share security work. Fixed the code comment and test reference. - export/utils.spec: pin current behavior of a leading-dot name (".gitignore" -> "") — same bug class as #204 but unreachable via the sole caller, so document not change. - share.types: narrow ISharedPage to the actual /shares/page-info allowlist (page -> Pick of id/slugId/title/icon/content; trimmed share; dropped the spurious `extends IShare`). Verified all three consumers (shared-page, link-view, mention-view) read only allowlist fields. - editor-ext: extract shared CALLOUT_TYPES / normalizeCalloutType / renderCalloutHtml into callout-common.marked.ts; both tokenizers (`:::type` and `> [!type]`) now share the renderer + type dict while staying separate. Eliminates the byte-identical renderer + duplicated type list. - share.service: extract named predicate shareIdGrantsAccess(requestedShareId, resolvedShare) for the id-or-key fast path (naming only, no control-flow change); kept narrower than resolveReadableSharePage's id-only gate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
a77a0bc92b |
fix(footnotes): re-review #232 — refuse footnoteRef into codeBlock/definition, deep-strip nested lists, docs + cross-copy guard (#228)
Must-fix: - REAL BUG: insertInlineFootnote could splice a footnoteReference (inline atom) into a codeBlock or an existing footnoteDefinition, persisting a schema-invalid doc (insert_footnote skips validateDocStructure). Now the search is bounded to the BODY (before the first footnotesList) and the insertNodesAfterAnchor core refuses textblocks that can't hold the atom (codeBlock); when the only match is in such a place the insert returns inserted:false and the write aborts cleanly. Reachable via docmost_transform too. Added codeBlock / definition / fall-through tests. - Fixed the deepEqualJson doc comment in both copies: arrays are order-SENSITIVE (correctness depends on it), only object keys are order-insensitive. - README.ru.md MCP tool count 38 -> 39 (lines 36/47/63), matching README.md/AGENTS. - CHANGELOG [Unreleased] Added entry for insert_footnote + server-side footnote canonicalization on non-editor write paths (#228). Suggestions: - canonicalize step 5/7 now strips footnotesList at ANY depth (both copies), so a schema-valid list nested in a callout/blockquote can't leave duplicate defs. - Exclude the test-only footnote-corpus.ts fixture from the editor-ext build (tsconfig), so it no longer ships in dist/. - Removed the duplicate manual canonicalize cases from the MCP unit test (the shared corpus covers them via full deepEqual); kept idempotence + immutability. - insertInlineFootnote dedup key now keys off the inline array directly (footnoteContentKey({ content: inline })) instead of a throwaway node. Tests / architecture: - New client-wrapper test (#9): overrides a small mutatePage seam to assert the not-found path throws and persists NOTHING, and the success path shapes footnoteId/reused/message/verify and writes the right content. Fixed the misleading comment in footnote-write.test.mjs. - B: cross-copy corpus parity guard test (loads both corpora, asserts deep-equal) so a typo in one copy can't pass both suites green. - A: declined — the full-vs-fragment decision lives at the call site, so a prepareDocForPersist wrapper would be a bare alias for canonicalizeFootnotes; kept the existing per-call-site comments instead. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
07ebd8c63e |
fix(footnotes): address PR #232 review — fragment-safe canonicalization, plugin placement parity, dead-code removal (#228)
Must-fix: - Move canonicalizeFootnotes OUT of parseProsemirrorContent. It now runs only on FULL writes (createPage, updatePageContent operation==='replace'), never on an append/prepend fragment (a fragment would lose definition-only footnotes or synthesize a bogus empty list). Add a server binding spec. - Match the live plugin's list PLACEMENT: a single already-canonical footnotesList is left exactly where it sits (the plugin never repositions a sole correct list), so the first write no longer reorders content that follows the list. Applied to BOTH the editor-ext copy and the MCP mirror; pinned by a shared golden corpus case with content after the list. - Fix MCP tool count 38 -> 39 (README x3, AGENTS.md) and the transformJs param help (add canonicalizeFootnotes/insertInlineFootnote). Simplifications: - Remove the dead duplicate re-id mechanism (deriveFootnoteId/suffix/occurrence) from the PURE canonicalizer in both copies — references are never renamed, so the derived ids were never requested; first-wins-drop is the real behaviour. This also makes the editor-ext footnote-util note about "no cross-package copy" true again. - Remove the sentinel round-trip in insertInlineFootnote: a generalized insertNodesAfterAnchor core inserts the footnoteReference node directly. - Drop the redundant per-definition deep clone in step 4 (shallow id-normalizing copy; out is already deep-cloned). Docs / architecture: - Correct the editor-ext copy's "It exists because…" header to its real consumers (server import, page.service create/update, client paste). - Note markdownToProseMirror reuse for create/update comment in collaboration.ts. - A: shared golden JSON corpus exercised by BOTH the editor-ext copy and the MCP mirror (footnote-corpus.ts / .mjs) so "the two copies behave identically" is checkable. - C: split the MCP canonicalizer into a pure mirror + footnote-authoring.ts. - B: import services persist via a different path, so left one-line consolidation comments at the call sites rather than folding (does not fall out cleanly). Tests: insertFootnote wrapper guards + docmost_transform dryRun auto-canonicalize (MCP mock), page.service create/update + append/prepend binding (server jest), shared corpus incl. nested-container reference. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
30cb9d293c |
feat(footnotes): inline authoring + deterministic server-side canonicalization
Make footnotes author-inline: the agent/tool inserts a footnote at its point of use (anchor + text) and the numbering plus the bottom list are DERIVED deterministically server-side. The agent has no access to footnotesList and cannot desync — out-of-order lists, orphan definitions, and raw trailing [^id] blocks become structurally impossible. editor-ext: - canonicalizeFootnotes(docJSON) -> docJSON: a pure, EditorView-free port of footnoteSyncPlugin's end-state. Distinct reference ids in document order are the source of truth; exactly one trailing footnotesList holds one definition per referenced id in reference order (reusing the existing node or synthesizing an empty one); orphans dropped; duplicate definitions resolved deterministically (first wins, never lost); idempotent. - Unit tests + a golden parity suite: on every editor-reachable steady state the live footnoteSyncPlugin's JSON is a canonicalize no-op (byte-for-byte parity), and the canonicalizer additionally repairs the out-of-order list a non-editor write produces. mcp: - footnote-canonicalize.ts: behavioural mirror of the editor-ext canonicalizer (the MCP package is intentionally decoupled from the editor barrel, like footnote-lex/docmost-schema), plus footnoteContentKey for content dedup. - Auto-canonicalize on EVERY write path: markdownToProseMirror (fixes import ordering), update_page_json, and after every docmost_transform. Idempotent, so it is a no-op when footnotes are already canonical. - insert_footnote tool + insertInlineFootnote: anchor + markdown text -> a mark-safe footnoteReference and a content-dedup'd definition; the list and numbering are derived. Same-content footnotes reuse one number/definition. - canonicalizeFootnotes + insertInlineFootnote exposed as docmost_transform sandbox helpers. Tests: editor-ext 157 green; MCP 325 green; server + client tsc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
2d36641f28 |
test(coverage): add regression tests for issues #192, #206, #204
Additive test coverage across server, editor-ext, client and mcp. #192 — AiChatService.stream integration (Section 3, against real Postgres): - new apps/server/test/integration/ai-chat-stream.int-spec.ts drives the real streamText through a seeded ai/test MockLanguageModelV3 and a real Node ServerResponse, covering: onError persists an assistant error record (status 'error' + partial answer + provider cause in metadata); external MCP client closed exactly once on BOTH onFinish and onError; anti-tamper — history is rebuilt from the DB transcript, not from body.messages. #206 — red-team findings (most already fixed+tested in #212): - mdrt-2 (UNFIXED, data loss): turndown.dataloss.test.ts documents that pageBreak / transclusionReference / mention are silently dropped on Markdown export (characterization + it.fails for the desired survive-export contract). - persist-6 (UNFIXED, data loss): persistence-store.spec.ts adds an it.failing documenting that a momentarily-empty live doc overwrites non-empty content (left unfixed — a store-side empty-guard is a behaviour change). #204 — test-strategy plan, highest-priority subset: - Phase 1: mcp-clients.lease.spec.ts covers the external MCP client lease/refcount/eviction lifecycle (leak / premature-close / double-close). - Phase 2 data-integrity pure functions: editor-ext table-utils (transpose/moveRow/convert round-trip) and math tokenizer false-positive guard; client emoji-menu (+ it.fails for the unguarded localStorage JSON.parse bug), sort-cells, normalizeTableColumnWidths; mcp htmlEmbed/ pageBreak markdown data-loss + footnote-diff; server export getInternalLinkPageName extensionless-path bug — FIXED (small/clear) + tested. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
22852be2e2 |
fix(qa): resolve UI bugs from #216 and #218
Public sharing (#218): - Bind public-share content to the requested shareId. getSharedPage now enforces dto.shareId (forwarded from /share/:shareId/p/:slug): the page must be reachable THROUGH that exact share (its own share, or an includeSubPages ancestor that contains it). A forged/mismatched shareId 404s instead of rendering off the slug alone and no longer leaks the real canonical key via redirect. A request with no shareId keeps the legacy slug-capability path. - Trim /shares/page-info: drop internal metadata (creatorId, spaceId, workspaceId, contributorIds, lastUpdated*, parent/position, lock/template flags, timestamps) from the anonymous payload. - Default share-to-web includeSubPages to false (opt-in), so enabling a share no longer silently exposes the whole sub-tree (#216). Editor (#218): - Harden the new-page pre-sync window: the body editor is kept read-only until the collab provider is Connected and synced, so early keystrokes can't land only in local ProseMirror and then be clobbered by the server's empty doc. - Surface a "Connecting… (read-only)" affordance during the static phase so input isn't silently swallowed. Other: - Breadcrumb: resolve from the page's own ancestor data (/pages/breadcrumbs) instead of waiting for the lazily-built sidebar tree, so deep pages don't render a blank breadcrumb for seconds. - Pasting GitHub `> [!type]` callouts now converts to a callout node instead of a literal blockquote (new marked extension wired into markdownToHtml). Tests: editor-sync-state gate (client), getSharedPage share-binding (server), github-callout markdown conversion (editor-ext). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
bd62d906bb |
test(e2e): anchor top-level mcp comment on existing page text
With the image fix in place, the mcp e2e ran through every section and
failed only at the last one (comments): create_comment was hardened to
require an inline "selection" (exact text to anchor on) for a top-level
comment, but the test created one without a selection ("an inline
'selection' ... is required for a top-level comment").
Pass an inline selection ("Добавленный абзац.", a plain paragraph
re-imported in section 5 and still present at the comments stage). The
reply is unchanged: it carries a parentCommentId, so it is a reply and
needs no selection.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
deeec50b5f |
test(e2e): fix remaining server config and mcp image failures
Follow-up to the first e2e fix: with nanoid/editRes.edits resolved, the suites failed one layer deeper. Both layers were never green since the e2e jobs were added (non-blocking in CI), so the failures had stacked up. server e2e (jest-e2e.json) — align module resolution/transform with the working unit/integration jest configs so AppModule's full import graph loads: - moduleFileExtensions: add "tsx" (React-Email .tsx templates are pulled in via the auth controller chain). - transform: ^.+\.(t|j)s$ -> ^.+\.(t|j)sx?$ so .tsx is transformed. - moduleNameMapper: add ^src/(.*)$ -> <rootDir>/../src/$1 (code imports via the absolute 'src/...' alias). Verified locally: the module graph now fully resolves (only env vars, supplied by CI, remain). mcp e2e (test-e2e.mjs) — insert_image/replace_image accept only http(s) URLs the server fetches; the test passed local file paths and died with "Invalid image URL". Serve the PNG bytes over a throwaway 127.0.0.1 HTTP server (the Docmost server runs on the same CI host) and pass URLs. The featPng negative test is untouched: replaceImage checks the attachmentId and throws before fetching, so its local path is never validated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
7eefdad512 |
test(e2e): fix failing server and mcp e2e suites
Two unrelated CI failures on the 0.94.0 release PR: - server e2e: jest-e2e.json lacked transformIgnorePatterns, so the ESM-only nanoid@5 package was loaded as CommonJS and crashed with "Cannot use import statement outside a module". Add the same node_modules whitelist already present in the unit and integration jest configs (nanoid|uuid|image-dimensions|marked|happy-dom|lib0). - mcp e2e: test-e2e.mjs read editRes.edits, but editPageText() returns the per-edit results under `applied` (not `edits`), so editRes.edits was undefined and .every() threw. Read editRes.applied instead. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
d32ad73158 |
test(editor-ext): cover the transclusionSource NO_REASSIGN branch in id dedupe (#206)
A colliding transclusionSource id is deliberately NOT reassigned (its id is a cross-reference key), while a missing id is still filled. Add coverage for both: two sources sharing an id keep it (red if the NO_REASSIGN guard is removed), and a source with no id gets a fresh one. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |