Compare commits

...

6 Commits

Author SHA1 Message Date
agent_coder e9e4c1028d perf(editor): cut per-keystroke work on the typing hot path (#343)
The editor lagged while typing (worse with doc size, and under collaboration the
same cost is paid for every REMOTE keystroke). ProseMirror itself was fine — the
overhead was the surrounding work done on every transaction. Behavior is 1:1;
only WHEN work runs changed.

- getJSON() off the keystroke path: `onUpdate` no longer serializes the whole doc
  synchronously — the serialization now runs inside a 3s debounce (new hook
  use-page-content-cache.ts), flushed on unmount so the last snapshot isn't lost.
- footnote numbering: merged 3 per-docChanged O(n) doc walks into one, and
  short-circuit the whole-doc renumber when the doc has no footnotes and the
  transaction didn't insert one (step-slice scan — covers typing/paste/collab).
- toolbar: replaced per-keystroke `editor.can().undo()/.redo()` dry-runs with
  cheap history-depth reads (Yjs undoManager stack length / pm-history depth).
- render side-effect bug: `remote.attach()` moved out of the render body into a
  useEffect.
- debounced the TOC all-headings rescan and memoized the slash-command suggestion
  build (was rebuilt twice per keystroke).
- node menus (image/video/audio/pdf/callout/subpages): the per-transaction
  selectors early-return a cheap isActive check instead of running getAttributes +
  multiple alignment probes while their node type is inactive (shouldShow still
  controls display — appears exactly when it did).
- code blocks: the global selectionUpdate listener is now added only for mermaid
  blocks (the only consumer of the selected state), eliminating N listeners +
  N setStates per caret move for normal code blocks.

Deferred (documented, collab hot-path risk): full conditional menu MOUNTING
(menu-less-frame risk on same-tx context switch) and code-block re-tokenization
debounce / language-persist (self-dispatching meta tx + node-attr writes interact
with collab/undo). The route split from #342 already keeps lowlight off startup.

Gate: editor-ext build + 252/252 tests, client editor tests pass, tsc --noEmit 0,
client build ok. New tests: footnote no-footnote-doc → 0 traversals + numbering
unchanged; page-content-cache onUpdate-no-sync-getJSON + flush-on-unmount.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-04 22:49:48 +03:00
agent_vscode 382e5196da Merge pull request 'fix(docker): toolchain python3/make/g++ для нативной сборки re2' (#353) from fix/docker-re2-toolchain into develop 2026-07-04 22:11:49 +03:00
agent_vscode 76e0c08cec fix(docker): install python3/make/g++ toolchain for re2 native build
The develop image build broke at `pnpm install --frozen-lockfile`: the new
native dependency re2@1.25.0 (packages/mcp, search_in_page #330) always
compiles from source under pnpm — its prebuilt-binary downloader
(install-artifact-from-github) cannot identify the GitHub repo because pnpm
does not populate npm_package_repository_*/npm_package_json env vars ("No
github repository was identified. Building locally ..."), and node:22-slim
ships no python3/make/g++ for the node-gyp fallback.

- builder stage: add a cache-friendly apt layer with python3 make g++
  before COPY; the stage is discarded so the toolchain may stay.
- installer stage: install the toolchain, run the prod install as the node
  user via `su node -c`, and purge the toolchain — all in one RUN layer so
  the final image stays slim and node_modules ownership needs no extra
  chown layer; USER node is restored right after.

Fixes the failed run 28715009124 (develop docker build); release.yml uses
the same Dockerfile and is covered too.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-04 22:09:40 +03:00
vvzvlad 8978d69f3e Merge pull request 'fix(converter): стабильность round-trip image/медиа — «» ≡ absent (класс defaults-instability)' (#350) from fix/media-roundtrip-stability into develop
Reviewed-on: #350
2026-07-04 21:30:12 +03:00
agent_coder c192f2a2e1 test(prosemirror-markdown): pin the third state — explicit "" converges once, then idempotent
Reviewer addition to the round-trip stability matrix: besides "attr absent" and
"attr has a real value", a string attr in the empty-string class has a third,
degenerate state — a LITERAL "" (a user types alt/title/name in the editor then
deletes it, and Tiptap persists `attr: ""`, distinct from never-set). The fix's
`getAttribute(...) || null` coercion normalizes such a stored "" to the default
on the FIRST round-trip (a one-time "" -> null diff) and is byte-stable from the
SECOND round-trip on.

Adds a convergence contract to the reusable matrix helper (emptyStringClass flag
+ runConvergenceCase): pass 1 must converge the attr to its schema default (NOT
asserted byte-stable vs the "" input — that is the intended one-time
normalization); pass 2 must deep-equal pass 1 (idempotent thereafter). Driven for
every empty-string-class attr across image + the media family (image/drawio
alt+title, video alt via aria-label, pdf/attachment name, attachment mime).
Documents the one-time normalization so a future sync/QA diff does not flag the
single "" -> null change as converter corruption.

Gate: package suite 33 files / 682 tests passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-04 21:17:17 +03:00
agent_coder 2ce672709a fix(prosemirror-markdown): stabilize image round-trip — "" ≡ absent on parse (empty-string class)
A stored image authored without `alt` gained a phantom `alt: ""` on every
round-trip (`markdownToProseMirror(convertProseMirrorToMarkdown(doc))`): `marked`
renders `![](src)` as `<img alt="">`, and the stock tiptap Image `alt` parseHTML
(`getAttribute("alt")`) materialized the empty string where the original had no
attribute. That false diff is a real GS-EDIT-REVERT churn source — an agent /
git-sync touch of a page with an image mutates the stored JSON (`absent -> ""`),
producing phantom diffs that can overwrite live edits.

Fix is PARSE-SIDE ("" ≡ absent), so the RAW round-trip is idempotent — not only
the canonical form (history / stored JSON diff on the raw shape; masking it only
in canonicalize would leave that noise). `image.alt`/`title` parseHTML now coerce
`getAttribute(...) || null`, plus defense-in-depth `|| null` across the at-risk
empty-string class (video aria-label, drawio/excalidraw title+alt, pdf name,
attachment name+mime) matching the existing `image.caption || null` precedent.

NOTE — image `align` is NOT changed: it round-trips correctly (center via the
schema default "center", left/right via the `<!--img {...}-->` comment). Its
`toBeUndefined()` in the git-sync gate is canonical-form normalization, not a loss.

Intentional divergence from editor-ext: editor-ext's literal `alt` parseHTML
returns "" verbatim, but this coercion CONVERGES on editor-ext's real STORED
shape (an image inserted without alt has no `alt` attribute -> re-parses absent,
never ""), so the round-trip is idempotent and matches real documents.

Adds a reusable, node-agnostic round-trip-stability matrix helper
(test/roundtrip-stability.helper.ts) — given a node + attr spec it enumerates
default/non-default combos and asserts byte-stability of BOTH the raw and the
canonical round-trip (the documented numeric width/height→string coercion encoded
as an explicit allowed normalization) — driven over image + the whole media
family (video/audio/pdf/attachment/embed/drawio/excalidraw). The only raw
empty-string instability it found was image.alt; the family was already stable.

Gate: package suite 33 files / 672 tests passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-04 20:51:34 +03:00
19 changed files with 1197 additions and 74 deletions
+16 -2
View File
@@ -5,6 +5,13 @@ RUN npm install -g pnpm@10.4.0
FROM base AS builder
# re2 (packages/mcp) always compiles from source under pnpm (the prebuilt-binary
# download cannot identify the GitHub repo), so node-gyp needs python3/make/g++.
# This stage is discarded, so the toolchain can stay installed.
RUN apt-get update \
&& apt-get install -y --no-install-recommends python3 make g++ \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY . .
@@ -57,9 +64,16 @@ COPY --from=builder /app/patches /app/patches
RUN chown -R node:node /app
USER node
# Toolchain is needed transiently to compile re2 during the prod install; install
# and purge it in one layer to keep the final image slim. The install itself runs
# as the node user via su to keep node_modules ownership without a costly chown layer.
RUN apt-get update \
&& apt-get install -y --no-install-recommends python3 make g++ \
&& su node -c "pnpm install --frozen-lockfile --prod" \
&& apt-get purge -y --auto-remove python3 make g++ \
&& rm -rf /var/lib/apt/lists/*
RUN pnpm install --frozen-lockfile --prod
USER node
RUN mkdir -p /app/data/storage
@@ -46,6 +46,13 @@ export function AudioMenu({ editor }: EditorMenuProps) {
return null;
}
// #343 PART 1: skip getAttributes unless an audio node is active. The menu
// only shows for an active audio node (shouldShow), so the null state while
// inactive is never rendered — behavior unchanged.
if (!ctx.editor.isActive("audio")) {
return null;
}
const audioAttrs = ctx.editor.getAttributes("audio");
return {
@@ -43,8 +43,15 @@ export function CalloutMenu({ editor }: EditorMenuProps) {
return null;
}
// #343 PART 1: skip the per-type isActive() probes unless a callout is
// active. The menu only shows for an active callout (shouldShow), so the
// null state while inactive is never rendered — behavior unchanged.
if (!ctx.editor.isActive("callout")) {
return null;
}
return {
isCallout: ctx.editor.isActive("callout"),
isCallout: true,
isInfo: ctx.editor.isActive("callout", { type: "info" }),
isNote: ctx.editor.isActive("callout", { type: "note" }),
isSuccess: ctx.editor.isActive("callout", { type: "success" }),
@@ -22,6 +22,12 @@ export default function CodeBlockView(props: NodeViewProps) {
const [isSelected, setIsSelected] = useState(false);
useEffect(() => {
// #343 PART 6: `isSelected` only drives the mermaid source's visibility (the
// `hidden` prop below). For every non-mermaid code block it is never read,
// so skip the per-block `selectionUpdate` listener entirely — otherwise N
// code blocks each add a global listener + a setState on every caret move.
if (language !== "mermaid") return;
const updateSelection = () => {
const { state } = editor;
const { from, to } = state.selection;
@@ -32,11 +38,14 @@ export default function CodeBlockView(props: NodeViewProps) {
setIsSelected(isNodeSelected);
};
// Initialize on attach so switching a block's language to "mermaid" reflects
// the current selection immediately (the listener was not running before).
updateSelection();
editor.on("selectionUpdate", updateSelection);
return () => {
editor.off("selectionUpdate", updateSelection);
};
}, [editor, getPos(), node.nodeSize]);
}, [editor, getPos(), node.nodeSize, language]);
function changeLanguage(language: string) {
setLanguageValue(language);
@@ -1,5 +1,7 @@
import type { Editor } from "@tiptap/react";
import { useEditorState } from "@tiptap/react";
import { undoDepth, redoDepth } from "@tiptap/pm/history";
import { yUndoPluginKey } from "@tiptap/y-tiptap";
export interface ToolbarState {
isBold: boolean;
@@ -16,14 +18,45 @@ export interface ToolbarState {
canRedo: boolean;
}
// Undo/redo come from either StarterKit's history or the Yjs collaboration
// history extension. During the brief moment a page is rendered with the
// static editor (mainExtensions only, undoRedo disabled), neither is loaded
// and editor.can().undo/redo is undefined.
function safeCan(editor: Editor, command: "undo" | "redo"): boolean {
const can = editor.can() as Record<string, unknown>;
const fn = can[command];
return typeof fn === "function" ? (fn as () => boolean)() : false;
// Undo/redo availability, computed WITHOUT `editor.can().undo()/.redo()`.
//
// `editor.can()` runs the command as a dry-run (building a throwaway state +
// transaction) — the most expensive work in this selector, and it ran on every
// keystroke (and every REMOTE keystroke under collaboration). Instead we read
// the history stack depth directly, which is a cheap plugin-state lookup and
// mirrors exactly what the undo/redo commands themselves check:
//
// - Collaboration (Yjs): the yjs UndoManager's undo/redo stack lengths — the
// same `undoStack.length === 0` / `redoStack.length === 0` guard the
// Collaboration extension's undo/redo commands use.
// - Plain history (templates / non-collab): prosemirror-history's undoDepth /
// redoDepth, which back the UndoRedo extension.
//
// When neither history backend is installed (the pre-sync static editor —
// mainExtensions only, undoRedo disabled), both fall through to 0 -> false,
// matching the previous `safeCan` behavior.
function historyAvailability(editor: Editor): {
canUndo: boolean;
canRedo: boolean;
} {
const state = editor.state;
// Collaboration history (Yjs) takes precedence when present.
const yState = yUndoPluginKey.getState(state) as
| { undoManager?: { undoStack: unknown[]; redoStack: unknown[] } }
| undefined;
if (yState?.undoManager) {
return {
canUndo: yState.undoManager.undoStack.length > 0,
canRedo: yState.undoManager.redoStack.length > 0,
};
}
// Plain prosemirror-history (returns 0 when the history plugin is absent).
return {
canUndo: undoDepth(state) > 0,
canRedo: redoDepth(state) > 0,
};
}
export function useToolbarState(editor: Editor | null): ToolbarState | null {
@@ -31,6 +64,7 @@ export function useToolbarState(editor: Editor | null): ToolbarState | null {
editor,
selector: (ctx) => {
if (!ctx.editor) return null;
const { canUndo, canRedo } = historyAvailability(ctx.editor);
return {
isBold: ctx.editor.isActive("bold"),
isItalic: ctx.editor.isActive("italic"),
@@ -42,8 +76,8 @@ export function useToolbarState(editor: Editor | null): ToolbarState | null {
isBulletList: ctx.editor.isActive("bulletList"),
isOrderedList: ctx.editor.isActive("orderedList"),
isTaskList: ctx.editor.isActive("taskList"),
canUndo: safeCan(ctx.editor, "undo"),
canRedo: safeCan(ctx.editor, "redo"),
canUndo,
canRedo,
};
},
});
@@ -38,6 +38,14 @@ export function ImageMenu({ editor }: EditorMenuProps) {
return null;
}
// #343 PART 1: skip the expensive per-keystroke work (getAttributes + the
// alignment isActive() probes) unless an image is actually active. The
// menu is only shown when an image is active (see shouldShow), so a null
// state while inactive is never rendered — behavior is unchanged.
if (!ctx.editor.isActive("image")) {
return null;
}
const imageAttrs = ctx.editor.getAttributes("image");
return {
@@ -25,6 +25,13 @@ export function PdfMenu({ editor }: EditorMenuProps) {
return null;
}
// #343 PART 1: skip getAttributes unless a pdf node is active. The menu
// only shows for an active pdf node (shouldShow), so the null state while
// inactive is never rendered — behavior unchanged.
if (!ctx.editor.isActive("pdf")) {
return null;
}
const pdfAttrs = ctx.editor.getAttributes("pdf");
return {
@@ -70,7 +70,14 @@ export const SubpagesMenu = React.memo(
// toggle without re-rendering on every keystroke.
const isRecursive = useEditorState({
editor,
selector: (ctx) => ctx.editor?.getAttributes("subpages")?.recursive ?? false,
// #343 PART 1: skip getAttributes unless a subpages node is active. The
// menu only shows for an active subpages node (shouldShow), so the value
// is only read then; getAttributes on an inactive node returns the default
// (recursive === false) anyway, so this is behavior-preserving.
selector: (ctx) =>
ctx.editor?.isActive("subpages")
? (ctx.editor.getAttributes("subpages")?.recursive ?? false)
: false,
});
return (
@@ -4,6 +4,7 @@ import React, { FC, useEffect, useRef, useState } from "react";
import classes from "./table-of-contents.module.css";
import clsx from "clsx";
import { Box, Text, Title } from "@mantine/core";
import { useDebouncedCallback } from "@mantine/hooks";
import { useTranslation } from "react-i18next";
type TableOfContentsProps = {
@@ -79,13 +80,21 @@ export const TableOfContents: FC<TableOfContentsProps> = (props) => {
setHeadingDOMNodes(result.nodes);
};
// Debounce the update-driven rescan: `$nodes("heading")` scans every heading
// in the document, and it previously ran on EVERY keystroke while the TOC
// panel was open. The panel is derived UI, so recomputing ~300ms after typing
// settles keeps it correct without doing an all-headings scan per keystroke
// (#343, PART 7). `useDebouncedCallback` returns a stable reference and always
// invokes the latest `handleUpdate`.
const debouncedHandleUpdate = useDebouncedCallback(handleUpdate, 300);
useEffect(() => {
props.editor?.on("update", handleUpdate);
props.editor?.on("update", debouncedHandleUpdate);
return () => {
props.editor?.off("update", handleUpdate);
props.editor?.off("update", debouncedHandleUpdate);
};
}, [props.editor]);
}, [props.editor, debouncedHandleUpdate]);
useEffect(
() => {
@@ -31,6 +31,13 @@ export function VideoMenu({ editor }: EditorMenuProps) {
return null;
}
// #343 PART 1: skip getAttributes + alignment isActive() probes unless a
// video is active. The menu only shows for an active video (shouldShow),
// so the null state while inactive is never rendered — behavior unchanged.
if (!ctx.editor.isActive("video")) {
return null;
}
const videoAttrs = ctx.editor.getAttributes("video");
return {
@@ -6,6 +6,23 @@ import getSuggestionItems from '@/features/editor/components/slash-menu/menu-ite
export const slashMenuPluginKey = new PluginKey('slash-command');
// getSuggestionItems fuzzy-matches EVERY command against the query (plus its
// wrong-keyboard-layout remaps) and, while the slash menu is open, is invoked
// TWICE per keystroke: once by the synchronous `allow` gate below and once by
// the popup's `items` builder. A synchronous gating predicate can't be
// debounced without breaking the suggestion decoration/activation, so instead we
// memoize the LAST query's result: the two same-query calls in one keystroke
// build the list only once, and the cache invalidates the moment the query
// changes — so there is no stale-state risk (#343, PART 7).
let lastQuery: string | null = null;
let lastResult: ReturnType<typeof getSuggestionItems> | null = null;
function suggestionItemsForQuery(query: string) {
if (query === lastQuery && lastResult) return lastResult;
lastQuery = query;
lastResult = getSuggestionItems({ query });
return lastResult;
}
// @ts-ignore
const Command = Extension.create({
name: 'slash-command',
@@ -38,7 +55,7 @@ const Command = Extension.create({
// non-matching queries while keeping multi-word matches (e.g.
// "/Heading 1") working.
const query = state.doc.textBetween(range.from + 1, range.to);
const groups = getSuggestionItems({ query });
const groups = suggestionItemsForQuery(query);
const hasMatches = Object.values(groups).some(
(items) => items.length > 0,
);
@@ -61,7 +78,9 @@ const Command = Extension.create({
const SlashCommand = Command.configure({
suggestion: {
items: getSuggestionItems,
// Share the per-query memo with `allow` so the pair of same-query calls in a
// single keystroke rebuilds the list once (#343, PART 7).
items: ({ query }: { query: string }) => suggestionItemsForQuery(query),
render: renderItems,
},
});
@@ -0,0 +1,100 @@
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { renderHook, act } from "@testing-library/react";
import type { MutableRefObject } from "react";
import type { Editor } from "@tiptap/react";
// Mock the app entry so importing the hook doesn't boot the whole app; the hook
// only needs queryClient's cache read/write, which we stub here. Declared via
// vi.hoisted so the spies exist before the hoisted vi.mock factory runs.
const { getQueryData, setQueryData } = vi.hoisted(() => ({
getQueryData: vi.fn(() => undefined as unknown),
setQueryData: vi.fn(),
}));
vi.mock("@/main.tsx", () => ({
queryClient: { getQueryData, setQueryData },
}));
import { usePageContentCache } from "./use-page-content-cache";
const SNAPSHOT = { type: "doc", content: [] };
function makeFakeEditor(overrides: Partial<Editor> = {}): Editor {
return {
isEmpty: false,
isDestroyed: false,
getJSON: vi.fn(() => SNAPSHOT),
...overrides,
} as unknown as Editor;
}
describe("usePageContentCache (#343 PART 3) — getJSON off the keystroke path", () => {
beforeEach(() => {
vi.useFakeTimers();
vi.clearAllMocks();
// A cached page exists so the write path runs.
getQueryData.mockReturnValue({ id: "p1", content: {} });
});
afterEach(() => {
vi.useRealTimers();
});
it("onUpdate (calling the debounced fn) does NOT call getJSON synchronously", () => {
const editor = makeFakeEditor();
const editorRef = { current: editor } as MutableRefObject<Editor | null>;
const { result } = renderHook(() =>
usePageContentCache(editorRef, "slug-1", 3000),
);
// Simulate a keystroke's onUpdate -> only schedules the debounce.
act(() => {
result.current();
result.current();
result.current();
});
// The whole-doc serialization must NOT have happened yet.
expect(editor.getJSON).not.toHaveBeenCalled();
expect(setQueryData).not.toHaveBeenCalled();
// Once the debounce window elapses, getJSON runs exactly once (not per call).
act(() => vi.advanceTimersByTime(3000));
expect(editor.getJSON).toHaveBeenCalledTimes(1);
expect(setQueryData).toHaveBeenCalledWith(["pages", "slug-1"], {
id: "p1",
content: SNAPSHOT,
});
});
it("flushes the pending snapshot on unmount so the last edit isn't lost", () => {
const editor = makeFakeEditor();
const editorRef = { current: editor } as MutableRefObject<Editor | null>;
const { result, unmount } = renderHook(() =>
usePageContentCache(editorRef, "slug-1", 3000),
);
act(() => result.current());
expect(editor.getJSON).not.toHaveBeenCalled();
// Navigation/unmount must flush (not drop) the pending write.
act(() => unmount());
expect(editor.getJSON).toHaveBeenCalledTimes(1);
expect(setQueryData).toHaveBeenCalledTimes(1);
});
it("skips the write when the editor is destroyed (flush racing teardown)", () => {
const editor = makeFakeEditor({ isDestroyed: true });
const editorRef = { current: editor } as MutableRefObject<Editor | null>;
const { result } = renderHook(() =>
usePageContentCache(editorRef, "slug-1", 3000),
);
act(() => result.current());
act(() => vi.advanceTimersByTime(3000));
expect(editor.getJSON).not.toHaveBeenCalled();
expect(setQueryData).not.toHaveBeenCalled();
});
});
@@ -0,0 +1,50 @@
import type { MutableRefObject } from "react";
import { useDebouncedCallback } from "@mantine/hooks";
import type { Editor } from "@tiptap/react";
import { queryClient } from "@/main.tsx";
import { IPage } from "@/features/page/types/page.types.ts";
/**
* Off-keystroke local page-cache updater (issue #343, PART 3).
*
* The editor's `onUpdate` fires on every keystroke — and, under collaboration,
* on every REMOTE keystroke too. Serializing the WHOLE document with
* `editor.getJSON()` on that hot path is expensive, and the previous 3s debounce
* only guarded the cache WRITE, not the serialization: `getJSON()` still ran per
* keystroke.
*
* This hook moves the serialization INSIDE the debounced callback, so the
* full-doc traversal happens at most once per `delay`, not per keystroke. Call
* the returned function from `onUpdate` (it only schedules the debounce); the
* `getJSON()` snapshot is taken when the debounce fires.
*
* On unmount/navigation the pending snapshot is FLUSHED (via `flushOnUnmount`)
* so the last edits within the debounce window aren't lost from the local cache.
* The source of truth is collab/Yjs, but the cache must not go stale.
*
* IMPORTANT: call this hook BEFORE `useEditor`. React runs effect cleanups in
* declaration order on unmount, so the debounce's flush cleanup must be declared
* before `useEditor`'s teardown to run while the editor is still alive; the
* `isDestroyed` guard keeps a flush that still races teardown safe (it skips).
*/
export function usePageContentCache(
editorRef: MutableRefObject<Editor | null>,
slugId: string | undefined,
delay = 3000,
) {
return useDebouncedCallback(
() => {
const e = editorRef.current;
if (!e || e.isDestroyed || e.isEmpty) return;
const pageData = queryClient.getQueryData<IPage>(["pages", slugId]);
if (pageData) {
// getJSON() (full-doc serialization) runs HERE, off the keystroke path.
queryClient.setQueryData(["pages", slugId], {
...pageData,
content: e.getJSON(),
});
}
},
{ delay, flushOnUnmount: true },
);
}
+20 -19
View File
@@ -62,7 +62,7 @@ import ExcalidrawMenu from "./components/excalidraw/excalidraw-menu-lazy";
import DrawioMenu from "./components/drawio/drawio-menu";
import { useCollabToken } from "@/features/auth/queries/auth-query.tsx";
import SearchAndReplaceDialog from "@/features/editor/components/search-and-replace/search-and-replace-dialog.tsx";
import { useDebouncedCallback, useDocumentVisibility } from "@mantine/hooks";
import { useDocumentVisibility } from "@mantine/hooks";
import { useIdle } from "@/hooks/use-idle.ts";
import { queryClient } from "@/main.tsx";
import { IPage } from "@/features/page/types/page.types.ts";
@@ -79,6 +79,7 @@ import { PageEditMode } from "@/features/user/types/user.types.ts";
import { jwtDecode } from "jwt-decode";
import { searchSpotlight } from "@/features/search/constants.ts";
import { useEditorScroll } from "./hooks/use-editor-scroll";
import { usePageContentCache } from "./hooks/use-page-content-cache";
import { useScrollRestoreOnSwap } from "./hooks/use-scroll-position";
import { useSwapHeightReservation } from "./hooks/use-swap-height-reservation";
import { EditorLinkMenu } from "@/features/editor/components/link/link-menu";
@@ -267,8 +268,13 @@ export default function PageEditor({
}
}, [isIdle, documentState, providersReady, resetIdle]);
// Attach here, to make sure the connection gets properly established
providersRef.current?.remote.attach();
// Attach the remote provider once it's ready (and again after a pageId swap
// recreates it) to make sure the connection gets properly established. This
// used to run in the render body — a side effect during render (#343, PART 7).
// `attach()` is idempotent, so re-running it on these deps is safe.
useEffect(() => {
providersRef.current?.remote.attach();
}, [providersReady, pageId]);
const extensions = useMemo(() => {
if (!providersReady || !providersRef.current || !currentUser?.user) {
@@ -283,6 +289,12 @@ export default function PageEditor({
];
}, [providersReady, currentUser?.user]);
// getJSON() serialization + cache write live in the hook, off the keystroke
// path, and flush on unmount so the last snapshot survives navigation (#343).
// MUST be declared before useEditor: React runs effect cleanups in declaration
// order on unmount, so the flush must run before the editor is torn down.
const debouncedUpdateContent = usePageContentCache(editorRef, slugId);
const editor = useEditor(
{
extensions,
@@ -353,11 +365,11 @@ export default function PageEditor({
editorRef.current = editor;
}
},
onUpdate({ editor }) {
if (editor.isEmpty) return;
const editorJson = editor.getJSON();
//update local page cache to reduce flickers
debouncedUpdateContent(editorJson);
onUpdate() {
// Only schedule the debounce here — the whole-doc getJSON() serialization
// happens INSIDE the debounced callback (see usePageContentCache), so it
// no longer runs synchronously on every (local or remote) keystroke.
debouncedUpdateContent();
},
},
[pageId, editable, extensions],
@@ -403,17 +415,6 @@ export default function PageEditor({
};
}, [editor, pageId, editorIsEditable]);
const debouncedUpdateContent = useDebouncedCallback((newContent: any) => {
const pageData = queryClient.getQueryData<IPage>(["pages", slugId]);
if (pageData) {
queryClient.setQueryData(["pages", slugId], {
...pageData,
content: newContent,
});
}
}, 3000);
const handleActiveCommentEvent = (event) => {
const { commentId, resolved } = event.detail;
@@ -0,0 +1,134 @@
import { describe, it, expect, vi, afterEach } from 'vitest';
import { getSchema } from '@tiptap/core';
import { Document } from '@tiptap/extension-document';
import { Paragraph } from '@tiptap/extension-paragraph';
import { Text } from '@tiptap/extension-text';
import { EditorState } from '@tiptap/pm/state';
import { Node as PMNode } from '@tiptap/pm/model';
import { FootnoteReference } from './footnote-reference';
import { FootnotesList } from './footnotes-list';
import { FootnoteDefinition } from './footnote-definition';
import {
footnoteNumberingPlugin,
footnoteNumberingPluginKey,
getFootnoteNumber,
} from './footnote-numbering';
import {
FOOTNOTE_REFERENCE_NAME,
FOOTNOTES_LIST_NAME,
FOOTNOTE_DEFINITION_NAME,
} from './footnote-util';
const extensions = [
Document,
Paragraph,
Text,
FootnoteReference,
FootnotesList,
FootnoteDefinition,
];
const schema = getSchema(extensions);
function makeState(docJson: any): EditorState {
return EditorState.create({
doc: PMNode.fromJSON(schema, docJson),
plugins: [footnoteNumberingPlugin()],
});
}
const withTwoFootnotes = {
type: 'doc',
content: [
{
type: 'paragraph',
content: [
{ type: 'text', text: 'a' },
{ type: FOOTNOTE_REFERENCE_NAME, attrs: { id: 'x' } },
{ type: 'text', text: 'b' },
{ type: FOOTNOTE_REFERENCE_NAME, attrs: { id: 'y' } },
],
},
{
type: FOOTNOTES_LIST_NAME,
content: [
{
type: FOOTNOTE_DEFINITION_NAME,
attrs: { id: 'x' },
content: [{ type: 'paragraph' }],
},
{
type: FOOTNOTE_DEFINITION_NAME,
attrs: { id: 'y' },
content: [{ type: 'paragraph' }],
},
],
},
],
};
describe('footnote numbering plugin — short-circuit (#343 PART 5)', () => {
afterEach(() => vi.restoreAllMocks());
it('does ZERO document traversals on a docChanged transaction when the doc has no footnotes', () => {
const state = makeState({
type: 'doc',
content: [{ type: 'paragraph', content: [{ type: 'text', text: 'hi' }] }],
});
// Only count traversals caused by the transaction, not the initial build.
const descendantsSpy = vi.spyOn(PMNode.prototype, 'descendants');
const before = footnoteNumberingPluginKey.getState(state);
// A real content edit (docChanged) that introduces no footnote node.
const next = state.apply(state.tr.insertText('!', 3));
const after = footnoteNumberingPluginKey.getState(next);
// The plugin never walked the document...
expect(descendantsSpy).not.toHaveBeenCalled();
// ...and reused the exact same (empty) state object — proof it short-circuited.
expect(after).toBe(before);
expect(after?.hasFootnotes).toBe(false);
});
it('rebuilds (numbering appears) the first time a footnote is inserted into a footnote-free doc', () => {
const state = makeState({
type: 'doc',
content: [{ type: 'paragraph', content: [{ type: 'text', text: 'hi' }] }],
});
expect(footnoteNumberingPluginKey.getState(state)?.hasFootnotes).toBe(false);
const ref = schema.nodes[FOOTNOTE_REFERENCE_NAME].create({ id: 'x' });
const next = state.apply(state.tr.insert(3, ref));
const after = footnoteNumberingPluginKey.getState(next);
expect(after?.hasFootnotes).toBe(true);
expect(getFootnoteNumber(next, 'x')).toBe(1);
});
});
describe('footnote numbering plugin — numbering unchanged with footnotes (#343 PART 5)', () => {
it('numbers references in document order via the single merged walk', () => {
const state = makeState(withTwoFootnotes);
expect(getFootnoteNumber(state, 'x')).toBe(1);
expect(getFootnoteNumber(state, 'y')).toBe(2);
});
it('produces a decoration for every reference and matching definition', () => {
const state = makeState(withTwoFootnotes);
const decos = footnoteNumberingPluginKey.getState(state)?.decorations;
// 2 references + 2 definitions = 4 number decorations.
expect(decos?.find().length).toBe(4);
});
it('keeps numbering current after an edit while footnotes exist', () => {
const state = makeState(withTwoFootnotes);
// Insert a NEW reference (id "z") before the others: it must become #1 and
// shift x -> #2, y -> #3 (deterministic document-order numbering).
const ref = schema.nodes[FOOTNOTE_REFERENCE_NAME].create({ id: 'z' });
const next = state.apply(state.tr.insert(1, ref));
expect(getFootnoteNumber(next, 'z')).toBe(1);
expect(getFootnoteNumber(next, 'x')).toBe(2);
expect(getFootnoteNumber(next, 'y')).toBe(3);
});
});
@@ -1,11 +1,9 @@
import { EditorState, Plugin, PluginKey } from '@tiptap/pm/state';
import { EditorState, Plugin, PluginKey, Transaction } from '@tiptap/pm/state';
import { Decoration, DecorationSet } from '@tiptap/pm/view';
import { Node as ProseMirrorNode } from '@tiptap/pm/model';
import { Node as ProseMirrorNode, Slice } from '@tiptap/pm/model';
import {
FOOTNOTE_DEFINITION_NAME,
FOOTNOTE_REFERENCE_NAME,
computeFootnoteNumbers,
computeFootnoteRefCounts,
} from './footnote-util';
export const footnoteNumberingPluginKey = new PluginKey<FootnoteNumberingState>(
@@ -27,8 +25,22 @@ interface FootnoteNumberingState {
refCounts: Map<string, number>;
/** Decorations rendering those numbers (refs + definitions). */
decorations: DecorationSet;
/** Whether the document contains ANY footnote reference/definition node.
* Cached so `apply` can skip the whole-doc walk on every keystroke in the
* common case (documents with no footnotes), recomputing only once a
* transaction actually inserts a footnote node (#343, PART 5). */
hasFootnotes: boolean;
}
/** Reusable empty state for footnote-free documents — avoids reallocating an
* empty map/decoration set on every keystroke while there are no footnotes. */
const EMPTY_STATE: FootnoteNumberingState = {
numbers: new Map(),
refCounts: new Map(),
decorations: DecorationSet.empty,
hasFootnotes: false,
};
/**
* Build the decoration set for footnote numbers. Pure function of the document:
* walk references in document order, assign 1-based numbers, then attach a
@@ -41,50 +53,101 @@ export function buildFootnoteDecorations(doc: ProseMirrorNode): DecorationSet {
return buildFootnoteNumberingState(doc).decorations;
}
function numberDecoration(pos: number, nodeSize: number, num: number): Decoration {
return Decoration.node(pos, pos + nodeSize, {
'data-footnote-number': String(num),
style: `--footnote-number: "${num}";`,
});
}
/**
* Compute both the number map AND the decorations for `doc` in a single walk.
* The plugin caches the result so NodeViews can read numbers without
* recomputing.
* Compute the number map, reference counts AND the decorations for `doc` in a
* SINGLE document walk (previously three separate O(n) traversals per
* docChanged — computeFootnoteNumbers + computeFootnoteRefCounts + a decoration
* pass, #343 PART 5). The plugin caches the result so NodeViews can read numbers
* without recomputing.
*
* References are numbered and decorated as they are encountered (document
* order). Definition positions are collected during the same walk and decorated
* afterwards from the completed number map — so a definition that appears before
* its reference in document order still resolves to the correct number, and the
* output is identical to the previous three-pass implementation. (Decoration
* insertion order does not matter: DecorationSet.create indexes by position.)
*/
function buildFootnoteNumberingState(
doc: ProseMirrorNode,
): FootnoteNumberingState {
const numbers = computeFootnoteNumbers(doc);
const refCounts = computeFootnoteRefCounts(doc);
const numbers = new Map<string, number>();
const refCounts = new Map<string, number>();
const decorations: Decoration[] = [];
const definitions: { id: string; pos: number; nodeSize: number }[] = [];
let n = 0;
let hasFootnotes = false;
doc.descendants((node, pos) => {
if (node.type.name === FOOTNOTE_REFERENCE_NAME) {
const num = numbers.get(node.attrs.id);
if (num != null) {
decorations.push(
Decoration.node(pos, pos + node.nodeSize, {
'data-footnote-number': String(num),
style: `--footnote-number: "${num}";`,
}),
);
}
}
if (node.type.name === FOOTNOTE_DEFINITION_NAME) {
const num = numbers.get(node.attrs.id);
if (num != null) {
decorations.push(
Decoration.node(pos, pos + node.nodeSize, {
'data-footnote-number': String(num),
style: `--footnote-number: "${num}";`,
}),
);
const typeName = node.type.name;
if (typeName === FOOTNOTE_REFERENCE_NAME) {
hasFootnotes = true;
const id = node.attrs.id;
if (id) {
if (!numbers.has(id)) numbers.set(id, ++n);
refCounts.set(id, (refCounts.get(id) ?? 0) + 1);
decorations.push(numberDecoration(pos, node.nodeSize, numbers.get(id)!));
}
} else if (typeName === FOOTNOTE_DEFINITION_NAME) {
hasFootnotes = true;
const id = node.attrs.id;
if (id != null) definitions.push({ id, pos, nodeSize: node.nodeSize });
}
});
if (!hasFootnotes) return EMPTY_STATE;
for (const def of definitions) {
const num = numbers.get(def.id);
if (num != null) {
decorations.push(numberDecoration(def.pos, def.nodeSize, num));
}
}
return {
numbers,
refCounts,
decorations: DecorationSet.create(doc, decorations),
hasFootnotes: true,
};
}
/**
* Cheap check: does any of a transaction's inserted content contain a footnote
* reference/definition node? Footnote nodes can only ENTER the document through
* replace steps (ReplaceStep / ReplaceAroundStep both expose a `.slice`), so
* scanning only the inserted slices — O(change size), not O(doc) — is sufficient
* to detect a newly-added footnote. Mark/attr steps never introduce nodes.
* Lets `apply` keep skipping the whole-doc walk until a footnote first appears.
*/
function transactionInsertsFootnote(tr: Transaction): boolean {
for (const step of tr.steps) {
const slice = (step as unknown as { slice?: Slice }).slice;
if (!slice || slice.content.size === 0) continue;
let found = false;
slice.content.descendants((node) => {
if (found) return false;
const typeName = node.type.name;
if (
typeName === FOOTNOTE_REFERENCE_NAME ||
typeName === FOOTNOTE_DEFINITION_NAME
) {
found = true;
return false;
}
return true;
});
if (found) return true;
}
return false;
}
/**
* Read the cached footnote number for `id` from the numbering plugin's state.
* This is the source NodeViews should use instead of calling
@@ -126,6 +189,13 @@ export function footnoteNumberingPlugin(): Plugin {
// the number map NodeViews read stays current on every edit while
// non-doc transactions (selection, etc.) reuse the cache for free.
if (!tr.docChanged) return old;
// Short-circuit the whole-doc walk while the document has no footnotes:
// if there were none and this transaction did not INSERT one, there is
// still nothing to number, so reuse the empty state (#343, PART 5). Once
// a footnote exists we always rebuild (covers renumbering/deletion).
if (!old.hasFootnotes && !transactionInsertsFootnote(tr)) {
return old;
}
return buildFootnoteNumberingState(tr.doc);
},
},
@@ -635,13 +635,17 @@ const Attachment = Node.create({
},
name: {
default: null,
parseHTML: (el: HTMLElement) => el.getAttribute("data-attachment-name"),
// Empty-string-vs-absent idempotency (GS-EDIT-REVERT class): "" -> default.
parseHTML: (el: HTMLElement) =>
el.getAttribute("data-attachment-name") || null,
renderHTML: (attrs: Record<string, any>) =>
attrs.name ? { "data-attachment-name": attrs.name } : {},
},
mime: {
default: null,
parseHTML: (el: HTMLElement) => el.getAttribute("data-attachment-mime"),
// Empty-string-vs-absent idempotency (GS-EDIT-REVERT class): "" -> default.
parseHTML: (el: HTMLElement) =>
el.getAttribute("data-attachment-mime") || null,
renderHTML: (attrs: Record<string, any>) =>
attrs.mime ? { "data-attachment-mime": attrs.mime } : {},
},
@@ -689,7 +693,10 @@ const Video = Node.create({
},
alt: {
default: null,
parseHTML: (el: HTMLElement) => el.getAttribute("aria-label"),
// Empty-string-vs-absent idempotency: coerce "" back to the default so a
// stray empty `aria-label` never materializes `alt: ""` on a video stored
// with no alt (same GS-EDIT-REVERT class as the image `alt` fix).
parseHTML: (el: HTMLElement) => el.getAttribute("aria-label") || null,
renderHTML: (attrs: Record<string, any>) =>
attrs.alt ? { "aria-label": attrs.alt } : {},
},
@@ -864,13 +871,15 @@ const diagramAttributes = () => ({
},
title: {
default: null,
parseHTML: (el: HTMLElement) => el.getAttribute("data-title"),
// Empty-string-vs-absent idempotency (GS-EDIT-REVERT class): "" -> default.
parseHTML: (el: HTMLElement) => el.getAttribute("data-title") || null,
renderHTML: (attrs: Record<string, any>) =>
attrs.title ? { "data-title": attrs.title } : {},
},
alt: {
default: null,
parseHTML: (el: HTMLElement) => el.getAttribute("data-alt"),
// Empty-string-vs-absent idempotency (GS-EDIT-REVERT class): "" -> default.
parseHTML: (el: HTMLElement) => el.getAttribute("data-alt") || null,
renderHTML: (attrs: Record<string, any>) =>
attrs.alt ? { "data-alt": attrs.alt } : {},
},
@@ -1106,7 +1115,8 @@ const Pdf = Node.create({
},
name: {
default: null,
parseHTML: (el: HTMLElement) => el.getAttribute("data-name"),
// Empty-string-vs-absent idempotency (GS-EDIT-REVERT class): "" -> default.
parseHTML: (el: HTMLElement) => el.getAttribute("data-name") || null,
renderHTML: (attrs: Record<string, any>) =>
attrs.name ? { "data-name": attrs.name } : {},
},
@@ -1491,6 +1501,29 @@ export const docmostExtensions = [
...parent.height,
parseHTML: (el: HTMLElement) => el.getAttribute("height"),
},
// Empty-string-vs-absent idempotency (GS-EDIT-REVERT class). `marked`
// renders `![](src)` as `<img alt="">`, so the stock Image `alt`
// parseHTML (`getAttribute("alt")`) materializes `alt: ""` on an image
// that was stored with NO alt (attr absent). That is a false diff against
// the editor-stored form (a no-alt image has alt ABSENT, not ""), so a
// git-sync / ai-chat touch of a page with a plain image produced phantom
// churn. Coerce an empty string back to the attr's default (null) so the
// import is idempotent. A real alt survives verbatim (`|| undefined` keeps
// the truthy value; the default fills the empty case). `title` is coerced
// the same way for the whole class, even though `marked` does not
// currently emit `title=""` — defence in depth against any path that does.
// NOTE: this DIVERGES from editor-ext's literal image `alt` parseHTML
// (`getAttribute("alt")`, which returns "" verbatim), but CONVERGES on
// editor-ext's real STORED shape: an editor image inserted without alt
// renders with no `alt` attribute and re-parses as absent, never "".
alt: {
...parent.alt,
parseHTML: (el: HTMLElement) => el.getAttribute("alt") || null,
},
title: {
...parent.title,
parseHTML: (el: HTMLElement) => el.getAttribute("title") || null,
},
};
},
}).configure({ inline: false }),
@@ -0,0 +1,443 @@
/**
* Reusable round-trip-STABILITY matrix helper (fixtures-first).
*
* A single stored node authored WITHOUT a given string attribute (attr
* absent / undefined) must not gain a phantom EMPTY-STRING value after a
* markdown round-trip — the "empty-string-vs-absent" churn class. This helper,
* given a node spec, drives a matrix of attribute combinations through the REAL
* converter (`convertProseMirrorToMarkdown` -> `markdownToProseMirror`) and
* asserts byte-stability on two contours:
*
* 1. RAW round-trip: for the node under test, every attribute the round-trip
* materializes must equal what the INPUT authored — an authored attr keeps
* its value, an ABSENT attr may only reappear at its SCHEMA DEFAULT. If an
* absent attr comes back as a NON-default value (e.g. `alt: ""` where the
* default is `null`), that is an instability and is reported precisely as
* `type.attr: absent -> "<got>"`. This is the contour git-sync / stored
* JSON diffs on, so masking it only in `canonicalize` would leave the noise.
*
* 2. CANONICAL round-trip: `canonicalizeContent(original)` must deep-equal
* `canonicalizeContent(roundtrip)` (a second, semantic contour).
*
* The ONLY normalization the helper treats as allowed (not an instability) is
* the DOCUMENTED numeric width/height/size/aspectRatio -> string coercion the
* converter performs on purpose (a stored numeric `640` re-parses via
* `getAttribute` as the string `"640"`). It is encoded here as an explicit
* per-spec `numericStringAttrs` set applied to BOTH contours, NOT a silent skip.
*
* The helper is node-type agnostic: image and the whole media family share the
* `align !== "center"` predicate + `<!--name {…}-->` comment machinery, so one
* matrix guards the shared class.
*/
import { getSchema } from "@tiptap/core";
import {
convertProseMirrorToMarkdown,
markdownToProseMirror,
canonicalizeContent,
docmostExtensions,
} from "../src/lib/index.js";
import { firstDivergence } from "./roundtrip-helpers.js";
/** One attribute's two probe values. */
export interface AttrMatrixEntry {
/** Attribute name on the node. */
attr: string;
/**
* The "default" pick. `undefined` means the attribute is OMITTED entirely
* (the absent case — the one that can materialize an empty string on import).
* A concrete value is authored verbatim.
*/
default: unknown;
/** A representative NON-default value to exercise (must survive verbatim). */
nonDefault: unknown;
/**
* Marks the attr as a member of the EMPTY-STRING class the fix targets: a
* string attr whose schema default is `null`/absent and whose parseHTML
* coerces `"" -> default` (image/drawio `alt`+`title`, video `alt` via
* aria-label, pdf/attachment `name`, attachment `mime`). Set true to also
* drive the THIRD-STATE convergence case (see runConvergenceCase) for this
* attr. Attrs whose default is NOT null (e.g. embed `provider`, default "")
* or that are not `""`-coerced (control attrs) are left unset.
*/
emptyStringClass?: boolean;
}
/** A node type + the attribute matrix to sweep for it. */
export interface NodeStabilitySpec {
/** Node type (e.g. "image", "video"). */
type: string;
/** Attributes always present on the node (e.g. `{ src: "/i.png" }`). */
baseAttrs?: Record<string, unknown>;
/** Attributes to sweep at default and non-default. */
attrMatrix: AttrMatrixEntry[];
/**
* Attributes whose numeric -> string coercion on round-trip is DOCUMENTED and
* intentional; compared modulo `String(x)` on both sides. Defaults to the
* converter's known sizing set.
*/
numericStringAttrs?: string[];
}
/** A single unstable finding, legible enough to tie a gate-lock to. */
export interface Instability {
type: string;
attr: string;
/** What the input authored: the literal value, or the ABSENT sentinel. */
authored: unknown | typeof ABSENT;
/** What the round-trip produced. */
got: unknown;
/** What a stable round-trip should have produced (authored value or default). */
expected: unknown;
}
/** One matrix cell's result. */
export interface ComboResult {
label: string;
authored: Record<string, unknown>;
/** RAW-contour instabilities on the node under test. */
raw: Instability[];
/** CANONICAL-contour divergence (path + values) or null when equal. */
canonical: { path: string; a: unknown; b: unknown } | null;
/** True when the node type failed to round-trip at all (structural loss). */
missing: boolean;
md: string;
}
/** Whole-matrix report for one node spec. */
export interface MatrixReport {
type: string;
combos: ComboResult[];
}
/** Sentinel marking an attribute the input did NOT author. */
export const ABSENT = Symbol("ABSENT");
const DEFAULT_NUMERIC_STRING_ATTRS = [
"width",
"height",
"size",
"aspectRatio",
];
// The ProseMirror schema the converter targets — its attribute `default`s are
// the authoritative "what an absent attr should re-materialize as" oracle.
const schema = getSchema(docmostExtensions);
/** Read the schema default for every attribute of a node type. */
function schemaDefaults(type: string): Record<string, unknown> {
const specAttrs = (schema.nodes[type]?.spec?.attrs ?? {}) as Record<
string,
{ default: unknown }
>;
const out: Record<string, unknown> = {};
for (const [k, v] of Object.entries(specAttrs)) out[k] = v.default;
return out;
}
/** Find the first node of a given type anywhere in a PM doc tree. */
function findFirst(node: any, type: string): any {
if (node && node.type === type) return node;
for (const child of node?.content ?? []) {
const hit = findFirst(child, type);
if (hit) return hit;
}
return null;
}
/** Coerce a scalar for the documented numeric->string comparison. */
const numStr = (x: unknown): unknown => (x == null ? x : String(x));
/**
* Enumerate the cartesian product of the matrix: every attribute independently
* at its default (index 0) or non-default (index 1) pick. The all-default
* corner is included (the baseline). Small by construction (2^N over a handful
* of at-risk string attrs).
*/
function enumerateCombos(matrix: AttrMatrixEntry[]): number[][] {
let combos: number[][] = [[]];
for (let i = 0; i < matrix.length; i++) {
const next: number[][] = [];
for (const c of combos) {
next.push([...c, 0]);
next.push([...c, 1]);
}
combos = next;
}
return combos;
}
/** Build the authored attrs for one combo pick vector. */
function authoredAttrs(
spec: NodeStabilitySpec,
picks: number[],
): Record<string, unknown> {
const attrs: Record<string, unknown> = { ...(spec.baseAttrs ?? {}) };
spec.attrMatrix.forEach((entry, i) => {
if (picks[i] === 1) {
attrs[entry.attr] = entry.nonDefault;
} else if (entry.default !== undefined) {
attrs[entry.attr] = entry.default;
}
// default === undefined -> OMIT the attr entirely (the absent case).
});
return attrs;
}
/** Human-readable label for a combo (which attrs are at non-default). */
function comboLabel(spec: NodeStabilitySpec, picks: number[]): string {
const on = spec.attrMatrix
.filter((_, i) => picks[i] === 1)
.map((e) => e.attr);
return on.length === 0 ? "<all-default>" : on.join("+");
}
/**
* Run the full stability matrix for one node spec and return a structured
* report (does NOT throw — the caller asserts, so a failure can print the whole
* report). Every combo runs the real export->import pipeline once.
*/
export async function runStabilityMatrix(
spec: NodeStabilitySpec,
): Promise<MatrixReport> {
const numericStringAttrs = new Set(
spec.numericStringAttrs ?? DEFAULT_NUMERIC_STRING_ATTRS,
);
const defaults = schemaDefaults(spec.type);
const combos: ComboResult[] = [];
for (const picks of enumerateCombos(spec.attrMatrix)) {
const authored = authoredAttrs(spec, picks);
const doc = { type: "doc", content: [{ type: spec.type, attrs: authored }] };
const md = convertProseMirrorToMarkdown(doc);
const rt = await markdownToProseMirror(md);
const node = findFirst(rt, spec.type);
const result: ComboResult = {
label: comboLabel(spec, picks),
authored,
raw: [],
canonical: null,
missing: node == null,
md,
};
if (node != null) {
// RAW contour: every materialized attr must equal the authored value, or
// (for an absent attr) the schema default — modulo the documented numeric
// string coercion.
const rtAttrs = (node.attrs ?? {}) as Record<string, unknown>;
for (const key of Object.keys(rtAttrs)) {
const authoredHas = Object.prototype.hasOwnProperty.call(authored, key);
const expected = authoredHas ? authored[key] : defaults[key];
let got = rtAttrs[key];
let exp = expected;
if (numericStringAttrs.has(key)) {
got = numStr(got);
exp = numStr(exp);
}
if (firstDivergence(got, exp) !== null) {
result.raw.push({
type: spec.type,
attr: key,
authored: authoredHas ? authored[key] : ABSENT,
got: rtAttrs[key],
expected,
});
}
}
// CANONICAL contour: canonical forms deep-equal, modulo the same numeric
// string coercion (applied to both trees so a documented coercion is not
// counted as a divergence).
const ca = normalizeNumeric(canonicalizeContent(doc), numericStringAttrs);
const cb = normalizeNumeric(canonicalizeContent(rt), numericStringAttrs);
result.canonical = firstDivergence(ca, cb);
}
combos.push(result);
}
return { type: spec.type, combos };
}
/**
* Deep-copy a canonical tree, coercing the documented numeric->string attrs to
* their string form so an intentional `640 -> "640"` coercion is not reported
* as a canonical divergence. Only touches the listed attribute keys.
*/
function normalizeNumeric(node: any, attrs: Set<string>): any {
if (Array.isArray(node)) return node.map((n) => normalizeNumeric(n, attrs));
if (node === null || typeof node !== "object") return node;
const out: Record<string, unknown> = {};
for (const key of Object.keys(node)) {
if (key === "attrs" && node.attrs && typeof node.attrs === "object") {
const a: Record<string, unknown> = {};
for (const [k, v] of Object.entries(node.attrs)) {
a[k] = attrs.has(k) ? numStr(v) : v;
}
out.attrs = a;
} else {
out[key] = normalizeNumeric(node[key], attrs);
}
}
return out;
}
/** Flatten a report to just its unstable combos (for a terse assertion). */
export function unstableCombos(report: MatrixReport): ComboResult[] {
return report.combos.filter(
(c) => c.missing || c.raw.length > 0 || c.canonical !== null,
);
}
// ---------------------------------------------------------------------------
// THIRD STATE: an EXPLICITLY-STORED empty string on a string attr.
//
// The matrix above sweeps TWO states per string attr: absent/default and a
// non-default value — and asserts FIRST-pass byte-stability for both. There is
// a third, degenerate state the matrix does NOT cover: the attr stored as a
// LITERAL `""`. This is DISTINCT from "the node never had the attr": a user
// types an alt in the editor, then deletes it, and Tiptap's
// `updateAttributes({ alt: "" })` persists a literal `alt: ""` in the stored
// JSON. There is no absent-vs-"" distinction in the DOM once serialized, so the
// fix's `getAttribute("alt") || null` coercion canonicalizes BOTH to the
// default (`null`).
//
// Consequence — and this is CORRECT, not a bug: a doc carrying an explicit `""`
// converges to the default on the FIRST round-trip (a ONE-TIME diff: `"" ->
// null`), then is byte-stable from the SECOND round-trip on (idempotent). So
// this state must be pinned with a DIFFERENT contract than the matrix's:
// - do NOT assert first-pass byte-stability (the first pass legitimately
// changes `""` -> default), and
// - DO assert the first pass converges to the default AND the second pass is
// idempotent (rt2 deep-equals rt1).
//
// A future sync/QA pass diffing stored pages will see this one-time `"" -> null`
// normalization exactly once per affected node; it is the converter canon, not
// corruption, and must not be flagged as data loss.
// ---------------------------------------------------------------------------
/** Result of the third-state ("explicit empty string") convergence probe. */
export interface ConvergenceResult {
type: string;
attr: string;
/** The schema default the attr must converge to on pass 1 (null / absent). */
expectedDefault: unknown;
/** rt1's materialized value for the attr — must equal `expectedDefault`. */
firstPassValue: unknown;
/** True when the node round-tripped AND rt1 converged the attr to default. */
convergedToDefault: boolean;
/** rt1-vs-rt2 divergence; MUST be null (idempotent from pass 2 on). */
secondPassDivergence: { path: string; a: unknown; b: unknown } | null;
/** True when the node type failed to round-trip at all (structural loss). */
missing: boolean;
}
/** Round-trip a full PM doc through the real converter once. */
async function roundtripDoc(doc: any): Promise<any> {
return markdownToProseMirror(convertProseMirrorToMarkdown(doc));
}
/**
* Third-state convergence probe for one string attr of the empty-string class.
*
* (a) builds a doc with the attr EXPLICITLY set to `""` (baseAttrs + `""`),
* (b) rt1 = roundtrip(doc); asserts rt1's attr equals the schema default — the
* documented ONE-TIME `"" -> default` normalization (NOT byte-stable vs the
* `""` input, so first-pass stability is deliberately NOT asserted here),
* (c) rt2 = roundtrip(rt1); asserts rt2 deep-equals rt1 — idempotent from the
* second round-trip on.
*
* Returns a structured result (does NOT throw) so the caller can assert and
* print. Reusable across the whole node family: drive it for every attr flagged
* `emptyStringClass` on every spec (see convergenceCasesFor / the test driver).
*/
export async function runConvergenceCase(
spec: NodeStabilitySpec,
attr: string,
): Promise<ConvergenceResult> {
const expectedDefault = schemaDefaults(spec.type)[attr];
// (a) The degenerate third state: attr persisted as a LITERAL "".
const authored = { ...(spec.baseAttrs ?? {}), [attr]: "" };
const doc = { type: "doc", content: [{ type: spec.type, attrs: authored }] };
// (b) First round-trip: "" must normalize to the default (a one-time diff).
const rt1 = await roundtripDoc(doc);
const node1 = findFirst(rt1, spec.type);
const firstPassValue = node1?.attrs?.[attr];
const convergedToDefault =
node1 != null && firstDivergence(firstPassValue, expectedDefault) === null;
// (c) Second round-trip: must be byte-stable (rt2 deep-equals rt1). We compare
// the WHOLE docs — both are converter OUTPUTS already in the same materialized
// form (numeric attrs are strings on both sides), so no numeric normalization
// is needed here, unlike the raw/canonical contours above.
const rt2 = node1 != null ? await roundtripDoc(rt1) : rt1;
const secondPassDivergence =
node1 != null ? firstDivergence(rt1, rt2) : null;
return {
type: spec.type,
attr,
expectedDefault,
firstPassValue,
convergedToDefault,
secondPassDivergence,
missing: node1 == null,
};
}
/** The attrs of a spec flagged as members of the empty-string class. */
export function convergenceCasesFor(spec: NodeStabilitySpec): string[] {
return spec.attrMatrix
.filter((e) => e.emptyStringClass)
.map((e) => e.attr);
}
/** True when a convergence result honours the "converges once, then stable" contract. */
export function convergenceOk(r: ConvergenceResult): boolean {
return !r.missing && r.convergedToDefault && r.secondPassDivergence === null;
}
/** Render a convergence result as a legible one-liner for a failed assertion. */
export function formatConvergence(r: ConvergenceResult): string {
if (r.missing) return `${r.type}.${r.attr}: DID-NOT-ROUND-TRIP`;
const parts: string[] = [];
if (!r.convergedToDefault) {
parts.push(
`pass1 did NOT converge: got ${JSON.stringify(r.firstPassValue)} (expected default ${JSON.stringify(r.expectedDefault)})`,
);
}
if (r.secondPassDivergence) {
parts.push(
`pass2 NOT idempotent @ ${r.secondPassDivergence.path}: ${JSON.stringify(r.secondPassDivergence.a)} vs ${JSON.stringify(r.secondPassDivergence.b)}`,
);
}
const status = parts.length === 0 ? "converges-once-then-stable" : parts.join("; ");
return `${r.type}.${r.attr}: ${status}`;
}
/** Render a report as a legible multi-line string for a failed assertion. */
export function formatReport(report: MatrixReport): string {
const lines: string[] = [`node "${report.type}":`];
for (const c of report.combos) {
const flags: string[] = [];
if (c.missing) flags.push("DID-NOT-ROUND-TRIP");
for (const i of c.raw) {
const authored =
i.authored === ABSENT ? "absent" : JSON.stringify(i.authored);
flags.push(
`RAW ${i.type}.${i.attr}: ${authored} -> ${JSON.stringify(i.got)} (expected ${JSON.stringify(i.expected)})`,
);
}
if (c.canonical) {
flags.push(
`CANON @ ${c.canonical.path}: ${JSON.stringify(c.canonical.a)} vs ${JSON.stringify(c.canonical.b)}`,
);
}
const status = flags.length === 0 ? "stable" : flags.join("; ");
lines.push(` [${c.label}] ${status}`);
}
return lines.join("\n");
}
@@ -0,0 +1,164 @@
import { describe, expect, it } from "vitest";
import {
runStabilityMatrix,
unstableCombos,
formatReport,
runConvergenceCase,
convergenceCasesFor,
convergenceOk,
formatConvergence,
type NodeStabilitySpec,
} from "./roundtrip-stability.helper.js";
// ---------------------------------------------------------------------------
// Round-trip STABILITY matrix for image + the media family.
//
// Guards the "empty-string-vs-absent" churn class (GS-EDIT-REVERT family): a
// stored node authored WITHOUT a string attr (alt/title/caption/aria-label/...)
// must not gain a phantom `attr: ""` after `markdownToProseMirror(convert…)`.
// Each spec sweeps the at-risk string attrs at DEFAULT (absent) and at a real
// NON-default value; the helper asserts both the RAW round-trip (attrs equal the
// input's, modulo the documented numeric width/height/size/aspectRatio -> string
// coercion) and the CANONICAL round-trip (canonical forms deep-equal).
//
// The image + media family share the `align !== "center"` predicate and the
// `<!--name {…}-->` comment machinery, so one matrix guards the shared class.
// align is NOT part of this class (it round-trips correctly) and is not swept.
// ---------------------------------------------------------------------------
const SPECS: NodeStabilitySpec[] = [
{
// Image carries the most at-risk string attrs. `alt` is the one marked
// materializes as `<img alt="">` on `![](src)` import (the real bug); title
// and caption are covered as the same class. attachmentId is a string attr
// that must stay absent when unset (control).
type: "image",
baseAttrs: { src: "/i.png" },
attrMatrix: [
{ attr: "alt", default: undefined, nonDefault: "a real alt text", emptyStringClass: true },
{ attr: "title", default: undefined, nonDefault: "a real title", emptyStringClass: true },
{ attr: "caption", default: undefined, nonDefault: "a real caption" },
{ attr: "attachmentId", default: undefined, nonDefault: "att-42" },
],
},
{
// Video's `alt` rides the `aria-label` attribute (media aria-label at risk).
type: "video",
baseAttrs: { src: "/v.mp4" },
attrMatrix: [
{ attr: "alt", default: undefined, nonDefault: "a clip", emptyStringClass: true },
{ attr: "attachmentId", default: undefined, nonDefault: "att-1" },
],
},
{
// Audio carries no alt/title; attachmentId is its only optional string attr.
type: "audio",
baseAttrs: { src: "/a.mp3" },
attrMatrix: [
{ attr: "attachmentId", default: undefined, nonDefault: "att-2" },
],
},
{
// pdf: link-form media. `name` (filename) is its at-risk string attr.
type: "pdf",
baseAttrs: { src: "/d.pdf" },
attrMatrix: [
{ attr: "name", default: undefined, nonDefault: "report.pdf", emptyStringClass: true },
{ attr: "attachmentId", default: undefined, nonDefault: "att-3" },
],
},
{
// attachment: link-form media (file card). `name` + `mime` string attrs.
type: "attachment",
baseAttrs: { url: "/f.zip" },
attrMatrix: [
{ attr: "name", default: undefined, nonDefault: "bundle.zip", emptyStringClass: true },
{ attr: "mime", default: undefined, nonDefault: "application/zip", emptyStringClass: true },
{ attr: "attachmentId", default: undefined, nonDefault: "att-4" },
],
},
{
// embed: link-form media. `provider` is its at-risk string attr (schema
// default ""). embed's numeric width/height defaults (800/600) are a SEPARATE,
// documented limitation OUTSIDE the empty-string class: they are not in
// canonicalize's KNOWN_DEFAULTS, so an ABSENT width/height re-imports as the
// 800/600 default and diverges canonically (see the note in canonicalize.ts).
// That is canonicalize-owned and out of scope here, so we author the
// dimensions at their defaults (as real editor embeds carry them) to keep this
// guard focused on the empty-string/provider class.
// provider's schema default is "" (NOT null), so a re-imported "" is the
// correct value, not a phantom — it is outside the null-default empty-string
// class. We author it at its "" default (the default pick) so the sweep still
// asserts a non-default provider ("youtube") round-trips, without tripping the
// canonicalize KNOWN_DEFAULTS gap for embed's non-null defaults.
type: "embed",
baseAttrs: { src: "https://example.com/x", width: 800, height: 600 },
attrMatrix: [
{ attr: "provider", default: "", nonDefault: "youtube" },
],
},
{
// drawio: image-form diagram. `title` + `alt` string attrs (data-title/-alt).
type: "drawio",
baseAttrs: { src: "blob:drawio" },
attrMatrix: [
{ attr: "title", default: undefined, nonDefault: "flow chart", emptyStringClass: true },
{ attr: "alt", default: undefined, nonDefault: "an alt", emptyStringClass: true },
{ attr: "attachmentId", default: undefined, nonDefault: "att-5" },
],
},
{
// excalidraw: image-form diagram, same shared diagramAttributes set.
type: "excalidraw",
baseAttrs: { src: "blob:excalidraw" },
attrMatrix: [
{ attr: "title", default: undefined, nonDefault: "sketch", emptyStringClass: true },
{ attr: "alt", default: undefined, nonDefault: "an alt", emptyStringClass: true },
{ attr: "attachmentId", default: undefined, nonDefault: "att-6" },
],
},
];
describe("round-trip stability matrix (image + media family)", () => {
for (const spec of SPECS) {
it(`${spec.type}: no attr materializes an empty-string / phantom value`, async () => {
const report = await runStabilityMatrix(spec);
const unstable = unstableCombos(report);
// On failure, print the WHOLE matrix so which (attr, value) combos are
// unstable is legible.
expect(unstable, `\n${formatReport(report)}\n`).toEqual([]);
});
}
});
// ---------------------------------------------------------------------------
// THIRD STATE: an attr EXPLICITLY stored as a literal "" (GS-EDIT-REVERT: a user
// typed alt/title/name/... then deleted it, so Tiptap persisted `attr: ""` — a
// value DISTINCT from "attr was never set"). Unlike the absent case above, this
// state is NOT first-pass byte-stable: the fix's `"" -> default` coercion is a
// deliberate ONE-TIME normalization on the FIRST sync round-trip, stable
// thereafter. We therefore assert a DIFFERENT contract — "converges to default
// on pass 1, then idempotent from pass 2 on" — for every empty-string-class attr
// across the whole node family (image/video/pdf/attachment/drawio/excalidraw).
//
// IMPORTANT for a future sync/QA pass: the pass-1 `"" -> null` diff is the
// converter canon, not corruption. It appears at most once per affected node and
// must NOT be flagged as "the converter is losing/corrupting page data".
// ---------------------------------------------------------------------------
describe("round-trip third state: explicit empty string converges once, then idempotent", () => {
for (const spec of SPECS) {
for (const attr of convergenceCasesFor(spec)) {
it(`${spec.type}.${attr}: "" normalizes to default on pass 1, byte-stable from pass 2`, async () => {
const r = await runConvergenceCase(spec, attr);
// Pass 1 must converge "" -> the schema default (the one-time diff) and
// pass 2 (roundtrip of pass-1 output) must be byte-stable. formatConvergence
// prints exactly which half failed.
expect(convergenceOk(r), `\n${formatConvergence(r)}\n`).toBe(true);
// Spell the contract out explicitly so the intent is legible in the test:
expect(r.convergedToDefault, `\n${formatConvergence(r)}\n`).toBe(true);
expect(r.firstPassValue).toEqual(r.expectedDefault);
expect(r.secondPassDivergence, `\n${formatConvergence(r)}\n`).toBeNull();
});
}
}
});