Compare commits
7 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| c5bff2d84a | |||
| 80fc30633b | |||
| e17d5bc060 | |||
| 2c2d60a5dc | |||
| 1417209915 | |||
| f555fc87da | |||
| d6d1195abd |
@@ -201,7 +201,7 @@ pnpm workspace (`pnpm@10.4.0`) orchestrated by **Nx**. Four workspace packages:
|
||||
| `apps/client` | `client` | React 18 + Vite + Mantine 8 + TanStack Query + Jotai | SPA frontend |
|
||||
| `packages/editor-ext` | `@docmost/editor-ext` | Tiptap/ProseMirror | Shared Tiptap node/mark extensions, imported by both the client and the server |
|
||||
| `packages/mcp` | `@docmost/mcp` | MCP SDK, Tiptap, Yjs | Standalone MCP server, also bundled into the server at `/mcp`. Consumes the shared converter/schema from `@docmost/prosemirror-markdown` (#293) — it no longer carries its own vendored converter/schema copy |
|
||||
| `packages/prosemirror-markdown` | `@docmost/prosemirror-markdown` | Tiptap, marked, jsdom | The single, canonical ProseMirror↔Markdown converter + Docmost schema mirror (#293). Consumed by `mcp` and `git-sync`; there is exactly ONE copy of the converter now |
|
||||
| `packages/prosemirror-markdown` | `@docmost/prosemirror-markdown` | Tiptap, marked, jsdom | The single, canonical ProseMirror↔Markdown converter + Docmost schema mirror (#293). Consumed by `mcp`, `git-sync`, AND `apps/server` (server-side markdown import/export, #345); there is exactly ONE copy of the converter now |
|
||||
|
||||
`build` targets are Nx-cached and dependency-ordered (`dependsOn: ["^build"]`), so `editor-ext` builds before the apps. `nx.json` sets `affected.defaultBase: main`.
|
||||
|
||||
@@ -284,7 +284,7 @@ The API server is a Fastify app with a global `/api` prefix (`main.ts` excludes
|
||||
### Client structure
|
||||
Vite SPA. Code is organized by feature under `apps/client/src/features/*` (mirrors the server domains: `page`, `space`, `comment`, `ai-chat`, `editor`, …). Conventions:
|
||||
- **TanStack Query** for server state (one `queries/` file per feature), **Jotai** atoms for local/shared UI state, **Mantine 8** + CSS modules (`*.module.css`) + `postcss-preset-mantine` for UI.
|
||||
- The editor is Tiptap; shared node/mark extensions live in `packages/editor-ext` and are imported by **both the client and the server** (collaboration, import/export) — editor schema changes often need to be made in `editor-ext`, not just the client. The ProseMirror↔Markdown converter and its Docmost schema mirror now live in a SINGLE package, `@docmost/prosemirror-markdown` (#293), consumed by both `mcp` and `git-sync` — do NOT reintroduce a per-package copy. `editor-ext` is the upstream source of the Tiptap schema; the package's `docmost-schema.ts` mirrors it and a serializer-contract test (`packages/prosemirror-markdown/test/serializer-contract.test.ts`) guards the boundary (every schema node must have a converter case), so a drift surfaces as a failing test rather than silent divergence.
|
||||
- The editor is Tiptap; shared node/mark extensions live in `packages/editor-ext` and are imported by **both the client and the server** (collaboration, schema, `canonicalizeFootnotes`) — editor schema changes often need to be made in `editor-ext`, not just the client. Server-side markdown import/export no longer lives in `editor-ext`: it goes through the canonical converter (#345, see below). The ProseMirror↔Markdown converter and its Docmost schema mirror now live in a SINGLE package, `@docmost/prosemirror-markdown` (#293), consumed by `mcp`, `git-sync`, and `apps/server` (#345) — do NOT reintroduce a per-package copy. `editor-ext` is the upstream source of the Tiptap schema; the package's `docmost-schema.ts` mirrors it and a serializer-contract test (`packages/prosemirror-markdown/test/serializer-contract.test.ts`) guards the boundary (every schema node must have a converter case), so a drift surfaces as a failing test rather than silent divergence.
|
||||
- API access goes through `apps/client/src/lib/api-client.ts` (axios). The `@` alias maps to `apps/client/src`.
|
||||
- Runtime config is injected at build time by `vite.config.ts` via `define` (`APP_URL`, `COLLAB_URL`, `APP_VERSION`, …) — these come from the root `.env`, not from `import.meta.env`.
|
||||
|
||||
|
||||
@@ -12,14 +12,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
|
||||
### Added
|
||||
|
||||
- **Save intentional page versions.** Press `Cmd/Ctrl+S` (or use the page menu)
|
||||
to save a named version of a page. The history panel now distinguishes
|
||||
intentional versions (a "Saved" / "Agent version" badge) from automatic
|
||||
snapshots, dims autosaves, and offers an "Only versions" filter. Automatic
|
||||
snapshots switched from a fixed interval to a trailing idle-flush with a
|
||||
max-wait ceiling, and a boundary snapshot is pinned whenever the editing source
|
||||
changes (e.g. a person's edits followed by the AI agent). (#370)
|
||||
|
||||
- **Place several images side by side in a row.** A new "Inline (side by
|
||||
side)" alignment mode in the image bubble menu renders consecutive inline
|
||||
images as a row that wraps onto the next line on narrow screens. The row is
|
||||
|
||||
@@ -1385,14 +1385,5 @@
|
||||
"The commented text changed since this suggestion was made; it was not applied.": "The commented text changed since this suggestion was made; it was not applied.",
|
||||
"Dismiss": "Dismiss",
|
||||
"Suggestion dismissed": "Suggestion dismissed",
|
||||
"Failed to dismiss suggestion": "Failed to dismiss suggestion",
|
||||
"Save version": "Save version",
|
||||
"Ctrl+S": "Ctrl+S",
|
||||
"Version saved": "Version saved",
|
||||
"Already saved as the latest version": "Already saved as the latest version",
|
||||
"Agent version": "Agent version",
|
||||
"Boundary": "Boundary",
|
||||
"Autosave": "Autosave",
|
||||
"Only versions": "Only versions",
|
||||
"No saved versions yet.": "No saved versions yet."
|
||||
"Failed to dismiss suggestion": "Failed to dismiss suggestion"
|
||||
}
|
||||
|
||||
@@ -1248,14 +1248,5 @@
|
||||
"The commented text changed since this suggestion was made; it was not applied.": "Прокомментированный текст изменился после создания предложения; оно не было применено.",
|
||||
"Dismiss": "Не применять",
|
||||
"Suggestion dismissed": "Предложение отклонено",
|
||||
"Failed to dismiss suggestion": "Не удалось отклонить предложение",
|
||||
"Save version": "Сохранить версию",
|
||||
"Ctrl+S": "Ctrl+S",
|
||||
"Version saved": "Версия сохранена",
|
||||
"Already saved as the latest version": "Уже сохранено как последняя версия",
|
||||
"Agent version": "Версия агента",
|
||||
"Boundary": "Граница",
|
||||
"Autosave": "Автосейв",
|
||||
"Only versions": "Только версии",
|
||||
"No saved versions yet.": "Пока нет сохранённых версий."
|
||||
"Failed to dismiss suggestion": "Не удалось отклонить предложение"
|
||||
}
|
||||
|
||||
@@ -1,19 +1,10 @@
|
||||
import { atom } from "jotai";
|
||||
import { Editor } from "@tiptap/core";
|
||||
import type { HocuspocusProvider } from "@hocuspocus/provider";
|
||||
import { PageEditMode } from "@/features/user/types/user.types.ts";
|
||||
import type { DictationUnavailableReason } from "@/features/dictation/dictation-status";
|
||||
|
||||
export const pageEditorAtom = atom<Editor | null>(null);
|
||||
|
||||
// #370 — the active page's collab provider, published by the page editor so the
|
||||
// header menu can emit the "save-version" stateless signal (Cmd+S / button).
|
||||
// Null when the page is read-only / collab isn't connected. A typed initial
|
||||
// value (rather than an explicit generic) keeps jotai's overload resolution on
|
||||
// the writable PrimitiveAtom branch.
|
||||
const initialCollabProvider: HocuspocusProvider | null = null;
|
||||
export const collabProviderAtom = atom(initialCollabProvider);
|
||||
|
||||
export const titleEditorAtom = atom<Editor | null>(null);
|
||||
|
||||
export const readOnlyEditorAtom = atom<Editor | null>(null);
|
||||
|
||||
@@ -31,18 +31,11 @@ import { useAtom, useAtomValue, useSetAtom } from "jotai";
|
||||
import useCollaborationUrl from "@/features/editor/hooks/use-collaboration-url";
|
||||
import { currentUserAtom } from "@/features/user/atoms/current-user-atom";
|
||||
import {
|
||||
collabProviderAtom,
|
||||
currentPageEditModeAtom,
|
||||
dictationAvailabilityAtom,
|
||||
pageEditorAtom,
|
||||
yjsConnectionStatusAtom,
|
||||
} from "@/features/editor/atoms/editor-atoms";
|
||||
import { notifications } from "@mantine/notifications";
|
||||
import {
|
||||
VERSION_SAVED_MESSAGE_TYPE,
|
||||
type VersionSavedMessage,
|
||||
saveVersionPending,
|
||||
} from "@/features/page-history/version-messages";
|
||||
import { asideStateAtom } from "@/components/layouts/global/hooks/atoms/sidebar-atom";
|
||||
import {
|
||||
activeCommentIdAtom,
|
||||
@@ -130,7 +123,6 @@ export default function PageEditor({
|
||||
|
||||
const [currentUser] = useAtom(currentUserAtom);
|
||||
const [, setEditor] = useAtom(pageEditorAtom);
|
||||
const setCollabProvider = useSetAtom(collabProviderAtom);
|
||||
const [, setAsideState] = useAtom(asideStateAtom);
|
||||
const [, setActiveCommentId] = useAtom(activeCommentIdAtom);
|
||||
const [showCommentPopup, setShowCommentPopup] = useAtom(showCommentPopupAtom);
|
||||
@@ -188,24 +180,6 @@ export default function PageEditor({
|
||||
const onStatelessHandler = ({ payload }: onStatelessParameters) => {
|
||||
try {
|
||||
const message = JSON.parse(payload);
|
||||
// #370 — a version was saved somewhere; live-refresh the history panel
|
||||
// on every client. Only the client that pressed Save (tracked by the
|
||||
// module-level flag) shows the confirmation toast.
|
||||
if (message?.type === VERSION_SAVED_MESSAGE_TYPE) {
|
||||
const versionMsg = message as VersionSavedMessage;
|
||||
queryClient.invalidateQueries({
|
||||
queryKey: ["page-history-list"],
|
||||
});
|
||||
if (saveVersionPending.current) {
|
||||
saveVersionPending.current = false;
|
||||
notifications.show({
|
||||
message: versionMsg.alreadySaved
|
||||
? t("Already saved as the latest version")
|
||||
: t("Version saved"),
|
||||
});
|
||||
}
|
||||
return;
|
||||
}
|
||||
if (message?.type !== "page.updated" || !message.updatedAt) return;
|
||||
const pageData = queryClient.getQueryData<IPage>(["pages", slugId]);
|
||||
if (pageData) {
|
||||
@@ -263,16 +237,12 @@ export default function PageEditor({
|
||||
|
||||
local.on("synced", onLocalSyncedHandler);
|
||||
providersRef.current = { socket, local, remote };
|
||||
// #370 — publish the provider so the header menu can emit save-version.
|
||||
setCollabProvider(remote);
|
||||
setProvidersReady(true);
|
||||
} else {
|
||||
setCollabProvider(providersRef.current.remote);
|
||||
setProvidersReady(true);
|
||||
}
|
||||
// Only destroy on final unmount
|
||||
return () => {
|
||||
setCollabProvider(null);
|
||||
providersRef.current?.socket.destroy();
|
||||
providersRef.current?.remote.destroy();
|
||||
providersRef.current?.local.destroy();
|
||||
|
||||
@@ -1,11 +1,4 @@
|
||||
import {
|
||||
Text,
|
||||
Group,
|
||||
UnstyledButton,
|
||||
Avatar,
|
||||
Tooltip,
|
||||
Badge,
|
||||
} from "@mantine/core";
|
||||
import { Text, Group, UnstyledButton, Avatar, Tooltip } from "@mantine/core";
|
||||
import { CustomAvatar } from "@/components/ui/custom-avatar.tsx";
|
||||
import { AgentAvatarStack } from "@/components/ui/agent-avatar-stack.tsx";
|
||||
import { formattedDate } from "@/lib/time";
|
||||
@@ -14,59 +7,36 @@ import clsx from "clsx";
|
||||
import { IPageHistory } from "@/features/page-history/types/page.types";
|
||||
import { memo, useCallback } from "react";
|
||||
import { useSetAtom } from "jotai";
|
||||
import { useTranslation } from "react-i18next";
|
||||
import { historyAtoms } from "@/features/page-history/atoms/history-atoms.ts";
|
||||
|
||||
const MAX_VISIBLE_AVATARS = 5;
|
||||
|
||||
/**
|
||||
* #370 — map a snapshot's intentionality tier to its badge. `version: true`
|
||||
* marks the intentional points (manual / agent); autosaves (boundary / idle /
|
||||
* legacy null) are non-versions and get dimmed in the list.
|
||||
*/
|
||||
type HistoryKindMeta = { labelKey: string; color: string; version: boolean };
|
||||
export function historyKindMeta(kind?: string | null): HistoryKindMeta {
|
||||
switch (kind) {
|
||||
case "manual":
|
||||
return { labelKey: "Saved", color: "blue", version: true };
|
||||
case "agent":
|
||||
return { labelKey: "Agent version", color: "violet", version: true };
|
||||
case "boundary":
|
||||
return { labelKey: "Boundary", color: "gray", version: false };
|
||||
default: // "idle" | null | undefined (legacy autosave)
|
||||
return { labelKey: "Autosave", color: "gray", version: false };
|
||||
}
|
||||
}
|
||||
|
||||
interface HistoryItemProps {
|
||||
historyItem: IPageHistory;
|
||||
// The previous snapshot for diff/restore is resolved by id from the FULL list
|
||||
// in the parent (resolvePrevSnapshotId), so the item only needs to report its
|
||||
// own id — never a list index (which would be the filtered-view index).
|
||||
onSelect: (id: string) => void;
|
||||
onHover?: (id: string) => void;
|
||||
index: number;
|
||||
onSelect: (id: string, index: number) => void;
|
||||
onHover?: (id: string, index: number) => void;
|
||||
onHoverEnd?: () => void;
|
||||
isActive: boolean;
|
||||
}
|
||||
|
||||
const HistoryItem = memo(function HistoryItem({
|
||||
historyItem,
|
||||
index,
|
||||
onSelect,
|
||||
onHover,
|
||||
onHoverEnd,
|
||||
isActive,
|
||||
}: HistoryItemProps) {
|
||||
const setHistoryModalOpen = useSetAtom(historyAtoms);
|
||||
const { t } = useTranslation();
|
||||
const kindMeta = historyKindMeta(historyItem.kind);
|
||||
|
||||
const handleClick = useCallback(() => {
|
||||
onSelect(historyItem.id);
|
||||
}, [onSelect, historyItem.id]);
|
||||
onSelect(historyItem.id, index);
|
||||
}, [onSelect, historyItem.id, index]);
|
||||
|
||||
const handleMouseEnter = useCallback(() => {
|
||||
onHover?.(historyItem.id);
|
||||
}, [onHover, historyItem.id]);
|
||||
onHover?.(historyItem.id, index);
|
||||
}, [onHover, historyItem.id, index]);
|
||||
|
||||
const contributors = historyItem.contributors;
|
||||
const hasContributors = contributors && contributors.length > 0;
|
||||
@@ -79,20 +49,8 @@ const HistoryItem = memo(function HistoryItem({
|
||||
onMouseEnter={handleMouseEnter}
|
||||
onMouseLeave={onHoverEnd}
|
||||
className={clsx(classes.history, { [classes.active]: isActive })}
|
||||
// #370 — dim autosnapshots so intentional versions stand out.
|
||||
style={{ opacity: kindMeta.version ? 1 : 0.55 }}
|
||||
>
|
||||
<Group gap={6} wrap="nowrap" justify="space-between">
|
||||
<Text size="sm">{formattedDate(new Date(historyItem.createdAt))}</Text>
|
||||
<Badge
|
||||
size="xs"
|
||||
radius="sm"
|
||||
variant={kindMeta.version ? "filled" : "light"}
|
||||
color={kindMeta.color}
|
||||
>
|
||||
{t(kindMeta.labelKey)}
|
||||
</Badge>
|
||||
</Group>
|
||||
<Text size="sm">{formattedDate(new Date(historyItem.createdAt))}</Text>
|
||||
|
||||
<Group gap={6} wrap="nowrap" mt={4}>
|
||||
{hasContributors ? (
|
||||
|
||||
@@ -9,7 +9,7 @@ import {
|
||||
historyAtoms,
|
||||
} from "@/features/page-history/atoms/history-atoms";
|
||||
import { useAtom, useSetAtom } from "jotai";
|
||||
import { useCallback, useEffect, useMemo, useRef, useState } from "react";
|
||||
import { useCallback, useEffect, useMemo, useRef } from "react";
|
||||
import {
|
||||
Button,
|
||||
ScrollArea,
|
||||
@@ -17,12 +17,9 @@ import {
|
||||
Divider,
|
||||
Loader,
|
||||
Center,
|
||||
Switch,
|
||||
Text,
|
||||
} from "@mantine/core";
|
||||
import { useTranslation } from "react-i18next";
|
||||
import { useHistoryRestore } from "@/features/page-history/hooks";
|
||||
import { resolvePrevSnapshotId } from "@/features/page-history/utils/resolve-prev-snapshot";
|
||||
|
||||
const PREFETCH_DELAY_MS = 150;
|
||||
|
||||
@@ -50,23 +47,6 @@ function HistoryList({ pageId }: Props) {
|
||||
[pageHistoryData],
|
||||
);
|
||||
|
||||
// #370 — "only versions" filter: hide autosnapshots (idle/boundary/legacy
|
||||
// null), keep only intentional points (manual/agent). Filtering is over the
|
||||
// already-loaded pages; the diff/restore still targets the true previous
|
||||
// snapshot, so items carry their index within the FULL list.
|
||||
const [onlyVersions, setOnlyVersions] = useState(false);
|
||||
const isVersion = useCallback(
|
||||
(kind?: string | null) => kind === "manual" || kind === "agent",
|
||||
[],
|
||||
);
|
||||
const visibleItems = useMemo(
|
||||
() =>
|
||||
onlyVersions
|
||||
? historyItems.filter((item) => isVersion(item.kind))
|
||||
: historyItems,
|
||||
[historyItems, onlyVersions, isVersion],
|
||||
);
|
||||
|
||||
const loadMoreRef = useRef<HTMLDivElement>(null);
|
||||
const prefetchTimeoutRef = useRef<ReturnType<typeof setTimeout> | null>(null);
|
||||
|
||||
@@ -80,13 +60,11 @@ function HistoryList({ pageId }: Props) {
|
||||
}, []);
|
||||
|
||||
const handleHover = useCallback(
|
||||
(historyId: string) => {
|
||||
(historyId: string, index: number) => {
|
||||
clearPrefetchTimeout();
|
||||
prefetchTimeoutRef.current = setTimeout(() => {
|
||||
prefetchPageHistory(historyId);
|
||||
// The true previous snapshot in the FULL list (not the previous visible
|
||||
// one under the "only versions" filter).
|
||||
const prevId = resolvePrevSnapshotId(historyItems, historyId);
|
||||
const prevId = historyItems[index + 1]?.id;
|
||||
if (prevId) {
|
||||
prefetchPageHistory(prevId);
|
||||
}
|
||||
@@ -100,11 +78,9 @@ function HistoryList({ pageId }: Props) {
|
||||
}, [clearPrefetchTimeout]);
|
||||
|
||||
const handleSelect = useCallback(
|
||||
(id: string) => {
|
||||
(id: string, index: number) => {
|
||||
setActiveHistoryId(id);
|
||||
// Baseline = true previous snapshot in the FULL list, so the "only
|
||||
// versions" filter never diffs/restores against the wrong item.
|
||||
setActiveHistoryPrevId(resolvePrevSnapshotId(historyItems, id));
|
||||
setActiveHistoryPrevId(historyItems[index + 1]?.id ?? "");
|
||||
},
|
||||
[historyItems, setActiveHistoryId, setActiveHistoryPrevId],
|
||||
);
|
||||
@@ -152,27 +128,12 @@ function HistoryList({ pageId }: Props) {
|
||||
|
||||
return (
|
||||
<div>
|
||||
<Group px="xs" py={6} justify="flex-end">
|
||||
<Switch
|
||||
size="xs"
|
||||
checked={onlyVersions}
|
||||
onChange={(e) => setOnlyVersions(e.currentTarget.checked)}
|
||||
label={t("Only versions")}
|
||||
/>
|
||||
</Group>
|
||||
|
||||
<ScrollArea h={620} w="100%" type="scroll" scrollbarSize={5}>
|
||||
{onlyVersions && visibleItems.length === 0 && (
|
||||
<Center py="md">
|
||||
<Text size="sm" c="dimmed">
|
||||
{t("No saved versions yet.")}
|
||||
</Text>
|
||||
</Center>
|
||||
)}
|
||||
{visibleItems.map((historyItem) => (
|
||||
{historyItems.map((historyItem, index) => (
|
||||
<HistoryItem
|
||||
key={historyItem.id}
|
||||
historyItem={historyItem}
|
||||
index={index}
|
||||
onSelect={handleSelect}
|
||||
onHover={handleHover}
|
||||
onHoverEnd={clearPrefetchTimeout}
|
||||
|
||||
@@ -24,10 +24,6 @@ export interface IPageHistory {
|
||||
updatedAt: string;
|
||||
lastUpdatedBy: IPageHistoryUser;
|
||||
contributors?: IPageHistoryUser[];
|
||||
// #370 — intentionality tier: 'manual'/'agent' are versions (intentional
|
||||
// points), 'idle'/'boundary' are autosnapshots; null/undefined = legacy
|
||||
// autosave. Derived server-side, drives the history badge + "versions" filter.
|
||||
kind?: "manual" | "agent" | "idle" | "boundary" | null;
|
||||
// Provenance markers copied off the page row when the snapshot was saved.
|
||||
// `'agent'` marks a version written by the AI agent; `lastUpdatedAiChatId`
|
||||
// (when present) deep-links to the chat that produced the edit.
|
||||
|
||||
@@ -1,42 +0,0 @@
|
||||
import { describe, it, expect } from "vitest";
|
||||
import { resolvePrevSnapshotId } from "./resolve-prev-snapshot";
|
||||
|
||||
// #370 F4 — the risky client path: with the "only versions" filter active, diff
|
||||
// and restore must still baseline against the TRUE previous snapshot in the FULL
|
||||
// list, never the previous VISIBLE version (which would skip the autosnapshots
|
||||
// between two versions). These pin that the resolution is by FULL-list order.
|
||||
describe("resolvePrevSnapshotId", () => {
|
||||
// Newest-first, as the history list stores it: a version, then two autosaves,
|
||||
// then an older version.
|
||||
const full = [
|
||||
{ id: "v2", kind: "manual" },
|
||||
{ id: "a2", kind: "idle" },
|
||||
{ id: "a1", kind: "boundary" },
|
||||
{ id: "v1", kind: "manual" },
|
||||
{ id: "a0", kind: null },
|
||||
];
|
||||
|
||||
it("returns the immediate FULL-list successor, not the previous visible version", () => {
|
||||
// Selecting v2 while filtered to versions-only must baseline against a2 (the
|
||||
// real chronological predecessor), NOT v1 (the previous visible version).
|
||||
expect(resolvePrevSnapshotId(full, "v2")).toBe("a2");
|
||||
});
|
||||
|
||||
it("resolves an autosnapshot's predecessor by full-list order", () => {
|
||||
expect(resolvePrevSnapshotId(full, "a1")).toBe("v1");
|
||||
});
|
||||
|
||||
it("returns '' for the oldest item (no predecessor)", () => {
|
||||
expect(resolvePrevSnapshotId(full, "a0")).toBe("");
|
||||
});
|
||||
|
||||
it("returns '' for an id not in the list", () => {
|
||||
expect(resolvePrevSnapshotId(full, "missing")).toBe("");
|
||||
});
|
||||
|
||||
it("does not depend on a filtered subset — same result whatever is visible", () => {
|
||||
// The helper only ever sees the full list; a filtered view cannot change the
|
||||
// baseline it computes.
|
||||
expect(resolvePrevSnapshotId(full, "v1")).toBe("a0");
|
||||
});
|
||||
});
|
||||
@@ -1,22 +0,0 @@
|
||||
/**
|
||||
* #370 — resolve the TRUE previous snapshot for a history item.
|
||||
*
|
||||
* The history panel can be filtered to "only versions" (manual/agent), but diff
|
||||
* and restore must always compare against the immediately-preceding snapshot in
|
||||
* the FULL, unfiltered list — NOT the previous VISIBLE item. Comparing against
|
||||
* the previous visible version would silently skip the autosnapshots between two
|
||||
* versions and diff/restore the wrong baseline.
|
||||
*
|
||||
* Given the full (newest-first) list and an item id, this returns the id of the
|
||||
* item right after it in the full list (its chronological predecessor), or "" if
|
||||
* it is the oldest / not found. Pure and list-order-preserving so it can be unit
|
||||
* tested without mounting the component.
|
||||
*/
|
||||
export function resolvePrevSnapshotId(
|
||||
fullItems: ReadonlyArray<{ id: string }>,
|
||||
id: string,
|
||||
): string {
|
||||
const index = fullItems.findIndex((item) => item.id === id);
|
||||
if (index === -1) return "";
|
||||
return fullItems[index + 1]?.id ?? "";
|
||||
}
|
||||
@@ -1,28 +0,0 @@
|
||||
/**
|
||||
* #370 — page-version stateless wire formats. Kept in one place so the client
|
||||
* emitter (Save hotkey / button) and the client listener (page-editor) agree
|
||||
* with the server (PersistenceExtension) on the message shapes.
|
||||
*/
|
||||
|
||||
/** Client → server: "save a version now". The server derives the tier
|
||||
* (manual/agent) from the signed connection actor, never from this payload. */
|
||||
export const SAVE_VERSION_MESSAGE_TYPE = "save-version";
|
||||
|
||||
/** Server → all clients: a version was saved (or promoted / already existed). */
|
||||
export const VERSION_SAVED_MESSAGE_TYPE = "version.saved";
|
||||
|
||||
export interface VersionSavedMessage {
|
||||
type: typeof VERSION_SAVED_MESSAGE_TYPE;
|
||||
historyId: string;
|
||||
kind: "manual" | "agent";
|
||||
/** True when the latest snapshot was already a manual version (a no-op save). */
|
||||
alreadySaved: boolean;
|
||||
}
|
||||
|
||||
/**
|
||||
* Cross-component coordination flag so only the client that pressed Save shows
|
||||
* the confirmation toast, while every other client silently refreshes its
|
||||
* history panel on the broadcast. A module-level ref avoids stale-closure
|
||||
* pitfalls in the editor's long-lived stateless handler.
|
||||
*/
|
||||
export const saveVersionPending = { current: false };
|
||||
@@ -3,7 +3,6 @@ import {
|
||||
IconArrowRight,
|
||||
IconArrowsHorizontal,
|
||||
IconClockHour4,
|
||||
IconDeviceFloppy,
|
||||
IconDots,
|
||||
IconEye,
|
||||
IconEyeOff,
|
||||
@@ -18,7 +17,7 @@ import {
|
||||
IconTrash,
|
||||
IconWifiOff,
|
||||
} from "@tabler/icons-react";
|
||||
import React, { useCallback, useEffect, useRef, useState } from "react";
|
||||
import React, { useEffect, useRef, useState } from "react";
|
||||
import { useAsideTriggerProps } from "@/hooks/use-toggle-aside.tsx";
|
||||
import { useAtom, useAtomValue } from "jotai";
|
||||
import { historyAtoms } from "@/features/page-history/atoms/history-atoms.ts";
|
||||
@@ -40,14 +39,9 @@ import { Trans, useTranslation } from "react-i18next";
|
||||
import ExportModal from "@/components/common/export-modal";
|
||||
import { htmlToMarkdown } from "@docmost/editor-ext";
|
||||
import {
|
||||
collabProviderAtom,
|
||||
pageEditorAtom,
|
||||
yjsConnectionStatusAtom,
|
||||
} from "@/features/editor/atoms/editor-atoms.ts";
|
||||
import {
|
||||
SAVE_VERSION_MESSAGE_TYPE,
|
||||
saveVersionPending,
|
||||
} from "@/features/page-history/version-messages.ts";
|
||||
import { formattedDate } from "@/lib/time.ts";
|
||||
import { PageEditModeToggle } from "@/features/user/components/page-state-pref.tsx";
|
||||
import MovePageModal from "@/features/page/components/move-page-modal.tsx";
|
||||
@@ -78,34 +72,9 @@ export default function PageHeaderMenu({ readOnly }: PageHeaderMenuProps) {
|
||||
});
|
||||
const isDeleted = !!page?.deletedAt;
|
||||
const [workspace] = useAtom(workspaceAtom);
|
||||
const collabProvider = useAtomValue(collabProviderAtom);
|
||||
// Community public-sharing entry point (replaces the removed EE PageShareModal)
|
||||
const workspaceSharingDisabled = workspace?.settings?.sharing?.disabled === true;
|
||||
|
||||
// #370 — explicit "save a version" (Cmd+S / Save button). One path for the
|
||||
// human; the server derives the tier from the signed actor. Readers can't save
|
||||
// (the button is hidden and the collab connection is read-only server-side).
|
||||
const handleSaveVersion = useCallback(() => {
|
||||
if (readOnly || !collabProvider) return;
|
||||
// Flag this client as the initiator so only it shows the confirmation toast;
|
||||
// a safety timeout clears it if no broadcast comes back (e.g. offline).
|
||||
saveVersionPending.current = true;
|
||||
window.setTimeout(() => {
|
||||
saveVersionPending.current = false;
|
||||
}, 5000);
|
||||
collabProvider.sendStateless(
|
||||
JSON.stringify({ type: SAVE_VERSION_MESSAGE_TYPE }),
|
||||
);
|
||||
}, [readOnly, collabProvider]);
|
||||
|
||||
// mod+S must also block the browser's "Save page" dialog. `triggerOnContent-
|
||||
// Editable` + empty ignore-list so it fires while typing in the editor/title.
|
||||
useHotkeys(
|
||||
[["mod+S", handleSaveVersion, { preventDefault: true }]],
|
||||
[],
|
||||
true,
|
||||
);
|
||||
|
||||
useHotkeys(
|
||||
[
|
||||
[
|
||||
@@ -164,16 +133,15 @@ export default function PageHeaderMenu({ readOnly }: PageHeaderMenuProps) {
|
||||
</ActionIcon>
|
||||
</Tooltip>
|
||||
|
||||
<PageActionMenu readOnly={readOnly} onSaveVersion={handleSaveVersion} />
|
||||
<PageActionMenu readOnly={readOnly} />
|
||||
</>
|
||||
);
|
||||
}
|
||||
|
||||
interface PageActionMenuProps {
|
||||
readOnly?: boolean;
|
||||
onSaveVersion?: () => void;
|
||||
}
|
||||
function PageActionMenu({ readOnly, onSaveVersion }: PageActionMenuProps) {
|
||||
function PageActionMenu({ readOnly }: PageActionMenuProps) {
|
||||
const { t } = useTranslation();
|
||||
const [, setHistoryModalOpen] = useAtom(historyAtoms);
|
||||
const clipboard = useClipboard({ timeout: 500 });
|
||||
@@ -334,20 +302,6 @@ function PageActionMenu({ readOnly, onSaveVersion }: PageActionMenuProps) {
|
||||
</Group>
|
||||
</Menu.Item>
|
||||
|
||||
{!readOnly && (
|
||||
<Menu.Item
|
||||
leftSection={<IconDeviceFloppy size={16} />}
|
||||
onClick={onSaveVersion}
|
||||
rightSection={
|
||||
<Text size="xs" c="dimmed">
|
||||
{t("Ctrl+S")}
|
||||
</Text>
|
||||
}
|
||||
>
|
||||
{t("Save version")}
|
||||
</Menu.Item>
|
||||
)}
|
||||
|
||||
<Menu.Item
|
||||
leftSection={<IconHistory size={16} />}
|
||||
onClick={openHistoryModal}
|
||||
|
||||
@@ -23,7 +23,7 @@
|
||||
"migration:reset": "tsx src/database/migrate.ts down-to NO_MIGRATIONS",
|
||||
"migration:codegen": "kysely-codegen --dialect=postgres --camel-case --env-file=../../.env --out-file=./src/database/types/db.d.ts",
|
||||
"lint": "eslint \"{src,apps,libs,test}/**/*.ts\" --fix",
|
||||
"pretest": "pnpm --filter @docmost/editor-ext build",
|
||||
"pretest": "pnpm --filter @docmost/editor-ext build && pnpm --filter @docmost/prosemirror-markdown build",
|
||||
"test": "jest",
|
||||
"test:int": "jest --config test/jest-integration.json",
|
||||
"test:watch": "jest --watch",
|
||||
@@ -43,6 +43,7 @@
|
||||
"@clickhouse/client": "^1.18.2",
|
||||
"@docmost/mcp": "workspace:*",
|
||||
"@docmost/pdf-inspector": "1.9.6",
|
||||
"@docmost/prosemirror-markdown": "workspace:*",
|
||||
"@fastify/cookie": "^11.0.2",
|
||||
"@fastify/multipart": "^10.0.0",
|
||||
"@fastify/static": "^9.1.3",
|
||||
@@ -175,7 +176,7 @@
|
||||
"/node_modules/"
|
||||
],
|
||||
"transform": {
|
||||
"happy-dom.+\\.js$": [
|
||||
"(happy-dom.+|prosemirror-markdown/build/.+)\\.js$": [
|
||||
"babel-jest",
|
||||
{
|
||||
"presets": [
|
||||
@@ -193,7 +194,7 @@
|
||||
"^.+\\.(t|j)sx?$": "ts-jest"
|
||||
},
|
||||
"transformIgnorePatterns": [
|
||||
"/node_modules/(?!(\\.pnpm/)?(nanoid|uuid|image-dimensions|marked|happy-dom|lib0)(@|/))"
|
||||
"/node_modules/(?!(\\.pnpm/)?(nanoid|uuid|image-dimensions|marked|happy-dom|lib0|@docmost/prosemirror-markdown)(@|/))"
|
||||
],
|
||||
"collectCoverageFrom": [
|
||||
"**/*.(t|j)s"
|
||||
@@ -204,7 +205,8 @@
|
||||
"^@docmost/db/(.*)$": "<rootDir>/database/$1",
|
||||
"^@docmost/transactional/(.*)$": "<rootDir>/integrations/transactional/$1",
|
||||
"^@docmost/ee/(.*)$": "<rootDir>/ee/$1",
|
||||
"^src/(.*)$": "<rootDir>/$1"
|
||||
"^src/(.*)$": "<rootDir>/$1",
|
||||
"^@tiptap/react$": "<rootDir>/../test/stubs/tiptap-react.js"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -43,7 +43,6 @@ import {
|
||||
Column,
|
||||
Status,
|
||||
addUniqueIdsToDoc,
|
||||
htmlToMarkdown,
|
||||
TransclusionSource,
|
||||
TransclusionReference,
|
||||
FootnoteReference,
|
||||
@@ -51,6 +50,7 @@ import {
|
||||
FootnoteDefinition,
|
||||
PageEmbed,
|
||||
} from '@docmost/editor-ext';
|
||||
import { convertProseMirrorToMarkdown } from '@docmost/prosemirror-markdown';
|
||||
import { generateText, getSchema, JSONContent } from '@tiptap/core';
|
||||
import { generateHTML, generateJSON } from '../common/helpers/prosemirror/html';
|
||||
// @tiptap/html library works best for generating prosemirror json state but not HTML
|
||||
@@ -239,6 +239,10 @@ export function prosemirrorNodeToYElement(node: any): Y.XmlElement | Y.XmlText {
|
||||
}
|
||||
|
||||
export function jsonToMarkdown(tiptapJson: any): string {
|
||||
const html = jsonToHtml(tiptapJson);
|
||||
return htmlToMarkdown(html);
|
||||
// Direct ProseMirror JSON -> Markdown via the canonical converter
|
||||
// (`@docmost/prosemirror-markdown`) — no HTML intermediate, no second
|
||||
// editor-ext markdown layer. Same serializer as the page/space export and the
|
||||
// git-sync vault writer, so every server PM->MD path emits identical canonical
|
||||
// markdown (issue #345).
|
||||
return convertProseMirrorToMarkdown(tiptapJson);
|
||||
}
|
||||
|
||||
@@ -1,29 +1,3 @@
|
||||
/**
|
||||
* #370 — page-history intentionality tiers. Domain of `page_history.kind`.
|
||||
* - 'manual' / 'agent' → Tier 1 versions (intentional points)
|
||||
* - 'idle' / 'boundary' → Tier 0 autosnapshots (safety net)
|
||||
* A legacy `null` kind is treated as an autosave.
|
||||
*/
|
||||
export type PageHistoryKind = 'manual' | 'agent' | 'idle' | 'boundary';
|
||||
|
||||
/**
|
||||
* #370 — trailing idle-flush windows. A page's pending idle snapshot is
|
||||
* re-armed on every store and fires this long after edits go quiet, so a burst
|
||||
* of edits collapses into a single autosnapshot instead of one-per-store. Human
|
||||
* sessions are noisier and less risky, so they flush less often than the agent.
|
||||
*/
|
||||
export const IDLE_INTERVAL_USER = 60 * 60 * 1000; // 60m
|
||||
export const IDLE_INTERVAL_AGENT = 15 * 60 * 1000; // 15m
|
||||
|
||||
/**
|
||||
* #370 — max-wait ceiling for the idle flush. Pure trailing debounce starves the
|
||||
* safety net: hocuspocus stores at least every ~45s, so a CONTINUOUS editing
|
||||
* session would re-arm the trailing timer forever and never take an idle
|
||||
* snapshot until edits finally go quiet (up to IDLE_INTERVAL_USER = 60m). This
|
||||
* ceiling bounds the actual wait from the FIRST edit of a burst, so an idle
|
||||
* snapshot fires at least this often during a long unbroken session — restoring
|
||||
* a recovery point cadence closer to the old heuristic without one-per-store
|
||||
* noise. Mirrors hocuspocus's own maxDebounce idea.
|
||||
*/
|
||||
export const IDLE_MAX_WAIT_USER = 10 * 60 * 1000; // 10m
|
||||
export const IDLE_MAX_WAIT_AGENT = 5 * 60 * 1000; // 5m
|
||||
export const HISTORY_INTERVAL = 5 * 60 * 1000;
|
||||
export const HISTORY_FAST_INTERVAL = 60 * 1000;
|
||||
export const HISTORY_FAST_THRESHOLD = 5 * 60 * 1000;
|
||||
|
||||
@@ -1,93 +1,84 @@
|
||||
import { computeHistoryJob, resolveSource } from './persistence.extension';
|
||||
import {
|
||||
IDLE_INTERVAL_AGENT,
|
||||
IDLE_INTERVAL_USER,
|
||||
IDLE_MAX_WAIT_AGENT,
|
||||
IDLE_MAX_WAIT_USER,
|
||||
computeHistoryJob,
|
||||
resolveSource,
|
||||
} from './persistence.extension';
|
||||
import {
|
||||
HISTORY_FAST_INTERVAL,
|
||||
HISTORY_FAST_THRESHOLD,
|
||||
HISTORY_INTERVAL,
|
||||
} from '../constants';
|
||||
|
||||
// A fixed clock + fixed createdAt make pageAge deterministic.
|
||||
const NOW = 1_700_000_000_000;
|
||||
const PAGE_ID = '550e8400-e29b-41d4-a716-446655440000';
|
||||
|
||||
const page = { id: PAGE_ID };
|
||||
// Build a minimal page whose age (NOW - createdAt) is exactly `ageMs`.
|
||||
const pageAged = (ageMs: number) => ({
|
||||
id: PAGE_ID,
|
||||
createdAt: new Date(NOW - ageMs),
|
||||
});
|
||||
|
||||
describe('computeHistoryJob (#370 — shared trailing idle pipeline)', () => {
|
||||
it('human edit → user idle window, bare page.id job', () => {
|
||||
// Humans and the agent now share ONE idle job per page (jobId = page.id).
|
||||
// The agent's old delay=0 fast path is GONE — intentional agent points now
|
||||
// arrive via the explicit save-version signal, not a zero-delay snapshot.
|
||||
const { jobId, delay } = computeHistoryJob(page, 'user');
|
||||
expect(delay).toBe(IDLE_INTERVAL_USER);
|
||||
describe('computeHistoryJob', () => {
|
||||
it('agent edit → delay MUST be 0 and job id is source-keyed', () => {
|
||||
// INVARIANT (§15 H2 / persistence.extension): the agent delay MUST stay 0.
|
||||
// The worker re-reads the page row at run time, so any non-zero delay risks
|
||||
// snapshotting content a later human edit has already overwritten. This is
|
||||
// the load-bearing assertion of this spec — do not relax it.
|
||||
const { jobId, delay } = computeHistoryJob(pageAged(0), 'agent', NOW);
|
||||
expect(delay).toBe(0);
|
||||
expect(jobId).toBe(`${PAGE_ID}-agent`);
|
||||
});
|
||||
|
||||
it('agent edit on an OLD page is still delay 0 (age never applies to agents)', () => {
|
||||
// Even when the page is far older than the fast threshold, the agent path
|
||||
// must short-circuit to 0 — age-based debounce is a human-only concern.
|
||||
const { jobId, delay } = computeHistoryJob(
|
||||
pageAged(HISTORY_FAST_THRESHOLD + 60_000),
|
||||
'agent',
|
||||
NOW,
|
||||
);
|
||||
expect(delay).toBe(0);
|
||||
expect(jobId).toBe(`${PAGE_ID}-agent`);
|
||||
});
|
||||
|
||||
it('human edit on a YOUNG page (age < threshold) → fast interval, bare job id', () => {
|
||||
const { jobId, delay } = computeHistoryJob(
|
||||
pageAged(HISTORY_FAST_THRESHOLD - 1),
|
||||
'user',
|
||||
NOW,
|
||||
);
|
||||
expect(delay).toBe(HISTORY_FAST_INTERVAL);
|
||||
expect(jobId).toBe(PAGE_ID);
|
||||
});
|
||||
|
||||
it('agent edit → agent idle window (shorter), still the bare page.id job', () => {
|
||||
const { jobId, delay } = computeHistoryJob(page, 'agent');
|
||||
expect(delay).toBe(IDLE_INTERVAL_AGENT);
|
||||
// No `-agent` suffix anymore: the agent joins the common idle pipeline.
|
||||
it('human edit on an OLD page (age > threshold) → standard interval', () => {
|
||||
const { jobId, delay } = computeHistoryJob(
|
||||
pageAged(HISTORY_FAST_THRESHOLD + 1),
|
||||
'user',
|
||||
NOW,
|
||||
);
|
||||
expect(delay).toBe(HISTORY_INTERVAL);
|
||||
expect(jobId).toBe(PAGE_ID);
|
||||
});
|
||||
|
||||
it('agent flushes sooner than a human', () => {
|
||||
expect(IDLE_INTERVAL_AGENT).toBeLessThan(IDLE_INTERVAL_USER);
|
||||
it('boundary: pageAge EXACTLY === threshold takes the slow branch (the `<` is strict)', () => {
|
||||
// Off-by-one guard: the condition is `pageAge < HISTORY_FAST_THRESHOLD`, so
|
||||
// an age of exactly the threshold is NOT "fast" — it must use HISTORY_INTERVAL.
|
||||
const { delay } = computeHistoryJob(
|
||||
pageAged(HISTORY_FAST_THRESHOLD),
|
||||
'user',
|
||||
NOW,
|
||||
);
|
||||
expect(delay).toBe(HISTORY_INTERVAL);
|
||||
});
|
||||
|
||||
it('treats any non-"agent" source string as human (keys strictly on === agent)', () => {
|
||||
const { jobId, delay } = computeHistoryJob(page, 'user');
|
||||
expect(delay).toBe(IDLE_INTERVAL_USER);
|
||||
it('treats any non-"agent" source string as human', () => {
|
||||
// resolveSource only ever yields 'agent' | 'user', but guard the contract:
|
||||
// the agent branch keys strictly on === 'agent'.
|
||||
const { jobId, delay } = computeHistoryJob(pageAged(0), 'user', NOW);
|
||||
expect(delay).toBe(HISTORY_FAST_INTERVAL);
|
||||
expect(jobId).toBe(PAGE_ID);
|
||||
});
|
||||
|
||||
// #370 review round-1 WARNING: the max-wait ceiling prevents autosnapshot
|
||||
// starvation during a continuous editing session (the trailing timer would
|
||||
// otherwise re-arm forever and never fire).
|
||||
describe('max-wait ceiling', () => {
|
||||
const T0 = 1_000_000; // arbitrary fixed epoch for deterministic tests
|
||||
|
||||
it('once a burst is armed, delay clamps to the remaining max-wait budget', () => {
|
||||
// 1 minute into the burst the USER interval (60m) far exceeds the remaining
|
||||
// max-wait budget (10m - 1m = 9m), so the delay is clamped DOWN to that
|
||||
// remaining budget — the full interval is NOT used once a ceiling applies.
|
||||
const { delay } = computeHistoryJob(page, 'user', T0, T0 + 60_000);
|
||||
expect(delay).toBe(IDLE_MAX_WAIT_USER - 60_000);
|
||||
});
|
||||
|
||||
it('never waits longer than the max-wait budget from the burst start', () => {
|
||||
// A store arriving right at the ceiling → delay 0 (fire promptly).
|
||||
const { delay } = computeHistoryJob(
|
||||
page,
|
||||
'user',
|
||||
T0,
|
||||
T0 + IDLE_MAX_WAIT_USER,
|
||||
);
|
||||
expect(delay).toBe(0);
|
||||
});
|
||||
|
||||
it('past the ceiling never returns a negative delay', () => {
|
||||
const { delay } = computeHistoryJob(
|
||||
page,
|
||||
'user',
|
||||
T0,
|
||||
T0 + IDLE_MAX_WAIT_USER + 5 * 60_000,
|
||||
);
|
||||
expect(delay).toBe(0);
|
||||
});
|
||||
|
||||
it('the agent ceiling is shorter than the user ceiling', () => {
|
||||
expect(IDLE_MAX_WAIT_AGENT).toBeLessThan(IDLE_MAX_WAIT_USER);
|
||||
const { delay } = computeHistoryJob(
|
||||
page,
|
||||
'agent',
|
||||
T0,
|
||||
T0 + IDLE_MAX_WAIT_AGENT,
|
||||
);
|
||||
expect(delay).toBe(0);
|
||||
});
|
||||
|
||||
it('without a burstStart there is no ceiling (backward-compatible)', () => {
|
||||
expect(computeHistoryJob(page, 'user').delay).toBe(IDLE_INTERVAL_USER);
|
||||
expect(computeHistoryJob(page, 'agent').delay).toBe(IDLE_INTERVAL_AGENT);
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
describe('resolveSource (truth table)', () => {
|
||||
|
||||
@@ -40,12 +40,11 @@ describe('PersistenceExtension.onStoreDocument — Approach-A boundary snapshot'
|
||||
let pageHistoryRepo: {
|
||||
saveHistory: jest.Mock;
|
||||
findPageLastHistory: jest.Mock;
|
||||
updateHistoryKind: jest.Mock;
|
||||
};
|
||||
let aiQueue: { add: jest.Mock };
|
||||
let historyQueue: { add: jest.Mock; remove: jest.Mock };
|
||||
let historyQueue: { add: jest.Mock };
|
||||
let notificationQueue: { add: jest.Mock };
|
||||
let collabHistory: { addContributors: jest.Mock; popContributors: jest.Mock };
|
||||
let collabHistory: { addContributors: jest.Mock };
|
||||
let transclusionService: {
|
||||
syncPageTransclusions: jest.Mock;
|
||||
syncPageReferences: jest.Mock;
|
||||
@@ -94,22 +93,13 @@ describe('PersistenceExtension.onStoreDocument — Approach-A boundary snapshot'
|
||||
pageHistoryRepo = {
|
||||
saveHistory: jest.fn().mockImplementation(async () => {
|
||||
callOrder.push('saveHistory');
|
||||
return { id: 'history-1' };
|
||||
}),
|
||||
findPageLastHistory: jest.fn().mockResolvedValue(null),
|
||||
updateHistoryKind: jest.fn().mockResolvedValue(undefined),
|
||||
};
|
||||
aiQueue = { add: jest.fn().mockResolvedValue(undefined) };
|
||||
historyQueue = {
|
||||
add: jest.fn().mockResolvedValue(undefined),
|
||||
// #370 — enqueuePageHistory now removes any pending idle job before re-adding.
|
||||
remove: jest.fn().mockResolvedValue(undefined),
|
||||
};
|
||||
historyQueue = { add: jest.fn().mockResolvedValue(undefined) };
|
||||
notificationQueue = { add: jest.fn().mockResolvedValue(undefined) };
|
||||
collabHistory = {
|
||||
addContributors: jest.fn().mockResolvedValue(undefined),
|
||||
popContributors: jest.fn().mockResolvedValue([]),
|
||||
};
|
||||
collabHistory = { addContributors: jest.fn().mockResolvedValue(undefined) };
|
||||
transclusionService = {
|
||||
syncPageTransclusions: jest.fn().mockResolvedValue(undefined),
|
||||
syncPageReferences: jest.fn().mockResolvedValue(undefined),
|
||||
@@ -175,50 +165,6 @@ describe('PersistenceExtension.onStoreDocument — Approach-A boundary snapshot'
|
||||
expect(pageRepo.updatePage.mock.calls[0][0].lastUpdatedSource).toBe('user');
|
||||
});
|
||||
|
||||
// #370 review round-1 SUGGESTION: the boundary was GENERALIZED from a
|
||||
// user→agent special-case to ANY lastUpdatedSource transition. These pin the
|
||||
// generalized behaviour it was rebuilt for.
|
||||
describe('generalized boundary — any source transition', () => {
|
||||
// Same persisted page but with an explicit prior source.
|
||||
const pageWithPriorSource = (prior: string | null) => ({
|
||||
...persistedHumanPage('NEW CONTENT'),
|
||||
lastUpdatedSource: prior,
|
||||
});
|
||||
|
||||
it('agent→user transition fires the boundary (pins the prior agent revision)', async () => {
|
||||
const document = ydocFor(doc('NEW CONTENT'));
|
||||
pageRepo.findById.mockResolvedValue(pageWithPriorSource('agent'));
|
||||
pageHistoryRepo.findPageLastHistory.mockResolvedValue(null);
|
||||
|
||||
await ext.onStoreDocument(buildData(document, 'user') as any);
|
||||
|
||||
expect(pageHistoryRepo.saveHistory).toHaveBeenCalledTimes(1);
|
||||
expect(callOrder).toEqual(['saveHistory', 'updatePage']);
|
||||
expect(pageRepo.updatePage.mock.calls[0][0].lastUpdatedSource).toBe('user');
|
||||
});
|
||||
|
||||
it('git→user transition fires the boundary (git-sync overwrite is a source change)', async () => {
|
||||
const document = ydocFor(doc('NEW CONTENT'));
|
||||
pageRepo.findById.mockResolvedValue(pageWithPriorSource('git'));
|
||||
pageHistoryRepo.findPageLastHistory.mockResolvedValue(null);
|
||||
|
||||
await ext.onStoreDocument(buildData(document, 'user') as any);
|
||||
|
||||
expect(pageHistoryRepo.saveHistory).toHaveBeenCalledTimes(1);
|
||||
expect(callOrder).toEqual(['saveHistory', 'updatePage']);
|
||||
});
|
||||
|
||||
it('a null prior source (first-ever edit) does NOT fire the boundary', async () => {
|
||||
const document = ydocFor(doc('NEW CONTENT'));
|
||||
pageRepo.findById.mockResolvedValue(pageWithPriorSource(null));
|
||||
|
||||
await ext.onStoreDocument(buildData(document, 'agent') as any);
|
||||
|
||||
expect(pageHistoryRepo.saveHistory).not.toHaveBeenCalled();
|
||||
expect(pageRepo.updatePage).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
});
|
||||
|
||||
it('idempotency: unchanged content → no updatePage, no history, no queues', async () => {
|
||||
// The Y.Doc content equals the persisted content deeply → early skip.
|
||||
// A Y.Doc round-trip normalizes attrs (e.g. paragraph indent), so derive
|
||||
@@ -523,125 +469,4 @@ describe('PersistenceExtension.onStoreDocument — Approach-A boundary snapshot'
|
||||
// Contributors keyed by the UUID so they match the PAGE_HISTORY job (page.id).
|
||||
expect(collabHistory.addContributors.mock.calls[0][0]).toBe(PAGE_ID);
|
||||
});
|
||||
|
||||
// #370 — explicit save-version (Cmd+S / agent save tool) over the stateless
|
||||
// seam. The tier is derived from the SIGNED connection actor, the store path
|
||||
// is reused, and promote-not-dup avoids duplicating heavy content rows.
|
||||
describe('save-version (#370)', () => {
|
||||
const emitSave = (document: any, actor: 'user' | 'agent') =>
|
||||
ext.onStateless({
|
||||
connection: {
|
||||
readOnly: false,
|
||||
context: { user: { id: USER_ID, name: 'Alice' }, actor },
|
||||
} as any,
|
||||
documentName: `page.${PAGE_ID}`,
|
||||
document: document as any,
|
||||
payload: JSON.stringify({ type: 'save-version' }),
|
||||
} as any);
|
||||
|
||||
// findById returns a page whose content already equals the live doc, so the
|
||||
// store path is a no-op and we isolate the versioning decision.
|
||||
const pageMatchingDoc = (document: any) => ({
|
||||
...persistedHumanPage('IGNORED'),
|
||||
content: TiptapTransformer.fromYdoc(document, 'default'),
|
||||
});
|
||||
|
||||
it('human save with no prior snapshot → writes a manual version + broadcasts', async () => {
|
||||
const document = ydocFor(doc('VERSION ME'));
|
||||
pageRepo.findById.mockResolvedValue(pageMatchingDoc(document));
|
||||
pageHistoryRepo.findPageLastHistory.mockResolvedValue(null);
|
||||
|
||||
await emitSave(document, 'user');
|
||||
|
||||
expect(pageHistoryRepo.saveHistory).toHaveBeenCalledTimes(1);
|
||||
expect(pageHistoryRepo.saveHistory.mock.calls[0][1]).toEqual(
|
||||
expect.objectContaining({ kind: 'manual' }),
|
||||
);
|
||||
// The pending idle autosnapshot is cancelled by the explicit version.
|
||||
expect(historyQueue.remove).toHaveBeenCalledWith(PAGE_ID);
|
||||
const msg = JSON.parse(
|
||||
(document as any).broadcastStateless.mock.calls[(document as any).broadcastStateless.mock.calls.length - 1][0],
|
||||
);
|
||||
expect(msg).toMatchObject({
|
||||
type: 'version.saved',
|
||||
kind: 'manual',
|
||||
alreadySaved: false,
|
||||
});
|
||||
});
|
||||
|
||||
it('agent save derives kind=agent from the signed actor', async () => {
|
||||
const document = ydocFor(doc('AGENT VERSION'));
|
||||
pageRepo.findById.mockResolvedValue(pageMatchingDoc(document));
|
||||
pageHistoryRepo.findPageLastHistory.mockResolvedValue(null);
|
||||
|
||||
await emitSave(document, 'agent');
|
||||
|
||||
expect(pageHistoryRepo.saveHistory.mock.calls[pageHistoryRepo.saveHistory.mock.calls.length - 1][1]).toEqual(
|
||||
expect.objectContaining({ kind: 'agent' }),
|
||||
);
|
||||
});
|
||||
|
||||
it('promote-not-dup: latest snapshot is an autosave with identical content → upgrades in place', async () => {
|
||||
const document = ydocFor(doc('SAME'));
|
||||
const page = pageMatchingDoc(document);
|
||||
pageRepo.findById.mockResolvedValue(page);
|
||||
pageHistoryRepo.findPageLastHistory.mockResolvedValue({
|
||||
id: 'auto-1',
|
||||
content: page.content,
|
||||
kind: 'idle',
|
||||
});
|
||||
|
||||
await emitSave(document, 'user');
|
||||
|
||||
// No heavy new content row — the existing autosave is promoted to manual.
|
||||
expect(pageHistoryRepo.updateHistoryKind).toHaveBeenCalledWith(
|
||||
'auto-1',
|
||||
'manual',
|
||||
expect.anything(),
|
||||
);
|
||||
expect(pageHistoryRepo.saveHistory).not.toHaveBeenCalled();
|
||||
const msg = JSON.parse(
|
||||
(document as any).broadcastStateless.mock.calls[(document as any).broadcastStateless.mock.calls.length - 1][0],
|
||||
);
|
||||
expect(msg).toMatchObject({ historyId: 'auto-1', alreadySaved: false });
|
||||
});
|
||||
|
||||
it('no-op when the latest snapshot is already a manual version of this content', async () => {
|
||||
const document = ydocFor(doc('ALREADY SAVED'));
|
||||
const page = pageMatchingDoc(document);
|
||||
pageRepo.findById.mockResolvedValue(page);
|
||||
pageHistoryRepo.findPageLastHistory.mockResolvedValue({
|
||||
id: 'ver-1',
|
||||
content: page.content,
|
||||
kind: 'manual',
|
||||
});
|
||||
|
||||
await emitSave(document, 'user');
|
||||
|
||||
expect(pageHistoryRepo.updateHistoryKind).not.toHaveBeenCalled();
|
||||
expect(pageHistoryRepo.saveHistory).not.toHaveBeenCalled();
|
||||
const msg = JSON.parse(
|
||||
(document as any).broadcastStateless.mock.calls[(document as any).broadcastStateless.mock.calls.length - 1][0],
|
||||
);
|
||||
expect(msg).toMatchObject({ alreadySaved: true, kind: 'manual' });
|
||||
});
|
||||
|
||||
it('a read-only connection cannot save a version', async () => {
|
||||
const document = ydocFor(doc('READER'));
|
||||
pageRepo.findById.mockResolvedValue(pageMatchingDoc(document));
|
||||
|
||||
await ext.onStateless({
|
||||
connection: {
|
||||
readOnly: true,
|
||||
context: { user: { id: USER_ID }, actor: 'user' },
|
||||
} as any,
|
||||
documentName: `page.${PAGE_ID}`,
|
||||
document: document as any,
|
||||
payload: JSON.stringify({ type: 'save-version' }),
|
||||
} as any);
|
||||
|
||||
expect(pageHistoryRepo.saveHistory).not.toHaveBeenCalled();
|
||||
expect(pageHistoryRepo.updateHistoryKind).not.toHaveBeenCalled();
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
@@ -36,11 +36,9 @@ import {
|
||||
import { Page } from '@docmost/db/types/entity.types';
|
||||
import { CollabHistoryService } from '../services/collab-history.service';
|
||||
import {
|
||||
IDLE_INTERVAL_AGENT,
|
||||
IDLE_INTERVAL_USER,
|
||||
IDLE_MAX_WAIT_AGENT,
|
||||
IDLE_MAX_WAIT_USER,
|
||||
PageHistoryKind,
|
||||
HISTORY_FAST_INTERVAL,
|
||||
HISTORY_FAST_THRESHOLD,
|
||||
HISTORY_INTERVAL,
|
||||
} from '../constants';
|
||||
import { TransclusionService } from '../../core/page/transclusion/transclusion.service';
|
||||
import { observeCollabStore } from '../../integrations/metrics/metrics.registry';
|
||||
@@ -53,16 +51,6 @@ import { observeCollabStore } from '../../integrations/metrics/metrics.registry'
|
||||
*/
|
||||
export const INTENTIONAL_CLEAR_MESSAGE_TYPE = 'intentional-clear';
|
||||
|
||||
/**
|
||||
* #370 — wire format of the client→server "save a version" signal. Sent by the
|
||||
* human (Cmd+S / Save button) and by the agent's explicit save tool over the
|
||||
* SAME stateless channel. The intentionality tier ('manual' vs 'agent') is
|
||||
* derived SERVER-SIDE from the signed connection actor, never from this
|
||||
* payload, so a version's type is unforgeable. The document is taken from the
|
||||
* connection (not the payload), so the signal cannot be aimed at another page.
|
||||
*/
|
||||
export const SAVE_VERSION_MESSAGE_TYPE = 'save-version';
|
||||
|
||||
/**
|
||||
* #251 — how long an intentional-clear signal stays "pending" before it is
|
||||
* ignored. The signal is set on the clearing keystroke but consumed by the
|
||||
@@ -99,39 +87,35 @@ export function resolveSource(
|
||||
}
|
||||
|
||||
/**
|
||||
* #370 — compute the BullMQ job id + delay for a page's trailing idle-flush
|
||||
* autosnapshot. Pure so the timing is unit-testable.
|
||||
* Compute the BullMQ job id + delay for a page-history snapshot job. Pure so
|
||||
* the data-loss-sensitive timing arithmetic is unit-testable; `now` is injected
|
||||
* (caller passes `Date.now()`) for determinism.
|
||||
*
|
||||
* Both humans and the agent now share ONE idle pipeline (the agent's old
|
||||
* `delay=0` fast path is gone — intentional agent points arrive via the
|
||||
* explicit save-version signal instead). The job id is the bare `page.id`, so a
|
||||
* page has at most one pending idle job; the caller removes-and-re-adds it on
|
||||
* every store to keep it debounced to the trailing edge of an edit burst. The
|
||||
* window differs by source only: the agent flushes sooner than a human.
|
||||
* - Agent edits: delay 0 and a source-keyed job id `${page.id}-agent`. The
|
||||
* delay MUST stay 0 — the worker re-reads the page row at run time, so any
|
||||
* delay risks reading content a later human edit has already overwritten
|
||||
* (mis-tagged snapshot). 0 minimizes that window. The `-agent` suffix keeps
|
||||
* the job from coalescing with the bare-page.id human job.
|
||||
* - Human edits: age-based debounce so rapid human edits coalesce into one
|
||||
* snapshot; job id is the bare `page.id`.
|
||||
*
|
||||
* BullMQ forbids ':' in custom job ids (Redis key separator), so '-' is used;
|
||||
* page.id is a UUID, so `${page.id}-agent` cannot collide with a human job.
|
||||
*/
|
||||
export function computeHistoryJob(
|
||||
page: Pick<Page, 'id'>,
|
||||
page: Pick<Page, 'id' | 'createdAt'>,
|
||||
source: string,
|
||||
// Epoch ms of the FIRST edit in the current burst (when the pending idle job
|
||||
// was first armed). Used to enforce the max-wait ceiling so a continuous
|
||||
// editing session cannot re-arm the trailing timer forever. `now` is injectable
|
||||
// for tests; both default to a live clock / no ceiling when omitted.
|
||||
burstStart?: number,
|
||||
now: number = Date.now(),
|
||||
now: number,
|
||||
): { jobId: string; delay: number } {
|
||||
const isAgent = source === 'agent';
|
||||
const interval = isAgent ? IDLE_INTERVAL_AGENT : IDLE_INTERVAL_USER;
|
||||
const maxWait = isAgent ? IDLE_MAX_WAIT_AGENT : IDLE_MAX_WAIT_USER;
|
||||
|
||||
let delay = interval;
|
||||
if (burstStart !== undefined) {
|
||||
// Time already elapsed since the burst's first edit; the snapshot must fire
|
||||
// no later than `maxWait` after that, so shrink the trailing delay to the
|
||||
// remaining budget (never negative, so BullMQ fires it promptly).
|
||||
const remaining = burstStart + maxWait - now;
|
||||
delay = Math.max(0, Math.min(interval, remaining));
|
||||
}
|
||||
return { jobId: page.id, delay };
|
||||
const pageAge = now - new Date(page.createdAt).getTime();
|
||||
const delay = isAgent
|
||||
? 0
|
||||
: pageAge < HISTORY_FAST_THRESHOLD
|
||||
? HISTORY_FAST_INTERVAL
|
||||
: HISTORY_INTERVAL;
|
||||
const jobId = isAgent ? `${page.id}-agent` : page.id;
|
||||
return { jobId, delay };
|
||||
}
|
||||
|
||||
@Injectable()
|
||||
@@ -143,11 +127,6 @@ export class PersistenceExtension implements Extension {
|
||||
// coalescing window" per document and OR it across all edits in the window,
|
||||
// so the snapshot is marked 'agent' regardless of who wrote last.
|
||||
private agentTouched: Map<string, boolean> = new Map();
|
||||
// #370 — epoch ms of the FIRST edit in the current idle-flush burst, per page.
|
||||
// Set when the pending idle job is first armed (empty entry), read to enforce
|
||||
// the max-wait ceiling in computeHistoryJob, and cleared when the idle job is
|
||||
// consumed/cancelled so the next burst starts a fresh window.
|
||||
private idleBurstStart: Map<string, number> = new Map();
|
||||
// #251 — per-document "intentional clear pending" flags. Keyed by
|
||||
// documentName, value = expiry timestamp (ms). Set by onStateless when the
|
||||
// client reports a deliberate clear; consumed once by the next
|
||||
@@ -347,19 +326,20 @@ export class PersistenceExtension implements Extension {
|
||||
//this.logger.debug('Contributors error:' + err?.['message']);
|
||||
}
|
||||
|
||||
// #370 — boundary snapshot on ANY source transition. When the store
|
||||
// flips the page's provenance (user↔agent↔git), pin the OUTGOING
|
||||
// state as its own history version BEFORE the incoming source
|
||||
// overwrites it. `page` still holds the OLD content/provenance here,
|
||||
// so saveHistory(page) captures the pre-transition state tagged with
|
||||
// its own source, kind='boundary'. The incoming content is snapshotted
|
||||
// later by the debounced idle job. Skip if the page is effectively
|
||||
// empty or if the latest existing snapshot already equals this state
|
||||
// (the shared isDeepStrictEqual gate — avoids duplicates). Generalizing
|
||||
// beyond the old user→agent special-case also covers git-sync for free.
|
||||
// Approach A — boundary snapshot before the agent's first edit.
|
||||
// When this store is the agent's and the page's currently persisted
|
||||
// state was authored by a human, pin that human state as its own
|
||||
// history version BEFORE the agent overwrites it. `page` still holds
|
||||
// the OLD content/provenance here, so saveHistory(page) captures the
|
||||
// pre-agent state tagged 'user'. The agent's new content is
|
||||
// snapshotted later by the debounced PAGE_HISTORY job ('agent'). Skip
|
||||
// if the prior state is already agent-authored (boundary already
|
||||
// pinned on the user->agent transition), if the page is effectively
|
||||
// empty, or if the latest existing snapshot already equals this human
|
||||
// state (avoid duplicates).
|
||||
if (
|
||||
page.lastUpdatedSource &&
|
||||
page.lastUpdatedSource !== lastUpdatedSource
|
||||
lastUpdatedSource === 'agent' &&
|
||||
page.lastUpdatedSource !== 'agent'
|
||||
) {
|
||||
// pageHistory.pageId is uuid-typed; use page.id (never the doc-name
|
||||
// slugId) so a `page.<slugId>` doc cannot throw 22P02 here (#260).
|
||||
@@ -367,13 +347,15 @@ export class PersistenceExtension implements Extension {
|
||||
page.id,
|
||||
{ includeContent: true, trx },
|
||||
);
|
||||
const baselineMissing =
|
||||
const humanBaselineMissing =
|
||||
!lastHistory ||
|
||||
!isDeepStrictEqual(lastHistory.content, page.content);
|
||||
if (!isEmptyParagraphDoc(page.content as any) && baselineMissing) {
|
||||
if (
|
||||
!isEmptyParagraphDoc(page.content as any) &&
|
||||
humanBaselineMissing
|
||||
) {
|
||||
await this.pageHistoryRepo.saveHistory(page, {
|
||||
contributorIds: page.contributorIds ?? undefined,
|
||||
kind: 'boundary',
|
||||
trx,
|
||||
});
|
||||
}
|
||||
@@ -498,14 +480,6 @@ export class PersistenceExtension implements Extension {
|
||||
return; // unrelated / malformed stateless message
|
||||
}
|
||||
|
||||
// #370 — explicit "save a version" (human Cmd+S / agent save tool). Edit
|
||||
// rights are already enforced by the readOnly reject above (a reader can't
|
||||
// create a version), exactly as intentional-clear requires.
|
||||
if (message?.type === SAVE_VERSION_MESSAGE_TYPE) {
|
||||
await this.handleSaveVersion(data);
|
||||
return;
|
||||
}
|
||||
|
||||
if (message?.type !== INTENTIONAL_CLEAR_MESSAGE_TYPE) return;
|
||||
|
||||
this.intentionalClear.set(
|
||||
@@ -514,117 +488,6 @@ export class PersistenceExtension implements Extension {
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* #370 — persist an intentional version from the live in-memory ydoc.
|
||||
*
|
||||
* One stateless path serves BOTH the human and the agent; the tier is derived
|
||||
* SERVER-SIDE from the signed connection actor ('agent' → 'agent', anything
|
||||
* else → 'manual'), so the version type cannot be spoofed by the client. We
|
||||
* take the fresh ydoc from the collab process memory and run it through the
|
||||
* EXISTING store path first (so pages.content/ydoc reflect the exact content
|
||||
* being versioned — a REST endpoint would race the up-to-10s-stale page row),
|
||||
* then snapshot it into page_history with the intentional kind.
|
||||
*
|
||||
* Promote-not-dup: if the latest history row already holds this exact content
|
||||
* and it is an autosave (idle/boundary/legacy-null), upgrade its kind in place
|
||||
* instead of duplicating a heavy content row; if it is already 'manual', it is
|
||||
* a no-op (the client shows an "already saved" toast). Otherwise a fresh
|
||||
* version row is written, popping the aggregated contributors from Redis.
|
||||
*/
|
||||
private async handleSaveVersion(data: onStatelessPayload): Promise<void> {
|
||||
const { connection, document, documentName } = data;
|
||||
const context = connection?.context;
|
||||
const pageId = getPageId(documentName);
|
||||
// Unforgeable: 'agent' only for a signed agent connection, else 'manual'.
|
||||
const kind: PageHistoryKind =
|
||||
context?.actor === 'agent' ? 'agent' : 'manual';
|
||||
|
||||
// Flush the live ydoc through the normal store path so the page row + ydoc
|
||||
// hold exactly what we are about to version (also fires the idle enqueue we
|
||||
// supersede below, plus any source-transition boundary). onStoreDocument
|
||||
// only needs document/documentName/context.
|
||||
await this.onStoreDocument({
|
||||
document,
|
||||
documentName,
|
||||
context,
|
||||
} as onStoreDocumentPayload);
|
||||
|
||||
let result:
|
||||
| { historyId: string; kind: PageHistoryKind; alreadySaved: boolean }
|
||||
| undefined;
|
||||
|
||||
await executeTx(this.db, async (trx) => {
|
||||
const page = await this.pageRepo.findById(pageId, {
|
||||
withLock: true,
|
||||
includeContent: true,
|
||||
trx,
|
||||
});
|
||||
if (!page) return;
|
||||
// Never version an effectively-empty page (mirrors the processor's
|
||||
// first-history guard); there is nothing intentional to pin.
|
||||
if (isEmptyParagraphDoc(page.content as any)) return;
|
||||
|
||||
const lastHistory = await this.pageHistoryRepo.findPageLastHistory(
|
||||
page.id,
|
||||
{ includeContent: true, trx },
|
||||
);
|
||||
|
||||
if (
|
||||
lastHistory &&
|
||||
isDeepStrictEqual(lastHistory.content, page.content)
|
||||
) {
|
||||
// Content is already snapshotted. Promote-not-dup.
|
||||
if (lastHistory.kind === 'manual') {
|
||||
result = {
|
||||
historyId: lastHistory.id,
|
||||
kind: 'manual',
|
||||
alreadySaved: true,
|
||||
};
|
||||
return;
|
||||
}
|
||||
await this.pageHistoryRepo.updateHistoryKind(
|
||||
lastHistory.id,
|
||||
kind,
|
||||
trx,
|
||||
);
|
||||
result = { historyId: lastHistory.id, kind, alreadySaved: false };
|
||||
return;
|
||||
}
|
||||
|
||||
// Fresh version row. Pop the contributors aggregated since the last
|
||||
// snapshot (SPOP); restore them if the write fails so they aren't lost.
|
||||
const contributorIds = await this.collabHistory.popContributors(page.id);
|
||||
try {
|
||||
const saved = await this.pageHistoryRepo.saveHistory(page, {
|
||||
contributorIds,
|
||||
kind,
|
||||
trx,
|
||||
});
|
||||
result = { historyId: saved.id, kind, alreadySaved: false };
|
||||
} catch (err) {
|
||||
await this.collabHistory.addContributors(page.id, contributorIds);
|
||||
throw err;
|
||||
}
|
||||
});
|
||||
|
||||
// Housekeeping: this explicit version supersedes the page's pending idle
|
||||
// autosnapshot, so cancel it (delayed job → remove() just deletes it) and
|
||||
// end the current idle burst so the next edit starts a fresh max-wait window.
|
||||
await this.historyQueue.remove(pageId).catch(() => undefined);
|
||||
this.idleBurstStart.delete(pageId);
|
||||
|
||||
if (result) {
|
||||
document.broadcastStateless(
|
||||
JSON.stringify({
|
||||
type: 'version.saved',
|
||||
historyId: result.historyId,
|
||||
kind: result.kind,
|
||||
alreadySaved: result.alreadySaved,
|
||||
}),
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
async onChange(data: onChangePayload) {
|
||||
const documentName = data.documentName;
|
||||
const userId = data.context?.user?.id;
|
||||
@@ -682,45 +545,17 @@ export class PersistenceExtension implements Extension {
|
||||
page: Page,
|
||||
lastUpdatedSource: string,
|
||||
): Promise<void> {
|
||||
// #370 — trailing idle debounce with a max-wait ceiling. One pending idle
|
||||
// job per page (jobId = page.id); on every store we remove the pending
|
||||
// delayed job and re-add it, so the snapshot lands `delay` after edits go
|
||||
// quiet rather than once per store (precedent: workspace.service.ts).
|
||||
// remove() on a delayed job simply deletes it (0 if absent, no throw); if the
|
||||
// job is already ACTIVE and the remove is a no-op, the add still de-dups and
|
||||
// the processor's isDeepStrictEqual gate collapses the duplicate content.
|
||||
//
|
||||
// The FIRST arm of a burst records `burstStart`; computeHistoryJob shrinks
|
||||
// the delay to the remaining max-wait budget from that point, so a continuous
|
||||
// session cannot re-arm the trailing timer forever and starve the snapshot.
|
||||
// A burst marker older than THIS TIER's max-wait means the previous idle job
|
||||
// has already fired — start a fresh window instead of firing immediately on
|
||||
// the next edit. Must use the SAME source-specific max-wait computeHistoryJob
|
||||
// uses (agent 5m / user 10m): a hardcoded USER ceiling would leave an agent
|
||||
// burst's marker stale for 5..10m, forcing delay=0 on every store in that
|
||||
// window and writing one idle row per store — exactly the per-store bloat the
|
||||
// debounce exists to prevent, on the continuous-agent path.
|
||||
const maxWait =
|
||||
lastUpdatedSource === 'agent' ? IDLE_MAX_WAIT_AGENT : IDLE_MAX_WAIT_USER;
|
||||
const now = Date.now();
|
||||
let burstStart = this.idleBurstStart.get(page.id);
|
||||
if (burstStart === undefined || now - burstStart >= maxWait) {
|
||||
burstStart = now;
|
||||
this.idleBurstStart.set(page.id, burstStart);
|
||||
}
|
||||
|
||||
// Job id + delay arithmetic lives in the pure `computeHistoryJob` (see its
|
||||
// doc comment for the agent-delay-0 / age-based-debounce invariants).
|
||||
const { jobId, delay } = computeHistoryJob(
|
||||
page,
|
||||
lastUpdatedSource,
|
||||
burstStart,
|
||||
now,
|
||||
Date.now(),
|
||||
);
|
||||
|
||||
await this.historyQueue.remove(jobId).catch(() => undefined);
|
||||
|
||||
await this.historyQueue.add(
|
||||
QueueJob.PAGE_HISTORY,
|
||||
{ pageId: page.id, kind: 'idle' } as IPageHistoryJob,
|
||||
{ pageId: page.id } as IPageHistoryJob,
|
||||
{ jobId, delay },
|
||||
);
|
||||
}
|
||||
|
||||
@@ -66,15 +66,6 @@ describe('HistoryProcessor.process', () => {
|
||||
notificationQueue = { add: jest.fn().mockResolvedValue(undefined) };
|
||||
generalQueue = { add: jest.fn().mockResolvedValue(undefined) };
|
||||
|
||||
// #370 F3 — the processor now serializes its find+save under a page-row lock
|
||||
// via executeTx. A db whose transaction().execute(fn) runs fn with a trx stub
|
||||
// drives the real executeTx() helper without a database.
|
||||
const db = {
|
||||
transaction: () => ({
|
||||
execute: (fn: (trx: any) => Promise<any>) => fn({ __trx: true }),
|
||||
}),
|
||||
};
|
||||
|
||||
// WorkerHost's constructor reads `this.worker`; passing repos positionally
|
||||
// matches the constructor and avoids the Nest DI container.
|
||||
proc = new HistoryProcessor(
|
||||
@@ -82,7 +73,6 @@ describe('HistoryProcessor.process', () => {
|
||||
pageRepo as any,
|
||||
collabHistory as any,
|
||||
watcherService as any,
|
||||
db as any,
|
||||
notificationQueue as any,
|
||||
generalQueue as any,
|
||||
);
|
||||
@@ -136,26 +126,15 @@ describe('HistoryProcessor.process', () => {
|
||||
await proc.process(buildJob());
|
||||
|
||||
expect(collabHistory.popContributors).toHaveBeenCalledWith(PAGE_ID);
|
||||
// #370 F3/F9 — the snapshot decision runs under a page-row lock. Pin the lock
|
||||
// structurally so a refactor that drops withLock/trx (silently reintroducing
|
||||
// the TOCTOU double-insert) turns this red. The tx stub is { __trx: true }.
|
||||
expect(pageRepo.findById).toHaveBeenCalledWith(
|
||||
PAGE_ID,
|
||||
expect.objectContaining({ withLock: true, trx: { __trx: true } }),
|
||||
);
|
||||
// #370 F7 — addPageWatchers MUST receive the trx, or its FK-check runs on a
|
||||
// separate connection and self-deadlocks against our FOR UPDATE. Asserting
|
||||
// the trx arg here is exactly what would have caught that regression.
|
||||
expect(watcherService.addPageWatchers).toHaveBeenCalledWith(
|
||||
['u1', 'u2'],
|
||||
PAGE_ID,
|
||||
SPACE_ID,
|
||||
WORKSPACE_ID,
|
||||
{ __trx: true },
|
||||
);
|
||||
expect(pageHistoryRepo.saveHistory).toHaveBeenCalledWith(
|
||||
expect.objectContaining({ id: PAGE_ID }),
|
||||
{ contributorIds: ['u1', 'u2'], kind: 'idle', trx: { __trx: true } },
|
||||
{ contributorIds: ['u1', 'u2'] },
|
||||
);
|
||||
expect(generalQueue.add).toHaveBeenCalledWith(
|
||||
QueueJob.PAGE_BACKLINKS,
|
||||
@@ -207,48 +186,6 @@ describe('HistoryProcessor.process', () => {
|
||||
]);
|
||||
});
|
||||
|
||||
it('COMMIT failure (throw outside the tx callback) → contributors RESTORED', async () => {
|
||||
// #370 F8 — a commit-time failure throws OUTSIDE the callback, so the inner
|
||||
// try/catch does not run; the outer catch must restore the popped set (else a
|
||||
// BullMQ retry writes an unattributed version). Use a db whose execute() runs
|
||||
// the callback THEN throws, simulating a commit abort.
|
||||
pageHistoryRepo.findPageLastHistory.mockResolvedValue({
|
||||
content: { type: 'doc', content: [] },
|
||||
});
|
||||
const commitFail = {
|
||||
transaction: () => ({
|
||||
execute: async (fn: (trx: any) => Promise<any>) => {
|
||||
await fn({ __trx: true }); // callback succeeds (saveHistory ok)
|
||||
throw new Error('commit aborted'); // ...but the COMMIT fails
|
||||
},
|
||||
}),
|
||||
};
|
||||
const procCommitFail = new HistoryProcessor(
|
||||
pageHistoryRepo as any,
|
||||
pageRepo as any,
|
||||
collabHistory as any,
|
||||
watcherService as any,
|
||||
commitFail as any,
|
||||
notificationQueue as any,
|
||||
generalQueue as any,
|
||||
);
|
||||
jest
|
||||
.spyOn(procCommitFail['logger'], 'error')
|
||||
.mockImplementation(() => undefined);
|
||||
|
||||
await expect(procCommitFail.process(buildJob())).rejects.toThrow(
|
||||
'commit aborted',
|
||||
);
|
||||
// The inner catch did NOT run (save succeeded), so only the outer catch can
|
||||
// restore — assert it did.
|
||||
expect(collabHistory.addContributors).toHaveBeenCalledWith(PAGE_ID, [
|
||||
'u1',
|
||||
'u2',
|
||||
]);
|
||||
// And the post-snapshot queue work must NOT have run (we rethrew).
|
||||
expect(generalQueue.add).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('backlinks + notification queue failures are swallowed (history still committed)', async () => {
|
||||
pageHistoryRepo.findPageLastHistory.mockResolvedValue({
|
||||
content: { type: 'doc', content: [] },
|
||||
|
||||
@@ -19,9 +19,6 @@ import { isDeepStrictEqual } from 'node:util';
|
||||
import { CollabHistoryService } from '../services/collab-history.service';
|
||||
import { WatcherService } from '../../core/watcher/watcher.service';
|
||||
import { isEmptyParagraphDoc } from '../collaboration.util';
|
||||
import { InjectKysely } from 'nestjs-kysely';
|
||||
import { KyselyDB } from '@docmost/db/types/kysely.types';
|
||||
import { executeTx } from '@docmost/db/utils';
|
||||
|
||||
@Processor(QueueName.HISTORY_QUEUE)
|
||||
export class HistoryProcessor extends WorkerHost implements OnModuleDestroy {
|
||||
@@ -32,7 +29,6 @@ export class HistoryProcessor extends WorkerHost implements OnModuleDestroy {
|
||||
private readonly pageRepo: PageRepo,
|
||||
private readonly collabHistory: CollabHistoryService,
|
||||
private readonly watcherService: WatcherService,
|
||||
@InjectKysely() private readonly db: KyselyDB,
|
||||
@InjectQueue(QueueName.NOTIFICATION_QUEUE) private notificationQueue: Queue,
|
||||
@InjectQueue(QueueName.GENERAL_QUEUE) private generalQueue: Queue,
|
||||
) {
|
||||
@@ -45,9 +41,6 @@ export class HistoryProcessor extends WorkerHost implements OnModuleDestroy {
|
||||
try {
|
||||
const { pageId } = job.data;
|
||||
|
||||
// Read the page WITHOUT a lock first, only to bail early on the two cheap
|
||||
// no-write cases (page gone / empty first snapshot) without opening a
|
||||
// transaction. The authoritative check-then-write happens locked below.
|
||||
const page = await this.pageRepo.findById(pageId, {
|
||||
includeContent: true,
|
||||
});
|
||||
@@ -58,109 +51,40 @@ export class HistoryProcessor extends WorkerHost implements OnModuleDestroy {
|
||||
return;
|
||||
}
|
||||
|
||||
// #370 F3 — the snapshot decision (findPageLastHistory → saveHistory) must
|
||||
// be serialized against manual-save/boundary writers, which run under a
|
||||
// page-row lock in onStoreDocument. Without it, this processor and a
|
||||
// concurrent manual-save each read the same lastHistory (MVCC), both see
|
||||
// content != lastHistory, and both insert — producing two page_history rows
|
||||
// with IDENTICAL content (one 'idle', one 'manual'), defeating
|
||||
// promote-not-dup and the version-vs-autosave split. Taking the same
|
||||
// page-row lock makes the second writer observe the first's committed row so
|
||||
// the isDeepStrictEqual gate collapses the duplicate. Only the read+write
|
||||
// is transacted; the post-snapshot queue work stays outside.
|
||||
let contributorIds: string[] = [];
|
||||
let snapshotWritten = false;
|
||||
let lastHistoryContent: unknown;
|
||||
// #370 F8 — the contributor set popped from Redis (destructive SPOP) must be
|
||||
// restored if the snapshot does not durably land. The inner try/catch only
|
||||
// covers a throw INSIDE the callback; a COMMIT failure (connection drop,
|
||||
// serialization/deadlock abort on commit — the transient class the epic
|
||||
// already retries) throws OUTSIDE it, rolling the snapshot back while the
|
||||
// pop is already gone. We track the popped set here and restore it in the
|
||||
// outer catch so a BullMQ retry re-attributes the version. addContributors
|
||||
// is an idempotent Redis SADD, so a double-restore is harmless.
|
||||
let poppedForRestore: string[] = [];
|
||||
const lastHistory = await this.pageHistoryRepo.findPageLastHistory(
|
||||
pageId,
|
||||
{ includeContent: true },
|
||||
);
|
||||
|
||||
try {
|
||||
await executeTx(this.db, async (trx) => {
|
||||
const lockedPage = await this.pageRepo.findById(pageId, {
|
||||
includeContent: true,
|
||||
withLock: true,
|
||||
trx,
|
||||
});
|
||||
if (!lockedPage) return;
|
||||
|
||||
const lastHistory = await this.pageHistoryRepo.findPageLastHistory(
|
||||
pageId,
|
||||
{ includeContent: true, trx },
|
||||
);
|
||||
lastHistoryContent = lastHistory?.content;
|
||||
|
||||
if (!lastHistory && isEmptyParagraphDoc(lockedPage.content as any)) {
|
||||
this.logger.debug(
|
||||
`Skipping first history for page ${pageId}: empty content`,
|
||||
);
|
||||
return;
|
||||
}
|
||||
|
||||
if (
|
||||
lastHistory &&
|
||||
isDeepStrictEqual(lastHistory.content, lockedPage.content)
|
||||
) {
|
||||
return; // already snapshotted at this content — nothing to write
|
||||
}
|
||||
|
||||
contributorIds = await this.collabHistory.popContributors(pageId);
|
||||
poppedForRestore = contributorIds;
|
||||
try {
|
||||
// Pass `trx` so the watcher insert's FK check (FOR KEY SHARE on
|
||||
// pages[pageId]) runs on the SAME connection that already holds the
|
||||
// FOR UPDATE lock from findById — otherwise it takes the FK lock on a
|
||||
// separate pool connection and self-deadlocks against our own tx.
|
||||
await this.watcherService.addPageWatchers(
|
||||
contributorIds,
|
||||
pageId,
|
||||
lockedPage.spaceId,
|
||||
lockedPage.workspaceId,
|
||||
trx,
|
||||
);
|
||||
|
||||
// #370 — every job on this queue is a trailing idle-flush autosnapshot.
|
||||
await this.pageHistoryRepo.saveHistory(lockedPage, {
|
||||
contributorIds,
|
||||
kind: job.data.kind ?? 'idle',
|
||||
trx,
|
||||
});
|
||||
snapshotWritten = true;
|
||||
this.logger.debug(`History created for page: ${pageId}`);
|
||||
} catch (err) {
|
||||
await this.collabHistory.addContributors(pageId, contributorIds);
|
||||
poppedForRestore = [];
|
||||
throw err;
|
||||
}
|
||||
});
|
||||
} catch (err) {
|
||||
// A throw here means the tx did NOT commit (callback threw, or the commit
|
||||
// itself failed and rolled back). If we popped contributors and the inner
|
||||
// catch did not already restore them, restore now so the retry keeps
|
||||
// attribution. snapshotWritten is irrelevant: it is set before commit, so
|
||||
// it can be true even when the commit rolled the snapshot back.
|
||||
if (poppedForRestore.length) {
|
||||
await this.collabHistory.addContributors(pageId, poppedForRestore);
|
||||
}
|
||||
throw err;
|
||||
}
|
||||
|
||||
// No snapshot written (page vanished / empty-first / unchanged content) →
|
||||
// clear the contributor set for the skip cases and stop.
|
||||
if (!snapshotWritten) {
|
||||
if (!lastHistoryContent && isEmptyParagraphDoc(page.content as any)) {
|
||||
await this.collabHistory.clearContributors(pageId);
|
||||
}
|
||||
if (!lastHistory && isEmptyParagraphDoc(page.content as any)) {
|
||||
this.logger.debug(
|
||||
`Skipping first history for page ${pageId}: empty content`,
|
||||
);
|
||||
await this.collabHistory.clearContributors(pageId);
|
||||
return;
|
||||
}
|
||||
|
||||
{
|
||||
if (
|
||||
!lastHistory ||
|
||||
!isDeepStrictEqual(lastHistory.content, page.content)
|
||||
) {
|
||||
const contributorIds = await this.collabHistory.popContributors(pageId);
|
||||
|
||||
try {
|
||||
await this.watcherService.addPageWatchers(
|
||||
contributorIds,
|
||||
pageId,
|
||||
page.spaceId,
|
||||
page.workspaceId,
|
||||
);
|
||||
|
||||
await this.pageHistoryRepo.saveHistory(page, { contributorIds });
|
||||
this.logger.debug(`History created for page: ${pageId}`);
|
||||
} catch (err) {
|
||||
await this.collabHistory.addContributors(pageId, contributorIds);
|
||||
throw err;
|
||||
}
|
||||
|
||||
const mentions = extractMentions(page.content);
|
||||
const pageMentions = extractPageMentions(mentions);
|
||||
const internalLinkSlugIds = extractInternalLinkSlugIds(page.content);
|
||||
@@ -178,7 +102,7 @@ export class HistoryProcessor extends WorkerHost implements OnModuleDestroy {
|
||||
);
|
||||
});
|
||||
|
||||
if (contributorIds.length > 0 && lastHistoryContent) {
|
||||
if (contributorIds.length > 0 && lastHistory?.content) {
|
||||
await this.notificationQueue
|
||||
.add(QueueJob.PAGE_UPDATED, {
|
||||
pageId,
|
||||
|
||||
@@ -52,7 +52,9 @@ import {
|
||||
INTERNAL_LINK_REGEX,
|
||||
extractPageSlugId,
|
||||
} from '../../../integrations/export/utils';
|
||||
import { markdownToHtml, canonicalizeFootnotes } from '@docmost/editor-ext';
|
||||
import { canonicalizeFootnotes } from '@docmost/editor-ext';
|
||||
import { markdownToProseMirror } from '@docmost/prosemirror-markdown';
|
||||
import { normalizeForeignMarkdown } from '../../../integrations/import/utils/foreign-markdown';
|
||||
import { WatcherService } from '../../watcher/watcher.service';
|
||||
import { sql } from 'kysely';
|
||||
import { TransclusionService } from '../transclusion/transclusion.service';
|
||||
@@ -1301,8 +1303,14 @@ export class PageService {
|
||||
|
||||
switch (format) {
|
||||
case 'markdown': {
|
||||
const html = await markdownToHtml(content as string);
|
||||
prosemirrorJson = htmlToJson(html as string);
|
||||
// Canonical markdown -> ProseMirror JSON directly via
|
||||
// `@docmost/prosemirror-markdown` (issue #345) — no HTML intermediate,
|
||||
// no editor-ext markdown layer. Foreign markdown surfaces the strict
|
||||
// parser rejects (GFM `[^id]` reference footnotes) are normalized to the
|
||||
// canonical inline form first.
|
||||
prosemirrorJson = await markdownToProseMirror(
|
||||
normalizeForeignMarkdown(content as string),
|
||||
);
|
||||
break;
|
||||
}
|
||||
case 'html': {
|
||||
|
||||
@@ -1,27 +0,0 @@
|
||||
import { type Kysely } from 'kysely';
|
||||
|
||||
/**
|
||||
* #370 — page-versioning intentionality tier on a history snapshot.
|
||||
*
|
||||
* Adds `page_history.kind`, the three-tier "how intentional was this snapshot"
|
||||
* marker that lets versions (intentional points) be told apart from autosaves:
|
||||
* - 'manual' — a human explicitly saved a version (Cmd+S / Save button)
|
||||
* - 'agent' — the AI agent explicitly saved a version
|
||||
* - 'idle' — trailing idle-flush autosnapshot (safety net)
|
||||
* - 'boundary' — autosnapshot pinned on a source transition (user↔agent↔git)
|
||||
*
|
||||
* Nullable with NO default (mirrors last_updated_source in the agent-provenance
|
||||
* migration): legacy rows predate the marker and read back as `null`, which the
|
||||
* client renders as a plain autosave. Stored as a short varchar to stay
|
||||
* forward-compatible without an enum migration.
|
||||
*/
|
||||
export async function up(db: Kysely<any>): Promise<void> {
|
||||
await db.schema
|
||||
.alterTable('page_history')
|
||||
.addColumn('kind', 'varchar(20)', (col) => col)
|
||||
.execute();
|
||||
}
|
||||
|
||||
export async function down(db: Kysely<any>): Promise<void> {
|
||||
await db.schema.alterTable('page_history').dropColumn('kind').execute();
|
||||
}
|
||||
@@ -13,7 +13,6 @@ import { jsonArrayFrom, jsonObjectFrom } from 'kysely/helpers/postgres';
|
||||
import { ExpressionBuilder, sql } from 'kysely';
|
||||
import { DB } from '@docmost/db/types/db';
|
||||
import { resolveAgentProvenance } from '../agent-provenance';
|
||||
import { PageHistoryKind } from '../../../collaboration/constants';
|
||||
|
||||
/**
|
||||
* Role-resolution subquery for a page-history row's bound AI chat (#300). Joins
|
||||
@@ -47,9 +46,6 @@ export class PageHistoryRepo {
|
||||
'lastUpdatedById',
|
||||
'lastUpdatedSource',
|
||||
'lastUpdatedAiChatId',
|
||||
// #370 — intentionality tier ('manual' | 'agent' | 'idle' | 'boundary');
|
||||
// null on legacy rows (= autosave). Selected so callers can read/promote it.
|
||||
'kind',
|
||||
'contributorIds',
|
||||
'spaceId',
|
||||
'workspaceId',
|
||||
@@ -89,15 +85,9 @@ export class PageHistoryRepo {
|
||||
|
||||
async saveHistory(
|
||||
page: Page,
|
||||
opts?: {
|
||||
contributorIds?: string[];
|
||||
// #370 — intentionality tier for this snapshot. Omitted → null (legacy
|
||||
// autosave semantics). Callers derive it server-side, never from a client.
|
||||
kind?: PageHistoryKind;
|
||||
trx?: KyselyTransaction;
|
||||
},
|
||||
): Promise<PageHistory> {
|
||||
return await this.insertPageHistory(
|
||||
opts?: { contributorIds?: string[]; trx?: KyselyTransaction },
|
||||
): Promise<void> {
|
||||
await this.insertPageHistory(
|
||||
{
|
||||
pageId: page.id,
|
||||
slugId: page.slugId,
|
||||
@@ -109,7 +99,6 @@ export class PageHistoryRepo {
|
||||
// Copy the provenance marker off the page row, as for lastUpdatedById.
|
||||
lastUpdatedSource: page.lastUpdatedSource,
|
||||
lastUpdatedAiChatId: page.lastUpdatedAiChatId,
|
||||
kind: opts?.kind ?? null,
|
||||
contributorIds: opts?.contributorIds,
|
||||
spaceId: page.spaceId,
|
||||
workspaceId: page.workspaceId,
|
||||
@@ -118,25 +107,6 @@ export class PageHistoryRepo {
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* #370 — promote an existing snapshot's intentionality tier in place. Used by
|
||||
* the manual-save "promote-not-dup" path: when the latest history row already
|
||||
* holds the exact content being versioned, we upgrade its `kind` instead of
|
||||
* duplicating a heavy content row.
|
||||
*/
|
||||
async updateHistoryKind(
|
||||
pageHistoryId: string,
|
||||
kind: PageHistoryKind,
|
||||
trx?: KyselyTransaction,
|
||||
): Promise<void> {
|
||||
const db = dbOrTx(this.db, trx);
|
||||
await db
|
||||
.updateTable('pageHistory')
|
||||
.set({ kind })
|
||||
.where('id', '=', pageHistoryId)
|
||||
.execute();
|
||||
}
|
||||
|
||||
async findPageHistoryByPageId(pageId: string, pagination: PaginationOptions) {
|
||||
const query = this.db
|
||||
.selectFrom('pageHistory')
|
||||
|
||||
-1
@@ -280,7 +280,6 @@ export interface PageHistory {
|
||||
createdAt: Generated<Timestamp>;
|
||||
icon: string | null;
|
||||
id: Generated<string>;
|
||||
kind: string | null;
|
||||
lastUpdatedAiChatId: string | null;
|
||||
lastUpdatedById: string | null;
|
||||
lastUpdatedSource: string | null;
|
||||
|
||||
@@ -0,0 +1,145 @@
|
||||
// export.service.ts imports the ESM-only @sindresorhus/slugify (not in jest's
|
||||
// transform allowlist). It is irrelevant to the markdown-serialization path under
|
||||
// test (only used for page-mention link slugs on the DB path), so it is mocked
|
||||
// out to keep the module graph loadable under ts-jest (mirrors the import specs).
|
||||
jest.mock('@sindresorhus/slugify', () => ({
|
||||
__esModule: true,
|
||||
default: (input: string) => String(input),
|
||||
}));
|
||||
|
||||
import { convertProseMirrorToMarkdown } from '@docmost/prosemirror-markdown';
|
||||
import { ExportService } from './export.service';
|
||||
import { ExportFormat } from './dto/export-dto';
|
||||
|
||||
/**
|
||||
* STEP 1 golden test for issue #345: server MARKDOWN export runs DIRECTLY through
|
||||
* the canonical converter (`convertProseMirrorToMarkdown`) — no HTML intermediate
|
||||
* and no `@docmost/editor-ext` markdown layer — so the emitted markdown is in the
|
||||
* canonical package forms and is byte-identical to the git-sync vault body.
|
||||
*
|
||||
* These are the goldens the swap has to satisfy: they assert the CANONICAL
|
||||
* surface (callout `> [!type]`, inline footnote `^[…]`, lossless image
|
||||
* `<!--img …-->`) rather than the old editor-ext forms (`:::type`, `[^id]`,
|
||||
* lossy ``).
|
||||
*
|
||||
* `exportPage(..., singlePage=false)` takes no DB path (no mention rewriting), so
|
||||
* the service is constructed with null collaborators and only the pure
|
||||
* PM -> Markdown path is exercised.
|
||||
*/
|
||||
|
||||
function makeService(): ExportService {
|
||||
return new ExportService(
|
||||
null as any, // pageRepo
|
||||
null as any, // pagePermissionRepo
|
||||
null as any, // db
|
||||
null as any, // storageService
|
||||
null as any, // environmentService
|
||||
null as any, // domainService
|
||||
);
|
||||
}
|
||||
|
||||
// A representative page exercising the node types whose canonical markdown form
|
||||
// changed with the move off the editor-ext layer: callout, inline footnote, and a
|
||||
// lossless image carrying width/align attrs that the old layer dropped.
|
||||
const REPRESENTATIVE_DOC = {
|
||||
type: 'doc',
|
||||
content: [
|
||||
{
|
||||
type: 'paragraph',
|
||||
content: [
|
||||
{ type: 'text', text: 'Body ' },
|
||||
{ type: 'footnoteReference', attrs: { id: 'fn-1' } },
|
||||
{ type: 'text', text: ' end.' },
|
||||
],
|
||||
},
|
||||
{
|
||||
type: 'callout',
|
||||
attrs: { type: 'info', icon: null },
|
||||
content: [
|
||||
{
|
||||
type: 'paragraph',
|
||||
content: [{ type: 'text', text: 'Heads up' }],
|
||||
},
|
||||
],
|
||||
},
|
||||
{
|
||||
type: 'image',
|
||||
attrs: {
|
||||
src: '/files/pic.png',
|
||||
alt: 'Pic',
|
||||
width: 320,
|
||||
align: 'left',
|
||||
},
|
||||
},
|
||||
{
|
||||
type: 'footnotesList',
|
||||
content: [
|
||||
{
|
||||
type: 'footnoteDefinition',
|
||||
attrs: { id: 'fn-1' },
|
||||
content: [
|
||||
{
|
||||
type: 'paragraph',
|
||||
content: [{ type: 'text', text: 'the note' }],
|
||||
},
|
||||
],
|
||||
},
|
||||
],
|
||||
},
|
||||
],
|
||||
};
|
||||
|
||||
describe('ExportService — markdown export via the canonical converter (#345)', () => {
|
||||
it('emits canonical callout, inline footnote and lossless image forms', async () => {
|
||||
const service = makeService();
|
||||
const md = (await service.exportPage(ExportFormat.Markdown, {
|
||||
title: '',
|
||||
content: REPRESENTATIVE_DOC,
|
||||
} as any)) as string;
|
||||
|
||||
// Callout: Obsidian `> [!type]`, NOT the legacy `:::type`.
|
||||
expect(md).toContain('> [!info]');
|
||||
expect(md).not.toContain(':::');
|
||||
|
||||
// Inline footnote: `^[…]`, NOT the reference `[^id]` form.
|
||||
expect(md).toContain('^[the note]');
|
||||
expect(md).not.toMatch(/\[\^/);
|
||||
|
||||
// Lossless image: trailing `<!--img …-->` carrying the dropped attrs.
|
||||
expect(md).toContain('');
|
||||
expect(md).toContain('<!--img');
|
||||
expect(md).toContain('"width":"320"');
|
||||
expect(md).toContain('"align":"left"');
|
||||
});
|
||||
|
||||
it('export body is byte-identical to the git-sync vault serializer (export == vault)', async () => {
|
||||
const service = makeService();
|
||||
// A title-less page: exportPage prepends NO heading, so the whole output is
|
||||
// the page BODY — exactly what git-sync serializes (git-sync stores the title
|
||||
// in frontmatter / the filename, never as an in-body H1).
|
||||
const exported = (await service.exportPage(ExportFormat.Markdown, {
|
||||
title: '',
|
||||
content: REPRESENTATIVE_DOC,
|
||||
} as any)) as string;
|
||||
|
||||
// The git-sync vault writer feeds this SAME converter (git-sync
|
||||
// `stabilizePageBody` = convertProseMirrorToMarkdown(content) at the
|
||||
// fixpoint). For an already-stable doc the single pass IS the fixpoint, so
|
||||
// the two are byte-identical by construction — assert it.
|
||||
const vaultBody = convertProseMirrorToMarkdown(REPRESENTATIVE_DOC);
|
||||
expect(exported).toBe(vaultBody);
|
||||
});
|
||||
|
||||
it('prepends the page title as an H1 heading (the one documented export/vault delta)', async () => {
|
||||
const service = makeService();
|
||||
const md = (await service.exportPage(ExportFormat.Markdown, {
|
||||
title: 'My Page',
|
||||
content: { type: 'doc', content: [] },
|
||||
} as any)) as string;
|
||||
|
||||
// Export makes standalone files, so it prepends the title as an H1. This is
|
||||
// the ONE deliberate difference from the vault body (which carries the title
|
||||
// in frontmatter). The body below the heading still serializes canonically.
|
||||
expect(md.startsWith('# My Page')).toBe(true);
|
||||
});
|
||||
});
|
||||
@@ -37,7 +37,7 @@ import {
|
||||
getAttachmentIds,
|
||||
getProsemirrorContent,
|
||||
} from '../../common/helpers/prosemirror/utils';
|
||||
import { htmlToMarkdown } from '@docmost/editor-ext';
|
||||
import { convertProseMirrorToMarkdown } from '@docmost/prosemirror-markdown';
|
||||
|
||||
type AllowedAttachment = { id: string; fileName: string; filePath: string };
|
||||
|
||||
@@ -79,9 +79,8 @@ export class ExportService {
|
||||
prosemirrorJson.content.unshift(titleNode);
|
||||
}
|
||||
|
||||
const pageHtml = jsonToHtml(prosemirrorJson);
|
||||
|
||||
if (format === ExportFormat.HTML) {
|
||||
const pageHtml = jsonToHtml(prosemirrorJson);
|
||||
return `<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
@@ -92,11 +91,14 @@ export class ExportService {
|
||||
}
|
||||
|
||||
if (format === ExportFormat.Markdown) {
|
||||
const newPageHtml = pageHtml.replace(
|
||||
/<colgroup[^>]*>[\s\S]*?<\/colgroup>/gim,
|
||||
'',
|
||||
);
|
||||
return htmlToMarkdown(newPageHtml);
|
||||
// Direct ProseMirror JSON -> Markdown via the canonical converter
|
||||
// (`@docmost/prosemirror-markdown`). This is the SAME serializer the
|
||||
// git-sync vault writer feeds (see git-sync `stabilizePageBody`), so an
|
||||
// exported page body is byte-identical to its vault representation — no
|
||||
// HTML intermediate, no second markdown layer, no format drift (issue
|
||||
// #345). The old `<colgroup>` scrub is gone with the HTML step: the
|
||||
// converter emits GFM tables directly and never produces `<colgroup>`.
|
||||
return convertProseMirrorToMarkdown(prosemirrorJson);
|
||||
}
|
||||
|
||||
return;
|
||||
|
||||
+144
-77
@@ -17,6 +17,22 @@ jest.mock('image-dimensions', () => ({
|
||||
__esModule: true,
|
||||
imageDimensionsFromData: () => undefined,
|
||||
}));
|
||||
// FileImportTaskService -> PageService -> collaboration.gateway ->
|
||||
// metrics.registry imports `prom-client`, which is not resolvable in this
|
||||
// workspace's node_modules (types-only stub, no runtime entry). Metrics are
|
||||
// disabled on this path, so a virtual no-op mock keeps the module graph loadable.
|
||||
jest.mock(
|
||||
'prom-client',
|
||||
() => ({
|
||||
collectDefaultMetrics: () => undefined,
|
||||
Registry: class {},
|
||||
Histogram: class {},
|
||||
Gauge: class {},
|
||||
Counter: class {},
|
||||
Summary: class {},
|
||||
}),
|
||||
{ virtual: true },
|
||||
);
|
||||
|
||||
import { promises as fs } from 'fs';
|
||||
import * as os from 'os';
|
||||
@@ -26,14 +42,17 @@ import { ImportService } from './import.service';
|
||||
|
||||
/**
|
||||
* Binding test for issue #228 / review #5: FileImportTaskService.processGenericImport
|
||||
* is a NON-editor write path (markdownToHtml -> processHTML -> JSON, never runs
|
||||
* footnoteSyncPlugin), so it canonicalizes footnotes before persisting. This pins
|
||||
* that binding — the same one import.service has a spec for — which previously had
|
||||
* NO spec at all.
|
||||
* is a NON-editor write path, so a zip-imported `.md` page ends up with canonical
|
||||
* footnotes before persisting: ordered by first reference, reused refs deduped,
|
||||
* orphan definitions dropped.
|
||||
*
|
||||
* The markdown -> HTML -> ProseMirror conversion is REAL (a real ImportService,
|
||||
* its createYdoc stubbed); the filesystem is a real temp dir with one .md file;
|
||||
* the DB transaction is stubbed to capture the persisted page content.
|
||||
* Since #345 the `.md` parse runs `normalizeForeignMarkdown` ->
|
||||
* `markdownToProseMirror` -> `jsonToHtml` (feeding the shared HTML attachment /
|
||||
* link pipeline) -> `processHTML` -> `canonicalizeFootnotes`. The parser assigns
|
||||
* fresh `fn-*` ids, so we assert by definition BODY order rather than the source
|
||||
* labels. The conversion is REAL (a real ImportService, its createYdoc stubbed);
|
||||
* the filesystem is a real temp dir with one .md file; the DB transaction is
|
||||
* stubbed to capture the persisted page content.
|
||||
*/
|
||||
|
||||
// Out-of-order references (c, a, b), a REUSED reference ([^a] twice), and an
|
||||
@@ -49,13 +68,14 @@ const MARKDOWN = [
|
||||
'[^z]: orphan note',
|
||||
].join('\n');
|
||||
|
||||
function footnoteListIds(content: any): string[] {
|
||||
/** Definition body texts of the (single) footnotesList, in list order. */
|
||||
function footnoteListBodies(content: any): string[] {
|
||||
const list = (content?.content ?? []).find(
|
||||
(n: any) => n.type === 'footnotesList',
|
||||
);
|
||||
return (list?.content ?? [])
|
||||
.filter((n: any) => n.type === 'footnoteDefinition')
|
||||
.map((n: any) => n.attrs?.id);
|
||||
.map((n: any) => n.content?.[0]?.content?.[0]?.text);
|
||||
}
|
||||
|
||||
// A permissive chainable stub for the spaces lookup (selectFrom(...).select(...)
|
||||
@@ -71,80 +91,127 @@ function chainable(result: any): any {
|
||||
return proxy;
|
||||
}
|
||||
|
||||
/**
|
||||
* Run one markdown file through the REAL zip-import pipeline
|
||||
* (`processGenericImport` -> `markdownToProseMirror` -> `jsonToHtml` ->
|
||||
* `processHTML`/`htmlToJson`) and return the persisted page `content`. This is
|
||||
* the server-specific PM->HTML->PM hop that the package's own PM<->MD tests do
|
||||
* NOT cover.
|
||||
*/
|
||||
async function runZipImport(markdown: string): Promise<any> {
|
||||
const extractDir = await fs.mkdtemp(path.join(os.tmpdir(), 'fit-canon-'));
|
||||
await fs.writeFile(path.join(extractDir, 'note.md'), markdown, 'utf-8');
|
||||
|
||||
const importService = new ImportService(
|
||||
{} as any,
|
||||
{} as any,
|
||||
{} as any,
|
||||
{} as any,
|
||||
);
|
||||
jest
|
||||
.spyOn(importService as any, 'createYdoc')
|
||||
.mockResolvedValue(Buffer.from([]) as any);
|
||||
|
||||
let captured: any = null;
|
||||
const trx = {
|
||||
insertInto: (table: string) => ({
|
||||
values: (v: any) => {
|
||||
if (table === 'pages') captured = v;
|
||||
return { execute: async () => {} };
|
||||
},
|
||||
}),
|
||||
};
|
||||
const db: any = {
|
||||
selectFrom: () => chainable({ slug: 'space-slug' }),
|
||||
transaction: () => ({ execute: (fn: any) => fn(trx) }),
|
||||
};
|
||||
|
||||
const importAttachmentService = {
|
||||
processAttachments: async ({ html }: any) => html,
|
||||
};
|
||||
const service = new FileImportTaskService(
|
||||
{} as any, // storageService
|
||||
importService as any,
|
||||
{ nextPagePosition: async () => 'a0' } as any,
|
||||
{ insertBacklink: jest.fn() } as any,
|
||||
db,
|
||||
importAttachmentService as any,
|
||||
{ emit: jest.fn() } as any,
|
||||
{ logBatchWithContext: jest.fn() } as any,
|
||||
);
|
||||
|
||||
const fileTask: any = {
|
||||
id: 'task-1',
|
||||
source: 'generic',
|
||||
spaceId: 'space-1',
|
||||
workspaceId: 'ws-1',
|
||||
creatorId: 'user-1',
|
||||
};
|
||||
|
||||
try {
|
||||
await service.processGenericImport({ extractDir, fileTask });
|
||||
expect(captured).toBeTruthy();
|
||||
return captured.content;
|
||||
} finally {
|
||||
await fs.rm(extractDir, { recursive: true, force: true });
|
||||
}
|
||||
}
|
||||
|
||||
/** Find the first node of a given type anywhere in a PM content tree. */
|
||||
function findFirst(node: any, type: string): any {
|
||||
if (!node || typeof node !== 'object') return null;
|
||||
if (node.type === type) return node;
|
||||
for (const child of node.content ?? []) {
|
||||
const hit = findFirst(child, type);
|
||||
if (hit) return hit;
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
describe('FileImportTaskService.processGenericImport — footnote canonicalization (#228)', () => {
|
||||
it('orders footnotes by first reference, dedupes reuse, and drops orphans on zip import', async () => {
|
||||
const extractDir = await fs.mkdtemp(path.join(os.tmpdir(), 'fit-canon-'));
|
||||
await fs.writeFile(path.join(extractDir, 'note.md'), MARKDOWN, 'utf-8');
|
||||
|
||||
// Real ImportService for the html -> JSON conversion; stub the yjs encode.
|
||||
const importService = new ImportService(
|
||||
{} as any,
|
||||
{} as any,
|
||||
{} as any,
|
||||
{} as any,
|
||||
const content = await runZipImport(MARKDOWN);
|
||||
// Definitions ordered by FIRST REFERENCE (C, A, B), NOT the markdown
|
||||
// definition order (A, B, C). Ids are the parser's fresh `fn-*`, so pin
|
||||
// the BODIES.
|
||||
expect(footnoteListBodies(content)).toEqual(['note C', 'note A', 'note B']);
|
||||
// Orphan [^z] dropped; reused [^a] collapses to one definition; one list.
|
||||
expect(footnoteListBodies(content)).not.toContain('orphan note');
|
||||
const lists = (content.content ?? []).filter(
|
||||
(n: any) => n.type === 'footnotesList',
|
||||
);
|
||||
jest
|
||||
.spyOn(importService as any, 'createYdoc')
|
||||
.mockResolvedValue(Buffer.from([]) as any);
|
||||
expect(lists).toHaveLength(1);
|
||||
expect(
|
||||
footnoteListBodies(content).filter((b) => b === 'note A'),
|
||||
).toHaveLength(1);
|
||||
});
|
||||
|
||||
let captured: any = null;
|
||||
const trx = {
|
||||
insertInto: (table: string) => ({
|
||||
values: (v: any) => {
|
||||
if (table === 'pages') captured = v;
|
||||
return { execute: async () => {} };
|
||||
},
|
||||
}),
|
||||
};
|
||||
const db: any = {
|
||||
selectFrom: () => chainable({ slug: 'space-slug' }),
|
||||
transaction: () => ({ execute: (fn: any) => fn(trx) }),
|
||||
};
|
||||
// #345 F4: the zip path routes markdown through jsonToHtml -> processHTML ->
|
||||
// htmlToJson (the shared HTML attachment pipeline). #345's headline is LOSSLESS
|
||||
// image width/align via the `<!--img {...}-->` comment; a callout carries its
|
||||
// `type`. This asserts those survive the PM->HTML->PM hop — the one hop the
|
||||
// package's PM<->MD suite does not exercise.
|
||||
it('preserves image width/align and callout type through the PM->HTML->PM hop', async () => {
|
||||
const md = [
|
||||
'# Doc',
|
||||
'',
|
||||
' <!--img {"width":"320","align":"left"}-->',
|
||||
'',
|
||||
':::warning',
|
||||
'Careful now.',
|
||||
':::',
|
||||
].join('\n');
|
||||
|
||||
const importAttachmentService = {
|
||||
processAttachments: async ({ html }: any) => html,
|
||||
};
|
||||
const backlinkRepo = { insertBacklink: jest.fn() };
|
||||
const eventEmitter = { emit: jest.fn() };
|
||||
const auditService = { logBatchWithContext: jest.fn() };
|
||||
const content = await runZipImport(md);
|
||||
|
||||
const pageService = { nextPagePosition: async () => 'a0' };
|
||||
const image = findFirst(content, 'image');
|
||||
expect(image).toBeTruthy();
|
||||
// The lossless sizing/alignment must survive the HTML hop.
|
||||
expect(String(image.attrs?.width)).toBe('320');
|
||||
expect(image.attrs?.align).toBe('left');
|
||||
|
||||
const service = new FileImportTaskService(
|
||||
{} as any, // storageService
|
||||
importService as any,
|
||||
pageService as any,
|
||||
backlinkRepo as any,
|
||||
db,
|
||||
importAttachmentService as any,
|
||||
eventEmitter as any,
|
||||
auditService as any,
|
||||
);
|
||||
|
||||
const fileTask: any = {
|
||||
id: 'task-1',
|
||||
source: 'generic',
|
||||
spaceId: 'space-1',
|
||||
workspaceId: 'ws-1',
|
||||
creatorId: 'user-1',
|
||||
};
|
||||
|
||||
try {
|
||||
await service.processGenericImport({ extractDir, fileTask });
|
||||
|
||||
expect(captured).toBeTruthy();
|
||||
const content = captured.content;
|
||||
// Reference order is c, a, b (NOT the markdown definition order a, b, c).
|
||||
expect(footnoteListIds(content)).toEqual(['c', 'a', 'b']);
|
||||
// Orphan [^z] dropped; reused [^a] collapses to one definition; one list.
|
||||
expect(footnoteListIds(content)).not.toContain('z');
|
||||
const lists = (content.content ?? []).filter(
|
||||
(n: any) => n.type === 'footnotesList',
|
||||
);
|
||||
expect(lists).toHaveLength(1);
|
||||
expect(footnoteListIds(content).filter((id) => id === 'a')).toHaveLength(1);
|
||||
} finally {
|
||||
await fs.rm(extractDir, { recursive: true, force: true });
|
||||
}
|
||||
const callout = findFirst(content, 'callout');
|
||||
expect(callout).toBeTruthy();
|
||||
expect(callout.attrs?.type).toBe('warning');
|
||||
});
|
||||
});
|
||||
|
||||
@@ -1,6 +1,9 @@
|
||||
import { Inject, Injectable, Logger } from '@nestjs/common';
|
||||
import * as path from 'path';
|
||||
import { jsonToText } from '../../../collaboration/collaboration.util';
|
||||
import {
|
||||
jsonToHtml,
|
||||
jsonToText,
|
||||
} from '../../../collaboration/collaboration.util';
|
||||
import { InjectKysely } from 'nestjs-kysely';
|
||||
import { KyselyDB } from '@docmost/db/types/kysely.types';
|
||||
import {
|
||||
@@ -18,9 +21,11 @@ import { generateSlugId } from '../../../common/helpers';
|
||||
import { v7 } from 'uuid';
|
||||
import { generateJitteredKeyBetween } from 'fractional-indexing-jittered';
|
||||
import { FileTask, InsertablePage } from '@docmost/db/types/entity.types';
|
||||
import { markdownToHtml, canonicalizeFootnotes } from '@docmost/editor-ext';
|
||||
import { canonicalizeFootnotes } from '@docmost/editor-ext';
|
||||
import { markdownToProseMirror } from '@docmost/prosemirror-markdown';
|
||||
import { getProsemirrorContent } from '../../../common/helpers/prosemirror/utils';
|
||||
import { formatImportHtml } from '../utils/import-formatter';
|
||||
import { normalizeForeignMarkdown } from '../utils/foreign-markdown';
|
||||
import {
|
||||
buildAttachmentCandidates,
|
||||
collectMarkdownAndHtmlFiles,
|
||||
@@ -461,7 +466,18 @@ export class FileImportTaskService {
|
||||
content = await fs.readFile(absPath, 'utf-8');
|
||||
|
||||
if (page.fileExtension.toLowerCase() === '.md') {
|
||||
content = await markdownToHtml(content);
|
||||
// Parse markdown with the single canonical converter
|
||||
// (`@docmost/prosemirror-markdown`), after normalizing foreign
|
||||
// reference footnotes, then serialize to HTML so the shared HTML
|
||||
// pipeline below (processAttachments + formatImportHtml +
|
||||
// processHTML) keeps handling `.md` and `.html` imports
|
||||
// uniformly. The markdown PARSE no longer goes through the
|
||||
// editor-ext markdown layer (issue #345) — the drift source is
|
||||
// gone. The PM -> HTML -> PM hop that follows is lossless
|
||||
// plumbing for attachment/link resolution, NOT a second parse.
|
||||
content = jsonToHtml(
|
||||
await markdownToProseMirror(normalizeForeignMarkdown(content)),
|
||||
);
|
||||
}
|
||||
} catch (err: any) {
|
||||
if (err?.code === 'ENOENT') {
|
||||
@@ -500,10 +516,12 @@ export class FileImportTaskService {
|
||||
this.importService.extractTitleAndRemoveHeading(pmState);
|
||||
|
||||
// Canonicalize footnote topology on this non-editor write path
|
||||
// (markdownToHtml/processHTML never runs footnoteSyncPlugin), so a
|
||||
// zip-imported page's footnotes are reference-ordered, deduped, and
|
||||
// (the HTML pipeline's processHTML never runs footnoteSyncPlugin), so
|
||||
// a zip-imported page's footnotes are reference-ordered, deduped, and
|
||||
// orphan-free like the editor's invariant (issue #228). Pure +
|
||||
// idempotent + shape-safe; a footnote-free doc is unchanged.
|
||||
// idempotent + shape-safe; a footnote-free doc is unchanged. (For a
|
||||
// `.md` file the package parser already yields canonical footnotes,
|
||||
// so this is a no-op there.)
|
||||
// (Future consolidation, architecture B: like import.service, this
|
||||
// path persists directly rather than via PageService — a shared
|
||||
// "prepare JSON for persist" helper would centralize this call.)
|
||||
|
||||
+27
-31
@@ -12,13 +12,19 @@ import { canonicalizeFootnotes } from '@docmost/editor-ext';
|
||||
|
||||
/**
|
||||
* Integration-ish test for the USER-FACING markdown import path
|
||||
* (`ImportService.importPage`). It exercises the REAL markdown -> HTML -> JSON
|
||||
* conversion and asserts that the stored page content has its footnotes
|
||||
* canonicalized — the gap that issue #228 fixes: the import path builds
|
||||
* ProseMirror JSON directly (never running the editor's footnoteSyncPlugin), so
|
||||
* before this wiring the stored footnotes kept the markdown's physical
|
||||
* definition order (out of order vs. references), retained orphan definitions,
|
||||
* and did not collapse reused references.
|
||||
* (`ImportService.importPage`). It exercises the REAL markdown -> ProseMirror
|
||||
* conversion and asserts the stored page's footnotes are canonical: ordered by
|
||||
* FIRST REFERENCE (not markdown definition order), reused references deduped to a
|
||||
* single definition, and orphan definitions dropped.
|
||||
*
|
||||
* Since #345 the markdown parse runs through the canonical package
|
||||
* (`normalizeForeignMarkdown` -> `markdownToProseMirror`), which owns this
|
||||
* canonicalization: the input's GFM `[^id]` reference footnotes are normalized to
|
||||
* inline `^[…]`, and the parser assigns fresh sequential ids (`fn-*`) in
|
||||
* reference order while merging identical bodies — so we assert by definition
|
||||
* BODY order, not by the source labels. `canonicalizeFootnotes` remains wired as
|
||||
* an idempotent safety net (issue #228) and is a no-op on this already-canonical
|
||||
* output.
|
||||
*
|
||||
* The DB/ydoc side-effects are stubbed: `getNewPagePosition` (DB query) and
|
||||
* `createYdoc` (Yjs encode) are spied, and `pageRepo.insertPage` captures the
|
||||
@@ -67,24 +73,14 @@ function makeService() {
|
||||
}
|
||||
|
||||
/** List the footnote-definition ids of the (single) footnotesList, in order. */
|
||||
function footnoteListIds(content: any): string[] {
|
||||
/** Definition body texts of the (single) footnotesList, in list order. */
|
||||
function footnoteListBodies(content: any): string[] {
|
||||
const list = (content.content ?? []).find(
|
||||
(n: any) => n.type === 'footnotesList',
|
||||
);
|
||||
if (!list) return [];
|
||||
return (list.content ?? [])
|
||||
return (list?.content ?? [])
|
||||
.filter((n: any) => n.type === 'footnoteDefinition')
|
||||
.map((n: any) => n.attrs?.id);
|
||||
}
|
||||
|
||||
function definitionText(content: any, id: string): string | undefined {
|
||||
const list = (content.content ?? []).find(
|
||||
(n: any) => n.type === 'footnotesList',
|
||||
);
|
||||
const def = (list?.content ?? []).find(
|
||||
(n: any) => n.type === 'footnoteDefinition' && n.attrs?.id === id,
|
||||
);
|
||||
return def?.content?.[0]?.content?.[0]?.text;
|
||||
.map((n: any) => n.content?.[0]?.content?.[0]?.text);
|
||||
}
|
||||
|
||||
describe('ImportService.importPage — footnote canonicalization (#228)', () => {
|
||||
@@ -101,23 +97,23 @@ describe('ImportService.importPage — footnote canonicalization (#228)', () =>
|
||||
const content = getCaptured().content;
|
||||
expect(content).toBeTruthy();
|
||||
|
||||
// Reference order is c, a, b (NOT the markdown definition order a, b, c).
|
||||
expect(footnoteListIds(content)).toEqual(['c', 'a', 'b']);
|
||||
|
||||
// Definitions preserved and attached to the right ids.
|
||||
expect(definitionText(content, 'c')).toBe('note C');
|
||||
expect(definitionText(content, 'a')).toBe('note A');
|
||||
expect(definitionText(content, 'b')).toBe('note B');
|
||||
// Definitions ordered by FIRST REFERENCE (C, A, B) — NOT the markdown
|
||||
// definition order (A, B, C) — with the orphan [^z] dropped and the reused
|
||||
// [^a] collapsed to a single definition. (Ids are the parser's fresh `fn-*`,
|
||||
// so we pin the BODIES.)
|
||||
expect(footnoteListBodies(content)).toEqual(['note C', 'note A', 'note B']);
|
||||
|
||||
// Orphan definition [^z] is dropped.
|
||||
expect(footnoteListIds(content)).not.toContain('z');
|
||||
expect(footnoteListBodies(content)).not.toContain('orphan note');
|
||||
|
||||
// Reused [^a] yields exactly ONE definition, and exactly one list.
|
||||
const lists = (content.content ?? []).filter(
|
||||
(n: any) => n.type === 'footnotesList',
|
||||
);
|
||||
expect(lists).toHaveLength(1);
|
||||
expect(footnoteListIds(content).filter((id) => id === 'a')).toHaveLength(1);
|
||||
expect(
|
||||
footnoteListBodies(content).filter((b) => b === 'note A'),
|
||||
).toHaveLength(1);
|
||||
});
|
||||
|
||||
it('is idempotent: canonicalizing the stored output again is a no-op', async () => {
|
||||
@@ -134,6 +130,6 @@ describe('ImportService.importPage — footnote canonicalization (#228)', () =>
|
||||
// time must not change it (safe to wire into every write path).
|
||||
const second = canonicalizeFootnotes(stored);
|
||||
expect(second).toEqual(stored);
|
||||
expect(footnoteListIds(second)).toEqual(['c', 'a', 'b']);
|
||||
expect(footnoteListBodies(second)).toEqual(['note C', 'note A', 'note B']);
|
||||
});
|
||||
});
|
||||
|
||||
@@ -17,7 +17,9 @@ import {
|
||||
import { generateJitteredKeyBetween } from 'fractional-indexing-jittered';
|
||||
import { TiptapTransformer } from '@hocuspocus/transformer';
|
||||
import * as Y from 'yjs';
|
||||
import { markdownToHtml, canonicalizeFootnotes } from '@docmost/editor-ext';
|
||||
import { canonicalizeFootnotes } from '@docmost/editor-ext';
|
||||
import { markdownToProseMirror } from '@docmost/prosemirror-markdown';
|
||||
import { normalizeForeignMarkdown } from '../utils/foreign-markdown';
|
||||
import {
|
||||
FileTaskStatus,
|
||||
FileTaskType,
|
||||
@@ -85,11 +87,13 @@ export class ImportService {
|
||||
|
||||
const extracted = this.extractTitleAndRemoveHeading(prosemirrorState);
|
||||
const title = extracted.title;
|
||||
// Imported markdown/HTML is built via markdownToHtml -> htmlToJson, which
|
||||
// never runs the editor's footnoteSyncPlugin, so the footnote topology keeps
|
||||
// the source's PHYSICAL definition order (out of order vs. references),
|
||||
// retains orphan definitions, and is not deduped. Canonicalize before
|
||||
// persisting so the stored page matches the editor's invariant (issue #228).
|
||||
// The markdown path now canonicalizes footnotes itself (the package parser),
|
||||
// but the HTML path (processHTML -> htmlToJson) does NOT run the editor's
|
||||
// footnoteSyncPlugin, so an imported HTML doc can keep its source's PHYSICAL
|
||||
// definition order (out of order vs. references), retain orphan definitions,
|
||||
// and not be deduped. Canonicalize before persisting so the stored page
|
||||
// matches the editor's invariant (issue #228); it is an idempotent no-op on
|
||||
// the already-canonical markdown output.
|
||||
// Pure + idempotent + shape-safe: a doc with no footnotes is unchanged.
|
||||
// (Future consolidation, architecture B: this import path persists directly
|
||||
// via pageRepo.insertPage rather than through PageService.createPage, so the
|
||||
@@ -133,12 +137,15 @@ export class ImportService {
|
||||
}
|
||||
|
||||
async processMarkdown(markdownInput: string): Promise<any> {
|
||||
try {
|
||||
const html = await markdownToHtml(markdownInput);
|
||||
return this.processHTML(html);
|
||||
} catch (err) {
|
||||
throw err;
|
||||
}
|
||||
// Canonical markdown -> ProseMirror JSON directly via
|
||||
// `@docmost/prosemirror-markdown` (issue #345) — no HTML intermediate and no
|
||||
// second editor-ext markdown layer. Foreign markdown surfaces the strict
|
||||
// canonical parser does not accept (GFM `[^id]` reference footnotes) are
|
||||
// rewritten to the canonical inline form by `normalizeForeignMarkdown` first.
|
||||
// The HTML-cleanup pass (`normalizeImportHtml`) is intentionally skipped here:
|
||||
// it targets foreign *HTML* (Notion/XWiki), which only ever arrives on the
|
||||
// `.html` path (`processHTML`), never as canonical markdown.
|
||||
return markdownToProseMirror(normalizeForeignMarkdown(markdownInput));
|
||||
}
|
||||
|
||||
async processHTML(htmlInput: string): Promise<any> {
|
||||
|
||||
@@ -0,0 +1,218 @@
|
||||
import {
|
||||
convertProseMirrorToMarkdown,
|
||||
markdownToProseMirror,
|
||||
} from '@docmost/prosemirror-markdown';
|
||||
import { normalizeForeignMarkdown } from './foreign-markdown';
|
||||
|
||||
/**
|
||||
* STEP 2 goldens for issue #345: the foreign-markdown normalizer that runs at the
|
||||
* import boundary BEFORE the strict canonical parser (`markdownToProseMirror`).
|
||||
*
|
||||
* Two layers:
|
||||
* 1. PURE string→string cases pinning the normalizer's own behavior (GFM
|
||||
* reference footnotes → inline `^[…]`).
|
||||
* 2. END-TO-END acceptance: for a foreign corpus, `normalizeForeignMarkdown`
|
||||
* then `markdownToProseMirror` then `convertProseMirrorToMarkdown` must leave
|
||||
* NO literal `[^id]` / `:::` garbage in the document and must re-export in the
|
||||
* canonical forms.
|
||||
*/
|
||||
|
||||
describe('normalizeForeignMarkdown — GFM reference footnotes', () => {
|
||||
it('inlines a single-line reference footnote and drops its definition', () => {
|
||||
const out = normalizeForeignMarkdown(
|
||||
'A note[^1] here.\n\n[^1]: The definition.',
|
||||
);
|
||||
expect(out).toBe('A note^[The definition.] here.\n');
|
||||
});
|
||||
|
||||
it('inlines every reference to a reused id (downstream dedups)', () => {
|
||||
const out = normalizeForeignMarkdown(
|
||||
'X[^a] and Y[^a].\n\n[^a]: shared.',
|
||||
);
|
||||
expect(out).toBe('X^[shared.] and Y^[shared.].\n');
|
||||
});
|
||||
|
||||
it('joins indented continuation lines of a definition with a space', () => {
|
||||
const out = normalizeForeignMarkdown(
|
||||
'See[^n].\n\n[^n]: line one\n line two',
|
||||
);
|
||||
expect(out).toBe('See^[line one line two].\n');
|
||||
});
|
||||
|
||||
it('never rewrites a reference inside a fenced code block', () => {
|
||||
const out = normalizeForeignMarkdown(
|
||||
'```\ncode[^1] here\n```\n\n[^1]: def.',
|
||||
);
|
||||
expect(out).toContain('code[^1] here');
|
||||
// The (now orphaned) definition line is still removed.
|
||||
expect(out).not.toContain('[^1]: def.');
|
||||
});
|
||||
|
||||
it('never rewrites a reference inside an INLINE-code span (backticks)', () => {
|
||||
// The `[^1]` inside backticks is literal code and must survive verbatim;
|
||||
// the one outside is rewritten. (Bug #1: only fenced blocks were protected.)
|
||||
const out = normalizeForeignMarkdown(
|
||||
'Use `arr[^1]` in code but note[^1] in prose.\n\n[^1]: def.',
|
||||
);
|
||||
expect(out).toBe('Use `arr[^1]` in code but note^[def.] in prose.\n');
|
||||
});
|
||||
|
||||
it('escapes brackets in a body so an unbalanced ] cannot truncate the footnote', () => {
|
||||
// A foreign definition body with a stray `]` would, unescaped, close the
|
||||
// canonical `^[...]` early and leak the tail as text (bug #2). The body's
|
||||
// brackets are backslash-escaped so the footnote stays whole.
|
||||
const out = normalizeForeignMarkdown(
|
||||
'Ref[^1] here.\n\n[^1]: see item ] and [more] later',
|
||||
);
|
||||
expect(out).toBe('Ref^[see item \\] and \\[more\\] later] here.\n');
|
||||
// The tokenizer must see exactly one unescaped closing bracket (our own).
|
||||
expect(out.match(/(?<!\\)\]/g)).toHaveLength(1);
|
||||
});
|
||||
|
||||
it('leaves a reference with no matching definition literal (no body to inline)', () => {
|
||||
const out = normalizeForeignMarkdown('Dangling[^x] ref.');
|
||||
expect(out).toBe('Dangling[^x] ref.');
|
||||
});
|
||||
|
||||
it('returns the input unchanged when there are no reference footnotes', () => {
|
||||
const md = '# Title\n\nJust text with `inline code` and a [link](/x).';
|
||||
expect(normalizeForeignMarkdown(md)).toBe(md);
|
||||
});
|
||||
|
||||
it('does NOT touch callout surfaces — the canonical parser handles them', () => {
|
||||
const callouts = ':::info\nHi\n:::\n\n> [!warning]\n> Careful';
|
||||
expect(normalizeForeignMarkdown(callouts)).toBe(callouts);
|
||||
});
|
||||
|
||||
it('strips a leading YAML front-matter block (Obsidian/Hugo/git-sync files)', () => {
|
||||
const out = normalizeForeignMarkdown(
|
||||
'---\ntitle: My Page\ntags: [a, b]\n---\n\n# Heading\n\nBody.',
|
||||
);
|
||||
expect(out).toBe('# Heading\n\nBody.');
|
||||
// The front-matter must not leak into the body as a setext heading.
|
||||
expect(out).not.toContain('title: My Page');
|
||||
expect(out).not.toContain('---');
|
||||
});
|
||||
|
||||
it('does not strip a horizontal rule that is not leading front-matter', () => {
|
||||
const md = 'Intro paragraph.\n\n---\n\nAfter the rule.';
|
||||
expect(normalizeForeignMarkdown(md)).toBe(md);
|
||||
});
|
||||
|
||||
it('is linear on a document with thousands of definitions (no quadratic blowup)', () => {
|
||||
// F2(a): the pass-2 rewrite must be O(text), not O(text × defs). Build a
|
||||
// pathological doc (many defs + many plain text lines) and assert it
|
||||
// completes well under a second — a quadratic implementation took ~14s.
|
||||
const N = 4000;
|
||||
const refs = Array.from({ length: N }, (_, i) => `line ${i} plain text`).join('\n');
|
||||
const defs = Array.from({ length: N }, (_, i) => `[^n${i}]: def ${i}`).join('\n');
|
||||
const doc = `start[^n0] and[^n${N - 1}] end\n\n${refs}\n\n${defs}`;
|
||||
const t0 = Date.now();
|
||||
const out = normalizeForeignMarkdown(doc);
|
||||
const elapsed = Date.now() - t0;
|
||||
expect(elapsed).toBeLessThan(2000);
|
||||
// Sanity: the two real references were still inlined.
|
||||
expect(out).toContain('^[def 0]');
|
||||
expect(out).toContain(`^[def ${N - 1}]`);
|
||||
});
|
||||
|
||||
it('is bounded on a long unclosed backtick run (no inline-split ReDoS)', () => {
|
||||
// F2(b): a huge unterminated backtick run must not cause quadratic
|
||||
// backtracking in the inline-code split. Oversized lines skip the split
|
||||
// entirely (left untouched), so this returns promptly.
|
||||
const line = 'x' + '`'.repeat(200000);
|
||||
const doc = `${line}\n\n[^1]: def`;
|
||||
const t0 = Date.now();
|
||||
normalizeForeignMarkdown(doc);
|
||||
expect(Date.now() - t0).toBeLessThan(2000);
|
||||
});
|
||||
|
||||
it('does not crash or slow down on thousands of prefix-chain definition ids', () => {
|
||||
// F7: the rewrite must use a FIXED generic scanner, not an alternation built
|
||||
// from the ids. A `(a|aa|aaa|…)` alternation over prefix-chain ids blows the
|
||||
// V8 regex compiler (FATAL RegExpCompiler Allocation failed — uncatchable,
|
||||
// kills the process). A fixed scanner has no id-dependent compilation cost.
|
||||
const N = 4000;
|
||||
const ids = Array.from({ length: N }, (_, i) => 'a'.repeat(i + 1));
|
||||
const defs = ids.map((id) => `[^${id}]: body ${id.length}`).join('\n');
|
||||
const doc = `ref[^${ids[0]}] and[^${ids[N - 1]}] end\n\n${defs}`;
|
||||
const t0 = Date.now();
|
||||
const out = normalizeForeignMarkdown(doc);
|
||||
expect(Date.now() - t0).toBeLessThan(2000);
|
||||
// Prefix disambiguation is correct: [^a] and [^aaaa...] inline their OWN body.
|
||||
expect(out).toContain('^[body 1]');
|
||||
expect(out).toContain(`^[body ${N}]`);
|
||||
});
|
||||
|
||||
it('strips a CRLF (Windows) front-matter block, not just LF', () => {
|
||||
// F9: the line-anchored regex needs LF after the opening `---`, so a Windows
|
||||
// file (`---\r\n…`) would slip past the strip and leak the front-matter into
|
||||
// the body. normalizeForeignMarkdown normalizes CRLF -> LF first.
|
||||
const out = normalizeForeignMarkdown(
|
||||
'---\r\ntitle: Foo\r\ntags: [a]\r\n---\r\n\r\n# Heading\r\n\r\nBody.',
|
||||
);
|
||||
expect(out).toBe('# Heading\n\nBody.');
|
||||
expect(out).not.toContain('title: Foo');
|
||||
expect(out).not.toContain('---');
|
||||
});
|
||||
|
||||
it('strips front-matter whose value contains a triple-dash (line-anchored)', () => {
|
||||
// F8: the block must close only on a `\n---` LINE, not the first inline
|
||||
// `---`. A value like `title: Q1 --- Q2` must not truncate the front-matter
|
||||
// and leak the rest (author/closing ---) into the body.
|
||||
const out = normalizeForeignMarkdown(
|
||||
'---\ntitle: Q1 --- Q2 results\nauthor: bob\n---\n\nReal body.',
|
||||
);
|
||||
expect(out).toBe('Real body.');
|
||||
expect(out).not.toContain('author: bob');
|
||||
expect(out).not.toContain('Q2 results');
|
||||
});
|
||||
});
|
||||
|
||||
describe('foreign markdown import acceptance (normalizer + canonical parser)', () => {
|
||||
const FOREIGN = [
|
||||
'# Doc',
|
||||
'',
|
||||
'Body refs [^c] and [^a] and [^b] and again [^a].',
|
||||
'',
|
||||
':::info',
|
||||
'A legacy callout.',
|
||||
':::',
|
||||
'',
|
||||
'| h1 | h2 |',
|
||||
'| --- | --- |',
|
||||
'| 1 | 2 |',
|
||||
'',
|
||||
'[^a]: note A',
|
||||
'[^b]: note B',
|
||||
'[^c]: note C',
|
||||
'[^z]: orphan note',
|
||||
].join('\n');
|
||||
|
||||
it('leaves no literal [^id] or ::: in the imported doc and re-exports canonically', async () => {
|
||||
const normalized = normalizeForeignMarkdown(FOREIGN);
|
||||
const doc = await markdownToProseMirror(normalized);
|
||||
const reexport = convertProseMirrorToMarkdown(doc);
|
||||
|
||||
// No foreign garbage leaks into the document.
|
||||
expect(reexport).not.toMatch(/\[\^/); // no reference footnote refs/defs
|
||||
expect(reexport).not.toContain(':::'); // no legacy callout fences
|
||||
|
||||
// Canonical forms are present.
|
||||
expect(reexport).toContain('^[note C]');
|
||||
expect(reexport).toContain('> [!info]');
|
||||
expect(reexport).toContain('| h1 | h2 |');
|
||||
|
||||
// Footnotes: ordered by first reference (C, A, B), reused [^a] deduped to one,
|
||||
// orphan [^z] dropped (it had no reference after normalization).
|
||||
const list = doc.content.find((n: any) => n.type === 'footnotesList');
|
||||
const bodies = list.content.map(
|
||||
(d: any) => d.content[0].content[0].text,
|
||||
);
|
||||
expect(bodies).toEqual(['note C', 'note A', 'note B']);
|
||||
expect(bodies).not.toContain('orphan note');
|
||||
expect(
|
||||
doc.content.filter((n: any) => n.type === 'footnotesList'),
|
||||
).toHaveLength(1);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,265 @@
|
||||
/**
|
||||
* Foreign-markdown normalizer — an input-liberal / output-canonical adapter that
|
||||
* runs at the IMPORT boundary, BEFORE the canonical parser
|
||||
* (`markdownToProseMirror` from `@docmost/prosemirror-markdown`).
|
||||
*
|
||||
* The canonical parser is deliberately STRICT: it only understands Docmost's
|
||||
* canonical markdown surface (Obsidian-style `> [!type]` callouts, Pandoc/Obsidian
|
||||
* inline footnotes `^[body]`, lossless ` <!--img {...}-->` images, …).
|
||||
* Import, however, ingests FOREIGN files (GitHub/GFM, Notion, old Docmost
|
||||
* exports). Those use surfaces the canonical parser does not accept, most notably
|
||||
* GitHub-flavoured *reference* footnotes:
|
||||
*
|
||||
* Text with a note[^1] and another[^long].
|
||||
*
|
||||
* [^1]: The first definition.
|
||||
* [^long]: A second one.
|
||||
*
|
||||
* Left untouched, the parser does NOT recognise `[^id]` (it only parses `^[body]`),
|
||||
* so the reference leaks as literal text — and worse, the trailing `[^id]: def`
|
||||
* line is a valid CommonMark *link-reference definition*, so `[^id]` is silently
|
||||
* rendered as a bogus link. This normalizer rewrites reference footnotes into the
|
||||
* canonical inline form so the parser materialises real footnote nodes.
|
||||
*
|
||||
* This is a TEXT pre-pass, NOT a second parser fork: it does not re-implement any
|
||||
* converter logic. Callout surfaces (`:::type` and `> [!type]`) are intentionally
|
||||
* NOT touched here — the canonical parser already accepts BOTH natively (its
|
||||
* `preprocessCallouts` pass), so normalizing them would be redundant and would
|
||||
* only risk degrading the parser's nesting/code-fence-aware handling.
|
||||
*/
|
||||
|
||||
/** Matches a fenced code block delimiter (``` or ~~~), capturing the marker run. */
|
||||
const CODE_FENCE_RE = /^(\s*)(`{3,}|~{3,})/;
|
||||
|
||||
/**
|
||||
* Matches a GFM footnote DEFINITION line: `[^id]: body`. The id is any run of
|
||||
* non-`]` characters; the body is the remainder of the line (possibly empty).
|
||||
*/
|
||||
const FOOTNOTE_DEF_RE = /^\[\^([^\]]+)\]:[ \t]?(.*)$/;
|
||||
|
||||
/** True when a line is a code-fence delimiter that toggles fenced-code state. */
|
||||
function fenceMarker(line: string): string | null {
|
||||
const m = line.match(CODE_FENCE_RE);
|
||||
return m ? m[2] : null;
|
||||
}
|
||||
|
||||
/** True when a line is indented (leading space/tab) and not blank — a continuation. */
|
||||
function isIndentedContinuation(line: string): boolean {
|
||||
return /^[ \t]+\S/.test(line);
|
||||
}
|
||||
|
||||
function escapeRegExp(value: string): string {
|
||||
return value.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
|
||||
}
|
||||
|
||||
/**
|
||||
* Backslash-escape any square bracket in a footnote body before it is wrapped in
|
||||
* `^[...]`. The canonical inline-footnote tokenizer scans the body with bracket
|
||||
* balancing and closes on the first UNMATCHED `]`, so an unbalanced bracket in a
|
||||
* foreign definition (e.g. `[^1]: see item ] later`) would otherwise truncate the
|
||||
* footnote and leak the tail as literal text. Escaping every `[`/`]` makes the
|
||||
* body an inert run of characters — the tokenizer then closes only on our own
|
||||
* closing `]`. (A balanced `[link](url)` inside a body still round-trips because
|
||||
* the escaped form renders the literal brackets, which is the safe reading for a
|
||||
* footnote body; the alternative — brittle balance tracking — risks worse.)
|
||||
*/
|
||||
function escapeFootnoteBody(body: string): string {
|
||||
return body.replace(/[[\]]/g, '\\$&');
|
||||
}
|
||||
|
||||
/**
|
||||
* Rewrite every `[^id]` reference on a line to its `^[body]` form, but ONLY in the
|
||||
* text OUTSIDE inline-code spans. A `[^id]` inside backticks is literal code
|
||||
* content and must be preserved verbatim (a footnote ref never lives inside code).
|
||||
* We split the line on inline-code spans (paired backtick runs) and rewrite only
|
||||
* the non-code segments.
|
||||
*/
|
||||
// Above this length a single line is not split into inline-code spans (see
|
||||
// below). A genuine markdown line carrying a footnote reference is never tens of
|
||||
// KB; the cap only bypasses the inline-code protection for pathological lines.
|
||||
const INLINE_SPLIT_MAX_LINE = 8192;
|
||||
|
||||
function rewriteRefsOutsideInlineCode(
|
||||
line: string,
|
||||
replace: (text: string) => string,
|
||||
): string {
|
||||
// The inline-code split alternation `(`+)(?:(?!\1)[\s\S])*\1` backtracks
|
||||
// quadratically on a long UNCLOSED backtick run (its middle can consume the
|
||||
// rest of the line, then fail to find a closing run and retry from each
|
||||
// position). On an untrusted import this is a request-thread ReDoS. A real
|
||||
// footnote line is short, so for an oversized line we skip the inline-code
|
||||
// protection entirely and leave the line UNTOUCHED (rewriting it wholesale
|
||||
// could corrupt a `[^id]` that legitimately lives inside inline code). This is
|
||||
// a conservative bypass: an over-8KB line simply does not get its reference
|
||||
// footnotes inlined — acceptable for a pathological input.
|
||||
if (line.length > INLINE_SPLIT_MAX_LINE) return line;
|
||||
|
||||
// Alternation: an inline-code span (one or more backticks, then anything up to
|
||||
// the SAME run of backticks) OR a run of non-backtick text. Unterminated
|
||||
// backticks fall through as ordinary text (matched by the second branch on the
|
||||
// leftover), so a stray backtick never swallows the rest of the line.
|
||||
const parts = line.match(/(`+)(?:(?!\1)[\s\S])*\1|[^`]+|`+/g);
|
||||
if (!parts) return line;
|
||||
return parts
|
||||
.map((seg) => (seg.startsWith('`') ? seg : replace(seg)))
|
||||
.join('');
|
||||
}
|
||||
|
||||
/**
|
||||
* Convert GFM reference footnotes (`[^id]` + `[^id]: def`) into canonical inline
|
||||
* footnotes (`^[def]`).
|
||||
*
|
||||
* - Definitions are collected first (a leading `[^id]: text` line plus any
|
||||
* immediately-following indented continuation lines, joined with a space) and
|
||||
* removed from the output.
|
||||
* - Each in-text reference `[^id]` for which a definition was found is replaced by
|
||||
* `^[def]`. References with no matching definition are left literal (there is no
|
||||
* body to inline; the parser fails them open the same way).
|
||||
* - Code is respected on both passes: `[^id]` inside a fenced ``` / ~~~ block is
|
||||
* never rewritten and a `[^id]:` line inside a fence is never a definition; and
|
||||
* on the rewrite pass a `[^id]` inside an INLINE-code span (backticks) is left
|
||||
* literal too.
|
||||
* - The inlined body is bracket-escaped so an unbalanced `[`/`]` in a foreign
|
||||
* definition cannot truncate the resulting `^[...]` footnote.
|
||||
*
|
||||
* Deduplication / reference-ordering / orphan-dropping of the resulting footnotes
|
||||
* is handled downstream by the canonical parser (`assembleFootnotes`); this pass
|
||||
* only changes the surface syntax.
|
||||
*/
|
||||
function convertReferenceFootnotes(markdown: string): string {
|
||||
const lines = markdown.split('\n');
|
||||
|
||||
// Pass 1: collect definitions and mark their lines for removal.
|
||||
const defs = new Map<string, string>();
|
||||
const dropped = new Array<boolean>(lines.length).fill(false);
|
||||
let inFence = false;
|
||||
let fence = '';
|
||||
|
||||
for (let i = 0; i < lines.length; i++) {
|
||||
const line = lines[i];
|
||||
const marker = fenceMarker(line);
|
||||
if (inFence) {
|
||||
if (marker && marker[0] === fence[0] && marker.length >= fence.length) {
|
||||
inFence = false;
|
||||
fence = '';
|
||||
}
|
||||
continue;
|
||||
}
|
||||
if (marker) {
|
||||
inFence = true;
|
||||
fence = marker;
|
||||
continue;
|
||||
}
|
||||
|
||||
const def = line.match(FOOTNOTE_DEF_RE);
|
||||
if (!def) continue;
|
||||
|
||||
const id = def[1];
|
||||
const body: string[] = [def[2].trim()];
|
||||
dropped[i] = true;
|
||||
|
||||
// Consume immediately-following indented continuation lines (GFM lazy
|
||||
// continuation is not supported by design — keep it simple and predictable).
|
||||
let j = i + 1;
|
||||
while (j < lines.length && isIndentedContinuation(lines[j])) {
|
||||
body.push(lines[j].trim());
|
||||
dropped[j] = true;
|
||||
j++;
|
||||
}
|
||||
i = j - 1;
|
||||
|
||||
// Last definition wins for a duplicated id (matches CommonMark link-ref
|
||||
// semantics closely enough for a foreign-input adapter).
|
||||
defs.set(id, body.filter((s) => s.length > 0).join(' '));
|
||||
}
|
||||
|
||||
if (defs.size === 0) {
|
||||
return markdown;
|
||||
}
|
||||
|
||||
// ONE fixed, generic scanner regex — NOT one built from the definition ids.
|
||||
// It matches ANY `[^id]` shape, and the replacer decides per match via a map
|
||||
// lookup whether that id is a real definition (replace) or not (leave as-is).
|
||||
// This is genuinely O(total text) with no per-document regex compilation.
|
||||
//
|
||||
// Do NOT rebuild this as an alternation over `[...defs.keys()]`: a giant
|
||||
// `(id1|id2|...)` alternation over thousands of ids can blow the V8 regex
|
||||
// compiler's stack — a fatal, UNCATCHABLE "RegExpCompiler Allocation failed"
|
||||
// on prefix-chain ids (`a`, `aa`, `aaa`, ...) that kills the whole process
|
||||
// (worse than the earlier per-def thread-hang). A fixed scanner has no
|
||||
// id-dependent compilation cost and cannot blow up.
|
||||
const refRe = /\[\^([^\]]+)\]/g;
|
||||
const rewriteSegment = (segment: string): string =>
|
||||
segment.replace(refRe, (whole, id: string) => {
|
||||
const body = defs.get(id);
|
||||
// Only real definitions are inlined; an unknown id is left literal (same as
|
||||
// the old per-def loop, which simply never matched it).
|
||||
return body === undefined ? whole : `^[${escapeFootnoteBody(body)}]`;
|
||||
});
|
||||
|
||||
// Pass 2: rewrite in-text references, skipping fenced code and dropped lines.
|
||||
const out: string[] = [];
|
||||
inFence = false;
|
||||
fence = '';
|
||||
for (let i = 0; i < lines.length; i++) {
|
||||
if (dropped[i]) continue;
|
||||
let line = lines[i];
|
||||
|
||||
const marker = fenceMarker(line);
|
||||
if (inFence) {
|
||||
out.push(line);
|
||||
if (marker && marker[0] === fence[0] && marker.length >= fence.length) {
|
||||
inFence = false;
|
||||
fence = '';
|
||||
}
|
||||
continue;
|
||||
}
|
||||
if (marker) {
|
||||
inFence = true;
|
||||
fence = marker;
|
||||
out.push(line);
|
||||
continue;
|
||||
}
|
||||
|
||||
line = rewriteRefsOutsideInlineCode(line, rewriteSegment);
|
||||
out.push(line);
|
||||
}
|
||||
|
||||
return out.join('\n');
|
||||
}
|
||||
|
||||
/**
|
||||
* Strip a single leading YAML front-matter block (`---\n…\n---`). Foreign files
|
||||
* from Obsidian / Hugo / Jekyll / Notion — and Docmost's OWN git-sync page files
|
||||
* — open with front-matter that the canonical parser does not consume, so
|
||||
* without this it leaks into the body (and `title: Foo` above the closing `---`
|
||||
* renders as a setext `<h2>` that `extractTitleAndRemoveHeading` can hijack as
|
||||
* the page title). It is a no-op for front-matter-free input.
|
||||
*
|
||||
* LINE-ANCHORED (the same shape the canonical parser uses in
|
||||
* prosemirror-markdown/page-file.ts): the block opens only on `---\n` at the
|
||||
* very start and closes only on a `\n---` line. The retired `markdownToHtml`
|
||||
* strip closed on the FIRST `---` ANYWHERE (an unanchored close), so a value
|
||||
* containing a triple-dash (e.g. `title: Q1 --- Q2`) truncated the front-matter
|
||||
* and leaked the rest into the body. An optional leading BOM is tolerated.
|
||||
*/
|
||||
const YAML_FRONT_MATTER_RE = /^\uFEFF?---\n[\s\S]*?\n---\n?/;
|
||||
|
||||
/**
|
||||
* Normalize a foreign markdown string into Docmost's canonical markdown surface
|
||||
* so the strict canonical parser accepts it losslessly: normalize line endings,
|
||||
* strip a leading YAML front-matter block, then rewrite GFM reference footnotes
|
||||
* into inline footnotes. Add further fixture-driven foreign-surface cases here as
|
||||
* they are found.
|
||||
*/
|
||||
export function normalizeForeignMarkdown(markdown: string): string {
|
||||
if (!markdown) return markdown;
|
||||
// Normalize CRLF -> LF FIRST. The line-anchored front-matter regex requires a
|
||||
// bare `\n` after the opening `---`, and convertReferenceFootnotes splits on
|
||||
// `\n`; a Windows/CRLF foreign file (`---\r\n…`) would otherwise slip past the
|
||||
// front-matter strip and leak into the body. The canonical parser
|
||||
// (page-file.ts parsePageFile) normalizes the same way before its FRONTMATTER_RE.
|
||||
const src = markdown.replace(/\r\n/g, '\n');
|
||||
const withoutFrontMatter = src.replace(YAML_FRONT_MATTER_RE, '').trimStart();
|
||||
return convertReferenceFootnotes(withoutFrontMatter);
|
||||
}
|
||||
@@ -20,10 +20,6 @@ export interface IStripeSeatsSyncJob {
|
||||
|
||||
export interface IPageHistoryJob {
|
||||
pageId: string;
|
||||
// #370 — intentionality tier the worker stamps on the snapshot. All jobs on
|
||||
// this queue are trailing idle-flush autosnapshots, so this is 'idle' (absent
|
||||
// → treated as 'idle' by the processor).
|
||||
kind?: 'idle';
|
||||
}
|
||||
|
||||
/**
|
||||
|
||||
@@ -0,0 +1,23 @@
|
||||
// Jest stub for @tiptap/react.
|
||||
//
|
||||
// The server export/import code paths transitively import editor-ext, whose node
|
||||
// extensions import from `@tiptap/react`. The real module re-exports all of
|
||||
// `@tiptap/core` (headless, safe under node) AND adds React view helpers
|
||||
// (`ReactNodeViewRenderer`, …) that eagerly pull in react-dom — which throws
|
||||
// `navigator is not defined` under jest's node environment.
|
||||
//
|
||||
// So this stub DELEGATES to the real `@tiptap/core` (keeping `mergeAttributes`,
|
||||
// `Node`, `Mark`, `nodeInputRule`, … working — they are used by
|
||||
// `jsonToHtml`/`htmlToJson` on the server) and overrides ONLY the React view
|
||||
// helpers with no-ops. Those helpers are referenced solely inside `addNodeView()`
|
||||
// — code that runs only in a live browser editor, never on the server; if any
|
||||
// were actually invoked here it would (correctly) surface as a test failure.
|
||||
const core = require('@tiptap/core');
|
||||
|
||||
module.exports = {
|
||||
...core,
|
||||
ReactNodeViewRenderer: () => () => ({}),
|
||||
NodeViewWrapper: () => null,
|
||||
NodeViewContent: () => null,
|
||||
ReactRenderer: class {},
|
||||
};
|
||||
Generated
+3
@@ -543,6 +543,9 @@ importers:
|
||||
'@docmost/pdf-inspector':
|
||||
specifier: 1.9.6
|
||||
version: 1.9.6
|
||||
'@docmost/prosemirror-markdown':
|
||||
specifier: workspace:*
|
||||
version: link:../../packages/prosemirror-markdown
|
||||
'@fastify/cookie':
|
||||
specifier: ^11.0.2
|
||||
version: 11.0.2
|
||||
|
||||
Reference in New Issue
Block a user