fix(footnotes): address PR #232 review — fragment-safe canonicalization, plugin placement parity, dead-code removal (#228)

Must-fix:
- Move canonicalizeFootnotes OUT of parseProsemirrorContent. It now runs only
  on FULL writes (createPage, updatePageContent operation==='replace'), never on
  an append/prepend fragment (a fragment would lose definition-only footnotes or
  synthesize a bogus empty list). Add a server binding spec.
- Match the live plugin's list PLACEMENT: a single already-canonical
  footnotesList is left exactly where it sits (the plugin never repositions a
  sole correct list), so the first write no longer reorders content that follows
  the list. Applied to BOTH the editor-ext copy and the MCP mirror; pinned by a
  shared golden corpus case with content after the list.
- Fix MCP tool count 38 -> 39 (README x3, AGENTS.md) and the transformJs param
  help (add canonicalizeFootnotes/insertInlineFootnote).

Simplifications:
- Remove the dead duplicate re-id mechanism (deriveFootnoteId/suffix/occurrence)
  from the PURE canonicalizer in both copies — references are never renamed, so
  the derived ids were never requested; first-wins-drop is the real behaviour.
  This also makes the editor-ext footnote-util note about "no cross-package copy"
  true again.
- Remove the sentinel round-trip in insertInlineFootnote: a generalized
  insertNodesAfterAnchor core inserts the footnoteReference node directly.
- Drop the redundant per-definition deep clone in step 4 (shallow id-normalizing
  copy; out is already deep-cloned).

Docs / architecture:
- Correct the editor-ext copy's "It exists because…" header to its real
  consumers (server import, page.service create/update, client paste).
- Note markdownToProseMirror reuse for create/update comment in collaboration.ts.
- A: shared golden JSON corpus exercised by BOTH the editor-ext copy and the MCP
  mirror (footnote-corpus.ts / .mjs) so "the two copies behave identically" is
  checkable.
- C: split the MCP canonicalizer into a pure mirror + footnote-authoring.ts.
- B: import services persist via a different path, so left one-line consolidation
  comments at the call sites rather than folding (does not fall out cleanly).

Tests: insertFootnote wrapper guards + docmost_transform dryRun auto-canonicalize
(MCP mock), page.service create/update + append/prepend binding (server jest),
shared corpus incl. nested-container reference.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
a
2026-06-27 20:23:16 +03:00
parent fa929c9e86
commit 07ebd8c63e
23 changed files with 3262 additions and 453 deletions

View File

@@ -7,6 +7,7 @@ import { FootnoteReference } from './footnote-reference';
import { FootnotesList } from './footnotes-list';
import { FootnoteDefinition } from './footnote-definition';
import { canonicalizeFootnotes } from './footnote-canonicalize';
import { FOOTNOTE_CORPUS } from './footnote-corpus';
import {
collectReferenceIds,
computeFootnoteNumbers,
@@ -325,3 +326,21 @@ describe('canonicalizeFootnotes golden parity with footnoteSyncPlugin', () => {
expect(new Set(defOrder(steady))).toEqual(new Set(defOrder(canon)));
});
});
/**
* SHARED golden corpus: this editor-ext copy of `canonicalizeFootnotes` and the
* MCP mirror (`packages/mcp/src/lib/footnote-canonicalize.ts`) are BOTH run
* against the identical { input -> expected } corpus. Pinning the same expected
* outputs in both suites makes "the two pure copies behave identically" a
* checkable property without coupling the packages (architecture item A). The
* MCP mirror of these assertions lives in `test/unit/footnote-corpus.test.mjs`.
*/
describe('canonicalizeFootnotes shared golden corpus (editor-ext copy)', () => {
for (const { name, input, expected } of FOOTNOTE_CORPUS) {
it(`matches the corpus expected output: ${name}`, () => {
expect(canonicalizeFootnotes(input)).toEqual(expected);
// Idempotent on the corpus too.
expect(canonicalizeFootnotes(expected)).toEqual(expected);
});
}
});

View File

@@ -2,7 +2,6 @@ import {
FOOTNOTE_REFERENCE_NAME,
FOOTNOTES_LIST_NAME,
FOOTNOTE_DEFINITION_NAME,
deriveFootnoteId,
} from './footnote-util';
/**
@@ -11,14 +10,20 @@ import {
* `appendTransaction` that only runs inside a ProseMirror `EditorView`, this is
* a PURE function over ProseMirror JSON: `canonicalizeFootnotes(doc) -> doc`.
*
* It exists because every NON-editor write path (the MCP `markdownToProseMirror`
* importer, `update_page_json`, `docmost_transform`, the future git-sync writer)
* builds ProseMirror JSON directly via `TiptapTransformer`/`updateYFragment`,
* which NEVER runs the editor's plugins — so the canonical footnote topology was
* never enforced on those writes. That is the root cause of the symptom in the
* issue: footnotes rendered out of order (`1, 4, 2, 3, …`), a raw trailing
* `[^id]: …` block, and orphan definitions, all of which are simply the result
* of content written PAST the canonicalizer.
* It exists because the NON-editor write paths served by THIS copy build
* ProseMirror JSON directly (never running the editor's plugins), so the
* canonical footnote topology was never enforced on those writes. The consumers
* of this editor-ext copy are: the server markdown/HTML import
* (`markdownToHtml -> htmlToJson` in import.service / file-import-task.service),
* `PageService` create/update (`parseProsemirrorContent` for the JSON/markdown/
* HTML REST write paths), and the client markdown PASTE path
* (`markdown-clipboard.ts`). (The MCP package mirrors this canonicalizer in
* `packages/mcp/src/lib/footnote-canonicalize.ts` for its own write paths —
* `markdownToProseMirror`, `update_page_json`, `docmost_transform`,
* `insert_footnote` — see that file's header.) All of these are the root cause
* of the symptom in the issue: footnotes rendered out of order (`1, 4, 2, 3, …`),
* a raw trailing `[^id]: …` block, and orphan definitions, all of which are
* simply the result of content written PAST the canonicalizer.
*
* The desired end-state (identical to the plugin's) is:
*
@@ -31,12 +36,14 @@ import {
* or synthesizing an empty one when missing. The list sits after the last
* meaningful block (only trailing empty paragraphs may follow it).
* 3. Orphan definitions (no matching reference) are dropped.
* 4. Duplicate DEFINITIONS (two nodes sharing an id) are resolved
* deterministically: the first keeps the id; each later duplicate is re-id'd
* via `deriveFootnoteId` (never random) so it is never silently lost — and,
* lacking a matching reference, it then falls under the orphan policy and is
* dropped. This matches the editor's never-lose-by-collision rule and the
* importer's first-wins rule (both converge to "one definition per id").
* 4. Duplicate DEFINITIONS (two nodes sharing an id) are resolved first-wins:
* the first definition for an id is kept; later duplicates carry the SAME
* id, so they can never be referenced separately and are simply dropped.
* This matches the importer's first-wins rule ("one definition per id").
* (The LIVE editor instead re-id's a duplicate definition so a paste/collab
* merge cannot silently lose live user data; the artifacts this copy
* sanitizes are agent/import-authored, so first-wins is the right policy —
* see footnote-sync.ts `resolveCollisions`.)
* 5. Idempotent: a document that already satisfies the invariant is returned
* structurally unchanged (the existing definition/list nodes are reused
* verbatim), so re-running the canonicalizer — or running it on a write that
@@ -47,10 +54,18 @@ import {
* PHYSICAL order of existing definition nodes to keep their Yjs/CRDT subtree
* identity stable across collaborators (numbering is decoration-derived, so the
* displayed numbers are correct regardless of physical order). This function has
* no live CRDT to protect, so it physically REORDERS the list into reference
* order — which is exactly the repair the out-of-order import needs. On every
* editor-reachable steady state (where the list is already reference-ordered) the
* two agree byte-for-byte; see the golden test.
* no live CRDT to protect, so when a REPAIR is needed it physically REORDERS the
* list into reference order — which is exactly the fix the out-of-order import
* needs.
*
* Placement PARITY with the plugin: when the document is already in the canonical
* single-list state, this function leaves that list EXACTLY where it sits (it
* does not move it to the end). The plugin behaves the same — it treats one
* footnotesList holding the canonical definition set as canonical regardless of
* whether content follows it (footnote-sync.ts: `primaryList` falls back to the
* last list and `noChangeNeeded` stays true). So on every editor-reachable steady
* state the two agree byte-for-byte, including when non-empty content follows the
* list; see the golden parity test and the shared corpus.
*
* Pure: deep-clones its input, never mutates the caller's object, and is
* deterministic (no `Math.random`/`Date.now`).
@@ -76,62 +91,69 @@ export function canonicalizeFootnotes<T = any>(doc: T): T {
const defNodes: any[] = [];
collectDefinitions(out, defNodes);
// 3) Resolve the id topology deterministically. The first definition for an id
// keeps it; a later duplicate is re-id'd to a fresh derived id (never lost),
// which — having no matching reference — is dropped as an orphan in step 4.
const taken = new Set<string>(referenceIds);
// 3) First definition per id wins. Later duplicates carry the SAME id, so they
// can never be referenced separately and would be orphans — they are simply
// dropped (first-wins; see the file header, item 4).
const defById = new Map<string, any>();
for (const d of defNodes) {
const id = d?.attrs?.id;
if (id) taken.add(id);
}
const occurrenceOf = new Map<string, number>();
const seenDefIds = new Set<string>();
// finalId -> definition node (the node reference inside `out`).
const defByFinalId = new Map<string, any>();
for (const d of defNodes) {
const origId = d?.attrs?.id;
if (!origId) continue;
if (!seenDefIds.has(origId)) {
seenDefIds.add(origId);
defByFinalId.set(origId, d);
} else {
const next = (occurrenceOf.get(origId) ?? 1) + 1;
occurrenceOf.set(origId, next);
const newId = deriveFootnoteId(origId, next, taken);
taken.add(newId);
defByFinalId.set(newId, d);
}
if (id && !defById.has(id)) defById.set(id, d);
}
// 4) Build the ordered definition list: one per referenced id, in REFERENCE
// order, reusing the existing node (content preserved, id normalized) or
// synthesizing an empty definition. Definitions whose final id is NOT
// referenced are orphans and are simply never added.
// synthesizing an empty definition. Definitions whose id is NOT referenced
// are orphans and are simply never added. The reused node is SHALLOW-copied
// (id normalized): `out` is already a deep clone and the old lists are cut,
// so a second per-definition deep clone is needless.
const orderedDefs: any[] = [];
for (const id of referenceIds) {
const existing = defByFinalId.get(id);
const existing = defById.get(id);
if (existing) {
const node = cloneJson(existing);
node.attrs = { ...(node.attrs ?? {}), id };
orderedDefs.push(node);
orderedDefs.push({
...existing,
attrs: { ...(existing.attrs ?? {}), id },
});
} else {
orderedDefs.push(emptyDefinition(id));
}
}
// 5) Strip every existing top-level footnotesList; we rebuild a single one.
const top: any[] = out.content.filter(
(n: any) => !(n && n.type === FOOTNOTES_LIST_NAME),
);
// 6) No references -> there must be NO list at all.
// 5) No references -> there must be NO list at all.
if (referenceIds.length === 0) {
out.content = top;
out.content = out.content.filter(
(n: any) => !(n && n.type === FOOTNOTES_LIST_NAME),
);
return out;
}
// 7) Insert exactly one footnotesList after the last meaningful (non-empty
// paragraph) block, so it coexists with a trailing-node empty paragraph.
// 6) Placement parity with the live plugin: when the document is ALREADY in the
// canonical single-list state, leave that list exactly where it sits instead
// of cutting and re-inserting it at the end. The plugin never repositions a
// sole correct list (footnote-sync.ts), so moving it here would silently
// reorder any user content that follows the list on the first write. The doc
// is in that state when there is exactly one top-level footnotesList, every
// definition in the doc is referenced (no orphans / duplicates: the def count
// equals the canonical count), and the list already holds exactly the
// canonical definitions in reference order.
const topLevelLists = out.content.filter(
(n: any) => n && n.type === FOOTNOTES_LIST_NAME,
);
if (
topLevelLists.length === 1 &&
defNodes.length === orderedDefs.length &&
deepEqualJson(topLevelLists[0].content, orderedDefs)
) {
return out;
}
// 7) Otherwise rebuild: strip every footnotesList and re-insert exactly one
// after the last meaningful (non-empty paragraph) block, so it coexists with
// a trailing-node empty paragraph. This both repairs a non-canonical doc and
// (in the import case) physically reorders the list into reference order.
const top: any[] = out.content.filter(
(n: any) => !(n && n.type === FOOTNOTES_LIST_NAME),
);
let insertAt = top.length;
while (insertAt > 0 && isEmptyParagraph(top[insertAt - 1])) insertAt--;
top.splice(insertAt, 0, { type: FOOTNOTES_LIST_NAME, content: orderedDefs });
@@ -139,6 +161,36 @@ export function canonicalizeFootnotes<T = any>(doc: T): T {
return out;
}
/**
* Order-insensitive deep equality over plain JSON (objects/arrays/primitives).
* Used to detect an already-canonical footnotesList so its physical position is
* preserved (placement parity with the live plugin).
*/
function deepEqualJson(a: any, b: any): boolean {
if (a === b) return true;
if (a == null || b == null || typeof a !== typeof b) return false;
if (Array.isArray(a) || Array.isArray(b)) {
if (!Array.isArray(a) || !Array.isArray(b) || a.length !== b.length) {
return false;
}
for (let i = 0; i < a.length; i++) {
if (!deepEqualJson(a[i], b[i])) return false;
}
return true;
}
if (typeof a === 'object') {
const ka = Object.keys(a);
const kb = Object.keys(b);
if (ka.length !== kb.length) return false;
for (const k of ka) {
if (!Object.prototype.hasOwnProperty.call(b, k)) return false;
if (!deepEqualJson(a[k], b[k])) return false;
}
return true;
}
return false;
}
/** A fresh empty definition node for a referenced id with no definition. */
function emptyDefinition(id: string): any {
return {

File diff suppressed because it is too large Load Diff

View File

@@ -656,7 +656,8 @@ export function createDocmostMcpServer(config) {
"parenthesized function). It receives a clone of the live doc and " +
"ctx (comments, log, consume(id), helpers: blockText/walk/getList/" +
"insertMarkerAfter/setCalloutRange/noteItem/mdToInlineNodes/" +
"commentsToFootnotes) and must return a {type:'doc'} node."),
"commentsToFootnotes/canonicalizeFootnotes/insertInlineFootnote) " +
"and must return a {type:'doc'} node."),
dryRun: z
.boolean()
.optional()

View File

@@ -344,7 +344,16 @@ function extractFootnotes(markdown) {
section: `<section data-footnotes>${inner}</section>`,
};
}
/** Convert markdown to a ProseMirror doc using the full Docmost schema. */
/**
* Convert markdown to a ProseMirror doc using the full Docmost schema.
*
* NOTE: besides the page-import write paths, this is also reused for comment
* bodies (createComment / updateComment). For an ordinary comment the
* canonicalize call below is a no-op (a comment carries no footnotes), so the
* reuse is safe; the only theoretical effect is if footnote markup were ever
* authored INSIDE a comment — a narrow case where canonicalizing the comment's
* own (self-contained) footnotes is still the correct behaviour.
*/
export async function markdownToProseMirror(markdownContent) {
const withCallouts = await preprocessCallouts(markdownContent);
const { body, section } = extractFootnotes(withCallouts);

View File

@@ -0,0 +1,88 @@
/**
* Inline-authoring helpers for footnotes (MCP).
*
* These build/identify footnote DEFINITION nodes for the author-inline tool
* (`insertInlineFootnote` in transforms.ts): a content key to de-duplicate notes
* by text, a definition-node factory, and a fresh uuidv7-style id generator.
*
* Split out of `footnote-canonicalize.ts` so that module stays a pure MIRROR of
* the editor-ext canonicalizer (compositionally symmetric to the editor-ext
* copy, which keeps its authoring helpers in `footnote-util.ts`). The pure
* canonicalizer has no dependency on these.
*/
const FOOTNOTE_DEFINITION_NAME = "footnoteDefinition";
function cloneJson(v) {
if (typeof structuredClone === "function")
return structuredClone(v);
return JSON.parse(JSON.stringify(v));
}
/**
* Normalized content key for de-duplicating footnote DEFINITIONS by their text.
*
* Two definitions with the same key are the SAME footnote — so the inline
* authoring tool reuses one id (one number, one definition, several references)
* instead of minting a second definition. Key = plaintext (whitespace-collapsed,
* trimmed) PLUS a signature of the inline mark types in order, so two notes that
* read the same but differ in formatting (one bold, one plain) are NOT merged.
* Conservative: only an exact match merges.
*/
export function footnoteContentKey(defNode) {
const parts = [];
const visit = (n) => {
if (!n || typeof n !== "object")
return;
if (n.type === "text" && typeof n.text === "string") {
const marks = Array.isArray(n.marks)
? n.marks.map((m) => m?.type).filter(Boolean).sort().join(",")
: "";
parts.push(`${n.text}${marks}`);
}
if (Array.isArray(n.content))
for (const c of n.content)
visit(c);
};
visit(defNode);
// Collapse the assembled text's whitespace and trim, keeping the mark
// signature attached so formatting differences still distinguish notes.
return parts
.join("")
.replace(/[ \t\r\n]+/g, " ")
.trim();
}
/**
* Build a footnoteDefinition node from inline ProseMirror nodes, keyed by id.
*/
export function makeFootnoteDefinition(id, inlineNodes) {
const content = Array.isArray(inlineNodes) ? cloneJson(inlineNodes) : [];
return {
type: FOOTNOTE_DEFINITION_NAME,
attrs: { id },
content: [{ type: "paragraph", content }],
};
}
/**
* Generate a uuidv7-style id (time-ordered), matching editor-ext's
* `generateFootnoteId`. Used for a genuinely-new inline footnote id.
*/
export function generateFootnoteId() {
const now = Date.now();
const timeHex = now.toString(16).padStart(12, "0");
const rand = (length) => {
let s = "";
for (let i = 0; i < length; i++)
s += Math.floor(Math.random() * 16).toString(16);
return s;
};
const versioned = "7" + rand(3);
const variantNibble = (8 + Math.floor(Math.random() * 4)).toString(16);
const variant = variantNibble + rand(3);
return (timeHex.slice(0, 8) +
"-" +
timeHex.slice(8, 12) +
"-" +
versioned +
"-" +
variant +
"-" +
rand(12));
}

View File

@@ -1,5 +1,5 @@
/**
* Server-side footnote canonicalizer + inline authoring helper (MCP mirror).
* Server-side footnote canonicalizer (MCP mirror — PURE).
*
* `canonicalizeFootnotes(doc)` is a pure ProseMirror-JSON port of the editor's
* `footnoteSyncPlugin` end-state, identical in behaviour to
@@ -8,7 +8,13 @@
* `docmost-schema.ts` nodes are mirrored: the MCP package is deliberately
* decoupled from the browser/React-heavy editor barrel and operates on plain
* JSON. The editor-ext copy owns the golden test against the live plugin; this
* copy must stay behaviourally identical.
* copy must stay behaviourally identical (a SHARED golden corpus, exercised by
* both test suites, pins that — see `test/unit/footnote-corpus.mjs`).
*
* This module is the pure MIRROR only. The inline-authoring helpers
* (`footnoteContentKey`, `makeFootnoteDefinition`, `generateFootnoteId`) used by
* `insertInlineFootnote` live in the sibling `footnote-authoring.ts`, so this
* file is compositionally symmetric to the editor-ext copy.
*
* Why it exists: every NON-editor write path (markdown import, update_page_json,
* docmost_transform, insert_footnote) builds ProseMirror JSON directly, so the
@@ -26,32 +32,6 @@ function cloneJson(v) {
return structuredClone(v);
return JSON.parse(JSON.stringify(v));
}
/**
* Deterministic unique id for the k-th (k >= 2) duplicate of an id during
* collision resolution. Pure function of (originalId, occurrence, taken) — no
* Math.random/Date.now — mirroring editor-ext's `deriveFootnoteId`. Kept local
* (the importer's first-wins de-dup means duplicates are rare here, but the
* canonicalizer must still resolve them deterministically).
*/
export function deriveFootnoteId(originalId, occurrence, taken) {
let candidate = `${originalId}__${occurrence}`;
let n = 0;
while (taken.has(candidate)) {
n += 1;
candidate = `${originalId}__${occurrence}${suffix(n)}`;
}
return candidate;
}
function suffix(n) {
let out = "";
let x = n;
while (x > 0) {
const rem = (x - 1) % 25;
out = String.fromCharCode(98 + rem) + out; // 98 = 'b'
x = Math.floor((x - 1) / 25);
}
return out;
}
function isEmptyParagraph(node) {
return (!!node &&
node.type === "paragraph" &&
@@ -89,6 +69,41 @@ function emptyDefinition(id) {
content: [{ type: "paragraph" }],
};
}
/**
* Order-insensitive deep equality over plain JSON (objects/arrays/primitives).
* Used to detect an already-canonical footnotesList so its physical position is
* preserved (placement parity with the live plugin).
*/
function deepEqualJson(a, b) {
if (a === b)
return true;
if (a == null || b == null || typeof a !== typeof b)
return false;
if (Array.isArray(a) || Array.isArray(b)) {
if (!Array.isArray(a) || !Array.isArray(b) || a.length !== b.length) {
return false;
}
for (let i = 0; i < a.length; i++) {
if (!deepEqualJson(a[i], b[i]))
return false;
}
return true;
}
if (typeof a === "object") {
const ka = Object.keys(a);
const kb = Object.keys(b);
if (ka.length !== kb.length)
return false;
for (const k of ka) {
if (!Object.prototype.hasOwnProperty.call(b, k))
return false;
if (!deepEqualJson(a[k], b[k]))
return false;
}
return true;
}
return false;
}
/**
* Canonicalize footnotes in a ProseMirror-JSON document. See the file header and
* the editor-ext twin for the full contract. Pure (deep-clones input,
@@ -101,52 +116,57 @@ export function canonicalizeFootnotes(doc) {
return doc;
}
const out = cloneJson(doc);
// 1) Distinct reference ids in document order (deep — refs can live in
// callouts, tables, list items, ...). The ordering/numbering truth.
const referenceIds = [];
collectReferenceIds(out, referenceIds, new Set());
// 2) Every definition node in document order (deep).
const defNodes = [];
collectDefinitions(out, defNodes);
const taken = new Set(referenceIds);
// 3) First definition per id wins; later duplicates carry the SAME id, so they
// cannot be referenced separately and would be orphans — they are dropped.
const defById = new Map();
for (const d of defNodes) {
const id = d?.attrs?.id;
if (id)
taken.add(id);
}
const occurrenceOf = new Map();
const seenDefIds = new Set();
const defByFinalId = new Map();
for (const d of defNodes) {
const origId = d?.attrs?.id;
if (!origId)
continue;
if (!seenDefIds.has(origId)) {
seenDefIds.add(origId);
defByFinalId.set(origId, d);
}
else {
const next = (occurrenceOf.get(origId) ?? 1) + 1;
occurrenceOf.set(origId, next);
const newId = deriveFootnoteId(origId, next, taken);
taken.add(newId);
defByFinalId.set(newId, d);
}
if (id && !defById.has(id))
defById.set(id, d);
}
// 4) Build the ordered definition list: one per referenced id, in REFERENCE
// order, reusing the existing node (shallow-copied, id normalized — `out` is
// already deep-cloned and the old lists are cut) or synthesizing an empty
// one. Definitions whose id is not referenced are orphans and never added.
const orderedDefs = [];
for (const id of referenceIds) {
const existing = defByFinalId.get(id);
const existing = defById.get(id);
if (existing) {
const node = cloneJson(existing);
node.attrs = { ...(node.attrs ?? {}), id };
orderedDefs.push(node);
orderedDefs.push({
...existing,
attrs: { ...(existing.attrs ?? {}), id },
});
}
else {
orderedDefs.push(emptyDefinition(id));
}
}
const top = out.content.filter((n) => !(n && n.type === FOOTNOTES_LIST_NAME));
// 5) No references -> there must be NO list at all.
if (referenceIds.length === 0) {
out.content = top;
out.content = out.content.filter((n) => !(n && n.type === FOOTNOTES_LIST_NAME));
return out;
}
// 6) Placement parity with the live plugin: when the document is ALREADY in the
// canonical single-list state, leave that list exactly where it sits rather
// than cutting and re-inserting it at the end (the plugin never repositions a
// sole correct list, so moving it would silently reorder any content that
// follows the list on the first write).
const topLevelLists = out.content.filter((n) => n && n.type === FOOTNOTES_LIST_NAME);
if (topLevelLists.length === 1 &&
defNodes.length === orderedDefs.length &&
deepEqualJson(topLevelLists[0].content, orderedDefs)) {
return out;
}
// 7) Otherwise rebuild: strip every footnotesList and re-insert exactly one
// after the last meaningful (non-empty paragraph) block.
const top = out.content.filter((n) => !(n && n.type === FOOTNOTES_LIST_NAME));
let insertAt = top.length;
while (insertAt > 0 && isEmptyParagraph(top[insertAt - 1]))
insertAt--;
@@ -154,73 +174,3 @@ export function canonicalizeFootnotes(doc) {
out.content = top;
return out;
}
/**
* Normalized content key for de-duplicating footnote DEFINITIONS by their text.
*
* Two definitions with the same key are the SAME footnote — so the inline
* authoring tool reuses one id (one number, one definition, several references)
* instead of minting a second definition. Key = plaintext (whitespace-collapsed,
* trimmed) PLUS a signature of the inline mark types in order, so two notes that
* read the same but differ in formatting (one bold, one plain) are NOT merged.
* Conservative: only an exact match merges.
*/
export function footnoteContentKey(defNode) {
const parts = [];
const visit = (n) => {
if (!n || typeof n !== "object")
return;
if (n.type === "text" && typeof n.text === "string") {
const marks = Array.isArray(n.marks)
? n.marks.map((m) => m?.type).filter(Boolean).sort().join(",")
: "";
parts.push(`${n.text}${marks}`);
}
if (Array.isArray(n.content))
for (const c of n.content)
visit(c);
};
visit(defNode);
// Collapse the assembled text's whitespace and trim, keeping the mark
// signature attached so formatting differences still distinguish notes.
return parts
.join("")
.replace(/[ \t\r\n]+/g, " ")
.trim();
}
/**
* Build a footnoteDefinition node from inline ProseMirror nodes, keyed by id.
*/
export function makeFootnoteDefinition(id, inlineNodes) {
const content = Array.isArray(inlineNodes) ? cloneJson(inlineNodes) : [];
return {
type: FOOTNOTE_DEFINITION_NAME,
attrs: { id },
content: [{ type: "paragraph", content }],
};
}
/**
* Generate a uuidv7-style id (time-ordered), matching editor-ext's
* `generateFootnoteId`. Used for a genuinely-new inline footnote id.
*/
export function generateFootnoteId() {
const now = Date.now();
const timeHex = now.toString(16).padStart(12, "0");
const rand = (length) => {
let s = "";
for (let i = 0; i < length; i++)
s += Math.floor(Math.random() * 16).toString(16);
return s;
};
const versioned = "7" + rand(3);
const variantNibble = (8 + Math.floor(Math.random() * 4)).toString(16);
const variant = variantNibble + rand(3);
return (timeHex.slice(0, 8) +
"-" +
timeHex.slice(8, 12) +
"-" +
versioned +
"-" +
variant +
"-" +
rand(12));
}

View File

@@ -14,7 +14,8 @@
* - `marks` arrays are preserved verbatim when fragments are split/reordered.
*/
import { blockPlainText } from "./node-ops.js";
import { canonicalizeFootnotes, footnoteContentKey, makeFootnoteDefinition, generateFootnoteId, } from "./footnote-canonicalize.js";
import { canonicalizeFootnotes } from "./footnote-canonicalize.js";
import { footnoteContentKey, makeFootnoteDefinition, generateFootnoteId, } from "./footnote-authoring.js";
export { canonicalizeFootnotes } from "./footnote-canonicalize.js";
/** Deep-clone a JSON-serializable value without mutating the original. */
function clone(value) {
@@ -85,6 +86,19 @@ export function getList(doc, predicate) {
* false when the anchor text was not found in any in-scope block.
*/
export function insertMarkerAfter(doc, anchor, marker, opts = {}) {
// A plain marker is a leading-space-padded unmarked text run.
return insertNodesAfterAnchor(doc, anchor, () => [{ type: "text", text: " " + marker }], opts);
}
/**
* Mark-safe insertion CORE: split the inline text run that holds the END of
* `anchor` (preserving the surrounding marks) and splice the nodes produced by
* `makeMiddle()` in at the split point. `insertMarkerAfter` (plain text marker)
* and `insertInlineFootnote` (a `footnoteReference` node) are both thin callers —
* the only difference is WHAT is inserted (a space-padded text run vs. a node
* that should hug the preceding word), which is exactly what `makeMiddle`
* decides. Operates on a clone; returns `{ doc, inserted }`.
*/
function insertNodesAfterAnchor(doc, anchor, makeMiddle, opts = {}) {
const out = clone(doc);
if (!isObject(out) || !Array.isArray(out.content) || !anchor) {
return { doc: out, inserted: false };
@@ -138,8 +152,9 @@ export function insertMarkerAfter(doc, anchor, marker, opts = {}) {
if (before.length > 0) {
parts.push({ ...n, text: before, marks: [...marks] });
}
// Marker is a PLAIN run: no marks copied. Leading space separates it.
parts.push({ type: "text", text: " " + marker });
// The inserted nodes are caller-decided (a space-padded marker run,
// or a node that hugs the word). They carry no copied marks.
parts.push(...makeMiddle());
if (after.length > 0) {
parts.push({ ...n, text: after, marks: [...marks] });
}
@@ -473,8 +488,6 @@ export function commentsToFootnotes(doc, comments, opts = {}) {
const synced = setCalloutRange(working, definitions.length);
return { doc: synced.doc, consumed };
}
/** A NUL-delimited sentinel that cannot occur in real prose. */
const INLINE_FOOTNOTE_SENTINEL = "\u0000IFN\u0000";
/**
* AUTHOR-INLINE footnote insertion. The caller supplies WHERE (anchorText) and
* WHAT (markdown text); numbering and the bottom list are derived server-side by
@@ -488,10 +501,10 @@ const INLINE_FOOTNOTE_SENTINEL = "\u0000IFN\u0000";
* minted and a new definition added. Conservative — only an exact content match
* merges.
*
* Mechanics: the marker is inserted with the same mark-safe `insertMarkerAfter`
* split used elsewhere, via a sentinel that is then replaced by a real
* `footnoteReference` node (dropping the inserted leading space so the marker
* attaches to the preceding word). The whole document is then canonicalized.
* Mechanics: the `footnoteReference` node is inserted DIRECTLY at the anchor via
* the same mark-safe split as `insertMarkerAfter` (the shared
* `insertNodesAfterAnchor` core), so it hugs the preceding word with no text
* sentinel round-trip. The whole document is then canonicalized.
*
* Operates on a clone of `doc`. When the anchor is not found, returns the input
* unchanged with `inserted:false`.
@@ -518,14 +531,13 @@ export function insertInlineFootnote(doc, opts) {
}
if (footnoteId == null)
footnoteId = generateFootnoteId();
// Insert a sentinel marker after the anchor (mark-safe split).
const r = insertMarkerAfter(doc, (opts.anchorText ?? "").trimEnd(), INLINE_FOOTNOTE_SENTINEL);
// Insert the footnoteReference node directly after the anchor (mark-safe
// split); it hugs the preceding word with no leading space.
const r = insertNodesAfterAnchor(doc, (opts.anchorText ?? "").trimEnd(), () => [{ type: "footnoteReference", attrs: { id: footnoteId } }]);
if (!r.inserted) {
return { doc: clone(doc), inserted: false, footnoteId, reused };
}
let working = r.doc;
// Replace the sentinel run with a real footnoteReference node.
replaceSentinelWithReference(working, footnoteId);
// Add a NEW definition (canonicalize will order/place it); a reused id needs
// no new definition (the existing one is shared).
if (!reused) {
@@ -535,48 +547,6 @@ export function insertInlineFootnote(doc, opts) {
working = canonicalizeFootnotes(working);
return { doc: working, inserted: true, footnoteId, reused };
}
/**
* Replace the lone sentinel text run (created by insertMarkerAfter as
* `" " + sentinel`) with a footnoteReference node, dropping the leading space so
* the marker attaches to the preceding word. Mutates `doc` in place.
*/
function replaceSentinelWithReference(doc, footnoteId) {
let done = false;
const visit = (container) => {
if (done || !isObject(container) || !Array.isArray(container.content))
return;
const arr = container.content;
for (let i = 0; i < arr.length; i++) {
const n = arr[i];
if (isObject(n) &&
n.type === "text" &&
typeof n.text === "string" &&
n.text.includes(INLINE_FOOTNOTE_SENTINEL)) {
const idx = n.text.indexOf(INLINE_FOOTNOTE_SENTINEL);
// Text before the sentinel, with a single trailing space (the one
// insertMarkerAfter prepended) stripped so the ref hugs the word.
const before = n.text.slice(0, idx).replace(/ $/, "");
const after = n.text.slice(idx + INLINE_FOOTNOTE_SENTINEL.length);
const marks = Array.isArray(n.marks) ? n.marks : [];
const parts = [];
if (before.length > 0)
parts.push({ ...n, text: before, marks: [...marks] });
parts.push({ type: "footnoteReference", attrs: { id: footnoteId } });
if (after.length > 0)
parts.push({ ...n, text: after, marks: [...marks] });
arr.splice(i, 1, ...parts);
done = true;
return;
}
}
for (const child of arr) {
visit(child);
if (done)
return;
}
};
visit(doc);
}
/**
* Append a definition node so the canonicalizer can order/place it: into the
* first existing footnotesList, or a new trailing list when none exists.

View File

@@ -912,7 +912,8 @@ server.registerTool(
"parenthesized function). It receives a clone of the live doc and " +
"ctx (comments, log, consume(id), helpers: blockText/walk/getList/" +
"insertMarkerAfter/setCalloutRange/noteItem/mdToInlineNodes/" +
"commentsToFootnotes) and must return a {type:'doc'} node.",
"commentsToFootnotes/canonicalizeFootnotes/insertInlineFootnote) " +
"and must return a {type:'doc'} node.",
),
dryRun: z
.boolean()

View File

@@ -393,7 +393,16 @@ function extractFootnotes(markdown: string): {
};
}
/** Convert markdown to a ProseMirror doc using the full Docmost schema. */
/**
* Convert markdown to a ProseMirror doc using the full Docmost schema.
*
* NOTE: besides the page-import write paths, this is also reused for comment
* bodies (createComment / updateComment). For an ordinary comment the
* canonicalize call below is a no-op (a comment carries no footnotes), so the
* reuse is safe; the only theoretical effect is if footnote markup were ever
* authored INSIDE a comment — a narrow case where canonicalizing the comment's
* own (self-contained) footnotes is still the correct behaviour.
*/
export async function markdownToProseMirror(
markdownContent: string,
): Promise<any> {

View File

@@ -0,0 +1,91 @@
/**
* Inline-authoring helpers for footnotes (MCP).
*
* These build/identify footnote DEFINITION nodes for the author-inline tool
* (`insertInlineFootnote` in transforms.ts): a content key to de-duplicate notes
* by text, a definition-node factory, and a fresh uuidv7-style id generator.
*
* Split out of `footnote-canonicalize.ts` so that module stays a pure MIRROR of
* the editor-ext canonicalizer (compositionally symmetric to the editor-ext
* copy, which keeps its authoring helpers in `footnote-util.ts`). The pure
* canonicalizer has no dependency on these.
*/
const FOOTNOTE_DEFINITION_NAME = "footnoteDefinition";
function cloneJson<T>(v: T): T {
if (typeof structuredClone === "function") return structuredClone(v);
return JSON.parse(JSON.stringify(v)) as T;
}
/**
* Normalized content key for de-duplicating footnote DEFINITIONS by their text.
*
* Two definitions with the same key are the SAME footnote — so the inline
* authoring tool reuses one id (one number, one definition, several references)
* instead of minting a second definition. Key = plaintext (whitespace-collapsed,
* trimmed) PLUS a signature of the inline mark types in order, so two notes that
* read the same but differ in formatting (one bold, one plain) are NOT merged.
* Conservative: only an exact match merges.
*/
export function footnoteContentKey(defNode: any): string {
const parts: string[] = [];
const visit = (n: any): void => {
if (!n || typeof n !== "object") return;
if (n.type === "text" && typeof n.text === "string") {
const marks = Array.isArray(n.marks)
? n.marks.map((m: any) => m?.type).filter(Boolean).sort().join(",")
: "";
parts.push(`${n.text}${marks}`);
}
if (Array.isArray(n.content)) for (const c of n.content) visit(c);
};
visit(defNode);
// Collapse the assembled text's whitespace and trim, keeping the mark
// signature attached so formatting differences still distinguish notes.
return parts
.join("")
.replace(/[ \t\r\n]+/g, " ")
.trim();
}
/**
* Build a footnoteDefinition node from inline ProseMirror nodes, keyed by id.
*/
export function makeFootnoteDefinition(id: string, inlineNodes: any[]): any {
const content = Array.isArray(inlineNodes) ? cloneJson(inlineNodes) : [];
return {
type: FOOTNOTE_DEFINITION_NAME,
attrs: { id },
content: [{ type: "paragraph", content }],
};
}
/**
* Generate a uuidv7-style id (time-ordered), matching editor-ext's
* `generateFootnoteId`. Used for a genuinely-new inline footnote id.
*/
export function generateFootnoteId(): string {
const now = Date.now();
const timeHex = now.toString(16).padStart(12, "0");
const rand = (length: number) => {
let s = "";
for (let i = 0; i < length; i++)
s += Math.floor(Math.random() * 16).toString(16);
return s;
};
const versioned = "7" + rand(3);
const variantNibble = (8 + Math.floor(Math.random() * 4)).toString(16);
const variant = variantNibble + rand(3);
return (
timeHex.slice(0, 8) +
"-" +
timeHex.slice(8, 12) +
"-" +
versioned +
"-" +
variant +
"-" +
rand(12)
);
}

View File

@@ -1,5 +1,5 @@
/**
* Server-side footnote canonicalizer + inline authoring helper (MCP mirror).
* Server-side footnote canonicalizer (MCP mirror — PURE).
*
* `canonicalizeFootnotes(doc)` is a pure ProseMirror-JSON port of the editor's
* `footnoteSyncPlugin` end-state, identical in behaviour to
@@ -8,7 +8,13 @@
* `docmost-schema.ts` nodes are mirrored: the MCP package is deliberately
* decoupled from the browser/React-heavy editor barrel and operates on plain
* JSON. The editor-ext copy owns the golden test against the live plugin; this
* copy must stay behaviourally identical.
* copy must stay behaviourally identical (a SHARED golden corpus, exercised by
* both test suites, pins that — see `test/unit/footnote-corpus.mjs`).
*
* This module is the pure MIRROR only. The inline-authoring helpers
* (`footnoteContentKey`, `makeFootnoteDefinition`, `generateFootnoteId`) used by
* `insertInlineFootnote` live in the sibling `footnote-authoring.ts`, so this
* file is compositionally symmetric to the editor-ext copy.
*
* Why it exists: every NON-editor write path (markdown import, update_page_json,
* docmost_transform, insert_footnote) builds ProseMirror JSON directly, so the
@@ -28,38 +34,6 @@ function cloneJson<T>(v: T): T {
return JSON.parse(JSON.stringify(v)) as T;
}
/**
* Deterministic unique id for the k-th (k >= 2) duplicate of an id during
* collision resolution. Pure function of (originalId, occurrence, taken) — no
* Math.random/Date.now — mirroring editor-ext's `deriveFootnoteId`. Kept local
* (the importer's first-wins de-dup means duplicates are rare here, but the
* canonicalizer must still resolve them deterministically).
*/
export function deriveFootnoteId(
originalId: string,
occurrence: number,
taken: Set<string> | ReadonlySet<string>,
): string {
let candidate = `${originalId}__${occurrence}`;
let n = 0;
while (taken.has(candidate)) {
n += 1;
candidate = `${originalId}__${occurrence}${suffix(n)}`;
}
return candidate;
}
function suffix(n: number): string {
let out = "";
let x = n;
while (x > 0) {
const rem = (x - 1) % 25;
out = String.fromCharCode(98 + rem) + out; // 98 = 'b'
x = Math.floor((x - 1) / 25);
}
return out;
}
function isEmptyParagraph(node: any): boolean {
return (
!!node &&
@@ -98,6 +72,36 @@ function emptyDefinition(id: string): any {
};
}
/**
* Order-insensitive deep equality over plain JSON (objects/arrays/primitives).
* Used to detect an already-canonical footnotesList so its physical position is
* preserved (placement parity with the live plugin).
*/
function deepEqualJson(a: any, b: any): boolean {
if (a === b) return true;
if (a == null || b == null || typeof a !== typeof b) return false;
if (Array.isArray(a) || Array.isArray(b)) {
if (!Array.isArray(a) || !Array.isArray(b) || a.length !== b.length) {
return false;
}
for (let i = 0; i < a.length; i++) {
if (!deepEqualJson(a[i], b[i])) return false;
}
return true;
}
if (typeof a === "object") {
const ka = Object.keys(a);
const kb = Object.keys(b);
if (ka.length !== kb.length) return false;
for (const k of ka) {
if (!Object.prototype.hasOwnProperty.call(b, k)) return false;
if (!deepEqualJson(a[k], b[k])) return false;
}
return true;
}
return false;
}
/**
* Canonicalize footnotes in a ProseMirror-JSON document. See the file header and
* the editor-ext twin for the full contract. Pure (deep-clones input,
@@ -113,131 +117,72 @@ export function canonicalizeFootnotes<T = any>(doc: T): T {
}
const out = cloneJson(doc) as any;
// 1) Distinct reference ids in document order (deep — refs can live in
// callouts, tables, list items, ...). The ordering/numbering truth.
const referenceIds: string[] = [];
collectReferenceIds(out, referenceIds, new Set<string>());
// 2) Every definition node in document order (deep).
const defNodes: any[] = [];
collectDefinitions(out, defNodes);
const taken = new Set<string>(referenceIds);
// 3) First definition per id wins; later duplicates carry the SAME id, so they
// cannot be referenced separately and would be orphans — they are dropped.
const defById = new Map<string, any>();
for (const d of defNodes) {
const id = d?.attrs?.id;
if (id) taken.add(id);
}
const occurrenceOf = new Map<string, number>();
const seenDefIds = new Set<string>();
const defByFinalId = new Map<string, any>();
for (const d of defNodes) {
const origId = d?.attrs?.id;
if (!origId) continue;
if (!seenDefIds.has(origId)) {
seenDefIds.add(origId);
defByFinalId.set(origId, d);
} else {
const next = (occurrenceOf.get(origId) ?? 1) + 1;
occurrenceOf.set(origId, next);
const newId = deriveFootnoteId(origId, next, taken);
taken.add(newId);
defByFinalId.set(newId, d);
}
if (id && !defById.has(id)) defById.set(id, d);
}
// 4) Build the ordered definition list: one per referenced id, in REFERENCE
// order, reusing the existing node (shallow-copied, id normalized — `out` is
// already deep-cloned and the old lists are cut) or synthesizing an empty
// one. Definitions whose id is not referenced are orphans and never added.
const orderedDefs: any[] = [];
for (const id of referenceIds) {
const existing = defByFinalId.get(id);
const existing = defById.get(id);
if (existing) {
const node = cloneJson(existing);
node.attrs = { ...(node.attrs ?? {}), id };
orderedDefs.push(node);
orderedDefs.push({
...existing,
attrs: { ...(existing.attrs ?? {}), id },
});
} else {
orderedDefs.push(emptyDefinition(id));
}
}
const top: any[] = out.content.filter(
(n: any) => !(n && n.type === FOOTNOTES_LIST_NAME),
);
// 5) No references -> there must be NO list at all.
if (referenceIds.length === 0) {
out.content = top;
out.content = out.content.filter(
(n: any) => !(n && n.type === FOOTNOTES_LIST_NAME),
);
return out;
}
// 6) Placement parity with the live plugin: when the document is ALREADY in the
// canonical single-list state, leave that list exactly where it sits rather
// than cutting and re-inserting it at the end (the plugin never repositions a
// sole correct list, so moving it would silently reorder any content that
// follows the list on the first write).
const topLevelLists = out.content.filter(
(n: any) => n && n.type === FOOTNOTES_LIST_NAME,
);
if (
topLevelLists.length === 1 &&
defNodes.length === orderedDefs.length &&
deepEqualJson(topLevelLists[0].content, orderedDefs)
) {
return out;
}
// 7) Otherwise rebuild: strip every footnotesList and re-insert exactly one
// after the last meaningful (non-empty paragraph) block.
const top: any[] = out.content.filter(
(n: any) => !(n && n.type === FOOTNOTES_LIST_NAME),
);
let insertAt = top.length;
while (insertAt > 0 && isEmptyParagraph(top[insertAt - 1])) insertAt--;
top.splice(insertAt, 0, { type: FOOTNOTES_LIST_NAME, content: orderedDefs });
out.content = top;
return out;
}
/**
* Normalized content key for de-duplicating footnote DEFINITIONS by their text.
*
* Two definitions with the same key are the SAME footnote — so the inline
* authoring tool reuses one id (one number, one definition, several references)
* instead of minting a second definition. Key = plaintext (whitespace-collapsed,
* trimmed) PLUS a signature of the inline mark types in order, so two notes that
* read the same but differ in formatting (one bold, one plain) are NOT merged.
* Conservative: only an exact match merges.
*/
export function footnoteContentKey(defNode: any): string {
const parts: string[] = [];
const visit = (n: any): void => {
if (!n || typeof n !== "object") return;
if (n.type === "text" && typeof n.text === "string") {
const marks = Array.isArray(n.marks)
? n.marks.map((m: any) => m?.type).filter(Boolean).sort().join(",")
: "";
parts.push(`${n.text}${marks}`);
}
if (Array.isArray(n.content)) for (const c of n.content) visit(c);
};
visit(defNode);
// Collapse the assembled text's whitespace and trim, keeping the mark
// signature attached so formatting differences still distinguish notes.
return parts
.join("")
.replace(/[ \t\r\n]+/g, " ")
.trim();
}
/**
* Build a footnoteDefinition node from inline ProseMirror nodes, keyed by id.
*/
export function makeFootnoteDefinition(id: string, inlineNodes: any[]): any {
const content = Array.isArray(inlineNodes) ? cloneJson(inlineNodes) : [];
return {
type: FOOTNOTE_DEFINITION_NAME,
attrs: { id },
content: [{ type: "paragraph", content }],
};
}
/**
* Generate a uuidv7-style id (time-ordered), matching editor-ext's
* `generateFootnoteId`. Used for a genuinely-new inline footnote id.
*/
export function generateFootnoteId(): string {
const now = Date.now();
const timeHex = now.toString(16).padStart(12, "0");
const rand = (length: number) => {
let s = "";
for (let i = 0; i < length; i++)
s += Math.floor(Math.random() * 16).toString(16);
return s;
};
const versioned = "7" + rand(3);
const variantNibble = (8 + Math.floor(Math.random() * 4)).toString(16);
const variant = variantNibble + rand(3);
return (
timeHex.slice(0, 8) +
"-" +
timeHex.slice(8, 12) +
"-" +
versioned +
"-" +
variant +
"-" +
rand(12)
);
}

View File

@@ -15,12 +15,12 @@
*/
import { blockPlainText } from "./node-ops.js";
import { canonicalizeFootnotes } from "./footnote-canonicalize.js";
import {
canonicalizeFootnotes,
footnoteContentKey,
makeFootnoteDefinition,
generateFootnoteId,
} from "./footnote-canonicalize.js";
} from "./footnote-authoring.js";
export { canonicalizeFootnotes } from "./footnote-canonicalize.js";
@@ -113,6 +113,30 @@ export function insertMarkerAfter(
anchor: string,
marker: string,
opts: InsertMarkerOptions = {},
): { doc: any; inserted: boolean } {
// A plain marker is a leading-space-padded unmarked text run.
return insertNodesAfterAnchor(
doc,
anchor,
() => [{ type: "text", text: " " + marker }],
opts,
);
}
/**
* Mark-safe insertion CORE: split the inline text run that holds the END of
* `anchor` (preserving the surrounding marks) and splice the nodes produced by
* `makeMiddle()` in at the split point. `insertMarkerAfter` (plain text marker)
* and `insertInlineFootnote` (a `footnoteReference` node) are both thin callers —
* the only difference is WHAT is inserted (a space-padded text run vs. a node
* that should hug the preceding word), which is exactly what `makeMiddle`
* decides. Operates on a clone; returns `{ doc, inserted }`.
*/
function insertNodesAfterAnchor(
doc: any,
anchor: string,
makeMiddle: () => any[],
opts: InsertMarkerOptions = {},
): { doc: any; inserted: boolean } {
const out = clone(doc);
if (!isObject(out) || !Array.isArray(out.content) || !anchor) {
@@ -174,8 +198,9 @@ export function insertMarkerAfter(
if (before.length > 0) {
parts.push({ ...n, text: before, marks: [...marks] });
}
// Marker is a PLAIN run: no marks copied. Leading space separates it.
parts.push({ type: "text", text: " " + marker });
// The inserted nodes are caller-decided (a space-padded marker run,
// or a node that hugs the word). They carry no copied marks.
parts.push(...makeMiddle());
if (after.length > 0) {
parts.push({ ...n, text: after, marks: [...marks] });
}
@@ -587,9 +612,6 @@ export interface InsertInlineFootnoteResult {
reused: boolean;
}
/** A NUL-delimited sentinel that cannot occur in real prose. */
const INLINE_FOOTNOTE_SENTINEL = "\u0000IFN\u0000";
/**
* AUTHOR-INLINE footnote insertion. The caller supplies WHERE (anchorText) and
* WHAT (markdown text); numbering and the bottom list are derived server-side by
@@ -603,10 +625,10 @@ const INLINE_FOOTNOTE_SENTINEL = "\u0000IFN\u0000";
* minted and a new definition added. Conservative — only an exact content match
* merges.
*
* Mechanics: the marker is inserted with the same mark-safe `insertMarkerAfter`
* split used elsewhere, via a sentinel that is then replaced by a real
* `footnoteReference` node (dropping the inserted leading space so the marker
* attaches to the preceding word). The whole document is then canonicalized.
* Mechanics: the `footnoteReference` node is inserted DIRECTLY at the anchor via
* the same mark-safe split as `insertMarkerAfter` (the shared
* `insertNodesAfterAnchor` core), so it hugs the preceding word with no text
* sentinel round-trip. The whole document is then canonicalized.
*
* Operates on a clone of `doc`. When the anchor is not found, returns the input
* unchanged with `inserted:false`.
@@ -639,16 +661,18 @@ export function insertInlineFootnote(
}
if (footnoteId == null) footnoteId = generateFootnoteId();
// Insert a sentinel marker after the anchor (mark-safe split).
const r = insertMarkerAfter(doc, (opts.anchorText ?? "").trimEnd(), INLINE_FOOTNOTE_SENTINEL);
// Insert the footnoteReference node directly after the anchor (mark-safe
// split); it hugs the preceding word with no leading space.
const r = insertNodesAfterAnchor(
doc,
(opts.anchorText ?? "").trimEnd(),
() => [{ type: "footnoteReference", attrs: { id: footnoteId } }],
);
if (!r.inserted) {
return { doc: clone(doc), inserted: false, footnoteId, reused };
}
let working = r.doc;
// Replace the sentinel run with a real footnoteReference node.
replaceSentinelWithReference(working, footnoteId);
// Add a NEW definition (canonicalize will order/place it); a reused id needs
// no new definition (the existing one is shared).
if (!reused) {
@@ -660,47 +684,6 @@ export function insertInlineFootnote(
return { doc: working, inserted: true, footnoteId, reused };
}
/**
* Replace the lone sentinel text run (created by insertMarkerAfter as
* `" " + sentinel`) with a footnoteReference node, dropping the leading space so
* the marker attaches to the preceding word. Mutates `doc` in place.
*/
function replaceSentinelWithReference(doc: any, footnoteId: string): void {
let done = false;
const visit = (container: any): void => {
if (done || !isObject(container) || !Array.isArray(container.content)) return;
const arr = container.content;
for (let i = 0; i < arr.length; i++) {
const n = arr[i];
if (
isObject(n) &&
n.type === "text" &&
typeof n.text === "string" &&
n.text.includes(INLINE_FOOTNOTE_SENTINEL)
) {
const idx = n.text.indexOf(INLINE_FOOTNOTE_SENTINEL);
// Text before the sentinel, with a single trailing space (the one
// insertMarkerAfter prepended) stripped so the ref hugs the word.
const before = n.text.slice(0, idx).replace(/ $/, "");
const after = n.text.slice(idx + INLINE_FOOTNOTE_SENTINEL.length);
const marks = Array.isArray(n.marks) ? n.marks : [];
const parts: any[] = [];
if (before.length > 0) parts.push({ ...n, text: before, marks: [...marks] });
parts.push({ type: "footnoteReference", attrs: { id: footnoteId } });
if (after.length > 0) parts.push({ ...n, text: after, marks: [...marks] });
arr.splice(i, 1, ...parts);
done = true;
return;
}
}
for (const child of arr) {
visit(child);
if (done) return;
}
};
visit(doc);
}
/**
* Append a definition node so the canonicalizer can order/place it: into the
* first existing footnotesList, or a new trailing list when none exists.

View File

@@ -0,0 +1,152 @@
// Mock-HTTP orchestration tests for the footnote WRITE wrappers on DocmostClient
// (issue #228):
// - insertFootnote (#11): the required-argument guards reject BEFORE any write,
// and never touch the collab/mutate path.
// - transformPage / docmost_transform (#13): the auto-canonicalize step
// (`result = canonicalizeFootnotes(raw)`) runs after every transform, so a
// transform that introduces an orphan footnote definition is silently tidied
// away — observable as an EMPTY diff in a dryRun preview.
//
// These stand a local http.createServer in for Docmost and only exercise plain
// HTTP routes (login / comments / pages.info), deliberately avoiding the live
// Hocuspocus collab WebSocket: the insertFootnote guards short-circuit before it,
// and docmost_transform's dryRun preview never opens it. The full collab mutate
// path (abort-via-throw on a missing anchor, the reused/message response branch)
// is covered at the pure level by insertInlineFootnote in
// test/unit/footnote-canonicalize.test.mjs.
import { test, after } from "node:test";
import assert from "node:assert/strict";
import http from "node:http";
import { DocmostClient } from "../../build/client.js";
function readBody(req) {
return new Promise((resolve) => {
let raw = "";
req.on("data", (c) => (raw += c));
req.on("end", () => resolve(raw));
});
}
function startServer(handler) {
return new Promise((resolve) => {
const server = http.createServer(handler);
server.listen(0, "127.0.0.1", () => {
const { port } = server.address();
resolve({ server, baseURL: `http://127.0.0.1:${port}/api` });
});
});
}
function sendJson(res, status, obj, extraHeaders = {}) {
res.writeHead(status, { "Content-Type": "application/json", ...extraHeaders });
res.end(JSON.stringify(obj));
}
const openServers = [];
async function spawn(handler) {
const { server, baseURL } = await startServer(handler);
openServers.push(server);
return { baseURL };
}
after(async () => {
await Promise.all(openServers.map((s) => new Promise((r) => s.close(r))));
});
const ref = (id) => ({ type: "footnoteReference", attrs: { id } });
const def = (id, text) => ({
type: "footnoteDefinition",
attrs: { id },
content: [{ type: "paragraph", content: [{ type: "text", text }] }],
});
// ---------------------------------------------------------------------------
// #11 insertFootnote guards: missing anchorText / text reject and never write.
// ---------------------------------------------------------------------------
test("insertFootnote rejects a missing anchorText before any write", async () => {
const otherRoutes = [];
const { baseURL } = await spawn(async (req, res) => {
await readBody(req);
if (req.url === "/api/auth/login") {
return sendJson(res, 200, { success: true }, {
"Set-Cookie": "authToken=t; Path=/; HttpOnly",
});
}
otherRoutes.push(req.url);
sendJson(res, 404, { message: "not found" });
});
const client = new DocmostClient(baseURL, "user@example.com", "pw");
await assert.rejects(
() => client.insertFootnote("page-1", " ", "a note"),
/anchorText is required/i,
);
assert.deepEqual(otherRoutes, [], "must not hit any write route");
});
test("insertFootnote rejects an empty text before any write", async () => {
const otherRoutes = [];
const { baseURL } = await spawn(async (req, res) => {
await readBody(req);
if (req.url === "/api/auth/login") {
return sendJson(res, 200, { success: true }, {
"Set-Cookie": "authToken=t; Path=/; HttpOnly",
});
}
otherRoutes.push(req.url);
sendJson(res, 404, { message: "not found" });
});
const client = new DocmostClient(baseURL, "user@example.com", "pw");
await assert.rejects(
() => client.insertFootnote("page-1", "anchor", " "),
/text is required/i,
);
assert.deepEqual(otherRoutes, [], "must not hit any write route");
});
// ---------------------------------------------------------------------------
// #13 docmost_transform auto-canonicalization: a transform that adds an orphan
// footnote definition produces NO net change (the canonicalizer drops it), so a
// dryRun preview reports an empty diff. Without the auto-canonicalize step the
// orphan would survive and the diff would be non-empty.
// ---------------------------------------------------------------------------
test("transformPage dryRun auto-canonicalizes footnotes (orphan def is dropped)", async () => {
// A page already in canonical footnote state (refs b,a; defs b,a).
const pageContent = {
type: "doc",
content: [
{ type: "paragraph", content: [{ type: "text", text: "x" }, ref("b"), ref("a")] },
{ type: "footnotesList", content: [def("b", "B"), def("a", "A")] },
],
};
const { baseURL } = await spawn(async (req, res) => {
await readBody(req);
if (req.url === "/api/auth/login") {
return sendJson(res, 200, { success: true }, {
"Set-Cookie": "authToken=t; Path=/; HttpOnly",
});
}
if (req.url === "/api/comments") {
return sendJson(res, 200, { data: { items: [], meta: { nextCursor: null } } });
}
if (req.url === "/api/pages/info") {
return sendJson(res, 200, {
data: { id: "page-1", slugId: "s", title: "P", spaceId: "sp", content: pageContent },
});
}
sendJson(res, 404, { message: "not found" });
});
const client = new DocmostClient(baseURL, "user@example.com", "pw");
// The transform appends an ORPHAN definition (id "z", no matching reference).
const transformJs = `(doc) => {
const list = doc.content.find((n) => n.type === "footnotesList");
list.content.push({
type: "footnoteDefinition",
attrs: { id: "z" },
content: [{ type: "paragraph", content: [{ type: "text", text: "orphan" }] }],
});
return doc;
}`;
const result = await client.transformPage("page-1", transformJs, { dryRun: true });
assert.equal(result.pushed, false);
// Auto-canonicalize dropped the orphan, so the doc is unchanged => empty diff.
assert.equal(result.diff.summary.inserted, 0, "orphan def must be canonicalized away");
assert.equal(result.diff.summary.deleted, 0);
});

View File

@@ -1,10 +1,8 @@
import { test } from "node:test";
import assert from "node:assert/strict";
import {
canonicalizeFootnotes,
footnoteContentKey,
} from "../../build/lib/footnote-canonicalize.js";
import { canonicalizeFootnotes } from "../../build/lib/footnote-canonicalize.js";
import { footnoteContentKey } from "../../build/lib/footnote-authoring.js";
import { insertInlineFootnote } from "../../build/lib/transforms.js";
import { markdownToProseMirror } from "../../build/lib/collaboration.js";

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,19 @@
// Runs the MCP mirror of `canonicalizeFootnotes` against the SHARED golden
// corpus (the same { input -> expected } cases the editor-ext copy is tested
// against in footnote-canonicalize.test.ts). Pinning identical expected outputs
// in both suites makes "the editor-ext copy and the MCP mirror behave
// identically" a checkable property without coupling the two packages
// (architecture item A). The corpus data is mirrored in footnote-corpus.mjs.
import { test } from "node:test";
import assert from "node:assert/strict";
import { canonicalizeFootnotes } from "../../build/lib/footnote-canonicalize.js";
import { FOOTNOTE_CORPUS } from "./footnote-corpus.mjs";
for (const { name, input, expected } of FOOTNOTE_CORPUS) {
test(`shared corpus (MCP mirror): ${name}`, () => {
assert.deepEqual(canonicalizeFootnotes(input), expected);
// Idempotent on the corpus too.
assert.deepEqual(canonicalizeFootnotes(expected), expected);
});
}