fix(footnotes): don't canonicalize comment bodies (data loss); canonicalize only page write paths (#228)

Must-fix (REAL DATA LOSS):
- markdownToProseMirror is reused for COMMENT bodies (createComment/updateComment).
  It unconditionally canonicalized, so a comment carrying a standalone footnote
  definition ([^1]: text with no matching reference) had its whole footnotesList
  stripped (referenceIds.length===0 -> stripFootnotesListsDeep) — the text
  vanished. Fix: markdownToProseMirror no longer canonicalizes (content-preserving
  primitive); a new markdownToProseMirrorCanonical wraps it for the PAGE write
  paths (markdown import via importPageMarkdown, update_page markdown via
  updatePageContentRealtime). Comment callers keep the non-canonicalizing
  primitive. Updated the now-false header comment and added create/update-comment
  inline notes. Added collaboration tests: comment path PRESERVES a reference-less
  definition; page path still drops it AND still reorders real footnotes. Updated
  the page-import canonicalization test to use the canonical variant.

Suggestions / architecture:
- #2: collapsed transforms.footnoteDefinition onto the shared
  makeFootnoteDefinition factory (adds only the inner paragraph block id); kept
  the dependency direction transforms -> footnote-authoring (no circular import,
  mirror stays pure).
- #3: confirmed docmost_transform auto-canonicalization is documented (inline
  comment, tool description, CHANGELOG) — no code change.
- #4: copyPageContent is a FULL-document write (replacePageContent of a
  type:"doc"); added a defensive canonicalizeFootnotes pass (no-op on
  already-canonical source).
- CHANGELOG entry refined to list the FULL-document write paths (incl.
  copy_page_content) and to state canonicalization is NOT applied to comment
  bodies.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
a
2026-06-27 22:17:15 +03:00
parent a77a0bc92b
commit 3fd66b4245
9 changed files with 158 additions and 58 deletions

View File

@@ -49,12 +49,15 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
edit the list, or desync, and a same-content note reuses one definition. Under
the hood, the editor's footnote-integrity invariant (one trailing list,
numbering by first reference, no orphans/duplicates, no raw `[^id]`) is now
enforced as a pure `canonicalizeFootnotes(doc)` on the write paths that bypass
the editor's plugins: server markdown/HTML import, `PageService` create and
full-document (`replace`) updates, the client markdown paste, and the MCP
`markdownToProseMirror` / `update_page_json` / `docmost_transform` /
`insert_footnote` paths. It is idempotent (a no-op once canonical) and is
deliberately NOT applied to append/prepend fragments. (#228)
enforced as a pure `canonicalizeFootnotes(doc)` on the FULL-document write paths
that bypass the editor's plugins: server markdown/HTML import, `PageService`
create and full-document (`replace`) updates, the client markdown paste, and the
MCP markdown page-import / `update_page` (markdown) / `update_page_json` /
`docmost_transform` / `insert_footnote` / `copy_page_content` paths. It is
idempotent (a no-op once canonical) and is deliberately NOT applied to
append/prepend fragments, nor to COMMENT bodies — a comment may legitimately
contain a standalone footnote definition, which canonicalization would drop.
(#228)
### Fixed