feat(mcp): serve embedded community MCP server at /mcp
Replace the removed enterprise EE MCP (private apps/server/src/ee submodule,
license-gated /mcp route) with our docmost-mcp, vendored as an isolated ESM
workspace package and served by the server over HTTP — no enterprise license.
Backend:
- Add packages/mcp (@docmost/mcp): vendored docmost-mcp refactored into a
side-effect-free createDocmostMcpServer() factory (38 tools preserved),
stdio entry kept in stdio.ts, Streamable-HTTP session manager in http.ts.
- Add apps/server McpModule: @Post/@Get/@Delete('mcp') (served at /mcp via the
existing global-prefix exclude), @SkipTransform + reply.hijack to bridge raw
Fastify req/res into the SDK transport. The module dynamically imports the
ESM-only package from CommonJS via a Function-indirected import resolved with
require.resolve + file:// URL. Gated by the workspace ai.mcp toggle, a
service-account (MCP_DOCMOST_EMAIL/PASSWORD/API_URL) and optional MCP_TOKEN;
per-session idle eviction (MCP_SESSION_IDLE_MS).
- Drop the enterprise license check on mcpEnabled in workspace.service.
- Dockerfile: copy packages/mcp into the production image.
- .env.example: document MCP_DOCMOST_*, MCP_TOKEN, MCP_SESSION_IDLE_MS.
Frontend:
- Recreate the community "AI & MCP" workspace-settings panel (mcp-settings.tsx):
admin-only toggle on settings.ai.mcp with optimistic update, copyable
${APP_URL}/mcp URL; wired into workspace-settings page. Reuses existing i18n.
Fixes:
- Pin packages/mcp tiptap deps to 3.20.4 (matching the client) and inline
getStyleProperty, preventing a duplicate @tiptap/core@3.26.1 from leaking into
the client editor via pnpm shamefully-hoist (was breaking apps/client tsc).
This commit is contained in:
357
packages/mcp/README.md
Normal file
357
packages/mcp/README.md
Normal file
@@ -0,0 +1,357 @@
|
||||
# Docmost MCP Server
|
||||
|
||||
**English** · [Русский](README.ru.md)
|
||||
|
||||
A Model Context Protocol (MCP) server for [Docmost](https://docmost.com/) that lets
|
||||
AI agents **read, search, write, restructure, review, version, comment on, illustrate
|
||||
and publish** documentation — safely, against a live instance, without an enterprise
|
||||
license.
|
||||
|
||||
> **Written by an agent, for agents.** A human edits a document with their eyes and hands:
|
||||
> they read it, click into the editor, and retype. An agent works differently — it is far
|
||||
> better at *writing a small function that fixes the text* than at re-reading and
|
||||
> re-emitting a whole document. So this server is built around the way a model actually
|
||||
> wants to edit: address a block by id, run a find/replace, or hand it a
|
||||
> `(doc, ctx) => doc` transform and let it *program* the change. `docmost_transform` is
|
||||
> that interface. Other Docmost MCPs are human-shaped — they expose "open the page" and
|
||||
> "replace the page"; this one exposes the editing primitives a model is good at.
|
||||
|
||||
It exposes **38 tools** built around three ideas that the other Docmost MCPs do not
|
||||
combine:
|
||||
|
||||
1. **Surgical, token-cheap edits.** Address a single block by id and patch it, or run
|
||||
a find/replace, instead of round-tripping a whole ~100 KB document through the model.
|
||||
2. **Safe live writes.** Every mutation goes through Docmost's real-time collaboration
|
||||
layer (the same WebSocket the web editor uses), serialized per page, so it never
|
||||
clobbers a concurrent human edit and is confirmed persisted before the tool returns.
|
||||
3. **A real safety net.** Version history, a Docmost-equivalent diff, a one-call
|
||||
restore, and a dry-run preview for scripted rewrites — so an agent can edit
|
||||
boldly and you can always see and undo what it did.
|
||||
|
||||
---
|
||||
|
||||
## Why this server (vs. the alternatives)
|
||||
|
||||
There are several Docmost MCPs. Here is a capability-by-capability comparison.
|
||||
"Official" is Docmost's built-in MCP; the others are the community projects on GitHub.
|
||||
|
||||
| Capability | **This server** | Official (built-in) | MrMartiniMo/docmost-mcp | cyborgx0x/mcp-docmost | aleksvin8888 / isak-landin |
|
||||
| --- | :---: | :---: | :---: | :---: | :---: |
|
||||
| **Enterprise license required** | **No** | **Yes** | No | No | No |
|
||||
| Authentication | email + password, **auto re-auth** | API key | email + password | cookie `authToken` (copy from DevTools) | Docmost API / **direct PostgreSQL** |
|
||||
| Read page as Markdown | ✅ | ✅ | ✅ | ✅ | ✅ (read-only) |
|
||||
| **Lossless Markdown round-trip** (export / import, keeps comment anchors) | ✅ | — | — | — | — |
|
||||
| Read **lossless ProseMirror JSON** (with block ids) | ✅ | — | — | — | — |
|
||||
| **Compact page outline** (cheap block-id lookup) | ✅ | — | — | — | — |
|
||||
| **Fetch a single block** (by id or index) | ✅ | — | — | — | — |
|
||||
| Create / move / delete pages | ✅ | ✅ | ✅ | ✅ | — |
|
||||
| **Per-block edits** (patch/insert/delete by id) | ✅ | — | — | — | — |
|
||||
| **Surgical find/replace** (structure-preserving) | ✅ | — | — | — | — |
|
||||
| **Scripted JS transform** (sandboxed, dry-run diff) | ✅ | — | — | — | — |
|
||||
| **Structured table editing** (row / cell CRUD) | ✅ | — | — | — | — |
|
||||
| Page **version history** | ✅ | — | — | ✅ | — |
|
||||
| **Diff two versions** | ✅ | — | — | — | — |
|
||||
| **Restore a version** (revertible) | ✅ | — | — | — | — |
|
||||
| **Comments** (CRUD + inline anchoring) | ✅ | — | — | ✅ | — |
|
||||
| **Poll for new comments** since a timestamp | ✅ | — | — | — | — |
|
||||
| **Images** (insert / replace) | ✅ | — | — | — | — |
|
||||
| **Public share links** (create / revoke / list) | ✅ | — | — | — | — |
|
||||
| Export to HTML / PDF | — | — | — | ✅ | — |
|
||||
| **Safe real-time-collab writes** (no clobber, confirmed) | ✅ | n/a | ✅ | — | n/a (read-only) |
|
||||
|
||||
### What that means in practice
|
||||
|
||||
- **No enterprise tax.** Docmost's official MCP is an enterprise feature: it needs an
|
||||
active enterprise license. This server is MIT and
|
||||
talks to *any* self-hosted Docmost over the standard API + collaboration socket, with
|
||||
nothing but an account email and password.
|
||||
|
||||
- **Token-efficient editing.** Most Docmost MCPs (and the official one) only offer
|
||||
"replace the whole page" writes — the agent must download the entire document, mutate
|
||||
it, and upload it back, paying for the full document **twice** on every tiny fix.
|
||||
This server lets the agent change exactly one block (`patch_node` / `insert_node` /
|
||||
`delete_node`), do a structure-preserving find/replace (`edit_page_text`), or copy a
|
||||
whole page server-side (`copy_page_content`) — **without the document ever passing
|
||||
through the model**.
|
||||
|
||||
- **Writes that don't fight the editor.** Naive REST writes race with whatever a human
|
||||
is typing and can silently overwrite their edits, or fail against Docmost's debounced
|
||||
save. This server applies every change through the live collaboration document
|
||||
(Hocuspocus/Yjs), reading and writing **synchronously inside one sync tick** so no
|
||||
concurrent edit can interleave, serializing writes **per page** with a mutex, and
|
||||
**waiting for the server to acknowledge persistence** before returning. If the socket
|
||||
drops mid-write, the tool errors instead of falsely reporting success.
|
||||
|
||||
- **Agent-native editing model.** Human-facing servers expose "open the page" and "replace
|
||||
the page", because that mirrors how a person works. A model edits better by *programming*
|
||||
the change — addressing blocks by id, running a find/replace, or supplying a
|
||||
`(doc, ctx) => doc` transform (`docmost_transform`, with a dry-run diff before it
|
||||
commits). This server is shaped around that, which is why it has editing primitives the
|
||||
others simply don't.
|
||||
|
||||
- **An editing safety net the others lack.** `list_page_history` → `diff_page_versions`
|
||||
→ `restore_page_version` give an agent (and you) a full view-and-undo loop. The diff
|
||||
uses the *same* `recreateTransform → ChangeSet → simplifyChanges` pipeline Docmost's
|
||||
own history viewer uses, so what you see matches the product.
|
||||
|
||||
- **Convenience over cookie-scraping.** Some community servers authenticate by making
|
||||
you copy a session cookie out of your browser's DevTools (it expires), or by reaching
|
||||
**directly into the PostgreSQL database**. This server logs in with credentials and
|
||||
**transparently re-authenticates on
|
||||
a 401/403** (with in-flight de-duplication), so long-running agents don't die when a
|
||||
token expires. It also respects Docmost's own access control, because it goes through
|
||||
the API and the collaboration server like a normal user.
|
||||
|
||||
---
|
||||
|
||||
## Tools
|
||||
|
||||
All 38 tools, grouped by what you'd reach for them.
|
||||
|
||||
### Exploration & retrieval
|
||||
|
||||
- **`get_workspace`** — Information about the current Docmost workspace.
|
||||
- **`list_spaces`** — All spaces in the workspace.
|
||||
- **`list_pages`** — Recent pages in a space, ordered by `updatedAt` desc (default 50,
|
||||
max 100). Use `search` for lookups in large spaces.
|
||||
- **`search`** — Full-text search across pages and content (bounded by `limit`, max 100).
|
||||
- **`get_page`** — A page's content as clean **Markdown** (convenient, but a *lossy*
|
||||
view — block ids and exact table/callout structure are approximated).
|
||||
- **`get_page_json`** — A page's **lossless ProseMirror/TipTap JSON**, including every
|
||||
block's `attrs.id` and the `slugId` used in URLs. This is what the per-block editing
|
||||
tools consume.
|
||||
- **`get_outline`** — A compact outline of a page's top-level blocks (`{index, type, id,
|
||||
level, firstText}`; tables add row/column counts and their header-cell texts, lists add
|
||||
item counts) **without** the document body. The cheap way to locate a section or table
|
||||
and grab its block id before
|
||||
`get_node` / `patch_node` / `insert_node`.
|
||||
- **`get_node`** — Fetch a single block's full ProseMirror subtree (lossless) without
|
||||
pulling the whole page. Address it by a block id (from `get_outline` / `get_page_json`),
|
||||
or by `#<index>` for a top-level block — use the `#<index>` form for tables/rows/cells,
|
||||
which carry no id.
|
||||
|
||||
### Page lifecycle
|
||||
|
||||
- **`create_page`** — Create a page from Markdown and place it in the hierarchy (optional
|
||||
`parentPageId`) in one call. Uses Docmost's import API for clean Markdown→ProseMirror.
|
||||
- **`rename_page`** — Change a page's title only, without touching or resending content.
|
||||
- **`move_page`** — Re-parent a page (nest it, or move to root); supports fractional-index
|
||||
positioning. Returns only on a *positively confirmed* success.
|
||||
- **`delete_page`** — Delete a single page.
|
||||
- **`copy_page_content`** — Replace one page's body with a copy of another's, **entirely
|
||||
server-side** — the document never passes through the model. The target keeps its own
|
||||
title and slug (so its URL is preserved).
|
||||
|
||||
### Editing
|
||||
|
||||
- **`edit_page_text`** — Surgical find/replace inside a page's text. Preserves **all**
|
||||
structure: block ids, marks, links, callouts, tables. The preferred tool for fixing
|
||||
wording, typos, numbers and names.
|
||||
- **`patch_node`** — Replace a single block addressed by its `attrs.id` (from
|
||||
`get_page_json`), without resending the document.
|
||||
- **`insert_node`** — Insert a block before/after another (by `attrs.id` or anchor text),
|
||||
or append at the end.
|
||||
- **`delete_node`** — Remove a single block by its `attrs.id`.
|
||||
- **`update_page_json`** — Replace a page's entire content with a ProseMirror document
|
||||
(bulk rewrites, or when nodes lack ids). `content` is optional — omit it to update only
|
||||
the title. Keeps the block ids you pass in, so heading anchors and history stay stable.
|
||||
- **`docmost_transform`** — The agent-native editing interface: instead of retyping a
|
||||
document, the agent **writes a function that fixes it**. Edit a page by running an
|
||||
arbitrary **`(doc, ctx) => doc` JavaScript transform** against its *live* ProseMirror
|
||||
document. Runs **sandboxed**
|
||||
(no `require`/`process`/`fs`/network, 5 s timeout). **Dry-run by default**: returns a
|
||||
diff preview without writing; set `dryRun:false` to apply atomically. `ctx` exposes the
|
||||
page's comments and a toolbox of helpers (`walk`, `getList`, `blockText`,
|
||||
`insertMarkerAfter`, `setCalloutRange`, `commentsToFootnotes`, …) for multi-step,
|
||||
coordinated rewrites such as renumbering, or turning inline comments into numbered
|
||||
footnotes.
|
||||
|
||||
### Tables
|
||||
|
||||
- **`table_get`** — Read a table as a matrix: `{rows, cols, cells (text[][]), cellIds}`
|
||||
(a paragraph id per cell, or `null`). Address the table by `#<index>` (from
|
||||
`get_outline`) or any block id inside it. Use `cellIds` with `patch_node` for
|
||||
rich-formatted cell edits.
|
||||
- **`table_insert_row`** — Insert a row of plain-text cells, padded to the table's column
|
||||
count (passing more cells than columns is an error). `index` is the 0-based insert
|
||||
position (0 inserts before the header); omit it to append at the end.
|
||||
- **`table_delete_row`** — Delete the row at a 0-based `index`. Refuses to delete a table's
|
||||
only row; deleting row 0 promotes the next row to header.
|
||||
- **`table_update_cell`** — Set the plain-text content of cell `[row, col]` (0-based). For
|
||||
rich formatting, `patch_node` the cell's paragraph id from `table_get`.
|
||||
|
||||
### Markdown round-trip
|
||||
|
||||
- **`export_page_markdown`** — Export a page to a single self-contained, **lossless
|
||||
Docmost-flavoured Markdown** file: a meta header, the body with inline comment anchors
|
||||
and diagrams, and a trailing comments-thread block. Built for a download → edit body →
|
||||
`import_page_markdown` round-trip that preserves everything, including comment highlights.
|
||||
- **`import_page_markdown`** — Replace a page's content from a Docmost-flavoured Markdown
|
||||
file produced by `export_page_markdown`, restoring comment-highlight anchors and diagrams
|
||||
from their inline HTML. (Comment *threads* in the file are not re-created on the server —
|
||||
only the page body and inline comment marks are written; manage threads via the comment
|
||||
tools/UI.)
|
||||
|
||||
### Images
|
||||
|
||||
- **`insert_image`** — Upload a local image and insert it in one step: append it, drop it
|
||||
in place of a text placeholder (`replaceText`), or put it after a given block
|
||||
(`afterText`). Preserves all other block ids.
|
||||
- **`replace_image`** — Swap an existing image. Uploads the new file as a **fresh
|
||||
attachment** (clean URL that renders and busts browser caches), then re-points every
|
||||
node referencing the old attachment (recursively, including callouts/tables) via the
|
||||
live document, preserving comments, alignment and alt text. (In-place overwrite is
|
||||
deliberately avoided — some Docmost versions corrupt the attachment on overwrite.)
|
||||
|
||||
### Comments
|
||||
|
||||
- **`create_comment`** — Add a page comment, optionally **anchored inline** to an exact
|
||||
span of text (the first occurrence is wrapped in a comment mark).
|
||||
- **`list_comments`** — List a page's comments (content returned as Markdown).
|
||||
- **`update_comment`** — Edit an existing comment.
|
||||
- **`delete_comment`** — Delete a comment.
|
||||
- **`check_new_comments`** — Find comments created after a given ISO-8601 timestamp across
|
||||
a space, optionally scoped to a page subtree — ideal for an agent that watches a doc for
|
||||
feedback.
|
||||
|
||||
### Versioning & history
|
||||
|
||||
- **`list_page_history`** — A page's saved versions (Docmost auto-snapshots on save),
|
||||
newest first, cursor-paginated. Each item's id is the `historyId`.
|
||||
- **`diff_page_versions`** — Diff two versions (or a version against the live page).
|
||||
Returns inserted/deleted text, integrity counts (images, links, tables, callouts,
|
||||
footnote markers), and a human-readable Markdown summary — computed with the same
|
||||
pipeline Docmost's own history viewer uses.
|
||||
- **`restore_page_version`** — Write a saved version back as the current content. Docmost
|
||||
has no restore endpoint, so this creates a **new** snapshot — the restore is itself
|
||||
revertible.
|
||||
|
||||
### Sharing
|
||||
|
||||
- **`share_page`** — Make a page publicly accessible (idempotent) and return its public
|
||||
URL (`<app>/share/<key>/p/<slugId>`); optional search-engine indexing.
|
||||
- **`unshare_page`** — Revoke a page's public share.
|
||||
- **`list_shares`** — All public shares in the workspace, with titles and public URLs.
|
||||
|
||||
---
|
||||
|
||||
## Choosing the right editing tool
|
||||
|
||||
This same guidance is also delivered at runtime via the MCP server `instructions` field,
|
||||
so capable clients steer the model automatically.
|
||||
|
||||
- **Text fixes** (wording, typos, numbers): `edit_page_text`.
|
||||
- **One block** (paragraph/heading/callout/table cell): `patch_node` / `insert_node` /
|
||||
`delete_node`, addressing the node by its `attrs.id` from `get_page_json`.
|
||||
- **Images**: `insert_image` / `replace_image`.
|
||||
- **A new page**: `create_page`.
|
||||
- **Bulk rewrite, or nodes without ids**: `update_page_json`.
|
||||
- **Multi-step / scripted rewrite** (renumbering, footnotes, coordinated edits):
|
||||
`docmost_transform` — preview with `dryRun`, then apply.
|
||||
- **Copy a whole page's content from another page** (server-side): `copy_page_content`.
|
||||
- **Rename a page** (title only): `rename_page`.
|
||||
- **Reads**: `get_page` (Markdown) / `get_page_json` (lossless ProseMirror with ids).
|
||||
- **Review changes**: `list_page_history` → `diff_page_versions` → `restore_page_version`.
|
||||
- **Comments**: `create_comment` (with optional inline anchoring) / `list_comments` /
|
||||
`update_comment` / `delete_comment` / `check_new_comments`.
|
||||
- **Navigate a page cheaply** (find a section/table, grab a block id): `get_outline` →
|
||||
`get_node`.
|
||||
- **Tables** (add/remove a row, set a cell): `table_get` / `table_insert_row` /
|
||||
`table_delete_row` / `table_update_cell`.
|
||||
- **Round-trip a page as Markdown** (download, edit, re-upload losslessly with comments):
|
||||
`export_page_markdown` / `import_page_markdown`.
|
||||
|
||||
---
|
||||
|
||||
## How it works (technical details)
|
||||
|
||||
- **Safe real-time-collaboration writes.** Content mutations are applied through Docmost's
|
||||
collaboration WebSocket (Hocuspocus + Yjs). The server connects, waits for the initial
|
||||
sync so its local doc mirrors the authoritative server doc (including edits not yet in
|
||||
the debounced REST snapshot), then **reads → transforms → writes synchronously** in one
|
||||
tick so no remote update can interleave, and **waits for persistence acknowledgement**
|
||||
before returning.
|
||||
- **Per-page write serialization.** A per-`pageId` async mutex ensures two MCP writes to
|
||||
the same page never overlap; different pages never block each other.
|
||||
- **Transparent re-authentication.** Login uses email/password; expired tokens are
|
||||
refreshed automatically on the first 401/403 (covering JSON, multipart upload, and the
|
||||
collaboration-token path), with in-flight login de-duplication so a burst of calls
|
||||
triggers a single re-login.
|
||||
- **Lossless and lossy reads.** `get_page_json` returns the exact ProseMirror tree with
|
||||
block ids; `get_page` returns clean Markdown for convenience.
|
||||
- **Full Docmost schema.** Markdown↔ProseMirror conversion supports callouts (including
|
||||
nested), task lists (bullet *and* numbered checklists), tables, math blocks, embeds,
|
||||
highlights, sub/superscript and more, with defensive caps against pathological input.
|
||||
- **Structured tables & lossless Markdown round-trip.** Tables can be edited as a matrix
|
||||
(read, insert/delete rows, set cells by `[row,col]`) without resending the document, and
|
||||
a page can be exported to and re-imported from a self-contained Docmost-flavoured
|
||||
Markdown file that preserves inline comment anchors and diagrams.
|
||||
- **Token-optimized responses.** API responses are filtered down to the fields agents
|
||||
actually need, and large collections (spaces, pages, comments, history) are paginated.
|
||||
- **Hardened runtime.** Global handlers keep a stray socket error from tearing down the
|
||||
stdio server; `move_page` requires a positively confirmed success; the diff engine
|
||||
falls back to a coarse block diff rather than hard-failing on a pathological document.
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
npm install
|
||||
npm run build
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
The server requires three environment variables:
|
||||
|
||||
- `DOCMOST_API_URL` — full URL to your Docmost API (e.g. `https://docs.example.com/api`).
|
||||
- `DOCMOST_EMAIL` — account email for authentication.
|
||||
- `DOCMOST_PASSWORD` — account password.
|
||||
|
||||
## Usage with Claude Desktop / a generic MCP client
|
||||
|
||||
Add the server to your MCP configuration (e.g. `claude_desktop_config.json`):
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"docmost-local": {
|
||||
"command": "node",
|
||||
"args": ["./build/index.js"],
|
||||
"env": {
|
||||
"DOCMOST_API_URL": "http://localhost:3000/api",
|
||||
"DOCMOST_EMAIL": "test@docmost.com",
|
||||
"DOCMOST_PASSWORD": "test"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Development
|
||||
|
||||
```bash
|
||||
# Watch mode
|
||||
npm run watch
|
||||
|
||||
# Build
|
||||
npm run build
|
||||
|
||||
# Tests (unit + mock; the live end-to-end suite needs a running Docmost)
|
||||
npm test
|
||||
npm run test:e2e
|
||||
```
|
||||
|
||||
## Lineage & acknowledgements
|
||||
|
||||
This project began as a fork of [MrMartiniMo/docmost-mcp](https://github.com/MrMartiniMo/docmost-mcp)
|
||||
(by Moritz Krause) and extends it substantially — adding per-block node editing,
|
||||
surgical text edits, the sandboxed `docmost_transform`, version history / diff / restore,
|
||||
comments, image insert/replace, public sharing, server-side page copy, dual
|
||||
JSON/Markdown reads, transparent re-authentication and significant hardening. The comment
|
||||
tools were ported from upstream PR #3 by Max Nikitin. Thanks to both.
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
371
packages/mcp/README.ru.md
Normal file
371
packages/mcp/README.ru.md
Normal file
@@ -0,0 +1,371 @@
|
||||
# Docmost MCP Server
|
||||
|
||||
[English](README.md) · **Русский**
|
||||
|
||||
Сервер Model Context Protocol (MCP) для [Docmost](https://docmost.com/), который
|
||||
позволяет ИИ-агентам **читать, искать, писать, реструктурировать, рецензировать, вести
|
||||
версии, комментировать, иллюстрировать и публиковать** документацию — безопасно, на живом
|
||||
инстансе и без enterprise-лицензии.
|
||||
|
||||
> **Написан агентом для агентов.** Человек правит документ глазами и руками: читает,
|
||||
> заходит в редактор, перепечатывает. Агент работает иначе — ему гораздо проще *написать
|
||||
> небольшую функцию, которая чинит текст*, чем перечитывать и заново выдавать весь
|
||||
> документ. Поэтому сервер построен вокруг того, как модели на самом деле удобно
|
||||
> редактировать: адресовать блок по id, сделать find/replace или передать трансформ
|
||||
> `(doc, ctx) => doc` и позволить модели *запрограммировать* правку. `docmost_transform` —
|
||||
> это и есть такой интерфейс. Другие Docmost-MCP «заточены под человека» — они дают
|
||||
> «открыть страницу» и «заменить страницу»; этот даёт примитивы редактирования, в которых
|
||||
> модель сильна.
|
||||
|
||||
Сервер предоставляет **38 инструментов**, построенных вокруг трёх идей, которые другие
|
||||
Docmost-MCP не сочетают:
|
||||
|
||||
1. **Точечные, экономичные по токенам правки.** Адресуйте отдельный блок по id и патчите
|
||||
его или делайте find/replace вместо того, чтобы гонять весь документ ~100 КБ через
|
||||
модель.
|
||||
2. **Безопасная запись на живой документ.** Каждая мутация проходит через слой
|
||||
коллаборации реального времени (тот же WebSocket, что использует веб-редактор),
|
||||
сериализуется по странице, поэтому никогда не затирает параллельную правку человека и
|
||||
подтверждается как сохранённая до возврата из инструмента.
|
||||
3. **Настоящая страховка.** История версий, дифф, эквивалентный Docmost, восстановление
|
||||
одним вызовом и предпросмотр (dry-run) для скриптовых правок — чтобы агент мог
|
||||
редактировать смело, а вы всегда могли увидеть и откатить сделанное.
|
||||
|
||||
---
|
||||
|
||||
## Почему именно этот сервер (в сравнении с альтернативами)
|
||||
|
||||
Существует несколько Docmost-MCP. Ниже — сравнение по возможностям.
|
||||
«Официальный» — встроенный MCP Docmost; остальные — community-проекты на GitHub.
|
||||
|
||||
| Возможность | **Этот сервер** | Официальный (встроенный) | MrMartiniMo/docmost-mcp | cyborgx0x/mcp-docmost | aleksvin8888 / isak-landin |
|
||||
| --- | :---: | :---: | :---: | :---: | :---: |
|
||||
| **Нужна enterprise-лицензия** | **Нет** | **Да** | Нет | Нет | Нет |
|
||||
| Аутентификация | email + пароль, **авто-переавторизация** | API-ключ | email + пароль | cookie `authToken` (копировать из DevTools) | API Docmost / **напрямую PostgreSQL** |
|
||||
| Чтение страницы как Markdown | ✅ | ✅ | ✅ | ✅ | ✅ (только чтение) |
|
||||
| **Lossless Markdown round-trip** (экспорт/импорт, сохраняет якоря комментариев) | ✅ | — | — | — | — |
|
||||
| Чтение **lossless ProseMirror JSON** (с id блоков) | ✅ | — | — | — | — |
|
||||
| **Компактная структура страницы** (дешёвый поиск id блока) | ✅ | — | — | — | — |
|
||||
| **Получение одного блока** (по id или индексу) | ✅ | — | — | — | — |
|
||||
| Создание / перемещение / удаление страниц | ✅ | ✅ | ✅ | ✅ | — |
|
||||
| **Поблочные правки** (patch/insert/delete по id) | ✅ | — | — | — | — |
|
||||
| **Хирургический find/replace** (с сохранением структуры) | ✅ | — | — | — | — |
|
||||
| **Скриптовый JS-трансформ** (песочница, dry-run дифф) | ✅ | — | — | — | — |
|
||||
| **Структурное редактирование таблиц** (CRUD строк/ячеек) | ✅ | — | — | — | — |
|
||||
| **История версий** страницы | ✅ | — | — | ✅ | — |
|
||||
| **Дифф двух версий** | ✅ | — | — | — | — |
|
||||
| **Восстановление версии** (обратимое) | ✅ | — | — | — | — |
|
||||
| **Комментарии** (CRUD + inline-привязка) | ✅ | — | — | ✅ | — |
|
||||
| **Поллинг новых комментариев** с момента времени | ✅ | — | — | — | — |
|
||||
| **Изображения** (вставка / замена) | ✅ | — | — | — | — |
|
||||
| **Публичные ссылки** (создать / отозвать / список) | ✅ | — | — | — | — |
|
||||
| Экспорт в HTML / PDF | — | — | — | ✅ | — |
|
||||
| **Безопасная запись через real-time-collab** (без затирания, с подтверждением) | ✅ | n/a | ✅ | — | n/a (только чтение) |
|
||||
|
||||
### Что это даёт на практике
|
||||
|
||||
- **Никакого enterprise-налога.** Официальный MCP Docmost — enterprise-функция: нужна
|
||||
активная enterprise-лицензия. Этот сервер — MIT и работает с *любым* self-hosted Docmost
|
||||
через стандартный API + сокет коллаборации, имея лишь email и пароль аккаунта.
|
||||
|
||||
- **Экономия токенов при редактировании.** Большинство Docmost-MCP (и официальный)
|
||||
предлагают только запись «заменить всю страницу» — агент вынужден скачать весь документ,
|
||||
изменить и загрузить обратно, оплачивая весь документ **дважды** на каждой мелкой
|
||||
правке. Этот сервер позволяет агенту изменить ровно один блок (`patch_node` /
|
||||
`insert_node` / `delete_node`), сделать find/replace с сохранением структуры
|
||||
(`edit_page_text`) или скопировать страницу на стороне сервера (`copy_page_content`) —
|
||||
**причём документ ни разу не проходит через модель**.
|
||||
|
||||
- **Записи, которые не воюют с редактором.** Наивная запись через REST конфликтует с тем,
|
||||
что в этот момент печатает человек, и может молча затереть его правки или упасть на
|
||||
дебаунс-сохранении Docmost. Этот сервер применяет каждое изменение через живой документ
|
||||
коллаборации (Hocuspocus/Yjs), читая и записывая **синхронно в пределах одного тика
|
||||
синхронизации**, чтобы никакая параллельная правка не вклинилась, сериализует записи
|
||||
**по странице** мьютексом и **ждёт подтверждения сохранения от сервера** до возврата.
|
||||
Если сокет отвалился посреди записи, инструмент возвращает ошибку, а не ложный успех.
|
||||
|
||||
- **Агентоориентированная модель редактирования.** Серверы «под человека» дают «открыть
|
||||
страницу» и «заменить страницу», потому что это отражает то, как работает человек. Модель
|
||||
редактирует лучше, *программируя* правку — адресуя блоки по id, делая find/replace или
|
||||
передавая трансформ `(doc, ctx) => doc` (`docmost_transform`, с dry-run диффом перед
|
||||
коммитом). Этот сервер построен вокруг этого — поэтому у него есть примитивы
|
||||
редактирования, которых у остальных просто нет.
|
||||
|
||||
- **Страховка при редактировании, которой нет у других.** `list_page_history` →
|
||||
`diff_page_versions` → `restore_page_version` дают агенту (и вам) полный цикл «посмотреть
|
||||
и откатить». Дифф использует *тот же* конвейер `recreateTransform → ChangeSet →
|
||||
simplifyChanges`, что и встроенный просмотр истории Docmost, так что результат совпадает
|
||||
с продуктом.
|
||||
|
||||
- **Удобство вместо выковыривания cookie.** Некоторые community-серверы аутентифицируются,
|
||||
заставляя вас копировать сессионный cookie из DevTools браузера (он истекает), либо лезут
|
||||
**напрямую в базу PostgreSQL**. Этот сервер логинится по учётным данным и **прозрачно
|
||||
переавторизуется на 401/403** (с дедупликацией
|
||||
параллельных логинов), поэтому долгоживущие агенты не падают, когда токен истёк. Он также
|
||||
соблюдает контроль доступа Docmost, потому что ходит через API и сервер коллаборации как
|
||||
обычный пользователь.
|
||||
|
||||
---
|
||||
|
||||
## Инструменты
|
||||
|
||||
Все 38 инструментов, сгруппированы по задачам, для которых вы их возьмёте.
|
||||
|
||||
### Чтение и поиск
|
||||
|
||||
- **`get_workspace`** — Информация о текущем воркспейсе Docmost.
|
||||
- **`list_spaces`** — Все пространства воркспейса.
|
||||
- **`list_pages`** — Недавние страницы пространства, по убыванию `updatedAt` (по умолчанию
|
||||
50, максимум 100). Для поиска в больших пространствах используйте `search`.
|
||||
- **`search`** — Полнотекстовый поиск по страницам и контенту (ограничен `limit`, максимум
|
||||
100).
|
||||
- **`get_page`** — Контент страницы как чистый **Markdown** (удобно, но это
|
||||
*lossy*-представление — id блоков и точная структура таблиц/коллаутов аппроксимируются).
|
||||
- **`get_page_json`** — **Lossless ProseMirror/TipTap JSON** страницы, включая `attrs.id`
|
||||
каждого блока и `slugId`, используемый в URL. Именно его потребляют инструменты
|
||||
поблочного редактирования.
|
||||
- **`get_outline`** — Компактная структура страницы из блоков верхнего уровня (`{index,
|
||||
type, id, level, firstText}`; для таблиц добавляются число строк/столбцов и тексты ячеек
|
||||
заголовка, для списков — число пунктов) **без** тела документа. Дешёвый способ найти раздел или таблицу и получить
|
||||
id блока перед `get_node` / `patch_node` / `insert_node`.
|
||||
- **`get_node`** — Получить полное ProseMirror-поддерево одного блока (lossless), не
|
||||
вытягивая всю страницу. Адресуйте его по id блока (из `get_outline` / `get_page_json`)
|
||||
или формой `#<index>` для блока верхнего уровня — используйте `#<index>` для
|
||||
таблиц/строк/ячеек, у которых нет id.
|
||||
|
||||
### Жизненный цикл страниц
|
||||
|
||||
- **`create_page`** — Создать страницу из Markdown и поместить в иерархию (опционально
|
||||
`parentPageId`) одним вызовом. Использует import API Docmost для чистой конвертации
|
||||
Markdown→ProseMirror.
|
||||
- **`rename_page`** — Изменить только заголовок страницы, не трогая и не пересылая контент.
|
||||
- **`move_page`** — Сменить родителя страницы (вложить или вынести в корень); поддерживает
|
||||
позиционирование по fractional-index. Возвращает успех только при *положительно
|
||||
подтверждённом* результате.
|
||||
- **`delete_page`** — Удалить одну страницу.
|
||||
- **`copy_page_content`** — Заменить тело одной страницы копией тела другой, **полностью на
|
||||
стороне сервера** — документ не проходит через модель. У целевой страницы сохраняются
|
||||
собственные заголовок и slug (URL не меняется).
|
||||
|
||||
### Редактирование
|
||||
|
||||
- **`edit_page_text`** — Хирургический find/replace внутри текста страницы. Сохраняет
|
||||
**всю** структуру: id блоков, marks, ссылки, коллауты, таблицы. Предпочтительный
|
||||
инструмент для правки формулировок, опечаток, чисел и имён.
|
||||
- **`patch_node`** — Заменить один блок, адресованный по `attrs.id` (из `get_page_json`),
|
||||
без пересылки документа.
|
||||
- **`insert_node`** — Вставить блок до/после другого (по `attrs.id` или по якорному тексту)
|
||||
либо добавить в конец.
|
||||
- **`delete_node`** — Удалить один блок по его `attrs.id`.
|
||||
- **`update_page_json`** — Заменить весь контент страницы документом ProseMirror (массовые
|
||||
перезаписи или когда у узлов нет id). `content` опционален — опустите его, чтобы изменить
|
||||
только заголовок. Сохраняет переданные id блоков, поэтому якоря заголовков и история
|
||||
остаются стабильными.
|
||||
- **`docmost_transform`** — Агентоориентированный интерфейс редактирования: вместо
|
||||
перепечатывания документа агент **пишет функцию, которая его чинит**. Редактирует
|
||||
страницу, запуская произвольный **JS-трансформ `(doc, ctx) => doc`** на её *живом*
|
||||
документе ProseMirror. Работает в **песочнице** (без `require`/`process`/`fs`/сети,
|
||||
таймаут 5 с). **По умолчанию dry-run**: возвращает предпросмотр диффа без записи;
|
||||
установите `dryRun:false`, чтобы применить атомарно. `ctx` даёт доступ к комментариям
|
||||
страницы и набору хелперов (`walk`, `getList`, `blockText`, `insertMarkerAfter`,
|
||||
`setCalloutRange`, `commentsToFootnotes`, …) для многошаговых согласованных перезаписей —
|
||||
например перенумерации или превращения inline-комментариев в нумерованные сноски.
|
||||
|
||||
### Таблицы
|
||||
|
||||
- **`table_get`** — Прочитать таблицу как матрицу: `{rows, cols, cells (text[][]),
|
||||
cellIds}` (id абзаца на ячейку или `null`). Адресуйте таблицу через `#<index>` (из
|
||||
`get_outline`) или любой id блока внутри неё. Используйте `cellIds` вместе с `patch_node`
|
||||
для правок ячеек с форматированием.
|
||||
- **`table_insert_row`** — Вставить строку из текстовых ячеек, дополненную до числа
|
||||
столбцов таблицы (передать ячеек больше числа столбцов — ошибка). `index` — 0-based
|
||||
позиция вставки (0 вставляет перед заголовком); опустите, чтобы добавить в конец.
|
||||
- **`table_delete_row`** — Удалить строку по 0-based `index`. Отказывается удалять
|
||||
единственную строку таблицы; удаление строки 0 делает заголовком следующую строку.
|
||||
- **`table_update_cell`** — Задать текстовое содержимое ячейки `[row, col]` (0-based). Для
|
||||
форматирования используйте `patch_node` по id абзаца ячейки из `table_get`.
|
||||
|
||||
### Markdown: экспорт и импорт
|
||||
|
||||
- **`export_page_markdown`** — Экспортировать страницу в один самодостаточный, **lossless
|
||||
Markdown в диалекте Docmost**: мета-заголовок, тело с inline-якорями комментариев и
|
||||
диаграммами и завершающий блок тредов комментариев. Рассчитан на цикл «скачать →
|
||||
отредактировать тело → `import_page_markdown`», сохраняющий всё, включая выделения
|
||||
комментариев.
|
||||
- **`import_page_markdown`** — Заменить контент страницы из Markdown-файла в диалекте
|
||||
Docmost, созданного `export_page_markdown`, восстанавливая якоря-выделения комментариев и
|
||||
диаграммы из их inline-HTML. (Треды комментариев из файла не пересоздаются на сервере —
|
||||
записываются только тело страницы и inline-марки комментариев; тредами управляйте через
|
||||
инструменты/UI комментариев.)
|
||||
|
||||
### Изображения
|
||||
|
||||
- **`insert_image`** — Загрузить локальное изображение и вставить за один шаг: добавить в
|
||||
конец, поставить вместо текстового плейсхолдера (`replaceText`) или после заданного блока
|
||||
(`afterText`). Сохраняет id всех остальных блоков.
|
||||
- **`replace_image`** — Заменить существующее изображение. Загружает новый файл как **новое
|
||||
вложение** (чистый URL, который рендерится и сбрасывает кэш браузера), затем
|
||||
перенаправляет все узлы, ссылавшиеся на старое вложение (рекурсивно, включая
|
||||
коллауты/таблицы), через живой документ, сохраняя комментарии, выравнивание и alt-текст.
|
||||
(Перезапись «по месту» намеренно не используется — некоторые версии Docmost портят
|
||||
вложение при перезаписи.)
|
||||
|
||||
### Комментарии
|
||||
|
||||
- **`create_comment`** — Добавить комментарий к странице, опционально **привязав inline** к
|
||||
точному фрагменту текста (первое вхождение оборачивается comment-маркой).
|
||||
- **`list_comments`** — Список комментариев страницы (контент возвращается как Markdown).
|
||||
- **`update_comment`** — Изменить существующий комментарий.
|
||||
- **`delete_comment`** — Удалить комментарий.
|
||||
- **`check_new_comments`** — Найти комментарии, созданные после заданной метки времени
|
||||
ISO-8601, по пространству, опционально в рамках поддерева страниц — идеально для агента,
|
||||
который следит за обратной связью в документе.
|
||||
|
||||
### Версии и история
|
||||
|
||||
- **`list_page_history`** — Сохранённые версии страницы (Docmost авто-снапшотит при каждом
|
||||
сохранении), новые сверху, курсорная пагинация. id каждого элемента — это `historyId`.
|
||||
- **`diff_page_versions`** — Дифф двух версий (или версии против живой страницы).
|
||||
Возвращает вставленный/удалённый текст, счётчики целостности (изображения, ссылки,
|
||||
таблицы, коллауты, маркеры сносок) и человекочитаемую Markdown-сводку — посчитано тем же
|
||||
конвейером, что использует встроенный просмотр истории Docmost.
|
||||
- **`restore_page_version`** — Записать сохранённую версию обратно как текущий контент. У
|
||||
Docmost нет эндпоинта восстановления, поэтому создаётся **новый** снапшот — само
|
||||
восстановление тоже обратимо.
|
||||
|
||||
### Публикация
|
||||
|
||||
- **`share_page`** — Сделать страницу публично доступной (идемпотентно) и вернуть её
|
||||
публичный URL (`<app>/share/<key>/p/<slugId>`); опционально индексирование поисковиками.
|
||||
- **`unshare_page`** — Отозвать публичный доступ к странице.
|
||||
- **`list_shares`** — Все публичные ссылки воркспейса с заголовками и публичными URL.
|
||||
|
||||
---
|
||||
|
||||
## Как выбрать инструмент редактирования
|
||||
|
||||
Та же подсказка отдаётся в рантайме через поле `instructions` MCP-сервера, так что
|
||||
подходящие клиенты направляют модель автоматически.
|
||||
|
||||
- **Правки текста** (формулировки, опечатки, числа): `edit_page_text`.
|
||||
- **Один блок** (абзац/заголовок/коллаут/ячейка таблицы): `patch_node` / `insert_node` /
|
||||
`delete_node`, адресуя узел по его `attrs.id` из `get_page_json`.
|
||||
- **Изображения**: `insert_image` / `replace_image`.
|
||||
- **Новая страница**: `create_page`.
|
||||
- **Массовая перезапись или узлы без id**: `update_page_json`.
|
||||
- **Многошаговая / скриптовая перезапись** (перенумерация, сноски, согласованные правки):
|
||||
`docmost_transform` — предпросмотр через `dryRun`, затем применение.
|
||||
- **Скопировать контент целой страницы из другой** (на стороне сервера):
|
||||
`copy_page_content`.
|
||||
- **Переименовать страницу** (только заголовок): `rename_page`.
|
||||
- **Чтение**: `get_page` (Markdown) / `get_page_json` (lossless ProseMirror с id).
|
||||
- **Просмотр изменений**: `list_page_history` → `diff_page_versions` →
|
||||
`restore_page_version`.
|
||||
- **Комментарии**: `create_comment` (с опциональной inline-привязкой) / `list_comments` /
|
||||
`update_comment` / `delete_comment` / `check_new_comments`.
|
||||
- **Дешёвая навигация по странице** (найти раздел/таблицу, получить id блока): `get_outline`
|
||||
→ `get_node`.
|
||||
- **Таблицы** (добавить/удалить строку, задать ячейку): `table_get` / `table_insert_row` /
|
||||
`table_delete_row` / `table_update_cell`.
|
||||
- **Round-trip страницы через Markdown** (скачать, отредактировать, залить обратно без
|
||||
потерь, с комментариями): `export_page_markdown` / `import_page_markdown`.
|
||||
|
||||
---
|
||||
|
||||
## Как это устроено (технические детали)
|
||||
|
||||
- **Безопасная запись через коллаборацию реального времени.** Мутации контента применяются
|
||||
через WebSocket коллаборации Docmost (Hocuspocus + Yjs). Сервер подключается, ждёт
|
||||
первичной синхронизации, чтобы локальный документ отражал авторитетный серверный (включая
|
||||
правки, которых ещё нет в дебаунс-снапшоте REST), затем **читает → трансформирует →
|
||||
пишет синхронно** в одном тике, чтобы никакое удалённое обновление не вклинилось, и
|
||||
**ждёт подтверждения сохранения** до возврата.
|
||||
- **Сериализация записи по странице.** Асинхронный мьютекс по `pageId` гарантирует, что
|
||||
две записи MCP в одну страницу никогда не пересекаются; разные страницы друг друга не
|
||||
блокируют.
|
||||
- **Прозрачная переавторизация.** Логин по email/паролю; истёкшие токены обновляются
|
||||
автоматически на первом 401/403 (покрывая JSON, multipart-загрузку и путь токена
|
||||
коллаборации), с дедупликацией параллельных логинов, так что пачка вызовов вызывает один
|
||||
повторный логин.
|
||||
- **Lossless- и lossy-чтение.** `get_page_json` возвращает точное дерево ProseMirror с id
|
||||
блоков; `get_page` возвращает чистый Markdown для удобства.
|
||||
- **Полная схема Docmost.** Конвертация Markdown↔ProseMirror поддерживает коллауты
|
||||
(включая вложенные), списки задач (маркированные *и* нумерованные чек-листы), таблицы,
|
||||
блоки формул, эмбеды, выделение, под/надстрочный текст и прочее, с защитными лимитами
|
||||
против патологического ввода.
|
||||
- **Структурные таблицы и lossless Markdown round-trip.** Таблицы можно редактировать как
|
||||
матрицу (чтение, вставка/удаление строк, задание ячеек по `[row, col]`) без пересылки
|
||||
документа, а страницу — экспортировать и заново импортировать как самодостаточный
|
||||
Markdown-файл в диалекте Docmost, сохраняющий inline-якоря комментариев и диаграммы.
|
||||
- **Ответы, оптимизированные по токенам.** Ответы API урезаются до полей, действительно
|
||||
нужных агентам, а большие коллекции (пространства, страницы, комментарии, история)
|
||||
пагинируются.
|
||||
- **Закалённый рантайм.** Глобальные обработчики не дают случайной ошибке сокета уронить
|
||||
stdio-сервер; `move_page` требует положительно подтверждённого успеха; движок диффа
|
||||
откатывается к грубому поблочному диффу, а не падает на патологическом документе.
|
||||
|
||||
---
|
||||
|
||||
## Установка
|
||||
|
||||
```bash
|
||||
npm install
|
||||
npm run build
|
||||
```
|
||||
|
||||
## Конфигурация
|
||||
|
||||
Серверу нужны три переменные окружения:
|
||||
|
||||
- `DOCMOST_API_URL` — полный URL к API вашего Docmost (например,
|
||||
`https://docs.example.com/api`).
|
||||
- `DOCMOST_EMAIL` — email аккаунта для аутентификации.
|
||||
- `DOCMOST_PASSWORD` — пароль аккаунта.
|
||||
|
||||
## Использование с Claude Desktop / произвольным MCP-клиентом
|
||||
|
||||
Добавьте сервер в конфигурацию MCP (например, `claude_desktop_config.json`):
|
||||
|
||||
```json
|
||||
{
|
||||
"mcpServers": {
|
||||
"docmost-local": {
|
||||
"command": "node",
|
||||
"args": ["./build/index.js"],
|
||||
"env": {
|
||||
"DOCMOST_API_URL": "http://localhost:3000/api",
|
||||
"DOCMOST_EMAIL": "test@docmost.com",
|
||||
"DOCMOST_PASSWORD": "test"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Разработка
|
||||
|
||||
```bash
|
||||
# Режим наблюдения
|
||||
npm run watch
|
||||
|
||||
# Сборка
|
||||
npm run build
|
||||
|
||||
# Тесты (unit + mock; live end-to-end набор требует запущенного Docmost)
|
||||
npm test
|
||||
npm run test:e2e
|
||||
```
|
||||
|
||||
## Происхождение и благодарности
|
||||
|
||||
Проект начинался как форк
|
||||
[MrMartiniMo/docmost-mcp](https://github.com/MrMartiniMo/docmost-mcp) (автор Moritz Krause)
|
||||
и существенно его расширяет — добавлены поблочное редактирование узлов, хирургические
|
||||
правки текста, песочница `docmost_transform`, история версий / дифф / восстановление,
|
||||
комментарии, вставка/замена изображений, публичные ссылки, серверное копирование страниц,
|
||||
двойное чтение JSON/Markdown, прозрачная переавторизация и значительное упрочнение.
|
||||
Инструменты комментариев портированы из upstream PR #3 от Max Nikitin. Спасибо обоим.
|
||||
|
||||
## Лицензия
|
||||
|
||||
MIT
|
||||
89
packages/mcp/TEST-PLAN.md
Normal file
89
packages/mcp/TEST-PLAN.md
Normal file
@@ -0,0 +1,89 @@
|
||||
# Docmost MCP — Test Plan (editing & image tools)
|
||||
|
||||
Manual/E2E test plan for every content-mutating tool, with special focus on
|
||||
images and image replacement. Executed against a live Docmost instance
|
||||
(`docs.vvzvlad.xyz`) and verified visually in Chrome (public share + authenticated
|
||||
editor).
|
||||
|
||||
## How to run the automated part
|
||||
|
||||
```
|
||||
DOCMOST_API_URL=https://<host>/api \
|
||||
DOCMOST_EMAIL=<email> \
|
||||
DOCMOST_PASSWORD=<password> \
|
||||
node test-e2e.mjs
|
||||
```
|
||||
|
||||
`test-e2e.mjs` creates a throwaway page, exercises every code path (including the
|
||||
image upload/insert/replace cycle) and deletes the page afterwards. Collab writes
|
||||
are debounced server-side, so the script waits ~16 s before reading back via REST.
|
||||
|
||||
## Test matrix
|
||||
|
||||
| # | Tool / path | What is checked | Expected |
|
||||
|---|-------------|-----------------|----------|
|
||||
| 1 | `create_page` | title with spaces, slugId returned | page created, title intact |
|
||||
| 2 | `update_page` (markdown) | headings, **bold**/*italic*/~~strike~~/`code`/link, nested bullet + ordered lists, blockquote, code block, `:::callout:::`, table | all structures survive re-import |
|
||||
| 3 | `get_page_json` | lossless ProseMirror, block ids, callout/table nodes | present (note: reads the **debounced** REST snapshot — recent collab writes may lag a few seconds) |
|
||||
| 4 | `edit_page_text` | surgical replace; block ids + marks preserved; ambiguous match rejected; missing match reported | edits applied, ids stable, errors correct |
|
||||
| 5 | `update_page_json` | full lossless write; custom block ids preserved; existing content (text edits, images, callout, table) not lost | round-trips intact |
|
||||
| 6 | `upload_image` | uploads attachment, returns node | src is a **clean** `/api/files/<id>/<file>` URL, served `200 image/*` |
|
||||
| 7 | `insert_image` (append / `replaceText` / `afterText`) | three placements | image lands in the right place, all other block ids preserved |
|
||||
| 8 | **`replace_image`** | swap an existing figure for new bytes; comments/align/alt preserved; **the new URL must actually serve the image** | new image renders (`200`), old node repointed |
|
||||
|
||||
## Image-specific assertions (the recurring bug area)
|
||||
|
||||
For every uploaded/inserted/replaced image, assert at the HTTP level that the
|
||||
`src` actually serves bytes — this is what catches "broken image" regressions:
|
||||
|
||||
* `GET <src>` → `200`, `Content-Type: image/*`, body starts with the image magic
|
||||
(`89 50 4E 47` for PNG, etc.).
|
||||
* `src` does **not** contain a `?v=` query (see "Known pitfalls").
|
||||
* After `replace_image`: the returned `newAttachmentId` **differs** from the old
|
||||
one (replacement uses a fresh attachment → fresh URL), and `GET <new src>` → `200`.
|
||||
* The old image node on the page is repointed to the new attachmentId.
|
||||
|
||||
## Browser verification (Chrome)
|
||||
|
||||
Open the page (public `/share/<key>/p/<slug>` URL, or the authenticated editor)
|
||||
and check each `<img>`:
|
||||
|
||||
```js
|
||||
[...document.querySelectorAll('.ProseMirror img')].map(im => ({
|
||||
src: im.getAttribute('src'),
|
||||
loaded: im.naturalWidth > 0, // 0 ⇒ broken
|
||||
}));
|
||||
```
|
||||
|
||||
`loaded === true` (naturalWidth > 0) means the image really rendered; `0` means a
|
||||
broken/empty figure.
|
||||
|
||||
## Known pitfalls (root-caused during testing)
|
||||
|
||||
1. **In-place attachment overwrite corrupts the file (HTTP 500).**
|
||||
Uploading with an existing `attachmentId` (`POST /files/upload` + `attachmentId`)
|
||||
overwrites the bytes in place. On this Docmost the attachment then returns
|
||||
**500 for every URL** (clean, `?v=`, any filename) → broken image. Therefore
|
||||
`replace_image` must upload a **new** attachment and repoint the nodes; the new
|
||||
id yields a new URL that both renders and busts the browser cache. The old
|
||||
attachment is left as an unreferenced orphan: Docmost exposes **no HTTP API to
|
||||
delete a single content attachment** (verified against the attachment
|
||||
controller/service and by probing ~20 route variants live — all 404; an
|
||||
attachment unlinked from a page stays reachable with no auto-GC). Attachments
|
||||
are removed only by cascade (page/space/user deletion). This matches Docmost's
|
||||
own editor, which also orphans attachments on image removal/replacement.
|
||||
|
||||
2. **`?v=<hash>` cache-buster is unnecessary and was a red herring.**
|
||||
The file endpoint serves `…/file.png?v=<hash>` exactly like the clean URL
|
||||
(`200 image/*`) — verified at the HTTP layer, on the public share, and in the
|
||||
authenticated editor. The broken images people saw came from pitfall #1, not
|
||||
from `?v=`. Image `src` is kept clean (`/api/files/<id>/<file>`); cache-busting
|
||||
on replace is achieved by the new attachment id.
|
||||
|
||||
3. **REST snapshot lag.** `get_page_json` reads the debounced DB snapshot, so a
|
||||
write made moments earlier may not be visible yet. Wait (~16 s) before reading
|
||||
back, and never feed a possibly-stale snapshot straight into `update_page_json`.
|
||||
|
||||
4. **Callout type narrowing (minor, open).** A `:::warning` callout is imported as
|
||||
`type: "info"` — the markdown→callout conversion does not carry non-`info`
|
||||
types through. Cosmetic; tracked separately.
|
||||
2159
packages/mcp/build/client.js
Normal file
2159
packages/mcp/build/client.js
Normal file
File diff suppressed because it is too large
Load Diff
92
packages/mcp/build/http.js
Normal file
92
packages/mcp/build/http.js
Normal file
@@ -0,0 +1,92 @@
|
||||
import { randomUUID } from "node:crypto";
|
||||
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
|
||||
import { isInitializeRequest } from "@modelcontextprotocol/sdk/types.js";
|
||||
import { createDocmostMcpServer } from "./index.js";
|
||||
/**
|
||||
* Build a stateful Streamable-HTTP handler for the Docmost MCP server. The
|
||||
* embedding host (the gitmost NestJS server) bridges its raw Node req/res into
|
||||
* `handleRequest`. One McpServer + transport is created per MCP session and
|
||||
* kept alive between requests, keyed by the `mcp-session-id` header.
|
||||
*/
|
||||
export function createMcpHttpHandler(config) {
|
||||
// One transport (and one McpServer) per MCP session, keyed by session id.
|
||||
const transports = {};
|
||||
// Last activity timestamp per session id, used for idle eviction.
|
||||
const lastSeen = {};
|
||||
// Idle session TTL (ms): a session with no activity for this long is evicted.
|
||||
// Defaults to 30 min; overridable via MCP_SESSION_IDLE_MS.
|
||||
const idleTtlMs = (() => {
|
||||
const parsed = parseInt(process.env.MCP_SESSION_IDLE_MS ?? "", 10);
|
||||
return Number.isFinite(parsed) && parsed > 0 ? parsed : 30 * 60 * 1000;
|
||||
})();
|
||||
// Periodically close transports idle longer than the TTL. transport.close()
|
||||
// triggers its onclose, which removes it from `transports`; we also drop the
|
||||
// lastSeen entry. unref() so this timer never keeps the process alive.
|
||||
const sweepIntervalMs = 5 * 60 * 1000;
|
||||
const sweepTimer = setInterval(() => {
|
||||
const now = Date.now();
|
||||
for (const sid of Object.keys(transports)) {
|
||||
if (now - (lastSeen[sid] ?? 0) > idleTtlMs) {
|
||||
void transports[sid].close();
|
||||
delete lastSeen[sid];
|
||||
}
|
||||
}
|
||||
}, sweepIntervalMs);
|
||||
sweepTimer.unref();
|
||||
async function handleRequest(req, res, parsedBody) {
|
||||
const sessionId = req.headers["mcp-session-id"];
|
||||
const method = (req.method || "GET").toUpperCase();
|
||||
let transport = sessionId ? transports[sessionId] : undefined;
|
||||
if (method === "POST" && !transport) {
|
||||
// A new session may only be created by an initialize request without a
|
||||
// session id.
|
||||
if (sessionId || !isInitializeRequest(parsedBody)) {
|
||||
res.statusCode = 400;
|
||||
res.setHeader("Content-Type", "application/json");
|
||||
res.end(JSON.stringify({
|
||||
jsonrpc: "2.0",
|
||||
error: {
|
||||
code: -32000,
|
||||
message: "Bad Request: no valid session ID provided",
|
||||
},
|
||||
id: null,
|
||||
}));
|
||||
return;
|
||||
}
|
||||
transport = new StreamableHTTPServerTransport({
|
||||
sessionIdGenerator: () => randomUUID(),
|
||||
onsessioninitialized: (sid) => {
|
||||
transports[sid] = transport;
|
||||
lastSeen[sid] = Date.now();
|
||||
},
|
||||
});
|
||||
transport.onclose = () => {
|
||||
const sid = transport.sessionId;
|
||||
if (sid && transports[sid])
|
||||
delete transports[sid];
|
||||
};
|
||||
const server = createDocmostMcpServer(config);
|
||||
await server.connect(transport);
|
||||
await transport.handleRequest(req, res, parsedBody);
|
||||
return;
|
||||
}
|
||||
if (!transport) {
|
||||
res.statusCode = 400;
|
||||
res.setHeader("Content-Type", "application/json");
|
||||
res.end(JSON.stringify({
|
||||
jsonrpc: "2.0",
|
||||
error: {
|
||||
code: -32000,
|
||||
message: "Bad Request: no valid session ID provided",
|
||||
},
|
||||
id: null,
|
||||
}));
|
||||
return;
|
||||
}
|
||||
// Routing to an existing transport: refresh its idle timestamp.
|
||||
if (sessionId)
|
||||
lastSeen[sessionId] = Date.now();
|
||||
await transport.handleRequest(req, res, parsedBody);
|
||||
}
|
||||
return { handleRequest };
|
||||
}
|
||||
777
packages/mcp/build/index.js
Normal file
777
packages/mcp/build/index.js
Normal file
@@ -0,0 +1,777 @@
|
||||
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
|
||||
import { z } from "zod";
|
||||
import { readFileSync } from "fs";
|
||||
import { fileURLToPath } from "url";
|
||||
import { dirname, join } from "path";
|
||||
import { DocmostClient } from "./client.js";
|
||||
// Read version from package.json
|
||||
const __filename = fileURLToPath(import.meta.url);
|
||||
const __dirname = dirname(__filename);
|
||||
const packageJson = JSON.parse(readFileSync(join(__dirname, "../package.json"), "utf-8"));
|
||||
const VERSION = packageJson.version;
|
||||
// --- Modern McpServer Implementation ---
|
||||
// Editing guide surfaced to MCP clients in the initialize result so they can
|
||||
// pick the right tool by intent and avoid resending whole documents.
|
||||
const SERVER_INSTRUCTIONS = "Docmost editing guide — choose the tool by intent: fix wording/typos/numbers (text inside blocks) -> edit_page_text (no node id needed). Change ONE block (paragraph/heading/callout/table cell/etc.) structurally -> patch_node (address by attrs.id from get_page_json). Add a block -> insert_node (before/after a block by attrs.id or by anchor text, or append). Remove a block -> delete_node (by attrs.id). Images -> insert_image (place a local image file) / replace_image (swap an existing image file). New page -> create_page (Markdown). Bulk/structural rewrite or nodes without an id -> update_page_json (full ProseMirror replace; prefer the granular tools above to avoid resending the whole ~100KB+ document). Copy/replace a page's whole content from another page (server-side, no document through the model) -> copy_page_content. Rename a page (title only) -> rename_page. Read -> get_page (Markdown, lossy) or get_page_json (lossless ProseMirror with block ids). Comments -> create_comment (an inline comment anchors to its selection text), list_comments, update_comment, delete_comment, check_new_comments. Tip: read block ids via get_page_json, then use patch_node/insert_node/delete_node so you never resend the full document. " +
|
||||
"Complex/scripted rewrite (multiple coordinated edits, footnotes, renumbering) -> docmost_transform: write a JS `(doc, ctx) => doc` transform, preview the diff with dryRun (default), then apply with dryRun:false; ctx.helpers includes commentsToFootnotes for turning inline comments into numbered footnotes. " +
|
||||
"Review what changed -> diff_page_versions (compare a historyId to current, or two history versions). See a page's saved versions -> list_page_history. Undo a bad edit -> restore_page_version (writes a past version back as current; itself revertible). " +
|
||||
"Lossless markdown round-trip (download, edit, re-upload, incl. comment anchors) -> export_page_markdown / import_page_markdown.";
|
||||
// Helper to format JSON responses
|
||||
const jsonContent = (data) => ({
|
||||
content: [{ type: "text", text: JSON.stringify(data, null, 2) }],
|
||||
});
|
||||
/**
|
||||
* Create a fully configured Docmost MCP server. Side-effect-free: it does not
|
||||
* read environment variables and does not connect any transport — the caller
|
||||
* decides how to expose it (stdio or HTTP). The client talks to Docmost over
|
||||
* REST + the collaboration WebSocket using the provided service-account
|
||||
* credentials and auto-re-authenticates.
|
||||
*/
|
||||
export function createDocmostMcpServer(config) {
|
||||
const docmostClient = new DocmostClient(config.apiUrl, config.email, config.password);
|
||||
const server = new McpServer({
|
||||
name: "docmost-mcp",
|
||||
version: VERSION,
|
||||
}, { instructions: SERVER_INSTRUCTIONS });
|
||||
// Tool: get_workspace
|
||||
server.registerTool("get_workspace", {
|
||||
description: "Get the current Docmost workspace",
|
||||
}, async () => {
|
||||
const workspace = await docmostClient.getWorkspace();
|
||||
return jsonContent(workspace);
|
||||
});
|
||||
// Tool: list_spaces
|
||||
server.registerTool("list_spaces", {
|
||||
description: "List all available spaces in Docmost",
|
||||
}, async () => {
|
||||
const spaces = await docmostClient.getSpaces();
|
||||
return jsonContent(spaces);
|
||||
});
|
||||
// Tool: list_pages
|
||||
server.registerTool("list_pages", {
|
||||
description: "List most recent pages in a space ordered by updatedAt (descending). " +
|
||||
"Returns a bounded list (default 50, max 100) — use search for lookups " +
|
||||
"in large spaces.",
|
||||
inputSchema: {
|
||||
spaceId: z.string().optional(),
|
||||
limit: z
|
||||
.number()
|
||||
.int()
|
||||
.min(1)
|
||||
.max(100)
|
||||
.optional()
|
||||
.describe("Max pages to return (default 50, max 100)"),
|
||||
},
|
||||
}, async ({ spaceId, limit }) => {
|
||||
const result = await docmostClient.listPages(spaceId, limit ?? 50);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: get_page
|
||||
server.registerTool("get_page", {
|
||||
description: "Get page details with content converted to Markdown. The conversion is " +
|
||||
"LOSSY (block ids, exact table/callout structure are approximated); for a " +
|
||||
"lossless representation use get_page_json.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
},
|
||||
}, async ({ pageId }) => {
|
||||
const page = await docmostClient.getPage(pageId);
|
||||
return jsonContent(page);
|
||||
});
|
||||
// Tool: get_page_json
|
||||
server.registerTool("get_page_json", {
|
||||
description: "Get page details with the raw ProseMirror JSON content (lossless: " +
|
||||
"includes block ids, callouts, tables, link/image attributes) plus the " +
|
||||
"slugId used in URLs. Use together with update_page_json for precise " +
|
||||
"structural edits, or edit_page_text for simple text fixes.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
},
|
||||
}, async ({ pageId }) => {
|
||||
const page = await docmostClient.getPageJson(pageId);
|
||||
return jsonContent(page);
|
||||
});
|
||||
// Tool: get_outline
|
||||
server.registerTool("get_outline", {
|
||||
description: "Return a COMPACT outline of a page's top-level blocks ({index, type, " +
|
||||
"id, level, firstText}; tables add rows/cols/header; lists add item " +
|
||||
"count) WITHOUT the full document body. Use it to locate sections/tables " +
|
||||
"and grab block ids cheaply before get_node / patch_node / insert_node.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
},
|
||||
}, async ({ pageId }) => {
|
||||
const result = await docmostClient.getOutline(pageId);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: get_node
|
||||
server.registerTool("get_node", {
|
||||
description: "Fetch a single node's full ProseMirror subtree (lossless) without " +
|
||||
"pulling the whole document. `nodeId` is a block id from get_outline/" +
|
||||
"get_page_json (works for headings/paragraphs/callouts/images), OR " +
|
||||
"`#<index>` to fetch a top-level block by its outline index — use the " +
|
||||
"`#<index>` form for tables/rows/cells, which carry no id.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
nodeId: z.string().min(1),
|
||||
},
|
||||
}, async ({ pageId, nodeId }) => {
|
||||
const result = await docmostClient.getNode(pageId, nodeId);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: table_get
|
||||
server.registerTool("table_get", {
|
||||
description: "Read a table as a matrix. Returns {rows, cols, cells (text[][]), " +
|
||||
"cellIds (paragraph id per cell, or null)}. `table` = `#<index>` from " +
|
||||
"get_outline, or any block id inside the table. Use cellIds with " +
|
||||
"patch_node for rich-formatted cell edits. `cols` is the FIRST row's " +
|
||||
"width; ragged tables may vary per row, so use the per-row length of " +
|
||||
"`cells` for each row.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
table: z.string().min(1),
|
||||
},
|
||||
}, async ({ pageId, table }) => {
|
||||
const result = await docmostClient.getTable(pageId, table);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: table_insert_row
|
||||
server.registerTool("table_insert_row", {
|
||||
description: "Insert a row of plain-text cells into a table. `table` = `#<index>` or " +
|
||||
"a block id inside it. `cells` = text per column (padded to the table's " +
|
||||
"column count; error if more cells than columns). `index` = 0-based " +
|
||||
"insert position (0 inserts before the header); omit to append at the end.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
table: z.string().min(1),
|
||||
cells: z.array(z.string()),
|
||||
index: z.number().int().optional(),
|
||||
},
|
||||
}, async ({ pageId, table, cells, index }) => {
|
||||
const result = await docmostClient.tableInsertRow(pageId, table, cells, index);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: table_delete_row
|
||||
server.registerTool("table_delete_row", {
|
||||
description: "Delete the row at 0-based `index` from a table (`table` = `#<index>` or " +
|
||||
"a block id inside it). Refuses to delete the table's only row. An " +
|
||||
"out-of-range `index` throws. Deleting `index` 0 removes the header row, " +
|
||||
"and the next row becomes the new header.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
table: z.string().min(1),
|
||||
index: z.number().int(),
|
||||
},
|
||||
}, async ({ pageId, table, index }) => {
|
||||
const result = await docmostClient.tableDeleteRow(pageId, table, index);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: table_update_cell
|
||||
server.registerTool("table_update_cell", {
|
||||
description: "Set the plain-text content of cell [row,col] (0-based) in a table " +
|
||||
"(`table` = `#<index>` or a block id inside it). Replaces the cell's " +
|
||||
"content with a single text paragraph; for rich formatting use patch_node " +
|
||||
"on the cell's paragraph id from table_get.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
table: z.string().min(1),
|
||||
row: z.number().int(),
|
||||
col: z.number().int(),
|
||||
text: z.string(),
|
||||
},
|
||||
}, async ({ pageId, table, row, col, text }) => {
|
||||
const result = await docmostClient.tableUpdateCell(pageId, table, row, col, text);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: create_page
|
||||
server.registerTool("create_page", {
|
||||
description: "Create a new page with content (automatically moves it to the correct hierarchy).",
|
||||
inputSchema: {
|
||||
title: z.string().min(1).describe("Title of the page"),
|
||||
content: z.string().min(1).describe("Markdown content"),
|
||||
spaceId: z.string().min(1),
|
||||
parentPageId: z
|
||||
.string()
|
||||
.optional()
|
||||
.describe("Optional parent page ID to nest under"),
|
||||
},
|
||||
}, async ({ title, content, spaceId, parentPageId }) => {
|
||||
const result = await docmostClient.createPage(title, content, spaceId, parentPageId);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: update_page_json
|
||||
server.registerTool("update_page_json", {
|
||||
description: "Replace a page's content with a raw ProseMirror JSON document " +
|
||||
"(lossless write: preserves the block ids, callouts, tables and " +
|
||||
"attributes you pass in). Typical flow: get_page_json -> modify the " +
|
||||
"JSON -> update_page_json. Keep existing node ids intact so heading " +
|
||||
"anchors and history stay stable. `content` is OPTIONAL: omit it to " +
|
||||
"update only the title (though prefer rename_page for a title-only " +
|
||||
"change). Supplying neither content nor title is an error.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1).describe("ID of the page to update"),
|
||||
content: z
|
||||
.any()
|
||||
.optional()
|
||||
.describe('ProseMirror document: {"type":"doc","content":[...]}. Omit to rename only.'),
|
||||
title: z.string().optional().describe("Optional new title"),
|
||||
},
|
||||
}, async ({ pageId, content, title }) => {
|
||||
// Only parse/validate the document when it was actually supplied; when it
|
||||
// is omitted, pass it straight through so the client performs a title-only
|
||||
// (or no-op) update.
|
||||
let doc;
|
||||
if (content === undefined || content === null) {
|
||||
doc = undefined;
|
||||
}
|
||||
else if (typeof content === "string") {
|
||||
try {
|
||||
doc = JSON.parse(content);
|
||||
}
|
||||
catch {
|
||||
throw new Error("content was a string but not valid JSON");
|
||||
}
|
||||
}
|
||||
else {
|
||||
doc = content;
|
||||
}
|
||||
const result = await docmostClient.updatePageJson(pageId, doc, title);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: export_page_markdown
|
||||
server.registerTool("export_page_markdown", {
|
||||
description: "Export a page to a single self-contained, lossless Docmost-flavoured " +
|
||||
"Markdown file (custom extensions): YAML-free meta header, body with " +
|
||||
"inline comment anchors and diagrams, and a trailing comments-thread " +
|
||||
"block. Designed for a download -> edit body -> import_page_markdown " +
|
||||
"round-trip that preserves everything, including comment highlights. " +
|
||||
"Comment THREADS are preserved in the file but are not re-pushed to the " +
|
||||
"server on import.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
},
|
||||
}, async ({ pageId }) => {
|
||||
const md = await docmostClient.exportPageMarkdown(pageId);
|
||||
return { content: [{ type: "text", text: md }] };
|
||||
});
|
||||
// Tool: import_page_markdown
|
||||
server.registerTool("import_page_markdown", {
|
||||
description: "Replace a page's content from a self-contained Docmost-flavoured " +
|
||||
"Markdown file produced by export_page_markdown. Restores comment " +
|
||||
"highlight anchors and diagrams from their inline HTML. NOTE: comment " +
|
||||
"thread records are NOT created/updated/deleted on the server by this " +
|
||||
"tool — only the page body + inline comment marks are written; manage " +
|
||||
"comment threads via the comment tools/UI.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
markdown: z.string().min(1),
|
||||
},
|
||||
}, async ({ pageId, markdown }) => {
|
||||
const res = await docmostClient.importPageMarkdown(pageId, markdown);
|
||||
return jsonContent(res);
|
||||
});
|
||||
// Tool: copy_page_content
|
||||
server.registerTool("copy_page_content", {
|
||||
description: "Replace targetPageId's content with a copy of sourcePageId's content, " +
|
||||
"entirely server-side — the document is NOT sent through the model. The " +
|
||||
"target keeps its own title and slug; only its body is replaced. Ideal " +
|
||||
"for 'make page A's content equal to B' or 'replace A with B but keep A's URL'.",
|
||||
inputSchema: {
|
||||
sourcePageId: z.string().min(1).describe("Page to copy content FROM"),
|
||||
targetPageId: z
|
||||
.string()
|
||||
.min(1)
|
||||
.describe("Page whose content is REPLACED (title/slug kept)"),
|
||||
},
|
||||
}, async ({ sourcePageId, targetPageId }) => {
|
||||
const result = await docmostClient.copyPageContent(sourcePageId, targetPageId);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: rename_page
|
||||
server.registerTool("rename_page", {
|
||||
description: "Rename a page (change its title only) without touching or resending " +
|
||||
"its content.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1).describe("ID of the page to rename"),
|
||||
title: z.string().min(1).describe("New title"),
|
||||
},
|
||||
}, async ({ pageId, title }) => {
|
||||
const result = await docmostClient.renamePage(pageId, title);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: edit_page_text
|
||||
server.registerTool("edit_page_text", {
|
||||
description: "Surgical find/replace inside a page's text. Preserves ALL structure: " +
|
||||
"block ids, marks, links, callouts, tables. Each `find` must match " +
|
||||
"exactly once (or set replaceAll). A match must lie inside one " +
|
||||
"formatting run; if the target text crosses bold/link boundaries the " +
|
||||
"tool reports it — use a shorter fragment or update_page_json then. " +
|
||||
"This is the preferred tool for fixing wording, typos, numbers, names.",
|
||||
inputSchema: {
|
||||
pageId: z.string().describe("ID of the page to edit"),
|
||||
edits: z
|
||||
.array(z.object({
|
||||
find: z.string().describe("Exact text to find"),
|
||||
replace: z.string().describe("Replacement text (may be empty)"),
|
||||
replaceAll: z
|
||||
.boolean()
|
||||
.optional()
|
||||
.describe("Replace every occurrence (default: must match once)"),
|
||||
}))
|
||||
.min(1)
|
||||
.describe("List of find/replace operations, applied in order"),
|
||||
},
|
||||
}, async ({ pageId, edits }) => {
|
||||
const result = await docmostClient.editPageText(pageId, edits);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: patch_node
|
||||
server.registerTool("patch_node", {
|
||||
description: "Replaces a single block identified by its attrs.id WITHOUT resending the " +
|
||||
"whole document. Get the block id from get_page_json, then pass a " +
|
||||
"ProseMirror node to put in its place. Cheaper and safer than " +
|
||||
"update_page_json for one-block structural edits.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
nodeId: z.string().min(1),
|
||||
node: z
|
||||
.any()
|
||||
.describe("ProseMirror node JSON to put in place of the node with this id"),
|
||||
},
|
||||
}, async ({ pageId, nodeId, node }) => {
|
||||
let parsedNode;
|
||||
if (typeof node === "string") {
|
||||
try {
|
||||
parsedNode = JSON.parse(node);
|
||||
}
|
||||
catch {
|
||||
throw new Error("node was a string but not valid JSON");
|
||||
}
|
||||
}
|
||||
else {
|
||||
parsedNode = node;
|
||||
}
|
||||
const result = await docmostClient.patchNode(pageId, nodeId, parsedNode);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: insert_node
|
||||
server.registerTool("insert_node", {
|
||||
description: "Insert a block before/after another block (by attrs.id or anchor text) " +
|
||||
"or append at the end. Get anchor block ids from get_page_json. Avoids " +
|
||||
"resending the whole document. Can also insert table structure: to add a " +
|
||||
"tableRow, pass a tableRow node with position before/after and anchor " +
|
||||
"INSIDE the target table — anchorNodeId of any block/cell in it, or " +
|
||||
"anchorText matching the table; to add a tableCell/tableHeader, use " +
|
||||
"anchorNodeId of a block inside the target row (anchorText only resolves " +
|
||||
"top-level blocks, so it cannot target a row). Note: append is top-level " +
|
||||
"only and rejects structural table nodes.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
node: z.any(),
|
||||
position: z.enum(["before", "after", "append"]),
|
||||
anchorNodeId: z.string().optional(),
|
||||
anchorText: z.string().optional(),
|
||||
},
|
||||
}, async ({ pageId, node, position, anchorNodeId, anchorText }) => {
|
||||
let parsedNode;
|
||||
if (typeof node === "string") {
|
||||
try {
|
||||
parsedNode = JSON.parse(node);
|
||||
}
|
||||
catch {
|
||||
throw new Error("node was a string but not valid JSON");
|
||||
}
|
||||
}
|
||||
else {
|
||||
parsedNode = node;
|
||||
}
|
||||
const result = await docmostClient.insertNode(pageId, parsedNode, {
|
||||
position,
|
||||
anchorNodeId,
|
||||
anchorText,
|
||||
});
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: delete_node
|
||||
server.registerTool("delete_node", {
|
||||
description: "Remove a single block by its attrs.id (from get_page_json) WITHOUT " +
|
||||
"resending the whole document.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
nodeId: z.string().min(1),
|
||||
},
|
||||
}, async ({ pageId, nodeId }) => {
|
||||
const result = await docmostClient.deleteNode(pageId, nodeId);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: insert_image
|
||||
server.registerTool("insert_image", {
|
||||
description: "Upload a local image and insert it into a page in one step. By default " +
|
||||
"appends the image at the end of the page. With replaceText, replaces the " +
|
||||
"first top-level block whose text contains that string (handy for " +
|
||||
'swapping a text placeholder like "[image: foo.png]" for the real image). ' +
|
||||
"With afterText, inserts the image right after the first block containing " +
|
||||
"that string. Preserves all other block ids.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
filePath: z
|
||||
.string()
|
||||
.min(1)
|
||||
.describe("Absolute local path to the image file"),
|
||||
align: z.enum(["left", "center", "right"]).optional(),
|
||||
alt: z.string().optional(),
|
||||
replaceText: z
|
||||
.string()
|
||||
.optional()
|
||||
.describe("Replace the first top-level block whose text contains this string with the image"),
|
||||
afterText: z
|
||||
.string()
|
||||
.optional()
|
||||
.describe("Insert the image right after the first top-level block whose text contains this string"),
|
||||
},
|
||||
}, async ({ pageId, filePath, align, alt, replaceText, afterText }) => {
|
||||
const result = await docmostClient.insertImage(pageId, filePath, {
|
||||
align,
|
||||
alt,
|
||||
replaceText,
|
||||
afterText,
|
||||
});
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: replace_image
|
||||
server.registerTool("replace_image", {
|
||||
description: "Replace an existing image on a page: uploads the new file as a NEW " +
|
||||
"attachment (fresh clean URL that renders and busts browser caches), then " +
|
||||
"repoints every image node referencing the old attachmentId (recursively, " +
|
||||
"incl. callouts/tables) via the live document, preserving comments, " +
|
||||
"alignment and alt. The old attachment is left as an unreferenced orphan " +
|
||||
"(Docmost has no API to delete a single attachment; it is removed only when " +
|
||||
"the page/space is deleted). In-place byte overwrite is avoided because some " +
|
||||
"Docmost versions corrupt the attachment (HTTP 500) on overwrite.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
attachmentId: z
|
||||
.string()
|
||||
.min(1)
|
||||
.describe("attachmentId of the image currently in the page to replace"),
|
||||
filePath: z
|
||||
.string()
|
||||
.min(1)
|
||||
.describe("Absolute local path to the new image file"),
|
||||
align: z.enum(["left", "center", "right"]).optional(),
|
||||
alt: z.string().optional(),
|
||||
},
|
||||
}, async ({ pageId, attachmentId, filePath, align, alt }) => {
|
||||
const result = await docmostClient.replaceImage(pageId, attachmentId, filePath, {
|
||||
align,
|
||||
alt,
|
||||
});
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: share_page
|
||||
server.registerTool("share_page", {
|
||||
description: "Make a page publicly accessible (idempotent) and return its public " +
|
||||
"URL. The URL format is <app>/share/<key>/p/<slugId>.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1).describe("ID of the page to share"),
|
||||
searchIndexing: z
|
||||
.boolean()
|
||||
.optional()
|
||||
.describe("Allow search engines to index the page (default true)"),
|
||||
},
|
||||
}, async ({ pageId, searchIndexing }) => {
|
||||
const result = await docmostClient.sharePage(pageId, searchIndexing ?? true);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: unshare_page
|
||||
server.registerTool("unshare_page", {
|
||||
description: "Remove the public share of a page (revokes the public URL).",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1).describe("ID of the page to unshare"),
|
||||
},
|
||||
}, async ({ pageId }) => {
|
||||
const result = await docmostClient.unsharePage(pageId);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: list_shares
|
||||
server.registerTool("list_shares", {
|
||||
description: "List all public shares in the workspace with page titles and public URLs.",
|
||||
}, async () => {
|
||||
const result = await docmostClient.listShares();
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: move_page
|
||||
server.registerTool("move_page", {
|
||||
description: "Move a page to a new parent (nesting) or root. Essential for organizing pages created via 'create_page'.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
parentPageId: z
|
||||
.string()
|
||||
.nullable()
|
||||
.optional()
|
||||
.describe("Target parent page ID. Pass 'null' or empty string to move to root."),
|
||||
position: z
|
||||
.string()
|
||||
.min(5)
|
||||
.optional()
|
||||
.describe("fractional-index position key; min 5 chars; omit to append at the end."),
|
||||
},
|
||||
}, async ({ pageId, parentPageId, position }) => {
|
||||
const finalParentId = parentPageId === "" || parentPageId === "null" ? null : parentPageId;
|
||||
// Cheap cycle guard: a page cannot be moved directly under itself.
|
||||
// (Deeper descendant-cycle detection is intentionally out of scope.)
|
||||
if (finalParentId !== null && finalParentId === pageId) {
|
||||
throw new Error("cannot move a page under itself");
|
||||
}
|
||||
const result = await docmostClient.movePage(pageId, finalParentId || null, position);
|
||||
// Require POSITIVE confirmation: the live /pages/move success shape is
|
||||
// exactly { success: true, status: 200 }. An empty body, a 204, or any odd
|
||||
// shape lacking success === true must NOT be reported as a successful move,
|
||||
// so we surface the raw API result instead of declaring success.
|
||||
if (!(result && typeof result === "object" && result.success === true)) {
|
||||
throw new Error(`Failed to move page ${pageId}: ${JSON.stringify(result)}`);
|
||||
}
|
||||
return jsonContent({
|
||||
message: `Successfully moved page ${pageId} to parent ${finalParentId || "root"}`,
|
||||
result,
|
||||
});
|
||||
});
|
||||
// Tool: delete_page
|
||||
server.registerTool("delete_page", {
|
||||
description: "Delete a single page by ID.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
},
|
||||
}, async ({ pageId }) => {
|
||||
await docmostClient.deletePage(pageId);
|
||||
return {
|
||||
content: [
|
||||
{ type: "text", text: `Successfully deleted page ${pageId}` },
|
||||
],
|
||||
};
|
||||
});
|
||||
// --- Comment tools (ported from upstream PR #3 by Max Nikitin) ---
|
||||
// Tool: list_comments
|
||||
server.registerTool("list_comments", {
|
||||
description: "List all comments on a page (paginated). Content is returned as Markdown.",
|
||||
inputSchema: {
|
||||
pageId: z.string().describe("ID of the page"),
|
||||
},
|
||||
}, async ({ pageId }) => {
|
||||
const comments = await docmostClient.listComments(pageId);
|
||||
return jsonContent(comments);
|
||||
});
|
||||
// Tool: create_comment
|
||||
server.registerTool("create_comment", {
|
||||
description: "Create a new comment on a page. Content is provided as Markdown and " +
|
||||
"automatically converted to the required format.",
|
||||
inputSchema: {
|
||||
pageId: z.string().describe("ID of the page to comment on"),
|
||||
content: z.string().min(1).describe("Comment content in Markdown format"),
|
||||
type: z
|
||||
.enum(["page", "inline"])
|
||||
.optional()
|
||||
.describe("Comment type: 'page' for general page comment (default), 'inline' for text selection comment"),
|
||||
selection: z
|
||||
.string()
|
||||
// Enforce the documented 250-char cap to match the description above.
|
||||
.max(250)
|
||||
.optional()
|
||||
.describe("For an inline comment, the EXACT text in the page to anchor/highlight the comment on (the first occurrence of this text is wrapped in a comment mark). Max 250 chars. Required when type is 'inline'."),
|
||||
parentCommentId: z
|
||||
.string()
|
||||
.optional()
|
||||
.describe("Parent comment ID to create a reply (max 2 nesting levels)"),
|
||||
},
|
||||
}, async ({ pageId, content, type, selection, parentCommentId }) => {
|
||||
const result = await docmostClient.createComment(pageId, content, type || "page", selection, parentCommentId);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: update_comment
|
||||
server.registerTool("update_comment", {
|
||||
description: "Update an existing comment's content. Only the comment creator can " +
|
||||
"update it. Content is provided as Markdown.",
|
||||
inputSchema: {
|
||||
commentId: z.string().min(1).describe("ID of the comment to update"),
|
||||
content: z
|
||||
.string()
|
||||
.min(1)
|
||||
.describe("New comment content in Markdown format"),
|
||||
},
|
||||
}, async ({ commentId, content }) => {
|
||||
const result = await docmostClient.updateComment(commentId, content);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: delete_comment
|
||||
server.registerTool("delete_comment", {
|
||||
description: "Delete a comment. Only the comment creator or space admin can delete it.",
|
||||
inputSchema: {
|
||||
commentId: z.string().min(1).describe("ID of the comment to delete"),
|
||||
},
|
||||
}, async ({ commentId }) => {
|
||||
await docmostClient.deleteComment(commentId);
|
||||
return {
|
||||
content: [
|
||||
{
|
||||
type: "text",
|
||||
text: `Successfully deleted comment ${commentId}`,
|
||||
},
|
||||
],
|
||||
};
|
||||
});
|
||||
// Tool: check_new_comments
|
||||
server.registerTool("check_new_comments", {
|
||||
description: "Check for new comments across pages in a space since a given timestamp. " +
|
||||
"Optionally scope to a page subtree (folder). Returns only comments " +
|
||||
"created after the specified time.",
|
||||
inputSchema: {
|
||||
spaceId: z.string().describe("Space ID to check for new comments"),
|
||||
since: z
|
||||
.string()
|
||||
.min(1)
|
||||
.describe("ISO 8601 timestamp — only return comments created after this time (e.g. '2026-03-10T00:00:00Z')"),
|
||||
parentPageId: z
|
||||
.string()
|
||||
.optional()
|
||||
.describe("Optional root page ID to scope the check to a subtree (folder). " +
|
||||
"Only pages under this parent will be checked."),
|
||||
},
|
||||
}, async ({ spaceId, since, parentPageId }) => {
|
||||
// Reject an unparseable timestamp up front: otherwise the comparison
|
||||
// against NaN silently treats every comment as "not new" and the tool
|
||||
// returns zero results without signalling the bad input.
|
||||
if (Number.isNaN(Date.parse(since))) {
|
||||
throw new Error(`Invalid 'since' timestamp: ${JSON.stringify(since)} — expected an ISO 8601 date (e.g. '2026-03-10T00:00:00Z')`);
|
||||
}
|
||||
const result = await docmostClient.checkNewComments(spaceId, since, parentPageId);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: search
|
||||
server.registerTool("search", {
|
||||
description: "Search for pages and content. Results are bounded by `limit` " +
|
||||
"(default applied by the client, max 100).",
|
||||
inputSchema: {
|
||||
query: z.string().min(1).describe("Search query"),
|
||||
limit: z
|
||||
.number()
|
||||
.int()
|
||||
.min(1)
|
||||
.max(100)
|
||||
.optional()
|
||||
.describe("Max results to return (max 100)"),
|
||||
},
|
||||
}, async ({ query, limit }) => {
|
||||
// The tool exposes no spaceId filter, so pass undefined for the client's
|
||||
// optional spaceId parameter and forward limit into its correct slot.
|
||||
const result = await docmostClient.search(query, undefined, limit);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: docmost_transform
|
||||
server.registerTool("docmost_transform", {
|
||||
description: "Edit a page by running an arbitrary JS transform `(doc, ctx) => doc` " +
|
||||
"against its LIVE ProseMirror document, with a diff preview and page " +
|
||||
"history as the safety net. By default dryRun=true: returns a diff " +
|
||||
"preview WITHOUT writing. Set dryRun=false to apply (atomic, won't " +
|
||||
"clobber concurrent edits). `doc` is the lossless ProseMirror document " +
|
||||
"({type:'doc',content:[...]}); return a new doc of the same shape. " +
|
||||
"`ctx` gives you: comments (the page's comments, each {id, content " +
|
||||
"(markdown), selection, type}); log (array; console.log pushes to it); " +
|
||||
"consume(id) (mark a comment id as consumed — those are deleted when " +
|
||||
"deleteComments=true after a successful apply); and helpers: " +
|
||||
"blockText(node) (plain text), walk(node, fn) (depth-first over all " +
|
||||
"nodes incl. callouts/tables/lists), getList(doc, predicate) (find a " +
|
||||
"node even without attrs.id), insertMarkerAfter(doc, anchor, marker, " +
|
||||
"{beforeBlock}) (insert a plain unmarked text run after anchor, " +
|
||||
"mark-safe), setCalloutRange(doc, n) (sync a [1]…[K] callout range to " +
|
||||
"[1]…[n]), noteItem(inlineNodes) (wrap inline nodes in a listItem with a " +
|
||||
"fresh id), mdToInlineNodes(markdown) (comment markdown -> inline nodes), " +
|
||||
"and commentsToFootnotes(doc, comments, {notesHeading}) (turn inline " +
|
||||
"comments into numbered footnotes). Footnote convention: markers are " +
|
||||
"plain '[N]' text in the body; the notes are an orderedList under a " +
|
||||
"heading whose text is 'Примечания переводчика'. The transform runs " +
|
||||
"sandboxed (no require/process/fs/network, 5s timeout) and must return a " +
|
||||
"{type:'doc'} node.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
transformJs: z
|
||||
.string()
|
||||
.min(1)
|
||||
.describe("A JS function `(doc, ctx) => doc` (expression-arrow or " +
|
||||
"parenthesized function). It receives a clone of the live doc and " +
|
||||
"ctx (comments, log, consume(id), helpers: blockText/walk/getList/" +
|
||||
"insertMarkerAfter/setCalloutRange/noteItem/mdToInlineNodes/" +
|
||||
"commentsToFootnotes) and must return a {type:'doc'} node."),
|
||||
dryRun: z
|
||||
.boolean()
|
||||
.optional()
|
||||
.default(true)
|
||||
.describe("Preview only (no write) when true (default)."),
|
||||
deleteComments: z
|
||||
.boolean()
|
||||
.optional()
|
||||
.default(false)
|
||||
.describe("After a successful apply, delete every comment id passed to " +
|
||||
"ctx.consume(id)."),
|
||||
},
|
||||
}, async ({ pageId, transformJs, dryRun, deleteComments }) => {
|
||||
const result = await docmostClient.transformPage(pageId, transformJs, {
|
||||
dryRun,
|
||||
deleteComments,
|
||||
});
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: diff_page_versions
|
||||
server.registerTool("diff_page_versions", {
|
||||
description: "Diff two versions of a page and return a Docmost-equivalent change set " +
|
||||
"(inserted/deleted text, integrity counts for images/links/tables/" +
|
||||
"callouts/footnote markers, and a human-readable markdown summary). " +
|
||||
"`from`/`to` each accept a historyId, or null/'current' for the page's " +
|
||||
"current content (defaults: from=current, to=current — pass a historyId " +
|
||||
"from list_page_history to compare against the live page).",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
from: z
|
||||
.string()
|
||||
.optional()
|
||||
.describe("historyId, or 'current'/omit for current content"),
|
||||
to: z
|
||||
.string()
|
||||
.optional()
|
||||
.describe("historyId, or 'current'/omit for current content"),
|
||||
},
|
||||
}, async ({ pageId, from, to }) => {
|
||||
const result = await docmostClient.diffPageVersions(pageId, from, to);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: list_page_history
|
||||
server.registerTool("list_page_history", {
|
||||
description: "List a page's saved versions (Docmost auto-snapshots on every save), " +
|
||||
"newest first, cursor-paginated. Returns { items, nextCursor }; each " +
|
||||
"item's id is the historyId to pass to diff_page_versions or " +
|
||||
"restore_page_version.",
|
||||
inputSchema: {
|
||||
pageId: z.string().min(1),
|
||||
cursor: z
|
||||
.string()
|
||||
.optional()
|
||||
.describe("Pagination cursor from a previous nextCursor"),
|
||||
},
|
||||
}, async ({ pageId, cursor }) => {
|
||||
const result = await docmostClient.listPageHistory(pageId, cursor);
|
||||
return jsonContent(result);
|
||||
});
|
||||
// Tool: restore_page_version
|
||||
server.registerTool("restore_page_version", {
|
||||
description: "Restore a page to a saved version: writes that version's content back " +
|
||||
"as the page's current content (Docmost has no restore endpoint, so " +
|
||||
"this creates a NEW history snapshot — the restore is itself revertible). " +
|
||||
"Get the historyId from list_page_history.",
|
||||
inputSchema: {
|
||||
historyId: z.string().min(1),
|
||||
},
|
||||
}, async ({ historyId }) => {
|
||||
const result = await docmostClient.restorePageVersion(historyId);
|
||||
return jsonContent(result);
|
||||
});
|
||||
return server;
|
||||
}
|
||||
74
packages/mcp/build/lib/auth-utils.js
Normal file
74
packages/mcp/build/lib/auth-utils.js
Normal file
@@ -0,0 +1,74 @@
|
||||
import axios from "axios";
|
||||
export async function getCollabToken(baseUrl, apiToken) {
|
||||
try {
|
||||
const response = await axios.post(`${baseUrl}/auth/collab-token`, {}, {
|
||||
headers: {
|
||||
Authorization: `Bearer ${apiToken}`,
|
||||
"Content-Type": "application/json",
|
||||
},
|
||||
});
|
||||
// console.error('Collab Token Response:', response.data);
|
||||
// Response is wrapped in { data: { token: ... } }
|
||||
return response.data.data?.token || response.data.token;
|
||||
}
|
||||
catch (error) {
|
||||
if (axios.isAxiosError(error)) {
|
||||
// Attach the HTTP status to the plain Error so callers (e.g.
|
||||
// getCollabTokenWithReauth) can still detect a 401/403 after the
|
||||
// original AxiosError has been wrapped away.
|
||||
// Avoid leaking the full server response body by default; include only
|
||||
// status + statusText. Append the body only when DEBUG is set.
|
||||
let message = `Failed to get collab token: ${error.response?.status} ${error.response?.statusText}`;
|
||||
if (process.env.DEBUG) {
|
||||
message += ` - ${JSON.stringify(error.response?.data)}`;
|
||||
}
|
||||
const err = new Error(message);
|
||||
err.status = error.response?.status;
|
||||
throw err;
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
export async function performLogin(baseUrl, email, password) {
|
||||
try {
|
||||
const response = await axios.post(`${baseUrl}/auth/login`, {
|
||||
email,
|
||||
password,
|
||||
});
|
||||
// Extract token from Set-Cookie header
|
||||
const cookies = response.headers["set-cookie"];
|
||||
if (!cookies) {
|
||||
throw new Error("No Set-Cookie header found in login response");
|
||||
}
|
||||
// Match the cookie name exactly to avoid matching a future
|
||||
// authTokenRefresh cookie (startsWith would catch it).
|
||||
const authCookie = cookies.find((c) => {
|
||||
const kv = c.split(";")[0];
|
||||
return kv.slice(0, kv.indexOf("=")) === "authToken";
|
||||
});
|
||||
if (!authCookie) {
|
||||
throw new Error("No authToken cookie found in login response");
|
||||
}
|
||||
// Take everything after the FIRST "=" up to the first ";".
|
||||
// Splitting on "=" would truncate base64 values containing "=" padding.
|
||||
const kv = authCookie.split(";")[0];
|
||||
const token = kv.slice(kv.indexOf("=") + 1);
|
||||
return token;
|
||||
}
|
||||
catch (error) {
|
||||
// Avoid leaking the full server response body by default; log only the
|
||||
// HTTP status. Log the verbose body only when DEBUG is set.
|
||||
if (axios.isAxiosError(error)) {
|
||||
if (process.env.DEBUG) {
|
||||
console.error("Login failed:", error.response?.data);
|
||||
}
|
||||
else {
|
||||
console.error("Login failed:", error.response?.status);
|
||||
}
|
||||
}
|
||||
else {
|
||||
console.error("Login failed:", error.message);
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
553
packages/mcp/build/lib/collaboration.js
Normal file
553
packages/mcp/build/lib/collaboration.js
Normal file
@@ -0,0 +1,553 @@
|
||||
import { HocuspocusProvider } from "@hocuspocus/provider";
|
||||
import { TiptapTransformer } from "@hocuspocus/transformer";
|
||||
import * as Y from "yjs";
|
||||
import WebSocket from "ws";
|
||||
import { marked } from "marked";
|
||||
import { generateJSON } from "@tiptap/html";
|
||||
import { JSDOM } from "jsdom";
|
||||
import { docmostExtensions } from "./docmost-schema.js";
|
||||
import { withPageLock } from "./page-lock.js";
|
||||
import { sanitizeForYjs, findUnstorableAttr } from "./node-ops.js";
|
||||
// Setup DOM environment for Tiptap HTML parsing in Node.js
|
||||
const dom = new JSDOM("<!DOCTYPE html><html><body></body></html>");
|
||||
global.window = dom.window;
|
||||
global.document = dom.window.document;
|
||||
// @ts-ignore
|
||||
global.Element = dom.window.Element;
|
||||
// @ts-ignore
|
||||
global.WebSocket = WebSocket;
|
||||
// Navigator is read-only in newer Node versions and already exists
|
||||
// global.navigator = dom.window.navigator;
|
||||
/**
|
||||
* Hard ceiling above which we skip callout preprocessing entirely. The linear
|
||||
* scanner below has no quadratic blow-up, but we still cap input defensively so
|
||||
* a pathological multi-megabyte payload cannot tie up the event loop; in that
|
||||
* case the markdown is passed through verbatim (callouts are simply not
|
||||
* detected) rather than risking a slow scan.
|
||||
*/
|
||||
const MAX_CALLOUT_PREPROCESS_BYTES = 4 * 1024 * 1024; // 4 MB
|
||||
/** Matches an opening callout fence: `:::type` (type captured, lower-cased). */
|
||||
const CALLOUT_OPEN_RE = /^:::\s*(\w+)\s*$/;
|
||||
/** Matches a bare closing callout fence: `:::`. */
|
||||
const CALLOUT_CLOSE_RE = /^:::\s*$/;
|
||||
/** Matches the start/end of a code fence (``` or ~~~), capturing the marker. */
|
||||
const CODE_FENCE_RE = /^(\s*)(`{3,}|~{3,})/;
|
||||
/**
|
||||
* Pre-process Docmost-flavoured markdown: convert `:::type ... :::`
|
||||
* callout blocks (the syntax our markdown export produces) into HTML
|
||||
* divs that the callout extension parses. The inner content is rendered
|
||||
* through marked as regular markdown.
|
||||
*
|
||||
* Implemented as a single linear pass over the lines (no quadratic regex
|
||||
* rescan). It:
|
||||
* - tracks fenced code regions (```...``` and ~~~...~~~) and never treats a
|
||||
* `:::` line that lives inside a code fence as a callout delimiter, so a
|
||||
* callout body that itself contains a fenced code block with a `:::` line is
|
||||
* no longer corrupted;
|
||||
* - matches an opening `:::type` line with the next CLOSING `:::` at the SAME
|
||||
* nesting level, supporting NESTED callouts via a depth counter (an inner
|
||||
* `:::type` opens a deeper level and consumes a matching `:::`);
|
||||
* - emits the same `<div data-type="callout" data-callout-type="TYPE">` output
|
||||
* (inner rendered through marked) as the previous regex implementation.
|
||||
*/
|
||||
async function preprocessCallouts(markdown) {
|
||||
// Defensive cap: skip preprocessing for pathologically large inputs.
|
||||
if (markdown.length > MAX_CALLOUT_PREPROCESS_BYTES) {
|
||||
return markdown;
|
||||
}
|
||||
// Recursively transform a slice of lines, converting top-level callouts in
|
||||
// that slice into <div> blocks and rendering their inner content (which may
|
||||
// itself contain nested callouts) through this same function.
|
||||
const transform = async (lines) => {
|
||||
const out = [];
|
||||
let inCodeFence = false;
|
||||
let codeFenceMarker = ""; // the exact run of backticks/tildes that opened it
|
||||
let i = 0;
|
||||
while (i < lines.length) {
|
||||
const line = lines[i];
|
||||
// Inside a code fence, only its matching closing fence is significant;
|
||||
// everything else (including `:::` lines) is copied through verbatim.
|
||||
if (inCodeFence) {
|
||||
out.push(line);
|
||||
const fence = line.match(CODE_FENCE_RE);
|
||||
if (fence && fence[2].startsWith(codeFenceMarker[0]) &&
|
||||
fence[2].length >= codeFenceMarker.length) {
|
||||
inCodeFence = false;
|
||||
codeFenceMarker = "";
|
||||
}
|
||||
i++;
|
||||
continue;
|
||||
}
|
||||
// A code fence opening outside any callout body: enter code-fence mode.
|
||||
const fenceOpen = line.match(CODE_FENCE_RE);
|
||||
if (fenceOpen) {
|
||||
inCodeFence = true;
|
||||
codeFenceMarker = fenceOpen[2];
|
||||
out.push(line);
|
||||
i++;
|
||||
continue;
|
||||
}
|
||||
// An opening callout fence: scan forward (with code-fence and nested
|
||||
// callout awareness) for its matching closing `:::` at the same level.
|
||||
const open = line.match(CALLOUT_OPEN_RE);
|
||||
if (open) {
|
||||
const type = open[1].toLowerCase();
|
||||
const bodyLines = [];
|
||||
let depth = 1;
|
||||
let innerInCodeFence = false;
|
||||
let innerCodeFenceMarker = "";
|
||||
let j = i + 1;
|
||||
for (; j < lines.length; j++) {
|
||||
const bl = lines[j];
|
||||
if (innerInCodeFence) {
|
||||
const f = bl.match(CODE_FENCE_RE);
|
||||
if (f && f[2].startsWith(innerCodeFenceMarker[0]) &&
|
||||
f[2].length >= innerCodeFenceMarker.length) {
|
||||
innerInCodeFence = false;
|
||||
innerCodeFenceMarker = "";
|
||||
}
|
||||
bodyLines.push(bl);
|
||||
continue;
|
||||
}
|
||||
const innerFence = bl.match(CODE_FENCE_RE);
|
||||
if (innerFence) {
|
||||
innerInCodeFence = true;
|
||||
innerCodeFenceMarker = innerFence[2];
|
||||
bodyLines.push(bl);
|
||||
continue;
|
||||
}
|
||||
if (CALLOUT_OPEN_RE.test(bl)) {
|
||||
depth++;
|
||||
bodyLines.push(bl);
|
||||
continue;
|
||||
}
|
||||
if (CALLOUT_CLOSE_RE.test(bl)) {
|
||||
depth--;
|
||||
if (depth === 0)
|
||||
break; // matching close for THIS callout
|
||||
bodyLines.push(bl);
|
||||
continue;
|
||||
}
|
||||
bodyLines.push(bl);
|
||||
}
|
||||
if (j < lines.length) {
|
||||
// Found the matching closing fence: render the body (recursively, so
|
||||
// nested callouts are handled) and emit the callout div.
|
||||
const inner = await transform(bodyLines);
|
||||
const renderedInner = await marked.parse(inner);
|
||||
out.push(`\n<div data-type="callout" data-callout-type="${type}">${renderedInner}</div>\n`);
|
||||
i = j + 1; // skip past the closing `:::`
|
||||
continue;
|
||||
}
|
||||
// No matching close (unterminated callout): treat the opener as a
|
||||
// literal line and continue, preserving the original text.
|
||||
out.push(line);
|
||||
i++;
|
||||
continue;
|
||||
}
|
||||
out.push(line);
|
||||
i++;
|
||||
}
|
||||
return out.join("\n");
|
||||
};
|
||||
return transform(markdown.split("\n"));
|
||||
}
|
||||
/**
|
||||
* Bridge marked's checkbox lists to TipTap task lists.
|
||||
*
|
||||
* marked renders GitHub task list items (`- [x] done`) as a plain
|
||||
* `<ul><li><p><input type="checkbox" checked> text</p></li></ul>` WITHOUT the
|
||||
* markup TipTap's TaskList/TaskItem extensions parse. This rewrites such lists
|
||||
* into the shape those extensions expect:
|
||||
* TaskList parseHTML matches `ul[data-type="taskList"]`,
|
||||
* TaskItem matches `li[data-type="taskItem"]`,
|
||||
* the checked state is read from `data-checked === "true"`.
|
||||
*
|
||||
* A list is only converted when it has at least one `<li>` and EVERY direct
|
||||
* `<li>` contains a checkbox input. Both `<ul>` and `<ol>` are considered: a
|
||||
* numbered checklist (`1. [x] a`, which marked renders as an `<ol>` of checkbox
|
||||
* `<li>`s) would otherwise lose its task state. TipTap task lists are unordered,
|
||||
* so a matching `<ol>` is emitted as `data-type="taskList"` exactly like a
|
||||
* `<ul>`. Mixed or ordinary lists (including ordinary `<ol>` lists) are left
|
||||
* untouched so they keep rendering as bullet/numbered lists. The marked `<p>`
|
||||
* wrapper is kept inside the `<li>` because TaskItem content allows paragraphs.
|
||||
*/
|
||||
function bridgeTaskLists(html) {
|
||||
// Cheap early-out: if the markup contains no checkbox input at all there is
|
||||
// nothing to bridge, so skip the expensive JSDOM parse entirely. This is the
|
||||
// common case (most pages have no task lists).
|
||||
if (!/type=["']?checkbox/i.test(html)) {
|
||||
return html;
|
||||
}
|
||||
// Defensive cap (consistent with preprocessCallouts): skip the bridge for
|
||||
// pathologically large inputs rather than running a second expensive JSDOM
|
||||
// parse on a multi-megabyte payload. The markup is passed through verbatim.
|
||||
if (html.length > MAX_CALLOUT_PREPROCESS_BYTES) {
|
||||
return html;
|
||||
}
|
||||
const dom = new JSDOM(html);
|
||||
const document = dom.window.document;
|
||||
// Collect the checkbox(es) that belong to THIS <li> directly: either direct
|
||||
// child <input type="checkbox"> elements or ones inside the <li>'s direct <p>
|
||||
// child (the shape marked emits: `<li><p><input type="checkbox"> text</p></li>`).
|
||||
// Checkboxes nested deeper (e.g. inside a child <ul>/<ol>) are excluded so a
|
||||
// bullet <li> that merely contains a nested task sublist is not misdetected.
|
||||
// Raw inline HTML can put more than one checkbox in a single <li>; we gather
|
||||
// ALL of them so none survive into the converted item.
|
||||
const directCheckboxes = (li) => {
|
||||
const found = [];
|
||||
for (const child of Array.from(li.children)) {
|
||||
if (child.tagName === "INPUT" &&
|
||||
child.getAttribute("type") === "checkbox") {
|
||||
found.push(child);
|
||||
continue;
|
||||
}
|
||||
if (child.tagName === "P") {
|
||||
for (const inp of Array.from(child.querySelectorAll(":scope > input[type='checkbox']"))) {
|
||||
found.push(inp);
|
||||
}
|
||||
}
|
||||
}
|
||||
return found;
|
||||
};
|
||||
// Both <ul> and <ol> are candidates: an <ol> whose every direct <li> carries
|
||||
// its own checkbox is a numbered checklist that must also become a taskList.
|
||||
const lists = Array.from(document.querySelectorAll("ul, ol"));
|
||||
for (const list of lists) {
|
||||
// Only consider DIRECT child <li> elements; nested lists are handled by
|
||||
// their own iteration of the outer loop.
|
||||
const items = Array.from(list.children).filter((child) => child.tagName === "LI");
|
||||
if (items.length === 0)
|
||||
continue;
|
||||
const itemCheckboxes = items.map((li) => directCheckboxes(li));
|
||||
// Convert only when every direct <li> carries at least one OWN checkbox.
|
||||
if (!itemCheckboxes.every((boxes) => boxes.length > 0))
|
||||
continue;
|
||||
// A numbered checklist arrives as an <ol>. We must NOT leave the tag as
|
||||
// <ol> while tagging it data-type="taskList": generateJSON would then match
|
||||
// BOTH the orderedList rule (tag ol) and the taskList rule (data-type),
|
||||
// emitting a phantom empty orderedList beside the real taskList. So rename a
|
||||
// qualifying <ol> to a <ul> — move its <li> children over and replace it —
|
||||
// leaving only the taskList rule to match. Already-<ul> lists are unchanged.
|
||||
let target = list;
|
||||
if (list.tagName === "OL") {
|
||||
const ul = document.createElement("ul");
|
||||
// Carry over existing attributes (e.g. class) so nothing is silently lost.
|
||||
for (const attr of Array.from(list.attributes)) {
|
||||
ul.setAttribute(attr.name, attr.value);
|
||||
}
|
||||
// Move every child node (including the <li>s we collected) into the <ul>.
|
||||
while (list.firstChild) {
|
||||
ul.appendChild(list.firstChild);
|
||||
}
|
||||
list.replaceWith(ul);
|
||||
target = ul;
|
||||
}
|
||||
target.setAttribute("data-type", "taskList");
|
||||
items.forEach((li, index) => {
|
||||
const boxes = itemCheckboxes[index];
|
||||
// The first checkbox determines the checked state (matches the previous
|
||||
// single-checkbox behaviour); any extras only need removing.
|
||||
const input = boxes[0] ?? null;
|
||||
li.setAttribute("data-type", "taskItem");
|
||||
const checked = input != null &&
|
||||
(input.hasAttribute("checked") || input.checked);
|
||||
li.setAttribute("data-checked", checked ? "true" : "false");
|
||||
// Remove ALL direct checkbox inputs so none survive into the content
|
||||
// (a raw-inline-HTML <li> may carry more than one).
|
||||
for (const box of boxes) {
|
||||
box.remove();
|
||||
}
|
||||
});
|
||||
}
|
||||
return document.body.innerHTML;
|
||||
}
|
||||
/** Convert markdown to a ProseMirror doc using the full Docmost schema. */
|
||||
export async function markdownToProseMirror(markdownContent) {
|
||||
const withCallouts = await preprocessCallouts(markdownContent);
|
||||
const html = await marked.parse(withCallouts);
|
||||
const bridged = bridgeTaskLists(html);
|
||||
return generateJSON(bridged, docmostExtensions);
|
||||
}
|
||||
/**
|
||||
* Build the collaboration WebSocket URL from an API base URL:
|
||||
* switch http(s)->ws(s), strip a trailing /api, mount on /collab.
|
||||
* Shared by the live read and the mutate path so both target the same socket.
|
||||
*/
|
||||
export function buildCollabWsUrl(baseUrl) {
|
||||
let wsUrl = baseUrl.replace(/^http/, "ws");
|
||||
try {
|
||||
const urlObj = new URL(wsUrl);
|
||||
if (urlObj.pathname.endsWith("/api") || urlObj.pathname.endsWith("/api/")) {
|
||||
urlObj.pathname = urlObj.pathname.replace(/\/api\/?$/, "");
|
||||
}
|
||||
urlObj.pathname = urlObj.pathname.replace(/\/$/, "") + "/collab";
|
||||
// Drop any query/hash from the base URL so it is not carried into the
|
||||
// collaboration ws URL.
|
||||
urlObj.search = "";
|
||||
urlObj.hash = "";
|
||||
wsUrl = urlObj.toString();
|
||||
}
|
||||
catch (e) {
|
||||
// Fallback if URL parsing fails
|
||||
if (!wsUrl.endsWith("/collab")) {
|
||||
wsUrl = wsUrl.replace(/\/$/, "") + "/collab";
|
||||
}
|
||||
}
|
||||
return wsUrl;
|
||||
}
|
||||
/**
|
||||
* Encode a ProseMirror doc to a Yjs document, sanitizing it first and turning
|
||||
* the opaque yjs "Unexpected content type" failure into a descriptive error.
|
||||
*
|
||||
* `sanitizeForYjs` strips `undefined` node/mark attributes (the common cause of
|
||||
* the failure); if `toYdoc` still throws, `findUnstorableAttr` is used to point
|
||||
* at the offending attribute path.
|
||||
*/
|
||||
export function buildYDoc(doc) {
|
||||
const safe = sanitizeForYjs(doc);
|
||||
try {
|
||||
return TiptapTransformer.toYdoc(safe, "default", docmostExtensions);
|
||||
}
|
||||
catch (e) {
|
||||
const bad = findUnstorableAttr(safe);
|
||||
throw new Error(`Failed to encode document to Yjs (toYdoc): ${e instanceof Error ? e.message : String(e)}.${bad ? ` Offending attribute: ${bad}.` : " A node/mark attribute likely holds a value Yjs cannot store (e.g. undefined)."}`);
|
||||
}
|
||||
}
|
||||
/**
|
||||
* Validate that a doc is Yjs-encodable by building (and discarding) a Y.Doc.
|
||||
* Throws the same descriptive error as the apply path when it is not. Used by
|
||||
* the dry-run preview so it fails identically to apply.
|
||||
*/
|
||||
export function assertYjsEncodable(doc) {
|
||||
buildYDoc(doc);
|
||||
}
|
||||
/** Time we wait for the initial handshake/sync before giving up. */
|
||||
const CONNECT_TIMEOUT_MS = 25000;
|
||||
/** Time we wait for the server to acknowledge our write before giving up. */
|
||||
const PERSIST_TIMEOUT_MS = 20000;
|
||||
/**
|
||||
* Safely mutate the live content of a page over the collaboration websocket.
|
||||
*
|
||||
* This is the single safe write path for every MCP content mutation. It:
|
||||
* 1. serializes per-page writes through withPageLock (no two MCP writes on
|
||||
* the same page overlap);
|
||||
* 2. connects to Hocuspocus and waits for the initial sync so the local ydoc
|
||||
* mirrors the authoritative server doc — INCLUDING edits/comments/images
|
||||
* that are not yet in the debounced REST snapshot;
|
||||
* 3. inside onSynced, SYNCHRONOUSLY reads the live doc, runs `transform`, and
|
||||
* writes the result back — with no `await` between read and write so no
|
||||
* remote update can interleave and clobber concurrent human edits;
|
||||
* 4. waits for the server to acknowledge the write (unsyncedChanges -> 0)
|
||||
* before resolving, so the next operation observes our change.
|
||||
*
|
||||
* `transform` receives the live ProseMirror doc and returns the NEW full
|
||||
* ProseMirror doc to write, or `null` to abort with no write (a no-op). If
|
||||
* `transform` throws, the error is propagated to the caller (not swallowed).
|
||||
*
|
||||
* Returns the doc that was written, or the live doc when the transform aborted.
|
||||
*/
|
||||
export async function mutatePageContent(pageId, collabToken, baseUrl, transform) {
|
||||
return withPageLock(pageId, () => {
|
||||
if (process.env.DEBUG) {
|
||||
console.error(`Starting realtime content mutate for page ${pageId}`);
|
||||
// Token prefix is sensitive; only log it under DEBUG.
|
||||
console.error(`Token prefix: ${collabToken ? collabToken.substring(0, 5) : "NONE"}...`);
|
||||
}
|
||||
const ydoc = new Y.Doc();
|
||||
const wsUrl = buildCollabWsUrl(baseUrl);
|
||||
if (process.env.DEBUG)
|
||||
console.error(`Connecting to WebSocket: ${wsUrl}`);
|
||||
return new Promise((resolve, reject) => {
|
||||
let provider;
|
||||
let applied = false; // onSynced may fire again on reconnect — apply once.
|
||||
let settled = false;
|
||||
// Set true on disconnect/close so a reconnect-driven unsyncedChanges->0
|
||||
// cannot be mistaken for a successful persist of our write.
|
||||
let connectionLost = false;
|
||||
let connectTimer;
|
||||
let persistTimer;
|
||||
let unsyncedHandler;
|
||||
const cleanup = () => {
|
||||
if (connectTimer)
|
||||
clearTimeout(connectTimer);
|
||||
if (persistTimer)
|
||||
clearTimeout(persistTimer);
|
||||
if (provider) {
|
||||
if (unsyncedHandler) {
|
||||
try {
|
||||
provider.off("unsyncedChanges", unsyncedHandler);
|
||||
}
|
||||
catch (err) { }
|
||||
}
|
||||
try {
|
||||
provider.destroy();
|
||||
}
|
||||
catch (err) { }
|
||||
}
|
||||
};
|
||||
const finish = (err, value) => {
|
||||
if (settled)
|
||||
return;
|
||||
settled = true;
|
||||
cleanup();
|
||||
if (err)
|
||||
reject(err);
|
||||
else
|
||||
resolve(value);
|
||||
};
|
||||
connectTimer = setTimeout(() => {
|
||||
finish(new Error("Connection timeout to collaboration server"));
|
||||
}, CONNECT_TIMEOUT_MS);
|
||||
// Resolve once the server has acknowledged our update. The provider
|
||||
// increments unsyncedChanges when our local update is sent and
|
||||
// decrements it when the server replies with a SyncStatus(applied=true);
|
||||
// reaching 0 means the authoritative in-memory ydoc on the server now
|
||||
// contains our write.
|
||||
const waitForPersistence = () => {
|
||||
if (settled)
|
||||
return;
|
||||
// A missing provider is a failure, not a success: without it the write
|
||||
// can never have been acknowledged. Only an actual unsyncedChanges===0
|
||||
// on a live provider counts as persisted.
|
||||
if (!provider) {
|
||||
finish(new Error("collab provider gone before persistence"));
|
||||
return;
|
||||
}
|
||||
if (provider.unsyncedChanges === 0) {
|
||||
finish(null, lastWrittenDoc);
|
||||
return;
|
||||
}
|
||||
persistTimer = setTimeout(() => {
|
||||
finish(new Error("Timeout waiting for collaboration server to persist the update"));
|
||||
}, PERSIST_TIMEOUT_MS);
|
||||
unsyncedHandler = (data) => {
|
||||
// Only treat unsyncedChanges->0 as success when the connection is
|
||||
// still up. A transient disconnect + reconnect handshake can drive
|
||||
// the counter back to 0 without our write being re-transmitted; in
|
||||
// that case let the disconnect/close error win instead.
|
||||
if (data.number === 0 && !connectionLost) {
|
||||
finish(null, lastWrittenDoc);
|
||||
}
|
||||
};
|
||||
provider.on("unsyncedChanges", unsyncedHandler);
|
||||
};
|
||||
let lastWrittenDoc;
|
||||
provider = new HocuspocusProvider({
|
||||
url: wsUrl,
|
||||
name: `page.${pageId}`,
|
||||
document: ydoc,
|
||||
token: collabToken,
|
||||
// @ts-ignore - Required for Node.js environment
|
||||
WebSocketPolyfill: WebSocket,
|
||||
onConnect: () => {
|
||||
if (process.env.DEBUG)
|
||||
console.error("WS Connect");
|
||||
},
|
||||
// An unexpected disconnect/close while we are still waiting (during the
|
||||
// connect-wait before onSynced, or during the persistence wait after the
|
||||
// write) means the update will never be acknowledged — surface it now
|
||||
// instead of hanging until the connect/persist timeout fires. `finish`
|
||||
// is idempotent via the `settled` flag, so the onClose that our own
|
||||
// cleanup()->provider.destroy() triggers (after settled=true is set) is
|
||||
// a harmless no-op and cannot cause a double-resolve.
|
||||
onDisconnect: () => {
|
||||
if (process.env.DEBUG)
|
||||
console.error("WS Disconnect");
|
||||
// Mark BEFORE finish so the unsyncedChanges handler (if it races)
|
||||
// sees the connection as lost and won't report a false success.
|
||||
connectionLost = true;
|
||||
finish(new Error("Collaboration connection closed before the update was persisted/synced"));
|
||||
},
|
||||
onClose: () => {
|
||||
if (process.env.DEBUG)
|
||||
console.error("WS Close");
|
||||
// Mark BEFORE finish so the unsyncedChanges handler (if it races)
|
||||
// sees the connection as lost and won't report a false success.
|
||||
connectionLost = true;
|
||||
finish(new Error("Collaboration connection closed before the update was persisted/synced"));
|
||||
},
|
||||
onSynced: () => {
|
||||
if (applied || settled)
|
||||
return;
|
||||
applied = true;
|
||||
if (process.env.DEBUG)
|
||||
console.error("Connected and synced!");
|
||||
// CRITICAL: everything between reading the live doc and writing it
|
||||
// back must stay synchronous (no await). While the JS event loop is
|
||||
// not yielded, no incoming remote update can interleave, so any
|
||||
// already-synced concurrent edits are preserved in liveDoc.
|
||||
let newDoc;
|
||||
try {
|
||||
let liveDoc = TiptapTransformer.fromYdoc(ydoc, "default");
|
||||
if (!liveDoc ||
|
||||
typeof liveDoc !== "object" ||
|
||||
!Array.isArray(liveDoc.content)) {
|
||||
liveDoc = { type: "doc", content: [] };
|
||||
}
|
||||
newDoc = transform(liveDoc);
|
||||
if (newDoc == null) {
|
||||
// Transform aborted — write nothing, return the live doc.
|
||||
lastWrittenDoc = liveDoc;
|
||||
finish(null, liveDoc);
|
||||
return;
|
||||
}
|
||||
const tempDoc = buildYDoc(newDoc);
|
||||
// Fetch the fragment immediately before the transact that mutates
|
||||
// it, rather than reusing a handle grabbed across the transform.
|
||||
const fragment = ydoc.getXmlFragment("default");
|
||||
ydoc.transact(() => {
|
||||
if (fragment.length > 0) {
|
||||
fragment.delete(0, fragment.length);
|
||||
}
|
||||
Y.applyUpdate(ydoc, Y.encodeStateAsUpdate(tempDoc));
|
||||
});
|
||||
}
|
||||
catch (e) {
|
||||
// Includes errors thrown by transform (e.g. "afterText not found",
|
||||
// "text not found"): propagate them verbatim to the caller.
|
||||
finish(e instanceof Error ? e : new Error(String(e)));
|
||||
return;
|
||||
}
|
||||
lastWrittenDoc = newDoc;
|
||||
if (process.env.DEBUG)
|
||||
console.error("Content written, waiting for server to persist...");
|
||||
waitForPersistence();
|
||||
},
|
||||
onAuthenticationFailed: () => {
|
||||
finish(new Error("Authentication failed for collaboration connection"));
|
||||
},
|
||||
});
|
||||
});
|
||||
});
|
||||
}
|
||||
/**
|
||||
* Replace the live content of a page over the collaboration websocket.
|
||||
* Accepts a ready ProseMirror JSON document; the caller controls whether
|
||||
* it was produced from markdown (ids regenerate) or edited in place
|
||||
* (existing block ids preserved).
|
||||
*
|
||||
* This is an intentional full replace (used by update_page / update_page_json),
|
||||
* but now runs under the per-page lock and waits for server persistence via
|
||||
* mutatePageContent.
|
||||
*/
|
||||
export async function replacePageContent(pageId, prosemirrorDoc, collabToken, baseUrl) {
|
||||
// Fail fast on a bad document instead of deferring the failure into the
|
||||
// collaboration write (where TiptapTransformer.toYdoc(undefined) used to
|
||||
// throw). The transform must return a valid ProseMirror doc.
|
||||
if (prosemirrorDoc == null ||
|
||||
typeof prosemirrorDoc !== "object" ||
|
||||
prosemirrorDoc.type !== "doc") {
|
||||
throw new Error("replacePageContent: invalid ProseMirror document");
|
||||
}
|
||||
await mutatePageContent(pageId, collabToken, baseUrl, () => prosemirrorDoc);
|
||||
}
|
||||
/**
|
||||
* Markdown update path (kept for backwards compatibility).
|
||||
* NOTE: this re-imports the whole document — block ids are regenerated.
|
||||
* Tables and :::callout::: blocks survive thanks to the full schema.
|
||||
*/
|
||||
export async function updatePageContentRealtime(pageId, markdownContent, collabToken, baseUrl) {
|
||||
const tiptapJson = await markdownToProseMirror(markdownContent);
|
||||
await mutatePageContent(pageId, collabToken, baseUrl, () => tiptapJson);
|
||||
}
|
||||
273
packages/mcp/build/lib/diff.js
Normal file
273
packages/mcp/build/lib/diff.js
Normal file
@@ -0,0 +1,273 @@
|
||||
/**
|
||||
* Headless, Docmost-equivalent document diff.
|
||||
*
|
||||
* Docmost's history editor computes a change set with the exact pipeline below
|
||||
* (recreateTransform -> ChangeSet.addSteps -> simplifyChanges) and renders it as
|
||||
* editor decorations. This module runs the SAME computation but serializes the
|
||||
* result to text + integrity counts instead of decorations, so a diff can be
|
||||
* previewed without a browser.
|
||||
*
|
||||
* recreateTransform here comes from @fellow/prosemirror-recreate-transform, the
|
||||
* maintained published fork of the MIT prosemirror-recreate-steps source that
|
||||
* Docmost vendors in @docmost/editor-ext; it exposes the identical
|
||||
* recreateTransform(fromDoc, toDoc, { complexSteps, wordDiffs, simplifyDiff })
|
||||
* signature.
|
||||
*
|
||||
* If recreateTransform / the changeset throws on a pathological document pair,
|
||||
* we fall back to a coarse block-level text diff so the tool never hard-fails.
|
||||
*/
|
||||
import { getSchema } from "@tiptap/core";
|
||||
import { Node } from "@tiptap/pm/model";
|
||||
import { ChangeSet, simplifyChanges } from "@tiptap/pm/changeset";
|
||||
import { recreateTransform } from "@fellow/prosemirror-recreate-transform";
|
||||
import { docmostExtensions } from "./docmost-schema.js";
|
||||
/** Build the schema once; it is pure and reused across calls. */
|
||||
const schema = getSchema(docmostExtensions);
|
||||
/** Recursively concatenate the plain text of a JSON node. */
|
||||
function plainText(node) {
|
||||
if (!node || typeof node !== "object")
|
||||
return "";
|
||||
let out = "";
|
||||
if (typeof node.text === "string")
|
||||
out += node.text;
|
||||
if (Array.isArray(node.content)) {
|
||||
for (const child of node.content)
|
||||
out += plainText(child);
|
||||
}
|
||||
return out;
|
||||
}
|
||||
/** Count nodes in a JSON doc that satisfy `pred` (recursive). */
|
||||
function countNodes(doc, pred) {
|
||||
let n = 0;
|
||||
const visit = (node) => {
|
||||
if (!node || typeof node !== "object")
|
||||
return;
|
||||
if (pred(node))
|
||||
n++;
|
||||
if (Array.isArray(node.content))
|
||||
for (const c of node.content)
|
||||
visit(c);
|
||||
};
|
||||
visit(doc);
|
||||
return n;
|
||||
}
|
||||
/**
|
||||
* Count UNIQUE links in a JSON doc by their `href`. A single link can be split
|
||||
* across several adjacent text runs (e.g. a "link+bold" run followed by a "link"
|
||||
* run); counting link-bearing runs would over-count it. Walking the tree and
|
||||
* collecting hrefs into a Set keys each distinct link once. Link marks with a
|
||||
* missing/empty href are bucketed under a single "" key so a malformed link is
|
||||
* still counted as one.
|
||||
*/
|
||||
function countUniqueLinks(doc) {
|
||||
const hrefs = new Set();
|
||||
const visit = (node) => {
|
||||
if (!node || typeof node !== "object")
|
||||
return;
|
||||
if (node.type === "text" && Array.isArray(node.marks)) {
|
||||
for (const m of node.marks) {
|
||||
if (m && m.type === "link") {
|
||||
const href = m.attrs && typeof m.attrs.href === "string" ? m.attrs.href : "";
|
||||
hrefs.add(href);
|
||||
}
|
||||
}
|
||||
}
|
||||
if (Array.isArray(node.content))
|
||||
for (const c of node.content)
|
||||
visit(c);
|
||||
};
|
||||
visit(doc);
|
||||
return hrefs.size;
|
||||
}
|
||||
/**
|
||||
* Parse the ordered list of integers from `[N]` footnote markers found in the
|
||||
* BODY only (every top-level block before the first "Примечания..." notes
|
||||
* heading; if no such heading, the whole doc). Returned in reading order.
|
||||
*/
|
||||
function footnoteMarkers(doc, notesHeading) {
|
||||
const top = Array.isArray(doc?.content) ? doc.content : [];
|
||||
const notesIdx = top.findIndex((n) => n &&
|
||||
n.type === "heading" &&
|
||||
plainText(n).trim() === notesHeading);
|
||||
const bodyBlocks = notesIdx >= 0 ? top.slice(0, notesIdx) : top;
|
||||
const markers = [];
|
||||
const re = /\[(\d+)\]/g;
|
||||
for (const block of bodyBlocks) {
|
||||
const text = plainText(block);
|
||||
let m;
|
||||
re.lastIndex = 0;
|
||||
while ((m = re.exec(text)) !== null) {
|
||||
markers.push(Number(m[1]));
|
||||
}
|
||||
}
|
||||
return markers;
|
||||
}
|
||||
/** Compute the [old,new] integrity tuples for two JSON docs. */
|
||||
function computeIntegrity(oldDoc, newDoc, notesHeading) {
|
||||
const images = [
|
||||
countNodes(oldDoc, (n) => n.type === "image"),
|
||||
countNodes(newDoc, (n) => n.type === "image"),
|
||||
];
|
||||
const links = [
|
||||
countUniqueLinks(oldDoc),
|
||||
countUniqueLinks(newDoc),
|
||||
];
|
||||
const tables = [
|
||||
countNodes(oldDoc, (n) => n.type === "table"),
|
||||
countNodes(newDoc, (n) => n.type === "table"),
|
||||
];
|
||||
const callouts = [
|
||||
countNodes(oldDoc, (n) => n.type === "callout"),
|
||||
countNodes(newDoc, (n) => n.type === "callout"),
|
||||
];
|
||||
const fns = [
|
||||
footnoteMarkers(oldDoc, notesHeading),
|
||||
footnoteMarkers(newDoc, notesHeading),
|
||||
];
|
||||
return { images, links, tables, callouts, footnoteMarkers: fns };
|
||||
}
|
||||
/**
|
||||
* Resolve the lead text of the top-level block in a ProseMirror Node that
|
||||
* contains the given document position. Returns "" when out of range.
|
||||
*/
|
||||
function blockContextAt(node, pos) {
|
||||
try {
|
||||
const clamped = Math.max(0, Math.min(pos, node.content.size));
|
||||
const $pos = node.resolve(clamped);
|
||||
// depth 1 is the top-level block in a doc node.
|
||||
const block = $pos.depth >= 1 ? $pos.node(1) : $pos.node(0);
|
||||
const text = block.textContent || "";
|
||||
return text.length > 80 ? text.slice(0, 77) + "..." : text;
|
||||
}
|
||||
catch {
|
||||
return "";
|
||||
}
|
||||
}
|
||||
/** Truncate a string for the markdown summary. */
|
||||
function truncate(s, n = 120) {
|
||||
return s.length > n ? s.slice(0, n - 3) + "..." : s;
|
||||
}
|
||||
/**
|
||||
* Coarse fallback: a block-by-block plain-text diff. Used only when the precise
|
||||
* changeset pipeline throws, so the tool degrades gracefully instead of failing.
|
||||
*/
|
||||
function coarseDiff(oldDoc, newDoc) {
|
||||
const oldBlocks = Array.isArray(oldDoc?.content) ? oldDoc.content : [];
|
||||
const newBlocks = Array.isArray(newDoc?.content) ? newDoc.content : [];
|
||||
const oldTexts = oldBlocks.map(plainText);
|
||||
const newTexts = newBlocks.map(plainText);
|
||||
const oldSet = new Set(oldTexts);
|
||||
const newSet = new Set(newTexts);
|
||||
const changes = [];
|
||||
for (const t of oldTexts) {
|
||||
if (!newSet.has(t) && t.trim() !== "") {
|
||||
changes.push({ op: "delete", block: truncate(t, 80), text: t });
|
||||
}
|
||||
}
|
||||
for (const t of newTexts) {
|
||||
if (!oldSet.has(t) && t.trim() !== "") {
|
||||
changes.push({ op: "insert", block: truncate(t, 80), text: t });
|
||||
}
|
||||
}
|
||||
return changes;
|
||||
}
|
||||
/** Build the human-readable unified-ish markdown summary. */
|
||||
function renderMarkdown(result, fellBack) {
|
||||
const lines = [];
|
||||
const { summary, integrity, changes } = result;
|
||||
lines.push(`# Diff: ${summary.inserted} inserted / ${summary.deleted} deleted (${summary.blocksChanged} blocks changed)`);
|
||||
if (fellBack) {
|
||||
lines.push("");
|
||||
lines.push("> note: precise diff failed; coarse block-level diff shown.");
|
||||
}
|
||||
lines.push("");
|
||||
lines.push("## Integrity (old -> new)");
|
||||
lines.push(`- images: ${integrity.images[0]} -> ${integrity.images[1]}`);
|
||||
lines.push(`- links: ${integrity.links[0]} -> ${integrity.links[1]}`);
|
||||
lines.push(`- tables: ${integrity.tables[0]} -> ${integrity.tables[1]}`);
|
||||
lines.push(`- callouts: ${integrity.callouts[0]} -> ${integrity.callouts[1]}`);
|
||||
lines.push(`- footnoteMarkers: [${integrity.footnoteMarkers[0].join(", ")}] -> [${integrity.footnoteMarkers[1].join(", ")}]`);
|
||||
lines.push("");
|
||||
lines.push("## Changes");
|
||||
if (changes.length === 0) {
|
||||
lines.push("(no textual changes)");
|
||||
}
|
||||
else {
|
||||
for (const c of changes) {
|
||||
const sign = c.op === "insert" ? "+" : "-";
|
||||
const ctx = c.block ? ` @ ${truncate(c.block, 60)}` : "";
|
||||
lines.push(`${sign} ${truncate(c.text)}${ctx}`);
|
||||
}
|
||||
}
|
||||
return lines.join("\n");
|
||||
}
|
||||
/**
|
||||
* Diff two ProseMirror JSON documents the way Docmost's history editor does and
|
||||
* serialize the result to text + integrity counts.
|
||||
*
|
||||
* @param oldDocJson the earlier document
|
||||
* @param newDocJson the later document
|
||||
* @param notesHeading heading delimiting body from notes for footnote counting
|
||||
*/
|
||||
export function diffDocs(oldDocJson, newDocJson, notesHeading = "Примечания переводчика") {
|
||||
const integrity = computeIntegrity(oldDocJson, newDocJson, notesHeading);
|
||||
let changes = [];
|
||||
let inserted = 0;
|
||||
let deleted = 0;
|
||||
let fellBack = false;
|
||||
const changedBlocks = new Set();
|
||||
try {
|
||||
const oldNode = Node.fromJSON(schema, oldDocJson);
|
||||
const newNode = Node.fromJSON(schema, newDocJson);
|
||||
const tr = recreateTransform(oldNode, newNode, {
|
||||
complexSteps: false,
|
||||
wordDiffs: true,
|
||||
simplifyDiff: true,
|
||||
});
|
||||
const changeSet = ChangeSet.create(oldNode).addSteps(tr.doc, tr.mapping.maps, []);
|
||||
const simplified = simplifyChanges(changeSet.changes, newNode);
|
||||
for (const change of simplified) {
|
||||
// Deleted text lives in the OLD doc coordinate range [fromA, toA).
|
||||
if (change.toA > change.fromA) {
|
||||
const text = oldNode.textBetween(change.fromA, change.toA, "\n", " ");
|
||||
if (text.length > 0) {
|
||||
deleted += text.length;
|
||||
const block = blockContextAt(oldNode, change.fromA);
|
||||
changes.push({ op: "delete", block, text });
|
||||
if (block)
|
||||
changedBlocks.add("d:" + block);
|
||||
}
|
||||
}
|
||||
// Inserted text lives in the NEW doc coordinate range [fromB, toB).
|
||||
if (change.toB > change.fromB) {
|
||||
const text = newNode.textBetween(change.fromB, change.toB, "\n", " ");
|
||||
if (text.length > 0) {
|
||||
inserted += text.length;
|
||||
const block = blockContextAt(newNode, change.fromB);
|
||||
changes.push({ op: "insert", block, text });
|
||||
if (block)
|
||||
changedBlocks.add("i:" + block);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
catch {
|
||||
// Pathological pair: degrade to a coarse block-level diff so we never throw.
|
||||
fellBack = true;
|
||||
changes = coarseDiff(oldDocJson, newDocJson);
|
||||
for (const c of changes) {
|
||||
if (c.op === "insert")
|
||||
inserted += c.text.length;
|
||||
else
|
||||
deleted += c.text.length;
|
||||
if (c.block)
|
||||
changedBlocks.add(c.op[0] + ":" + c.block);
|
||||
}
|
||||
}
|
||||
const partial = {
|
||||
summary: { inserted, deleted, blocksChanged: changedBlocks.size },
|
||||
integrity,
|
||||
changes,
|
||||
};
|
||||
return { ...partial, markdown: renderMarkdown(partial, fellBack) };
|
||||
}
|
||||
999
packages/mcp/build/lib/docmost-schema.js
Normal file
999
packages/mcp/build/lib/docmost-schema.js
Normal file
@@ -0,0 +1,999 @@
|
||||
/**
|
||||
* Full TipTap extension set matching the real Docmost document schema.
|
||||
*
|
||||
* The default StarterKit-only schema silently destroys Docmost-specific
|
||||
* nodes (callout, table) and drops attributes it does not know about
|
||||
* (node ids, image sizing, link targets). Every code path that converts
|
||||
* to or from ProseMirror JSON must use THIS set, otherwise a round-trip
|
||||
* loses content.
|
||||
*/
|
||||
import StarterKit from "@tiptap/starter-kit";
|
||||
import Image from "@tiptap/extension-image";
|
||||
import TaskList from "@tiptap/extension-task-list";
|
||||
import TaskItem from "@tiptap/extension-task-item";
|
||||
import Highlight from "@tiptap/extension-highlight";
|
||||
import Subscript from "@tiptap/extension-subscript";
|
||||
import Superscript from "@tiptap/extension-superscript";
|
||||
import { Node, Extension, Mark } from "@tiptap/core";
|
||||
// Inlined from @tiptap/core's getStyleProperty (added after 3.20.x) so this
|
||||
// package can stay on the same @tiptap/core version as the editor and avoid a
|
||||
// duplicate-tiptap version split in the monorepo. Reads a single declaration
|
||||
// from an element's inline `style` attribute, last-wins, case-insensitive.
|
||||
function getStyleProperty(element, propertyName) {
|
||||
const styleAttr = element.getAttribute("style");
|
||||
if (!styleAttr) {
|
||||
return null;
|
||||
}
|
||||
const decls = styleAttr.split(";").map((decl) => decl.trim()).filter(Boolean);
|
||||
const target = propertyName.toLowerCase();
|
||||
for (let i = decls.length - 1; i >= 0; i -= 1) {
|
||||
const decl = decls[i];
|
||||
const colonIndex = decl.indexOf(":");
|
||||
if (colonIndex === -1) {
|
||||
continue;
|
||||
}
|
||||
const prop = decl.slice(0, colonIndex).trim().toLowerCase();
|
||||
if (prop === target) {
|
||||
return decl.slice(colonIndex + 1).trim();
|
||||
}
|
||||
}
|
||||
return null;
|
||||
}
|
||||
/** Allowed Docmost callout types; anything else falls back to "info". */
|
||||
const CALLOUT_TYPES = ["info", "warning", "danger", "success"];
|
||||
export const clampCalloutType = (value) => value && CALLOUT_TYPES.includes(value.toLowerCase())
|
||||
? value.toLowerCase()
|
||||
: "info";
|
||||
/**
|
||||
* Allowlist guard for CSS color values imported from HTML.
|
||||
*
|
||||
* Docmost interpolates stored mark colors straight into an inline style
|
||||
* attribute (e.g. style="background-color: ${color}" / "color: ${color}").
|
||||
* An unsanitized value such as `red; --x: url(...)` or `red"><script>` would
|
||||
* let a crafted document break out of the style attribute. We therefore only
|
||||
* accept a narrow, well-formed subset of CSS <color> syntax and reject (-> null)
|
||||
* anything else.
|
||||
*
|
||||
* Accepted forms:
|
||||
* - named colors: letters only, e.g. "red", "rebeccapurple"
|
||||
* - hex: #rgb, #rgba, #rrggbb, #rrggbbaa
|
||||
* - functional notation: rgb()/rgba()/hsl()/hsla() containing only
|
||||
* digits, %, ., commas, spaces and slashes
|
||||
*/
|
||||
const SAFE_COLOR_RE = /^(?:[a-zA-Z]+|#(?:[0-9a-fA-F]{3,4}|[0-9a-fA-F]{6}|[0-9a-fA-F]{8})|(?:rgb|rgba|hsl|hsla)\([0-9.,%/\s]+\))$/;
|
||||
export const sanitizeCssColor = (value) => {
|
||||
if (typeof value !== "string")
|
||||
return null;
|
||||
const color = value.trim();
|
||||
return color && SAFE_COLOR_RE.test(color) ? color : null;
|
||||
};
|
||||
/** Docmost callout (info/warning/danger/success banner). */
|
||||
const Callout = Node.create({
|
||||
name: "callout",
|
||||
group: "block",
|
||||
content: "block+",
|
||||
defining: true,
|
||||
addAttributes() {
|
||||
return {
|
||||
// Read the type from data-callout-type so generateJSON(html) preserves
|
||||
// it; without an explicit parseHTML every imported callout became "info".
|
||||
type: {
|
||||
default: "info",
|
||||
parseHTML: (el) => clampCalloutType(el.getAttribute("data-callout-type")),
|
||||
renderHTML: (attrs) => ({
|
||||
"data-callout-type": clampCalloutType(attrs.type),
|
||||
}),
|
||||
},
|
||||
icon: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-icon"),
|
||||
renderHTML: (attrs) => attrs.icon ? { "data-icon": attrs.icon } : {},
|
||||
},
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: 'div[data-type="callout"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["div", { "data-type": "callout", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/** Minimal table family: enough for schema round-trips and HTML parsing. */
|
||||
const Table = Node.create({
|
||||
name: "table",
|
||||
group: "block",
|
||||
content: "tableRow+",
|
||||
isolating: true,
|
||||
parseHTML() {
|
||||
return [{ tag: "table" }];
|
||||
},
|
||||
renderHTML() {
|
||||
return ["table", ["tbody", 0]];
|
||||
},
|
||||
});
|
||||
const TableRow = Node.create({
|
||||
name: "tableRow",
|
||||
content: "(tableCell | tableHeader)*",
|
||||
parseHTML() {
|
||||
return [{ tag: "tr" }];
|
||||
},
|
||||
renderHTML() {
|
||||
return ["tr", 0];
|
||||
},
|
||||
});
|
||||
const cellAttributes = () => ({
|
||||
colspan: { default: 1 },
|
||||
rowspan: { default: 1 },
|
||||
colwidth: { default: null },
|
||||
backgroundColor: { default: null },
|
||||
backgroundColorName: { default: null },
|
||||
// Column alignment so GFM aligned tables (|:--|:-:|--:|) round-trip.
|
||||
align: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("align") || el.style.textAlign || null,
|
||||
renderHTML: (attrs) => attrs.align ? { align: attrs.align } : {},
|
||||
},
|
||||
});
|
||||
const TableCell = Node.create({
|
||||
name: "tableCell",
|
||||
content: "block+",
|
||||
isolating: true,
|
||||
addAttributes: cellAttributes,
|
||||
parseHTML() {
|
||||
return [{ tag: "td" }];
|
||||
},
|
||||
renderHTML() {
|
||||
return ["td", 0];
|
||||
},
|
||||
});
|
||||
const TableHeader = Node.create({
|
||||
name: "tableHeader",
|
||||
content: "block+",
|
||||
isolating: true,
|
||||
addAttributes: cellAttributes,
|
||||
parseHTML() {
|
||||
return [{ tag: "th" }];
|
||||
},
|
||||
renderHTML() {
|
||||
return ["th", 0];
|
||||
},
|
||||
});
|
||||
/**
|
||||
* Attributes Docmost stores on standard nodes that the stock extensions
|
||||
* do not declare. Without these, Node.fromJSON silently drops them —
|
||||
* including the block ids that heading anchors rely on.
|
||||
*/
|
||||
const DocmostAttributes = Extension.create({
|
||||
name: "docmostAttributes",
|
||||
addGlobalAttributes() {
|
||||
return [
|
||||
{
|
||||
types: ["heading", "paragraph"],
|
||||
attributes: {
|
||||
id: { default: null },
|
||||
indent: { default: null },
|
||||
textAlign: { default: null },
|
||||
},
|
||||
},
|
||||
{
|
||||
types: ["image"],
|
||||
attributes: {
|
||||
align: { default: null },
|
||||
attachmentId: { default: null },
|
||||
aspectRatio: { default: null },
|
||||
height: { default: null },
|
||||
placeholder: { default: null },
|
||||
size: { default: null },
|
||||
width: { default: null },
|
||||
},
|
||||
},
|
||||
{
|
||||
types: ["orderedList"],
|
||||
attributes: { type: { default: null } },
|
||||
},
|
||||
{
|
||||
types: ["link"],
|
||||
attributes: { internal: { default: null }, title: { default: null } },
|
||||
},
|
||||
];
|
||||
},
|
||||
});
|
||||
/**
|
||||
* Docmost inline comment mark. Anchors a comment thread to a text range via
|
||||
* `commentId`. Without it, any document containing comment highlights fails to
|
||||
* round-trip through the schema ("There is no mark type comment in this schema"),
|
||||
* which breaks update_page_json and edit_page_text on every commented page.
|
||||
* Mirrors Docmost's @docmost/editor-ext comment mark (commentId / resolved).
|
||||
*/
|
||||
const Comment = Mark.create({
|
||||
name: "comment",
|
||||
exitable: true,
|
||||
inclusive: false,
|
||||
addAttributes() {
|
||||
return {
|
||||
commentId: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-comment-id"),
|
||||
renderHTML: (attrs) => attrs.commentId ? { "data-comment-id": attrs.commentId } : {},
|
||||
},
|
||||
resolved: {
|
||||
default: false,
|
||||
parseHTML: (el) => el.getAttribute("data-resolved") === "true",
|
||||
renderHTML: (attrs) => attrs.resolved ? { "data-resolved": "true" } : {},
|
||||
},
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: "span[data-comment-id]" }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["span", { class: "comment-mark", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/**
|
||||
* Text color mark. The markdown-converter emits colored text as
|
||||
* <span style="color: ...">, but with no mark parsing it back the color was
|
||||
* silently dropped on import. This mirrors TipTap's @tiptap/extension-text-style
|
||||
* `textStyle` mark (the name Docmost expects) and carries a single `color`
|
||||
* attribute. The parsed color is passed through the allowlist guard so a crafted
|
||||
* style cannot break out of the attribute when Docmost re-renders it.
|
||||
*/
|
||||
const TextStyle = Mark.create({
|
||||
name: "textStyle",
|
||||
addAttributes() {
|
||||
return {
|
||||
color: {
|
||||
default: null,
|
||||
parseHTML: (el) => sanitizeCssColor(el.style.color || el.getAttribute("data-color")),
|
||||
renderHTML: (attrs) => {
|
||||
const color = sanitizeCssColor(attrs.color);
|
||||
return color ? { style: `color: ${color}` } : {};
|
||||
},
|
||||
},
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [
|
||||
{
|
||||
tag: "span",
|
||||
// Only claim a plain colored span. Do NOT match spans that are already a
|
||||
// comment mark (data-comment-id) or a mention node (data-type=mention),
|
||||
// otherwise importing such HTML would silently drop the comment/mention.
|
||||
getAttrs: (el) => el.style.color &&
|
||||
!el.getAttribute("data-comment-id") &&
|
||||
el.getAttribute("data-type") !== "mention"
|
||||
? {}
|
||||
: false,
|
||||
},
|
||||
];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["span", HTMLAttributes, 0];
|
||||
},
|
||||
});
|
||||
/**
|
||||
* Passthrough definitions for the remaining Docmost-specific nodes.
|
||||
*
|
||||
* TiptapTransformer.toYdoc (the write path every mutation uses) throws
|
||||
* "Unknown node type: X" for any node not registered here, so editing ANY
|
||||
* page that contains one of these nodes used to fail outright. The read path
|
||||
* (fromYdoc) accepts them, which is why they appear in real documents.
|
||||
*
|
||||
* Each node below mirrors the real @docmost/editor-ext definition's name,
|
||||
* group, content, inline/atom flags and attribute keys (with the same data-*
|
||||
* HTML mapping) so that a fromYdoc -> transform -> toYdoc round-trip both
|
||||
* validates and preserves attributes faithfully. Interactive concerns
|
||||
* (node views, commands, keyboard shortcuts, input rules, suggestion plugins)
|
||||
* are intentionally omitted: the MCP server never renders these nodes, it only
|
||||
* needs the schema to accept and carry them. The Callout node above is the
|
||||
* pattern these follow.
|
||||
*/
|
||||
/** Docmost @mention (user/page reference). Inline atom. */
|
||||
const Mention = Node.create({
|
||||
name: "mention",
|
||||
group: "inline",
|
||||
inline: true,
|
||||
selectable: true,
|
||||
atom: true,
|
||||
draggable: true,
|
||||
addAttributes() {
|
||||
return {
|
||||
id: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-id"),
|
||||
renderHTML: (attrs) => attrs.id ? { "data-id": attrs.id } : {},
|
||||
},
|
||||
label: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-label"),
|
||||
renderHTML: (attrs) => attrs.label ? { "data-label": attrs.label } : {},
|
||||
},
|
||||
entityType: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-entity-type"),
|
||||
renderHTML: (attrs) => attrs.entityType ? { "data-entity-type": attrs.entityType } : {},
|
||||
},
|
||||
entityId: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-entity-id"),
|
||||
renderHTML: (attrs) => attrs.entityId ? { "data-entity-id": attrs.entityId } : {},
|
||||
},
|
||||
slugId: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-slug-id"),
|
||||
renderHTML: (attrs) => attrs.slugId ? { "data-slug-id": attrs.slugId } : {},
|
||||
},
|
||||
creatorId: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-creator-id"),
|
||||
renderHTML: (attrs) => attrs.creatorId ? { "data-creator-id": attrs.creatorId } : {},
|
||||
},
|
||||
anchorId: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-anchor-id"),
|
||||
renderHTML: (attrs) => attrs.anchorId ? { "data-anchor-id": attrs.anchorId } : {},
|
||||
},
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: 'span[data-type="mention"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["span", { "data-type": "mention", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/** Inline KaTeX expression. Carries the LaTeX source in `text`. */
|
||||
const MathInline = Node.create({
|
||||
name: "mathInline",
|
||||
group: "inline",
|
||||
inline: true,
|
||||
atom: true,
|
||||
addAttributes() {
|
||||
return {
|
||||
text: { default: "" },
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: 'span[data-type="mathInline"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return [
|
||||
"span",
|
||||
{ "data-type": "mathInline", "data-katex": "true" },
|
||||
`${HTMLAttributes.text ?? ""}`,
|
||||
];
|
||||
},
|
||||
});
|
||||
/** Block KaTeX expression. Carries the LaTeX source in `text`. */
|
||||
const MathBlock = Node.create({
|
||||
name: "mathBlock",
|
||||
group: "block",
|
||||
atom: true,
|
||||
isolating: true,
|
||||
addAttributes() {
|
||||
return {
|
||||
text: { default: "" },
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: 'div[data-type="mathBlock"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return [
|
||||
"div",
|
||||
{ "data-type": "mathBlock", "data-katex": "true" },
|
||||
`${HTMLAttributes.text ?? ""}`,
|
||||
];
|
||||
},
|
||||
});
|
||||
/** Collapsible <details> wrapper: summary + content children. */
|
||||
const Details = Node.create({
|
||||
name: "details",
|
||||
group: "block",
|
||||
content: "detailsSummary detailsContent",
|
||||
defining: true,
|
||||
isolating: true,
|
||||
addAttributes() {
|
||||
return {
|
||||
open: {
|
||||
default: false,
|
||||
parseHTML: (el) => el.getAttribute("open"),
|
||||
renderHTML: (attrs) => attrs.open ? { open: "" } : {},
|
||||
},
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: "details" }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["details", { ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/** Clickable summary line of a <details> block. */
|
||||
const DetailsSummary = Node.create({
|
||||
name: "detailsSummary",
|
||||
group: "block",
|
||||
content: "inline*",
|
||||
defining: true,
|
||||
isolating: true,
|
||||
selectable: false,
|
||||
parseHTML() {
|
||||
return [{ tag: "summary" }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["summary", { "data-type": "detailsSummary", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/** Body of a <details> block. Permissive content so fromYdoc output validates. */
|
||||
const DetailsContent = Node.create({
|
||||
name: "detailsContent",
|
||||
group: "block",
|
||||
// Docmost declares block* (an empty details body is valid); block+ would
|
||||
// reject a collapsed/empty details on round-trip.
|
||||
content: "block*",
|
||||
defining: true,
|
||||
selectable: false,
|
||||
parseHTML() {
|
||||
return [{ tag: 'div[data-type="detailsContent"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["div", { "data-type": "detailsContent", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/** File attachment card (non-image upload). Block atom. */
|
||||
const Attachment = Node.create({
|
||||
name: "attachment",
|
||||
group: "block",
|
||||
inline: false,
|
||||
isolating: true,
|
||||
atom: true,
|
||||
defining: true,
|
||||
draggable: true,
|
||||
addAttributes() {
|
||||
return {
|
||||
url: {
|
||||
default: "",
|
||||
parseHTML: (el) => el.getAttribute("data-attachment-url"),
|
||||
renderHTML: (attrs) => ({
|
||||
"data-attachment-url": attrs.url ?? "",
|
||||
}),
|
||||
},
|
||||
name: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-attachment-name"),
|
||||
renderHTML: (attrs) => attrs.name ? { "data-attachment-name": attrs.name } : {},
|
||||
},
|
||||
mime: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-attachment-mime"),
|
||||
renderHTML: (attrs) => attrs.mime ? { "data-attachment-mime": attrs.mime } : {},
|
||||
},
|
||||
size: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-attachment-size"),
|
||||
renderHTML: (attrs) => attrs.size != null ? { "data-attachment-size": attrs.size } : {},
|
||||
},
|
||||
attachmentId: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-attachment-id"),
|
||||
renderHTML: (attrs) => attrs.attachmentId
|
||||
? { "data-attachment-id": attrs.attachmentId }
|
||||
: {},
|
||||
},
|
||||
// Docmost declares `placeholder` (a transient upload key, not rendered
|
||||
// to HTML). Carry it so a round-trip never hits "Unsupported attribute".
|
||||
placeholder: { default: null },
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: 'div[data-type="attachment"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["div", { "data-type": "attachment", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/** Uploaded <video> player. Block atom. */
|
||||
const Video = Node.create({
|
||||
name: "video",
|
||||
group: "block",
|
||||
isolating: true,
|
||||
atom: true,
|
||||
defining: true,
|
||||
draggable: true,
|
||||
addAttributes() {
|
||||
return {
|
||||
src: {
|
||||
default: "",
|
||||
parseHTML: (el) => el.getAttribute("src"),
|
||||
renderHTML: (attrs) => ({ src: attrs.src ?? "" }),
|
||||
},
|
||||
alt: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("aria-label"),
|
||||
renderHTML: (attrs) => attrs.alt ? { "aria-label": attrs.alt } : {},
|
||||
},
|
||||
attachmentId: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-attachment-id"),
|
||||
renderHTML: (attrs) => attrs.attachmentId
|
||||
? { "data-attachment-id": attrs.attachmentId }
|
||||
: {},
|
||||
},
|
||||
width: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("width"),
|
||||
renderHTML: (attrs) => attrs.width != null ? { width: attrs.width } : {},
|
||||
},
|
||||
height: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("height"),
|
||||
renderHTML: (attrs) => attrs.height != null ? { height: attrs.height } : {},
|
||||
},
|
||||
size: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-size"),
|
||||
renderHTML: (attrs) => attrs.size != null ? { "data-size": attrs.size } : {},
|
||||
},
|
||||
align: {
|
||||
default: "center",
|
||||
parseHTML: (el) => el.getAttribute("data-align"),
|
||||
renderHTML: (attrs) => attrs.align ? { "data-align": attrs.align } : {},
|
||||
},
|
||||
aspectRatio: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-aspect-ratio"),
|
||||
renderHTML: (attrs) => attrs.aspectRatio != null
|
||||
? { "data-aspect-ratio": attrs.aspectRatio }
|
||||
: {},
|
||||
},
|
||||
// Docmost declares `placeholder` (a transient upload key, not rendered
|
||||
// to HTML). Carry it so a round-trip never hits "Unsupported attribute".
|
||||
placeholder: { default: null },
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: "video" }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["video", { controls: "true", ...HTMLAttributes }];
|
||||
},
|
||||
});
|
||||
/**
|
||||
* Defensive passthrough for a `youtube` node. Docmost itself has no dedicated
|
||||
* youtube node (YouTube is handled via `embed`), but the converter read path
|
||||
* references this type, so accept it as a generic block atom that preserves
|
||||
* its src so legacy/external documents survive a round-trip.
|
||||
*/
|
||||
const Youtube = Node.create({
|
||||
name: "youtube",
|
||||
group: "block",
|
||||
inline: false,
|
||||
isolating: true,
|
||||
atom: true,
|
||||
defining: true,
|
||||
draggable: true,
|
||||
addAttributes() {
|
||||
return {
|
||||
src: {
|
||||
default: "",
|
||||
parseHTML: (el) => el.getAttribute("data-src"),
|
||||
renderHTML: (attrs) => ({
|
||||
"data-src": attrs.src ?? "",
|
||||
}),
|
||||
},
|
||||
width: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-width"),
|
||||
renderHTML: (attrs) => attrs.width != null ? { "data-width": attrs.width } : {},
|
||||
},
|
||||
height: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-height"),
|
||||
renderHTML: (attrs) => attrs.height != null ? { "data-height": attrs.height } : {},
|
||||
},
|
||||
align: {
|
||||
default: "center",
|
||||
parseHTML: (el) => el.getAttribute("data-align"),
|
||||
renderHTML: (attrs) => attrs.align ? { "data-align": attrs.align } : {},
|
||||
},
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: 'div[data-type="youtube"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["div", { "data-type": "youtube", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/** Generic embed (provider iframe). Block atom. */
|
||||
const Embed = Node.create({
|
||||
name: "embed",
|
||||
group: "block",
|
||||
inline: false,
|
||||
isolating: true,
|
||||
atom: true,
|
||||
defining: true,
|
||||
draggable: true,
|
||||
addAttributes() {
|
||||
return {
|
||||
src: {
|
||||
default: "",
|
||||
parseHTML: (el) => el.getAttribute("data-src"),
|
||||
renderHTML: (attrs) => ({
|
||||
"data-src": attrs.src ?? "",
|
||||
}),
|
||||
},
|
||||
provider: {
|
||||
default: "",
|
||||
parseHTML: (el) => el.getAttribute("data-provider"),
|
||||
renderHTML: (attrs) => ({
|
||||
"data-provider": attrs.provider ?? "",
|
||||
}),
|
||||
},
|
||||
align: {
|
||||
default: "center",
|
||||
parseHTML: (el) => el.getAttribute("data-align"),
|
||||
renderHTML: (attrs) => ({
|
||||
"data-align": attrs.align ?? "center",
|
||||
}),
|
||||
},
|
||||
width: {
|
||||
default: 800,
|
||||
parseHTML: (el) => el.getAttribute("data-width"),
|
||||
renderHTML: (attrs) => ({
|
||||
"data-width": attrs.width,
|
||||
}),
|
||||
},
|
||||
height: {
|
||||
default: 600,
|
||||
parseHTML: (el) => el.getAttribute("data-height"),
|
||||
renderHTML: (attrs) => ({
|
||||
"data-height": attrs.height,
|
||||
}),
|
||||
},
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: 'div[data-type="embed"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["div", { "data-type": "embed", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/** Shared attribute set for drawio/excalidraw diagram nodes. */
|
||||
const diagramAttributes = () => ({
|
||||
src: {
|
||||
default: "",
|
||||
parseHTML: (el) => el.getAttribute("data-src"),
|
||||
renderHTML: (attrs) => ({
|
||||
"data-src": attrs.src ?? "",
|
||||
}),
|
||||
},
|
||||
title: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-title"),
|
||||
renderHTML: (attrs) => attrs.title ? { "data-title": attrs.title } : {},
|
||||
},
|
||||
alt: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-alt"),
|
||||
renderHTML: (attrs) => attrs.alt ? { "data-alt": attrs.alt } : {},
|
||||
},
|
||||
width: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-width"),
|
||||
renderHTML: (attrs) => attrs.width != null ? { "data-width": attrs.width } : {},
|
||||
},
|
||||
height: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-height"),
|
||||
renderHTML: (attrs) => attrs.height != null ? { "data-height": attrs.height } : {},
|
||||
},
|
||||
size: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-size"),
|
||||
renderHTML: (attrs) => attrs.size != null ? { "data-size": attrs.size } : {},
|
||||
},
|
||||
aspectRatio: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-aspect-ratio"),
|
||||
renderHTML: (attrs) => attrs.aspectRatio != null
|
||||
? { "data-aspect-ratio": attrs.aspectRatio }
|
||||
: {},
|
||||
},
|
||||
align: {
|
||||
default: "center",
|
||||
parseHTML: (el) => el.getAttribute("data-align"),
|
||||
renderHTML: (attrs) => attrs.align ? { "data-align": attrs.align } : {},
|
||||
},
|
||||
attachmentId: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-attachment-id"),
|
||||
renderHTML: (attrs) => attrs.attachmentId ? { "data-attachment-id": attrs.attachmentId } : {},
|
||||
},
|
||||
});
|
||||
/** draw.io diagram. Block atom (image-backed). */
|
||||
const Drawio = Node.create({
|
||||
name: "drawio",
|
||||
group: "block",
|
||||
inline: false,
|
||||
isolating: true,
|
||||
atom: true,
|
||||
defining: true,
|
||||
draggable: true,
|
||||
addAttributes: diagramAttributes,
|
||||
parseHTML() {
|
||||
return [{ tag: 'div[data-type="drawio"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["div", { "data-type": "drawio", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/** Excalidraw diagram. Block atom (image-backed). */
|
||||
const Excalidraw = Node.create({
|
||||
name: "excalidraw",
|
||||
group: "block",
|
||||
inline: false,
|
||||
isolating: true,
|
||||
atom: true,
|
||||
defining: true,
|
||||
draggable: true,
|
||||
addAttributes: diagramAttributes,
|
||||
parseHTML() {
|
||||
return [{ tag: 'div[data-type="excalidraw"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["div", { "data-type": "excalidraw", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/** Multi-column layout container holding one or more `column` children. */
|
||||
const Columns = Node.create({
|
||||
name: "columns",
|
||||
group: "block",
|
||||
content: "column+",
|
||||
defining: true,
|
||||
isolating: true,
|
||||
addAttributes() {
|
||||
return {
|
||||
layout: {
|
||||
default: "two_equal",
|
||||
parseHTML: (el) => el.getAttribute("data-layout"),
|
||||
renderHTML: (attrs) => attrs.layout ? { "data-layout": attrs.layout } : {},
|
||||
},
|
||||
widthMode: {
|
||||
default: "normal",
|
||||
parseHTML: (el) => el.getAttribute("data-width-mode") || "normal",
|
||||
renderHTML: (attrs) => attrs.widthMode && attrs.widthMode !== "normal"
|
||||
? { "data-width-mode": attrs.widthMode }
|
||||
: {},
|
||||
},
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: 'div[data-type="columns"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["div", { "data-type": "columns", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/** Single column within a `columns` layout. */
|
||||
const Column = Node.create({
|
||||
name: "column",
|
||||
group: "block",
|
||||
content: "block+",
|
||||
defining: true,
|
||||
isolating: true,
|
||||
selectable: false,
|
||||
addAttributes() {
|
||||
return {
|
||||
width: {
|
||||
default: null,
|
||||
parseHTML: (el) => {
|
||||
const value = el.getAttribute("data-width");
|
||||
return value ? parseFloat(value) : null;
|
||||
},
|
||||
renderHTML: (attrs) => attrs.width ? { "data-width": attrs.width } : {},
|
||||
},
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: 'div[data-type="column"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["div", { "data-type": "column", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/**
|
||||
* Subpages listing block (auto-generated index of child pages). Docmost
|
||||
* declares no attributes; the markdown-converter has a `case "subpages"`, so
|
||||
* the read path can emit it and toYdoc must accept it. Block atom.
|
||||
*/
|
||||
const Subpages = Node.create({
|
||||
name: "subpages",
|
||||
group: "block",
|
||||
inline: false,
|
||||
isolating: true,
|
||||
atom: true,
|
||||
defining: true,
|
||||
draggable: true,
|
||||
parseHTML() {
|
||||
return [{ tag: 'div[data-type="subpages"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["div", { "data-type": "subpages", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/** Uploaded <audio> player. Block atom. Mirrors Docmost audio attrs. */
|
||||
const Audio = Node.create({
|
||||
name: "audio",
|
||||
group: "block",
|
||||
inline: false,
|
||||
isolating: true,
|
||||
atom: true,
|
||||
defining: true,
|
||||
draggable: true,
|
||||
addAttributes() {
|
||||
return {
|
||||
src: {
|
||||
default: "",
|
||||
parseHTML: (el) => el.getAttribute("src"),
|
||||
renderHTML: (attrs) => ({ src: attrs.src ?? "" }),
|
||||
},
|
||||
attachmentId: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-attachment-id"),
|
||||
renderHTML: (attrs) => attrs.attachmentId
|
||||
? { "data-attachment-id": attrs.attachmentId }
|
||||
: {},
|
||||
},
|
||||
size: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-size"),
|
||||
renderHTML: (attrs) => attrs.size != null ? { "data-size": attrs.size } : {},
|
||||
},
|
||||
// Transient upload key Docmost declares with rendered:false; carried so
|
||||
// a round-trip never hits "Unsupported attribute".
|
||||
placeholder: { default: null },
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: "audio" }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["audio", { controls: "true", ...HTMLAttributes }];
|
||||
},
|
||||
});
|
||||
/** Embedded PDF viewer. Block atom. Mirrors Docmost pdf attrs. */
|
||||
const Pdf = Node.create({
|
||||
name: "pdf",
|
||||
group: "block",
|
||||
inline: false,
|
||||
isolating: true,
|
||||
atom: true,
|
||||
defining: true,
|
||||
draggable: true,
|
||||
addAttributes() {
|
||||
return {
|
||||
src: {
|
||||
default: "",
|
||||
parseHTML: (el) => el.getAttribute("src"),
|
||||
renderHTML: (attrs) => ({ src: attrs.src ?? "" }),
|
||||
},
|
||||
name: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-name"),
|
||||
renderHTML: (attrs) => attrs.name ? { "data-name": attrs.name } : {},
|
||||
},
|
||||
attachmentId: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-attachment-id"),
|
||||
renderHTML: (attrs) => attrs.attachmentId
|
||||
? { "data-attachment-id": attrs.attachmentId }
|
||||
: {},
|
||||
},
|
||||
size: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("data-size"),
|
||||
renderHTML: (attrs) => attrs.size != null ? { "data-size": attrs.size } : {},
|
||||
},
|
||||
width: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("width"),
|
||||
renderHTML: (attrs) => attrs.width != null ? { width: attrs.width } : {},
|
||||
},
|
||||
height: {
|
||||
default: null,
|
||||
parseHTML: (el) => el.getAttribute("height"),
|
||||
renderHTML: (attrs) => attrs.height != null ? { height: attrs.height } : {},
|
||||
},
|
||||
// Transient upload key Docmost declares with rendered:false; carried so
|
||||
// a round-trip never hits "Unsupported attribute".
|
||||
placeholder: { default: null },
|
||||
};
|
||||
},
|
||||
parseHTML() {
|
||||
return [{ tag: 'div[data-type="pdf"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["div", { "data-type": "pdf", ...HTMLAttributes }, 0];
|
||||
},
|
||||
});
|
||||
/** Page break (print/export divider). Block atom; Docmost declares no attrs. */
|
||||
const PageBreak = Node.create({
|
||||
name: "pageBreak",
|
||||
group: "block",
|
||||
inline: false,
|
||||
isolating: true,
|
||||
atom: true,
|
||||
defining: true,
|
||||
draggable: true,
|
||||
parseHTML() {
|
||||
return [{ tag: 'div[data-type="pageBreak"]' }];
|
||||
},
|
||||
renderHTML({ HTMLAttributes }) {
|
||||
return ["div", { "data-type": "pageBreak", ...HTMLAttributes }];
|
||||
},
|
||||
});
|
||||
/**
|
||||
* Full extension list. Image is block-level (matches Docmost); the
|
||||
* ProseMirror DOM parser hoists <img> found inside <p> automatically.
|
||||
* StarterKit v3 already bundles the link extension, configured here.
|
||||
*/
|
||||
export const docmostExtensions = [
|
||||
StarterKit.configure({
|
||||
codeBlock: {},
|
||||
heading: {},
|
||||
link: { openOnClick: false },
|
||||
}),
|
||||
Image.configure({ inline: false }),
|
||||
TaskList,
|
||||
TaskItem.configure({ nested: true }),
|
||||
// Highlight stores its color unescaped and Docmost interpolates it into
|
||||
// style="background-color: ${color}". Wrap the color attribute's parseHTML
|
||||
// with the same allowlist guard used by textStyle so a crafted import color
|
||||
// cannot break out of the style attribute. Multicolor behavior is preserved.
|
||||
Highlight.extend({
|
||||
addAttributes() {
|
||||
const parent = this.parent?.() ?? {};
|
||||
return {
|
||||
...parent,
|
||||
color: {
|
||||
...parent.color,
|
||||
parseHTML: (el) => sanitizeCssColor(el.getAttribute("data-color") ||
|
||||
getStyleProperty(el, "background-color") ||
|
||||
el.style.backgroundColor),
|
||||
},
|
||||
};
|
||||
},
|
||||
}).configure({ multicolor: true }),
|
||||
Subscript,
|
||||
Superscript,
|
||||
// StarterKit does not provide a textStyle mark, so register ours; without it
|
||||
// generateJSON drops <span style="color: ...">, defeating the color import.
|
||||
TextStyle,
|
||||
Comment,
|
||||
Callout,
|
||||
Table,
|
||||
TableRow,
|
||||
TableCell,
|
||||
TableHeader,
|
||||
Mention,
|
||||
MathInline,
|
||||
MathBlock,
|
||||
Details,
|
||||
DetailsSummary,
|
||||
DetailsContent,
|
||||
Attachment,
|
||||
Video,
|
||||
Youtube,
|
||||
Embed,
|
||||
Drawio,
|
||||
Excalidraw,
|
||||
Columns,
|
||||
Column,
|
||||
Subpages,
|
||||
Audio,
|
||||
Pdf,
|
||||
PageBreak,
|
||||
DocmostAttributes,
|
||||
];
|
||||
87
packages/mcp/build/lib/filters.js
Normal file
87
packages/mcp/build/lib/filters.js
Normal file
@@ -0,0 +1,87 @@
|
||||
/**
|
||||
* Filter functions to extract only relevant information from API responses
|
||||
* for better agent consumption
|
||||
*/
|
||||
export function filterWorkspace(data) {
|
||||
return {
|
||||
id: data.id,
|
||||
name: data.name,
|
||||
description: data.description,
|
||||
defaultSpaceId: data.defaultSpaceId,
|
||||
createdAt: data.createdAt,
|
||||
updatedAt: data.updatedAt,
|
||||
deletedAt: data.deletedAt,
|
||||
};
|
||||
}
|
||||
export function filterSpace(space) {
|
||||
return {
|
||||
id: space.id,
|
||||
name: space.name,
|
||||
description: space.description,
|
||||
slug: space.slug,
|
||||
visibility: space.visibility,
|
||||
createdAt: space.createdAt,
|
||||
updatedAt: space.updatedAt,
|
||||
deletedAt: space.deletedAt,
|
||||
};
|
||||
}
|
||||
export function filterGroup(group) {
|
||||
return {
|
||||
id: group.id,
|
||||
name: group.name,
|
||||
description: group.description,
|
||||
workspaceId: group.workspaceId,
|
||||
createdAt: group.createdAt,
|
||||
updatedAt: group.updatedAt,
|
||||
deletedAt: group.deletedAt,
|
||||
};
|
||||
}
|
||||
export function filterPage(page, content, subpages) {
|
||||
return {
|
||||
id: page.id,
|
||||
slugId: page.slugId,
|
||||
title: page.title,
|
||||
parentPageId: page.parentPageId,
|
||||
spaceId: page.spaceId,
|
||||
isLocked: page.isLocked,
|
||||
createdAt: page.createdAt,
|
||||
updatedAt: page.updatedAt,
|
||||
deletedAt: page.deletedAt,
|
||||
// Include converted markdown content if valid string (even empty)
|
||||
...(typeof content === "string" && { content }),
|
||||
// Include subpages if provided
|
||||
...(subpages &&
|
||||
subpages.length > 0 && {
|
||||
subpages: subpages.map((p) => ({ id: p.id, title: p.title })),
|
||||
}),
|
||||
};
|
||||
}
|
||||
export function filterComment(comment, markdownContent) {
|
||||
return {
|
||||
id: comment.id,
|
||||
pageId: comment.pageId,
|
||||
content: markdownContent ?? comment.content,
|
||||
selection: comment.selection || null,
|
||||
type: comment.type || "page",
|
||||
parentCommentId: comment.parentCommentId || null,
|
||||
creatorId: comment.creatorId,
|
||||
creatorName: comment.creator?.name || null,
|
||||
createdAt: comment.createdAt,
|
||||
editedAt: comment.editedAt || null,
|
||||
resolvedAt: comment.resolvedAt || null,
|
||||
resolvedById: comment.resolvedById || null,
|
||||
};
|
||||
}
|
||||
export function filterSearchResult(result) {
|
||||
return {
|
||||
id: result.id,
|
||||
title: result.title,
|
||||
parentPageId: result.parentPageId,
|
||||
createdAt: result.createdAt,
|
||||
updatedAt: result.updatedAt,
|
||||
rank: result.rank,
|
||||
highlight: result.highlight,
|
||||
spaceId: result.space?.id,
|
||||
spaceName: result.space?.name,
|
||||
};
|
||||
}
|
||||
100
packages/mcp/build/lib/json-edit.js
Normal file
100
packages/mcp/build/lib/json-edit.js
Normal file
@@ -0,0 +1,100 @@
|
||||
/**
|
||||
* Surgical text edits on a ProseMirror document without re-importing it.
|
||||
*
|
||||
* Each edit replaces an exact substring inside individual text nodes,
|
||||
* preserving every node id, mark and attribute around it. This is the
|
||||
* safe alternative to a full markdown re-import for small wording fixes.
|
||||
*/
|
||||
/** Collect plain text of the whole document (for span-detection hints). */
|
||||
function collectText(node) {
|
||||
let out = "";
|
||||
if (node.type === "text")
|
||||
out += node.text || "";
|
||||
for (const child of node.content || [])
|
||||
out += collectText(child);
|
||||
return out;
|
||||
}
|
||||
function countOccurrences(haystack, needle) {
|
||||
if (!needle)
|
||||
return 0;
|
||||
let count = 0;
|
||||
let idx = haystack.indexOf(needle);
|
||||
while (idx !== -1) {
|
||||
count++;
|
||||
idx = haystack.indexOf(needle, idx + needle.length);
|
||||
}
|
||||
return count;
|
||||
}
|
||||
/**
|
||||
* Apply text edits to a ProseMirror doc (mutates a deep copy, returns it).
|
||||
* Throws a descriptive error when an edit matches zero times or matches
|
||||
* multiple times without replaceAll — so the caller can refine `find`.
|
||||
*/
|
||||
export function applyTextEdits(doc, edits) {
|
||||
const copy = JSON.parse(JSON.stringify(doc));
|
||||
const results = [];
|
||||
for (const edit of edits) {
|
||||
if (!edit.find)
|
||||
throw new Error("edit.find must be a non-empty string");
|
||||
// Count matches inside individual text nodes first.
|
||||
let nodeMatches = 0;
|
||||
(function count(node) {
|
||||
if (node.type === "text" && node.text) {
|
||||
nodeMatches += countOccurrences(node.text, edit.find);
|
||||
}
|
||||
for (const child of node.content || [])
|
||||
count(child);
|
||||
})(copy);
|
||||
if (nodeMatches === 0) {
|
||||
// Distinguish "text not present" from "text spans formatting runs".
|
||||
const fullText = collectText(copy);
|
||||
if (fullText.includes(edit.find)) {
|
||||
throw new Error(`Edit "${truncate(edit.find)}": the text exists in the document but spans ` +
|
||||
`multiple formatting runs (bold/link/italic boundaries). Use a shorter ` +
|
||||
`fragment that stays inside one run, or use update_page_json for ` +
|
||||
`structural changes.`);
|
||||
}
|
||||
throw new Error(`Edit "${truncate(edit.find)}": text not found in the document.`);
|
||||
}
|
||||
if (nodeMatches > 1 && !edit.replaceAll) {
|
||||
throw new Error(`Edit "${truncate(edit.find)}": matches ${nodeMatches} times. ` +
|
||||
`Provide a longer, unique fragment or set replaceAll: true.`);
|
||||
}
|
||||
// Perform the replacement(s).
|
||||
let done = 0;
|
||||
(function replace(node) {
|
||||
if (node.type === "text" && node.text && node.text.includes(edit.find)) {
|
||||
if (edit.replaceAll) {
|
||||
done += countOccurrences(node.text, edit.find);
|
||||
node.text = node.text.split(edit.find).join(edit.replace);
|
||||
}
|
||||
else if (done === 0) {
|
||||
// Avoid String.replace: its second arg treats $&, $1, $`, $', $$ as
|
||||
// special patterns, expanding them instead of inserting literally.
|
||||
// Splice the first occurrence by index to keep the replacement literal.
|
||||
const idx = node.text.indexOf(edit.find);
|
||||
node.text =
|
||||
node.text.slice(0, idx) +
|
||||
edit.replace +
|
||||
node.text.slice(idx + edit.find.length);
|
||||
done = 1;
|
||||
}
|
||||
}
|
||||
for (const child of node.content || [])
|
||||
replace(child);
|
||||
})(copy);
|
||||
results.push({ find: edit.find, replacements: done });
|
||||
}
|
||||
// Drop text nodes that became empty (ProseMirror forbids empty text nodes).
|
||||
(function prune(node) {
|
||||
if (Array.isArray(node.content)) {
|
||||
node.content = node.content.filter((child) => !(child.type === "text" && child.text === ""));
|
||||
for (const child of node.content)
|
||||
prune(child);
|
||||
}
|
||||
})(copy);
|
||||
return { doc: copy, results };
|
||||
}
|
||||
function truncate(s) {
|
||||
return s.length > 60 ? s.slice(0, 57) + "..." : s;
|
||||
}
|
||||
795
packages/mcp/build/lib/markdown-converter.js
Normal file
795
packages/mcp/build/lib/markdown-converter.js
Normal file
@@ -0,0 +1,795 @@
|
||||
/**
|
||||
* Convert ProseMirror/TipTap JSON content to Markdown
|
||||
* Supports all Docmost-specific node types and extensions
|
||||
*/
|
||||
export function convertProseMirrorToMarkdown(content) {
|
||||
if (!content || !content.content)
|
||||
return "";
|
||||
// Escape a value interpolated into an HTML double-quoted attribute value
|
||||
// (textAlign, colors, image src, math `text`, all data-* attrs, etc.). In the
|
||||
// ATTRIBUTE context only the quote that delimits the value and the ampersand
|
||||
// that starts an entity are special, so we escape ONLY & " (and ' for safety
|
||||
// when single-quoted delimiters are used). We deliberately do NOT escape < or
|
||||
// >: the HTML re-parser (parse5/jsdom via @tiptap/html) does NOT decode
|
||||
// </> back inside attribute values, so escaping them would corrupt the
|
||||
// stored data (e.g. a math node's LaTeX `a < b`) and ACCUMULATE escapes on
|
||||
// every round-trip (`a < b` -> `a < b` -> `a &lt; b`). Escaping & "
|
||||
// keeps the value inert against attribute-injection while staying idempotent.
|
||||
// NOTE: escape ONLY & and " here. The value is always wrapped in double
|
||||
// quotes, so " is the only delimiter; ' is NOT special in a double-quoted
|
||||
// value, and parse5 does not decode ' back inside attribute values, so
|
||||
// escaping ' would (like < >) corrupt the value and accumulate & on every
|
||||
// round-trip. Escaping & and " is idempotent (parse5 decodes them back).
|
||||
const escapeAttr = (value) => String(value)
|
||||
.replace(/&/g, "&")
|
||||
.replace(/"/g, """);
|
||||
// Escape a value placed as HTML element TEXT content (between tags), where
|
||||
// <, >, and & are all significant. Used for text rendered inside raw-HTML
|
||||
// blocks (table cells / columns) so stored characters cannot inject markup.
|
||||
const escapeHtmlText = (value) => String(value)
|
||||
.replace(/&/g, "&")
|
||||
.replace(/</g, "<")
|
||||
.replace(/>/g, ">");
|
||||
// Percent-encode characters that would break out of a markdown URL target
|
||||
// (...) — whitespace/newlines and parentheses — so a stored src stays a
|
||||
// single inert token (used for image/video/youtube srcs).
|
||||
const encodeMdUrl = (value) => String(value || "")
|
||||
.replace(/\s/g, (c) => (c === " " ? "%20" : encodeURIComponent(c)))
|
||||
.replace(/\(/g, "%28")
|
||||
.replace(/\)/g, "%29");
|
||||
const processNode = (node) => {
|
||||
const type = node.type;
|
||||
const nodeContent = node.content || [];
|
||||
switch (type) {
|
||||
case "doc":
|
||||
return nodeContent.map(processNode).join("\n\n");
|
||||
case "paragraph":
|
||||
const text = nodeContent.map(processNode).join("");
|
||||
const align = node.attrs?.textAlign;
|
||||
if (align && align !== "left") {
|
||||
return `<div align="${escapeAttr(align)}">${text}</div>`;
|
||||
}
|
||||
return text || "";
|
||||
case "heading":
|
||||
const level = node.attrs?.level || 1;
|
||||
const headingText = nodeContent.map(processNode).join("");
|
||||
return "#".repeat(level) + " " + headingText;
|
||||
case "text":
|
||||
let textContent = node.text || "";
|
||||
// Apply marks (bold, italic, code, etc.)
|
||||
if (node.marks) {
|
||||
// Markdown code spans (`...`) cannot carry inner formatting, so when a
|
||||
// run has the `code` mark alongside ANY other mark, backtick syntax
|
||||
// would leak literal ** / []() into the code text. In that case emit
|
||||
// nested HTML (<code> innermost, the other marks wrapping it as HTML)
|
||||
// so the output is at least well-formed and re-parseable.
|
||||
//
|
||||
// NOTE: this does NOT round-trip both marks. The schema's `code` mark
|
||||
// has `excludes: "_"` (it excludes every other mark), so on import the
|
||||
// co-occurring mark is always dropped — the run comes back as `code`
|
||||
// only. We keep the emission simple and accept that the other mark is
|
||||
// lost; preserving both is impossible while `code` excludes them.
|
||||
// Only use the backtick form when `code` is the sole mark.
|
||||
const markTypes = node.marks.map((m) => m.type);
|
||||
const hasCode = markTypes.includes("code");
|
||||
const codeCombined = hasCode && markTypes.length > 1;
|
||||
for (const mark of node.marks) {
|
||||
switch (mark.type) {
|
||||
case "bold":
|
||||
textContent = codeCombined
|
||||
? `<strong>${textContent}</strong>`
|
||||
: `**${textContent}**`;
|
||||
break;
|
||||
case "italic":
|
||||
textContent = codeCombined
|
||||
? `<em>${textContent}</em>`
|
||||
: `*${textContent}*`;
|
||||
break;
|
||||
case "code":
|
||||
// When combined with another mark, wrap as <code> so the
|
||||
// surrounding HTML marks can nest around it; otherwise use the
|
||||
// plain backtick span.
|
||||
textContent = codeCombined
|
||||
? `<code>${textContent}</code>`
|
||||
: `\`${textContent}\``;
|
||||
break;
|
||||
case "link": {
|
||||
const href = mark.attrs?.href || "";
|
||||
const title = mark.attrs?.title;
|
||||
if (codeCombined) {
|
||||
// Emit an HTML anchor so it can wrap the nested <code>.
|
||||
const safeHref = escapeAttr(href);
|
||||
if (title) {
|
||||
textContent = `<a href="${safeHref}" title="${escapeAttr(String(title))}">${textContent}</a>`;
|
||||
}
|
||||
else {
|
||||
textContent = `<a href="${safeHref}">${textContent}</a>`;
|
||||
}
|
||||
}
|
||||
else if (title) {
|
||||
// Emit the optional markdown link title; escape an embedded
|
||||
// double-quote so it cannot terminate the title string early.
|
||||
const safeTitle = String(title).replace(/"/g, '\\"');
|
||||
textContent = `[${textContent}](${href} "${safeTitle}")`;
|
||||
}
|
||||
else {
|
||||
textContent = `[${textContent}](${href})`;
|
||||
}
|
||||
break;
|
||||
}
|
||||
case "strike":
|
||||
textContent = codeCombined
|
||||
? `<s>${textContent}</s>`
|
||||
: `~~${textContent}~~`;
|
||||
break;
|
||||
case "underline":
|
||||
textContent = `<u>${textContent}</u>`;
|
||||
break;
|
||||
case "subscript":
|
||||
textContent = `<sub>${textContent}</sub>`;
|
||||
break;
|
||||
case "superscript":
|
||||
textContent = `<sup>${textContent}</sup>`;
|
||||
break;
|
||||
case "highlight": {
|
||||
// Preserve a null/empty color as a plain highlight (a bare
|
||||
// <mark> with no background-color); only emit the style when a
|
||||
// color is actually set, so a plain highlight is not forced to
|
||||
// yellow on export.
|
||||
const color = mark.attrs?.color;
|
||||
textContent = color
|
||||
? `<mark style="background-color: ${escapeAttr(color)}">${textContent}</mark>`
|
||||
: `<mark>${textContent}</mark>`;
|
||||
break;
|
||||
}
|
||||
case "textStyle":
|
||||
if (mark.attrs?.color) {
|
||||
textContent = `<span style="color: ${escapeAttr(mark.attrs.color)}">${textContent}</span>`;
|
||||
}
|
||||
break;
|
||||
case "comment": {
|
||||
// Emit the inline comment anchor so highlights round-trip. The
|
||||
// schema's Comment mark parses span[data-comment-id] (attrs
|
||||
// commentId/resolved).
|
||||
const cid = mark.attrs?.commentId;
|
||||
if (cid) {
|
||||
const resolvedAttr = mark.attrs?.resolved
|
||||
? ` data-resolved="true"`
|
||||
: "";
|
||||
textContent = `<span data-comment-id="${escapeAttr(cid)}"${resolvedAttr}>${textContent}</span>`;
|
||||
}
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return textContent;
|
||||
case "codeBlock":
|
||||
const language = node.attrs?.language || "";
|
||||
// Strip ALL trailing newlines so the export is idempotent: marked
|
||||
// re-adds exactly one trailing "\n" on import, so trimming only one
|
||||
// here would let the text grow by "\n" on each round-trip. Removing
|
||||
// every trailing newline makes repeated cycles stable.
|
||||
const code = nodeContent
|
||||
.map(processNode)
|
||||
.join("")
|
||||
.replace(/\n+$/, "");
|
||||
return "```" + language + "\n" + code + "\n```";
|
||||
case "bulletList":
|
||||
return nodeContent
|
||||
.map((item) => processListItem(item, "-"))
|
||||
.join("\n");
|
||||
case "orderedList":
|
||||
return nodeContent
|
||||
.map((item, index) => processListItem(item, `${index + 1}.`))
|
||||
.join("\n");
|
||||
case "taskList":
|
||||
return nodeContent.map((item) => processTaskItem(item)).join("\n");
|
||||
case "taskItem":
|
||||
// Delegate to the same helper used by taskList so multi-block and
|
||||
// nested task items render and indent consistently.
|
||||
return processTaskItem(node);
|
||||
case "listItem":
|
||||
return nodeContent.map(processNode).join("\n");
|
||||
case "blockquote":
|
||||
// Prefix EVERY line of EVERY child with "> " and separate block-level
|
||||
// children with a blank ">" line so code blocks / multi-paragraph
|
||||
// quotes round-trip correctly.
|
||||
return nodeContent
|
||||
.map((n) => processNode(n)
|
||||
.split("\n")
|
||||
.map((line) => (line.length ? `> ${line}` : ">"))
|
||||
.join("\n"))
|
||||
.join("\n>\n");
|
||||
case "horizontalRule":
|
||||
return "---";
|
||||
case "hardBreak":
|
||||
// Two trailing spaces before the newline encode a markdown hard break;
|
||||
// a bare "\n" would be reimported as a soft break and lost.
|
||||
return " \n";
|
||||
case "image":
|
||||
const imgAlt = node.attrs?.alt || "";
|
||||
// Neutralize characters that could break out of the markdown image
|
||||
// URL: spaces/newlines and parentheses would terminate the (...) target
|
||||
// and let a stored src inject following markdown/HTML. Percent-encode
|
||||
// them so the URL stays a single inert token.
|
||||
const imgSrc = encodeMdUrl(node.attrs?.src);
|
||||
// No "caption" attribute exists in the Docmost image schema, so we do
|
||||
// not emit one (the previous caption branch was dead).
|
||||
return ``;
|
||||
case "video": {
|
||||
// Emit the schema-matching <video> element so generateJSON rebuilds the
|
||||
// node with its attrs intact. The schema's parseHTML reads src/aria-label
|
||||
// from the standard attributes and the remaining attrs from data-*.
|
||||
const attrs = node.attrs || {};
|
||||
const parts = [`src="${escapeAttr(attrs.src ?? "")}"`];
|
||||
if (attrs.alt)
|
||||
parts.push(`aria-label="${escapeAttr(attrs.alt)}"`);
|
||||
if (attrs.attachmentId)
|
||||
parts.push(`data-attachment-id="${escapeAttr(attrs.attachmentId)}"`);
|
||||
if (attrs.width != null)
|
||||
parts.push(`width="${escapeAttr(attrs.width)}"`);
|
||||
if (attrs.height != null)
|
||||
parts.push(`height="${escapeAttr(attrs.height)}"`);
|
||||
if (attrs.size != null)
|
||||
parts.push(`data-size="${escapeAttr(attrs.size)}"`);
|
||||
if (attrs.align)
|
||||
parts.push(`data-align="${escapeAttr(attrs.align)}"`);
|
||||
if (attrs.aspectRatio != null)
|
||||
parts.push(`data-aspect-ratio="${escapeAttr(attrs.aspectRatio)}"`);
|
||||
// Wrap in a block <div> so marked treats it as a block (a bare <video>
|
||||
// is inline-level HTML and marked wraps it in <p>, leaving a spurious
|
||||
// empty paragraph beside the hoisted block atom). The wrapper has no
|
||||
// data-type, so the schema parser ignores it and just hoists the video.
|
||||
return `<div><video ${parts.join(" ")}></video></div>`;
|
||||
}
|
||||
case "youtube": {
|
||||
// Emit the schema-matching div[data-type="youtube"]; the schema reads
|
||||
// src from data-src and width/height/align from data-* attributes.
|
||||
const attrs = node.attrs || {};
|
||||
const parts = [
|
||||
`data-type="youtube"`,
|
||||
`data-src="${escapeAttr(attrs.src ?? "")}"`,
|
||||
];
|
||||
if (attrs.width != null)
|
||||
parts.push(`data-width="${escapeAttr(attrs.width)}"`);
|
||||
if (attrs.height != null)
|
||||
parts.push(`data-height="${escapeAttr(attrs.height)}"`);
|
||||
if (attrs.align)
|
||||
parts.push(`data-align="${escapeAttr(attrs.align)}"`);
|
||||
return `<div ${parts.join(" ")}></div>`;
|
||||
}
|
||||
case "table": {
|
||||
// A GFM pipe table cannot represent merged cells. If ANY cell carries
|
||||
// colspan>1 or rowspan>1, a pipe table would corrupt the grid on
|
||||
// re-import, so emit the WHOLE table as raw HTML <table> instead: the
|
||||
// schema's table family parseHTML (tag table/tr/td/th, with colspan/
|
||||
// rowspan read from the same-named HTML attrs and align via parseHTML)
|
||||
// round-trips it faithfully. Otherwise keep the lighter GFM pipe table.
|
||||
const tableRows = nodeContent;
|
||||
if (tableRows.length === 0)
|
||||
return "";
|
||||
const hasSpan = tableRows.some((row) => (row.content || []).some((cell) => (cell.attrs?.colspan ?? 1) > 1 || (cell.attrs?.rowspan ?? 1) > 1));
|
||||
if (hasSpan) {
|
||||
// Render each cell's block children to HTML (marked does NOT parse
|
||||
// markdown inside a raw HTML block, so emitting markdown here would
|
||||
// leak literal ** / `` into the cell). blockToHtml mirrors the schema
|
||||
// HTML so inner formatting re-parses into the right marks/nodes.
|
||||
const renderHtmlCell = (cell) => {
|
||||
const tag = cell.type === "tableHeader" ? "th" : "td";
|
||||
const a = cell.attrs || {};
|
||||
const cellParts = [];
|
||||
if ((a.colspan ?? 1) > 1)
|
||||
cellParts.push(`colspan="${escapeAttr(a.colspan)}"`);
|
||||
if ((a.rowspan ?? 1) > 1)
|
||||
cellParts.push(`rowspan="${escapeAttr(a.rowspan)}"`);
|
||||
if (a.align)
|
||||
cellParts.push(`align="${escapeAttr(a.align)}"`);
|
||||
const open = cellParts.length
|
||||
? `<${tag} ${cellParts.join(" ")}>`
|
||||
: `<${tag}>`;
|
||||
const inner = (cell.content || [])
|
||||
.map((block) => blockToHtml(block))
|
||||
.join("");
|
||||
return `${open}${inner}</${tag}>`;
|
||||
};
|
||||
const htmlRows = tableRows
|
||||
.map((row) => `<tr>${(row.content || []).map(renderHtmlCell).join("")}</tr>`)
|
||||
.join("");
|
||||
return `<table><tbody>${htmlRows}</tbody></table>`;
|
||||
}
|
||||
// No merged cells: emit a GFM table (header row + separator) so the
|
||||
// markdown can be parsed back into a table on re-import.
|
||||
const rows = tableRows.map(processNode);
|
||||
const headerCells = tableRows[0]?.content || [];
|
||||
const columns = headerCells.length || 1;
|
||||
// Derive alignment markers (:--, :-:, --:) from each header cell.
|
||||
const markers = Array.from({ length: columns }, (_, i) => {
|
||||
const align = headerCells[i]?.attrs?.align;
|
||||
switch (align) {
|
||||
case "left":
|
||||
return ":--";
|
||||
case "center":
|
||||
return ":-:";
|
||||
case "right":
|
||||
return "--:";
|
||||
default:
|
||||
return "---";
|
||||
}
|
||||
});
|
||||
const separator = "| " + markers.join(" | ") + " |";
|
||||
return [rows[0], separator, ...rows.slice(1)].join("\n");
|
||||
}
|
||||
case "tableRow":
|
||||
return "| " + nodeContent.map(processNode).join(" | ") + " |";
|
||||
case "tableCell":
|
||||
case "tableHeader": {
|
||||
// Join multiple block children with a space (not "") so adjacent blocks
|
||||
// like a paragraph followed by a list don't collide into "line1- a".
|
||||
// Then collapse newlines and escape pipes so a cell containing "|" or a
|
||||
// line break cannot corrupt the surrounding GFM row.
|
||||
return nodeContent
|
||||
.map(processNode)
|
||||
.join(" ")
|
||||
.replace(/\r?\n/g, " ")
|
||||
.replace(/\|/g, "\\|");
|
||||
}
|
||||
case "callout":
|
||||
const calloutType = node.attrs?.type || "info";
|
||||
const calloutContent = nodeContent.map(processNode).join("\n");
|
||||
return `:::${calloutType.toLowerCase()}\n${calloutContent}\n:::`;
|
||||
case "details":
|
||||
return nodeContent.map(processNode).join("\n");
|
||||
case "detailsSummary":
|
||||
const summaryText = nodeContent.map(processNode).join("");
|
||||
return `<details>\n<summary>${summaryText}</summary>\n`;
|
||||
case "detailsContent":
|
||||
const detailsText = nodeContent.map(processNode).join("\n");
|
||||
return `${detailsText}\n</details>`;
|
||||
case "mathInline": {
|
||||
// The schema's `text` attribute has no parseHTML, so TipTap's default
|
||||
// parser reads it from the `text` HTML attribute (NOT the element's text
|
||||
// content). Emit span[data-type="mathInline"] carrying the LaTeX in a
|
||||
// `text="..."` attribute so it round-trips. marked cannot parse $...$
|
||||
// back, so the previous form was lossy.
|
||||
const inlineMath = node.attrs?.text || "";
|
||||
return `<span data-type="mathInline" data-katex="true" text="${escapeAttr(inlineMath)}"></span>`;
|
||||
}
|
||||
case "mathBlock": {
|
||||
// Same as mathInline: the LaTeX must ride in the `text` HTML attribute
|
||||
// for the schema's default parser to recover it.
|
||||
const blockMath = node.attrs?.text || "";
|
||||
return `<div data-type="mathBlock" data-katex="true" text="${escapeAttr(blockMath)}"></div>`;
|
||||
}
|
||||
case "mention": {
|
||||
// Emit span[data-type="mention"] with the schema's data-* attributes so
|
||||
// generateJSON rebuilds the mention node instead of leaving "@label"
|
||||
// plain text that cannot re-parse.
|
||||
const attrs = node.attrs || {};
|
||||
const parts = [`data-type="mention"`];
|
||||
if (attrs.id)
|
||||
parts.push(`data-id="${escapeAttr(attrs.id)}"`);
|
||||
if (attrs.label)
|
||||
parts.push(`data-label="${escapeAttr(attrs.label)}"`);
|
||||
if (attrs.entityType)
|
||||
parts.push(`data-entity-type="${escapeAttr(attrs.entityType)}"`);
|
||||
if (attrs.entityId)
|
||||
parts.push(`data-entity-id="${escapeAttr(attrs.entityId)}"`);
|
||||
if (attrs.slugId)
|
||||
parts.push(`data-slug-id="${escapeAttr(attrs.slugId)}"`);
|
||||
if (attrs.creatorId)
|
||||
parts.push(`data-creator-id="${escapeAttr(attrs.creatorId)}"`);
|
||||
if (attrs.anchorId)
|
||||
parts.push(`data-anchor-id="${escapeAttr(attrs.anchorId)}"`);
|
||||
// Keep the label as visible text content too; the schema reads attrs
|
||||
// from data-*, so the inner text is purely cosmetic and harmless.
|
||||
const mentionLabel = attrs.label || attrs.id || "";
|
||||
// The label is visible element TEXT content here (the data-* attrs above
|
||||
// carry the real values), so escape it for the text context, not attrs.
|
||||
return `<span ${parts.join(" ")}>@${escapeHtmlText(mentionLabel)}</span>`;
|
||||
}
|
||||
case "attachment": {
|
||||
// BUG FIX: the old code read node.attrs.fileName / node.attrs.src, but
|
||||
// the schema stores name/url (plus mime/size/attachmentId). Emit the
|
||||
// schema-matching div[data-type="attachment"] with data-attachment-*
|
||||
// attrs so the node round-trips instead of degrading to a markdown link.
|
||||
const attrs = node.attrs || {};
|
||||
const parts = [
|
||||
`data-type="attachment"`,
|
||||
`data-attachment-url="${escapeAttr(attrs.url ?? "")}"`,
|
||||
];
|
||||
if (attrs.name)
|
||||
parts.push(`data-attachment-name="${escapeAttr(attrs.name)}"`);
|
||||
if (attrs.mime)
|
||||
parts.push(`data-attachment-mime="${escapeAttr(attrs.mime)}"`);
|
||||
if (attrs.size != null)
|
||||
parts.push(`data-attachment-size="${escapeAttr(attrs.size)}"`);
|
||||
if (attrs.attachmentId)
|
||||
parts.push(`data-attachment-id="${escapeAttr(attrs.attachmentId)}"`);
|
||||
return `<div ${parts.join(" ")}></div>`;
|
||||
}
|
||||
case "drawio":
|
||||
case "excalidraw": {
|
||||
// Emit the schema-matching div[data-type=...] carrying the diagram's
|
||||
// attrs as data-* (the schema's diagramAttributes reads src/title/alt/
|
||||
// width/height/size/aspectRatio/align/attachmentId from data-*), so the
|
||||
// diagram round-trips instead of degrading to a lossy placeholder.
|
||||
const attrs = node.attrs || {};
|
||||
const parts = [
|
||||
`data-type="${type}"`,
|
||||
`data-src="${escapeAttr(attrs.src ?? "")}"`,
|
||||
];
|
||||
if (attrs.title != null)
|
||||
parts.push(`data-title="${escapeAttr(attrs.title)}"`);
|
||||
if (attrs.alt != null)
|
||||
parts.push(`data-alt="${escapeAttr(attrs.alt)}"`);
|
||||
if (attrs.width != null)
|
||||
parts.push(`data-width="${escapeAttr(attrs.width)}"`);
|
||||
if (attrs.height != null)
|
||||
parts.push(`data-height="${escapeAttr(attrs.height)}"`);
|
||||
if (attrs.size != null)
|
||||
parts.push(`data-size="${escapeAttr(attrs.size)}"`);
|
||||
if (attrs.aspectRatio != null)
|
||||
parts.push(`data-aspect-ratio="${escapeAttr(attrs.aspectRatio)}"`);
|
||||
if (attrs.align)
|
||||
parts.push(`data-align="${escapeAttr(attrs.align)}"`);
|
||||
if (attrs.attachmentId)
|
||||
parts.push(`data-attachment-id="${escapeAttr(attrs.attachmentId)}"`);
|
||||
return `<div ${parts.join(" ")}></div>`;
|
||||
}
|
||||
case "embed": {
|
||||
// Emit the schema-matching div[data-type="embed"]; the schema reads
|
||||
// src/provider/align/width/height from data-* attributes so the node
|
||||
// (and its provider iframe info) survives the round-trip.
|
||||
const attrs = node.attrs || {};
|
||||
const parts = [
|
||||
`data-type="embed"`,
|
||||
`data-src="${escapeAttr(attrs.src ?? "")}"`,
|
||||
`data-provider="${escapeAttr(attrs.provider ?? "")}"`,
|
||||
];
|
||||
if (attrs.align)
|
||||
parts.push(`data-align="${escapeAttr(attrs.align)}"`);
|
||||
if (attrs.width != null)
|
||||
parts.push(`data-width="${escapeAttr(attrs.width)}"`);
|
||||
if (attrs.height != null)
|
||||
parts.push(`data-height="${escapeAttr(attrs.height)}"`);
|
||||
return `<div ${parts.join(" ")}></div>`;
|
||||
}
|
||||
case "audio": {
|
||||
// Emit the schema-matching <audio> element (was emitting nothing). The
|
||||
// schema reads src from src and attachmentId/size from data-*.
|
||||
const attrs = node.attrs || {};
|
||||
const parts = [`src="${escapeAttr(attrs.src ?? "")}"`];
|
||||
if (attrs.attachmentId)
|
||||
parts.push(`data-attachment-id="${escapeAttr(attrs.attachmentId)}"`);
|
||||
if (attrs.size != null)
|
||||
parts.push(`data-size="${escapeAttr(attrs.size)}"`);
|
||||
// Wrap in a block <div> for the same reason as video: a bare <audio> is
|
||||
// inline-level HTML that marked would wrap in <p>.
|
||||
return `<div><audio ${parts.join(" ")}></audio></div>`;
|
||||
}
|
||||
case "pdf": {
|
||||
// Emit the schema-matching div[data-type="pdf"] (was emitting nothing).
|
||||
// The schema reads src/width/height from standard attrs and name/
|
||||
// attachmentId/size from data-*.
|
||||
const attrs = node.attrs || {};
|
||||
const parts = [
|
||||
`data-type="pdf"`,
|
||||
`src="${escapeAttr(attrs.src ?? "")}"`,
|
||||
];
|
||||
if (attrs.name)
|
||||
parts.push(`data-name="${escapeAttr(attrs.name)}"`);
|
||||
if (attrs.attachmentId)
|
||||
parts.push(`data-attachment-id="${escapeAttr(attrs.attachmentId)}"`);
|
||||
if (attrs.size != null)
|
||||
parts.push(`data-size="${escapeAttr(attrs.size)}"`);
|
||||
if (attrs.width != null)
|
||||
parts.push(`width="${escapeAttr(attrs.width)}"`);
|
||||
if (attrs.height != null)
|
||||
parts.push(`height="${escapeAttr(attrs.height)}"`);
|
||||
return `<div ${parts.join(" ")}></div>`;
|
||||
}
|
||||
case "columns": {
|
||||
// Emit the schema-matching div[data-type="columns"] wrapper so the
|
||||
// multi-column layout survives. Without a case the children were
|
||||
// concatenated with no separator and the text merged. The schema reads
|
||||
// layout from data-layout and widthMode from data-width-mode. The whole
|
||||
// block is raw HTML, so render children via blockToHtml (NOT markdown,
|
||||
// which marked would not re-parse inside a raw HTML block).
|
||||
const attrs = node.attrs || {};
|
||||
const parts = [`data-type="columns"`];
|
||||
if (attrs.layout)
|
||||
parts.push(`data-layout="${escapeAttr(attrs.layout)}"`);
|
||||
if (attrs.widthMode && attrs.widthMode !== "normal")
|
||||
parts.push(`data-width-mode="${escapeAttr(attrs.widthMode)}"`);
|
||||
const inner = nodeContent.map((n) => blockToHtml(n)).join("");
|
||||
return `<div ${parts.join(" ")}>${inner}</div>`;
|
||||
}
|
||||
case "column": {
|
||||
// Emit the schema-matching div[data-type="column"]; the schema reads the
|
||||
// column width from data-width. Children are rendered as HTML so their
|
||||
// formatting survives inside this raw HTML block.
|
||||
const attrs = node.attrs || {};
|
||||
const parts = [`data-type="column"`];
|
||||
if (attrs.width)
|
||||
parts.push(`data-width="${escapeAttr(attrs.width)}"`);
|
||||
const inner = nodeContent.map((n) => blockToHtml(n)).join("");
|
||||
return `<div ${parts.join(" ")}>${inner}</div>`;
|
||||
}
|
||||
case "subpages":
|
||||
return "{{SUBPAGES}}";
|
||||
default:
|
||||
// Fallback: process children
|
||||
return nodeContent.map(processNode).join("");
|
||||
}
|
||||
};
|
||||
// Render inline content (text runs + their marks) to HTML. Used by the raw
|
||||
// HTML fallbacks (spanned tables, columns) where marked will NOT re-parse
|
||||
// markdown, so backtick/asterisk/bracket syntax would otherwise leak as
|
||||
// literal characters. Each mark is mirrored to the HTML the schema's parseHTML
|
||||
// accepts so it re-imports as the matching ProseMirror mark.
|
||||
const inlineToHtml = (inlineNodes) => (inlineNodes || [])
|
||||
.map((n) => {
|
||||
if (n.type === "hardBreak")
|
||||
return "<br>";
|
||||
if (n.type !== "text") {
|
||||
// Inline atoms (mention, mathInline) already emit schema HTML.
|
||||
return processNode(n);
|
||||
}
|
||||
let t = escapeHtmlText(n.text || "");
|
||||
for (const mark of n.marks || []) {
|
||||
switch (mark.type) {
|
||||
case "bold":
|
||||
t = `<strong>${t}</strong>`;
|
||||
break;
|
||||
case "italic":
|
||||
t = `<em>${t}</em>`;
|
||||
break;
|
||||
case "code":
|
||||
t = `<code>${t}</code>`;
|
||||
break;
|
||||
case "strike":
|
||||
t = `<s>${t}</s>`;
|
||||
break;
|
||||
case "underline":
|
||||
t = `<u>${t}</u>`;
|
||||
break;
|
||||
case "subscript":
|
||||
t = `<sub>${t}</sub>`;
|
||||
break;
|
||||
case "superscript":
|
||||
t = `<sup>${t}</sup>`;
|
||||
break;
|
||||
case "link":
|
||||
t = `<a href="${escapeAttr(mark.attrs?.href || "")}">${t}</a>`;
|
||||
break;
|
||||
case "highlight":
|
||||
t = mark.attrs?.color
|
||||
? `<mark style="background-color: ${escapeAttr(mark.attrs.color)}">${t}</mark>`
|
||||
: `<mark>${t}</mark>`;
|
||||
break;
|
||||
case "textStyle":
|
||||
if (mark.attrs?.color)
|
||||
t = `<span style="color: ${escapeAttr(mark.attrs.color)}">${t}</span>`;
|
||||
break;
|
||||
case "comment":
|
||||
// Inline comment anchor inside a raw-HTML container (columns /
|
||||
// spanned table cells), so commented text there also round-trips.
|
||||
if (mark.attrs?.commentId) {
|
||||
const r = mark.attrs?.resolved ? ` data-resolved="true"` : "";
|
||||
t = `<span data-comment-id="${escapeAttr(mark.attrs.commentId)}"${r}>${t}</span>`;
|
||||
}
|
||||
break;
|
||||
}
|
||||
}
|
||||
return t;
|
||||
})
|
||||
.join("");
|
||||
// Emit the schema-matching <img> for an image node. Shared so the image is
|
||||
// emitted as real HTML wherever a raw-HTML container needs it (inside a column
|
||||
// or a spanned table cell), where markdown `` would NOT be re-parsed
|
||||
// and would survive as literal text. The Image extension reads src/alt from
|
||||
// the standard attributes; the Docmost extra attrs (width/height/align/size/
|
||||
// attachmentId/aspectRatio) are global attributes read from same-named DOM
|
||||
// attributes, so emit them by name.
|
||||
const imageToHtml = (node) => {
|
||||
const attrs = node.attrs || {};
|
||||
const parts = [`src="${escapeAttr(attrs.src ?? "")}"`];
|
||||
if (attrs.alt)
|
||||
parts.push(`alt="${escapeAttr(attrs.alt)}"`);
|
||||
if (attrs.title)
|
||||
parts.push(`title="${escapeAttr(attrs.title)}"`);
|
||||
if (attrs.width != null)
|
||||
parts.push(`width="${escapeAttr(attrs.width)}"`);
|
||||
if (attrs.height != null)
|
||||
parts.push(`height="${escapeAttr(attrs.height)}"`);
|
||||
if (attrs.align)
|
||||
parts.push(`align="${escapeAttr(attrs.align)}"`);
|
||||
if (attrs.size != null)
|
||||
parts.push(`data-size="${escapeAttr(attrs.size)}"`);
|
||||
if (attrs.attachmentId)
|
||||
parts.push(`data-attachment-id="${escapeAttr(attrs.attachmentId)}"`);
|
||||
if (attrs.aspectRatio != null)
|
||||
parts.push(`data-aspect-ratio="${escapeAttr(attrs.aspectRatio)}"`);
|
||||
return `<img ${parts.join(" ")}>`;
|
||||
};
|
||||
// Emit the schema-matching div[data-type="callout"] for a callout node. The
|
||||
// schema reads the banner type from data-callout-type. Children are rendered
|
||||
// as HTML so they survive inside a raw-HTML container.
|
||||
const calloutToHtml = (node) => {
|
||||
const type = (node.attrs?.type || "info").toLowerCase();
|
||||
const inner = (node.content || []).map(blockToHtml).join("");
|
||||
return `<div data-type="callout" data-callout-type="${escapeAttr(type)}">${inner}</div>`;
|
||||
};
|
||||
// Emit a schema-matching <details> tree. The schema parses <details>,
|
||||
// summary[data-type="detailsSummary"], and div[data-type="detailsContent"].
|
||||
const detailsToHtml = (node) => {
|
||||
const inner = (node.content || []).map(blockToHtml).join("");
|
||||
return `<details>${inner}</details>`;
|
||||
};
|
||||
const detailsSummaryToHtml = (node) => `<summary data-type="detailsSummary">${inlineToHtml(node.content || [])}</summary>`;
|
||||
const detailsContentToHtml = (node) => {
|
||||
const inner = (node.content || []).map(blockToHtml).join("");
|
||||
return `<div data-type="detailsContent">${inner}</div>`;
|
||||
};
|
||||
// Emit the schema-matching taskList/taskItem HTML. bridgeTaskLists (in
|
||||
// collaboration.ts) recognizes ul[data-type="taskList"] with
|
||||
// li[data-type="taskItem"][data-checked]; emitting that directly here keeps
|
||||
// task lists inside columns/cells from degrading to literal "- [ ]" text.
|
||||
const taskListToHtml = (node) => {
|
||||
const items = (node.content || [])
|
||||
.map((it) => {
|
||||
const checked = it.attrs?.checked ? "true" : "false";
|
||||
return `<li data-type="taskItem" data-checked="${checked}">${blockChildrenToHtml(it)}</li>`;
|
||||
})
|
||||
.join("");
|
||||
return `<ul data-type="taskList">${items}</ul>`;
|
||||
};
|
||||
// Render a block node to HTML for the raw-HTML containers (spanned tables,
|
||||
// columns). marked does NOT re-parse markdown inside a raw-HTML block, so
|
||||
// EVERY block type that can appear inside a column or a spanned cell must be
|
||||
// emitted as schema-matching HTML here — never as markdown, or it would land
|
||||
// as literal text on re-import. Nodes whose processNode case already produces
|
||||
// schema-matching HTML (math/media/embed/attachment/nested columns/spanned
|
||||
// table) are delegated to processNode; the markdown-emitting cases
|
||||
// (image/blockquote/callout/details/hr/taskList) get explicit HTML here.
|
||||
const blockToHtml = (block) => {
|
||||
const children = block.content || [];
|
||||
switch (block.type) {
|
||||
case "paragraph":
|
||||
return `<p>${inlineToHtml(children)}</p>`;
|
||||
case "heading": {
|
||||
const level = block.attrs?.level || 1;
|
||||
return `<h${level}>${inlineToHtml(children)}</h${level}>`;
|
||||
}
|
||||
case "bulletList":
|
||||
return `<ul>${children
|
||||
.map((li) => `<li>${blockChildrenToHtml(li)}</li>`)
|
||||
.join("")}</ul>`;
|
||||
case "orderedList":
|
||||
return `<ol>${children
|
||||
.map((li) => `<li>${blockChildrenToHtml(li)}</li>`)
|
||||
.join("")}</ol>`;
|
||||
case "codeBlock": {
|
||||
const lang = block.attrs?.language || "";
|
||||
// The code itself is element TEXT content (between <code> tags), so it
|
||||
// must escape < > & — NOT the attribute escaper. The language rides in
|
||||
// a class ATTRIBUTE, so it uses escapeAttr.
|
||||
const code = escapeHtmlText(children
|
||||
.map(processNode)
|
||||
.join("")
|
||||
.replace(/\n+$/, ""));
|
||||
const cls = lang ? ` class="language-${escapeAttr(lang)}"` : "";
|
||||
return `<pre><code${cls}>${code}</code></pre>`;
|
||||
}
|
||||
case "image":
|
||||
return imageToHtml(block);
|
||||
case "blockquote":
|
||||
return `<blockquote>${children.map(blockToHtml).join("")}</blockquote>`;
|
||||
case "horizontalRule":
|
||||
return "<hr>";
|
||||
case "callout":
|
||||
return calloutToHtml(block);
|
||||
case "details":
|
||||
return detailsToHtml(block);
|
||||
case "detailsSummary":
|
||||
return detailsSummaryToHtml(block);
|
||||
case "detailsContent":
|
||||
return detailsContentToHtml(block);
|
||||
case "taskList":
|
||||
return taskListToHtml(block);
|
||||
case "taskItem":
|
||||
// A bare taskItem (outside a taskList) still needs a wrapping list so
|
||||
// the schema parses it; wrap it in a single-item taskList.
|
||||
return taskListToHtml({ content: [block] });
|
||||
// table (incl. spanned), columns/column, math, media, embed, attachment,
|
||||
// mention, etc. already emit schema-matching HTML from processNode.
|
||||
case "table":
|
||||
case "columns":
|
||||
case "column":
|
||||
case "mathBlock":
|
||||
case "video":
|
||||
case "audio":
|
||||
case "pdf":
|
||||
case "youtube":
|
||||
case "embed":
|
||||
case "attachment":
|
||||
case "drawio":
|
||||
case "excalidraw":
|
||||
return processNode(block);
|
||||
default:
|
||||
// Any still-unhandled block type: NEVER fall back to markdown inside a
|
||||
// raw-HTML block (it would become literal text). Wrap its rendered
|
||||
// children in a <div> so their content is preserved; if it has no block
|
||||
// children, render its inline content instead.
|
||||
if (children.length && children.some((c) => c.type !== "text")) {
|
||||
return `<div>${children.map(blockToHtml).join("")}</div>`;
|
||||
}
|
||||
return `<div>${inlineToHtml(children)}</div>`;
|
||||
}
|
||||
};
|
||||
// Render the block children of a list item to HTML (a listItem holds block+
|
||||
// content). Mirrors processListItem but for the HTML fallback path.
|
||||
const blockChildrenToHtml = (item) => (item.content || []).map((b) => blockToHtml(b)).join("");
|
||||
// Indent the rendered children of a list item under a marker prefix.
|
||||
// Each child block is a (possibly multi-line) string. The very first physical
|
||||
// line of the first child carries the marker (e.g. "- " or "1. "); EVERY
|
||||
// other line — the remaining lines of the first child AND all lines of every
|
||||
// subsequent child (nested lists, code blocks, extra paragraphs) — is indented
|
||||
// to align under the marker. Without indenting these continuation lines, the
|
||||
// 2nd/3rd line of a nested child collapses to column 0 and escapes the list.
|
||||
//
|
||||
// The continuation indent MUST equal the LIST marker width, which is not the
|
||||
// same as the visible prefix width:
|
||||
// - bullet "- " -> 2 columns
|
||||
// - task "- [ ] " -> marker is still "- " (the "[ ] " is content), 2
|
||||
// - ordered "1. "/"10. " -> 3/4 columns, scaling with the number's digits
|
||||
// CommonMark anchors nested content to the marker column, so an ordered item
|
||||
// indented to only 2 columns would be re-parsed as a sibling/loose content on
|
||||
// re-import. Callers therefore pass the exact indent width to use.
|
||||
const indentItemChildren = (childStrings, prefix, indentWidth) => {
|
||||
const indent = " ".repeat(indentWidth);
|
||||
const lines = [];
|
||||
childStrings.forEach((child, childIndex) => {
|
||||
child.split("\n").forEach((line, lineIndex) => {
|
||||
if (childIndex === 0 && lineIndex === 0) {
|
||||
// First physical line of the first block gets the marker.
|
||||
lines.push(`${prefix} ${line}`);
|
||||
}
|
||||
else {
|
||||
// Indent every continuation line by the marker width; keep blank
|
||||
// lines blank rather than emitting trailing whitespace.
|
||||
lines.push(line.length ? `${indent}${line}` : "");
|
||||
}
|
||||
});
|
||||
});
|
||||
return lines.join("\n");
|
||||
};
|
||||
const processListItem = (item, prefix) => {
|
||||
const itemContent = item.content || [];
|
||||
const childStrings = itemContent.map(processNode);
|
||||
if (childStrings.length === 0)
|
||||
return prefix;
|
||||
// The rendered marker is `${prefix} ` (prefix + one space), so its width —
|
||||
// and thus the continuation indent — is prefix.length + 1. This is correct
|
||||
// for both bullet ("-" -> 2) and ordered ("1." -> 3, "10." -> 4) markers,
|
||||
// since for those the visible prefix IS the list marker.
|
||||
return indentItemChildren(childStrings, prefix, prefix.length + 1);
|
||||
};
|
||||
const processTaskItem = (item) => {
|
||||
const checked = item.attrs?.checked || false;
|
||||
const checkbox = checked ? "[x]" : "[ ]";
|
||||
const prefix = `- ${checkbox}`;
|
||||
const itemContent = item.content || [];
|
||||
const childStrings = itemContent.map(processNode);
|
||||
// An empty task item still needs its checkbox marker; without this guard
|
||||
// the indent below produces "" and the "- [ ]"/"- [x]" row disappears.
|
||||
if (childStrings.length === 0)
|
||||
return prefix;
|
||||
// The list marker for a task item is just "- " (2 columns); the "[ ] "/"[x] "
|
||||
// checkbox is item content, NOT part of the marker. So the continuation
|
||||
// indent is a fixed 2 — do NOT derive it from the wider prefix.length.
|
||||
return indentItemChildren(childStrings, prefix, 2);
|
||||
};
|
||||
return processNode(content).trim();
|
||||
}
|
||||
104
packages/mcp/build/lib/markdown-document.js
Normal file
104
packages/mcp/build/lib/markdown-document.js
Normal file
@@ -0,0 +1,104 @@
|
||||
/**
|
||||
* Self-contained Docmost-flavoured Markdown document (custom extensions).
|
||||
*
|
||||
* A single `.md` file that packages everything needed to losslessly round-trip
|
||||
* a page through "download -> edit body -> re-upload":
|
||||
* - a leading `docmost:meta` block: a one-line JSON object with page identity;
|
||||
* - the Markdown body (carrying inline comment anchors and diagrams as HTML);
|
||||
* - a trailing `docmost:comments` block: a one-line JSON array of comment
|
||||
* threads.
|
||||
*
|
||||
* Both metadata blocks are HTML comments on purpose: `marked`/`generateJSON`
|
||||
* drop HTML comments, so even if the WHOLE file were ever fed straight to the
|
||||
* importer without first stripping the blocks, the metadata cannot leak into the
|
||||
* document. (A fenced ```docmost-comments``` block would WRONGLY become a
|
||||
* codeBlock node, so a fenced block is deliberately NOT used.)
|
||||
*
|
||||
* The delimiter literals may legitimately appear in the BODY too (e.g. a user
|
||||
* re-pastes an exported `.md` into a page, or a page documents this very
|
||||
* format). To stay robust, parsing treats only the FINAL, document-ending
|
||||
* `docmost:comments` block as metadata: it is the last `<!-- docmost:comments`
|
||||
* opener whose closing `-->` sits at the very end of the file. Any earlier
|
||||
* literal occurrence is left in the body untouched.
|
||||
*
|
||||
* NOTE on comments: in this version the comment THREAD records are preserved in
|
||||
* the file but are NOT pushed back to the server on import — only the inline
|
||||
* comment marks (anchors) embedded in the body are restored. Managing comment
|
||||
* records stays with the comment tools/UI.
|
||||
*/
|
||||
// Match the leading meta block (allow leading whitespace). Capture group 1 is
|
||||
// the JSON text between the markers.
|
||||
const META_RE = /^\s*<!--\s*docmost:meta\s*\n([\s\S]*?)\n-->/;
|
||||
// Match a `docmost:comments` opener. Used globally to scan for the LAST opener
|
||||
// rather than end-anchoring a single regex (which would mis-capture across a
|
||||
// literal opener that appears earlier in the body).
|
||||
const COMMENTS_OPEN_RE = /<!--[ \t]*docmost:comments[ \t]*\r?\n/g;
|
||||
/**
|
||||
* Assemble the full self-contained markdown file: meta block, body, and the
|
||||
* comments block. The meta block is always emitted; the comments block is always
|
||||
* emitted too (with `[]` when there are no comments) so the format stays uniform
|
||||
* and parsing stays simple.
|
||||
*/
|
||||
export function serializeDocmostMarkdown(meta, body, comments) {
|
||||
const metaJson = JSON.stringify(meta);
|
||||
const commentsJson = JSON.stringify(Array.isArray(comments) ? comments : []);
|
||||
const trimmedBody = (body ?? "").trim();
|
||||
return (`<!-- docmost:meta\n${metaJson}\n-->\n\n` +
|
||||
`${trimmedBody}\n\n` +
|
||||
`<!-- docmost:comments\n${commentsJson}\n-->\n`);
|
||||
}
|
||||
/**
|
||||
* Split a self-contained file back into its parts. Tolerant: if the meta or
|
||||
* comments block is missing (e.g. a hand-written plain-markdown file), the
|
||||
* corresponding value is returned as `null` and the whole input is treated as
|
||||
* the body. This never throws on a MISSING block; only a `JSON.parse` failure
|
||||
* inside a block that IS present is surfaced as a thrown Error with a clear
|
||||
* message. Robust to `\r\n` line endings.
|
||||
*/
|
||||
export function parseDocmostMarkdown(full) {
|
||||
// Normalize line endings so the anchored regexes work regardless of CRLF.
|
||||
const normalized = (full ?? "").replace(/\r\n/g, "\n");
|
||||
// Extract the leading meta block (start-anchored — already unambiguous).
|
||||
let meta = null;
|
||||
let metaEnd = 0;
|
||||
const metaMatch = normalized.match(META_RE);
|
||||
if (metaMatch) {
|
||||
try {
|
||||
meta = JSON.parse(metaMatch[1]);
|
||||
}
|
||||
catch (e) {
|
||||
throw new Error(`Invalid docmost:meta JSON block: ${e instanceof Error ? e.message : String(e)}`);
|
||||
}
|
||||
// Body starts right after the matched meta block.
|
||||
metaEnd = (metaMatch.index ?? 0) + metaMatch[0].length;
|
||||
}
|
||||
// Find the LAST `<!-- docmost:comments` opener; the real file-level block is
|
||||
// the final one whose closing `-->` ends the document. Any earlier literal
|
||||
// occurrence inside the body (e.g. a re-pasted export) is left in the body.
|
||||
let lastOpenStart = -1;
|
||||
let lastOpenEnd = -1;
|
||||
let m;
|
||||
COMMENTS_OPEN_RE.lastIndex = 0;
|
||||
while ((m = COMMENTS_OPEN_RE.exec(normalized)) !== null) {
|
||||
lastOpenStart = m.index;
|
||||
lastOpenEnd = m.index + m[0].length;
|
||||
}
|
||||
let comments = null;
|
||||
let bodyEnd = normalized.length;
|
||||
if (lastOpenStart !== -1) {
|
||||
const rest = normalized.slice(lastOpenEnd);
|
||||
const close = rest.match(/\r?\n-->[ \t]*\r?\n?\s*$/); // closer must end the doc
|
||||
if (close) {
|
||||
const jsonText = rest.slice(0, close.index);
|
||||
try {
|
||||
comments = JSON.parse(jsonText);
|
||||
}
|
||||
catch (e) {
|
||||
throw new Error(`Invalid docmost:comments JSON block: ${e instanceof Error ? e.message : String(e)}`);
|
||||
}
|
||||
bodyEnd = lastOpenStart; // strip from the opener to end of document
|
||||
}
|
||||
}
|
||||
const body = normalized.slice(metaEnd, bodyEnd).trim();
|
||||
return { meta, body, comments };
|
||||
}
|
||||
770
packages/mcp/build/lib/node-ops.js
Normal file
770
packages/mcp/build/lib/node-ops.js
Normal file
@@ -0,0 +1,770 @@
|
||||
/**
|
||||
* Pure, network-free helpers for manipulating a ProseMirror/TipTap document
|
||||
* tree by node id.
|
||||
*
|
||||
* A ProseMirror node here is a plain JSON object of the shape produced by
|
||||
* Docmost: `{ type, attrs?, content?, text?, marks? }`. Children live in the
|
||||
* `content` array; a node carries a stable id in `attrs.id`. Callouts and
|
||||
* table cells hold their children in `content` just like any other block, so a
|
||||
* single recursive walk reaches them all.
|
||||
*
|
||||
* Every exported function operates on a DEEP CLONE of the input document and
|
||||
* returns the new document. The input doc and any `newNode`/`node` argument are
|
||||
* never mutated. All functions are defensively null-safe: missing/!Array
|
||||
* `content`, non-object nodes, and absent `attrs` are tolerated.
|
||||
*/
|
||||
/** Deep-clone a JSON-serializable value without mutating the original. */
|
||||
function clone(value) {
|
||||
if (typeof structuredClone === "function") {
|
||||
return structuredClone(value);
|
||||
}
|
||||
// Fallback for environments without structuredClone.
|
||||
return JSON.parse(JSON.stringify(value));
|
||||
}
|
||||
/** True if `value` is a non-null object (and not an array). */
|
||||
function isObject(value) {
|
||||
return value != null && typeof value === "object" && !Array.isArray(value);
|
||||
}
|
||||
/** True if `node` carries the given id in `node.attrs.id`. */
|
||||
function matchesId(node, nodeId) {
|
||||
return isObject(node) && isObject(node.attrs) && node.attrs.id === nodeId;
|
||||
}
|
||||
/**
|
||||
* Recursively concatenate all text contained in a node.
|
||||
*
|
||||
* Text nodes contribute their `text` string; container nodes contribute the
|
||||
* joined `blockPlainText` of their `content` children. Returns "" for nullish
|
||||
* or non-object inputs.
|
||||
*/
|
||||
export function blockPlainText(node) {
|
||||
if (!isObject(node))
|
||||
return "";
|
||||
let out = "";
|
||||
if (typeof node.text === "string") {
|
||||
out += node.text;
|
||||
}
|
||||
if (Array.isArray(node.content)) {
|
||||
for (const child of node.content) {
|
||||
out += blockPlainText(child);
|
||||
}
|
||||
}
|
||||
return out;
|
||||
}
|
||||
/** Truncate `text` to at most `n` chars, appending an ellipsis when cut. */
|
||||
function truncate(text, n) {
|
||||
return text.length > n ? text.slice(0, n) + "…" : text;
|
||||
}
|
||||
/**
|
||||
* Build a COMPACT outline of the TOP-LEVEL blocks of `doc` (the entries in
|
||||
* `doc.content`). Deliberately does NOT recurse into paragraphs, list items, or
|
||||
* table cells — compactness is the point; use `getNodeByRef` to drill into a
|
||||
* specific block.
|
||||
*
|
||||
* Each entry carries `{ index, type, id, firstText }`, plus type-specific
|
||||
* extras: headings add `level`; tables add `rows`/`cols` and the first row's
|
||||
* cell texts as `header`; list blocks (types ending in "List") add `items`.
|
||||
* `firstText` is the block's plain text truncated to 100 chars. Null-safe:
|
||||
* a missing or non-object doc/content yields `[]`.
|
||||
*/
|
||||
export function buildOutline(doc) {
|
||||
if (!isObject(doc) || !Array.isArray(doc.content))
|
||||
return [];
|
||||
const out = [];
|
||||
for (let i = 0; i < doc.content.length; i++) {
|
||||
const block = doc.content[i];
|
||||
const type = isObject(block) ? block.type : undefined;
|
||||
const entry = {
|
||||
index: i,
|
||||
type,
|
||||
id: isObject(block) && isObject(block.attrs) ? block.attrs.id ?? null : null,
|
||||
firstText: truncate(blockPlainText(block), 100),
|
||||
};
|
||||
if (type === "heading") {
|
||||
entry.level = isObject(block.attrs) ? block.attrs.level ?? null : null;
|
||||
}
|
||||
else if (type === "table") {
|
||||
const headerRow = block.content?.[0]?.content ?? [];
|
||||
entry.rows = block.content?.length ?? 0;
|
||||
entry.cols = block.content?.[0]?.content?.length ?? 0;
|
||||
entry.header = headerRow.map((cell) => truncate(blockPlainText(cell), 40));
|
||||
}
|
||||
else if (typeof type === "string" && type.endsWith("List")) {
|
||||
entry.items = block.content?.length ?? 0;
|
||||
}
|
||||
out.push(entry);
|
||||
}
|
||||
return out;
|
||||
}
|
||||
/**
|
||||
* Resolve a single node by reference and return `{ node, path, type }`, or
|
||||
* `null` when nothing matches.
|
||||
*
|
||||
* - `ref` of the form `#<n>` (e.g. `#2`) selects the TOP-LEVEL block at index
|
||||
* `n` in `doc.content`. This is the only way to address table/tableRow/
|
||||
* tableCell nodes, which carry no `attrs.id`.
|
||||
* - Otherwise `ref` is treated as a block id: the FIRST node anywhere in the
|
||||
* tree with `attrs.id === ref` is returned.
|
||||
*
|
||||
* `path` is the array of child indices from the doc root down to the node
|
||||
* (so a top-level block is `[index]`). The returned `node` is a DEEP CLONE,
|
||||
* so callers can mutate it without touching the input doc. Null-safe.
|
||||
*/
|
||||
export function getNodeByRef(doc, ref) {
|
||||
if (!isObject(doc))
|
||||
return null;
|
||||
// "#<n>": index into the top-level content array.
|
||||
const indexMatch = typeof ref === "string" ? ref.match(/^#(\d+)$/) : null;
|
||||
if (indexMatch) {
|
||||
const index = Number(indexMatch[1]);
|
||||
const block = Array.isArray(doc.content) ? doc.content[index] : undefined;
|
||||
if (!isObject(block))
|
||||
return null;
|
||||
return { node: clone(block), path: [index], type: block.type };
|
||||
}
|
||||
// Otherwise: depth-first search for the first node with attrs.id === ref.
|
||||
const search = (node, trail) => {
|
||||
if (!isObject(node))
|
||||
return null;
|
||||
if (Array.isArray(node.content)) {
|
||||
for (let i = 0; i < node.content.length; i++) {
|
||||
const child = node.content[i];
|
||||
const path = [...trail, i];
|
||||
if (matchesId(child, ref)) {
|
||||
return { node: clone(child), path, type: child.type };
|
||||
}
|
||||
const hit = search(child, path);
|
||||
if (hit != null)
|
||||
return hit;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
};
|
||||
return search(doc, []);
|
||||
}
|
||||
/**
|
||||
* Replace EVERY node whose `attrs.id === nodeId` with a deep clone of
|
||||
* `newNode`, anywhere in the tree (including inside callouts and table cells).
|
||||
*
|
||||
* Operates on a clone of `doc`; returns `{ doc, replaced }` where `replaced`
|
||||
* is the number of nodes substituted. A fresh clone of `newNode` is used for
|
||||
* each match so they do not share references.
|
||||
*/
|
||||
export function replaceNodeById(doc, nodeId, newNode) {
|
||||
const out = clone(doc);
|
||||
let replaced = 0;
|
||||
// Walk a content array, replacing direct matches and recursing into the
|
||||
// (possibly new) children of non-matching nodes.
|
||||
const walkContent = (content) => {
|
||||
for (let i = 0; i < content.length; i++) {
|
||||
const child = content[i];
|
||||
if (matchesId(child, nodeId)) {
|
||||
content[i] = clone(newNode);
|
||||
replaced++;
|
||||
// Do not recurse into a freshly substituted node.
|
||||
continue;
|
||||
}
|
||||
if (isObject(child) && Array.isArray(child.content)) {
|
||||
walkContent(child.content);
|
||||
}
|
||||
}
|
||||
};
|
||||
if (isObject(out) && Array.isArray(out.content)) {
|
||||
walkContent(out.content);
|
||||
}
|
||||
return { doc: out, replaced };
|
||||
}
|
||||
/**
|
||||
* Remove EVERY node whose `attrs.id === nodeId` from its parent `content`
|
||||
* array, anywhere in the tree (recursive, including callouts and tables).
|
||||
*
|
||||
* Operates on a clone of `doc`; returns `{ doc, deleted }` where `deleted` is
|
||||
* the number of nodes removed.
|
||||
*/
|
||||
export function deleteNodeById(doc, nodeId) {
|
||||
const out = clone(doc);
|
||||
let deleted = 0;
|
||||
// Filter a content array in place, dropping matches and recursing into the
|
||||
// surviving children.
|
||||
const walkContent = (content) => {
|
||||
const kept = [];
|
||||
for (const child of content) {
|
||||
if (matchesId(child, nodeId)) {
|
||||
deleted++;
|
||||
continue;
|
||||
}
|
||||
if (isObject(child) && Array.isArray(child.content)) {
|
||||
child.content = walkContent(child.content);
|
||||
}
|
||||
kept.push(child);
|
||||
}
|
||||
return kept;
|
||||
};
|
||||
if (isObject(out) && Array.isArray(out.content)) {
|
||||
out.content = walkContent(out.content);
|
||||
}
|
||||
return { doc: out, deleted };
|
||||
}
|
||||
/**
|
||||
* Deep-clone `doc` and strip every node/mark attribute whose value is strictly
|
||||
* `undefined`, so the result is safe to hand to Yjs (which throws an opaque
|
||||
* "Unexpected content type" when asked to store an `undefined` attribute value).
|
||||
*
|
||||
* Only `undefined` keys are removed; `null`, `false`, `0`, and `""` are all
|
||||
* legitimate JSON-storable values and are preserved. Operates on a clone and
|
||||
* returns it; the input is never mutated. Defensively null-safe like the rest
|
||||
* of the file.
|
||||
*/
|
||||
export function sanitizeForYjs(doc) {
|
||||
const out = clone(doc);
|
||||
// Drop every key whose value is strictly `undefined` from an attrs object.
|
||||
const stripUndefined = (attrs) => {
|
||||
if (!isObject(attrs))
|
||||
return;
|
||||
for (const key of Object.keys(attrs)) {
|
||||
if (attrs[key] === undefined) {
|
||||
delete attrs[key];
|
||||
}
|
||||
}
|
||||
};
|
||||
const walk = (node) => {
|
||||
if (!isObject(node))
|
||||
return;
|
||||
stripUndefined(node.attrs);
|
||||
if (Array.isArray(node.marks)) {
|
||||
for (const mark of node.marks) {
|
||||
if (isObject(mark))
|
||||
stripUndefined(mark.attrs);
|
||||
}
|
||||
}
|
||||
if (Array.isArray(node.content)) {
|
||||
for (const child of node.content) {
|
||||
walk(child);
|
||||
}
|
||||
}
|
||||
};
|
||||
walk(out);
|
||||
return out;
|
||||
}
|
||||
/**
|
||||
* Diagnostics helper: walk the tree and return a human-readable path string for
|
||||
* the FIRST attribute value (in any `node.attrs` or `mark.attrs`) that Yjs
|
||||
* cannot store — i.e. `undefined`, a `function`, a `symbol`, or a `bigint`
|
||||
* (e.g. `content[3].content[0].attrs.indent (undefined)`). Returns `null` when
|
||||
* every attribute is storable. Null-safe.
|
||||
*/
|
||||
export function findUnstorableAttr(doc) {
|
||||
const isUnstorable = (value) => {
|
||||
if (value === undefined)
|
||||
return "undefined";
|
||||
const t = typeof value;
|
||||
if (t === "function")
|
||||
return "function";
|
||||
if (t === "symbol")
|
||||
return "symbol";
|
||||
if (t === "bigint")
|
||||
return "bigint";
|
||||
return null;
|
||||
};
|
||||
// Check an attrs object; return the offending sub-path or null.
|
||||
const checkAttrs = (attrs, basePath) => {
|
||||
if (!isObject(attrs))
|
||||
return null;
|
||||
for (const key of Object.keys(attrs)) {
|
||||
const kind = isUnstorable(attrs[key]);
|
||||
if (kind != null)
|
||||
return `${basePath}.${key} (${kind})`;
|
||||
}
|
||||
return null;
|
||||
};
|
||||
const walk = (node, path) => {
|
||||
if (!isObject(node))
|
||||
return null;
|
||||
const attrHit = checkAttrs(node.attrs, `${path}.attrs`);
|
||||
if (attrHit != null)
|
||||
return attrHit;
|
||||
if (Array.isArray(node.marks)) {
|
||||
for (let i = 0; i < node.marks.length; i++) {
|
||||
const markHit = checkAttrs(node.marks[i]?.attrs, `${path}.marks[${i}].attrs`);
|
||||
if (markHit != null)
|
||||
return markHit;
|
||||
}
|
||||
}
|
||||
if (Array.isArray(node.content)) {
|
||||
for (let i = 0; i < node.content.length; i++) {
|
||||
const childHit = walk(node.content[i], `${path}.content[${i}]`);
|
||||
if (childHit != null)
|
||||
return childHit;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
};
|
||||
// The root doc node carries no useful index, so start the path at "doc".
|
||||
if (!isObject(doc))
|
||||
return null;
|
||||
const attrHit = checkAttrs(doc.attrs, "attrs");
|
||||
if (attrHit != null)
|
||||
return attrHit;
|
||||
if (Array.isArray(doc.content)) {
|
||||
for (let i = 0; i < doc.content.length; i++) {
|
||||
const childHit = walk(doc.content[i], `content[${i}]`);
|
||||
if (childHit != null)
|
||||
return childHit;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
}
|
||||
/**
|
||||
* Table structural node types and the container each must live directly inside.
|
||||
* Used by `insertNodeRelative` to splice rows/cells into the correct ancestor
|
||||
* rather than blindly into the anchor's direct parent (which would corrupt the
|
||||
* table's nesting).
|
||||
*/
|
||||
const STRUCTURAL_TYPES = new Set(["tableRow", "tableCell", "tableHeader"]);
|
||||
const REQUIRED_CONTAINER = {
|
||||
tableRow: "table",
|
||||
tableCell: "tableRow",
|
||||
tableHeader: "tableRow",
|
||||
};
|
||||
/**
|
||||
* Locate an anchor and return its ancestor chain (from `doc` down to and
|
||||
* including the matched node). Each chain entry is `{ node, index }` where
|
||||
* `index` is the node's position inside its parent's `content` array (the root
|
||||
* doc has index -1). Returns `null` when the anchor cannot be resolved.
|
||||
*/
|
||||
function findAnchorChain(doc, opts) {
|
||||
if (!isObject(doc))
|
||||
return null;
|
||||
// DFS by id anywhere in the tree, accumulating the path.
|
||||
if (opts.anchorNodeId != null) {
|
||||
const targetId = opts.anchorNodeId;
|
||||
const search = (node, index, trail) => {
|
||||
if (!isObject(node))
|
||||
return null;
|
||||
const here = [...trail, { node, index }];
|
||||
if (matchesId(node, targetId))
|
||||
return here;
|
||||
if (Array.isArray(node.content)) {
|
||||
for (let i = 0; i < node.content.length; i++) {
|
||||
const hit = search(node.content[i], i, here);
|
||||
if (hit != null)
|
||||
return hit;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
};
|
||||
return search(doc, -1, []);
|
||||
}
|
||||
// By text: only top-level blocks are scanned (same rule as the JSON path).
|
||||
if (opts.anchorText != null && Array.isArray(doc.content)) {
|
||||
for (let i = 0; i < doc.content.length; i++) {
|
||||
if (blockPlainText(doc.content[i]).includes(opts.anchorText)) {
|
||||
return [
|
||||
{ node: doc, index: -1 },
|
||||
{ node: doc.content[i], index: i },
|
||||
];
|
||||
}
|
||||
}
|
||||
}
|
||||
return null;
|
||||
}
|
||||
/**
|
||||
* Insert a deep clone of `node` relative to an anchor.
|
||||
*
|
||||
* - position "append": push the node onto the top-level `doc.content`.
|
||||
* - position "before"/"after": locate the anchor and splice the node into the
|
||||
* anchor's parent `content` array immediately before / after it.
|
||||
*
|
||||
* Anchor resolution for before/after:
|
||||
* - if `anchorNodeId` is given, find the node with `attrs.id === anchorNodeId`
|
||||
* anywhere in the tree (recursive);
|
||||
* - otherwise, if `anchorText` is given, scan only TOP-LEVEL `doc.content`
|
||||
* blocks and pick the first whose `blockPlainText` includes `anchorText`.
|
||||
*
|
||||
* Operates on a clone of `doc`; returns `{ doc, inserted }`. `inserted` is
|
||||
* false when the anchor could not be resolved (the doc is returned unchanged
|
||||
* apart from being cloned).
|
||||
*/
|
||||
export function insertNodeRelative(doc, node, opts) {
|
||||
const out = clone(doc);
|
||||
const fresh = clone(node);
|
||||
// Defensive: stay null-safe like the other exports — a missing opts means
|
||||
// there is nothing actionable to do.
|
||||
if (!isObject(opts))
|
||||
return { doc: out, inserted: false };
|
||||
const isStructural = isObject(node) && STRUCTURAL_TYPES.has(node.type);
|
||||
// "append": top-level push.
|
||||
if (opts.position === "append") {
|
||||
// Structural table nodes (tableRow/tableCell/tableHeader) cannot live at the
|
||||
// top level — appending one would produce invalid nesting.
|
||||
if (isStructural) {
|
||||
throw new Error(`insert_node: cannot append a ${node.type} at the top level; use ` +
|
||||
`position before/after with an anchor inside the target table`);
|
||||
}
|
||||
if (isObject(out)) {
|
||||
if (!Array.isArray(out.content))
|
||||
out.content = [];
|
||||
out.content.push(fresh);
|
||||
return { doc: out, inserted: true };
|
||||
}
|
||||
return { doc: out, inserted: false };
|
||||
}
|
||||
const offset = opts.position === "after" ? 1 : 0;
|
||||
// Structural insert (before/after a tableRow/tableCell/tableHeader): splice
|
||||
// into the nearest enclosing table/tableRow rather than the anchor's direct
|
||||
// parent, so the row/cell lands at the correct level of the table.
|
||||
if (isStructural) {
|
||||
const containerType = REQUIRED_CONTAINER[node.type];
|
||||
const chain = findAnchorChain(out, opts);
|
||||
// Anchor not resolved at all — keep the existing "anchor not found" path.
|
||||
if (chain == null)
|
||||
return { doc: out, inserted: false };
|
||||
// Find the DEEPEST ancestor (including the anchor itself) of the required
|
||||
// container type.
|
||||
let containerIdx = -1;
|
||||
for (let i = chain.length - 1; i >= 0; i--) {
|
||||
if (isObject(chain[i].node) && chain[i].node.type === containerType) {
|
||||
containerIdx = i;
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (containerIdx === -1) {
|
||||
throw new Error(`insert_node: cannot insert a ${node.type} here — the anchor is not ` +
|
||||
`inside a ${containerType}. Anchor on a cell's text or a block id ` +
|
||||
`that lives inside the target table.`);
|
||||
}
|
||||
const container = chain[containerIdx].node;
|
||||
if (!Array.isArray(container.content))
|
||||
container.content = [];
|
||||
if (containerIdx === chain.length - 1) {
|
||||
// The matched container IS the anchor node itself (e.g. anchorText
|
||||
// resolved to the table block): append/prepend within it.
|
||||
const at = opts.position === "after" ? container.content.length : 0;
|
||||
container.content.splice(at, 0, fresh);
|
||||
}
|
||||
else {
|
||||
// The immediate child on the path leading to the anchor is the row/cell
|
||||
// to splice next to.
|
||||
const enclosingChildIndex = chain[containerIdx + 1].index;
|
||||
container.content.splice(enclosingChildIndex + offset, 0, fresh);
|
||||
}
|
||||
return { doc: out, inserted: true };
|
||||
}
|
||||
// Resolve by id anywhere in the tree: splice into the parent content array.
|
||||
if (opts.anchorNodeId != null) {
|
||||
let inserted = false;
|
||||
const walkContent = (content) => {
|
||||
for (let i = 0; i < content.length; i++) {
|
||||
const child = content[i];
|
||||
if (matchesId(child, opts.anchorNodeId)) {
|
||||
content.splice(i + offset, 0, fresh);
|
||||
inserted = true;
|
||||
return;
|
||||
}
|
||||
if (isObject(child) && Array.isArray(child.content)) {
|
||||
walkContent(child.content);
|
||||
if (inserted)
|
||||
return;
|
||||
}
|
||||
}
|
||||
};
|
||||
if (isObject(out) && Array.isArray(out.content)) {
|
||||
walkContent(out.content);
|
||||
}
|
||||
return { doc: out, inserted };
|
||||
}
|
||||
// Resolve by text: only top-level doc.content blocks are scanned.
|
||||
if (opts.anchorText != null && isObject(out) && Array.isArray(out.content)) {
|
||||
for (let i = 0; i < out.content.length; i++) {
|
||||
if (blockPlainText(out.content[i]).includes(opts.anchorText)) {
|
||||
out.content.splice(i + offset, 0, fresh);
|
||||
return { doc: out, inserted: true };
|
||||
}
|
||||
}
|
||||
}
|
||||
return { doc: out, inserted: false };
|
||||
}
|
||||
// ===========================================================================
|
||||
// Table editing helpers
|
||||
//
|
||||
// A Docmost table is a ProseMirror subtree with NO ids on the structural nodes:
|
||||
// table -> { type:"table", content:[tableRow...] }
|
||||
// row -> { type:"tableRow", content:[tableCell|tableHeader...] }
|
||||
// cell -> { type:"tableCell"|"tableHeader", attrs:{colspan,rowspan,colwidth},
|
||||
// content:[paragraph...] }
|
||||
// para -> { type:"paragraph", attrs:{id,indent}, content:[textNode...] }
|
||||
// Only paragraphs/headings carry an `attrs.id`, so a cell is addressed via the
|
||||
// id of the paragraph inside it. The helpers below all operate on a DEEP CLONE
|
||||
// of the input doc (via `clone`) and never mutate their inputs.
|
||||
// ===========================================================================
|
||||
/**
|
||||
* Collect EVERY `attrs.id` present anywhere in `node` into `used`. Used to seed
|
||||
* `makeFreshId` so generated paragraph ids never collide with existing ones.
|
||||
*/
|
||||
function collectIds(node, used) {
|
||||
if (!isObject(node))
|
||||
return;
|
||||
if (isObject(node.attrs) && typeof node.attrs.id === "string") {
|
||||
used.add(node.attrs.id);
|
||||
}
|
||||
if (Array.isArray(node.content)) {
|
||||
for (const child of node.content)
|
||||
collectIds(child, used);
|
||||
}
|
||||
}
|
||||
/**
|
||||
* Fresh-id generator: returns a random Docmost-style id (12 chars from
|
||||
* lowercase `a-z0-9`) that is not already in `used`, and records it. On the
|
||||
* rare collision the id is regenerated. Callers rely on uniqueness, not on the
|
||||
* exact string, so randomness is fine — and unlike a module-local counter it
|
||||
* needs no reset and cannot become predictable across calls.
|
||||
*/
|
||||
function makeFreshId(used) {
|
||||
const alphabet = "abcdefghijklmnopqrstuvwxyz0123456789";
|
||||
let id;
|
||||
do {
|
||||
id = "";
|
||||
for (let i = 0; i < 12; i++) {
|
||||
id += alphabet[Math.floor(Math.random() * alphabet.length)];
|
||||
}
|
||||
} while (used.has(id) || id === "");
|
||||
used.add(id);
|
||||
return id;
|
||||
}
|
||||
/**
|
||||
* Resolve a table reference against an ALREADY-CLONED doc and return the LIVE
|
||||
* table node (a reference inside `rootClone`, so the caller may mutate it) plus
|
||||
* its index path. Returns null when no table matches.
|
||||
*
|
||||
* - `#<n>`: the top-level block at index `n`, only if its `type === "table"`.
|
||||
* - otherwise: DFS for the node with `attrs.id === tableRef`, then walk UP its
|
||||
* ancestor chain to the nearest `type === "table"` ancestor.
|
||||
*/
|
||||
function locateTable(rootClone, tableRef) {
|
||||
if (!isObject(rootClone))
|
||||
return null;
|
||||
// "#<n>": index into the top-level content array; must be a table.
|
||||
const indexMatch = typeof tableRef === "string" ? tableRef.match(/^#(\d+)$/) : null;
|
||||
if (indexMatch) {
|
||||
const index = Number(indexMatch[1]);
|
||||
const block = Array.isArray(rootClone.content)
|
||||
? rootClone.content[index]
|
||||
: undefined;
|
||||
if (isObject(block) && block.type === "table") {
|
||||
return { table: block, path: [index] };
|
||||
}
|
||||
return null;
|
||||
}
|
||||
// Otherwise: DFS for attrs.id === tableRef, tracking the ancestor chain, then
|
||||
// climb to the nearest enclosing table.
|
||||
const search = (node, trail) => {
|
||||
if (!isObject(node))
|
||||
return null;
|
||||
if (Array.isArray(node.content)) {
|
||||
for (let i = 0; i < node.content.length; i++) {
|
||||
const child = node.content[i];
|
||||
const here = [...trail, { node: child, index: i }];
|
||||
if (matchesId(child, tableRef)) {
|
||||
// Walk UP to the nearest table ancestor (including the match itself).
|
||||
for (let j = here.length - 1; j >= 0; j--) {
|
||||
if (isObject(here[j].node) && here[j].node.type === "table") {
|
||||
return {
|
||||
table: here[j].node,
|
||||
path: here.slice(0, j + 1).map((e) => e.index),
|
||||
};
|
||||
}
|
||||
}
|
||||
return null; // id found but no enclosing table
|
||||
}
|
||||
const hit = search(child, here);
|
||||
if (hit != null)
|
||||
return hit;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
};
|
||||
return search(rootClone, []);
|
||||
}
|
||||
/** Build the plain-text → single-paragraph cell content used by all writers. */
|
||||
function makeCellParagraph(id, text) {
|
||||
return {
|
||||
type: "paragraph",
|
||||
attrs: { id, indent: 0 },
|
||||
// Empty string → a paragraph with an empty content array.
|
||||
content: text ? [{ type: "text", text }] : [],
|
||||
};
|
||||
}
|
||||
/**
|
||||
* Read a table as a matrix. Returns null when `tableRef` resolves to no table.
|
||||
*
|
||||
* - `rows`/`cols`: the table's row count and the column count of its FIRST row.
|
||||
* Tables may be ragged (rows of differing length), so `cols` reflects only
|
||||
* row 0; use the per-row length of `cells`/`cellIds` for each row's actual
|
||||
* width.
|
||||
* - `cells`: `string[][]` of each cell's `blockPlainText`.
|
||||
* - `cellIds`: `(string|null)[][]` of each cell's FIRST paragraph id (or null),
|
||||
* so callers can `patch_node` a cell for rich-formatted edits.
|
||||
* - `path`: index path of the table within the doc.
|
||||
*/
|
||||
export function readTable(doc, tableRef) {
|
||||
const root = clone(doc);
|
||||
const located = locateTable(root, tableRef);
|
||||
if (located == null)
|
||||
return null;
|
||||
const { table, path } = located;
|
||||
const rowNodes = Array.isArray(table.content) ? table.content : [];
|
||||
const rows = rowNodes.length;
|
||||
const cols = rowNodes[0]?.content?.length ?? 0;
|
||||
const cells = [];
|
||||
const cellIds = [];
|
||||
for (const rowNode of rowNodes) {
|
||||
const cellNodes = Array.isArray(rowNode?.content) ? rowNode.content : [];
|
||||
const rowText = [];
|
||||
const rowIds = [];
|
||||
for (const cellNode of cellNodes) {
|
||||
rowText.push(blockPlainText(cellNode));
|
||||
// The cell's first paragraph carries the id used for patch_node.
|
||||
const firstPara = Array.isArray(cellNode?.content)
|
||||
? cellNode.content[0]
|
||||
: undefined;
|
||||
const id = isObject(firstPara) && isObject(firstPara.attrs)
|
||||
? firstPara.attrs.id ?? null
|
||||
: null;
|
||||
rowIds.push(id);
|
||||
}
|
||||
cells.push(rowText);
|
||||
cellIds.push(rowIds);
|
||||
}
|
||||
return { rows, cols, cells, cellIds, path };
|
||||
}
|
||||
/**
|
||||
* Insert a row of plain-text cells into a table. Returns `{ doc, inserted }`.
|
||||
*
|
||||
* The row is padded to the table's column count (`cells[i] ?? ""`); supplying
|
||||
* MORE cells than columns throws. Each new cell copies `colwidth` for its
|
||||
* column from the header row when present, gets a fresh-id paragraph, and a
|
||||
* `colspan:1, rowspan:1` attrs. `index` (when an integer in `[0, rows]`) splices
|
||||
* the row there; otherwise the row is appended at the end.
|
||||
*/
|
||||
export function insertTableRow(doc, tableRef, cells, index) {
|
||||
const out = clone(doc);
|
||||
const located = locateTable(out, tableRef);
|
||||
if (located == null)
|
||||
return { doc: out, inserted: false };
|
||||
const { table } = located;
|
||||
if (!Array.isArray(table.content))
|
||||
table.content = [];
|
||||
const rows = table.content.length;
|
||||
const headerRow = table.content[0];
|
||||
const headerCells = Array.isArray(headerRow?.content) ? headerRow.content : [];
|
||||
// Column count is the WIDEST existing row, so the guard below stays
|
||||
// meaningful for ragged tables and the new row matches the table's width.
|
||||
// Fall back to the supplied cell count only when the table has no rows.
|
||||
let colCount = 0;
|
||||
for (const r of table.content) {
|
||||
if (isObject(r) && Array.isArray(r.content))
|
||||
colCount = Math.max(colCount, r.content.length);
|
||||
}
|
||||
if (colCount === 0)
|
||||
colCount = Array.isArray(cells) ? cells.length : 0;
|
||||
if (Array.isArray(cells) && cells.length > colCount) {
|
||||
throw new Error(`table_insert_row: got ${cells.length} cell(s) but the table has ${colCount} column(s)`);
|
||||
}
|
||||
// Resolve the landing index up front so the cell-type decision and the splice
|
||||
// below agree: a valid integer in [0, rows] splices there, else we append.
|
||||
const landingIndex = typeof index === "number" && Number.isInteger(index) && index >= 0 && index <= rows
|
||||
? index
|
||||
: rows;
|
||||
// Seed the id generator with every id already in the doc so the new cell
|
||||
// paragraph ids are unique within the whole document.
|
||||
const used = new Set();
|
||||
collectIds(out, used);
|
||||
const newCells = [];
|
||||
for (let i = 0; i < colCount; i++) {
|
||||
const text = (Array.isArray(cells) ? cells[i] : undefined) ?? "";
|
||||
const attrs = { colspan: 1, rowspan: 1 };
|
||||
// Copy this column's colwidth from the header row's cell when present.
|
||||
const colwidth = headerCells[i]?.attrs?.colwidth;
|
||||
if (colwidth !== undefined)
|
||||
attrs.colwidth = colwidth;
|
||||
// A row landing at index 0 becomes the new header row, so inherit the
|
||||
// current header cell's type per column (Docmost uses "tableHeader" there);
|
||||
// every other position is a plain data cell.
|
||||
const cellType = landingIndex === 0 ? headerCells[i]?.type ?? "tableCell" : "tableCell";
|
||||
newCells.push({
|
||||
type: cellType,
|
||||
attrs,
|
||||
content: [makeCellParagraph(makeFreshId(used), text)],
|
||||
});
|
||||
}
|
||||
const newRow = { type: "tableRow", content: newCells };
|
||||
// Splice at the resolved landing index (append when index was omitted/invalid).
|
||||
table.content.splice(landingIndex, 0, newRow);
|
||||
return { doc: out, inserted: true };
|
||||
}
|
||||
/**
|
||||
* Delete the row at 0-based `index` from a table. Returns `{ doc, deleted }`.
|
||||
* `deleted` is false only when the table cannot be located. Throws on an
|
||||
* out-of-range index, and refuses to delete the table's only row.
|
||||
*/
|
||||
export function deleteTableRow(doc, tableRef, index) {
|
||||
const out = clone(doc);
|
||||
const located = locateTable(out, tableRef);
|
||||
if (located == null)
|
||||
return { doc: out, deleted: false };
|
||||
const { table } = located;
|
||||
if (!Array.isArray(table.content))
|
||||
table.content = [];
|
||||
const rows = table.content.length;
|
||||
if (!Number.isInteger(index) || index < 0 || index >= rows) {
|
||||
throw new Error(`table_delete_row: row index ${index} out of range (table has ${rows} row(s))`);
|
||||
}
|
||||
if (rows <= 1) {
|
||||
throw new Error("table_delete_row: refusing to delete the only row of the table");
|
||||
}
|
||||
table.content.splice(index, 1);
|
||||
return { doc: out, deleted: true };
|
||||
}
|
||||
/**
|
||||
* Set the plain-text content of cell `[row, col]` (0-based) to `text`. Returns
|
||||
* `{ doc, updated }`; `updated` is false only when the table cannot be located.
|
||||
* Throws when `row`/`col` is out of range. The cell's own attrs (colspan/
|
||||
* rowspan/colwidth) are preserved; its content becomes a single text paragraph
|
||||
* that reuses the cell's existing first-paragraph id when present, else a fresh
|
||||
* one.
|
||||
*/
|
||||
export function updateTableCell(doc, tableRef, row, col, text) {
|
||||
const out = clone(doc);
|
||||
const located = locateTable(out, tableRef);
|
||||
if (located == null)
|
||||
return { doc: out, updated: false };
|
||||
const { table } = located;
|
||||
const rowNodes = Array.isArray(table.content) ? table.content : [];
|
||||
const rows = rowNodes.length;
|
||||
const rowNode = rowNodes[row];
|
||||
const cols = isObject(rowNode) && Array.isArray(rowNode.content)
|
||||
? rowNode.content.length
|
||||
: 0;
|
||||
if (!Number.isInteger(row) ||
|
||||
row < 0 ||
|
||||
row >= rows ||
|
||||
!Number.isInteger(col) ||
|
||||
col < 0 ||
|
||||
col >= cols) {
|
||||
throw new Error(`table_update_cell: cell [${row},${col}] out of range`);
|
||||
}
|
||||
const cellNode = rowNode.content[col];
|
||||
// Reuse the cell's existing first-paragraph id, or mint a fresh unique one.
|
||||
const existingPara = Array.isArray(cellNode?.content)
|
||||
? cellNode.content[0]
|
||||
: undefined;
|
||||
let id = isObject(existingPara) && isObject(existingPara.attrs)
|
||||
? existingPara.attrs.id
|
||||
: undefined;
|
||||
if (typeof id !== "string" || id.length === 0) {
|
||||
const used = new Set();
|
||||
collectIds(out, used);
|
||||
id = makeFreshId(used);
|
||||
}
|
||||
cellNode.content = [makeCellParagraph(id, text)];
|
||||
return { doc: out, updated: true };
|
||||
}
|
||||
31
packages/mcp/build/lib/page-lock.js
Normal file
31
packages/mcp/build/lib/page-lock.js
Normal file
@@ -0,0 +1,31 @@
|
||||
/**
|
||||
* Per-page async mutex.
|
||||
*
|
||||
* Content writes over the collaboration websocket must never overlap for the
|
||||
* same page: two concurrent full-document replaces would race on the live Yjs
|
||||
* fragment. We serialize them with a per-pageId promise chain — each new
|
||||
* operation waits for the previous one on that page to settle (success or
|
||||
* failure) before it runs. Different pages never block each other.
|
||||
*/
|
||||
const chains = new Map();
|
||||
// The returned promise carries the real result/rejection of `fn` and MUST be
|
||||
// awaited/handled by the caller; only the internal chaining tail swallows
|
||||
// errors (purely to gate ordering).
|
||||
export function withPageLock(pageId, fn) {
|
||||
// Wait for the previous op on this page; swallow its error so a failure does
|
||||
// not poison the queue for the next caller.
|
||||
const prev = (chains.get(pageId) ?? Promise.resolve()).catch(() => { });
|
||||
const run = prev.then(fn);
|
||||
// The tail used for chaining must also swallow errors (it only gates order).
|
||||
const tail = run.catch(() => { });
|
||||
chains.set(pageId, tail);
|
||||
// Drop the map entry once this op is the tail and has settled, to avoid an
|
||||
// unbounded map of resolved promises.
|
||||
tail.then(() => {
|
||||
if (chains.get(pageId) === tail) {
|
||||
chains.delete(pageId);
|
||||
}
|
||||
});
|
||||
// Callers get the real result/rejection of fn.
|
||||
return run;
|
||||
}
|
||||
405
packages/mcp/build/lib/transforms.js
Normal file
405
packages/mcp/build/lib/transforms.js
Normal file
@@ -0,0 +1,405 @@
|
||||
/**
|
||||
* Pure, network-free transform primitives for a ProseMirror/TipTap document
|
||||
* tree, plus one higher-level orchestration (commentsToFootnotes).
|
||||
*
|
||||
* A ProseMirror node here is a plain JSON object of the shape produced by
|
||||
* Docmost: `{ type, attrs?, content?, text?, marks? }`. Children live in the
|
||||
* `content` array; callouts, tables, lists all hold their children in
|
||||
* `content`, so a single recursive walk reaches them all.
|
||||
*
|
||||
* Conventions (matching node-ops.ts):
|
||||
* - functions that produce a new document deep-clone their input and return a
|
||||
* `{ doc, ... }` object; the caller's objects are never mutated.
|
||||
* - functions are defensively null-safe.
|
||||
* - `marks` arrays are preserved verbatim when fragments are split/reordered.
|
||||
*/
|
||||
import { blockPlainText } from "./node-ops.js";
|
||||
/** Deep-clone a JSON-serializable value without mutating the original. */
|
||||
function clone(value) {
|
||||
if (typeof structuredClone === "function") {
|
||||
return structuredClone(value);
|
||||
}
|
||||
// Fallback for environments without structuredClone.
|
||||
return JSON.parse(JSON.stringify(value));
|
||||
}
|
||||
/** True if `value` is a non-null object (and not an array). */
|
||||
function isObject(value) {
|
||||
return value != null && typeof value === "object" && !Array.isArray(value);
|
||||
}
|
||||
/**
|
||||
* Plain text of a node (re-export of node-ops' blockPlainText so transform
|
||||
* authors have a single import surface). Recurses through nested content.
|
||||
*/
|
||||
export function blockText(node) {
|
||||
return blockPlainText(node);
|
||||
}
|
||||
/**
|
||||
* Depth-first visit of every node in the tree, including the root and the
|
||||
* nested content of callouts, tables, lists, etc. `fn` is called once per node.
|
||||
* Null-safe: a nullish or non-object node is ignored.
|
||||
*/
|
||||
export function walk(node, fn) {
|
||||
if (!isObject(node))
|
||||
return;
|
||||
fn(node);
|
||||
if (Array.isArray(node.content)) {
|
||||
for (const child of node.content) {
|
||||
walk(child, fn);
|
||||
}
|
||||
}
|
||||
}
|
||||
/**
|
||||
* Find the FIRST node (depth-first) matching `predicate`, anywhere in the tree.
|
||||
* Works even when the node carries no `attrs.id` (it searches the raw tree, not
|
||||
* an id index). Returns the live node reference inside `doc` (NOT a clone), or
|
||||
* null when nothing matches. Typical use: `getList(doc, n => n.type ===
|
||||
* "orderedList")`.
|
||||
*/
|
||||
export function getList(doc, predicate) {
|
||||
let found = null;
|
||||
walk(doc, (node) => {
|
||||
if (found == null && predicate(node)) {
|
||||
found = node;
|
||||
}
|
||||
});
|
||||
return found;
|
||||
}
|
||||
/**
|
||||
* Insert `marker` as a PLAIN (unmarked) text run right after the first
|
||||
* occurrence of `anchor`.
|
||||
*
|
||||
* The text run that contains the END of the anchor is SPLIT at the anchor end,
|
||||
* so all existing marks (links, bold, ...) on the surrounding text are
|
||||
* preserved, while the inserted marker run carries NO marks. The marker is
|
||||
* inserted as a leading-space-padded run (`" " + marker`) so it visually
|
||||
* separates from the preceding word.
|
||||
*
|
||||
* The anchor is matched against the concatenated plain text of each top-level
|
||||
* block (so an anchor that spans several text/mark runs still matches). The
|
||||
* insertion happens inside the inline content array that holds the anchor's
|
||||
* final character.
|
||||
*
|
||||
* Operates on a clone of `doc`; returns `{ doc, inserted }`. `inserted` is
|
||||
* false when the anchor text was not found in any in-scope block.
|
||||
*/
|
||||
export function insertMarkerAfter(doc, anchor, marker, opts = {}) {
|
||||
const out = clone(doc);
|
||||
if (!isObject(out) || !Array.isArray(out.content) || !anchor) {
|
||||
return { doc: out, inserted: false };
|
||||
}
|
||||
const limit = typeof opts.beforeBlock === "number"
|
||||
? Math.min(opts.beforeBlock, out.content.length)
|
||||
: out.content.length;
|
||||
for (let b = 0; b < limit; b++) {
|
||||
const block = out.content[b];
|
||||
if (!isObject(block))
|
||||
continue;
|
||||
// Quick reject: skip blocks whose plain text cannot contain the anchor.
|
||||
if (!blockPlainText(block).includes(anchor))
|
||||
continue;
|
||||
// Walk the inline content arrays inside this block, tracking a running
|
||||
// character offset so we can locate the inline array + text run that holds
|
||||
// the END of the anchor's first occurrence.
|
||||
let inserted = false;
|
||||
let offset = 0; // characters of plain text seen so far in this block
|
||||
const anchorEnd = (() => blockPlainText(block).indexOf(anchor) + anchor.length)();
|
||||
// Recurse into inline-bearing containers (paragraph, heading, table cell,
|
||||
// callout child paragraphs, ...). We only split inside an array of inline
|
||||
// nodes (text/inline atoms); the FIRST array whose cumulative range covers
|
||||
// anchorEnd receives the split + marker.
|
||||
const visit = (container) => {
|
||||
if (inserted || !isObject(container) || !Array.isArray(container.content)) {
|
||||
return;
|
||||
}
|
||||
const inline = container.content;
|
||||
// Detect whether this array is an inline array (contains text nodes).
|
||||
const hasText = inline.some((n) => isObject(n) && n.type === "text");
|
||||
if (hasText) {
|
||||
for (let i = 0; i < inline.length; i++) {
|
||||
const n = inline[i];
|
||||
const len = isObject(n) ? blockPlainText(n).length : 0;
|
||||
const runStart = offset;
|
||||
const runEnd = offset + len;
|
||||
// The run that contains the anchor end (anchorEnd lands inside this
|
||||
// run, i.e. runStart < anchorEnd <= runEnd) is the split point.
|
||||
if (!inserted &&
|
||||
isObject(n) &&
|
||||
n.type === "text" &&
|
||||
typeof n.text === "string" &&
|
||||
anchorEnd > runStart &&
|
||||
anchorEnd <= runEnd) {
|
||||
const cut = anchorEnd - runStart; // split index within this text run
|
||||
const before = n.text.slice(0, cut);
|
||||
const after = n.text.slice(cut);
|
||||
const marks = Array.isArray(n.marks) ? n.marks : [];
|
||||
const parts = [];
|
||||
if (before.length > 0) {
|
||||
parts.push({ ...n, text: before, marks: [...marks] });
|
||||
}
|
||||
// Marker is a PLAIN run: no marks copied. Leading space separates it.
|
||||
parts.push({ type: "text", text: " " + marker });
|
||||
if (after.length > 0) {
|
||||
parts.push({ ...n, text: after, marks: [...marks] });
|
||||
}
|
||||
inline.splice(i, 1, ...parts);
|
||||
inserted = true;
|
||||
return;
|
||||
}
|
||||
offset = runEnd;
|
||||
}
|
||||
}
|
||||
else {
|
||||
// Not an inline array: recurse into children (e.g. callout -> paragraph).
|
||||
for (const child of inline) {
|
||||
visit(child);
|
||||
if (inserted)
|
||||
return;
|
||||
}
|
||||
}
|
||||
};
|
||||
visit(block);
|
||||
if (inserted) {
|
||||
return { doc: out, inserted: true };
|
||||
}
|
||||
// If the block matched in plain text but we could not split (e.g. anchor
|
||||
// lands inside an atom), fall through to the next block rather than failing.
|
||||
}
|
||||
return { doc: out, inserted: false };
|
||||
}
|
||||
/**
|
||||
* In the disclaimer callout, replace a `[1]…[K]` range marker with `[1]…[n]`.
|
||||
*
|
||||
* Docmost translations use a callout that states the footnote range, e.g.
|
||||
* "[1]…[5]". When the number of notes changes, this rewrites the trailing
|
||||
* number of any `[1]…[K]` (or `[1]...[K]`, ASCII ellipsis) occurrence found in a
|
||||
* callout's text nodes to `[1]…[n]`. Operates on a clone; returns
|
||||
* `{ doc, changed }` where `changed` is the number of text nodes rewritten.
|
||||
*/
|
||||
export function setCalloutRange(doc, n) {
|
||||
const out = clone(doc);
|
||||
let changed = 0;
|
||||
// Match "[1]" + (… or ...) + "[<digits>]"; rewrite the last number to n.
|
||||
const rangeRe = /(\[1\]\s*(?:…|\.\.\.)\s*\[)\d+(\])/g;
|
||||
walk(out, (node) => {
|
||||
if (node.type === "callout") {
|
||||
walk(node, (inner) => {
|
||||
if (inner.type === "text" &&
|
||||
typeof inner.text === "string" &&
|
||||
rangeRe.test(inner.text)) {
|
||||
rangeRe.lastIndex = 0;
|
||||
inner.text = inner.text.replace(rangeRe, `$1${n}$2`);
|
||||
changed++;
|
||||
}
|
||||
rangeRe.lastIndex = 0;
|
||||
});
|
||||
}
|
||||
});
|
||||
return { doc: out, changed };
|
||||
}
|
||||
/**
|
||||
* Generate a short random id for a new block's `attrs.id`. Docmost uses nanoid;
|
||||
* a base36 random string is sufficient here (uniqueness within one document).
|
||||
*/
|
||||
function freshId() {
|
||||
return (Math.random().toString(36).slice(2, 12) +
|
||||
Math.random().toString(36).slice(2, 6));
|
||||
}
|
||||
/**
|
||||
* Wrap inline ProseMirror nodes in a list item:
|
||||
* { type:"listItem", content:[{ type:"paragraph", attrs:{id}, content: inlineNodes }] }
|
||||
* with a fresh random block id on the paragraph. The inline nodes are cloned so
|
||||
* the result shares no references with the caller's input.
|
||||
*/
|
||||
export function noteItem(inlineNodes) {
|
||||
const content = Array.isArray(inlineNodes) ? clone(inlineNodes) : [];
|
||||
return {
|
||||
type: "listItem",
|
||||
content: [
|
||||
{
|
||||
type: "paragraph",
|
||||
attrs: { id: freshId() },
|
||||
content,
|
||||
},
|
||||
],
|
||||
};
|
||||
}
|
||||
/**
|
||||
* Convert a comment's markdown (e.g. `**Lead.** body...`) into inline
|
||||
* ProseMirror nodes.
|
||||
*
|
||||
* A leading `комментарий: ` (case-insensitive) or `N. ` numeric prefix is
|
||||
* stripped first. Then a minimal bold-split is applied: a leading
|
||||
* `**bold lead**` run becomes a text node with a bold mark, and the remainder
|
||||
* becomes a plain text node. This keeps the conversion synchronous (the
|
||||
* transform sandbox runs synchronously) and dependency-free; the existing
|
||||
* async markdownToProseMirror is intentionally NOT used here.
|
||||
*/
|
||||
export function mdToInlineNodes(markdown) {
|
||||
let md = typeof markdown === "string" ? markdown : "";
|
||||
// Strip a leading "комментарий: " prefix (case-insensitive) or a "N. " prefix.
|
||||
md = md.replace(/^\s*комментарий\s*:\s*/i, "");
|
||||
md = md.replace(/^\s*\d+\.\s+/, "");
|
||||
md = md.trim();
|
||||
if (md === "")
|
||||
return [];
|
||||
const nodes = [];
|
||||
// Leading bold lead: **...** at the very start.
|
||||
const leadMatch = /^\*\*([^*]+)\*\*\s*/.exec(md);
|
||||
if (leadMatch) {
|
||||
const leadText = leadMatch[1];
|
||||
nodes.push({
|
||||
type: "text",
|
||||
text: leadText,
|
||||
marks: [{ type: "bold" }],
|
||||
});
|
||||
const rest = md.slice(leadMatch[0].length);
|
||||
if (rest.length > 0) {
|
||||
// Preserve the separating space that followed the bold lead.
|
||||
const sep = /^\*\*[^*]+\*\*(\s*)/.exec(md);
|
||||
const spacing = sep ? sep[1] : "";
|
||||
nodes.push({ type: "text", text: spacing + rest });
|
||||
}
|
||||
return nodes;
|
||||
}
|
||||
// No bold lead: emit the whole thing as a single plain text node, with any
|
||||
// remaining **bold** spans split out inline.
|
||||
return splitInlineBold(md);
|
||||
}
|
||||
/**
|
||||
* Split a string with inline `**bold**` spans into text nodes, bolding the
|
||||
* spans. Used as the no-lead fallback in mdToInlineNodes.
|
||||
*/
|
||||
function splitInlineBold(text) {
|
||||
const nodes = [];
|
||||
const re = /\*\*([^*]+)\*\*/g;
|
||||
let last = 0;
|
||||
let m;
|
||||
while ((m = re.exec(text)) !== null) {
|
||||
if (m.index > last) {
|
||||
nodes.push({ type: "text", text: text.slice(last, m.index) });
|
||||
}
|
||||
nodes.push({ type: "text", text: m[1], marks: [{ type: "bold" }] });
|
||||
last = m.index + m[0].length;
|
||||
}
|
||||
if (last < text.length) {
|
||||
nodes.push({ type: "text", text: text.slice(last) });
|
||||
}
|
||||
return nodes.length > 0 ? nodes : [{ type: "text", text }];
|
||||
}
|
||||
/**
|
||||
* Turn inline comments into numbered footnotes.
|
||||
*
|
||||
* For each inline comment that carries a `selection`:
|
||||
* 1. insert a placeholder marker (a NUL-delimited "\u0000FN<i>\u0000"
|
||||
* sentinel) right after the selection text in the BODY (before the
|
||||
* notes heading);
|
||||
* 2. build a note list item from the comment's markdown content.
|
||||
*
|
||||
* Then RENUMBER every footnote marker in the body by reading order: existing
|
||||
* `[N]` markers and the new "\u0000FN<i>\u0000" placeholders are both replaced by a
|
||||
* sequential `[seq]`, and the notes orderedList is reordered so each note lines
|
||||
* up with its marker's reading-order position. Finally the disclaimer callout
|
||||
* range is synced to the new note count.
|
||||
*
|
||||
* Returns `{ doc, consumed }` where `consumed` lists the ids of comments that
|
||||
* were successfully anchored (their selection was found and a placeholder
|
||||
* inserted). Operates on a clone of `doc`.
|
||||
*/
|
||||
export function commentsToFootnotes(doc, comments, opts = {}) {
|
||||
let working = clone(doc);
|
||||
const notesHeading = opts.notesHeading ?? "Примечания переводчика";
|
||||
const top = Array.isArray(working.content) ? working.content : [];
|
||||
const notesIdx = top.findIndex((n) => isObject(n) && n.type === "heading" && blockText(n).trim() === notesHeading);
|
||||
if (notesIdx < 0) {
|
||||
throw new Error(`heading "${notesHeading}" not found`);
|
||||
}
|
||||
// The notes orderedList lives at or after the heading.
|
||||
const notesList = top
|
||||
.slice(notesIdx)
|
||||
.find((n) => isObject(n) && n.type === "orderedList");
|
||||
if (!notesList) {
|
||||
throw new Error("notes orderedList not found");
|
||||
}
|
||||
const consumed = [];
|
||||
const noteByPh = new Map();
|
||||
(Array.isArray(comments) ? comments : []).forEach((c, i) => {
|
||||
if (!c || !c.selection)
|
||||
return;
|
||||
// Collision-proof sentinel delimited by NUL control chars, which never occur
|
||||
// in real Docmost prose — so the renumber regex below cannot mistake any body
|
||||
// text (e.g. "Press F1 for help", model "FN2") for a placeholder. The NUL is
|
||||
// transient: the placeholder round-trips within this function (insertMarkerAfter
|
||||
// inserts it, the renumber pass replaces it with "[N]"), so it never persists
|
||||
// in a returned/pushed document.
|
||||
const ph = `\u0000FN${i}\u0000`;
|
||||
// insertMarkerAfter returns a NEW cloned doc; reassign `working` and refresh
|
||||
// the `top` / `notesList` references that point into it.
|
||||
const r = insertMarkerAfter(working, c.selection.trimEnd(), ph, {
|
||||
beforeBlock: notesIdx,
|
||||
});
|
||||
if (!r.inserted)
|
||||
return;
|
||||
working = r.doc;
|
||||
noteByPh.set(ph, noteItem(mdToInlineNodes(c.content)));
|
||||
consumed.push(c.id);
|
||||
});
|
||||
// Re-resolve references into the (possibly re-cloned) working doc.
|
||||
const top2 = Array.isArray(working.content) ? working.content : [];
|
||||
const notesList2 = top2
|
||||
.slice(notesIdx)
|
||||
.find((n) => isObject(n) && n.type === "orderedList");
|
||||
if (!notesList2) {
|
||||
throw new Error("notes orderedList not found");
|
||||
}
|
||||
const oldNotes = Array.isArray(notesList2.content)
|
||||
? notesList2.content
|
||||
: [];
|
||||
const newNotes = [];
|
||||
let seq = 0;
|
||||
// Match either an existing "[N]" marker or a NUL-delimited "\u0000FN<i>\u0000"
|
||||
// placeholder, in reading order across the body (blocks before the notes heading).
|
||||
const re = /\[(\d+)\]|\u0000FN(\d+)\u0000/g;
|
||||
// Same range regex setCalloutRange uses to detect the disclaimer callout's
|
||||
// "[1]…[K]" range; used here to decide whether a top-level callout is the
|
||||
// disclaimer (skip) or an ordinary callout (renumber normally).
|
||||
const disclaimerRangeRe = /(\[1\]\s*(?:…|\.\.\.)\s*\[)\d+(\])/;
|
||||
for (let i = 0; i < notesIdx; i++) {
|
||||
// Skip ONLY the disclaimer callout: its "[1]…[K]" range is NOT a footnote
|
||||
// marker and is synced separately by setCalloutRange. Renumbering it here
|
||||
// would consume note slots and corrupt the sequence. Other top-level
|
||||
// callouts may carry legitimate "[N]" body markers and are renumbered.
|
||||
if (isObject(top2[i]) &&
|
||||
top2[i].type === "callout" &&
|
||||
disclaimerRangeRe.test(blockText(top2[i]))) {
|
||||
continue;
|
||||
}
|
||||
walk(top2[i], (node) => {
|
||||
if (node.type !== "text" || typeof node.text !== "string")
|
||||
return;
|
||||
node.text = node.text.replace(re, (_m, oldNum, phIdx) => {
|
||||
if (oldNum != null) {
|
||||
const note = oldNotes[Number(oldNum) - 1];
|
||||
// Every existing body marker MUST map to a real note. An out-of-range
|
||||
// marker means the document is internally inconsistent; fail loudly
|
||||
// rather than silently dropping the note and desyncing the callout.
|
||||
if (note === undefined) {
|
||||
throw new Error(`footnote [${oldNum}] has no matching note (notes list has ${oldNotes.length} items); document is inconsistent`);
|
||||
}
|
||||
newNotes.push(note);
|
||||
}
|
||||
else {
|
||||
newNotes.push(noteByPh.get(`\u0000FN${phIdx}\u0000`));
|
||||
}
|
||||
return `[${++seq}]`;
|
||||
});
|
||||
});
|
||||
}
|
||||
// Reorder the notes list IN PLACE on `working` first, THEN sync the callout
|
||||
// range. setCalloutRange clones `working`, so the reordered notes (mutated
|
||||
// before the clone) are carried into its result automatically. No null-filter
|
||||
// here: marker count and note count must stay exactly equal (the out-of-range
|
||||
// guard above guarantees no undefined entry is ever pushed).
|
||||
notesList2.content = newNotes;
|
||||
const synced = setCalloutRange(working, notesList2.content.length);
|
||||
return { doc: synced.doc, consumed };
|
||||
}
|
||||
40
packages/mcp/build/stdio.js
Executable file
40
packages/mcp/build/stdio.js
Executable file
@@ -0,0 +1,40 @@
|
||||
#!/usr/bin/env node
|
||||
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
|
||||
import { createDocmostMcpServer } from "./index.js";
|
||||
// Standalone stdio entrypoint. This restores the original behavior of the
|
||||
// package when run as a CLI (`docmost-mcp`): it reads credentials from the
|
||||
// environment and serves the MCP protocol over stdin/stdout. The factory in
|
||||
// index.ts stays side-effect-free; all the process/transport lifecycle lives
|
||||
// here.
|
||||
const API_URL = process.env.DOCMOST_API_URL;
|
||||
const EMAIL = process.env.DOCMOST_EMAIL;
|
||||
const PASSWORD = process.env.DOCMOST_PASSWORD;
|
||||
if (!API_URL || !EMAIL || !PASSWORD) {
|
||||
console.error("Error: DOCMOST_API_URL, DOCMOST_EMAIL, and DOCMOST_PASSWORD environment variables are required.");
|
||||
process.exit(1);
|
||||
}
|
||||
async function run() {
|
||||
// Global safety nets so a stray rejection/exception cannot silently kill
|
||||
// the stdio server. Per-tool errors still flow through the SDK and are not
|
||||
// affected by these handlers; these only catch errors raised OUTSIDE a tool
|
||||
// call (e.g. a transient ws/collab socket "error" event). Such errors must
|
||||
// NOT tear down the whole stdio server, so we log only and keep running.
|
||||
// Genuine startup failures are still fatal via run().catch(...) below.
|
||||
process.on("unhandledRejection", (reason) => {
|
||||
console.error("Unhandled promise rejection:", reason);
|
||||
});
|
||||
process.on("uncaughtException", (error) => {
|
||||
console.error("Uncaught exception:", error);
|
||||
});
|
||||
const server = createDocmostMcpServer({
|
||||
apiUrl: API_URL,
|
||||
email: EMAIL,
|
||||
password: PASSWORD,
|
||||
});
|
||||
const transport = new StdioServerTransport();
|
||||
await server.connect(transport);
|
||||
}
|
||||
run().catch((error) => {
|
||||
console.error("Fatal error running server:", error);
|
||||
process.exit(1);
|
||||
});
|
||||
13
packages/mcp/mcp_config_example.json
Normal file
13
packages/mcp/mcp_config_example.json
Normal file
@@ -0,0 +1,13 @@
|
||||
{
|
||||
"mcpServers": {
|
||||
"docmost-local": {
|
||||
"command": "node",
|
||||
"args": ["./build/index.js"],
|
||||
"env": {
|
||||
"DOCMOST_API_URL": "http://localhost:3000/api",
|
||||
"DOCMOST_EMAIL": "test@docmost.com",
|
||||
"DOCMOST_PASSWORD": "test"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
17
packages/mcp/node_modules/.bin/marked
generated
vendored
Executable file
17
packages/mcp/node_modules/.bin/marked
generated
vendored
Executable file
@@ -0,0 +1,17 @@
|
||||
#!/bin/sh
|
||||
basedir=$(dirname "$(echo "$0" | sed -e 's,\\,/,g')")
|
||||
|
||||
case `uname` in
|
||||
*CYGWIN*) basedir=`cygpath -w "$basedir"`;;
|
||||
esac
|
||||
|
||||
if [ -z "$NODE_PATH" ]; then
|
||||
export NODE_PATH="/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/marked@17.0.5/node_modules/marked/bin/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/marked@17.0.5/node_modules/marked/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/marked@17.0.5/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/node_modules"
|
||||
else
|
||||
export NODE_PATH="/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/marked@17.0.5/node_modules/marked/bin/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/marked@17.0.5/node_modules/marked/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/marked@17.0.5/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/node_modules:$NODE_PATH"
|
||||
fi
|
||||
if [ -x "$basedir/node" ]; then
|
||||
exec "$basedir/node" "$basedir/../marked/bin/marked.js" "$@"
|
||||
else
|
||||
exec node "$basedir/../marked/bin/marked.js" "$@"
|
||||
fi
|
||||
17
packages/mcp/node_modules/.bin/tsc
generated
vendored
Executable file
17
packages/mcp/node_modules/.bin/tsc
generated
vendored
Executable file
@@ -0,0 +1,17 @@
|
||||
#!/bin/sh
|
||||
basedir=$(dirname "$(echo "$0" | sed -e 's,\\,/,g')")
|
||||
|
||||
case `uname` in
|
||||
*CYGWIN*) basedir=`cygpath -w "$basedir"`;;
|
||||
esac
|
||||
|
||||
if [ -z "$NODE_PATH" ]; then
|
||||
export NODE_PATH="/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/typescript@5.9.3/node_modules/typescript/bin/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/typescript@5.9.3/node_modules/typescript/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/typescript@5.9.3/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/node_modules"
|
||||
else
|
||||
export NODE_PATH="/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/typescript@5.9.3/node_modules/typescript/bin/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/typescript@5.9.3/node_modules/typescript/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/typescript@5.9.3/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/node_modules:$NODE_PATH"
|
||||
fi
|
||||
if [ -x "$basedir/node" ]; then
|
||||
exec "$basedir/node" "$basedir/../typescript/bin/tsc" "$@"
|
||||
else
|
||||
exec node "$basedir/../typescript/bin/tsc" "$@"
|
||||
fi
|
||||
17
packages/mcp/node_modules/.bin/tsserver
generated
vendored
Executable file
17
packages/mcp/node_modules/.bin/tsserver
generated
vendored
Executable file
@@ -0,0 +1,17 @@
|
||||
#!/bin/sh
|
||||
basedir=$(dirname "$(echo "$0" | sed -e 's,\\,/,g')")
|
||||
|
||||
case `uname` in
|
||||
*CYGWIN*) basedir=`cygpath -w "$basedir"`;;
|
||||
esac
|
||||
|
||||
if [ -z "$NODE_PATH" ]; then
|
||||
export NODE_PATH="/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/typescript@5.9.3/node_modules/typescript/bin/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/typescript@5.9.3/node_modules/typescript/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/typescript@5.9.3/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/node_modules"
|
||||
else
|
||||
export NODE_PATH="/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/typescript@5.9.3/node_modules/typescript/bin/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/typescript@5.9.3/node_modules/typescript/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/typescript@5.9.3/node_modules:/Users/vvzvlad/Data/Projects/gitmost/node_modules/.pnpm/node_modules:$NODE_PATH"
|
||||
fi
|
||||
if [ -x "$basedir/node" ]; then
|
||||
exec "$basedir/node" "$basedir/../typescript/bin/tsserver" "$@"
|
||||
else
|
||||
exec node "$basedir/../typescript/bin/tsserver" "$@"
|
||||
fi
|
||||
1
packages/mcp/node_modules/@fellow/prosemirror-recreate-transform
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@fellow/prosemirror-recreate-transform
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@fellow+prosemirror-recreate-transform@1.2.3/node_modules/@fellow/prosemirror-recreate-transform
|
||||
1
packages/mcp/node_modules/@hocuspocus/provider
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@hocuspocus/provider
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@hocuspocus+provider@3.4.4_y-protocols@1.0.6_yjs@13.6.30__yjs@13.6.30/node_modules/@hocuspocus/provider
|
||||
1
packages/mcp/node_modules/@hocuspocus/transformer
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@hocuspocus/transformer
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@hocuspocus+transformer@3.4.4_@tiptap+core@3.20.4_@tiptap+pm@3.20.4__@tiptap+pm@3.20.4__d2104a828d218219abc1c54b602a69ac/node_modules/@hocuspocus/transformer
|
||||
1
packages/mcp/node_modules/@modelcontextprotocol/sdk
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@modelcontextprotocol/sdk
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@modelcontextprotocol+sdk@1.29.0_@cfworker+json-schema@4.1.1_zod@3.25.76/node_modules/@modelcontextprotocol/sdk
|
||||
1
packages/mcp/node_modules/@tiptap/core
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@tiptap/core
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@tiptap+core@3.20.4_@tiptap+pm@3.20.4/node_modules/@tiptap/core
|
||||
1
packages/mcp/node_modules/@tiptap/extension-highlight
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@tiptap/extension-highlight
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@tiptap+extension-highlight@3.20.4_@tiptap+core@3.20.4_@tiptap+pm@3.20.4_/node_modules/@tiptap/extension-highlight
|
||||
1
packages/mcp/node_modules/@tiptap/extension-image
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@tiptap/extension-image
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@tiptap+extension-image@3.20.4_@tiptap+core@3.20.4_@tiptap+pm@3.20.4_/node_modules/@tiptap/extension-image
|
||||
1
packages/mcp/node_modules/@tiptap/extension-link
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@tiptap/extension-link
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@tiptap+extension-link@3.20.4_@tiptap+core@3.20.4_@tiptap+pm@3.20.4__@tiptap+pm@3.20.4/node_modules/@tiptap/extension-link
|
||||
1
packages/mcp/node_modules/@tiptap/extension-subscript
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@tiptap/extension-subscript
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@tiptap+extension-subscript@3.20.4_@tiptap+core@3.20.4_@tiptap+pm@3.20.4__@tiptap+pm@3.20.4/node_modules/@tiptap/extension-subscript
|
||||
1
packages/mcp/node_modules/@tiptap/extension-superscript
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@tiptap/extension-superscript
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@tiptap+extension-superscript@3.20.4_@tiptap+core@3.20.4_@tiptap+pm@3.20.4__@tiptap+pm@3.20.4/node_modules/@tiptap/extension-superscript
|
||||
1
packages/mcp/node_modules/@tiptap/extension-task-item
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@tiptap/extension-task-item
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@tiptap+extension-task-item@3.20.4_@tiptap+extension-list@3.20.4_@tiptap+core@3.20.4_@t_f120fce1a3d9fc85461b67496f03c362/node_modules/@tiptap/extension-task-item
|
||||
1
packages/mcp/node_modules/@tiptap/extension-task-list
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@tiptap/extension-task-list
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@tiptap+extension-task-list@3.20.4_@tiptap+extension-list@3.20.4_@tiptap+core@3.20.4_@t_c94f69f56aee3556ec680ab7491aa1d4/node_modules/@tiptap/extension-task-list
|
||||
1
packages/mcp/node_modules/@tiptap/html
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@tiptap/html
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@tiptap+html@3.20.4_@tiptap+core@3.20.4_@tiptap+pm@3.20.4__@tiptap+pm@3.20.4_happy-dom@20.8.9/node_modules/@tiptap/html
|
||||
1
packages/mcp/node_modules/@tiptap/starter-kit
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@tiptap/starter-kit
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@tiptap+starter-kit@3.20.4/node_modules/@tiptap/starter-kit
|
||||
1
packages/mcp/node_modules/@types/form-data
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@types/form-data
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@types+form-data@2.5.2/node_modules/@types/form-data
|
||||
1
packages/mcp/node_modules/@types/jsdom
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@types/jsdom
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@types+jsdom@27.0.0/node_modules/@types/jsdom
|
||||
1
packages/mcp/node_modules/@types/node
generated
vendored
Symbolic link
1
packages/mcp/node_modules/@types/node
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../../node_modules/.pnpm/@types+node@20.19.43/node_modules/@types/node
|
||||
1
packages/mcp/node_modules/axios
generated
vendored
Symbolic link
1
packages/mcp/node_modules/axios
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../node_modules/.pnpm/axios@1.16.0/node_modules/axios
|
||||
1
packages/mcp/node_modules/form-data
generated
vendored
Symbolic link
1
packages/mcp/node_modules/form-data
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../node_modules/.pnpm/form-data@4.0.5/node_modules/form-data
|
||||
1
packages/mcp/node_modules/jsdom
generated
vendored
Symbolic link
1
packages/mcp/node_modules/jsdom
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../node_modules/.pnpm/jsdom@27.4.0_@noble+hashes@2.0.1/node_modules/jsdom
|
||||
1
packages/mcp/node_modules/marked
generated
vendored
Symbolic link
1
packages/mcp/node_modules/marked
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../node_modules/.pnpm/marked@17.0.5/node_modules/marked
|
||||
1
packages/mcp/node_modules/typescript
generated
vendored
Symbolic link
1
packages/mcp/node_modules/typescript
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../node_modules/.pnpm/typescript@5.9.3/node_modules/typescript
|
||||
1
packages/mcp/node_modules/ws
generated
vendored
Symbolic link
1
packages/mcp/node_modules/ws
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../node_modules/.pnpm/ws@8.20.1/node_modules/ws
|
||||
1
packages/mcp/node_modules/yjs
generated
vendored
Symbolic link
1
packages/mcp/node_modules/yjs
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../node_modules/.pnpm/yjs@13.6.30/node_modules/yjs
|
||||
1
packages/mcp/node_modules/zod
generated
vendored
Symbolic link
1
packages/mcp/node_modules/zod
generated
vendored
Symbolic link
@@ -0,0 +1 @@
|
||||
../../../node_modules/.pnpm/zod@3.25.76/node_modules/zod
|
||||
63
packages/mcp/package.json
Normal file
63
packages/mcp/package.json
Normal file
@@ -0,0 +1,63 @@
|
||||
{
|
||||
"name": "@docmost/mcp",
|
||||
"version": "1.0.0",
|
||||
"description": "A Model Context Protocol (MCP) server for Docmost, allowing AI agents to manage documentation spaces and pages.",
|
||||
"private": true,
|
||||
"type": "module",
|
||||
"main": "./build/index.js",
|
||||
"exports": {
|
||||
".": "./build/index.js",
|
||||
"./http": "./build/http.js"
|
||||
},
|
||||
"bin": {
|
||||
"docmost-mcp": "./build/stdio.js"
|
||||
},
|
||||
"scripts": {
|
||||
"build": "tsc",
|
||||
"start": "node build/stdio.js",
|
||||
"watch": "tsc --watch",
|
||||
"pretest": "tsc",
|
||||
"test": "node --test \"test/unit/*.test.mjs\" \"test/mock/*.test.mjs\"",
|
||||
"test:unit": "node --test \"test/unit/*.test.mjs\"",
|
||||
"test:mock": "node --test \"test/mock/*.test.mjs\"",
|
||||
"test:e2e": "node test-e2e.mjs"
|
||||
},
|
||||
"keywords": [
|
||||
"mcp",
|
||||
"docmost",
|
||||
"documentation",
|
||||
"ai",
|
||||
"agent"
|
||||
],
|
||||
"author": "Moritz Krause",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"@fellow/prosemirror-recreate-transform": "^1.2.3",
|
||||
"@hocuspocus/provider": "^3.4.4",
|
||||
"@hocuspocus/transformer": "^3.4.4",
|
||||
"@modelcontextprotocol/sdk": "^1.25.3",
|
||||
"@tiptap/core": "3.20.4",
|
||||
"@tiptap/extension-highlight": "3.20.4",
|
||||
"@tiptap/extension-image": "3.20.4",
|
||||
"@tiptap/extension-link": "3.20.4",
|
||||
"@tiptap/extension-subscript": "3.20.4",
|
||||
"@tiptap/extension-superscript": "3.20.4",
|
||||
"@tiptap/extension-task-item": "3.20.4",
|
||||
"@tiptap/extension-task-list": "3.20.4",
|
||||
"@tiptap/html": "3.20.4",
|
||||
"@tiptap/starter-kit": "3.20.4",
|
||||
"@types/jsdom": "^27.0.0",
|
||||
"axios": "^1.6.0",
|
||||
"form-data": "^4.0.0",
|
||||
"jsdom": "^27.4.0",
|
||||
"marked": "^17.0.1",
|
||||
"ws": "^8.19.0",
|
||||
"yjs": "^13.6.29",
|
||||
"zod": "^3.22.0"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/form-data": "^2.5.0",
|
||||
"@types/node": "^20.0.0",
|
||||
"typescript": "^5.0.0"
|
||||
}
|
||||
}
|
||||
2577
packages/mcp/src/client.ts
Normal file
2577
packages/mcp/src/client.ts
Normal file
File diff suppressed because it is too large
Load Diff
106
packages/mcp/src/http.ts
Normal file
106
packages/mcp/src/http.ts
Normal file
@@ -0,0 +1,106 @@
|
||||
import { randomUUID } from "node:crypto";
|
||||
import { IncomingMessage, ServerResponse } from "node:http";
|
||||
import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
|
||||
import { isInitializeRequest } from "@modelcontextprotocol/sdk/types.js";
|
||||
import { createDocmostMcpServer, DocmostMcpConfig } from "./index.js";
|
||||
|
||||
/**
|
||||
* Build a stateful Streamable-HTTP handler for the Docmost MCP server. The
|
||||
* embedding host (the gitmost NestJS server) bridges its raw Node req/res into
|
||||
* `handleRequest`. One McpServer + transport is created per MCP session and
|
||||
* kept alive between requests, keyed by the `mcp-session-id` header.
|
||||
*/
|
||||
export function createMcpHttpHandler(config: DocmostMcpConfig) {
|
||||
// One transport (and one McpServer) per MCP session, keyed by session id.
|
||||
const transports: Record<string, StreamableHTTPServerTransport> = {};
|
||||
// Last activity timestamp per session id, used for idle eviction.
|
||||
const lastSeen: Record<string, number> = {};
|
||||
|
||||
// Idle session TTL (ms): a session with no activity for this long is evicted.
|
||||
// Defaults to 30 min; overridable via MCP_SESSION_IDLE_MS.
|
||||
const idleTtlMs = (() => {
|
||||
const parsed = parseInt(process.env.MCP_SESSION_IDLE_MS ?? "", 10);
|
||||
return Number.isFinite(parsed) && parsed > 0 ? parsed : 30 * 60 * 1000;
|
||||
})();
|
||||
|
||||
// Periodically close transports idle longer than the TTL. transport.close()
|
||||
// triggers its onclose, which removes it from `transports`; we also drop the
|
||||
// lastSeen entry. unref() so this timer never keeps the process alive.
|
||||
const sweepIntervalMs = 5 * 60 * 1000;
|
||||
const sweepTimer = setInterval(() => {
|
||||
const now = Date.now();
|
||||
for (const sid of Object.keys(transports)) {
|
||||
if (now - (lastSeen[sid] ?? 0) > idleTtlMs) {
|
||||
void transports[sid].close();
|
||||
delete lastSeen[sid];
|
||||
}
|
||||
}
|
||||
}, sweepIntervalMs);
|
||||
sweepTimer.unref();
|
||||
|
||||
async function handleRequest(
|
||||
req: IncomingMessage,
|
||||
res: ServerResponse,
|
||||
parsedBody?: unknown,
|
||||
): Promise<void> {
|
||||
const sessionId = req.headers["mcp-session-id"] as string | undefined;
|
||||
const method = (req.method || "GET").toUpperCase();
|
||||
let transport = sessionId ? transports[sessionId] : undefined;
|
||||
|
||||
if (method === "POST" && !transport) {
|
||||
// A new session may only be created by an initialize request without a
|
||||
// session id.
|
||||
if (sessionId || !isInitializeRequest(parsedBody)) {
|
||||
res.statusCode = 400;
|
||||
res.setHeader("Content-Type", "application/json");
|
||||
res.end(
|
||||
JSON.stringify({
|
||||
jsonrpc: "2.0",
|
||||
error: {
|
||||
code: -32000,
|
||||
message: "Bad Request: no valid session ID provided",
|
||||
},
|
||||
id: null,
|
||||
}),
|
||||
);
|
||||
return;
|
||||
}
|
||||
transport = new StreamableHTTPServerTransport({
|
||||
sessionIdGenerator: () => randomUUID(),
|
||||
onsessioninitialized: (sid: string) => {
|
||||
transports[sid] = transport!;
|
||||
lastSeen[sid] = Date.now();
|
||||
},
|
||||
});
|
||||
transport.onclose = () => {
|
||||
const sid = transport!.sessionId;
|
||||
if (sid && transports[sid]) delete transports[sid];
|
||||
};
|
||||
const server = createDocmostMcpServer(config);
|
||||
await server.connect(transport);
|
||||
await transport.handleRequest(req, res, parsedBody);
|
||||
return;
|
||||
}
|
||||
|
||||
if (!transport) {
|
||||
res.statusCode = 400;
|
||||
res.setHeader("Content-Type", "application/json");
|
||||
res.end(
|
||||
JSON.stringify({
|
||||
jsonrpc: "2.0",
|
||||
error: {
|
||||
code: -32000,
|
||||
message: "Bad Request: no valid session ID provided",
|
||||
},
|
||||
id: null,
|
||||
}),
|
||||
);
|
||||
return;
|
||||
}
|
||||
// Routing to an existing transport: refresh its idle timestamp.
|
||||
if (sessionId) lastSeen[sessionId] = Date.now();
|
||||
await transport.handleRequest(req, res, parsedBody);
|
||||
}
|
||||
|
||||
return { handleRequest };
|
||||
}
|
||||
1088
packages/mcp/src/index.ts
Normal file
1088
packages/mcp/src/index.ts
Normal file
File diff suppressed because it is too large
Load Diff
86
packages/mcp/src/lib/auth-utils.ts
Normal file
86
packages/mcp/src/lib/auth-utils.ts
Normal file
@@ -0,0 +1,86 @@
|
||||
import axios from "axios";
|
||||
|
||||
export async function getCollabToken(
|
||||
baseUrl: string,
|
||||
apiToken: string,
|
||||
): Promise<string> {
|
||||
try {
|
||||
const response = await axios.post(
|
||||
`${baseUrl}/auth/collab-token`,
|
||||
{},
|
||||
{
|
||||
headers: {
|
||||
Authorization: `Bearer ${apiToken}`,
|
||||
"Content-Type": "application/json",
|
||||
},
|
||||
},
|
||||
);
|
||||
|
||||
// console.error('Collab Token Response:', response.data);
|
||||
// Response is wrapped in { data: { token: ... } }
|
||||
return response.data.data?.token || response.data.token;
|
||||
} catch (error) {
|
||||
if (axios.isAxiosError(error)) {
|
||||
// Attach the HTTP status to the plain Error so callers (e.g.
|
||||
// getCollabTokenWithReauth) can still detect a 401/403 after the
|
||||
// original AxiosError has been wrapped away.
|
||||
// Avoid leaking the full server response body by default; include only
|
||||
// status + statusText. Append the body only when DEBUG is set.
|
||||
let message = `Failed to get collab token: ${error.response?.status} ${error.response?.statusText}`;
|
||||
if (process.env.DEBUG) {
|
||||
message += ` - ${JSON.stringify(error.response?.data)}`;
|
||||
}
|
||||
const err: any = new Error(message);
|
||||
err.status = error.response?.status;
|
||||
throw err;
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
export async function performLogin(
|
||||
baseUrl: string,
|
||||
email: string,
|
||||
password: string,
|
||||
): Promise<string> {
|
||||
try {
|
||||
const response = await axios.post(`${baseUrl}/auth/login`, {
|
||||
email,
|
||||
password,
|
||||
});
|
||||
|
||||
// Extract token from Set-Cookie header
|
||||
const cookies = response.headers["set-cookie"];
|
||||
if (!cookies) {
|
||||
throw new Error("No Set-Cookie header found in login response");
|
||||
}
|
||||
// Match the cookie name exactly to avoid matching a future
|
||||
// authTokenRefresh cookie (startsWith would catch it).
|
||||
const authCookie = cookies.find((c: string) => {
|
||||
const kv = c.split(";")[0];
|
||||
return kv.slice(0, kv.indexOf("=")) === "authToken";
|
||||
});
|
||||
if (!authCookie) {
|
||||
throw new Error("No authToken cookie found in login response");
|
||||
}
|
||||
|
||||
// Take everything after the FIRST "=" up to the first ";".
|
||||
// Splitting on "=" would truncate base64 values containing "=" padding.
|
||||
const kv = authCookie.split(";")[0];
|
||||
const token = kv.slice(kv.indexOf("=") + 1);
|
||||
return token;
|
||||
} catch (error: any) {
|
||||
// Avoid leaking the full server response body by default; log only the
|
||||
// HTTP status. Log the verbose body only when DEBUG is set.
|
||||
if (axios.isAxiosError(error)) {
|
||||
if (process.env.DEBUG) {
|
||||
console.error("Login failed:", error.response?.data);
|
||||
} else {
|
||||
console.error("Login failed:", error.response?.status);
|
||||
}
|
||||
} else {
|
||||
console.error("Login failed:", error.message);
|
||||
}
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
618
packages/mcp/src/lib/collaboration.ts
Normal file
618
packages/mcp/src/lib/collaboration.ts
Normal file
@@ -0,0 +1,618 @@
|
||||
import { HocuspocusProvider } from "@hocuspocus/provider";
|
||||
import { TiptapTransformer } from "@hocuspocus/transformer";
|
||||
import * as Y from "yjs";
|
||||
import WebSocket from "ws";
|
||||
import { marked } from "marked";
|
||||
import { generateJSON } from "@tiptap/html";
|
||||
import { JSDOM } from "jsdom";
|
||||
import { docmostExtensions } from "./docmost-schema.js";
|
||||
import { withPageLock } from "./page-lock.js";
|
||||
import { sanitizeForYjs, findUnstorableAttr } from "./node-ops.js";
|
||||
|
||||
// Setup DOM environment for Tiptap HTML parsing in Node.js
|
||||
const dom = new JSDOM("<!DOCTYPE html><html><body></body></html>");
|
||||
global.window = dom.window as any;
|
||||
global.document = dom.window.document;
|
||||
// @ts-ignore
|
||||
global.Element = dom.window.Element;
|
||||
// @ts-ignore
|
||||
global.WebSocket = WebSocket;
|
||||
// Navigator is read-only in newer Node versions and already exists
|
||||
// global.navigator = dom.window.navigator;
|
||||
|
||||
/**
|
||||
* Hard ceiling above which we skip callout preprocessing entirely. The linear
|
||||
* scanner below has no quadratic blow-up, but we still cap input defensively so
|
||||
* a pathological multi-megabyte payload cannot tie up the event loop; in that
|
||||
* case the markdown is passed through verbatim (callouts are simply not
|
||||
* detected) rather than risking a slow scan.
|
||||
*/
|
||||
const MAX_CALLOUT_PREPROCESS_BYTES = 4 * 1024 * 1024; // 4 MB
|
||||
|
||||
/** Matches an opening callout fence: `:::type` (type captured, lower-cased). */
|
||||
const CALLOUT_OPEN_RE = /^:::\s*(\w+)\s*$/;
|
||||
/** Matches a bare closing callout fence: `:::`. */
|
||||
const CALLOUT_CLOSE_RE = /^:::\s*$/;
|
||||
/** Matches the start/end of a code fence (``` or ~~~), capturing the marker. */
|
||||
const CODE_FENCE_RE = /^(\s*)(`{3,}|~{3,})/;
|
||||
|
||||
/**
|
||||
* Pre-process Docmost-flavoured markdown: convert `:::type ... :::`
|
||||
* callout blocks (the syntax our markdown export produces) into HTML
|
||||
* divs that the callout extension parses. The inner content is rendered
|
||||
* through marked as regular markdown.
|
||||
*
|
||||
* Implemented as a single linear pass over the lines (no quadratic regex
|
||||
* rescan). It:
|
||||
* - tracks fenced code regions (```...``` and ~~~...~~~) and never treats a
|
||||
* `:::` line that lives inside a code fence as a callout delimiter, so a
|
||||
* callout body that itself contains a fenced code block with a `:::` line is
|
||||
* no longer corrupted;
|
||||
* - matches an opening `:::type` line with the next CLOSING `:::` at the SAME
|
||||
* nesting level, supporting NESTED callouts via a depth counter (an inner
|
||||
* `:::type` opens a deeper level and consumes a matching `:::`);
|
||||
* - emits the same `<div data-type="callout" data-callout-type="TYPE">` output
|
||||
* (inner rendered through marked) as the previous regex implementation.
|
||||
*/
|
||||
async function preprocessCallouts(markdown: string): Promise<string> {
|
||||
// Defensive cap: skip preprocessing for pathologically large inputs.
|
||||
if (markdown.length > MAX_CALLOUT_PREPROCESS_BYTES) {
|
||||
return markdown;
|
||||
}
|
||||
|
||||
// Recursively transform a slice of lines, converting top-level callouts in
|
||||
// that slice into <div> blocks and rendering their inner content (which may
|
||||
// itself contain nested callouts) through this same function.
|
||||
const transform = async (lines: string[]): Promise<string> => {
|
||||
const out: string[] = [];
|
||||
let inCodeFence = false;
|
||||
let codeFenceMarker = ""; // the exact run of backticks/tildes that opened it
|
||||
let i = 0;
|
||||
|
||||
while (i < lines.length) {
|
||||
const line = lines[i];
|
||||
|
||||
// Inside a code fence, only its matching closing fence is significant;
|
||||
// everything else (including `:::` lines) is copied through verbatim.
|
||||
if (inCodeFence) {
|
||||
out.push(line);
|
||||
const fence = line.match(CODE_FENCE_RE);
|
||||
if (fence && fence[2].startsWith(codeFenceMarker[0]) &&
|
||||
fence[2].length >= codeFenceMarker.length) {
|
||||
inCodeFence = false;
|
||||
codeFenceMarker = "";
|
||||
}
|
||||
i++;
|
||||
continue;
|
||||
}
|
||||
|
||||
// A code fence opening outside any callout body: enter code-fence mode.
|
||||
const fenceOpen = line.match(CODE_FENCE_RE);
|
||||
if (fenceOpen) {
|
||||
inCodeFence = true;
|
||||
codeFenceMarker = fenceOpen[2];
|
||||
out.push(line);
|
||||
i++;
|
||||
continue;
|
||||
}
|
||||
|
||||
// An opening callout fence: scan forward (with code-fence and nested
|
||||
// callout awareness) for its matching closing `:::` at the same level.
|
||||
const open = line.match(CALLOUT_OPEN_RE);
|
||||
if (open) {
|
||||
const type = open[1].toLowerCase();
|
||||
const bodyLines: string[] = [];
|
||||
let depth = 1;
|
||||
let innerInCodeFence = false;
|
||||
let innerCodeFenceMarker = "";
|
||||
let j = i + 1;
|
||||
for (; j < lines.length; j++) {
|
||||
const bl = lines[j];
|
||||
if (innerInCodeFence) {
|
||||
const f = bl.match(CODE_FENCE_RE);
|
||||
if (f && f[2].startsWith(innerCodeFenceMarker[0]) &&
|
||||
f[2].length >= innerCodeFenceMarker.length) {
|
||||
innerInCodeFence = false;
|
||||
innerCodeFenceMarker = "";
|
||||
}
|
||||
bodyLines.push(bl);
|
||||
continue;
|
||||
}
|
||||
const innerFence = bl.match(CODE_FENCE_RE);
|
||||
if (innerFence) {
|
||||
innerInCodeFence = true;
|
||||
innerCodeFenceMarker = innerFence[2];
|
||||
bodyLines.push(bl);
|
||||
continue;
|
||||
}
|
||||
if (CALLOUT_OPEN_RE.test(bl)) {
|
||||
depth++;
|
||||
bodyLines.push(bl);
|
||||
continue;
|
||||
}
|
||||
if (CALLOUT_CLOSE_RE.test(bl)) {
|
||||
depth--;
|
||||
if (depth === 0) break; // matching close for THIS callout
|
||||
bodyLines.push(bl);
|
||||
continue;
|
||||
}
|
||||
bodyLines.push(bl);
|
||||
}
|
||||
|
||||
if (j < lines.length) {
|
||||
// Found the matching closing fence: render the body (recursively, so
|
||||
// nested callouts are handled) and emit the callout div.
|
||||
const inner = await transform(bodyLines);
|
||||
const renderedInner = await marked.parse(inner);
|
||||
out.push(
|
||||
`\n<div data-type="callout" data-callout-type="${type}">${renderedInner}</div>\n`,
|
||||
);
|
||||
i = j + 1; // skip past the closing `:::`
|
||||
continue;
|
||||
}
|
||||
// No matching close (unterminated callout): treat the opener as a
|
||||
// literal line and continue, preserving the original text.
|
||||
out.push(line);
|
||||
i++;
|
||||
continue;
|
||||
}
|
||||
|
||||
out.push(line);
|
||||
i++;
|
||||
}
|
||||
|
||||
return out.join("\n");
|
||||
};
|
||||
|
||||
return transform(markdown.split("\n"));
|
||||
}
|
||||
|
||||
/**
|
||||
* Bridge marked's checkbox lists to TipTap task lists.
|
||||
*
|
||||
* marked renders GitHub task list items (`- [x] done`) as a plain
|
||||
* `<ul><li><p><input type="checkbox" checked> text</p></li></ul>` WITHOUT the
|
||||
* markup TipTap's TaskList/TaskItem extensions parse. This rewrites such lists
|
||||
* into the shape those extensions expect:
|
||||
* TaskList parseHTML matches `ul[data-type="taskList"]`,
|
||||
* TaskItem matches `li[data-type="taskItem"]`,
|
||||
* the checked state is read from `data-checked === "true"`.
|
||||
*
|
||||
* A list is only converted when it has at least one `<li>` and EVERY direct
|
||||
* `<li>` contains a checkbox input. Both `<ul>` and `<ol>` are considered: a
|
||||
* numbered checklist (`1. [x] a`, which marked renders as an `<ol>` of checkbox
|
||||
* `<li>`s) would otherwise lose its task state. TipTap task lists are unordered,
|
||||
* so a matching `<ol>` is emitted as `data-type="taskList"` exactly like a
|
||||
* `<ul>`. Mixed or ordinary lists (including ordinary `<ol>` lists) are left
|
||||
* untouched so they keep rendering as bullet/numbered lists. The marked `<p>`
|
||||
* wrapper is kept inside the `<li>` because TaskItem content allows paragraphs.
|
||||
*/
|
||||
function bridgeTaskLists(html: string): string {
|
||||
// Cheap early-out: if the markup contains no checkbox input at all there is
|
||||
// nothing to bridge, so skip the expensive JSDOM parse entirely. This is the
|
||||
// common case (most pages have no task lists).
|
||||
if (!/type=["']?checkbox/i.test(html)) {
|
||||
return html;
|
||||
}
|
||||
// Defensive cap (consistent with preprocessCallouts): skip the bridge for
|
||||
// pathologically large inputs rather than running a second expensive JSDOM
|
||||
// parse on a multi-megabyte payload. The markup is passed through verbatim.
|
||||
if (html.length > MAX_CALLOUT_PREPROCESS_BYTES) {
|
||||
return html;
|
||||
}
|
||||
const dom = new JSDOM(html);
|
||||
const document = dom.window.document;
|
||||
// Collect the checkbox(es) that belong to THIS <li> directly: either direct
|
||||
// child <input type="checkbox"> elements or ones inside the <li>'s direct <p>
|
||||
// child (the shape marked emits: `<li><p><input type="checkbox"> text</p></li>`).
|
||||
// Checkboxes nested deeper (e.g. inside a child <ul>/<ol>) are excluded so a
|
||||
// bullet <li> that merely contains a nested task sublist is not misdetected.
|
||||
// Raw inline HTML can put more than one checkbox in a single <li>; we gather
|
||||
// ALL of them so none survive into the converted item.
|
||||
const directCheckboxes = (li: Element): Element[] => {
|
||||
const found: Element[] = [];
|
||||
for (const child of Array.from(li.children)) {
|
||||
if (
|
||||
child.tagName === "INPUT" &&
|
||||
child.getAttribute("type") === "checkbox"
|
||||
) {
|
||||
found.push(child);
|
||||
continue;
|
||||
}
|
||||
if (child.tagName === "P") {
|
||||
for (const inp of Array.from(
|
||||
child.querySelectorAll(":scope > input[type='checkbox']"),
|
||||
)) {
|
||||
found.push(inp);
|
||||
}
|
||||
}
|
||||
}
|
||||
return found;
|
||||
};
|
||||
// Both <ul> and <ol> are candidates: an <ol> whose every direct <li> carries
|
||||
// its own checkbox is a numbered checklist that must also become a taskList.
|
||||
const lists = Array.from(document.querySelectorAll("ul, ol"));
|
||||
for (const list of lists) {
|
||||
// Only consider DIRECT child <li> elements; nested lists are handled by
|
||||
// their own iteration of the outer loop.
|
||||
const items = Array.from(list.children).filter(
|
||||
(child) => child.tagName === "LI",
|
||||
);
|
||||
if (items.length === 0) continue;
|
||||
const itemCheckboxes = items.map((li) => directCheckboxes(li));
|
||||
// Convert only when every direct <li> carries at least one OWN checkbox.
|
||||
if (!itemCheckboxes.every((boxes) => boxes.length > 0)) continue;
|
||||
|
||||
// A numbered checklist arrives as an <ol>. We must NOT leave the tag as
|
||||
// <ol> while tagging it data-type="taskList": generateJSON would then match
|
||||
// BOTH the orderedList rule (tag ol) and the taskList rule (data-type),
|
||||
// emitting a phantom empty orderedList beside the real taskList. So rename a
|
||||
// qualifying <ol> to a <ul> — move its <li> children over and replace it —
|
||||
// leaving only the taskList rule to match. Already-<ul> lists are unchanged.
|
||||
let target: Element = list;
|
||||
if (list.tagName === "OL") {
|
||||
const ul = document.createElement("ul");
|
||||
// Carry over existing attributes (e.g. class) so nothing is silently lost.
|
||||
for (const attr of Array.from(list.attributes)) {
|
||||
ul.setAttribute(attr.name, attr.value);
|
||||
}
|
||||
// Move every child node (including the <li>s we collected) into the <ul>.
|
||||
while (list.firstChild) {
|
||||
ul.appendChild(list.firstChild);
|
||||
}
|
||||
list.replaceWith(ul);
|
||||
target = ul;
|
||||
}
|
||||
|
||||
target.setAttribute("data-type", "taskList");
|
||||
items.forEach((li, index) => {
|
||||
const boxes = itemCheckboxes[index];
|
||||
// The first checkbox determines the checked state (matches the previous
|
||||
// single-checkbox behaviour); any extras only need removing.
|
||||
const input = boxes[0] ?? null;
|
||||
li.setAttribute("data-type", "taskItem");
|
||||
const checked =
|
||||
input != null &&
|
||||
(input.hasAttribute("checked") || (input as any).checked);
|
||||
li.setAttribute("data-checked", checked ? "true" : "false");
|
||||
// Remove ALL direct checkbox inputs so none survive into the content
|
||||
// (a raw-inline-HTML <li> may carry more than one).
|
||||
for (const box of boxes) {
|
||||
box.remove();
|
||||
}
|
||||
});
|
||||
}
|
||||
return document.body.innerHTML;
|
||||
}
|
||||
|
||||
/** Convert markdown to a ProseMirror doc using the full Docmost schema. */
|
||||
export async function markdownToProseMirror(
|
||||
markdownContent: string,
|
||||
): Promise<any> {
|
||||
const withCallouts = await preprocessCallouts(markdownContent);
|
||||
const html = await marked.parse(withCallouts);
|
||||
const bridged = bridgeTaskLists(html);
|
||||
return generateJSON(bridged, docmostExtensions);
|
||||
}
|
||||
|
||||
/**
|
||||
* Build the collaboration WebSocket URL from an API base URL:
|
||||
* switch http(s)->ws(s), strip a trailing /api, mount on /collab.
|
||||
* Shared by the live read and the mutate path so both target the same socket.
|
||||
*/
|
||||
export function buildCollabWsUrl(baseUrl: string): string {
|
||||
let wsUrl = baseUrl.replace(/^http/, "ws");
|
||||
try {
|
||||
const urlObj = new URL(wsUrl);
|
||||
if (urlObj.pathname.endsWith("/api") || urlObj.pathname.endsWith("/api/")) {
|
||||
urlObj.pathname = urlObj.pathname.replace(/\/api\/?$/, "");
|
||||
}
|
||||
urlObj.pathname = urlObj.pathname.replace(/\/$/, "") + "/collab";
|
||||
// Drop any query/hash from the base URL so it is not carried into the
|
||||
// collaboration ws URL.
|
||||
urlObj.search = "";
|
||||
urlObj.hash = "";
|
||||
wsUrl = urlObj.toString();
|
||||
} catch (e) {
|
||||
// Fallback if URL parsing fails
|
||||
if (!wsUrl.endsWith("/collab")) {
|
||||
wsUrl = wsUrl.replace(/\/$/, "") + "/collab";
|
||||
}
|
||||
}
|
||||
return wsUrl;
|
||||
}
|
||||
|
||||
/**
|
||||
* Encode a ProseMirror doc to a Yjs document, sanitizing it first and turning
|
||||
* the opaque yjs "Unexpected content type" failure into a descriptive error.
|
||||
*
|
||||
* `sanitizeForYjs` strips `undefined` node/mark attributes (the common cause of
|
||||
* the failure); if `toYdoc` still throws, `findUnstorableAttr` is used to point
|
||||
* at the offending attribute path.
|
||||
*/
|
||||
export function buildYDoc(doc: any): Y.Doc {
|
||||
const safe = sanitizeForYjs(doc);
|
||||
try {
|
||||
return TiptapTransformer.toYdoc(safe, "default", docmostExtensions);
|
||||
} catch (e) {
|
||||
const bad = findUnstorableAttr(safe);
|
||||
throw new Error(
|
||||
`Failed to encode document to Yjs (toYdoc): ${e instanceof Error ? e.message : String(e)}.${bad ? ` Offending attribute: ${bad}.` : " A node/mark attribute likely holds a value Yjs cannot store (e.g. undefined)."}`,
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Validate that a doc is Yjs-encodable by building (and discarding) a Y.Doc.
|
||||
* Throws the same descriptive error as the apply path when it is not. Used by
|
||||
* the dry-run preview so it fails identically to apply.
|
||||
*/
|
||||
export function assertYjsEncodable(doc: any): void {
|
||||
buildYDoc(doc);
|
||||
}
|
||||
|
||||
/** Time we wait for the initial handshake/sync before giving up. */
|
||||
const CONNECT_TIMEOUT_MS = 25000;
|
||||
/** Time we wait for the server to acknowledge our write before giving up. */
|
||||
const PERSIST_TIMEOUT_MS = 20000;
|
||||
|
||||
/**
|
||||
* Safely mutate the live content of a page over the collaboration websocket.
|
||||
*
|
||||
* This is the single safe write path for every MCP content mutation. It:
|
||||
* 1. serializes per-page writes through withPageLock (no two MCP writes on
|
||||
* the same page overlap);
|
||||
* 2. connects to Hocuspocus and waits for the initial sync so the local ydoc
|
||||
* mirrors the authoritative server doc — INCLUDING edits/comments/images
|
||||
* that are not yet in the debounced REST snapshot;
|
||||
* 3. inside onSynced, SYNCHRONOUSLY reads the live doc, runs `transform`, and
|
||||
* writes the result back — with no `await` between read and write so no
|
||||
* remote update can interleave and clobber concurrent human edits;
|
||||
* 4. waits for the server to acknowledge the write (unsyncedChanges -> 0)
|
||||
* before resolving, so the next operation observes our change.
|
||||
*
|
||||
* `transform` receives the live ProseMirror doc and returns the NEW full
|
||||
* ProseMirror doc to write, or `null` to abort with no write (a no-op). If
|
||||
* `transform` throws, the error is propagated to the caller (not swallowed).
|
||||
*
|
||||
* Returns the doc that was written, or the live doc when the transform aborted.
|
||||
*/
|
||||
export async function mutatePageContent(
|
||||
pageId: string,
|
||||
collabToken: string,
|
||||
baseUrl: string,
|
||||
transform: (liveDoc: any) => any | null,
|
||||
): Promise<any> {
|
||||
return withPageLock(pageId, () => {
|
||||
if (process.env.DEBUG) {
|
||||
console.error(`Starting realtime content mutate for page ${pageId}`);
|
||||
// Token prefix is sensitive; only log it under DEBUG.
|
||||
console.error(
|
||||
`Token prefix: ${collabToken ? collabToken.substring(0, 5) : "NONE"}...`,
|
||||
);
|
||||
}
|
||||
|
||||
const ydoc = new Y.Doc();
|
||||
const wsUrl = buildCollabWsUrl(baseUrl);
|
||||
if (process.env.DEBUG) console.error(`Connecting to WebSocket: ${wsUrl}`);
|
||||
|
||||
return new Promise<any>((resolve, reject) => {
|
||||
let provider: HocuspocusProvider | undefined;
|
||||
let applied = false; // onSynced may fire again on reconnect — apply once.
|
||||
let settled = false;
|
||||
// Set true on disconnect/close so a reconnect-driven unsyncedChanges->0
|
||||
// cannot be mistaken for a successful persist of our write.
|
||||
let connectionLost = false;
|
||||
let connectTimer: ReturnType<typeof setTimeout> | undefined;
|
||||
let persistTimer: ReturnType<typeof setTimeout> | undefined;
|
||||
let unsyncedHandler: ((data: { number: number }) => void) | undefined;
|
||||
|
||||
const cleanup = () => {
|
||||
if (connectTimer) clearTimeout(connectTimer);
|
||||
if (persistTimer) clearTimeout(persistTimer);
|
||||
if (provider) {
|
||||
if (unsyncedHandler) {
|
||||
try {
|
||||
provider.off("unsyncedChanges", unsyncedHandler);
|
||||
} catch (err) {}
|
||||
}
|
||||
try {
|
||||
provider.destroy();
|
||||
} catch (err) {}
|
||||
}
|
||||
};
|
||||
|
||||
const finish = (err: Error | null, value?: any) => {
|
||||
if (settled) return;
|
||||
settled = true;
|
||||
cleanup();
|
||||
if (err) reject(err);
|
||||
else resolve(value);
|
||||
};
|
||||
|
||||
connectTimer = setTimeout(() => {
|
||||
finish(new Error("Connection timeout to collaboration server"));
|
||||
}, CONNECT_TIMEOUT_MS);
|
||||
|
||||
// Resolve once the server has acknowledged our update. The provider
|
||||
// increments unsyncedChanges when our local update is sent and
|
||||
// decrements it when the server replies with a SyncStatus(applied=true);
|
||||
// reaching 0 means the authoritative in-memory ydoc on the server now
|
||||
// contains our write.
|
||||
const waitForPersistence = () => {
|
||||
if (settled) return;
|
||||
// A missing provider is a failure, not a success: without it the write
|
||||
// can never have been acknowledged. Only an actual unsyncedChanges===0
|
||||
// on a live provider counts as persisted.
|
||||
if (!provider) {
|
||||
finish(new Error("collab provider gone before persistence"));
|
||||
return;
|
||||
}
|
||||
if (provider.unsyncedChanges === 0) {
|
||||
finish(null, lastWrittenDoc);
|
||||
return;
|
||||
}
|
||||
persistTimer = setTimeout(() => {
|
||||
finish(
|
||||
new Error(
|
||||
"Timeout waiting for collaboration server to persist the update",
|
||||
),
|
||||
);
|
||||
}, PERSIST_TIMEOUT_MS);
|
||||
unsyncedHandler = (data: { number: number }) => {
|
||||
// Only treat unsyncedChanges->0 as success when the connection is
|
||||
// still up. A transient disconnect + reconnect handshake can drive
|
||||
// the counter back to 0 without our write being re-transmitted; in
|
||||
// that case let the disconnect/close error win instead.
|
||||
if (data.number === 0 && !connectionLost) {
|
||||
finish(null, lastWrittenDoc);
|
||||
}
|
||||
};
|
||||
provider.on("unsyncedChanges", unsyncedHandler);
|
||||
};
|
||||
|
||||
let lastWrittenDoc: any;
|
||||
|
||||
provider = new HocuspocusProvider({
|
||||
url: wsUrl,
|
||||
name: `page.${pageId}`,
|
||||
document: ydoc,
|
||||
token: collabToken,
|
||||
// @ts-ignore - Required for Node.js environment
|
||||
WebSocketPolyfill: WebSocket,
|
||||
onConnect: () => {
|
||||
if (process.env.DEBUG) console.error("WS Connect");
|
||||
},
|
||||
// An unexpected disconnect/close while we are still waiting (during the
|
||||
// connect-wait before onSynced, or during the persistence wait after the
|
||||
// write) means the update will never be acknowledged — surface it now
|
||||
// instead of hanging until the connect/persist timeout fires. `finish`
|
||||
// is idempotent via the `settled` flag, so the onClose that our own
|
||||
// cleanup()->provider.destroy() triggers (after settled=true is set) is
|
||||
// a harmless no-op and cannot cause a double-resolve.
|
||||
onDisconnect: () => {
|
||||
if (process.env.DEBUG) console.error("WS Disconnect");
|
||||
// Mark BEFORE finish so the unsyncedChanges handler (if it races)
|
||||
// sees the connection as lost and won't report a false success.
|
||||
connectionLost = true;
|
||||
finish(
|
||||
new Error(
|
||||
"Collaboration connection closed before the update was persisted/synced",
|
||||
),
|
||||
);
|
||||
},
|
||||
onClose: () => {
|
||||
if (process.env.DEBUG) console.error("WS Close");
|
||||
// Mark BEFORE finish so the unsyncedChanges handler (if it races)
|
||||
// sees the connection as lost and won't report a false success.
|
||||
connectionLost = true;
|
||||
finish(
|
||||
new Error(
|
||||
"Collaboration connection closed before the update was persisted/synced",
|
||||
),
|
||||
);
|
||||
},
|
||||
onSynced: () => {
|
||||
if (applied || settled) return;
|
||||
applied = true;
|
||||
if (process.env.DEBUG) console.error("Connected and synced!");
|
||||
|
||||
// CRITICAL: everything between reading the live doc and writing it
|
||||
// back must stay synchronous (no await). While the JS event loop is
|
||||
// not yielded, no incoming remote update can interleave, so any
|
||||
// already-synced concurrent edits are preserved in liveDoc.
|
||||
let newDoc: any;
|
||||
try {
|
||||
let liveDoc = TiptapTransformer.fromYdoc(ydoc, "default");
|
||||
if (
|
||||
!liveDoc ||
|
||||
typeof liveDoc !== "object" ||
|
||||
!Array.isArray(liveDoc.content)
|
||||
) {
|
||||
liveDoc = { type: "doc", content: [] };
|
||||
}
|
||||
|
||||
newDoc = transform(liveDoc);
|
||||
|
||||
if (newDoc == null) {
|
||||
// Transform aborted — write nothing, return the live doc.
|
||||
lastWrittenDoc = liveDoc;
|
||||
finish(null, liveDoc);
|
||||
return;
|
||||
}
|
||||
|
||||
const tempDoc = buildYDoc(newDoc);
|
||||
// Fetch the fragment immediately before the transact that mutates
|
||||
// it, rather than reusing a handle grabbed across the transform.
|
||||
const fragment = ydoc.getXmlFragment("default");
|
||||
ydoc.transact(() => {
|
||||
if (fragment.length > 0) {
|
||||
fragment.delete(0, fragment.length);
|
||||
}
|
||||
Y.applyUpdate(ydoc, Y.encodeStateAsUpdate(tempDoc));
|
||||
});
|
||||
} catch (e) {
|
||||
// Includes errors thrown by transform (e.g. "afterText not found",
|
||||
// "text not found"): propagate them verbatim to the caller.
|
||||
finish(e instanceof Error ? e : new Error(String(e)));
|
||||
return;
|
||||
}
|
||||
|
||||
lastWrittenDoc = newDoc;
|
||||
if (process.env.DEBUG)
|
||||
console.error("Content written, waiting for server to persist...");
|
||||
waitForPersistence();
|
||||
},
|
||||
onAuthenticationFailed: () => {
|
||||
finish(
|
||||
new Error("Authentication failed for collaboration connection"),
|
||||
);
|
||||
},
|
||||
});
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Replace the live content of a page over the collaboration websocket.
|
||||
* Accepts a ready ProseMirror JSON document; the caller controls whether
|
||||
* it was produced from markdown (ids regenerate) or edited in place
|
||||
* (existing block ids preserved).
|
||||
*
|
||||
* This is an intentional full replace (used by update_page / update_page_json),
|
||||
* but now runs under the per-page lock and waits for server persistence via
|
||||
* mutatePageContent.
|
||||
*/
|
||||
export async function replacePageContent(
|
||||
pageId: string,
|
||||
prosemirrorDoc: any,
|
||||
collabToken: string,
|
||||
baseUrl: string,
|
||||
): Promise<void> {
|
||||
// Fail fast on a bad document instead of deferring the failure into the
|
||||
// collaboration write (where TiptapTransformer.toYdoc(undefined) used to
|
||||
// throw). The transform must return a valid ProseMirror doc.
|
||||
if (
|
||||
prosemirrorDoc == null ||
|
||||
typeof prosemirrorDoc !== "object" ||
|
||||
prosemirrorDoc.type !== "doc"
|
||||
) {
|
||||
throw new Error("replacePageContent: invalid ProseMirror document");
|
||||
}
|
||||
await mutatePageContent(pageId, collabToken, baseUrl, () => prosemirrorDoc);
|
||||
}
|
||||
|
||||
/**
|
||||
* Markdown update path (kept for backwards compatibility).
|
||||
* NOTE: this re-imports the whole document — block ids are regenerated.
|
||||
* Tables and :::callout::: blocks survive thanks to the full schema.
|
||||
*/
|
||||
export async function updatePageContentRealtime(
|
||||
pageId: string,
|
||||
markdownContent: string,
|
||||
collabToken: string,
|
||||
baseUrl: string,
|
||||
): Promise<void> {
|
||||
const tiptapJson = await markdownToProseMirror(markdownContent);
|
||||
await mutatePageContent(pageId, collabToken, baseUrl, () => tiptapJson);
|
||||
}
|
||||
319
packages/mcp/src/lib/diff.ts
Normal file
319
packages/mcp/src/lib/diff.ts
Normal file
@@ -0,0 +1,319 @@
|
||||
/**
|
||||
* Headless, Docmost-equivalent document diff.
|
||||
*
|
||||
* Docmost's history editor computes a change set with the exact pipeline below
|
||||
* (recreateTransform -> ChangeSet.addSteps -> simplifyChanges) and renders it as
|
||||
* editor decorations. This module runs the SAME computation but serializes the
|
||||
* result to text + integrity counts instead of decorations, so a diff can be
|
||||
* previewed without a browser.
|
||||
*
|
||||
* recreateTransform here comes from @fellow/prosemirror-recreate-transform, the
|
||||
* maintained published fork of the MIT prosemirror-recreate-steps source that
|
||||
* Docmost vendors in @docmost/editor-ext; it exposes the identical
|
||||
* recreateTransform(fromDoc, toDoc, { complexSteps, wordDiffs, simplifyDiff })
|
||||
* signature.
|
||||
*
|
||||
* If recreateTransform / the changeset throws on a pathological document pair,
|
||||
* we fall back to a coarse block-level text diff so the tool never hard-fails.
|
||||
*/
|
||||
|
||||
import { getSchema } from "@tiptap/core";
|
||||
import { Node } from "@tiptap/pm/model";
|
||||
import { ChangeSet, simplifyChanges } from "@tiptap/pm/changeset";
|
||||
import { recreateTransform } from "@fellow/prosemirror-recreate-transform";
|
||||
import { docmostExtensions } from "./docmost-schema.js";
|
||||
|
||||
/** A single inserted/deleted change with its containing-block context. */
|
||||
export interface DiffChange {
|
||||
op: "insert" | "delete";
|
||||
/** Lead (plain) text of the block that contains the change, for context. */
|
||||
block: string;
|
||||
/** The inserted or deleted text. */
|
||||
text: string;
|
||||
}
|
||||
|
||||
/** Integrity counts as [old, new] tuples; footnoteMarkers as [oldList, newList]. */
|
||||
export interface DiffIntegrity {
|
||||
images: [number, number];
|
||||
links: [number, number];
|
||||
tables: [number, number];
|
||||
callouts: [number, number];
|
||||
footnoteMarkers: [number[], number[]];
|
||||
}
|
||||
|
||||
export interface DiffResult {
|
||||
summary: { inserted: number; deleted: number; blocksChanged: number };
|
||||
integrity: DiffIntegrity;
|
||||
changes: DiffChange[];
|
||||
/** Human-readable unified-ish summary. */
|
||||
markdown: string;
|
||||
}
|
||||
|
||||
/** Build the schema once; it is pure and reused across calls. */
|
||||
const schema = getSchema(docmostExtensions);
|
||||
|
||||
/** Recursively concatenate the plain text of a JSON node. */
|
||||
function plainText(node: any): string {
|
||||
if (!node || typeof node !== "object") return "";
|
||||
let out = "";
|
||||
if (typeof node.text === "string") out += node.text;
|
||||
if (Array.isArray(node.content)) {
|
||||
for (const child of node.content) out += plainText(child);
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
/** Count nodes in a JSON doc that satisfy `pred` (recursive). */
|
||||
function countNodes(doc: any, pred: (node: any) => boolean): number {
|
||||
let n = 0;
|
||||
const visit = (node: any): void => {
|
||||
if (!node || typeof node !== "object") return;
|
||||
if (pred(node)) n++;
|
||||
if (Array.isArray(node.content)) for (const c of node.content) visit(c);
|
||||
};
|
||||
visit(doc);
|
||||
return n;
|
||||
}
|
||||
|
||||
/**
|
||||
* Count UNIQUE links in a JSON doc by their `href`. A single link can be split
|
||||
* across several adjacent text runs (e.g. a "link+bold" run followed by a "link"
|
||||
* run); counting link-bearing runs would over-count it. Walking the tree and
|
||||
* collecting hrefs into a Set keys each distinct link once. Link marks with a
|
||||
* missing/empty href are bucketed under a single "" key so a malformed link is
|
||||
* still counted as one.
|
||||
*/
|
||||
function countUniqueLinks(doc: any): number {
|
||||
const hrefs = new Set<string>();
|
||||
const visit = (node: any): void => {
|
||||
if (!node || typeof node !== "object") return;
|
||||
if (node.type === "text" && Array.isArray(node.marks)) {
|
||||
for (const m of node.marks) {
|
||||
if (m && m.type === "link") {
|
||||
const href = m.attrs && typeof m.attrs.href === "string" ? m.attrs.href : "";
|
||||
hrefs.add(href);
|
||||
}
|
||||
}
|
||||
}
|
||||
if (Array.isArray(node.content)) for (const c of node.content) visit(c);
|
||||
};
|
||||
visit(doc);
|
||||
return hrefs.size;
|
||||
}
|
||||
|
||||
/**
|
||||
* Parse the ordered list of integers from `[N]` footnote markers found in the
|
||||
* BODY only (every top-level block before the first "Примечания..." notes
|
||||
* heading; if no such heading, the whole doc). Returned in reading order.
|
||||
*/
|
||||
function footnoteMarkers(doc: any, notesHeading: string): number[] {
|
||||
const top: any[] = Array.isArray(doc?.content) ? doc.content : [];
|
||||
const notesIdx = top.findIndex(
|
||||
(n) =>
|
||||
n &&
|
||||
n.type === "heading" &&
|
||||
plainText(n).trim() === notesHeading,
|
||||
);
|
||||
const bodyBlocks = notesIdx >= 0 ? top.slice(0, notesIdx) : top;
|
||||
const markers: number[] = [];
|
||||
const re = /\[(\d+)\]/g;
|
||||
for (const block of bodyBlocks) {
|
||||
const text = plainText(block);
|
||||
let m: RegExpExecArray | null;
|
||||
re.lastIndex = 0;
|
||||
while ((m = re.exec(text)) !== null) {
|
||||
markers.push(Number(m[1]));
|
||||
}
|
||||
}
|
||||
return markers;
|
||||
}
|
||||
|
||||
/** Compute the [old,new] integrity tuples for two JSON docs. */
|
||||
function computeIntegrity(
|
||||
oldDoc: any,
|
||||
newDoc: any,
|
||||
notesHeading: string,
|
||||
): DiffIntegrity {
|
||||
const images: [number, number] = [
|
||||
countNodes(oldDoc, (n) => n.type === "image"),
|
||||
countNodes(newDoc, (n) => n.type === "image"),
|
||||
];
|
||||
const links: [number, number] = [
|
||||
countUniqueLinks(oldDoc),
|
||||
countUniqueLinks(newDoc),
|
||||
];
|
||||
const tables: [number, number] = [
|
||||
countNodes(oldDoc, (n) => n.type === "table"),
|
||||
countNodes(newDoc, (n) => n.type === "table"),
|
||||
];
|
||||
const callouts: [number, number] = [
|
||||
countNodes(oldDoc, (n) => n.type === "callout"),
|
||||
countNodes(newDoc, (n) => n.type === "callout"),
|
||||
];
|
||||
const fns: [number[], number[]] = [
|
||||
footnoteMarkers(oldDoc, notesHeading),
|
||||
footnoteMarkers(newDoc, notesHeading),
|
||||
];
|
||||
return { images, links, tables, callouts, footnoteMarkers: fns };
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolve the lead text of the top-level block in a ProseMirror Node that
|
||||
* contains the given document position. Returns "" when out of range.
|
||||
*/
|
||||
function blockContextAt(node: Node, pos: number): string {
|
||||
try {
|
||||
const clamped = Math.max(0, Math.min(pos, node.content.size));
|
||||
const $pos = node.resolve(clamped);
|
||||
// depth 1 is the top-level block in a doc node.
|
||||
const block = $pos.depth >= 1 ? $pos.node(1) : $pos.node(0);
|
||||
const text = block.textContent || "";
|
||||
return text.length > 80 ? text.slice(0, 77) + "..." : text;
|
||||
} catch {
|
||||
return "";
|
||||
}
|
||||
}
|
||||
|
||||
/** Truncate a string for the markdown summary. */
|
||||
function truncate(s: string, n = 120): string {
|
||||
return s.length > n ? s.slice(0, n - 3) + "..." : s;
|
||||
}
|
||||
|
||||
/**
|
||||
* Coarse fallback: a block-by-block plain-text diff. Used only when the precise
|
||||
* changeset pipeline throws, so the tool degrades gracefully instead of failing.
|
||||
*/
|
||||
function coarseDiff(oldDoc: any, newDoc: any): DiffChange[] {
|
||||
const oldBlocks: any[] = Array.isArray(oldDoc?.content) ? oldDoc.content : [];
|
||||
const newBlocks: any[] = Array.isArray(newDoc?.content) ? newDoc.content : [];
|
||||
const oldTexts = oldBlocks.map(plainText);
|
||||
const newTexts = newBlocks.map(plainText);
|
||||
const oldSet = new Set(oldTexts);
|
||||
const newSet = new Set(newTexts);
|
||||
const changes: DiffChange[] = [];
|
||||
for (const t of oldTexts) {
|
||||
if (!newSet.has(t) && t.trim() !== "") {
|
||||
changes.push({ op: "delete", block: truncate(t, 80), text: t });
|
||||
}
|
||||
}
|
||||
for (const t of newTexts) {
|
||||
if (!oldSet.has(t) && t.trim() !== "") {
|
||||
changes.push({ op: "insert", block: truncate(t, 80), text: t });
|
||||
}
|
||||
}
|
||||
return changes;
|
||||
}
|
||||
|
||||
/** Build the human-readable unified-ish markdown summary. */
|
||||
function renderMarkdown(
|
||||
result: Omit<DiffResult, "markdown">,
|
||||
fellBack: boolean,
|
||||
): string {
|
||||
const lines: string[] = [];
|
||||
const { summary, integrity, changes } = result;
|
||||
lines.push(
|
||||
`# Diff: ${summary.inserted} inserted / ${summary.deleted} deleted (${summary.blocksChanged} blocks changed)`,
|
||||
);
|
||||
if (fellBack) {
|
||||
lines.push("");
|
||||
lines.push("> note: precise diff failed; coarse block-level diff shown.");
|
||||
}
|
||||
lines.push("");
|
||||
lines.push("## Integrity (old -> new)");
|
||||
lines.push(`- images: ${integrity.images[0]} -> ${integrity.images[1]}`);
|
||||
lines.push(`- links: ${integrity.links[0]} -> ${integrity.links[1]}`);
|
||||
lines.push(`- tables: ${integrity.tables[0]} -> ${integrity.tables[1]}`);
|
||||
lines.push(`- callouts: ${integrity.callouts[0]} -> ${integrity.callouts[1]}`);
|
||||
lines.push(
|
||||
`- footnoteMarkers: [${integrity.footnoteMarkers[0].join(", ")}] -> [${integrity.footnoteMarkers[1].join(", ")}]`,
|
||||
);
|
||||
lines.push("");
|
||||
lines.push("## Changes");
|
||||
if (changes.length === 0) {
|
||||
lines.push("(no textual changes)");
|
||||
} else {
|
||||
for (const c of changes) {
|
||||
const sign = c.op === "insert" ? "+" : "-";
|
||||
const ctx = c.block ? ` @ ${truncate(c.block, 60)}` : "";
|
||||
lines.push(`${sign} ${truncate(c.text)}${ctx}`);
|
||||
}
|
||||
}
|
||||
return lines.join("\n");
|
||||
}
|
||||
|
||||
/**
|
||||
* Diff two ProseMirror JSON documents the way Docmost's history editor does and
|
||||
* serialize the result to text + integrity counts.
|
||||
*
|
||||
* @param oldDocJson the earlier document
|
||||
* @param newDocJson the later document
|
||||
* @param notesHeading heading delimiting body from notes for footnote counting
|
||||
*/
|
||||
export function diffDocs(
|
||||
oldDocJson: any,
|
||||
newDocJson: any,
|
||||
notesHeading: string = "Примечания переводчика",
|
||||
): DiffResult {
|
||||
const integrity = computeIntegrity(oldDocJson, newDocJson, notesHeading);
|
||||
|
||||
let changes: DiffChange[] = [];
|
||||
let inserted = 0;
|
||||
let deleted = 0;
|
||||
let fellBack = false;
|
||||
const changedBlocks = new Set<string>();
|
||||
|
||||
try {
|
||||
const oldNode = Node.fromJSON(schema, oldDocJson);
|
||||
const newNode = Node.fromJSON(schema, newDocJson);
|
||||
const tr = recreateTransform(oldNode, newNode, {
|
||||
complexSteps: false,
|
||||
wordDiffs: true,
|
||||
simplifyDiff: true,
|
||||
});
|
||||
const changeSet = ChangeSet.create(oldNode).addSteps(
|
||||
tr.doc,
|
||||
tr.mapping.maps,
|
||||
[],
|
||||
);
|
||||
const simplified = simplifyChanges(changeSet.changes, newNode);
|
||||
|
||||
for (const change of simplified) {
|
||||
// Deleted text lives in the OLD doc coordinate range [fromA, toA).
|
||||
if (change.toA > change.fromA) {
|
||||
const text = oldNode.textBetween(change.fromA, change.toA, "\n", " ");
|
||||
if (text.length > 0) {
|
||||
deleted += text.length;
|
||||
const block = blockContextAt(oldNode, change.fromA);
|
||||
changes.push({ op: "delete", block, text });
|
||||
if (block) changedBlocks.add("d:" + block);
|
||||
}
|
||||
}
|
||||
// Inserted text lives in the NEW doc coordinate range [fromB, toB).
|
||||
if (change.toB > change.fromB) {
|
||||
const text = newNode.textBetween(change.fromB, change.toB, "\n", " ");
|
||||
if (text.length > 0) {
|
||||
inserted += text.length;
|
||||
const block = blockContextAt(newNode, change.fromB);
|
||||
changes.push({ op: "insert", block, text });
|
||||
if (block) changedBlocks.add("i:" + block);
|
||||
}
|
||||
}
|
||||
}
|
||||
} catch {
|
||||
// Pathological pair: degrade to a coarse block-level diff so we never throw.
|
||||
fellBack = true;
|
||||
changes = coarseDiff(oldDocJson, newDocJson);
|
||||
for (const c of changes) {
|
||||
if (c.op === "insert") inserted += c.text.length;
|
||||
else deleted += c.text.length;
|
||||
if (c.block) changedBlocks.add(c.op[0] + ":" + c.block);
|
||||
}
|
||||
}
|
||||
|
||||
const partial: Omit<DiffResult, "markdown"> = {
|
||||
summary: { inserted, deleted, blocksChanged: changedBlocks.size },
|
||||
integrity,
|
||||
changes,
|
||||
};
|
||||
return { ...partial, markdown: renderMarkdown(partial, fellBack) };
|
||||
}
|
||||
1090
packages/mcp/src/lib/docmost-schema.ts
Normal file
1090
packages/mcp/src/lib/docmost-schema.ts
Normal file
File diff suppressed because it is too large
Load Diff
93
packages/mcp/src/lib/filters.ts
Normal file
93
packages/mcp/src/lib/filters.ts
Normal file
@@ -0,0 +1,93 @@
|
||||
/**
|
||||
* Filter functions to extract only relevant information from API responses
|
||||
* for better agent consumption
|
||||
*/
|
||||
|
||||
export function filterWorkspace(data: any) {
|
||||
return {
|
||||
id: data.id,
|
||||
name: data.name,
|
||||
description: data.description,
|
||||
defaultSpaceId: data.defaultSpaceId,
|
||||
createdAt: data.createdAt,
|
||||
updatedAt: data.updatedAt,
|
||||
deletedAt: data.deletedAt,
|
||||
};
|
||||
}
|
||||
|
||||
export function filterSpace(space: any) {
|
||||
return {
|
||||
id: space.id,
|
||||
name: space.name,
|
||||
description: space.description,
|
||||
slug: space.slug,
|
||||
visibility: space.visibility,
|
||||
createdAt: space.createdAt,
|
||||
updatedAt: space.updatedAt,
|
||||
deletedAt: space.deletedAt,
|
||||
};
|
||||
}
|
||||
|
||||
export function filterGroup(group: any) {
|
||||
return {
|
||||
id: group.id,
|
||||
name: group.name,
|
||||
description: group.description,
|
||||
workspaceId: group.workspaceId,
|
||||
createdAt: group.createdAt,
|
||||
updatedAt: group.updatedAt,
|
||||
deletedAt: group.deletedAt,
|
||||
};
|
||||
}
|
||||
|
||||
export function filterPage(page: any, content?: string, subpages?: any[]) {
|
||||
return {
|
||||
id: page.id,
|
||||
slugId: page.slugId,
|
||||
title: page.title,
|
||||
parentPageId: page.parentPageId,
|
||||
spaceId: page.spaceId,
|
||||
isLocked: page.isLocked,
|
||||
createdAt: page.createdAt,
|
||||
updatedAt: page.updatedAt,
|
||||
deletedAt: page.deletedAt,
|
||||
// Include converted markdown content if valid string (even empty)
|
||||
...(typeof content === "string" && { content }),
|
||||
// Include subpages if provided
|
||||
...(subpages &&
|
||||
subpages.length > 0 && {
|
||||
subpages: subpages.map((p) => ({ id: p.id, title: p.title })),
|
||||
}),
|
||||
};
|
||||
}
|
||||
|
||||
export function filterComment(comment: any, markdownContent?: string) {
|
||||
return {
|
||||
id: comment.id,
|
||||
pageId: comment.pageId,
|
||||
content: markdownContent ?? comment.content,
|
||||
selection: comment.selection || null,
|
||||
type: comment.type || "page",
|
||||
parentCommentId: comment.parentCommentId || null,
|
||||
creatorId: comment.creatorId,
|
||||
creatorName: comment.creator?.name || null,
|
||||
createdAt: comment.createdAt,
|
||||
editedAt: comment.editedAt || null,
|
||||
resolvedAt: comment.resolvedAt || null,
|
||||
resolvedById: comment.resolvedById || null,
|
||||
};
|
||||
}
|
||||
|
||||
export function filterSearchResult(result: any) {
|
||||
return {
|
||||
id: result.id,
|
||||
title: result.title,
|
||||
parentPageId: result.parentPageId,
|
||||
createdAt: result.createdAt,
|
||||
updatedAt: result.updatedAt,
|
||||
rank: result.rank,
|
||||
highlight: result.highlight,
|
||||
spaceId: result.space?.id,
|
||||
spaceName: result.space?.name,
|
||||
};
|
||||
}
|
||||
127
packages/mcp/src/lib/json-edit.ts
Normal file
127
packages/mcp/src/lib/json-edit.ts
Normal file
@@ -0,0 +1,127 @@
|
||||
/**
|
||||
* Surgical text edits on a ProseMirror document without re-importing it.
|
||||
*
|
||||
* Each edit replaces an exact substring inside individual text nodes,
|
||||
* preserving every node id, mark and attribute around it. This is the
|
||||
* safe alternative to a full markdown re-import for small wording fixes.
|
||||
*/
|
||||
|
||||
export interface TextEdit {
|
||||
find: string;
|
||||
replace: string;
|
||||
/** Replace every occurrence; otherwise the edit must match exactly once. */
|
||||
replaceAll?: boolean;
|
||||
}
|
||||
|
||||
export interface TextEditResult {
|
||||
find: string;
|
||||
replacements: number;
|
||||
}
|
||||
|
||||
/** Collect plain text of the whole document (for span-detection hints). */
|
||||
function collectText(node: any): string {
|
||||
let out = "";
|
||||
if (node.type === "text") out += node.text || "";
|
||||
for (const child of node.content || []) out += collectText(child);
|
||||
return out;
|
||||
}
|
||||
|
||||
function countOccurrences(haystack: string, needle: string): number {
|
||||
if (!needle) return 0;
|
||||
let count = 0;
|
||||
let idx = haystack.indexOf(needle);
|
||||
while (idx !== -1) {
|
||||
count++;
|
||||
idx = haystack.indexOf(needle, idx + needle.length);
|
||||
}
|
||||
return count;
|
||||
}
|
||||
|
||||
/**
|
||||
* Apply text edits to a ProseMirror doc (mutates a deep copy, returns it).
|
||||
* Throws a descriptive error when an edit matches zero times or matches
|
||||
* multiple times without replaceAll — so the caller can refine `find`.
|
||||
*/
|
||||
export function applyTextEdits(
|
||||
doc: any,
|
||||
edits: TextEdit[],
|
||||
): { doc: any; results: TextEditResult[] } {
|
||||
const copy = JSON.parse(JSON.stringify(doc));
|
||||
const results: TextEditResult[] = [];
|
||||
|
||||
for (const edit of edits) {
|
||||
if (!edit.find) throw new Error("edit.find must be a non-empty string");
|
||||
|
||||
// Count matches inside individual text nodes first.
|
||||
let nodeMatches = 0;
|
||||
(function count(node: any) {
|
||||
if (node.type === "text" && node.text) {
|
||||
nodeMatches += countOccurrences(node.text, edit.find);
|
||||
}
|
||||
for (const child of node.content || []) count(child);
|
||||
})(copy);
|
||||
|
||||
if (nodeMatches === 0) {
|
||||
// Distinguish "text not present" from "text spans formatting runs".
|
||||
const fullText = collectText(copy);
|
||||
if (fullText.includes(edit.find)) {
|
||||
throw new Error(
|
||||
`Edit "${truncate(edit.find)}": the text exists in the document but spans ` +
|
||||
`multiple formatting runs (bold/link/italic boundaries). Use a shorter ` +
|
||||
`fragment that stays inside one run, or use update_page_json for ` +
|
||||
`structural changes.`,
|
||||
);
|
||||
}
|
||||
throw new Error(
|
||||
`Edit "${truncate(edit.find)}": text not found in the document.`,
|
||||
);
|
||||
}
|
||||
|
||||
if (nodeMatches > 1 && !edit.replaceAll) {
|
||||
throw new Error(
|
||||
`Edit "${truncate(edit.find)}": matches ${nodeMatches} times. ` +
|
||||
`Provide a longer, unique fragment or set replaceAll: true.`,
|
||||
);
|
||||
}
|
||||
|
||||
// Perform the replacement(s).
|
||||
let done = 0;
|
||||
(function replace(node: any) {
|
||||
if (node.type === "text" && node.text && node.text.includes(edit.find)) {
|
||||
if (edit.replaceAll) {
|
||||
done += countOccurrences(node.text, edit.find);
|
||||
node.text = node.text.split(edit.find).join(edit.replace);
|
||||
} else if (done === 0) {
|
||||
// Avoid String.replace: its second arg treats $&, $1, $`, $', $$ as
|
||||
// special patterns, expanding them instead of inserting literally.
|
||||
// Splice the first occurrence by index to keep the replacement literal.
|
||||
const idx = node.text.indexOf(edit.find);
|
||||
node.text =
|
||||
node.text.slice(0, idx) +
|
||||
edit.replace +
|
||||
node.text.slice(idx + edit.find.length);
|
||||
done = 1;
|
||||
}
|
||||
}
|
||||
for (const child of node.content || []) replace(child);
|
||||
})(copy);
|
||||
|
||||
results.push({ find: edit.find, replacements: done });
|
||||
}
|
||||
|
||||
// Drop text nodes that became empty (ProseMirror forbids empty text nodes).
|
||||
(function prune(node: any) {
|
||||
if (Array.isArray(node.content)) {
|
||||
node.content = node.content.filter(
|
||||
(child: any) => !(child.type === "text" && child.text === ""),
|
||||
);
|
||||
for (const child of node.content) prune(child);
|
||||
}
|
||||
})(copy);
|
||||
|
||||
return { doc: copy, results };
|
||||
}
|
||||
|
||||
function truncate(s: string): string {
|
||||
return s.length > 60 ? s.slice(0, 57) + "..." : s;
|
||||
}
|
||||
861
packages/mcp/src/lib/markdown-converter.ts
Normal file
861
packages/mcp/src/lib/markdown-converter.ts
Normal file
@@ -0,0 +1,861 @@
|
||||
/**
|
||||
* Convert ProseMirror/TipTap JSON content to Markdown
|
||||
* Supports all Docmost-specific node types and extensions
|
||||
*/
|
||||
export function convertProseMirrorToMarkdown(content: any): string {
|
||||
if (!content || !content.content) return "";
|
||||
|
||||
// Escape a value interpolated into an HTML double-quoted attribute value
|
||||
// (textAlign, colors, image src, math `text`, all data-* attrs, etc.). In the
|
||||
// ATTRIBUTE context only the quote that delimits the value and the ampersand
|
||||
// that starts an entity are special, so we escape ONLY & " (and ' for safety
|
||||
// when single-quoted delimiters are used). We deliberately do NOT escape < or
|
||||
// >: the HTML re-parser (parse5/jsdom via @tiptap/html) does NOT decode
|
||||
// </> back inside attribute values, so escaping them would corrupt the
|
||||
// stored data (e.g. a math node's LaTeX `a < b`) and ACCUMULATE escapes on
|
||||
// every round-trip (`a < b` -> `a < b` -> `a &lt; b`). Escaping & "
|
||||
// keeps the value inert against attribute-injection while staying idempotent.
|
||||
// NOTE: escape ONLY & and " here. The value is always wrapped in double
|
||||
// quotes, so " is the only delimiter; ' is NOT special in a double-quoted
|
||||
// value, and parse5 does not decode ' back inside attribute values, so
|
||||
// escaping ' would (like < >) corrupt the value and accumulate & on every
|
||||
// round-trip. Escaping & and " is idempotent (parse5 decodes them back).
|
||||
const escapeAttr = (value: unknown): string =>
|
||||
String(value)
|
||||
.replace(/&/g, "&")
|
||||
.replace(/"/g, """);
|
||||
|
||||
// Escape a value placed as HTML element TEXT content (between tags), where
|
||||
// <, >, and & are all significant. Used for text rendered inside raw-HTML
|
||||
// blocks (table cells / columns) so stored characters cannot inject markup.
|
||||
const escapeHtmlText = (value: unknown): string =>
|
||||
String(value)
|
||||
.replace(/&/g, "&")
|
||||
.replace(/</g, "<")
|
||||
.replace(/>/g, ">");
|
||||
|
||||
// Percent-encode characters that would break out of a markdown URL target
|
||||
// (...) — whitespace/newlines and parentheses — so a stored src stays a
|
||||
// single inert token (used for image/video/youtube srcs).
|
||||
const encodeMdUrl = (value: unknown): string =>
|
||||
String(value || "")
|
||||
.replace(/\s/g, (c: string) => (c === " " ? "%20" : encodeURIComponent(c)))
|
||||
.replace(/\(/g, "%28")
|
||||
.replace(/\)/g, "%29");
|
||||
|
||||
const processNode = (node: any): string => {
|
||||
const type = node.type;
|
||||
const nodeContent = node.content || [];
|
||||
|
||||
switch (type) {
|
||||
case "doc":
|
||||
return nodeContent.map(processNode).join("\n\n");
|
||||
|
||||
case "paragraph":
|
||||
const text = nodeContent.map(processNode).join("");
|
||||
const align = node.attrs?.textAlign;
|
||||
if (align && align !== "left") {
|
||||
return `<div align="${escapeAttr(align)}">${text}</div>`;
|
||||
}
|
||||
return text || "";
|
||||
|
||||
case "heading":
|
||||
const level = node.attrs?.level || 1;
|
||||
const headingText = nodeContent.map(processNode).join("");
|
||||
return "#".repeat(level) + " " + headingText;
|
||||
|
||||
case "text":
|
||||
let textContent = node.text || "";
|
||||
// Apply marks (bold, italic, code, etc.)
|
||||
if (node.marks) {
|
||||
// Markdown code spans (`...`) cannot carry inner formatting, so when a
|
||||
// run has the `code` mark alongside ANY other mark, backtick syntax
|
||||
// would leak literal ** / []() into the code text. In that case emit
|
||||
// nested HTML (<code> innermost, the other marks wrapping it as HTML)
|
||||
// so the output is at least well-formed and re-parseable.
|
||||
//
|
||||
// NOTE: this does NOT round-trip both marks. The schema's `code` mark
|
||||
// has `excludes: "_"` (it excludes every other mark), so on import the
|
||||
// co-occurring mark is always dropped — the run comes back as `code`
|
||||
// only. We keep the emission simple and accept that the other mark is
|
||||
// lost; preserving both is impossible while `code` excludes them.
|
||||
// Only use the backtick form when `code` is the sole mark.
|
||||
const markTypes = node.marks.map((m: any) => m.type);
|
||||
const hasCode = markTypes.includes("code");
|
||||
const codeCombined = hasCode && markTypes.length > 1;
|
||||
for (const mark of node.marks) {
|
||||
switch (mark.type) {
|
||||
case "bold":
|
||||
textContent = codeCombined
|
||||
? `<strong>${textContent}</strong>`
|
||||
: `**${textContent}**`;
|
||||
break;
|
||||
case "italic":
|
||||
textContent = codeCombined
|
||||
? `<em>${textContent}</em>`
|
||||
: `*${textContent}*`;
|
||||
break;
|
||||
case "code":
|
||||
// When combined with another mark, wrap as <code> so the
|
||||
// surrounding HTML marks can nest around it; otherwise use the
|
||||
// plain backtick span.
|
||||
textContent = codeCombined
|
||||
? `<code>${textContent}</code>`
|
||||
: `\`${textContent}\``;
|
||||
break;
|
||||
case "link": {
|
||||
const href = mark.attrs?.href || "";
|
||||
const title = mark.attrs?.title;
|
||||
if (codeCombined) {
|
||||
// Emit an HTML anchor so it can wrap the nested <code>.
|
||||
const safeHref = escapeAttr(href);
|
||||
if (title) {
|
||||
textContent = `<a href="${safeHref}" title="${escapeAttr(String(title))}">${textContent}</a>`;
|
||||
} else {
|
||||
textContent = `<a href="${safeHref}">${textContent}</a>`;
|
||||
}
|
||||
} else if (title) {
|
||||
// Emit the optional markdown link title; escape an embedded
|
||||
// double-quote so it cannot terminate the title string early.
|
||||
const safeTitle = String(title).replace(/"/g, '\\"');
|
||||
textContent = `[${textContent}](${href} "${safeTitle}")`;
|
||||
} else {
|
||||
textContent = `[${textContent}](${href})`;
|
||||
}
|
||||
break;
|
||||
}
|
||||
case "strike":
|
||||
textContent = codeCombined
|
||||
? `<s>${textContent}</s>`
|
||||
: `~~${textContent}~~`;
|
||||
break;
|
||||
case "underline":
|
||||
textContent = `<u>${textContent}</u>`;
|
||||
break;
|
||||
case "subscript":
|
||||
textContent = `<sub>${textContent}</sub>`;
|
||||
break;
|
||||
case "superscript":
|
||||
textContent = `<sup>${textContent}</sup>`;
|
||||
break;
|
||||
case "highlight": {
|
||||
// Preserve a null/empty color as a plain highlight (a bare
|
||||
// <mark> with no background-color); only emit the style when a
|
||||
// color is actually set, so a plain highlight is not forced to
|
||||
// yellow on export.
|
||||
const color = mark.attrs?.color;
|
||||
textContent = color
|
||||
? `<mark style="background-color: ${escapeAttr(color)}">${textContent}</mark>`
|
||||
: `<mark>${textContent}</mark>`;
|
||||
break;
|
||||
}
|
||||
case "textStyle":
|
||||
if (mark.attrs?.color) {
|
||||
textContent = `<span style="color: ${escapeAttr(mark.attrs.color)}">${textContent}</span>`;
|
||||
}
|
||||
break;
|
||||
case "comment": {
|
||||
// Emit the inline comment anchor so highlights round-trip. The
|
||||
// schema's Comment mark parses span[data-comment-id] (attrs
|
||||
// commentId/resolved).
|
||||
const cid = mark.attrs?.commentId;
|
||||
if (cid) {
|
||||
const resolvedAttr = mark.attrs?.resolved
|
||||
? ` data-resolved="true"`
|
||||
: "";
|
||||
textContent = `<span data-comment-id="${escapeAttr(cid)}"${resolvedAttr}>${textContent}</span>`;
|
||||
}
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return textContent;
|
||||
|
||||
case "codeBlock":
|
||||
const language = node.attrs?.language || "";
|
||||
// Strip ALL trailing newlines so the export is idempotent: marked
|
||||
// re-adds exactly one trailing "\n" on import, so trimming only one
|
||||
// here would let the text grow by "\n" on each round-trip. Removing
|
||||
// every trailing newline makes repeated cycles stable.
|
||||
const code = nodeContent
|
||||
.map(processNode)
|
||||
.join("")
|
||||
.replace(/\n+$/, "");
|
||||
return "```" + language + "\n" + code + "\n```";
|
||||
|
||||
case "bulletList":
|
||||
return nodeContent
|
||||
.map((item: any) => processListItem(item, "-"))
|
||||
.join("\n");
|
||||
|
||||
case "orderedList":
|
||||
return nodeContent
|
||||
.map((item: any, index: number) =>
|
||||
processListItem(item, `${index + 1}.`),
|
||||
)
|
||||
.join("\n");
|
||||
|
||||
case "taskList":
|
||||
return nodeContent.map((item: any) => processTaskItem(item)).join("\n");
|
||||
|
||||
case "taskItem":
|
||||
// Delegate to the same helper used by taskList so multi-block and
|
||||
// nested task items render and indent consistently.
|
||||
return processTaskItem(node);
|
||||
|
||||
case "listItem":
|
||||
return nodeContent.map(processNode).join("\n");
|
||||
|
||||
case "blockquote":
|
||||
// Prefix EVERY line of EVERY child with "> " and separate block-level
|
||||
// children with a blank ">" line so code blocks / multi-paragraph
|
||||
// quotes round-trip correctly.
|
||||
return nodeContent
|
||||
.map((n: any) =>
|
||||
processNode(n)
|
||||
.split("\n")
|
||||
.map((line: string) => (line.length ? `> ${line}` : ">"))
|
||||
.join("\n"),
|
||||
)
|
||||
.join("\n>\n");
|
||||
|
||||
case "horizontalRule":
|
||||
return "---";
|
||||
|
||||
case "hardBreak":
|
||||
// Two trailing spaces before the newline encode a markdown hard break;
|
||||
// a bare "\n" would be reimported as a soft break and lost.
|
||||
return " \n";
|
||||
|
||||
case "image":
|
||||
const imgAlt = node.attrs?.alt || "";
|
||||
// Neutralize characters that could break out of the markdown image
|
||||
// URL: spaces/newlines and parentheses would terminate the (...) target
|
||||
// and let a stored src inject following markdown/HTML. Percent-encode
|
||||
// them so the URL stays a single inert token.
|
||||
const imgSrc = encodeMdUrl(node.attrs?.src);
|
||||
// No "caption" attribute exists in the Docmost image schema, so we do
|
||||
// not emit one (the previous caption branch was dead).
|
||||
return ``;
|
||||
|
||||
case "video": {
|
||||
// Emit the schema-matching <video> element so generateJSON rebuilds the
|
||||
// node with its attrs intact. The schema's parseHTML reads src/aria-label
|
||||
// from the standard attributes and the remaining attrs from data-*.
|
||||
const attrs = node.attrs || {};
|
||||
const parts: string[] = [`src="${escapeAttr(attrs.src ?? "")}"`];
|
||||
if (attrs.alt) parts.push(`aria-label="${escapeAttr(attrs.alt)}"`);
|
||||
if (attrs.attachmentId)
|
||||
parts.push(
|
||||
`data-attachment-id="${escapeAttr(attrs.attachmentId)}"`,
|
||||
);
|
||||
if (attrs.width != null)
|
||||
parts.push(`width="${escapeAttr(attrs.width)}"`);
|
||||
if (attrs.height != null)
|
||||
parts.push(`height="${escapeAttr(attrs.height)}"`);
|
||||
if (attrs.size != null)
|
||||
parts.push(`data-size="${escapeAttr(attrs.size)}"`);
|
||||
if (attrs.align)
|
||||
parts.push(`data-align="${escapeAttr(attrs.align)}"`);
|
||||
if (attrs.aspectRatio != null)
|
||||
parts.push(`data-aspect-ratio="${escapeAttr(attrs.aspectRatio)}"`);
|
||||
// Wrap in a block <div> so marked treats it as a block (a bare <video>
|
||||
// is inline-level HTML and marked wraps it in <p>, leaving a spurious
|
||||
// empty paragraph beside the hoisted block atom). The wrapper has no
|
||||
// data-type, so the schema parser ignores it and just hoists the video.
|
||||
return `<div><video ${parts.join(" ")}></video></div>`;
|
||||
}
|
||||
|
||||
case "youtube": {
|
||||
// Emit the schema-matching div[data-type="youtube"]; the schema reads
|
||||
// src from data-src and width/height/align from data-* attributes.
|
||||
const attrs = node.attrs || {};
|
||||
const parts: string[] = [
|
||||
`data-type="youtube"`,
|
||||
`data-src="${escapeAttr(attrs.src ?? "")}"`,
|
||||
];
|
||||
if (attrs.width != null)
|
||||
parts.push(`data-width="${escapeAttr(attrs.width)}"`);
|
||||
if (attrs.height != null)
|
||||
parts.push(`data-height="${escapeAttr(attrs.height)}"`);
|
||||
if (attrs.align)
|
||||
parts.push(`data-align="${escapeAttr(attrs.align)}"`);
|
||||
return `<div ${parts.join(" ")}></div>`;
|
||||
}
|
||||
|
||||
case "table": {
|
||||
// A GFM pipe table cannot represent merged cells. If ANY cell carries
|
||||
// colspan>1 or rowspan>1, a pipe table would corrupt the grid on
|
||||
// re-import, so emit the WHOLE table as raw HTML <table> instead: the
|
||||
// schema's table family parseHTML (tag table/tr/td/th, with colspan/
|
||||
// rowspan read from the same-named HTML attrs and align via parseHTML)
|
||||
// round-trips it faithfully. Otherwise keep the lighter GFM pipe table.
|
||||
const tableRows: any[] = nodeContent;
|
||||
if (tableRows.length === 0) return "";
|
||||
const hasSpan = tableRows.some((row: any) =>
|
||||
(row.content || []).some(
|
||||
(cell: any) =>
|
||||
(cell.attrs?.colspan ?? 1) > 1 || (cell.attrs?.rowspan ?? 1) > 1,
|
||||
),
|
||||
);
|
||||
|
||||
if (hasSpan) {
|
||||
// Render each cell's block children to HTML (marked does NOT parse
|
||||
// markdown inside a raw HTML block, so emitting markdown here would
|
||||
// leak literal ** / `` into the cell). blockToHtml mirrors the schema
|
||||
// HTML so inner formatting re-parses into the right marks/nodes.
|
||||
const renderHtmlCell = (cell: any): string => {
|
||||
const tag = cell.type === "tableHeader" ? "th" : "td";
|
||||
const a = cell.attrs || {};
|
||||
const cellParts: string[] = [];
|
||||
if ((a.colspan ?? 1) > 1)
|
||||
cellParts.push(`colspan="${escapeAttr(a.colspan)}"`);
|
||||
if ((a.rowspan ?? 1) > 1)
|
||||
cellParts.push(`rowspan="${escapeAttr(a.rowspan)}"`);
|
||||
if (a.align) cellParts.push(`align="${escapeAttr(a.align)}"`);
|
||||
const open = cellParts.length
|
||||
? `<${tag} ${cellParts.join(" ")}>`
|
||||
: `<${tag}>`;
|
||||
const inner = (cell.content || [])
|
||||
.map((block: any) => blockToHtml(block))
|
||||
.join("");
|
||||
return `${open}${inner}</${tag}>`;
|
||||
};
|
||||
const htmlRows = tableRows
|
||||
.map(
|
||||
(row: any) =>
|
||||
`<tr>${(row.content || []).map(renderHtmlCell).join("")}</tr>`,
|
||||
)
|
||||
.join("");
|
||||
return `<table><tbody>${htmlRows}</tbody></table>`;
|
||||
}
|
||||
|
||||
// No merged cells: emit a GFM table (header row + separator) so the
|
||||
// markdown can be parsed back into a table on re-import.
|
||||
const rows = tableRows.map(processNode);
|
||||
const headerCells = tableRows[0]?.content || [];
|
||||
const columns = headerCells.length || 1;
|
||||
// Derive alignment markers (:--, :-:, --:) from each header cell.
|
||||
const markers = Array.from({ length: columns }, (_, i) => {
|
||||
const align = headerCells[i]?.attrs?.align;
|
||||
switch (align) {
|
||||
case "left":
|
||||
return ":--";
|
||||
case "center":
|
||||
return ":-:";
|
||||
case "right":
|
||||
return "--:";
|
||||
default:
|
||||
return "---";
|
||||
}
|
||||
});
|
||||
const separator = "| " + markers.join(" | ") + " |";
|
||||
return [rows[0], separator, ...rows.slice(1)].join("\n");
|
||||
}
|
||||
|
||||
case "tableRow":
|
||||
return "| " + nodeContent.map(processNode).join(" | ") + " |";
|
||||
|
||||
case "tableCell":
|
||||
case "tableHeader": {
|
||||
// Join multiple block children with a space (not "") so adjacent blocks
|
||||
// like a paragraph followed by a list don't collide into "line1- a".
|
||||
// Then collapse newlines and escape pipes so a cell containing "|" or a
|
||||
// line break cannot corrupt the surrounding GFM row.
|
||||
return nodeContent
|
||||
.map(processNode)
|
||||
.join(" ")
|
||||
.replace(/\r?\n/g, " ")
|
||||
.replace(/\|/g, "\\|");
|
||||
}
|
||||
|
||||
case "callout":
|
||||
const calloutType = node.attrs?.type || "info";
|
||||
const calloutContent = nodeContent.map(processNode).join("\n");
|
||||
return `:::${calloutType.toLowerCase()}\n${calloutContent}\n:::`;
|
||||
|
||||
case "details":
|
||||
return nodeContent.map(processNode).join("\n");
|
||||
|
||||
case "detailsSummary":
|
||||
const summaryText = nodeContent.map(processNode).join("");
|
||||
return `<details>\n<summary>${summaryText}</summary>\n`;
|
||||
|
||||
case "detailsContent":
|
||||
const detailsText = nodeContent.map(processNode).join("\n");
|
||||
return `${detailsText}\n</details>`;
|
||||
|
||||
case "mathInline": {
|
||||
// The schema's `text` attribute has no parseHTML, so TipTap's default
|
||||
// parser reads it from the `text` HTML attribute (NOT the element's text
|
||||
// content). Emit span[data-type="mathInline"] carrying the LaTeX in a
|
||||
// `text="..."` attribute so it round-trips. marked cannot parse $...$
|
||||
// back, so the previous form was lossy.
|
||||
const inlineMath = node.attrs?.text || "";
|
||||
return `<span data-type="mathInline" data-katex="true" text="${escapeAttr(inlineMath)}"></span>`;
|
||||
}
|
||||
|
||||
case "mathBlock": {
|
||||
// Same as mathInline: the LaTeX must ride in the `text` HTML attribute
|
||||
// for the schema's default parser to recover it.
|
||||
const blockMath = node.attrs?.text || "";
|
||||
return `<div data-type="mathBlock" data-katex="true" text="${escapeAttr(blockMath)}"></div>`;
|
||||
}
|
||||
|
||||
case "mention": {
|
||||
// Emit span[data-type="mention"] with the schema's data-* attributes so
|
||||
// generateJSON rebuilds the mention node instead of leaving "@label"
|
||||
// plain text that cannot re-parse.
|
||||
const attrs = node.attrs || {};
|
||||
const parts: string[] = [`data-type="mention"`];
|
||||
if (attrs.id) parts.push(`data-id="${escapeAttr(attrs.id)}"`);
|
||||
if (attrs.label)
|
||||
parts.push(`data-label="${escapeAttr(attrs.label)}"`);
|
||||
if (attrs.entityType)
|
||||
parts.push(`data-entity-type="${escapeAttr(attrs.entityType)}"`);
|
||||
if (attrs.entityId)
|
||||
parts.push(`data-entity-id="${escapeAttr(attrs.entityId)}"`);
|
||||
if (attrs.slugId)
|
||||
parts.push(`data-slug-id="${escapeAttr(attrs.slugId)}"`);
|
||||
if (attrs.creatorId)
|
||||
parts.push(`data-creator-id="${escapeAttr(attrs.creatorId)}"`);
|
||||
if (attrs.anchorId)
|
||||
parts.push(`data-anchor-id="${escapeAttr(attrs.anchorId)}"`);
|
||||
// Keep the label as visible text content too; the schema reads attrs
|
||||
// from data-*, so the inner text is purely cosmetic and harmless.
|
||||
const mentionLabel = attrs.label || attrs.id || "";
|
||||
// The label is visible element TEXT content here (the data-* attrs above
|
||||
// carry the real values), so escape it for the text context, not attrs.
|
||||
return `<span ${parts.join(" ")}>@${escapeHtmlText(mentionLabel)}</span>`;
|
||||
}
|
||||
|
||||
case "attachment": {
|
||||
// BUG FIX: the old code read node.attrs.fileName / node.attrs.src, but
|
||||
// the schema stores name/url (plus mime/size/attachmentId). Emit the
|
||||
// schema-matching div[data-type="attachment"] with data-attachment-*
|
||||
// attrs so the node round-trips instead of degrading to a markdown link.
|
||||
const attrs = node.attrs || {};
|
||||
const parts: string[] = [
|
||||
`data-type="attachment"`,
|
||||
`data-attachment-url="${escapeAttr(attrs.url ?? "")}"`,
|
||||
];
|
||||
if (attrs.name)
|
||||
parts.push(`data-attachment-name="${escapeAttr(attrs.name)}"`);
|
||||
if (attrs.mime)
|
||||
parts.push(`data-attachment-mime="${escapeAttr(attrs.mime)}"`);
|
||||
if (attrs.size != null)
|
||||
parts.push(`data-attachment-size="${escapeAttr(attrs.size)}"`);
|
||||
if (attrs.attachmentId)
|
||||
parts.push(
|
||||
`data-attachment-id="${escapeAttr(attrs.attachmentId)}"`,
|
||||
);
|
||||
return `<div ${parts.join(" ")}></div>`;
|
||||
}
|
||||
|
||||
case "drawio":
|
||||
case "excalidraw": {
|
||||
// Emit the schema-matching div[data-type=...] carrying the diagram's
|
||||
// attrs as data-* (the schema's diagramAttributes reads src/title/alt/
|
||||
// width/height/size/aspectRatio/align/attachmentId from data-*), so the
|
||||
// diagram round-trips instead of degrading to a lossy placeholder.
|
||||
const attrs = node.attrs || {};
|
||||
const parts: string[] = [
|
||||
`data-type="${type}"`,
|
||||
`data-src="${escapeAttr(attrs.src ?? "")}"`,
|
||||
];
|
||||
if (attrs.title != null)
|
||||
parts.push(`data-title="${escapeAttr(attrs.title)}"`);
|
||||
if (attrs.alt != null) parts.push(`data-alt="${escapeAttr(attrs.alt)}"`);
|
||||
if (attrs.width != null)
|
||||
parts.push(`data-width="${escapeAttr(attrs.width)}"`);
|
||||
if (attrs.height != null)
|
||||
parts.push(`data-height="${escapeAttr(attrs.height)}"`);
|
||||
if (attrs.size != null)
|
||||
parts.push(`data-size="${escapeAttr(attrs.size)}"`);
|
||||
if (attrs.aspectRatio != null)
|
||||
parts.push(`data-aspect-ratio="${escapeAttr(attrs.aspectRatio)}"`);
|
||||
if (attrs.align)
|
||||
parts.push(`data-align="${escapeAttr(attrs.align)}"`);
|
||||
if (attrs.attachmentId)
|
||||
parts.push(
|
||||
`data-attachment-id="${escapeAttr(attrs.attachmentId)}"`,
|
||||
);
|
||||
return `<div ${parts.join(" ")}></div>`;
|
||||
}
|
||||
|
||||
case "embed": {
|
||||
// Emit the schema-matching div[data-type="embed"]; the schema reads
|
||||
// src/provider/align/width/height from data-* attributes so the node
|
||||
// (and its provider iframe info) survives the round-trip.
|
||||
const attrs = node.attrs || {};
|
||||
const parts: string[] = [
|
||||
`data-type="embed"`,
|
||||
`data-src="${escapeAttr(attrs.src ?? "")}"`,
|
||||
`data-provider="${escapeAttr(attrs.provider ?? "")}"`,
|
||||
];
|
||||
if (attrs.align)
|
||||
parts.push(`data-align="${escapeAttr(attrs.align)}"`);
|
||||
if (attrs.width != null)
|
||||
parts.push(`data-width="${escapeAttr(attrs.width)}"`);
|
||||
if (attrs.height != null)
|
||||
parts.push(`data-height="${escapeAttr(attrs.height)}"`);
|
||||
return `<div ${parts.join(" ")}></div>`;
|
||||
}
|
||||
|
||||
case "audio": {
|
||||
// Emit the schema-matching <audio> element (was emitting nothing). The
|
||||
// schema reads src from src and attachmentId/size from data-*.
|
||||
const attrs = node.attrs || {};
|
||||
const parts: string[] = [`src="${escapeAttr(attrs.src ?? "")}"`];
|
||||
if (attrs.attachmentId)
|
||||
parts.push(
|
||||
`data-attachment-id="${escapeAttr(attrs.attachmentId)}"`,
|
||||
);
|
||||
if (attrs.size != null)
|
||||
parts.push(`data-size="${escapeAttr(attrs.size)}"`);
|
||||
// Wrap in a block <div> for the same reason as video: a bare <audio> is
|
||||
// inline-level HTML that marked would wrap in <p>.
|
||||
return `<div><audio ${parts.join(" ")}></audio></div>`;
|
||||
}
|
||||
|
||||
case "pdf": {
|
||||
// Emit the schema-matching div[data-type="pdf"] (was emitting nothing).
|
||||
// The schema reads src/width/height from standard attrs and name/
|
||||
// attachmentId/size from data-*.
|
||||
const attrs = node.attrs || {};
|
||||
const parts: string[] = [
|
||||
`data-type="pdf"`,
|
||||
`src="${escapeAttr(attrs.src ?? "")}"`,
|
||||
];
|
||||
if (attrs.name) parts.push(`data-name="${escapeAttr(attrs.name)}"`);
|
||||
if (attrs.attachmentId)
|
||||
parts.push(
|
||||
`data-attachment-id="${escapeAttr(attrs.attachmentId)}"`,
|
||||
);
|
||||
if (attrs.size != null)
|
||||
parts.push(`data-size="${escapeAttr(attrs.size)}"`);
|
||||
if (attrs.width != null)
|
||||
parts.push(`width="${escapeAttr(attrs.width)}"`);
|
||||
if (attrs.height != null)
|
||||
parts.push(`height="${escapeAttr(attrs.height)}"`);
|
||||
return `<div ${parts.join(" ")}></div>`;
|
||||
}
|
||||
|
||||
case "columns": {
|
||||
// Emit the schema-matching div[data-type="columns"] wrapper so the
|
||||
// multi-column layout survives. Without a case the children were
|
||||
// concatenated with no separator and the text merged. The schema reads
|
||||
// layout from data-layout and widthMode from data-width-mode. The whole
|
||||
// block is raw HTML, so render children via blockToHtml (NOT markdown,
|
||||
// which marked would not re-parse inside a raw HTML block).
|
||||
const attrs = node.attrs || {};
|
||||
const parts: string[] = [`data-type="columns"`];
|
||||
if (attrs.layout)
|
||||
parts.push(`data-layout="${escapeAttr(attrs.layout)}"`);
|
||||
if (attrs.widthMode && attrs.widthMode !== "normal")
|
||||
parts.push(`data-width-mode="${escapeAttr(attrs.widthMode)}"`);
|
||||
const inner = nodeContent.map((n: any) => blockToHtml(n)).join("");
|
||||
return `<div ${parts.join(" ")}>${inner}</div>`;
|
||||
}
|
||||
|
||||
case "column": {
|
||||
// Emit the schema-matching div[data-type="column"]; the schema reads the
|
||||
// column width from data-width. Children are rendered as HTML so their
|
||||
// formatting survives inside this raw HTML block.
|
||||
const attrs = node.attrs || {};
|
||||
const parts: string[] = [`data-type="column"`];
|
||||
if (attrs.width)
|
||||
parts.push(`data-width="${escapeAttr(attrs.width)}"`);
|
||||
const inner = nodeContent.map((n: any) => blockToHtml(n)).join("");
|
||||
return `<div ${parts.join(" ")}>${inner}</div>`;
|
||||
}
|
||||
|
||||
case "subpages":
|
||||
return "{{SUBPAGES}}";
|
||||
|
||||
default:
|
||||
// Fallback: process children
|
||||
return nodeContent.map(processNode).join("");
|
||||
}
|
||||
};
|
||||
|
||||
// Render inline content (text runs + their marks) to HTML. Used by the raw
|
||||
// HTML fallbacks (spanned tables, columns) where marked will NOT re-parse
|
||||
// markdown, so backtick/asterisk/bracket syntax would otherwise leak as
|
||||
// literal characters. Each mark is mirrored to the HTML the schema's parseHTML
|
||||
// accepts so it re-imports as the matching ProseMirror mark.
|
||||
const inlineToHtml = (inlineNodes: any[]): string =>
|
||||
(inlineNodes || [])
|
||||
.map((n: any) => {
|
||||
if (n.type === "hardBreak") return "<br>";
|
||||
if (n.type !== "text") {
|
||||
// Inline atoms (mention, mathInline) already emit schema HTML.
|
||||
return processNode(n);
|
||||
}
|
||||
let t = escapeHtmlText(n.text || "");
|
||||
for (const mark of n.marks || []) {
|
||||
switch (mark.type) {
|
||||
case "bold":
|
||||
t = `<strong>${t}</strong>`;
|
||||
break;
|
||||
case "italic":
|
||||
t = `<em>${t}</em>`;
|
||||
break;
|
||||
case "code":
|
||||
t = `<code>${t}</code>`;
|
||||
break;
|
||||
case "strike":
|
||||
t = `<s>${t}</s>`;
|
||||
break;
|
||||
case "underline":
|
||||
t = `<u>${t}</u>`;
|
||||
break;
|
||||
case "subscript":
|
||||
t = `<sub>${t}</sub>`;
|
||||
break;
|
||||
case "superscript":
|
||||
t = `<sup>${t}</sup>`;
|
||||
break;
|
||||
case "link":
|
||||
t = `<a href="${escapeAttr(mark.attrs?.href || "")}">${t}</a>`;
|
||||
break;
|
||||
case "highlight":
|
||||
t = mark.attrs?.color
|
||||
? `<mark style="background-color: ${escapeAttr(mark.attrs.color)}">${t}</mark>`
|
||||
: `<mark>${t}</mark>`;
|
||||
break;
|
||||
case "textStyle":
|
||||
if (mark.attrs?.color)
|
||||
t = `<span style="color: ${escapeAttr(mark.attrs.color)}">${t}</span>`;
|
||||
break;
|
||||
case "comment":
|
||||
// Inline comment anchor inside a raw-HTML container (columns /
|
||||
// spanned table cells), so commented text there also round-trips.
|
||||
if (mark.attrs?.commentId) {
|
||||
const r = mark.attrs?.resolved ? ` data-resolved="true"` : "";
|
||||
t = `<span data-comment-id="${escapeAttr(mark.attrs.commentId)}"${r}>${t}</span>`;
|
||||
}
|
||||
break;
|
||||
}
|
||||
}
|
||||
return t;
|
||||
})
|
||||
.join("");
|
||||
|
||||
// Emit the schema-matching <img> for an image node. Shared so the image is
|
||||
// emitted as real HTML wherever a raw-HTML container needs it (inside a column
|
||||
// or a spanned table cell), where markdown `` would NOT be re-parsed
|
||||
// and would survive as literal text. The Image extension reads src/alt from
|
||||
// the standard attributes; the Docmost extra attrs (width/height/align/size/
|
||||
// attachmentId/aspectRatio) are global attributes read from same-named DOM
|
||||
// attributes, so emit them by name.
|
||||
const imageToHtml = (node: any): string => {
|
||||
const attrs = node.attrs || {};
|
||||
const parts: string[] = [`src="${escapeAttr(attrs.src ?? "")}"`];
|
||||
if (attrs.alt) parts.push(`alt="${escapeAttr(attrs.alt)}"`);
|
||||
if (attrs.title) parts.push(`title="${escapeAttr(attrs.title)}"`);
|
||||
if (attrs.width != null) parts.push(`width="${escapeAttr(attrs.width)}"`);
|
||||
if (attrs.height != null) parts.push(`height="${escapeAttr(attrs.height)}"`);
|
||||
if (attrs.align) parts.push(`align="${escapeAttr(attrs.align)}"`);
|
||||
if (attrs.size != null) parts.push(`data-size="${escapeAttr(attrs.size)}"`);
|
||||
if (attrs.attachmentId)
|
||||
parts.push(`data-attachment-id="${escapeAttr(attrs.attachmentId)}"`);
|
||||
if (attrs.aspectRatio != null)
|
||||
parts.push(`data-aspect-ratio="${escapeAttr(attrs.aspectRatio)}"`);
|
||||
return `<img ${parts.join(" ")}>`;
|
||||
};
|
||||
|
||||
// Emit the schema-matching div[data-type="callout"] for a callout node. The
|
||||
// schema reads the banner type from data-callout-type. Children are rendered
|
||||
// as HTML so they survive inside a raw-HTML container.
|
||||
const calloutToHtml = (node: any): string => {
|
||||
const type = (node.attrs?.type || "info").toLowerCase();
|
||||
const inner = (node.content || []).map(blockToHtml).join("");
|
||||
return `<div data-type="callout" data-callout-type="${escapeAttr(type)}">${inner}</div>`;
|
||||
};
|
||||
|
||||
// Emit a schema-matching <details> tree. The schema parses <details>,
|
||||
// summary[data-type="detailsSummary"], and div[data-type="detailsContent"].
|
||||
const detailsToHtml = (node: any): string => {
|
||||
const inner = (node.content || []).map(blockToHtml).join("");
|
||||
return `<details>${inner}</details>`;
|
||||
};
|
||||
const detailsSummaryToHtml = (node: any): string =>
|
||||
`<summary data-type="detailsSummary">${inlineToHtml(node.content || [])}</summary>`;
|
||||
const detailsContentToHtml = (node: any): string => {
|
||||
const inner = (node.content || []).map(blockToHtml).join("");
|
||||
return `<div data-type="detailsContent">${inner}</div>`;
|
||||
};
|
||||
|
||||
// Emit the schema-matching taskList/taskItem HTML. bridgeTaskLists (in
|
||||
// collaboration.ts) recognizes ul[data-type="taskList"] with
|
||||
// li[data-type="taskItem"][data-checked]; emitting that directly here keeps
|
||||
// task lists inside columns/cells from degrading to literal "- [ ]" text.
|
||||
const taskListToHtml = (node: any): string => {
|
||||
const items = (node.content || [])
|
||||
.map((it: any) => {
|
||||
const checked = it.attrs?.checked ? "true" : "false";
|
||||
return `<li data-type="taskItem" data-checked="${checked}">${blockChildrenToHtml(it)}</li>`;
|
||||
})
|
||||
.join("");
|
||||
return `<ul data-type="taskList">${items}</ul>`;
|
||||
};
|
||||
|
||||
// Render a block node to HTML for the raw-HTML containers (spanned tables,
|
||||
// columns). marked does NOT re-parse markdown inside a raw-HTML block, so
|
||||
// EVERY block type that can appear inside a column or a spanned cell must be
|
||||
// emitted as schema-matching HTML here — never as markdown, or it would land
|
||||
// as literal text on re-import. Nodes whose processNode case already produces
|
||||
// schema-matching HTML (math/media/embed/attachment/nested columns/spanned
|
||||
// table) are delegated to processNode; the markdown-emitting cases
|
||||
// (image/blockquote/callout/details/hr/taskList) get explicit HTML here.
|
||||
const blockToHtml = (block: any): string => {
|
||||
const children = block.content || [];
|
||||
switch (block.type) {
|
||||
case "paragraph":
|
||||
return `<p>${inlineToHtml(children)}</p>`;
|
||||
case "heading": {
|
||||
const level = block.attrs?.level || 1;
|
||||
return `<h${level}>${inlineToHtml(children)}</h${level}>`;
|
||||
}
|
||||
case "bulletList":
|
||||
return `<ul>${children
|
||||
.map((li: any) => `<li>${blockChildrenToHtml(li)}</li>`)
|
||||
.join("")}</ul>`;
|
||||
case "orderedList":
|
||||
return `<ol>${children
|
||||
.map((li: any) => `<li>${blockChildrenToHtml(li)}</li>`)
|
||||
.join("")}</ol>`;
|
||||
case "codeBlock": {
|
||||
const lang = block.attrs?.language || "";
|
||||
// The code itself is element TEXT content (between <code> tags), so it
|
||||
// must escape < > & — NOT the attribute escaper. The language rides in
|
||||
// a class ATTRIBUTE, so it uses escapeAttr.
|
||||
const code = escapeHtmlText(
|
||||
children
|
||||
.map(processNode)
|
||||
.join("")
|
||||
.replace(/\n+$/, ""),
|
||||
);
|
||||
const cls = lang ? ` class="language-${escapeAttr(lang)}"` : "";
|
||||
return `<pre><code${cls}>${code}</code></pre>`;
|
||||
}
|
||||
case "image":
|
||||
return imageToHtml(block);
|
||||
case "blockquote":
|
||||
return `<blockquote>${children.map(blockToHtml).join("")}</blockquote>`;
|
||||
case "horizontalRule":
|
||||
return "<hr>";
|
||||
case "callout":
|
||||
return calloutToHtml(block);
|
||||
case "details":
|
||||
return detailsToHtml(block);
|
||||
case "detailsSummary":
|
||||
return detailsSummaryToHtml(block);
|
||||
case "detailsContent":
|
||||
return detailsContentToHtml(block);
|
||||
case "taskList":
|
||||
return taskListToHtml(block);
|
||||
case "taskItem":
|
||||
// A bare taskItem (outside a taskList) still needs a wrapping list so
|
||||
// the schema parses it; wrap it in a single-item taskList.
|
||||
return taskListToHtml({ content: [block] });
|
||||
// table (incl. spanned), columns/column, math, media, embed, attachment,
|
||||
// mention, etc. already emit schema-matching HTML from processNode.
|
||||
case "table":
|
||||
case "columns":
|
||||
case "column":
|
||||
case "mathBlock":
|
||||
case "video":
|
||||
case "audio":
|
||||
case "pdf":
|
||||
case "youtube":
|
||||
case "embed":
|
||||
case "attachment":
|
||||
case "drawio":
|
||||
case "excalidraw":
|
||||
return processNode(block);
|
||||
default:
|
||||
// Any still-unhandled block type: NEVER fall back to markdown inside a
|
||||
// raw-HTML block (it would become literal text). Wrap its rendered
|
||||
// children in a <div> so their content is preserved; if it has no block
|
||||
// children, render its inline content instead.
|
||||
if (children.length && children.some((c: any) => c.type !== "text")) {
|
||||
return `<div>${children.map(blockToHtml).join("")}</div>`;
|
||||
}
|
||||
return `<div>${inlineToHtml(children)}</div>`;
|
||||
}
|
||||
};
|
||||
|
||||
// Render the block children of a list item to HTML (a listItem holds block+
|
||||
// content). Mirrors processListItem but for the HTML fallback path.
|
||||
const blockChildrenToHtml = (item: any): string =>
|
||||
(item.content || []).map((b: any) => blockToHtml(b)).join("");
|
||||
|
||||
// Indent the rendered children of a list item under a marker prefix.
|
||||
// Each child block is a (possibly multi-line) string. The very first physical
|
||||
// line of the first child carries the marker (e.g. "- " or "1. "); EVERY
|
||||
// other line — the remaining lines of the first child AND all lines of every
|
||||
// subsequent child (nested lists, code blocks, extra paragraphs) — is indented
|
||||
// to align under the marker. Without indenting these continuation lines, the
|
||||
// 2nd/3rd line of a nested child collapses to column 0 and escapes the list.
|
||||
//
|
||||
// The continuation indent MUST equal the LIST marker width, which is not the
|
||||
// same as the visible prefix width:
|
||||
// - bullet "- " -> 2 columns
|
||||
// - task "- [ ] " -> marker is still "- " (the "[ ] " is content), 2
|
||||
// - ordered "1. "/"10. " -> 3/4 columns, scaling with the number's digits
|
||||
// CommonMark anchors nested content to the marker column, so an ordered item
|
||||
// indented to only 2 columns would be re-parsed as a sibling/loose content on
|
||||
// re-import. Callers therefore pass the exact indent width to use.
|
||||
const indentItemChildren = (
|
||||
childStrings: string[],
|
||||
prefix: string,
|
||||
indentWidth: number,
|
||||
): string => {
|
||||
const indent = " ".repeat(indentWidth);
|
||||
const lines: string[] = [];
|
||||
childStrings.forEach((child, childIndex) => {
|
||||
child.split("\n").forEach((line, lineIndex) => {
|
||||
if (childIndex === 0 && lineIndex === 0) {
|
||||
// First physical line of the first block gets the marker.
|
||||
lines.push(`${prefix} ${line}`);
|
||||
} else {
|
||||
// Indent every continuation line by the marker width; keep blank
|
||||
// lines blank rather than emitting trailing whitespace.
|
||||
lines.push(line.length ? `${indent}${line}` : "");
|
||||
}
|
||||
});
|
||||
});
|
||||
return lines.join("\n");
|
||||
};
|
||||
|
||||
const processListItem = (item: any, prefix: string): string => {
|
||||
const itemContent = item.content || [];
|
||||
const childStrings = itemContent.map(processNode);
|
||||
if (childStrings.length === 0) return prefix;
|
||||
// The rendered marker is `${prefix} ` (prefix + one space), so its width —
|
||||
// and thus the continuation indent — is prefix.length + 1. This is correct
|
||||
// for both bullet ("-" -> 2) and ordered ("1." -> 3, "10." -> 4) markers,
|
||||
// since for those the visible prefix IS the list marker.
|
||||
return indentItemChildren(childStrings, prefix, prefix.length + 1);
|
||||
};
|
||||
|
||||
const processTaskItem = (item: any): string => {
|
||||
const checked = item.attrs?.checked || false;
|
||||
const checkbox = checked ? "[x]" : "[ ]";
|
||||
const prefix = `- ${checkbox}`;
|
||||
const itemContent = item.content || [];
|
||||
const childStrings = itemContent.map(processNode);
|
||||
// An empty task item still needs its checkbox marker; without this guard
|
||||
// the indent below produces "" and the "- [ ]"/"- [x]" row disappears.
|
||||
if (childStrings.length === 0) return prefix;
|
||||
// The list marker for a task item is just "- " (2 columns); the "[ ] "/"[x] "
|
||||
// checkbox is item content, NOT part of the marker. So the continuation
|
||||
// indent is a fixed 2 — do NOT derive it from the wider prefix.length.
|
||||
return indentItemChildren(childStrings, prefix, 2);
|
||||
};
|
||||
|
||||
return processNode(content).trim();
|
||||
}
|
||||
136
packages/mcp/src/lib/markdown-document.ts
Normal file
136
packages/mcp/src/lib/markdown-document.ts
Normal file
@@ -0,0 +1,136 @@
|
||||
/**
|
||||
* Self-contained Docmost-flavoured Markdown document (custom extensions).
|
||||
*
|
||||
* A single `.md` file that packages everything needed to losslessly round-trip
|
||||
* a page through "download -> edit body -> re-upload":
|
||||
* - a leading `docmost:meta` block: a one-line JSON object with page identity;
|
||||
* - the Markdown body (carrying inline comment anchors and diagrams as HTML);
|
||||
* - a trailing `docmost:comments` block: a one-line JSON array of comment
|
||||
* threads.
|
||||
*
|
||||
* Both metadata blocks are HTML comments on purpose: `marked`/`generateJSON`
|
||||
* drop HTML comments, so even if the WHOLE file were ever fed straight to the
|
||||
* importer without first stripping the blocks, the metadata cannot leak into the
|
||||
* document. (A fenced ```docmost-comments``` block would WRONGLY become a
|
||||
* codeBlock node, so a fenced block is deliberately NOT used.)
|
||||
*
|
||||
* The delimiter literals may legitimately appear in the BODY too (e.g. a user
|
||||
* re-pastes an exported `.md` into a page, or a page documents this very
|
||||
* format). To stay robust, parsing treats only the FINAL, document-ending
|
||||
* `docmost:comments` block as metadata: it is the last `<!-- docmost:comments`
|
||||
* opener whose closing `-->` sits at the very end of the file. Any earlier
|
||||
* literal occurrence is left in the body untouched.
|
||||
*
|
||||
* NOTE on comments: in this version the comment THREAD records are preserved in
|
||||
* the file but are NOT pushed back to the server on import — only the inline
|
||||
* comment marks (anchors) embedded in the body are restored. Managing comment
|
||||
* records stays with the comment tools/UI.
|
||||
*/
|
||||
|
||||
export interface DocmostMdMeta {
|
||||
version: number;
|
||||
pageId?: string;
|
||||
slugId?: string;
|
||||
title?: string;
|
||||
spaceId?: string;
|
||||
parentPageId?: string | null;
|
||||
}
|
||||
|
||||
// Match the leading meta block (allow leading whitespace). Capture group 1 is
|
||||
// the JSON text between the markers.
|
||||
const META_RE = /^\s*<!--\s*docmost:meta\s*\n([\s\S]*?)\n-->/;
|
||||
// Match a `docmost:comments` opener. Used globally to scan for the LAST opener
|
||||
// rather than end-anchoring a single regex (which would mis-capture across a
|
||||
// literal opener that appears earlier in the body).
|
||||
const COMMENTS_OPEN_RE = /<!--[ \t]*docmost:comments[ \t]*\r?\n/g;
|
||||
|
||||
/**
|
||||
* Assemble the full self-contained markdown file: meta block, body, and the
|
||||
* comments block. The meta block is always emitted; the comments block is always
|
||||
* emitted too (with `[]` when there are no comments) so the format stays uniform
|
||||
* and parsing stays simple.
|
||||
*/
|
||||
export function serializeDocmostMarkdown(
|
||||
meta: DocmostMdMeta,
|
||||
body: string,
|
||||
comments: any[],
|
||||
): string {
|
||||
const metaJson = JSON.stringify(meta);
|
||||
const commentsJson = JSON.stringify(Array.isArray(comments) ? comments : []);
|
||||
const trimmedBody = (body ?? "").trim();
|
||||
return (
|
||||
`<!-- docmost:meta\n${metaJson}\n-->\n\n` +
|
||||
`${trimmedBody}\n\n` +
|
||||
`<!-- docmost:comments\n${commentsJson}\n-->\n`
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* Split a self-contained file back into its parts. Tolerant: if the meta or
|
||||
* comments block is missing (e.g. a hand-written plain-markdown file), the
|
||||
* corresponding value is returned as `null` and the whole input is treated as
|
||||
* the body. This never throws on a MISSING block; only a `JSON.parse` failure
|
||||
* inside a block that IS present is surfaced as a thrown Error with a clear
|
||||
* message. Robust to `\r\n` line endings.
|
||||
*/
|
||||
export function parseDocmostMarkdown(full: string): {
|
||||
meta: DocmostMdMeta | null;
|
||||
body: string;
|
||||
comments: any[] | null;
|
||||
} {
|
||||
// Normalize line endings so the anchored regexes work regardless of CRLF.
|
||||
const normalized = (full ?? "").replace(/\r\n/g, "\n");
|
||||
|
||||
// Extract the leading meta block (start-anchored — already unambiguous).
|
||||
let meta: DocmostMdMeta | null = null;
|
||||
let metaEnd = 0;
|
||||
const metaMatch = normalized.match(META_RE);
|
||||
if (metaMatch) {
|
||||
try {
|
||||
meta = JSON.parse(metaMatch[1]);
|
||||
} catch (e) {
|
||||
throw new Error(
|
||||
`Invalid docmost:meta JSON block: ${
|
||||
e instanceof Error ? e.message : String(e)
|
||||
}`,
|
||||
);
|
||||
}
|
||||
// Body starts right after the matched meta block.
|
||||
metaEnd = (metaMatch.index ?? 0) + metaMatch[0].length;
|
||||
}
|
||||
|
||||
// Find the LAST `<!-- docmost:comments` opener; the real file-level block is
|
||||
// the final one whose closing `-->` ends the document. Any earlier literal
|
||||
// occurrence inside the body (e.g. a re-pasted export) is left in the body.
|
||||
let lastOpenStart = -1;
|
||||
let lastOpenEnd = -1;
|
||||
let m: RegExpExecArray | null;
|
||||
COMMENTS_OPEN_RE.lastIndex = 0;
|
||||
while ((m = COMMENTS_OPEN_RE.exec(normalized)) !== null) {
|
||||
lastOpenStart = m.index;
|
||||
lastOpenEnd = m.index + m[0].length;
|
||||
}
|
||||
|
||||
let comments: any[] | null = null;
|
||||
let bodyEnd = normalized.length;
|
||||
if (lastOpenStart !== -1) {
|
||||
const rest = normalized.slice(lastOpenEnd);
|
||||
const close = rest.match(/\r?\n-->[ \t]*\r?\n?\s*$/); // closer must end the doc
|
||||
if (close) {
|
||||
const jsonText = rest.slice(0, close.index);
|
||||
try {
|
||||
comments = JSON.parse(jsonText);
|
||||
} catch (e) {
|
||||
throw new Error(
|
||||
`Invalid docmost:comments JSON block: ${
|
||||
e instanceof Error ? e.message : String(e)
|
||||
}`,
|
||||
);
|
||||
}
|
||||
bodyEnd = lastOpenStart; // strip from the opener to end of document
|
||||
}
|
||||
}
|
||||
|
||||
const body = normalized.slice(metaEnd, bodyEnd).trim();
|
||||
return { meta, body, comments };
|
||||
}
|
||||
897
packages/mcp/src/lib/node-ops.ts
Normal file
897
packages/mcp/src/lib/node-ops.ts
Normal file
@@ -0,0 +1,897 @@
|
||||
/**
|
||||
* Pure, network-free helpers for manipulating a ProseMirror/TipTap document
|
||||
* tree by node id.
|
||||
*
|
||||
* A ProseMirror node here is a plain JSON object of the shape produced by
|
||||
* Docmost: `{ type, attrs?, content?, text?, marks? }`. Children live in the
|
||||
* `content` array; a node carries a stable id in `attrs.id`. Callouts and
|
||||
* table cells hold their children in `content` just like any other block, so a
|
||||
* single recursive walk reaches them all.
|
||||
*
|
||||
* Every exported function operates on a DEEP CLONE of the input document and
|
||||
* returns the new document. The input doc and any `newNode`/`node` argument are
|
||||
* never mutated. All functions are defensively null-safe: missing/!Array
|
||||
* `content`, non-object nodes, and absent `attrs` are tolerated.
|
||||
*/
|
||||
|
||||
/** Deep-clone a JSON-serializable value without mutating the original. */
|
||||
function clone<T>(value: T): T {
|
||||
if (typeof structuredClone === "function") {
|
||||
return structuredClone(value);
|
||||
}
|
||||
// Fallback for environments without structuredClone.
|
||||
return JSON.parse(JSON.stringify(value)) as T;
|
||||
}
|
||||
|
||||
/** True if `value` is a non-null object (and not an array). */
|
||||
function isObject(value: any): value is Record<string, any> {
|
||||
return value != null && typeof value === "object" && !Array.isArray(value);
|
||||
}
|
||||
|
||||
/** True if `node` carries the given id in `node.attrs.id`. */
|
||||
function matchesId(node: any, nodeId: string): boolean {
|
||||
return isObject(node) && isObject(node.attrs) && node.attrs.id === nodeId;
|
||||
}
|
||||
|
||||
/**
|
||||
* Recursively concatenate all text contained in a node.
|
||||
*
|
||||
* Text nodes contribute their `text` string; container nodes contribute the
|
||||
* joined `blockPlainText` of their `content` children. Returns "" for nullish
|
||||
* or non-object inputs.
|
||||
*/
|
||||
export function blockPlainText(node: any): string {
|
||||
if (!isObject(node)) return "";
|
||||
let out = "";
|
||||
if (typeof node.text === "string") {
|
||||
out += node.text;
|
||||
}
|
||||
if (Array.isArray(node.content)) {
|
||||
for (const child of node.content) {
|
||||
out += blockPlainText(child);
|
||||
}
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
/** Truncate `text` to at most `n` chars, appending an ellipsis when cut. */
|
||||
function truncate(text: string, n: number): string {
|
||||
return text.length > n ? text.slice(0, n) + "…" : text;
|
||||
}
|
||||
|
||||
/** One compact outline entry for a single top-level block. */
|
||||
export interface OutlineEntry {
|
||||
index: number;
|
||||
type: string | undefined;
|
||||
id: string | null;
|
||||
firstText: string;
|
||||
/** Present for headings only. */
|
||||
level?: number | null;
|
||||
/** Present for tables only. */
|
||||
rows?: number;
|
||||
cols?: number;
|
||||
header?: string[];
|
||||
/** Present for list blocks only (bulletList/orderedList/taskList). */
|
||||
items?: number;
|
||||
}
|
||||
|
||||
/**
|
||||
* Build a COMPACT outline of the TOP-LEVEL blocks of `doc` (the entries in
|
||||
* `doc.content`). Deliberately does NOT recurse into paragraphs, list items, or
|
||||
* table cells — compactness is the point; use `getNodeByRef` to drill into a
|
||||
* specific block.
|
||||
*
|
||||
* Each entry carries `{ index, type, id, firstText }`, plus type-specific
|
||||
* extras: headings add `level`; tables add `rows`/`cols` and the first row's
|
||||
* cell texts as `header`; list blocks (types ending in "List") add `items`.
|
||||
* `firstText` is the block's plain text truncated to 100 chars. Null-safe:
|
||||
* a missing or non-object doc/content yields `[]`.
|
||||
*/
|
||||
export function buildOutline(doc: any): OutlineEntry[] {
|
||||
if (!isObject(doc) || !Array.isArray(doc.content)) return [];
|
||||
|
||||
const out: OutlineEntry[] = [];
|
||||
for (let i = 0; i < doc.content.length; i++) {
|
||||
const block = doc.content[i];
|
||||
const type = isObject(block) ? block.type : undefined;
|
||||
const entry: OutlineEntry = {
|
||||
index: i,
|
||||
type,
|
||||
id: isObject(block) && isObject(block.attrs) ? block.attrs.id ?? null : null,
|
||||
firstText: truncate(blockPlainText(block), 100),
|
||||
};
|
||||
|
||||
if (type === "heading") {
|
||||
entry.level = isObject(block.attrs) ? block.attrs.level ?? null : null;
|
||||
} else if (type === "table") {
|
||||
const headerRow = block.content?.[0]?.content ?? [];
|
||||
entry.rows = block.content?.length ?? 0;
|
||||
entry.cols = block.content?.[0]?.content?.length ?? 0;
|
||||
entry.header = headerRow.map((cell: any) =>
|
||||
truncate(blockPlainText(cell), 40),
|
||||
);
|
||||
} else if (typeof type === "string" && type.endsWith("List")) {
|
||||
entry.items = block.content?.length ?? 0;
|
||||
}
|
||||
|
||||
out.push(entry);
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolve a single node by reference and return `{ node, path, type }`, or
|
||||
* `null` when nothing matches.
|
||||
*
|
||||
* - `ref` of the form `#<n>` (e.g. `#2`) selects the TOP-LEVEL block at index
|
||||
* `n` in `doc.content`. This is the only way to address table/tableRow/
|
||||
* tableCell nodes, which carry no `attrs.id`.
|
||||
* - Otherwise `ref` is treated as a block id: the FIRST node anywhere in the
|
||||
* tree with `attrs.id === ref` is returned.
|
||||
*
|
||||
* `path` is the array of child indices from the doc root down to the node
|
||||
* (so a top-level block is `[index]`). The returned `node` is a DEEP CLONE,
|
||||
* so callers can mutate it without touching the input doc. Null-safe.
|
||||
*/
|
||||
export function getNodeByRef(
|
||||
doc: any,
|
||||
ref: string,
|
||||
): { node: any; path: number[]; type: string | undefined } | null {
|
||||
if (!isObject(doc)) return null;
|
||||
|
||||
// "#<n>": index into the top-level content array.
|
||||
const indexMatch = typeof ref === "string" ? ref.match(/^#(\d+)$/) : null;
|
||||
if (indexMatch) {
|
||||
const index = Number(indexMatch[1]);
|
||||
const block = Array.isArray(doc.content) ? doc.content[index] : undefined;
|
||||
if (!isObject(block)) return null;
|
||||
return { node: clone(block), path: [index], type: block.type };
|
||||
}
|
||||
|
||||
// Otherwise: depth-first search for the first node with attrs.id === ref.
|
||||
const search = (
|
||||
node: any,
|
||||
trail: number[],
|
||||
): { node: any; path: number[]; type: string } | null => {
|
||||
if (!isObject(node)) return null;
|
||||
if (Array.isArray(node.content)) {
|
||||
for (let i = 0; i < node.content.length; i++) {
|
||||
const child = node.content[i];
|
||||
const path = [...trail, i];
|
||||
if (matchesId(child, ref)) {
|
||||
return { node: clone(child), path, type: child.type };
|
||||
}
|
||||
const hit = search(child, path);
|
||||
if (hit != null) return hit;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
};
|
||||
|
||||
return search(doc, []);
|
||||
}
|
||||
|
||||
/**
|
||||
* Replace EVERY node whose `attrs.id === nodeId` with a deep clone of
|
||||
* `newNode`, anywhere in the tree (including inside callouts and table cells).
|
||||
*
|
||||
* Operates on a clone of `doc`; returns `{ doc, replaced }` where `replaced`
|
||||
* is the number of nodes substituted. A fresh clone of `newNode` is used for
|
||||
* each match so they do not share references.
|
||||
*/
|
||||
export function replaceNodeById(
|
||||
doc: any,
|
||||
nodeId: string,
|
||||
newNode: any,
|
||||
): { doc: any; replaced: number } {
|
||||
const out = clone(doc);
|
||||
let replaced = 0;
|
||||
|
||||
// Walk a content array, replacing direct matches and recursing into the
|
||||
// (possibly new) children of non-matching nodes.
|
||||
const walkContent = (content: any[]): void => {
|
||||
for (let i = 0; i < content.length; i++) {
|
||||
const child = content[i];
|
||||
if (matchesId(child, nodeId)) {
|
||||
content[i] = clone(newNode);
|
||||
replaced++;
|
||||
// Do not recurse into a freshly substituted node.
|
||||
continue;
|
||||
}
|
||||
if (isObject(child) && Array.isArray(child.content)) {
|
||||
walkContent(child.content);
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
if (isObject(out) && Array.isArray(out.content)) {
|
||||
walkContent(out.content);
|
||||
}
|
||||
return { doc: out, replaced };
|
||||
}
|
||||
|
||||
/**
|
||||
* Remove EVERY node whose `attrs.id === nodeId` from its parent `content`
|
||||
* array, anywhere in the tree (recursive, including callouts and tables).
|
||||
*
|
||||
* Operates on a clone of `doc`; returns `{ doc, deleted }` where `deleted` is
|
||||
* the number of nodes removed.
|
||||
*/
|
||||
export function deleteNodeById(
|
||||
doc: any,
|
||||
nodeId: string,
|
||||
): { doc: any; deleted: number } {
|
||||
const out = clone(doc);
|
||||
let deleted = 0;
|
||||
|
||||
// Filter a content array in place, dropping matches and recursing into the
|
||||
// surviving children.
|
||||
const walkContent = (content: any[]): any[] => {
|
||||
const kept: any[] = [];
|
||||
for (const child of content) {
|
||||
if (matchesId(child, nodeId)) {
|
||||
deleted++;
|
||||
continue;
|
||||
}
|
||||
if (isObject(child) && Array.isArray(child.content)) {
|
||||
child.content = walkContent(child.content);
|
||||
}
|
||||
kept.push(child);
|
||||
}
|
||||
return kept;
|
||||
};
|
||||
|
||||
if (isObject(out) && Array.isArray(out.content)) {
|
||||
out.content = walkContent(out.content);
|
||||
}
|
||||
return { doc: out, deleted };
|
||||
}
|
||||
|
||||
/**
|
||||
* Deep-clone `doc` and strip every node/mark attribute whose value is strictly
|
||||
* `undefined`, so the result is safe to hand to Yjs (which throws an opaque
|
||||
* "Unexpected content type" when asked to store an `undefined` attribute value).
|
||||
*
|
||||
* Only `undefined` keys are removed; `null`, `false`, `0`, and `""` are all
|
||||
* legitimate JSON-storable values and are preserved. Operates on a clone and
|
||||
* returns it; the input is never mutated. Defensively null-safe like the rest
|
||||
* of the file.
|
||||
*/
|
||||
export function sanitizeForYjs(doc: any): any {
|
||||
const out = clone(doc);
|
||||
|
||||
// Drop every key whose value is strictly `undefined` from an attrs object.
|
||||
const stripUndefined = (attrs: any): void => {
|
||||
if (!isObject(attrs)) return;
|
||||
for (const key of Object.keys(attrs)) {
|
||||
if (attrs[key] === undefined) {
|
||||
delete attrs[key];
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
const walk = (node: any): void => {
|
||||
if (!isObject(node)) return;
|
||||
stripUndefined(node.attrs);
|
||||
if (Array.isArray(node.marks)) {
|
||||
for (const mark of node.marks) {
|
||||
if (isObject(mark)) stripUndefined(mark.attrs);
|
||||
}
|
||||
}
|
||||
if (Array.isArray(node.content)) {
|
||||
for (const child of node.content) {
|
||||
walk(child);
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
walk(out);
|
||||
return out;
|
||||
}
|
||||
|
||||
/**
|
||||
* Diagnostics helper: walk the tree and return a human-readable path string for
|
||||
* the FIRST attribute value (in any `node.attrs` or `mark.attrs`) that Yjs
|
||||
* cannot store — i.e. `undefined`, a `function`, a `symbol`, or a `bigint`
|
||||
* (e.g. `content[3].content[0].attrs.indent (undefined)`). Returns `null` when
|
||||
* every attribute is storable. Null-safe.
|
||||
*/
|
||||
export function findUnstorableAttr(doc: any): string | null {
|
||||
const isUnstorable = (value: any): string | null => {
|
||||
if (value === undefined) return "undefined";
|
||||
const t = typeof value;
|
||||
if (t === "function") return "function";
|
||||
if (t === "symbol") return "symbol";
|
||||
if (t === "bigint") return "bigint";
|
||||
return null;
|
||||
};
|
||||
|
||||
// Check an attrs object; return the offending sub-path or null.
|
||||
const checkAttrs = (attrs: any, basePath: string): string | null => {
|
||||
if (!isObject(attrs)) return null;
|
||||
for (const key of Object.keys(attrs)) {
|
||||
const kind = isUnstorable(attrs[key]);
|
||||
if (kind != null) return `${basePath}.${key} (${kind})`;
|
||||
}
|
||||
return null;
|
||||
};
|
||||
|
||||
const walk = (node: any, path: string): string | null => {
|
||||
if (!isObject(node)) return null;
|
||||
const attrHit = checkAttrs(node.attrs, `${path}.attrs`);
|
||||
if (attrHit != null) return attrHit;
|
||||
if (Array.isArray(node.marks)) {
|
||||
for (let i = 0; i < node.marks.length; i++) {
|
||||
const markHit = checkAttrs(
|
||||
node.marks[i]?.attrs,
|
||||
`${path}.marks[${i}].attrs`,
|
||||
);
|
||||
if (markHit != null) return markHit;
|
||||
}
|
||||
}
|
||||
if (Array.isArray(node.content)) {
|
||||
for (let i = 0; i < node.content.length; i++) {
|
||||
const childHit = walk(node.content[i], `${path}.content[${i}]`);
|
||||
if (childHit != null) return childHit;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
};
|
||||
|
||||
// The root doc node carries no useful index, so start the path at "doc".
|
||||
if (!isObject(doc)) return null;
|
||||
const attrHit = checkAttrs(doc.attrs, "attrs");
|
||||
if (attrHit != null) return attrHit;
|
||||
if (Array.isArray(doc.content)) {
|
||||
for (let i = 0; i < doc.content.length; i++) {
|
||||
const childHit = walk(doc.content[i], `content[${i}]`);
|
||||
if (childHit != null) return childHit;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
/**
|
||||
* Table structural node types and the container each must live directly inside.
|
||||
* Used by `insertNodeRelative` to splice rows/cells into the correct ancestor
|
||||
* rather than blindly into the anchor's direct parent (which would corrupt the
|
||||
* table's nesting).
|
||||
*/
|
||||
const STRUCTURAL_TYPES = new Set(["tableRow", "tableCell", "tableHeader"]);
|
||||
const REQUIRED_CONTAINER: Record<string, string> = {
|
||||
tableRow: "table",
|
||||
tableCell: "tableRow",
|
||||
tableHeader: "tableRow",
|
||||
};
|
||||
|
||||
/**
|
||||
* Locate an anchor and return its ancestor chain (from `doc` down to and
|
||||
* including the matched node). Each chain entry is `{ node, index }` where
|
||||
* `index` is the node's position inside its parent's `content` array (the root
|
||||
* doc has index -1). Returns `null` when the anchor cannot be resolved.
|
||||
*/
|
||||
function findAnchorChain(
|
||||
doc: any,
|
||||
opts: InsertOptions,
|
||||
): { node: any; index: number }[] | null {
|
||||
if (!isObject(doc)) return null;
|
||||
|
||||
// DFS by id anywhere in the tree, accumulating the path.
|
||||
if (opts.anchorNodeId != null) {
|
||||
const targetId = opts.anchorNodeId;
|
||||
const search = (
|
||||
node: any,
|
||||
index: number,
|
||||
trail: { node: any; index: number }[],
|
||||
): { node: any; index: number }[] | null => {
|
||||
if (!isObject(node)) return null;
|
||||
const here = [...trail, { node, index }];
|
||||
if (matchesId(node, targetId)) return here;
|
||||
if (Array.isArray(node.content)) {
|
||||
for (let i = 0; i < node.content.length; i++) {
|
||||
const hit = search(node.content[i], i, here);
|
||||
if (hit != null) return hit;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
};
|
||||
return search(doc, -1, []);
|
||||
}
|
||||
|
||||
// By text: only top-level blocks are scanned (same rule as the JSON path).
|
||||
if (opts.anchorText != null && Array.isArray(doc.content)) {
|
||||
for (let i = 0; i < doc.content.length; i++) {
|
||||
if (blockPlainText(doc.content[i]).includes(opts.anchorText)) {
|
||||
return [
|
||||
{ node: doc, index: -1 },
|
||||
{ node: doc.content[i], index: i },
|
||||
];
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
/** Options controlling where `insertNodeRelative` places the new node. */
|
||||
export interface InsertOptions {
|
||||
position: "before" | "after" | "append";
|
||||
/** Resolve the anchor by node id anywhere in the tree (preferred). */
|
||||
anchorNodeId?: string;
|
||||
/** Fallback: first TOP-LEVEL block whose plain text includes this string. */
|
||||
anchorText?: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Insert a deep clone of `node` relative to an anchor.
|
||||
*
|
||||
* - position "append": push the node onto the top-level `doc.content`.
|
||||
* - position "before"/"after": locate the anchor and splice the node into the
|
||||
* anchor's parent `content` array immediately before / after it.
|
||||
*
|
||||
* Anchor resolution for before/after:
|
||||
* - if `anchorNodeId` is given, find the node with `attrs.id === anchorNodeId`
|
||||
* anywhere in the tree (recursive);
|
||||
* - otherwise, if `anchorText` is given, scan only TOP-LEVEL `doc.content`
|
||||
* blocks and pick the first whose `blockPlainText` includes `anchorText`.
|
||||
*
|
||||
* Operates on a clone of `doc`; returns `{ doc, inserted }`. `inserted` is
|
||||
* false when the anchor could not be resolved (the doc is returned unchanged
|
||||
* apart from being cloned).
|
||||
*/
|
||||
export function insertNodeRelative(
|
||||
doc: any,
|
||||
node: any,
|
||||
opts: InsertOptions,
|
||||
): { doc: any; inserted: boolean } {
|
||||
const out = clone(doc);
|
||||
const fresh = clone(node);
|
||||
|
||||
// Defensive: stay null-safe like the other exports — a missing opts means
|
||||
// there is nothing actionable to do.
|
||||
if (!isObject(opts)) return { doc: out, inserted: false };
|
||||
|
||||
const isStructural = isObject(node) && STRUCTURAL_TYPES.has(node.type);
|
||||
|
||||
// "append": top-level push.
|
||||
if (opts.position === "append") {
|
||||
// Structural table nodes (tableRow/tableCell/tableHeader) cannot live at the
|
||||
// top level — appending one would produce invalid nesting.
|
||||
if (isStructural) {
|
||||
throw new Error(
|
||||
`insert_node: cannot append a ${node.type} at the top level; use ` +
|
||||
`position before/after with an anchor inside the target table`,
|
||||
);
|
||||
}
|
||||
if (isObject(out)) {
|
||||
if (!Array.isArray(out.content)) out.content = [];
|
||||
out.content.push(fresh);
|
||||
return { doc: out, inserted: true };
|
||||
}
|
||||
return { doc: out, inserted: false };
|
||||
}
|
||||
|
||||
const offset = opts.position === "after" ? 1 : 0;
|
||||
|
||||
// Structural insert (before/after a tableRow/tableCell/tableHeader): splice
|
||||
// into the nearest enclosing table/tableRow rather than the anchor's direct
|
||||
// parent, so the row/cell lands at the correct level of the table.
|
||||
if (isStructural) {
|
||||
const containerType = REQUIRED_CONTAINER[node.type];
|
||||
const chain = findAnchorChain(out, opts);
|
||||
// Anchor not resolved at all — keep the existing "anchor not found" path.
|
||||
if (chain == null) return { doc: out, inserted: false };
|
||||
|
||||
// Find the DEEPEST ancestor (including the anchor itself) of the required
|
||||
// container type.
|
||||
let containerIdx = -1;
|
||||
for (let i = chain.length - 1; i >= 0; i--) {
|
||||
if (isObject(chain[i].node) && chain[i].node.type === containerType) {
|
||||
containerIdx = i;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
if (containerIdx === -1) {
|
||||
throw new Error(
|
||||
`insert_node: cannot insert a ${node.type} here — the anchor is not ` +
|
||||
`inside a ${containerType}. Anchor on a cell's text or a block id ` +
|
||||
`that lives inside the target table.`,
|
||||
);
|
||||
}
|
||||
|
||||
const container = chain[containerIdx].node;
|
||||
if (!Array.isArray(container.content)) container.content = [];
|
||||
|
||||
if (containerIdx === chain.length - 1) {
|
||||
// The matched container IS the anchor node itself (e.g. anchorText
|
||||
// resolved to the table block): append/prepend within it.
|
||||
const at = opts.position === "after" ? container.content.length : 0;
|
||||
container.content.splice(at, 0, fresh);
|
||||
} else {
|
||||
// The immediate child on the path leading to the anchor is the row/cell
|
||||
// to splice next to.
|
||||
const enclosingChildIndex = chain[containerIdx + 1].index;
|
||||
container.content.splice(enclosingChildIndex + offset, 0, fresh);
|
||||
}
|
||||
return { doc: out, inserted: true };
|
||||
}
|
||||
|
||||
// Resolve by id anywhere in the tree: splice into the parent content array.
|
||||
if (opts.anchorNodeId != null) {
|
||||
let inserted = false;
|
||||
const walkContent = (content: any[]): void => {
|
||||
for (let i = 0; i < content.length; i++) {
|
||||
const child = content[i];
|
||||
if (matchesId(child, opts.anchorNodeId as string)) {
|
||||
content.splice(i + offset, 0, fresh);
|
||||
inserted = true;
|
||||
return;
|
||||
}
|
||||
if (isObject(child) && Array.isArray(child.content)) {
|
||||
walkContent(child.content);
|
||||
if (inserted) return;
|
||||
}
|
||||
}
|
||||
};
|
||||
if (isObject(out) && Array.isArray(out.content)) {
|
||||
walkContent(out.content);
|
||||
}
|
||||
return { doc: out, inserted };
|
||||
}
|
||||
|
||||
// Resolve by text: only top-level doc.content blocks are scanned.
|
||||
if (opts.anchorText != null && isObject(out) && Array.isArray(out.content)) {
|
||||
for (let i = 0; i < out.content.length; i++) {
|
||||
if (blockPlainText(out.content[i]).includes(opts.anchorText)) {
|
||||
out.content.splice(i + offset, 0, fresh);
|
||||
return { doc: out, inserted: true };
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return { doc: out, inserted: false };
|
||||
}
|
||||
|
||||
// ===========================================================================
|
||||
// Table editing helpers
|
||||
//
|
||||
// A Docmost table is a ProseMirror subtree with NO ids on the structural nodes:
|
||||
// table -> { type:"table", content:[tableRow...] }
|
||||
// row -> { type:"tableRow", content:[tableCell|tableHeader...] }
|
||||
// cell -> { type:"tableCell"|"tableHeader", attrs:{colspan,rowspan,colwidth},
|
||||
// content:[paragraph...] }
|
||||
// para -> { type:"paragraph", attrs:{id,indent}, content:[textNode...] }
|
||||
// Only paragraphs/headings carry an `attrs.id`, so a cell is addressed via the
|
||||
// id of the paragraph inside it. The helpers below all operate on a DEEP CLONE
|
||||
// of the input doc (via `clone`) and never mutate their inputs.
|
||||
// ===========================================================================
|
||||
|
||||
/**
|
||||
* Collect EVERY `attrs.id` present anywhere in `node` into `used`. Used to seed
|
||||
* `makeFreshId` so generated paragraph ids never collide with existing ones.
|
||||
*/
|
||||
function collectIds(node: any, used: Set<string>): void {
|
||||
if (!isObject(node)) return;
|
||||
if (isObject(node.attrs) && typeof node.attrs.id === "string") {
|
||||
used.add(node.attrs.id);
|
||||
}
|
||||
if (Array.isArray(node.content)) {
|
||||
for (const child of node.content) collectIds(child, used);
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Fresh-id generator: returns a random Docmost-style id (12 chars from
|
||||
* lowercase `a-z0-9`) that is not already in `used`, and records it. On the
|
||||
* rare collision the id is regenerated. Callers rely on uniqueness, not on the
|
||||
* exact string, so randomness is fine — and unlike a module-local counter it
|
||||
* needs no reset and cannot become predictable across calls.
|
||||
*/
|
||||
function makeFreshId(used: Set<string>): string {
|
||||
const alphabet = "abcdefghijklmnopqrstuvwxyz0123456789";
|
||||
let id: string;
|
||||
do {
|
||||
id = "";
|
||||
for (let i = 0; i < 12; i++) {
|
||||
id += alphabet[Math.floor(Math.random() * alphabet.length)];
|
||||
}
|
||||
} while (used.has(id) || id === "");
|
||||
used.add(id);
|
||||
return id;
|
||||
}
|
||||
|
||||
/**
|
||||
* Resolve a table reference against an ALREADY-CLONED doc and return the LIVE
|
||||
* table node (a reference inside `rootClone`, so the caller may mutate it) plus
|
||||
* its index path. Returns null when no table matches.
|
||||
*
|
||||
* - `#<n>`: the top-level block at index `n`, only if its `type === "table"`.
|
||||
* - otherwise: DFS for the node with `attrs.id === tableRef`, then walk UP its
|
||||
* ancestor chain to the nearest `type === "table"` ancestor.
|
||||
*/
|
||||
function locateTable(
|
||||
rootClone: any,
|
||||
tableRef: string,
|
||||
): { table: any; path: number[] } | null {
|
||||
if (!isObject(rootClone)) return null;
|
||||
|
||||
// "#<n>": index into the top-level content array; must be a table.
|
||||
const indexMatch = typeof tableRef === "string" ? tableRef.match(/^#(\d+)$/) : null;
|
||||
if (indexMatch) {
|
||||
const index = Number(indexMatch[1]);
|
||||
const block = Array.isArray(rootClone.content)
|
||||
? rootClone.content[index]
|
||||
: undefined;
|
||||
if (isObject(block) && block.type === "table") {
|
||||
return { table: block, path: [index] };
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
// Otherwise: DFS for attrs.id === tableRef, tracking the ancestor chain, then
|
||||
// climb to the nearest enclosing table.
|
||||
const search = (
|
||||
node: any,
|
||||
trail: { node: any; index: number }[],
|
||||
): { table: any; path: number[] } | null => {
|
||||
if (!isObject(node)) return null;
|
||||
if (Array.isArray(node.content)) {
|
||||
for (let i = 0; i < node.content.length; i++) {
|
||||
const child = node.content[i];
|
||||
const here = [...trail, { node: child, index: i }];
|
||||
if (matchesId(child, tableRef)) {
|
||||
// Walk UP to the nearest table ancestor (including the match itself).
|
||||
for (let j = here.length - 1; j >= 0; j--) {
|
||||
if (isObject(here[j].node) && here[j].node.type === "table") {
|
||||
return {
|
||||
table: here[j].node,
|
||||
path: here.slice(0, j + 1).map((e) => e.index),
|
||||
};
|
||||
}
|
||||
}
|
||||
return null; // id found but no enclosing table
|
||||
}
|
||||
const hit = search(child, here);
|
||||
if (hit != null) return hit;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
};
|
||||
|
||||
return search(rootClone, []);
|
||||
}
|
||||
|
||||
/** Build the plain-text → single-paragraph cell content used by all writers. */
|
||||
function makeCellParagraph(id: string, text: string): any {
|
||||
return {
|
||||
type: "paragraph",
|
||||
attrs: { id, indent: 0 },
|
||||
// Empty string → a paragraph with an empty content array.
|
||||
content: text ? [{ type: "text", text }] : [],
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Read a table as a matrix. Returns null when `tableRef` resolves to no table.
|
||||
*
|
||||
* - `rows`/`cols`: the table's row count and the column count of its FIRST row.
|
||||
* Tables may be ragged (rows of differing length), so `cols` reflects only
|
||||
* row 0; use the per-row length of `cells`/`cellIds` for each row's actual
|
||||
* width.
|
||||
* - `cells`: `string[][]` of each cell's `blockPlainText`.
|
||||
* - `cellIds`: `(string|null)[][]` of each cell's FIRST paragraph id (or null),
|
||||
* so callers can `patch_node` a cell for rich-formatted edits.
|
||||
* - `path`: index path of the table within the doc.
|
||||
*/
|
||||
export function readTable(
|
||||
doc: any,
|
||||
tableRef: string,
|
||||
): {
|
||||
rows: number;
|
||||
cols: number;
|
||||
cells: string[][];
|
||||
cellIds: (string | null)[][];
|
||||
path: number[];
|
||||
} | null {
|
||||
const root = clone(doc);
|
||||
const located = locateTable(root, tableRef);
|
||||
if (located == null) return null;
|
||||
const { table, path } = located;
|
||||
|
||||
const rowNodes = Array.isArray(table.content) ? table.content : [];
|
||||
const rows = rowNodes.length;
|
||||
const cols = rowNodes[0]?.content?.length ?? 0;
|
||||
|
||||
const cells: string[][] = [];
|
||||
const cellIds: (string | null)[][] = [];
|
||||
for (const rowNode of rowNodes) {
|
||||
const cellNodes = Array.isArray(rowNode?.content) ? rowNode.content : [];
|
||||
const rowText: string[] = [];
|
||||
const rowIds: (string | null)[] = [];
|
||||
for (const cellNode of cellNodes) {
|
||||
rowText.push(blockPlainText(cellNode));
|
||||
// The cell's first paragraph carries the id used for patch_node.
|
||||
const firstPara = Array.isArray(cellNode?.content)
|
||||
? cellNode.content[0]
|
||||
: undefined;
|
||||
const id =
|
||||
isObject(firstPara) && isObject(firstPara.attrs)
|
||||
? firstPara.attrs.id ?? null
|
||||
: null;
|
||||
rowIds.push(id);
|
||||
}
|
||||
cells.push(rowText);
|
||||
cellIds.push(rowIds);
|
||||
}
|
||||
|
||||
return { rows, cols, cells, cellIds, path };
|
||||
}
|
||||
|
||||
/**
|
||||
* Insert a row of plain-text cells into a table. Returns `{ doc, inserted }`.
|
||||
*
|
||||
* The row is padded to the table's column count (`cells[i] ?? ""`); supplying
|
||||
* MORE cells than columns throws. Each new cell copies `colwidth` for its
|
||||
* column from the header row when present, gets a fresh-id paragraph, and a
|
||||
* `colspan:1, rowspan:1` attrs. `index` (when an integer in `[0, rows]`) splices
|
||||
* the row there; otherwise the row is appended at the end.
|
||||
*/
|
||||
export function insertTableRow(
|
||||
doc: any,
|
||||
tableRef: string,
|
||||
cells: string[],
|
||||
index?: number,
|
||||
): { doc: any; inserted: boolean } {
|
||||
const out = clone(doc);
|
||||
const located = locateTable(out, tableRef);
|
||||
if (located == null) return { doc: out, inserted: false };
|
||||
const { table } = located;
|
||||
|
||||
if (!Array.isArray(table.content)) table.content = [];
|
||||
const rows = table.content.length;
|
||||
const headerRow = table.content[0];
|
||||
const headerCells = Array.isArray(headerRow?.content) ? headerRow.content : [];
|
||||
|
||||
// Column count is the WIDEST existing row, so the guard below stays
|
||||
// meaningful for ragged tables and the new row matches the table's width.
|
||||
// Fall back to the supplied cell count only when the table has no rows.
|
||||
let colCount = 0;
|
||||
for (const r of table.content) {
|
||||
if (isObject(r) && Array.isArray(r.content)) colCount = Math.max(colCount, r.content.length);
|
||||
}
|
||||
if (colCount === 0) colCount = Array.isArray(cells) ? cells.length : 0;
|
||||
|
||||
if (Array.isArray(cells) && cells.length > colCount) {
|
||||
throw new Error(
|
||||
`table_insert_row: got ${cells.length} cell(s) but the table has ${colCount} column(s)`,
|
||||
);
|
||||
}
|
||||
|
||||
// Resolve the landing index up front so the cell-type decision and the splice
|
||||
// below agree: a valid integer in [0, rows] splices there, else we append.
|
||||
const landingIndex =
|
||||
typeof index === "number" && Number.isInteger(index) && index >= 0 && index <= rows
|
||||
? index
|
||||
: rows;
|
||||
|
||||
// Seed the id generator with every id already in the doc so the new cell
|
||||
// paragraph ids are unique within the whole document.
|
||||
const used = new Set<string>();
|
||||
collectIds(out, used);
|
||||
|
||||
const newCells: any[] = [];
|
||||
for (let i = 0; i < colCount; i++) {
|
||||
const text = (Array.isArray(cells) ? cells[i] : undefined) ?? "";
|
||||
const attrs: Record<string, any> = { colspan: 1, rowspan: 1 };
|
||||
// Copy this column's colwidth from the header row's cell when present.
|
||||
const colwidth = headerCells[i]?.attrs?.colwidth;
|
||||
if (colwidth !== undefined) attrs.colwidth = colwidth;
|
||||
// A row landing at index 0 becomes the new header row, so inherit the
|
||||
// current header cell's type per column (Docmost uses "tableHeader" there);
|
||||
// every other position is a plain data cell.
|
||||
const cellType = landingIndex === 0 ? headerCells[i]?.type ?? "tableCell" : "tableCell";
|
||||
newCells.push({
|
||||
type: cellType,
|
||||
attrs,
|
||||
content: [makeCellParagraph(makeFreshId(used), text)],
|
||||
});
|
||||
}
|
||||
|
||||
const newRow = { type: "tableRow", content: newCells };
|
||||
|
||||
// Splice at the resolved landing index (append when index was omitted/invalid).
|
||||
table.content.splice(landingIndex, 0, newRow);
|
||||
|
||||
return { doc: out, inserted: true };
|
||||
}
|
||||
|
||||
/**
|
||||
* Delete the row at 0-based `index` from a table. Returns `{ doc, deleted }`.
|
||||
* `deleted` is false only when the table cannot be located. Throws on an
|
||||
* out-of-range index, and refuses to delete the table's only row.
|
||||
*/
|
||||
export function deleteTableRow(
|
||||
doc: any,
|
||||
tableRef: string,
|
||||
index: number,
|
||||
): { doc: any; deleted: boolean } {
|
||||
const out = clone(doc);
|
||||
const located = locateTable(out, tableRef);
|
||||
if (located == null) return { doc: out, deleted: false };
|
||||
const { table } = located;
|
||||
|
||||
if (!Array.isArray(table.content)) table.content = [];
|
||||
const rows = table.content.length;
|
||||
|
||||
if (!Number.isInteger(index) || index < 0 || index >= rows) {
|
||||
throw new Error(
|
||||
`table_delete_row: row index ${index} out of range (table has ${rows} row(s))`,
|
||||
);
|
||||
}
|
||||
if (rows <= 1) {
|
||||
throw new Error(
|
||||
"table_delete_row: refusing to delete the only row of the table",
|
||||
);
|
||||
}
|
||||
|
||||
table.content.splice(index, 1);
|
||||
return { doc: out, deleted: true };
|
||||
}
|
||||
|
||||
/**
|
||||
* Set the plain-text content of cell `[row, col]` (0-based) to `text`. Returns
|
||||
* `{ doc, updated }`; `updated` is false only when the table cannot be located.
|
||||
* Throws when `row`/`col` is out of range. The cell's own attrs (colspan/
|
||||
* rowspan/colwidth) are preserved; its content becomes a single text paragraph
|
||||
* that reuses the cell's existing first-paragraph id when present, else a fresh
|
||||
* one.
|
||||
*/
|
||||
export function updateTableCell(
|
||||
doc: any,
|
||||
tableRef: string,
|
||||
row: number,
|
||||
col: number,
|
||||
text: string,
|
||||
): { doc: any; updated: boolean } {
|
||||
const out = clone(doc);
|
||||
const located = locateTable(out, tableRef);
|
||||
if (located == null) return { doc: out, updated: false };
|
||||
const { table } = located;
|
||||
|
||||
const rowNodes = Array.isArray(table.content) ? table.content : [];
|
||||
const rows = rowNodes.length;
|
||||
const rowNode = rowNodes[row];
|
||||
const cols = isObject(rowNode) && Array.isArray(rowNode.content)
|
||||
? rowNode.content.length
|
||||
: 0;
|
||||
|
||||
if (
|
||||
!Number.isInteger(row) ||
|
||||
row < 0 ||
|
||||
row >= rows ||
|
||||
!Number.isInteger(col) ||
|
||||
col < 0 ||
|
||||
col >= cols
|
||||
) {
|
||||
throw new Error(`table_update_cell: cell [${row},${col}] out of range`);
|
||||
}
|
||||
|
||||
const cellNode = rowNode.content[col];
|
||||
// Reuse the cell's existing first-paragraph id, or mint a fresh unique one.
|
||||
const existingPara = Array.isArray(cellNode?.content)
|
||||
? cellNode.content[0]
|
||||
: undefined;
|
||||
let id =
|
||||
isObject(existingPara) && isObject(existingPara.attrs)
|
||||
? existingPara.attrs.id
|
||||
: undefined;
|
||||
if (typeof id !== "string" || id.length === 0) {
|
||||
const used = new Set<string>();
|
||||
collectIds(out, used);
|
||||
id = makeFreshId(used);
|
||||
}
|
||||
|
||||
cellNode.content = [makeCellParagraph(id, text)];
|
||||
return { doc: out, updated: true };
|
||||
}
|
||||
39
packages/mcp/src/lib/page-lock.ts
Normal file
39
packages/mcp/src/lib/page-lock.ts
Normal file
@@ -0,0 +1,39 @@
|
||||
/**
|
||||
* Per-page async mutex.
|
||||
*
|
||||
* Content writes over the collaboration websocket must never overlap for the
|
||||
* same page: two concurrent full-document replaces would race on the live Yjs
|
||||
* fragment. We serialize them with a per-pageId promise chain — each new
|
||||
* operation waits for the previous one on that page to settle (success or
|
||||
* failure) before it runs. Different pages never block each other.
|
||||
*/
|
||||
|
||||
const chains = new Map<string, Promise<unknown>>();
|
||||
|
||||
// The returned promise carries the real result/rejection of `fn` and MUST be
|
||||
// awaited/handled by the caller; only the internal chaining tail swallows
|
||||
// errors (purely to gate ordering).
|
||||
export function withPageLock<T>(
|
||||
pageId: string,
|
||||
fn: () => Promise<T>,
|
||||
): Promise<T> {
|
||||
// Wait for the previous op on this page; swallow its error so a failure does
|
||||
// not poison the queue for the next caller.
|
||||
const prev = (chains.get(pageId) ?? Promise.resolve()).catch(() => {});
|
||||
const run = prev.then(fn);
|
||||
|
||||
// The tail used for chaining must also swallow errors (it only gates order).
|
||||
const tail = run.catch(() => {});
|
||||
chains.set(pageId, tail);
|
||||
|
||||
// Drop the map entry once this op is the tail and has settled, to avoid an
|
||||
// unbounded map of resolved promises.
|
||||
tail.then(() => {
|
||||
if (chains.get(pageId) === tail) {
|
||||
chains.delete(pageId);
|
||||
}
|
||||
});
|
||||
|
||||
// Callers get the real result/rejection of fn.
|
||||
return run;
|
||||
}
|
||||
477
packages/mcp/src/lib/transforms.ts
Normal file
477
packages/mcp/src/lib/transforms.ts
Normal file
@@ -0,0 +1,477 @@
|
||||
/**
|
||||
* Pure, network-free transform primitives for a ProseMirror/TipTap document
|
||||
* tree, plus one higher-level orchestration (commentsToFootnotes).
|
||||
*
|
||||
* A ProseMirror node here is a plain JSON object of the shape produced by
|
||||
* Docmost: `{ type, attrs?, content?, text?, marks? }`. Children live in the
|
||||
* `content` array; callouts, tables, lists all hold their children in
|
||||
* `content`, so a single recursive walk reaches them all.
|
||||
*
|
||||
* Conventions (matching node-ops.ts):
|
||||
* - functions that produce a new document deep-clone their input and return a
|
||||
* `{ doc, ... }` object; the caller's objects are never mutated.
|
||||
* - functions are defensively null-safe.
|
||||
* - `marks` arrays are preserved verbatim when fragments are split/reordered.
|
||||
*/
|
||||
|
||||
import { blockPlainText } from "./node-ops.js";
|
||||
|
||||
/** Deep-clone a JSON-serializable value without mutating the original. */
|
||||
function clone<T>(value: T): T {
|
||||
if (typeof structuredClone === "function") {
|
||||
return structuredClone(value);
|
||||
}
|
||||
// Fallback for environments without structuredClone.
|
||||
return JSON.parse(JSON.stringify(value)) as T;
|
||||
}
|
||||
|
||||
/** True if `value` is a non-null object (and not an array). */
|
||||
function isObject(value: any): value is Record<string, any> {
|
||||
return value != null && typeof value === "object" && !Array.isArray(value);
|
||||
}
|
||||
|
||||
/**
|
||||
* Plain text of a node (re-export of node-ops' blockPlainText so transform
|
||||
* authors have a single import surface). Recurses through nested content.
|
||||
*/
|
||||
export function blockText(node: any): string {
|
||||
return blockPlainText(node);
|
||||
}
|
||||
|
||||
/**
|
||||
* Depth-first visit of every node in the tree, including the root and the
|
||||
* nested content of callouts, tables, lists, etc. `fn` is called once per node.
|
||||
* Null-safe: a nullish or non-object node is ignored.
|
||||
*/
|
||||
export function walk(node: any, fn: (node: any) => void): void {
|
||||
if (!isObject(node)) return;
|
||||
fn(node);
|
||||
if (Array.isArray(node.content)) {
|
||||
for (const child of node.content) {
|
||||
walk(child, fn);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Find the FIRST node (depth-first) matching `predicate`, anywhere in the tree.
|
||||
* Works even when the node carries no `attrs.id` (it searches the raw tree, not
|
||||
* an id index). Returns the live node reference inside `doc` (NOT a clone), or
|
||||
* null when nothing matches. Typical use: `getList(doc, n => n.type ===
|
||||
* "orderedList")`.
|
||||
*/
|
||||
export function getList(
|
||||
doc: any,
|
||||
predicate: (node: any) => boolean,
|
||||
): any | null {
|
||||
let found: any | null = null;
|
||||
walk(doc, (node) => {
|
||||
if (found == null && predicate(node)) {
|
||||
found = node;
|
||||
}
|
||||
});
|
||||
return found;
|
||||
}
|
||||
|
||||
/** Options for insertMarkerAfter. */
|
||||
export interface InsertMarkerOptions {
|
||||
/**
|
||||
* Limit the search to TOP-LEVEL blocks with index < beforeBlock. Used to keep
|
||||
* footnote markers in the body and out of the notes section.
|
||||
*/
|
||||
beforeBlock?: number;
|
||||
}
|
||||
|
||||
/**
|
||||
* Insert `marker` as a PLAIN (unmarked) text run right after the first
|
||||
* occurrence of `anchor`.
|
||||
*
|
||||
* The text run that contains the END of the anchor is SPLIT at the anchor end,
|
||||
* so all existing marks (links, bold, ...) on the surrounding text are
|
||||
* preserved, while the inserted marker run carries NO marks. The marker is
|
||||
* inserted as a leading-space-padded run (`" " + marker`) so it visually
|
||||
* separates from the preceding word.
|
||||
*
|
||||
* The anchor is matched against the concatenated plain text of each top-level
|
||||
* block (so an anchor that spans several text/mark runs still matches). The
|
||||
* insertion happens inside the inline content array that holds the anchor's
|
||||
* final character.
|
||||
*
|
||||
* Operates on a clone of `doc`; returns `{ doc, inserted }`. `inserted` is
|
||||
* false when the anchor text was not found in any in-scope block.
|
||||
*/
|
||||
export function insertMarkerAfter(
|
||||
doc: any,
|
||||
anchor: string,
|
||||
marker: string,
|
||||
opts: InsertMarkerOptions = {},
|
||||
): { doc: any; inserted: boolean } {
|
||||
const out = clone(doc);
|
||||
if (!isObject(out) || !Array.isArray(out.content) || !anchor) {
|
||||
return { doc: out, inserted: false };
|
||||
}
|
||||
|
||||
const limit =
|
||||
typeof opts.beforeBlock === "number"
|
||||
? Math.min(opts.beforeBlock, out.content.length)
|
||||
: out.content.length;
|
||||
|
||||
for (let b = 0; b < limit; b++) {
|
||||
const block = out.content[b];
|
||||
if (!isObject(block)) continue;
|
||||
// Quick reject: skip blocks whose plain text cannot contain the anchor.
|
||||
if (!blockPlainText(block).includes(anchor)) continue;
|
||||
|
||||
// Walk the inline content arrays inside this block, tracking a running
|
||||
// character offset so we can locate the inline array + text run that holds
|
||||
// the END of the anchor's first occurrence.
|
||||
let inserted = false;
|
||||
let offset = 0; // characters of plain text seen so far in this block
|
||||
const anchorEnd = (() => blockPlainText(block).indexOf(anchor) + anchor.length)();
|
||||
|
||||
// Recurse into inline-bearing containers (paragraph, heading, table cell,
|
||||
// callout child paragraphs, ...). We only split inside an array of inline
|
||||
// nodes (text/inline atoms); the FIRST array whose cumulative range covers
|
||||
// anchorEnd receives the split + marker.
|
||||
const visit = (container: any): void => {
|
||||
if (inserted || !isObject(container) || !Array.isArray(container.content)) {
|
||||
return;
|
||||
}
|
||||
const inline = container.content;
|
||||
// Detect whether this array is an inline array (contains text nodes).
|
||||
const hasText = inline.some(
|
||||
(n: any) => isObject(n) && n.type === "text",
|
||||
);
|
||||
if (hasText) {
|
||||
for (let i = 0; i < inline.length; i++) {
|
||||
const n = inline[i];
|
||||
const len = isObject(n) ? blockPlainText(n).length : 0;
|
||||
const runStart = offset;
|
||||
const runEnd = offset + len;
|
||||
// The run that contains the anchor end (anchorEnd lands inside this
|
||||
// run, i.e. runStart < anchorEnd <= runEnd) is the split point.
|
||||
if (
|
||||
!inserted &&
|
||||
isObject(n) &&
|
||||
n.type === "text" &&
|
||||
typeof n.text === "string" &&
|
||||
anchorEnd > runStart &&
|
||||
anchorEnd <= runEnd
|
||||
) {
|
||||
const cut = anchorEnd - runStart; // split index within this text run
|
||||
const before = n.text.slice(0, cut);
|
||||
const after = n.text.slice(cut);
|
||||
const marks = Array.isArray(n.marks) ? n.marks : [];
|
||||
const parts: any[] = [];
|
||||
if (before.length > 0) {
|
||||
parts.push({ ...n, text: before, marks: [...marks] });
|
||||
}
|
||||
// Marker is a PLAIN run: no marks copied. Leading space separates it.
|
||||
parts.push({ type: "text", text: " " + marker });
|
||||
if (after.length > 0) {
|
||||
parts.push({ ...n, text: after, marks: [...marks] });
|
||||
}
|
||||
inline.splice(i, 1, ...parts);
|
||||
inserted = true;
|
||||
return;
|
||||
}
|
||||
offset = runEnd;
|
||||
}
|
||||
} else {
|
||||
// Not an inline array: recurse into children (e.g. callout -> paragraph).
|
||||
for (const child of inline) {
|
||||
visit(child);
|
||||
if (inserted) return;
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
visit(block);
|
||||
if (inserted) {
|
||||
return { doc: out, inserted: true };
|
||||
}
|
||||
// If the block matched in plain text but we could not split (e.g. anchor
|
||||
// lands inside an atom), fall through to the next block rather than failing.
|
||||
}
|
||||
|
||||
return { doc: out, inserted: false };
|
||||
}
|
||||
|
||||
/**
|
||||
* In the disclaimer callout, replace a `[1]…[K]` range marker with `[1]…[n]`.
|
||||
*
|
||||
* Docmost translations use a callout that states the footnote range, e.g.
|
||||
* "[1]…[5]". When the number of notes changes, this rewrites the trailing
|
||||
* number of any `[1]…[K]` (or `[1]...[K]`, ASCII ellipsis) occurrence found in a
|
||||
* callout's text nodes to `[1]…[n]`. Operates on a clone; returns
|
||||
* `{ doc, changed }` where `changed` is the number of text nodes rewritten.
|
||||
*/
|
||||
export function setCalloutRange(
|
||||
doc: any,
|
||||
n: number,
|
||||
): { doc: any; changed: number } {
|
||||
const out = clone(doc);
|
||||
let changed = 0;
|
||||
// Match "[1]" + (… or ...) + "[<digits>]"; rewrite the last number to n.
|
||||
const rangeRe = /(\[1\]\s*(?:…|\.\.\.)\s*\[)\d+(\])/g;
|
||||
walk(out, (node) => {
|
||||
if (node.type === "callout") {
|
||||
walk(node, (inner) => {
|
||||
if (
|
||||
inner.type === "text" &&
|
||||
typeof inner.text === "string" &&
|
||||
rangeRe.test(inner.text)
|
||||
) {
|
||||
rangeRe.lastIndex = 0;
|
||||
inner.text = inner.text.replace(rangeRe, `$1${n}$2`);
|
||||
changed++;
|
||||
}
|
||||
rangeRe.lastIndex = 0;
|
||||
});
|
||||
}
|
||||
});
|
||||
return { doc: out, changed };
|
||||
}
|
||||
|
||||
/**
|
||||
* Generate a short random id for a new block's `attrs.id`. Docmost uses nanoid;
|
||||
* a base36 random string is sufficient here (uniqueness within one document).
|
||||
*/
|
||||
function freshId(): string {
|
||||
return (
|
||||
Math.random().toString(36).slice(2, 12) +
|
||||
Math.random().toString(36).slice(2, 6)
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* Wrap inline ProseMirror nodes in a list item:
|
||||
* { type:"listItem", content:[{ type:"paragraph", attrs:{id}, content: inlineNodes }] }
|
||||
* with a fresh random block id on the paragraph. The inline nodes are cloned so
|
||||
* the result shares no references with the caller's input.
|
||||
*/
|
||||
export function noteItem(inlineNodes: any[]): any {
|
||||
const content = Array.isArray(inlineNodes) ? clone(inlineNodes) : [];
|
||||
return {
|
||||
type: "listItem",
|
||||
content: [
|
||||
{
|
||||
type: "paragraph",
|
||||
attrs: { id: freshId() },
|
||||
content,
|
||||
},
|
||||
],
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Convert a comment's markdown (e.g. `**Lead.** body...`) into inline
|
||||
* ProseMirror nodes.
|
||||
*
|
||||
* A leading `комментарий: ` (case-insensitive) or `N. ` numeric prefix is
|
||||
* stripped first. Then a minimal bold-split is applied: a leading
|
||||
* `**bold lead**` run becomes a text node with a bold mark, and the remainder
|
||||
* becomes a plain text node. This keeps the conversion synchronous (the
|
||||
* transform sandbox runs synchronously) and dependency-free; the existing
|
||||
* async markdownToProseMirror is intentionally NOT used here.
|
||||
*/
|
||||
export function mdToInlineNodes(markdown: string): any[] {
|
||||
let md = typeof markdown === "string" ? markdown : "";
|
||||
// Strip a leading "комментарий: " prefix (case-insensitive) or a "N. " prefix.
|
||||
md = md.replace(/^\s*комментарий\s*:\s*/i, "");
|
||||
md = md.replace(/^\s*\d+\.\s+/, "");
|
||||
md = md.trim();
|
||||
|
||||
if (md === "") return [];
|
||||
|
||||
const nodes: any[] = [];
|
||||
// Leading bold lead: **...** at the very start.
|
||||
const leadMatch = /^\*\*([^*]+)\*\*\s*/.exec(md);
|
||||
if (leadMatch) {
|
||||
const leadText = leadMatch[1];
|
||||
nodes.push({
|
||||
type: "text",
|
||||
text: leadText,
|
||||
marks: [{ type: "bold" }],
|
||||
});
|
||||
const rest = md.slice(leadMatch[0].length);
|
||||
if (rest.length > 0) {
|
||||
// Preserve the separating space that followed the bold lead.
|
||||
const sep = /^\*\*[^*]+\*\*(\s*)/.exec(md);
|
||||
const spacing = sep ? sep[1] : "";
|
||||
nodes.push({ type: "text", text: spacing + rest });
|
||||
}
|
||||
return nodes;
|
||||
}
|
||||
|
||||
// No bold lead: emit the whole thing as a single plain text node, with any
|
||||
// remaining **bold** spans split out inline.
|
||||
return splitInlineBold(md);
|
||||
}
|
||||
|
||||
/**
|
||||
* Split a string with inline `**bold**` spans into text nodes, bolding the
|
||||
* spans. Used as the no-lead fallback in mdToInlineNodes.
|
||||
*/
|
||||
function splitInlineBold(text: string): any[] {
|
||||
const nodes: any[] = [];
|
||||
const re = /\*\*([^*]+)\*\*/g;
|
||||
let last = 0;
|
||||
let m: RegExpExecArray | null;
|
||||
while ((m = re.exec(text)) !== null) {
|
||||
if (m.index > last) {
|
||||
nodes.push({ type: "text", text: text.slice(last, m.index) });
|
||||
}
|
||||
nodes.push({ type: "text", text: m[1], marks: [{ type: "bold" }] });
|
||||
last = m.index + m[0].length;
|
||||
}
|
||||
if (last < text.length) {
|
||||
nodes.push({ type: "text", text: text.slice(last) });
|
||||
}
|
||||
return nodes.length > 0 ? nodes : [{ type: "text", text }];
|
||||
}
|
||||
|
||||
/** Options for commentsToFootnotes. */
|
||||
export interface CommentsToFootnotesOptions {
|
||||
/** Heading text under which the notes orderedList lives. */
|
||||
notesHeading?: string;
|
||||
}
|
||||
|
||||
/** A comment shape as returned by DocmostClient.listComments. */
|
||||
export interface FootnoteComment {
|
||||
id: string;
|
||||
content: string;
|
||||
selection?: string | null;
|
||||
[k: string]: any;
|
||||
}
|
||||
|
||||
/**
|
||||
* Turn inline comments into numbered footnotes.
|
||||
*
|
||||
* For each inline comment that carries a `selection`:
|
||||
* 1. insert a placeholder marker (a NUL-delimited "\u0000FN<i>\u0000"
|
||||
* sentinel) right after the selection text in the BODY (before the
|
||||
* notes heading);
|
||||
* 2. build a note list item from the comment's markdown content.
|
||||
*
|
||||
* Then RENUMBER every footnote marker in the body by reading order: existing
|
||||
* `[N]` markers and the new "\u0000FN<i>\u0000" placeholders are both replaced by a
|
||||
* sequential `[seq]`, and the notes orderedList is reordered so each note lines
|
||||
* up with its marker's reading-order position. Finally the disclaimer callout
|
||||
* range is synced to the new note count.
|
||||
*
|
||||
* Returns `{ doc, consumed }` where `consumed` lists the ids of comments that
|
||||
* were successfully anchored (their selection was found and a placeholder
|
||||
* inserted). Operates on a clone of `doc`.
|
||||
*/
|
||||
export function commentsToFootnotes(
|
||||
doc: any,
|
||||
comments: FootnoteComment[],
|
||||
opts: CommentsToFootnotesOptions = {},
|
||||
): { doc: any; consumed: string[] } {
|
||||
let working = clone(doc);
|
||||
const notesHeading = opts.notesHeading ?? "Примечания переводчика";
|
||||
|
||||
const top: any[] = Array.isArray(working.content) ? working.content : [];
|
||||
const notesIdx = top.findIndex(
|
||||
(n) => isObject(n) && n.type === "heading" && blockText(n).trim() === notesHeading,
|
||||
);
|
||||
if (notesIdx < 0) {
|
||||
throw new Error(`heading "${notesHeading}" not found`);
|
||||
}
|
||||
// The notes orderedList lives at or after the heading.
|
||||
const notesList = top
|
||||
.slice(notesIdx)
|
||||
.find((n) => isObject(n) && n.type === "orderedList");
|
||||
if (!notesList) {
|
||||
throw new Error("notes orderedList not found");
|
||||
}
|
||||
|
||||
const consumed: string[] = [];
|
||||
const noteByPh = new Map<string, any>();
|
||||
|
||||
(Array.isArray(comments) ? comments : []).forEach((c, i) => {
|
||||
if (!c || !c.selection) return;
|
||||
// Collision-proof sentinel delimited by NUL control chars, which never occur
|
||||
// in real Docmost prose — so the renumber regex below cannot mistake any body
|
||||
// text (e.g. "Press F1 for help", model "FN2") for a placeholder. The NUL is
|
||||
// transient: the placeholder round-trips within this function (insertMarkerAfter
|
||||
// inserts it, the renumber pass replaces it with "[N]"), so it never persists
|
||||
// in a returned/pushed document.
|
||||
const ph = `\u0000FN${i}\u0000`;
|
||||
// insertMarkerAfter returns a NEW cloned doc; reassign `working` and refresh
|
||||
// the `top` / `notesList` references that point into it.
|
||||
const r = insertMarkerAfter(working, c.selection.trimEnd(), ph, {
|
||||
beforeBlock: notesIdx,
|
||||
});
|
||||
if (!r.inserted) return;
|
||||
working = r.doc;
|
||||
noteByPh.set(ph, noteItem(mdToInlineNodes(c.content)));
|
||||
consumed.push(c.id);
|
||||
});
|
||||
|
||||
// Re-resolve references into the (possibly re-cloned) working doc.
|
||||
const top2: any[] = Array.isArray(working.content) ? working.content : [];
|
||||
const notesList2 = top2
|
||||
.slice(notesIdx)
|
||||
.find((n) => isObject(n) && n.type === "orderedList");
|
||||
if (!notesList2) {
|
||||
throw new Error("notes orderedList not found");
|
||||
}
|
||||
|
||||
const oldNotes: any[] = Array.isArray(notesList2.content)
|
||||
? notesList2.content
|
||||
: [];
|
||||
const newNotes: any[] = [];
|
||||
let seq = 0;
|
||||
// Match either an existing "[N]" marker or a NUL-delimited "\u0000FN<i>\u0000"
|
||||
// placeholder, in reading order across the body (blocks before the notes heading).
|
||||
const re = /\[(\d+)\]|\u0000FN(\d+)\u0000/g;
|
||||
// Same range regex setCalloutRange uses to detect the disclaimer callout's
|
||||
// "[1]…[K]" range; used here to decide whether a top-level callout is the
|
||||
// disclaimer (skip) or an ordinary callout (renumber normally).
|
||||
const disclaimerRangeRe = /(\[1\]\s*(?:…|\.\.\.)\s*\[)\d+(\])/;
|
||||
for (let i = 0; i < notesIdx; i++) {
|
||||
// Skip ONLY the disclaimer callout: its "[1]…[K]" range is NOT a footnote
|
||||
// marker and is synced separately by setCalloutRange. Renumbering it here
|
||||
// would consume note slots and corrupt the sequence. Other top-level
|
||||
// callouts may carry legitimate "[N]" body markers and are renumbered.
|
||||
if (
|
||||
isObject(top2[i]) &&
|
||||
top2[i].type === "callout" &&
|
||||
disclaimerRangeRe.test(blockText(top2[i]))
|
||||
) {
|
||||
continue;
|
||||
}
|
||||
walk(top2[i], (node) => {
|
||||
if (node.type !== "text" || typeof node.text !== "string") return;
|
||||
node.text = node.text.replace(re, (_m: string, oldNum: string, phIdx: string) => {
|
||||
if (oldNum != null) {
|
||||
const note = oldNotes[Number(oldNum) - 1];
|
||||
// Every existing body marker MUST map to a real note. An out-of-range
|
||||
// marker means the document is internally inconsistent; fail loudly
|
||||
// rather than silently dropping the note and desyncing the callout.
|
||||
if (note === undefined) {
|
||||
throw new Error(
|
||||
`footnote [${oldNum}] has no matching note (notes list has ${oldNotes.length} items); document is inconsistent`,
|
||||
);
|
||||
}
|
||||
newNotes.push(note);
|
||||
} else {
|
||||
newNotes.push(noteByPh.get(`\u0000FN${phIdx}\u0000`));
|
||||
}
|
||||
return `[${++seq}]`;
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
// Reorder the notes list IN PLACE on `working` first, THEN sync the callout
|
||||
// range. setCalloutRange clones `working`, so the reordered notes (mutated
|
||||
// before the clone) are carried into its result automatically. No null-filter
|
||||
// here: marker count and note count must stay exactly equal (the out-of-range
|
||||
// guard above guarantees no undefined entry is ever pushed).
|
||||
notesList2.content = newNotes;
|
||||
const synced = setCalloutRange(working, notesList2.content.length);
|
||||
|
||||
return { doc: synced.doc, consumed };
|
||||
}
|
||||
48
packages/mcp/src/stdio.ts
Normal file
48
packages/mcp/src/stdio.ts
Normal file
@@ -0,0 +1,48 @@
|
||||
#!/usr/bin/env node
|
||||
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
|
||||
import { createDocmostMcpServer } from "./index.js";
|
||||
|
||||
// Standalone stdio entrypoint. This restores the original behavior of the
|
||||
// package when run as a CLI (`docmost-mcp`): it reads credentials from the
|
||||
// environment and serves the MCP protocol over stdin/stdout. The factory in
|
||||
// index.ts stays side-effect-free; all the process/transport lifecycle lives
|
||||
// here.
|
||||
|
||||
const API_URL = process.env.DOCMOST_API_URL;
|
||||
const EMAIL = process.env.DOCMOST_EMAIL;
|
||||
const PASSWORD = process.env.DOCMOST_PASSWORD;
|
||||
|
||||
if (!API_URL || !EMAIL || !PASSWORD) {
|
||||
console.error(
|
||||
"Error: DOCMOST_API_URL, DOCMOST_EMAIL, and DOCMOST_PASSWORD environment variables are required.",
|
||||
);
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
async function run() {
|
||||
// Global safety nets so a stray rejection/exception cannot silently kill
|
||||
// the stdio server. Per-tool errors still flow through the SDK and are not
|
||||
// affected by these handlers; these only catch errors raised OUTSIDE a tool
|
||||
// call (e.g. a transient ws/collab socket "error" event). Such errors must
|
||||
// NOT tear down the whole stdio server, so we log only and keep running.
|
||||
// Genuine startup failures are still fatal via run().catch(...) below.
|
||||
process.on("unhandledRejection", (reason) => {
|
||||
console.error("Unhandled promise rejection:", reason);
|
||||
});
|
||||
process.on("uncaughtException", (error) => {
|
||||
console.error("Uncaught exception:", error);
|
||||
});
|
||||
|
||||
const server = createDocmostMcpServer({
|
||||
apiUrl: API_URL!,
|
||||
email: EMAIL!,
|
||||
password: PASSWORD!,
|
||||
});
|
||||
const transport = new StdioServerTransport();
|
||||
await server.connect(transport);
|
||||
}
|
||||
|
||||
run().catch((error) => {
|
||||
console.error("Fatal error running server:", error);
|
||||
process.exit(1);
|
||||
});
|
||||
474
packages/mcp/test-e2e.mjs
Normal file
474
packages/mcp/test-e2e.mjs
Normal file
@@ -0,0 +1,474 @@
|
||||
// End-to-end test of the docmost-mcp client against a live Docmost server.
|
||||
// Creates a throwaway page, exercises every code path, cleans up after itself.
|
||||
// Usage: DOCMOST_API_URL=... DOCMOST_EMAIL=... DOCMOST_PASSWORD=... node test-e2e.mjs
|
||||
import { DocmostClient } from "./build/client.js";
|
||||
import axios from "axios";
|
||||
import { writeFileSync, unlinkSync } from "node:fs";
|
||||
import { tmpdir } from "node:os";
|
||||
import { join } from "node:path";
|
||||
import { deflateSync } from "node:zlib";
|
||||
|
||||
const API = process.env.DOCMOST_API_URL;
|
||||
if (!API || !process.env.DOCMOST_EMAIL || !process.env.DOCMOST_PASSWORD) {
|
||||
console.error("Set DOCMOST_API_URL, DOCMOST_EMAIL and DOCMOST_PASSWORD env variables.");
|
||||
process.exit(2);
|
||||
}
|
||||
const APP = API.replace(/\/api\/?$/, "");
|
||||
const client = new DocmostClient(API, process.env.DOCMOST_EMAIL, process.env.DOCMOST_PASSWORD);
|
||||
|
||||
let failed = 0;
|
||||
const check = (name, cond, extra = "") => {
|
||||
console.log(`${cond ? "OK " : "FAIL"} ${name}${extra ? " — " + extra : ""}`);
|
||||
if (!cond) failed++;
|
||||
};
|
||||
|
||||
// Minimal solid-color PNG encoder using Node built-ins only (no dependencies).
|
||||
// Returns a valid PNG buffer for a 1x1 image of the given RGB color.
|
||||
const crc32 = (buf) => {
|
||||
let crc = 0xffffffff;
|
||||
for (let i = 0; i < buf.length; i++) {
|
||||
crc ^= buf[i];
|
||||
for (let k = 0; k < 8; k++) crc = crc & 1 ? (crc >>> 1) ^ 0xedb88320 : crc >>> 1;
|
||||
}
|
||||
return (crc ^ 0xffffffff) >>> 0;
|
||||
};
|
||||
const pngChunk = (type, data) => {
|
||||
const len = Buffer.alloc(4);
|
||||
len.writeUInt32BE(data.length, 0);
|
||||
const typeBuf = Buffer.from(type, "ascii");
|
||||
const crc = Buffer.alloc(4);
|
||||
crc.writeUInt32BE(crc32(Buffer.concat([typeBuf, data])), 0);
|
||||
return Buffer.concat([len, typeBuf, data, crc]);
|
||||
};
|
||||
const makePng = (r, g, b) => {
|
||||
const sig = Buffer.from([0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a]);
|
||||
const ihdr = Buffer.alloc(13);
|
||||
ihdr.writeUInt32BE(1, 0); // width
|
||||
ihdr.writeUInt32BE(1, 4); // height
|
||||
ihdr[8] = 8; // bit depth
|
||||
ihdr[9] = 2; // color type: truecolor RGB
|
||||
ihdr[10] = 0; // compression
|
||||
ihdr[11] = 0; // filter
|
||||
ihdr[12] = 0; // interlace
|
||||
// One scanline: filter byte 0 followed by one RGB pixel.
|
||||
const raw = Buffer.from([0, r, g, b]);
|
||||
const idat = deflateSync(raw);
|
||||
return Buffer.concat([
|
||||
sig,
|
||||
pngChunk("IHDR", ihdr),
|
||||
pngChunk("IDAT", idat),
|
||||
pngChunk("IEND", Buffer.alloc(0)),
|
||||
]);
|
||||
};
|
||||
|
||||
const MD = `:::info
|
||||
**Тестовый callout.** Он должен стать узлом callout, а не blockquote.
|
||||
:::
|
||||
|
||||
Первый абзац с **жирным** и [ссылкой](https://example.com). Маркер тут [1] стоит.
|
||||
|
||||
## Раздел два
|
||||
|
||||
| Колонка А | Колонка Б |
|
||||
| --- | --- |
|
||||
| раз | два |
|
||||
| три | четыре |
|
||||
|
||||
Последний абзац со словом БУКВОЕД для замены.
|
||||
`;
|
||||
|
||||
async function main() {
|
||||
const spaces = await client.getSpaces();
|
||||
const spaceId = spaces[0].id;
|
||||
let pageId = null;
|
||||
|
||||
try {
|
||||
// 1. create_page: title with spaces must survive (was: underscores bug)
|
||||
const created = await client.createPage("Тест апгрейда MCP сервера", MD, spaceId);
|
||||
pageId = created.data.id;
|
||||
check("create_page: title keeps spaces", created.data.title === "Тест апгрейда MCP сервера", created.data.title);
|
||||
check("create_page: slugId exposed", typeof created.data.slugId === "string" && created.data.slugId.length > 0, created.data.slugId);
|
||||
|
||||
// 2. get_page_json: raw ProseMirror with callout + table
|
||||
const pj = await client.getPageJson(pageId);
|
||||
const types = pj.content.content.map((n) => n.type);
|
||||
check("get_page_json: callout node present", types.includes("callout"), types.join(","));
|
||||
check("get_page_json: table node present", types.includes("table"));
|
||||
check("get_page_json: slugId present", !!pj.slugId);
|
||||
|
||||
// 3. edit_page_text: surgical replace, ids preserved
|
||||
const idsBefore = JSON.stringify(
|
||||
pj.content.content.filter((n) => n.attrs?.id).map((n) => n.attrs.id),
|
||||
);
|
||||
const editRes = await client.editPageText(pageId, [
|
||||
{ find: "БУКВОЕД", replace: "КНИГОЛЮБ" },
|
||||
{ find: "[1]", replace: "[42]" },
|
||||
]);
|
||||
check("edit_page_text: both edits applied", editRes.edits.every((e) => e.replacements === 1));
|
||||
await new Promise((r) => setTimeout(r, 16000)); // wait for server persistence
|
||||
const pj2 = await client.getPageJson(pageId);
|
||||
const text2 = JSON.stringify(pj2.content);
|
||||
check("edit_page_text: replacement visible", text2.includes("КНИГОЛЮБ") && text2.includes("[42]"));
|
||||
check("edit_page_text: old text gone", !text2.includes("БУКВОЕД"));
|
||||
const idsAfter = JSON.stringify(
|
||||
pj2.content.content.filter((n) => n.attrs?.id).map((n) => n.attrs.id),
|
||||
);
|
||||
check("edit_page_text: block ids preserved", idsBefore === idsAfter);
|
||||
check("edit_page_text: callout survived", JSON.stringify(pj2.content).includes('"callout"'));
|
||||
check("edit_page_text: table survived", pj2.content.content.some((n) => n.type === "table"));
|
||||
|
||||
// 4. error reporting: ambiguous and missing finds
|
||||
let err1 = "";
|
||||
try { await client.editPageText(pageId, [{ find: "Колонка", replace: "X" }]); } catch (e) { err1 = e.message; }
|
||||
check("edit_page_text: ambiguous match rejected", err1.includes("matches"), err1);
|
||||
let err2 = "";
|
||||
try { await client.editPageText(pageId, [{ find: "НЕСУЩЕСТВУЮЩЕЕ", replace: "X" }]); } catch (e) { err2 = e.message; }
|
||||
check("edit_page_text: missing text reported", err2.includes("not found"), err2);
|
||||
|
||||
// 5. update_page (markdown): table + callout must survive the re-import
|
||||
await client.updatePage(pageId, MD + "\nДобавленный абзац.\n");
|
||||
await new Promise((r) => setTimeout(r, 16000));
|
||||
const pj3 = await client.getPageJson(pageId);
|
||||
const types3 = pj3.content.content.map((n) => n.type);
|
||||
check("update_page md: callout survives re-import", types3.includes("callout"), types3.join(","));
|
||||
check("update_page md: table survives re-import", types3.includes("table"));
|
||||
const tableNode = pj3.content.content.find((n) => n.type === "table");
|
||||
const cellText = JSON.stringify(tableNode);
|
||||
check("update_page md: table cells intact", cellText.includes("четыре") && cellText.includes("Колонка А"));
|
||||
|
||||
// 6. update_page_json: lossless write round-trip
|
||||
pj3.content.content.push({
|
||||
type: "paragraph",
|
||||
attrs: { id: "testidjsonpush", indent: 0, textAlign: null },
|
||||
content: [{ type: "text", text: "Абзац, добавленный через update_page_json." }],
|
||||
});
|
||||
await client.updatePageJson(pageId, pj3.content);
|
||||
await new Promise((r) => setTimeout(r, 16000));
|
||||
const pj4 = await client.getPageJson(pageId);
|
||||
const lastNode = pj4.content.content[pj4.content.content.length - 1];
|
||||
check("update_page_json: paragraph appended", JSON.stringify(pj4.content).includes("добавленный через update_page_json"));
|
||||
check("update_page_json: custom node id preserved", lastNode.attrs?.id === "testidjsonpush", lastNode.attrs?.id);
|
||||
|
||||
// 6b. images: upload / insert / replace (clean src, fresh attachment on replace)
|
||||
const pngA = join(tmpdir(), `mcp-e2e-img-a-${Date.now()}.png`);
|
||||
const pngB = join(tmpdir(), `mcp-e2e-img-b-${Date.now()}.png`);
|
||||
writeFileSync(pngA, makePng(255, 0, 0)); // red
|
||||
writeFileSync(pngB, makePng(0, 0, 255)); // blue (a DIFFERENT valid PNG)
|
||||
try {
|
||||
// Independent login to fetch file bytes with the same cookie the editor uses.
|
||||
const login = await axios.post(
|
||||
`${API}/auth/login`,
|
||||
{ email: process.env.DOCMOST_EMAIL, password: process.env.DOCMOST_PASSWORD },
|
||||
{ validateStatus: () => true },
|
||||
);
|
||||
const token = (login.headers["set-cookie"] || [])
|
||||
.find((c) => c.startsWith("authToken="))
|
||||
?.split(";")[0]
|
||||
.split("=")[1];
|
||||
const fetchFile = (src) =>
|
||||
axios.get(`${APP}${src}`, {
|
||||
headers: { Cookie: `authToken=${token}` },
|
||||
responseType: "arraybuffer",
|
||||
validateStatus: () => true,
|
||||
});
|
||||
|
||||
// insert_image: append the first PNG, src must be clean (no ?v=) and fetchable.
|
||||
const ins = await client.insertImage(pageId, pngA);
|
||||
check("insert_image: src has no ?v= cache-buster", !ins.src.includes("?v="), ins.src);
|
||||
const fileA = await fetchFile(ins.src);
|
||||
check("insert_image: file fetch returns 200", fileA.status === 200, `status=${fileA.status}`);
|
||||
check(
|
||||
"insert_image: content-type is image/*",
|
||||
String(fileA.headers["content-type"] || "").startsWith("image/"),
|
||||
String(fileA.headers["content-type"]),
|
||||
);
|
||||
|
||||
await new Promise((r) => setTimeout(r, 16000));
|
||||
const pjImg = await client.getPageJson(pageId);
|
||||
const findImage = (nodes, id) => {
|
||||
for (const n of nodes || []) {
|
||||
if (n.type === "image" && (!id || n.attrs?.attachmentId === id)) return n;
|
||||
const found = findImage(n.content, id);
|
||||
if (found) return found;
|
||||
}
|
||||
return null;
|
||||
};
|
||||
const imgNode = findImage(pjImg.content.content);
|
||||
const oldAttachmentId = imgNode?.attrs?.attachmentId;
|
||||
check("insert_image: image node present after persist", !!oldAttachmentId, oldAttachmentId);
|
||||
|
||||
// replace_image: must create a NEW attachment with a clean, fetchable URL.
|
||||
// The 200 fetch is the assertion that catches the in-place-overwrite HTTP 500 regression.
|
||||
const rep = await client.replaceImage(pageId, oldAttachmentId, pngB);
|
||||
check("replace_image: new attachment id differs from old", rep.newAttachmentId !== oldAttachmentId, `${oldAttachmentId} -> ${rep.newAttachmentId}`);
|
||||
check("replace_image: src has no ?v= cache-buster", !rep.src.includes("?v="), rep.src);
|
||||
const fileB = await fetchFile(rep.src);
|
||||
check("replace_image: new file fetch returns 200", fileB.status === 200, `status=${fileB.status}`);
|
||||
check(
|
||||
"replace_image: new content-type is image/*",
|
||||
String(fileB.headers["content-type"] || "").startsWith("image/"),
|
||||
String(fileB.headers["content-type"]),
|
||||
);
|
||||
|
||||
await new Promise((r) => setTimeout(r, 16000));
|
||||
const pjImg2 = await client.getPageJson(pageId);
|
||||
check("replace_image: page has new attachment id", !!findImage(pjImg2.content.content, rep.newAttachmentId), rep.newAttachmentId);
|
||||
check("replace_image: old attachment id repointed away", !findImage(pjImg2.content.content, oldAttachmentId), oldAttachmentId);
|
||||
} finally {
|
||||
try { unlinkSync(pngA); } catch {}
|
||||
try { unlinkSync(pngB); } catch {}
|
||||
}
|
||||
|
||||
// 6c. rich formatting: callout type, task list, inline marks, table alignment,
|
||||
// and literal $-pattern edits. Runs on its own throwaway page so it does not
|
||||
// disturb the markdown-export assumptions of later sections.
|
||||
{
|
||||
const findNodes = (n, t, acc = []) => {
|
||||
if (!n) return acc;
|
||||
if (n.type === t) acc.push(n);
|
||||
for (const ch of n.content || []) findNodes(ch, t, acc);
|
||||
return acc;
|
||||
};
|
||||
const marksOf = (n, acc = new Set()) => {
|
||||
if (!n) return acc;
|
||||
for (const m of n.marks || []) acc.add(m.type);
|
||||
for (const ch of n.content || []) marksOf(ch, acc);
|
||||
return acc;
|
||||
};
|
||||
const FMD = [
|
||||
":::warning", "Warning callout with СЛОВО.", ":::", "",
|
||||
"- [x] done", "- [ ] todo", "",
|
||||
"Marks: <mark>hl</mark> <sub>lo</sub> <sup>hi</sup>.", "",
|
||||
"| L | C | R |", "|:--|:-:|--:|", "| a | b | c |", "",
|
||||
"Edit anchor PRICEMARK.",
|
||||
].join("\n");
|
||||
const featPng = join(tmpdir(), `mcp-e2e-feat-${Date.now()}.png`);
|
||||
writeFileSync(featPng, makePng(0, 255, 0));
|
||||
const fp = await client.createPage("E2E features " + Date.now(), "init", spaceId);
|
||||
const fid = fp.data.id;
|
||||
try {
|
||||
await client.updatePage(fid, FMD);
|
||||
await new Promise((r) => setTimeout(r, 16000));
|
||||
const fj = (await client.getPageJson(fid)).content;
|
||||
check("feature: callout type 'warning' preserved (was coerced to info)", findNodes(fj, "callout").some((n) => n.attrs?.type === "warning"), JSON.stringify(findNodes(fj, "callout").map((n) => n.attrs?.type)));
|
||||
check("feature: task list imported (taskList + 2 taskItems)", findNodes(fj, "taskList").length >= 1 && findNodes(fj, "taskItem").length === 2, `tl=${findNodes(fj, "taskList").length} ti=${findNodes(fj, "taskItem").length}`);
|
||||
check("feature: task checked states preserved", findNodes(fj, "taskItem").some((n) => n.attrs?.checked === true) && findNodes(fj, "taskItem").some((n) => n.attrs?.checked === false));
|
||||
const mk = [...marksOf(fj)];
|
||||
check("feature: highlight/subscript/superscript marks imported", ["highlight", "subscript", "superscript"].every((m) => mk.includes(m)), mk.join(","));
|
||||
check("feature: table cell alignment imported", JSON.stringify(findNodes(fj, "tableHeader").map((n) => n.attrs?.align)) === '["left","center","right"]', JSON.stringify(findNodes(fj, "tableHeader").map((n) => n.attrs?.align)));
|
||||
const fmd = (await client.getPage(fid)).data.content;
|
||||
check("feature: md export emits task checkboxes", fmd.includes("- [x]") && fmd.includes("- [ ]"));
|
||||
check("feature: md export emits table alignment markers", /:--|:-:|--:/.test(fmd));
|
||||
await client.editPageText(fid, [{ find: "PRICEMARK", replace: "$& costs $100" }]);
|
||||
await new Promise((r) => setTimeout(r, 16000));
|
||||
const ftext = JSON.stringify((await client.getPageJson(fid)).content);
|
||||
check("feature: edit_page_text inserts $-pattern literally (no $& expansion)", ftext.includes("$& costs $100") && !ftext.includes("PRICEMARK costs"));
|
||||
let badThrew = false;
|
||||
try { await client.replaceImage(fid, "00000000-0000-0000-0000-000000000000", featPng); } catch (e) { badThrew = /no image with attachmentId/.test(e.message); }
|
||||
check("feature: replace_image with unknown id throws (no orphan upload)", badThrew);
|
||||
} finally {
|
||||
try { await client.deletePage(fid); } catch {}
|
||||
try { unlinkSync(featPng); } catch {}
|
||||
}
|
||||
}
|
||||
|
||||
// 6d. node ops: patch / insert / delete a block by id on a throwaway page.
|
||||
// Three paragraphs are written with KNOWN ids via update_page_json so the
|
||||
// ids can be targeted directly; each op is verified via getPageJson after
|
||||
// the standard 16s persistence wait.
|
||||
{
|
||||
const np = await client.createPage("E2E node-ops " + Date.now(), "init", spaceId);
|
||||
const nid = np.data.id;
|
||||
try {
|
||||
const mkPara = (id, text) => ({
|
||||
type: "paragraph",
|
||||
attrs: { id, indent: 0, textAlign: null },
|
||||
content: [{ type: "text", text }],
|
||||
});
|
||||
// Seed three paragraphs with known ids.
|
||||
await client.updatePageJson(nid, {
|
||||
type: "doc",
|
||||
content: [
|
||||
mkPara("nodeops-a", "Alpha paragraph."),
|
||||
mkPara("nodeops-b", "Bravo paragraph."),
|
||||
mkPara("nodeops-c", "Charlie paragraph."),
|
||||
],
|
||||
});
|
||||
await new Promise((r) => setTimeout(r, 16000));
|
||||
|
||||
// Read back the ids the server actually assigned.
|
||||
const seed = (await client.getPageJson(nid)).content;
|
||||
const seedIds = seed.content.map((n) => n.attrs?.id);
|
||||
check("node_ops: three seed paragraphs present", seed.content.length === 3, seedIds.join(","));
|
||||
const [idA, idB, idC] = seedIds;
|
||||
|
||||
// patchNode: replace the middle paragraph; siblings' ids must be unchanged.
|
||||
await client.patchNode(nid, idB, mkPara(idB, "Bravo PATCHED."));
|
||||
await new Promise((r) => setTimeout(r, 16000));
|
||||
const afterPatch = (await client.getPageJson(nid)).content;
|
||||
const patchText = JSON.stringify(afterPatch);
|
||||
check("node_ops: patchNode applied new text", patchText.includes("Bravo PATCHED.") && !patchText.includes("Bravo paragraph."));
|
||||
const patchIds = afterPatch.content.map((n) => n.attrs?.id);
|
||||
check("node_ops: patchNode kept sibling ids", patchIds[0] === idA && patchIds[2] === idC, patchIds.join(","));
|
||||
|
||||
// insertNode: place a new block after the first paragraph.
|
||||
await client.insertNode(
|
||||
nid,
|
||||
mkPara("nodeops-ins", "Inserted paragraph."),
|
||||
{ position: "after", anchorNodeId: idA },
|
||||
);
|
||||
await new Promise((r) => setTimeout(r, 16000));
|
||||
const afterIns = (await client.getPageJson(nid)).content;
|
||||
const insIds = afterIns.content.map((n) => n.attrs?.id);
|
||||
const insText = afterIns.content.map((n) => JSON.stringify(n.content)).join("|");
|
||||
check("node_ops: insertNode added a block", afterIns.content.length === 4 && insText.includes("Inserted paragraph."));
|
||||
check("node_ops: insertNode placed block right after anchor", insIds[0] === idA && insIds[1] !== idB && insIds[2] === idB, insIds.join(","));
|
||||
|
||||
// deleteNode: remove the last (Charlie) paragraph.
|
||||
await client.deleteNode(nid, idC);
|
||||
await new Promise((r) => setTimeout(r, 16000));
|
||||
const afterDel = (await client.getPageJson(nid)).content;
|
||||
const delText = JSON.stringify(afterDel);
|
||||
check("node_ops: deleteNode removed the block", !delText.includes("Charlie paragraph.") && !afterDel.content.some((n) => n.attrs?.id === idC));
|
||||
} finally {
|
||||
try { await client.deletePage(nid); } catch {}
|
||||
}
|
||||
}
|
||||
|
||||
// 6e. rename_page: title-only update must leave the content untouched.
|
||||
{
|
||||
const rp = await client.createPage("E2E rename before " + Date.now(), "Rename body marker RENAMEBODY.", spaceId);
|
||||
const rid = rp.data.id;
|
||||
try {
|
||||
const beforeJson = (await client.getPageJson(rid)).content;
|
||||
const beforeContent = JSON.stringify(beforeJson);
|
||||
const newTitle = "E2E rename AFTER " + Date.now();
|
||||
const rr = await client.renamePage(rid, newTitle);
|
||||
check("rename_page: returns success+title", rr.success === true && rr.title === newTitle, JSON.stringify(rr));
|
||||
await new Promise((r) => setTimeout(r, 16000));
|
||||
const afterJson = await client.getPageJson(rid);
|
||||
check("rename_page: title changed", afterJson.title === newTitle, afterJson.title);
|
||||
check("rename_page: content unchanged", JSON.stringify(afterJson.content) === beforeContent && beforeContent.includes("RENAMEBODY"));
|
||||
const afterMd = (await client.getPage(rid)).data;
|
||||
check("rename_page: get_page reflects new title", afterMd.title === newTitle, afterMd.title);
|
||||
} finally {
|
||||
try { await client.deletePage(rid); } catch {}
|
||||
}
|
||||
}
|
||||
|
||||
// 6f. update_page_json title-only: omitting content updates the title and
|
||||
// leaves the body intact; supplying neither content nor title throws.
|
||||
{
|
||||
const up = await client.createPage("E2E upj-title before " + Date.now(), "Title-only body marker UPJTITLEBODY.", spaceId);
|
||||
const uid = up.data.id;
|
||||
try {
|
||||
const beforeContent = JSON.stringify((await client.getPageJson(uid)).content);
|
||||
const newTitle = "E2E upj-title AFTER " + Date.now();
|
||||
const ur = await client.updatePageJson(uid, undefined, newTitle);
|
||||
check("update_page_json title-only: succeeds", ur.success === true, JSON.stringify(ur));
|
||||
await new Promise((r) => setTimeout(r, 16000));
|
||||
const afterJson = await client.getPageJson(uid);
|
||||
check("update_page_json title-only: title updated", afterJson.title === newTitle, afterJson.title);
|
||||
check("update_page_json title-only: content intact", JSON.stringify(afterJson.content) === beforeContent && beforeContent.includes("UPJTITLEBODY"));
|
||||
let upjErr = "";
|
||||
try { await client.updatePageJson(uid); } catch (e) { upjErr = e.message; }
|
||||
check("update_page_json: neither content nor title throws", upjErr.includes("nothing to update"), upjErr);
|
||||
} finally {
|
||||
try { await client.deletePage(uid); } catch {}
|
||||
}
|
||||
}
|
||||
|
||||
// 6g. copy_page_content: B's body becomes a copy of A's body, server-side,
|
||||
// while B's title/slugId stay put. Both pages are throwaways.
|
||||
{
|
||||
let aid = null;
|
||||
let bid = null;
|
||||
try {
|
||||
const aPage = await client.createPage("E2E copy SOURCE " + Date.now(), "Source marker COPYSOURCE only here.\n\nSecond source paragraph.", spaceId);
|
||||
aid = aPage.data.id;
|
||||
const bPage = await client.createPage("E2E copy TARGET " + Date.now(), "Target marker COPYTARGET only here.", spaceId);
|
||||
bid = bPage.data.id;
|
||||
|
||||
const aJson = await client.getPageJson(aid);
|
||||
const bBefore = await client.getPageJson(bid);
|
||||
const bTitleBefore = bBefore.title;
|
||||
const bSlugBefore = bBefore.slugId;
|
||||
const aNodeCount = aJson.content.content.length;
|
||||
|
||||
const cr = await client.copyPageContent(aid, bid);
|
||||
check("copy_page_content: returns success + node count", cr.success === true && cr.copiedNodes === aNodeCount, JSON.stringify(cr));
|
||||
await new Promise((r) => setTimeout(r, 16000));
|
||||
|
||||
const bAfter = await client.getPageJson(bid);
|
||||
const bText = JSON.stringify(bAfter.content);
|
||||
check("copy_page_content: B now has A's marker", bText.includes("COPYSOURCE"));
|
||||
check("copy_page_content: B's old marker gone", !bText.includes("COPYTARGET"));
|
||||
check("copy_page_content: B node count equals A's", bAfter.content.content.length === aNodeCount, `${bAfter.content.content.length} vs ${aNodeCount}`);
|
||||
check("copy_page_content: B title unchanged", bAfter.title === bTitleBefore, bAfter.title);
|
||||
check("copy_page_content: B slugId unchanged", bAfter.slugId === bSlugBefore, bAfter.slugId);
|
||||
|
||||
// Source must be left untouched by the copy.
|
||||
const aAfter = JSON.stringify((await client.getPageJson(aid)).content);
|
||||
check("copy_page_content: source page unchanged", aAfter === JSON.stringify(aJson.content) && aAfter.includes("COPYSOURCE"));
|
||||
|
||||
let copyErr = "";
|
||||
try { await client.copyPageContent(aid, aid); } catch (e) { copyErr = e.message; }
|
||||
check("copy_page_content: self-copy rejected", copyErr.includes("same page"), copyErr);
|
||||
} finally {
|
||||
try { if (bid) await client.deletePage(bid); } catch {}
|
||||
try { if (aid) await client.deletePage(aid); } catch {}
|
||||
}
|
||||
}
|
||||
|
||||
// 7. shares: create (idempotent), public access, list, unshare
|
||||
const share = await client.sharePage(pageId);
|
||||
check("share_page: returns public URL", share.publicUrl?.startsWith(`${APP}/share/`), share.publicUrl);
|
||||
const share2 = await client.sharePage(pageId);
|
||||
check("share_page: idempotent", share2.key === share.key);
|
||||
const anon = await axios.post(`${API}/shares/page-info`, { pageId: pj4.slugId, shareId: share.key }, { validateStatus: () => true });
|
||||
check("share_page: anonymous access works", anon.status === 200);
|
||||
const shares = await client.listShares();
|
||||
check("list_shares: contains our page", shares.some((s) => s.pageId === pageId && s.publicUrl === share.publicUrl));
|
||||
const un = await client.unsharePage(pageId);
|
||||
check("unshare_page: success", un.success === true);
|
||||
const anon2 = await axios.post(`${API}/shares/page-info`, { pageId: pj4.slugId, shareId: share.key }, { validateStatus: () => true });
|
||||
check("unshare_page: public access revoked", anon2.status !== 200, `status=${anon2.status}`);
|
||||
|
||||
// 8. get_page markdown round-trip sanity (table separator present)
|
||||
const md = await client.getPage(pageId);
|
||||
check("get_page md: table separator emitted", md.data.content.includes("| --- |"), "");
|
||||
check("get_page md: callout exported as :::", md.data.content.includes(":::info"));
|
||||
|
||||
// 9. comments: create / list / reply / update / check_new / delete
|
||||
const beforeComments = new Date(Date.now() - 1000).toISOString();
|
||||
const c1 = await client.createComment(pageId, "Первый **комментарий** с [ссылкой](https://example.com).");
|
||||
check("create_comment: created", !!c1.data.id, c1.data.id);
|
||||
check("create_comment: markdown round-trip", c1.data.content.includes("**комментарий**"), c1.data.content);
|
||||
const reply = await client.createComment(pageId, "Ответ на комментарий.", "page", undefined, c1.data.id);
|
||||
check("create_comment: reply has parent", reply.data.parentCommentId === c1.data.id);
|
||||
const list = await client.listComments(pageId);
|
||||
check("list_comments: both visible", list.length === 2, `count=${list.length}`);
|
||||
await client.updateComment(c1.data.id, "Обновлённый текст комментария.");
|
||||
const got = await client.getComment(c1.data.id);
|
||||
check("update_comment + get_comment: content updated", got.data.content.includes("Обновлённый"), got.data.content);
|
||||
const news = await client.checkNewComments(spaceId, beforeComments, pageId);
|
||||
check("check_new_comments: finds new comments in subtree", news.totalNewComments >= 2, `total=${news.totalNewComments}`);
|
||||
await client.deleteComment(reply.data.id);
|
||||
await client.deleteComment(c1.data.id);
|
||||
const listAfter = await client.listComments(pageId);
|
||||
check("delete_comment: comments removed", listAfter.length === 0, `count=${listAfter.length}`);
|
||||
} finally {
|
||||
if (pageId) {
|
||||
await client.deletePage(pageId);
|
||||
console.log("cleanup: test page deleted");
|
||||
}
|
||||
}
|
||||
|
||||
console.log(failed === 0 ? "\nALL TESTS PASSED" : `\n${failed} TESTS FAILED`);
|
||||
process.exit(failed === 0 ? 0 : 1);
|
||||
}
|
||||
|
||||
main().catch((e) => {
|
||||
console.error("FATAL:", e.message);
|
||||
process.exit(2);
|
||||
});
|
||||
440
packages/mcp/test/mock/reauth.test.mjs
Normal file
440
packages/mcp/test/mock/reauth.test.mjs
Normal file
@@ -0,0 +1,440 @@
|
||||
// Mock-HTTP tests for the re-auth / multipart / pagination paths in
|
||||
// DocmostClient that the live e2e (which always starts with a FRESH token)
|
||||
// can never reach: expired-token replay, concurrent-login dedup, the
|
||||
// no-infinite-loop guard, exact cookie parsing, and the paginateAll loop
|
||||
// guards. A local http.createServer stands in for Docmost so everything
|
||||
// stays deterministic and offline.
|
||||
import { test, after } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
import http from "node:http";
|
||||
import { DocmostClient } from "../../build/client.js";
|
||||
|
||||
// Read a request body to completion (used to assert /auth/login receives the
|
||||
// email/password JSON, and just to drain the stream before responding).
|
||||
function readBody(req) {
|
||||
return new Promise((resolve) => {
|
||||
let raw = "";
|
||||
req.on("data", (chunk) => {
|
||||
raw += chunk;
|
||||
});
|
||||
req.on("end", () => resolve(raw));
|
||||
});
|
||||
}
|
||||
|
||||
// Start an http server bound to an ephemeral port and resolve once it is
|
||||
// listening, returning the server plus the api base URL the client should use.
|
||||
function startServer(handler) {
|
||||
return new Promise((resolve) => {
|
||||
const server = http.createServer(handler);
|
||||
server.listen(0, "127.0.0.1", () => {
|
||||
const { port } = server.address();
|
||||
resolve({ server, baseURL: `http://127.0.0.1:${port}/api` });
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
function closeServer(server) {
|
||||
return new Promise((resolve) => server.close(resolve));
|
||||
}
|
||||
|
||||
// JSON helper.
|
||||
function sendJson(res, status, obj, extraHeaders = {}) {
|
||||
res.writeHead(status, { "Content-Type": "application/json", ...extraHeaders });
|
||||
res.end(JSON.stringify(obj));
|
||||
}
|
||||
|
||||
// Track every server so the after() hook can guarantee nothing is left open.
|
||||
const openServers = [];
|
||||
async function spawn(handler) {
|
||||
const { server, baseURL } = await startServer(handler);
|
||||
openServers.push(server);
|
||||
return { server, baseURL };
|
||||
}
|
||||
|
||||
after(async () => {
|
||||
await Promise.all(openServers.map((s) => closeServer(s)));
|
||||
});
|
||||
|
||||
// -----------------------------------------------------------------------------
|
||||
// 1) 401-then-200: the interceptor re-logs-in and replays the request once.
|
||||
// -----------------------------------------------------------------------------
|
||||
test("401 on a JSON endpoint triggers re-login and a successful replay", async () => {
|
||||
let loginCalls = 0;
|
||||
let infoCalls = 0;
|
||||
let replayedAuthHeader = null;
|
||||
|
||||
const { baseURL } = await spawn(async (req, res) => {
|
||||
await readBody(req);
|
||||
if (req.url === "/api/auth/login") {
|
||||
loginCalls++;
|
||||
// Hand back a fresh token via Set-Cookie (HttpOnly, like Docmost).
|
||||
sendJson(res, 200, { success: true }, {
|
||||
"Set-Cookie": "authToken=fresh-token-123; Path=/; HttpOnly",
|
||||
});
|
||||
return;
|
||||
}
|
||||
if (req.url === "/api/workspace/info") {
|
||||
infoCalls++;
|
||||
// First hit: token is stale -> 401. Second hit (the replay): 200, and
|
||||
// record the Authorization header so we can confirm the new Bearer.
|
||||
if (infoCalls === 1) {
|
||||
sendJson(res, 401, { message: "Unauthorized" });
|
||||
} else {
|
||||
replayedAuthHeader = req.headers["authorization"];
|
||||
sendJson(res, 200, { success: true, data: { id: "ws-1", name: "WS" } });
|
||||
}
|
||||
return;
|
||||
}
|
||||
sendJson(res, 404, { message: "not found" });
|
||||
});
|
||||
|
||||
const client = new DocmostClient(baseURL, "user@example.com", "pw");
|
||||
// Pre-seed a stale token so the FIRST /workspace/info uses it and 401s,
|
||||
// exercising the interceptor replay rather than the initial-login path.
|
||||
client.token = "stale-token";
|
||||
client.client.defaults.headers.common["Authorization"] = "Bearer stale-token";
|
||||
|
||||
const result = await client.getWorkspace();
|
||||
|
||||
assert.equal(result.success, true);
|
||||
assert.equal(loginCalls, 1, "/auth/login should be called exactly once");
|
||||
assert.equal(infoCalls, 2, "the endpoint should be hit twice (401 then replay)");
|
||||
assert.equal(
|
||||
replayedAuthHeader,
|
||||
"Bearer fresh-token-123",
|
||||
"the replay must carry the freshly minted Bearer token",
|
||||
);
|
||||
});
|
||||
|
||||
// -----------------------------------------------------------------------------
|
||||
// 2) Login dedup: concurrent 401s collapse into a single /auth/login.
|
||||
// -----------------------------------------------------------------------------
|
||||
test("concurrent 401s deduplicate into a single /auth/login call", async () => {
|
||||
let loginCalls = 0;
|
||||
const infoState = new Map(); // per-endpoint hit counter
|
||||
|
||||
const { baseURL } = await spawn(async (req, res) => {
|
||||
await readBody(req);
|
||||
if (req.url === "/api/auth/login") {
|
||||
loginCalls++;
|
||||
// Delay the login response a touch so all concurrent requests are still
|
||||
// in flight and genuinely share the one in-flight loginPromise.
|
||||
setTimeout(() => {
|
||||
sendJson(res, 200, { success: true }, {
|
||||
"Set-Cookie": "authToken=shared-token; Path=/; HttpOnly",
|
||||
});
|
||||
}, 40);
|
||||
return;
|
||||
}
|
||||
// Several distinct JSON endpoints, each 401 on the first hit then 200.
|
||||
const n = (infoState.get(req.url) || 0) + 1;
|
||||
infoState.set(req.url, n);
|
||||
if (n === 1) {
|
||||
sendJson(res, 401, { message: "Unauthorized" });
|
||||
} else {
|
||||
sendJson(res, 200, { success: true, data: { items: [], meta: {} } });
|
||||
}
|
||||
});
|
||||
|
||||
const client = new DocmostClient(baseURL, "user@example.com", "pw");
|
||||
client.token = "stale-token";
|
||||
client.client.defaults.headers.common["Authorization"] = "Bearer stale-token";
|
||||
|
||||
// Fire several different requests concurrently; each one's first attempt 401s
|
||||
// and triggers a re-login, but the in-flight loginPromise must coalesce them.
|
||||
await Promise.all([
|
||||
client.getWorkspace(),
|
||||
client.getSpaces(),
|
||||
client.search("anything"),
|
||||
client.listShares(),
|
||||
]);
|
||||
|
||||
assert.equal(
|
||||
loginCalls,
|
||||
1,
|
||||
"all concurrent 401s must share ONE in-flight /auth/login",
|
||||
);
|
||||
});
|
||||
|
||||
// -----------------------------------------------------------------------------
|
||||
// 3) Persistent 401: exactly one retry, no infinite loop; a 401 on the login
|
||||
// endpoint itself is NOT retried.
|
||||
// -----------------------------------------------------------------------------
|
||||
test("a persistently-401 endpoint fails after exactly one retry", async () => {
|
||||
let loginCalls = 0;
|
||||
let infoCalls = 0;
|
||||
|
||||
const { baseURL } = await spawn(async (req, res) => {
|
||||
await readBody(req);
|
||||
if (req.url === "/api/auth/login") {
|
||||
loginCalls++;
|
||||
sendJson(res, 200, { success: true }, {
|
||||
"Set-Cookie": "authToken=t; Path=/; HttpOnly",
|
||||
});
|
||||
return;
|
||||
}
|
||||
if (req.url === "/api/workspace/info") {
|
||||
infoCalls++;
|
||||
// ALWAYS 401, even after a fresh login: the retry guard must stop here.
|
||||
sendJson(res, 401, { message: "Unauthorized" });
|
||||
return;
|
||||
}
|
||||
sendJson(res, 404, {});
|
||||
});
|
||||
|
||||
const client = new DocmostClient(baseURL, "user@example.com", "pw");
|
||||
client.token = "stale-token";
|
||||
client.client.defaults.headers.common["Authorization"] = "Bearer stale-token";
|
||||
|
||||
await assert.rejects(() => client.getWorkspace());
|
||||
|
||||
// Original request + exactly ONE replay = 2 hits, never more (no loop).
|
||||
assert.equal(infoCalls, 2, "endpoint hit at most twice (one retry only)");
|
||||
assert.equal(loginCalls, 1, "re-login attempted exactly once");
|
||||
});
|
||||
|
||||
test("a 401 on /auth/login itself is not retried", async () => {
|
||||
let loginCalls = 0;
|
||||
|
||||
const { baseURL } = await spawn(async (req, res) => {
|
||||
await readBody(req);
|
||||
if (req.url === "/api/auth/login") {
|
||||
loginCalls++;
|
||||
// The login endpoint rejects credentials. The interceptor must NOT try
|
||||
// to "re-login to fix a failed login" — that would loop forever.
|
||||
sendJson(res, 401, { message: "Invalid credentials" });
|
||||
return;
|
||||
}
|
||||
sendJson(res, 404, {});
|
||||
});
|
||||
|
||||
const client = new DocmostClient(baseURL, "user@example.com", "wrong-pw");
|
||||
|
||||
// login() -> performLogin POSTs /auth/login, gets 401; the interceptor sees
|
||||
// isLoginRequest and rejects without retrying. So /auth/login is hit once.
|
||||
await assert.rejects(() => client.login());
|
||||
assert.equal(loginCalls, 1, "/auth/login must be attempted exactly once");
|
||||
});
|
||||
|
||||
// -----------------------------------------------------------------------------
|
||||
// 4) performLogin cookie parsing: base64 "=" padding survives intact, and a
|
||||
// cookie literally named authTokenRefresh is not mistaken for authToken.
|
||||
// -----------------------------------------------------------------------------
|
||||
test("a token with base64 '=' padding round-trips intact to the server", async () => {
|
||||
// A realistic JWT-ish value whose final segment ends in base64 "=" padding.
|
||||
const paddedToken = "header.payload.c2lnbmF0dXJl==";
|
||||
let sentBearer = null;
|
||||
|
||||
const { baseURL } = await spawn(async (req, res) => {
|
||||
await readBody(req);
|
||||
if (req.url === "/api/auth/login") {
|
||||
sendJson(res, 200, { success: true }, {
|
||||
// Include attributes AND a base64 value containing "=" so we verify the
|
||||
// parser keeps everything after the FIRST "=" up to the first ";".
|
||||
"Set-Cookie": `authToken=${paddedToken}; Path=/; HttpOnly; SameSite=Lax`,
|
||||
});
|
||||
return;
|
||||
}
|
||||
if (req.url === "/api/workspace/info") {
|
||||
sentBearer = req.headers["authorization"];
|
||||
sendJson(res, 200, { success: true, data: { id: "ws" } });
|
||||
return;
|
||||
}
|
||||
sendJson(res, 404, {});
|
||||
});
|
||||
|
||||
const client = new DocmostClient(baseURL, "user@example.com", "pw");
|
||||
await client.login();
|
||||
// The parsed token equals exactly what the server set (padding preserved).
|
||||
assert.equal(client.token, paddedToken);
|
||||
|
||||
// And the client sends that exact token back on a subsequent request.
|
||||
await client.getWorkspace();
|
||||
assert.equal(sentBearer, `Bearer ${paddedToken}`);
|
||||
});
|
||||
|
||||
test("an authTokenRefresh cookie is not mistaken for authToken", async () => {
|
||||
const { baseURL } = await spawn(async (req, res) => {
|
||||
await readBody(req);
|
||||
if (req.url === "/api/auth/login") {
|
||||
// Set BOTH cookies. The exact-name match must pick authToken=real and
|
||||
// ignore authTokenRefresh=should-not-match (a prefix match would grab it).
|
||||
res.writeHead(200, {
|
||||
"Content-Type": "application/json",
|
||||
"Set-Cookie": [
|
||||
"authTokenRefresh=should-not-match; Path=/; HttpOnly",
|
||||
"authToken=real-token; Path=/; HttpOnly",
|
||||
],
|
||||
});
|
||||
res.end(JSON.stringify({ success: true }));
|
||||
return;
|
||||
}
|
||||
sendJson(res, 404, {});
|
||||
});
|
||||
|
||||
const client = new DocmostClient(baseURL, "user@example.com", "pw");
|
||||
await client.login();
|
||||
assert.equal(client.token, "real-token");
|
||||
});
|
||||
|
||||
test("a response with ONLY authTokenRefresh (no authToken) rejects login", async () => {
|
||||
const { baseURL } = await spawn(async (req, res) => {
|
||||
await readBody(req);
|
||||
if (req.url === "/api/auth/login") {
|
||||
sendJson(res, 200, { success: true }, {
|
||||
"Set-Cookie": "authTokenRefresh=nope; Path=/; HttpOnly",
|
||||
});
|
||||
return;
|
||||
}
|
||||
sendJson(res, 404, {});
|
||||
});
|
||||
|
||||
const client = new DocmostClient(baseURL, "user@example.com", "pw");
|
||||
// No authToken cookie present -> performLogin throws.
|
||||
await assert.rejects(() => client.login(), /No authToken cookie/);
|
||||
});
|
||||
|
||||
// -----------------------------------------------------------------------------
|
||||
// 5) paginateAll loop guards.
|
||||
// -----------------------------------------------------------------------------
|
||||
test("paginateAll stops at the MAX_PAGES cap when hasNextPage is always true", async () => {
|
||||
let pageRequests = 0;
|
||||
const LIMIT = 100;
|
||||
|
||||
const { baseURL } = await spawn(async (req, res) => {
|
||||
await readBody(req);
|
||||
if (req.url === "/api/auth/login") {
|
||||
sendJson(res, 200, { success: true }, {
|
||||
"Set-Cookie": "authToken=t; Path=/; HttpOnly",
|
||||
});
|
||||
return;
|
||||
}
|
||||
if (req.url === "/api/spaces") {
|
||||
pageRequests++;
|
||||
// Always return a FULL page (== requested limit) AND hasNextPage:true.
|
||||
// Both the page-length check and the hasNextPage flag say "keep going",
|
||||
// so only the MAX_PAGES ceiling can stop the loop.
|
||||
const items = Array.from({ length: LIMIT }, (_, i) => ({
|
||||
id: `s-${pageRequests}-${i}`,
|
||||
}));
|
||||
sendJson(res, 200, {
|
||||
success: true,
|
||||
data: { items, meta: { hasNextPage: true } },
|
||||
});
|
||||
return;
|
||||
}
|
||||
sendJson(res, 404, {});
|
||||
});
|
||||
|
||||
const client = new DocmostClient(baseURL, "user@example.com", "pw");
|
||||
const all = await client.paginateAll("/spaces", {}, LIMIT);
|
||||
|
||||
// MAX_PAGES is 50; the loop must terminate there, not run unbounded.
|
||||
assert.ok(
|
||||
pageRequests <= 50,
|
||||
`expected <= 50 page requests, got ${pageRequests}`,
|
||||
);
|
||||
assert.equal(pageRequests, 50, "should fetch exactly the MAX_PAGES cap");
|
||||
assert.equal(all.length, 50 * LIMIT, "accumulates one full page per request");
|
||||
});
|
||||
|
||||
test("paginateAll stops early on a short page even if hasNextPage is true", async () => {
|
||||
let pageRequests = 0;
|
||||
const LIMIT = 100;
|
||||
|
||||
const { baseURL } = await spawn(async (req, res) => {
|
||||
await readBody(req);
|
||||
if (req.url === "/api/auth/login") {
|
||||
sendJson(res, 200, { success: true }, {
|
||||
"Set-Cookie": "authToken=t; Path=/; HttpOnly",
|
||||
});
|
||||
return;
|
||||
}
|
||||
if (req.url === "/api/spaces") {
|
||||
pageRequests++;
|
||||
// First page is full; second page is SHORT (fewer than limit). The short
|
||||
// page must stop the loop immediately even though hasNextPage stays true.
|
||||
const count = pageRequests === 1 ? LIMIT : 3;
|
||||
const items = Array.from({ length: count }, (_, i) => ({
|
||||
id: `s-${pageRequests}-${i}`,
|
||||
}));
|
||||
sendJson(res, 200, {
|
||||
success: true,
|
||||
data: { items, meta: { hasNextPage: true } },
|
||||
});
|
||||
return;
|
||||
}
|
||||
sendJson(res, 404, {});
|
||||
});
|
||||
|
||||
const client = new DocmostClient(baseURL, "user@example.com", "pw");
|
||||
const all = await client.paginateAll("/spaces", {}, LIMIT);
|
||||
|
||||
assert.equal(pageRequests, 2, "stops right after the first short page");
|
||||
assert.equal(all.length, LIMIT + 3, "full page + short page accumulated");
|
||||
});
|
||||
|
||||
test("paginateAll handles both {data:{items,meta}} and {items,meta} envelopes", async () => {
|
||||
// Bare envelope: { items, meta } with no { data } wrapper.
|
||||
const bareRequests = [];
|
||||
const { baseURL: bareURL } = await spawn(async (req, res) => {
|
||||
await readBody(req);
|
||||
if (req.url === "/api/auth/login") {
|
||||
sendJson(res, 200, { success: true }, {
|
||||
"Set-Cookie": "authToken=t; Path=/; HttpOnly",
|
||||
});
|
||||
return;
|
||||
}
|
||||
if (req.url === "/api/groups") {
|
||||
bareRequests.push(1);
|
||||
// Page 1: full page, hasNextPage true. Page 2: short page -> stop.
|
||||
if (bareRequests.length === 1) {
|
||||
sendJson(res, 200, {
|
||||
items: Array.from({ length: 100 }, (_, i) => ({ id: `g${i}` })),
|
||||
meta: { hasNextPage: true },
|
||||
});
|
||||
} else {
|
||||
sendJson(res, 200, {
|
||||
items: [{ id: "tail" }],
|
||||
meta: { hasNextPage: false },
|
||||
});
|
||||
}
|
||||
return;
|
||||
}
|
||||
sendJson(res, 404, {});
|
||||
});
|
||||
|
||||
const bareClient = new DocmostClient(bareURL, "user@example.com", "pw");
|
||||
const bare = await bareClient.paginateAll("/groups", {}, 100);
|
||||
assert.equal(bare.length, 101, "bare {items,meta} envelope handled");
|
||||
assert.equal(bare[bare.length - 1].id, "tail");
|
||||
|
||||
// Wrapped envelope: { data: { items, meta } }.
|
||||
const wrappedRequests = [];
|
||||
const { baseURL: wrappedURL } = await spawn(async (req, res) => {
|
||||
await readBody(req);
|
||||
if (req.url === "/api/auth/login") {
|
||||
sendJson(res, 200, { success: true }, {
|
||||
"Set-Cookie": "authToken=t; Path=/; HttpOnly",
|
||||
});
|
||||
return;
|
||||
}
|
||||
if (req.url === "/api/groups") {
|
||||
wrappedRequests.push(1);
|
||||
// Single short page -> stops after one request.
|
||||
sendJson(res, 200, {
|
||||
data: {
|
||||
items: [{ id: "w1" }, { id: "w2" }],
|
||||
meta: { hasNextPage: false },
|
||||
},
|
||||
});
|
||||
return;
|
||||
}
|
||||
sendJson(res, 404, {});
|
||||
});
|
||||
|
||||
const wrappedClient = new DocmostClient(wrappedURL, "user@example.com", "pw");
|
||||
const wrapped = await wrappedClient.paginateAll("/groups", {}, 100);
|
||||
assert.equal(wrapped.length, 2, "wrapped {data:{items,meta}} envelope handled");
|
||||
assert.equal(wrappedRequests.length, 1, "single short page -> one request");
|
||||
});
|
||||
126
packages/mcp/test/unit/collaboration.test.mjs
Normal file
126
packages/mcp/test/unit/collaboration.test.mjs
Normal file
@@ -0,0 +1,126 @@
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
|
||||
import {
|
||||
buildCollabWsUrl,
|
||||
markdownToProseMirror,
|
||||
} from "../../build/lib/collaboration.js";
|
||||
|
||||
/** Recursively find the first descendant node (or self) of the given type. */
|
||||
function find(node, type) {
|
||||
if (!node || typeof node !== "object") return null;
|
||||
if (node.type === type) return node;
|
||||
const kids = Array.isArray(node.content) ? node.content : [];
|
||||
for (const k of kids) {
|
||||
const r = find(k, type);
|
||||
if (r) return r;
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
/** Recursively collect every descendant node (and self) of the given type. */
|
||||
function findAll(node, type, acc = []) {
|
||||
if (!node || typeof node !== "object") return acc;
|
||||
if (node.type === type) acc.push(node);
|
||||
const kids = Array.isArray(node.content) ? node.content : [];
|
||||
for (const k of kids) findAll(k, type, acc);
|
||||
return acc;
|
||||
}
|
||||
|
||||
/** Collect the set of mark types present anywhere in the document tree. */
|
||||
function collectMarkTypes(node, set = new Set()) {
|
||||
if (!node || typeof node !== "object") return set;
|
||||
if (Array.isArray(node.marks)) {
|
||||
for (const m of node.marks) set.add(m.type);
|
||||
}
|
||||
const kids = Array.isArray(node.content) ? node.content : [];
|
||||
for (const k of kids) collectMarkTypes(k, set);
|
||||
return set;
|
||||
}
|
||||
|
||||
test("buildCollabWsUrl: https + /api -> wss + /collab", () => {
|
||||
assert.equal(buildCollabWsUrl("https://h/api"), "wss://h/collab");
|
||||
});
|
||||
|
||||
test("buildCollabWsUrl: http (no /api) -> ws + /collab", () => {
|
||||
assert.equal(buildCollabWsUrl("http://h"), "ws://h/collab");
|
||||
});
|
||||
|
||||
test("buildCollabWsUrl: trailing slash on /api/ is handled", () => {
|
||||
assert.equal(buildCollabWsUrl("https://h/api/"), "wss://h/collab");
|
||||
});
|
||||
|
||||
test("buildCollabWsUrl: a base with trailing slash maps to /collab", () => {
|
||||
assert.equal(buildCollabWsUrl("https://h/"), "wss://h/collab");
|
||||
});
|
||||
|
||||
test("buildCollabWsUrl: query and hash on the base are dropped", () => {
|
||||
assert.equal(buildCollabWsUrl("https://h/api?foo=1#bar"), "wss://h/collab");
|
||||
});
|
||||
|
||||
test("markdownToProseMirror: :::warning::: becomes a callout node typed warning", async () => {
|
||||
const doc = await markdownToProseMirror(":::warning\nhello\n:::");
|
||||
const callout = find(doc, "callout");
|
||||
assert.ok(callout, "expected a callout node");
|
||||
assert.equal(callout.attrs.type, "warning");
|
||||
});
|
||||
|
||||
test("markdownToProseMirror: a ::: line inside a fenced code block is not a callout delimiter", async () => {
|
||||
const doc = await markdownToProseMirror("```\n:::warning\nx\n:::\n```");
|
||||
assert.equal(find(doc, "callout"), null, "code-fenced ::: must not open a callout");
|
||||
assert.ok(find(doc, "codeBlock"), "the fenced block should stay a codeBlock");
|
||||
});
|
||||
|
||||
test("markdownToProseMirror: GFM checkbox list -> one taskList, two taskItems, no bulletList", async () => {
|
||||
const doc = await markdownToProseMirror("- [x] a\n- [ ] b");
|
||||
const taskLists = findAll(doc, "taskList");
|
||||
assert.equal(taskLists.length, 1, "expected exactly one taskList");
|
||||
const items = findAll(doc, "taskItem");
|
||||
assert.equal(items.length, 2, "expected two taskItems");
|
||||
assert.deepEqual(
|
||||
items.map((i) => i.attrs.checked),
|
||||
[true, false],
|
||||
);
|
||||
assert.equal(find(doc, "bulletList"), null, "no bulletList should remain");
|
||||
});
|
||||
|
||||
test("markdownToProseMirror: numbered checklist -> one taskList, no orderedList (ol phantom regression)", async () => {
|
||||
const doc = await markdownToProseMirror("1. [x] a\n2. [ ] b");
|
||||
const taskLists = findAll(doc, "taskList");
|
||||
assert.equal(taskLists.length, 1, "expected exactly one taskList");
|
||||
assert.equal(
|
||||
find(doc, "orderedList"),
|
||||
null,
|
||||
"a numbered checklist must not leave a phantom orderedList",
|
||||
);
|
||||
assert.deepEqual(
|
||||
findAll(doc, "taskItem").map((i) => i.attrs.checked),
|
||||
[true, false],
|
||||
);
|
||||
});
|
||||
|
||||
test("markdownToProseMirror: a plain numbered list stays an orderedList", async () => {
|
||||
const doc = await markdownToProseMirror("1. a\n2. b");
|
||||
assert.ok(find(doc, "orderedList"), "plain numbered list should be an orderedList");
|
||||
assert.equal(find(doc, "taskList"), null, "plain numbered list must not become a taskList");
|
||||
});
|
||||
|
||||
test("markdownToProseMirror: mark/sub/sup produce highlight, subscript, superscript marks", async () => {
|
||||
const doc = await markdownToProseMirror("<mark>h</mark> <sub>x</sub> <sup>y</sup>");
|
||||
const marks = collectMarkTypes(doc);
|
||||
assert.ok(marks.has("highlight"), "expected a highlight mark");
|
||||
assert.ok(marks.has("subscript"), "expected a subscript mark");
|
||||
assert.ok(marks.has("superscript"), "expected a superscript mark");
|
||||
});
|
||||
|
||||
test("markdownToProseMirror: an aligned GFM table maps header alignment", async () => {
|
||||
const doc = await markdownToProseMirror(
|
||||
"| a | b | c |\n|:--|:-:|--:|\n| 1 | 2 | 3 |",
|
||||
);
|
||||
const headers = findAll(doc, "tableHeader");
|
||||
assert.equal(headers.length, 3, "expected three header cells");
|
||||
assert.deepEqual(
|
||||
headers.map((h) => h.attrs.align),
|
||||
["left", "center", "right"],
|
||||
);
|
||||
});
|
||||
136
packages/mcp/test/unit/diff.test.mjs
Normal file
136
packages/mcp/test/unit/diff.test.mjs
Normal file
@@ -0,0 +1,136 @@
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
|
||||
import { diffDocs } from "../../build/lib/diff.js";
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Builders
|
||||
// ---------------------------------------------------------------------------
|
||||
const t = (text, marks) => (marks ? { type: "text", text, marks } : { type: "text", text });
|
||||
const para = (...children) => ({ type: "paragraph", content: children });
|
||||
const doc = (...children) => ({ type: "doc", content: children });
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Core diff: one inserted word
|
||||
// ---------------------------------------------------------------------------
|
||||
test("diffDocs detects a single inserted word", () => {
|
||||
const oldDoc = doc(para(t("Hello world")));
|
||||
const newDoc = doc(para(t("Hello brave world")));
|
||||
const r = diffDocs(oldDoc, newDoc);
|
||||
|
||||
assert.ok(r.summary.inserted > 0, "reports insertion length");
|
||||
assert.equal(r.summary.deleted, 0, "no deletions");
|
||||
const ins = r.changes.find((c) => c.op === "insert");
|
||||
assert.ok(ins, "has an insert change");
|
||||
assert.match(ins.text, /brave/);
|
||||
assert.match(r.markdown, /inserted/);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Core diff: one deleted block
|
||||
// ---------------------------------------------------------------------------
|
||||
test("diffDocs detects a deleted block", () => {
|
||||
const oldDoc = doc(para(t("keep this")), para(t("remove this block")));
|
||||
const newDoc = doc(para(t("keep this")));
|
||||
const r = diffDocs(oldDoc, newDoc);
|
||||
|
||||
assert.ok(r.summary.deleted > 0, "reports deletion length");
|
||||
const del = r.changes.find((c) => c.op === "delete");
|
||||
assert.ok(del, "has a delete change");
|
||||
assert.match(del.text, /remove this block/);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Integrity counts
|
||||
// ---------------------------------------------------------------------------
|
||||
test("diffDocs reports integrity counts as [old,new] tuples", () => {
|
||||
const link = [{ type: "link", attrs: { href: "http://x" } }];
|
||||
const image = { type: "image", attrs: { src: "/api/files/a.png" } };
|
||||
const callout = {
|
||||
type: "callout",
|
||||
attrs: { type: "info" },
|
||||
content: [para(t("note"))],
|
||||
};
|
||||
|
||||
const oldDoc = doc(
|
||||
para(t("a link", link)),
|
||||
image,
|
||||
callout,
|
||||
para(t("body with [1] and [2]")),
|
||||
);
|
||||
// new doc: drop the image, drop one footnote marker, keep link + callout.
|
||||
const newDoc = doc(
|
||||
para(t("a link", link)),
|
||||
callout,
|
||||
para(t("body with [1]")),
|
||||
);
|
||||
|
||||
const r = diffDocs(oldDoc, newDoc);
|
||||
assert.deepEqual(r.integrity.images, [1, 0]);
|
||||
assert.deepEqual(r.integrity.links, [1, 1]);
|
||||
assert.deepEqual(r.integrity.callouts, [1, 1]);
|
||||
assert.deepEqual(r.integrity.tables, [0, 0]);
|
||||
// footnote markers parsed in reading order from the body.
|
||||
assert.deepEqual(r.integrity.footnoteMarkers, [[1, 2], [1]]);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Footnote markers stop at the notes heading
|
||||
// ---------------------------------------------------------------------------
|
||||
test("diffDocs footnote markers ignore the notes section", () => {
|
||||
const oldDoc = doc(
|
||||
para(t("body [1]")),
|
||||
{ type: "heading", attrs: { level: 2 }, content: [t("Примечания переводчика")] },
|
||||
{
|
||||
type: "orderedList",
|
||||
content: [
|
||||
{ type: "listItem", content: [para(t("note [1] inside list"))] },
|
||||
],
|
||||
},
|
||||
);
|
||||
const r = diffDocs(oldDoc, oldDoc);
|
||||
// Only the body [1] is counted, not the [1] inside the notes list.
|
||||
assert.deepEqual(r.integrity.footnoteMarkers, [[1], [1]]);
|
||||
assert.equal(r.summary.inserted, 0);
|
||||
assert.equal(r.summary.deleted, 0);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Bug 3: links integrity counts UNIQUE links by href, not link-bearing runs.
|
||||
// A single link split across two runs (link+bold, then link) is one link.
|
||||
// ---------------------------------------------------------------------------
|
||||
test("diffDocs counts a link split across two runs as one link", () => {
|
||||
const link = [{ type: "link", attrs: { href: "http://x" } }];
|
||||
const linkBold = [
|
||||
{ type: "link", attrs: { href: "http://x" } },
|
||||
{ type: "bold" },
|
||||
];
|
||||
// One logical link to http://x rendered as two adjacent runs.
|
||||
const splitDoc = doc(para(t("see ", linkBold), t("the link", link), t(" here")));
|
||||
// Same single href represented as a single run.
|
||||
const wholeDoc = doc(para(t("see the link", link), t(" here")));
|
||||
|
||||
const r = diffDocs(splitDoc, wholeDoc);
|
||||
// Unique-by-href: both sides have exactly one distinct link.
|
||||
assert.deepEqual(r.integrity.links, [1, 1]);
|
||||
});
|
||||
|
||||
test("diffDocs counts two distinct hrefs as two links", () => {
|
||||
const a = [{ type: "link", attrs: { href: "http://a" } }];
|
||||
const b = [{ type: "link", attrs: { href: "http://b" } }];
|
||||
const oldDoc = doc(para(t("one", a), t(" two", b)));
|
||||
// new doc drops the second link.
|
||||
const newDoc = doc(para(t("one", a), t(" two")));
|
||||
const r = diffDocs(oldDoc, newDoc);
|
||||
assert.deepEqual(r.integrity.links, [2, 1]);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Identical docs produce no changes
|
||||
// ---------------------------------------------------------------------------
|
||||
test("diffDocs on identical docs reports no changes", () => {
|
||||
const d = doc(para(t("unchanged")));
|
||||
const r = diffDocs(d, d);
|
||||
assert.equal(r.changes.length, 0);
|
||||
assert.equal(r.summary.blocksChanged, 0);
|
||||
});
|
||||
190
packages/mcp/test/unit/docmost-md-roundtrip.test.mjs
Normal file
190
packages/mcp/test/unit/docmost-md-roundtrip.test.mjs
Normal file
@@ -0,0 +1,190 @@
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
|
||||
import {
|
||||
serializeDocmostMarkdown,
|
||||
parseDocmostMarkdown,
|
||||
} from "../../build/lib/markdown-document.js";
|
||||
import { convertProseMirrorToMarkdown } from "../../build/lib/markdown-converter.js";
|
||||
import { markdownToProseMirror } from "../../build/lib/collaboration.js";
|
||||
|
||||
/** Recursively find the first descendant node (or self) of the given type. */
|
||||
function find(node, type) {
|
||||
if (!node || typeof node !== "object") return null;
|
||||
if (node.type === type) return node;
|
||||
const kids = Array.isArray(node.content) ? node.content : [];
|
||||
for (const k of kids) {
|
||||
const r = find(k, type);
|
||||
if (r) return r;
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
/** Recursively collect every descendant node (and self) of the given type. */
|
||||
function findAll(node, type, acc = []) {
|
||||
if (!node || typeof node !== "object") return acc;
|
||||
if (node.type === type) acc.push(node);
|
||||
const kids = Array.isArray(node.content) ? node.content : [];
|
||||
for (const k of kids) findAll(k, type, acc);
|
||||
return acc;
|
||||
}
|
||||
|
||||
/** Find the first text node carrying a mark of the given type. */
|
||||
function findTextWithMark(node, markType) {
|
||||
for (const t of findAll(node, "text")) {
|
||||
if (Array.isArray(t.marks) && t.marks.some((m) => m.type === markType)) {
|
||||
return t;
|
||||
}
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
test("serialize/parse: meta and comments survive a round-trip; body recovered", () => {
|
||||
const meta = {
|
||||
version: 1,
|
||||
pageId: "p1",
|
||||
slugId: "s1",
|
||||
title: "Hello",
|
||||
spaceId: "sp1",
|
||||
parentPageId: null,
|
||||
};
|
||||
const body = "# Title\n\nSome **bold** body text.";
|
||||
const comments = [
|
||||
{ id: "c1", content: "a note", resolved: false },
|
||||
{ id: "c2", content: "another", resolved: true },
|
||||
];
|
||||
|
||||
const full = serializeDocmostMarkdown(meta, body, comments);
|
||||
const parsed = parseDocmostMarkdown(full);
|
||||
|
||||
assert.deepEqual(parsed.meta, meta);
|
||||
assert.deepEqual(parsed.comments, comments);
|
||||
assert.equal(parsed.body, body);
|
||||
});
|
||||
|
||||
test("serialize: a page with no comments still emits an empty comments block", () => {
|
||||
const full = serializeDocmostMarkdown({ version: 1 }, "body", []);
|
||||
assert.match(full, /<!--\s*docmost:comments\s*\n\[\]\n-->/);
|
||||
const parsed = parseDocmostMarkdown(full);
|
||||
assert.deepEqual(parsed.comments, []);
|
||||
});
|
||||
|
||||
test("parse: plain markdown with no blocks -> meta=null, comments=null, body=input", () => {
|
||||
const input = " # Just a heading\n\nplain body ";
|
||||
const parsed = parseDocmostMarkdown(input);
|
||||
assert.equal(parsed.meta, null);
|
||||
assert.equal(parsed.comments, null);
|
||||
assert.equal(parsed.body, input.trim());
|
||||
});
|
||||
|
||||
test("parse: tolerant to CRLF line endings", () => {
|
||||
const meta = { version: 1, pageId: "p9" };
|
||||
const body = "line one\n\nline two";
|
||||
const full = serializeDocmostMarkdown(meta, body, []).replace(/\n/g, "\r\n");
|
||||
const parsed = parseDocmostMarkdown(full);
|
||||
assert.deepEqual(parsed.meta, meta);
|
||||
assert.deepEqual(parsed.comments, []);
|
||||
assert.equal(parsed.body, body);
|
||||
});
|
||||
|
||||
test("parse: a malformed present meta block throws a clear error", () => {
|
||||
const bad = "<!-- docmost:meta\n{not valid json}\n-->\n\nbody\n";
|
||||
assert.throws(() => parseDocmostMarkdown(bad), /docmost:meta JSON/);
|
||||
});
|
||||
|
||||
test("parse: a literal comments-block in the body is left in the body when a real trailing block follows", () => {
|
||||
// The body documents the format (e.g. inside a fenced code block) AND there is
|
||||
// a real trailing comments block. Only the final, document-ending block is
|
||||
// metadata; the literal stays in the body verbatim.
|
||||
const meta = { version: 1, pageId: "p-literal" };
|
||||
const literal = "```\n<!-- docmost:comments\n[1]\n-->\n```";
|
||||
const body = `# Doc\n\nExample of the format:\n\n${literal}`;
|
||||
const realComments = [{ id: "c1", content: "real" }];
|
||||
|
||||
const full = serializeDocmostMarkdown(meta, body, realComments);
|
||||
const parsed = parseDocmostMarkdown(full);
|
||||
|
||||
// The REAL trailing comments are parsed.
|
||||
assert.deepEqual(parsed.comments, realComments);
|
||||
// The literal block text is still present in the recovered body.
|
||||
assert.ok(
|
||||
parsed.body.includes("<!-- docmost:comments\n[1]\n-->"),
|
||||
"expected the literal comments block to remain in the body",
|
||||
);
|
||||
assert.equal(parsed.body, body.trim());
|
||||
});
|
||||
|
||||
test("parse: a body-ending literal comments block (no real trailing block) is treated as the final block", () => {
|
||||
// Hand-written file whose ONLY `docmost:comments` opener is a literal that
|
||||
// also ends the document. Per the implementation, the final document-ending
|
||||
// block IS treated as metadata, so it is parsed and stripped from the body.
|
||||
const input = "# Doc\n\nsome text\n\n<!-- docmost:comments\n[1]\n-->\n";
|
||||
const parsed = parseDocmostMarkdown(input);
|
||||
assert.equal(parsed.meta, null);
|
||||
assert.deepEqual(parsed.comments, [1]);
|
||||
assert.equal(parsed.body, "# Doc\n\nsome text");
|
||||
});
|
||||
|
||||
test("parse: a literal comments block NOT ending the document stays entirely in the body", () => {
|
||||
// The literal opener/closer is followed by more body content, so it does not
|
||||
// end the document and is therefore left untouched in the body.
|
||||
const input =
|
||||
"# Doc\n\n<!-- docmost:comments\n[1]\n-->\n\nmore body after it\n";
|
||||
const parsed = parseDocmostMarkdown(input);
|
||||
assert.equal(parsed.meta, null);
|
||||
assert.equal(parsed.comments, null);
|
||||
assert.equal(parsed.body, input.trim());
|
||||
});
|
||||
|
||||
test("export emits comment anchors and they round-trip back to a comment mark", () => {
|
||||
// A small ProseMirror doc with a text run carrying a `comment` mark.
|
||||
const doc = {
|
||||
type: "doc",
|
||||
content: [
|
||||
{
|
||||
type: "paragraph",
|
||||
content: [
|
||||
{ type: "text", text: "before " },
|
||||
{
|
||||
type: "text",
|
||||
text: "anchored",
|
||||
marks: [{ type: "comment", attrs: { commentId: "cm-123" } }],
|
||||
},
|
||||
{ type: "text", text: " after" },
|
||||
],
|
||||
},
|
||||
],
|
||||
};
|
||||
|
||||
const body = convertProseMirrorToMarkdown(doc);
|
||||
assert.match(body, /data-comment-id="cm-123"/);
|
||||
|
||||
return markdownToProseMirror(body).then((rebuilt) => {
|
||||
const commented = findTextWithMark(rebuilt, "comment");
|
||||
assert.ok(commented, "expected a text node with a comment mark");
|
||||
const mark = commented.marks.find((m) => m.type === "comment");
|
||||
assert.equal(mark.attrs.commentId, "cm-123");
|
||||
});
|
||||
});
|
||||
|
||||
test("drawio round-trips through export and import", () => {
|
||||
const doc = {
|
||||
type: "doc",
|
||||
content: [
|
||||
{
|
||||
type: "drawio",
|
||||
attrs: { src: "https://example/diagram.xml", attachmentId: "att-7" },
|
||||
},
|
||||
],
|
||||
};
|
||||
|
||||
const body = convertProseMirrorToMarkdown(doc);
|
||||
assert.match(body, /data-type="drawio"/);
|
||||
assert.match(body, /data-src="https:\/\/example\/diagram\.xml"/);
|
||||
|
||||
return markdownToProseMirror(body).then((rebuilt) => {
|
||||
const diagram = find(rebuilt, "drawio");
|
||||
assert.ok(diagram, "expected a drawio node after import");
|
||||
assert.equal(diagram.attrs.src, "https://example/diagram.xml");
|
||||
});
|
||||
});
|
||||
173
packages/mcp/test/unit/filters.test.mjs
Normal file
173
packages/mcp/test/unit/filters.test.mjs
Normal file
@@ -0,0 +1,173 @@
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
|
||||
import { filterComment, filterPage } from "../../build/lib/filters.js";
|
||||
|
||||
test("filterComment includes resolvedAt/resolvedById as null when absent", () => {
|
||||
const result = filterComment({
|
||||
id: "c1",
|
||||
pageId: "p1",
|
||||
content: "hello",
|
||||
createdAt: "2026-01-01T00:00:00.000Z",
|
||||
});
|
||||
|
||||
assert.equal(result.resolvedAt, null);
|
||||
assert.equal(result.resolvedById, null);
|
||||
});
|
||||
|
||||
test("filterComment passes through resolvedAt/resolvedById when present", () => {
|
||||
const result = filterComment({
|
||||
id: "c1",
|
||||
pageId: "p1",
|
||||
content: "hello",
|
||||
createdAt: "2026-01-01T00:00:00.000Z",
|
||||
resolvedAt: "2026-02-02T10:00:00.000Z",
|
||||
resolvedById: "user-42",
|
||||
});
|
||||
|
||||
assert.equal(result.resolvedAt, "2026-02-02T10:00:00.000Z");
|
||||
assert.equal(result.resolvedById, "user-42");
|
||||
});
|
||||
|
||||
test("filterComment still includes id/content/createdAt", () => {
|
||||
const result = filterComment({
|
||||
id: "c-id",
|
||||
pageId: "p1",
|
||||
content: "the body",
|
||||
createdAt: "2026-03-03T03:03:03.000Z",
|
||||
});
|
||||
|
||||
assert.equal(result.id, "c-id");
|
||||
assert.equal(result.content, "the body");
|
||||
assert.equal(result.createdAt, "2026-03-03T03:03:03.000Z");
|
||||
});
|
||||
|
||||
test("filterComment uses markdownContent override when provided", () => {
|
||||
const result = filterComment(
|
||||
{
|
||||
id: "c1",
|
||||
pageId: "p1",
|
||||
content: "raw json content",
|
||||
createdAt: "2026-01-01T00:00:00.000Z",
|
||||
},
|
||||
"**markdown** content",
|
||||
);
|
||||
|
||||
assert.equal(result.content, "**markdown** content");
|
||||
});
|
||||
|
||||
test("filterComment is null-safe on missing creator", () => {
|
||||
const result = filterComment({
|
||||
id: "c1",
|
||||
pageId: "p1",
|
||||
content: "hello",
|
||||
createdAt: "2026-01-01T00:00:00.000Z",
|
||||
creatorId: "u1",
|
||||
// no `creator` object present
|
||||
});
|
||||
|
||||
assert.equal(result.creatorName, null);
|
||||
assert.equal(result.creatorId, "u1");
|
||||
});
|
||||
|
||||
test("filterComment reads creator.name when creator present", () => {
|
||||
const result = filterComment({
|
||||
id: "c1",
|
||||
pageId: "p1",
|
||||
content: "hello",
|
||||
createdAt: "2026-01-01T00:00:00.000Z",
|
||||
creator: { name: "Alice" },
|
||||
});
|
||||
|
||||
assert.equal(result.creatorName, "Alice");
|
||||
});
|
||||
|
||||
test("filterComment defaults selection/type/parentCommentId/editedAt", () => {
|
||||
const result = filterComment({
|
||||
id: "c1",
|
||||
pageId: "p1",
|
||||
content: "hello",
|
||||
createdAt: "2026-01-01T00:00:00.000Z",
|
||||
});
|
||||
|
||||
assert.equal(result.selection, null);
|
||||
assert.equal(result.type, "page");
|
||||
assert.equal(result.parentCommentId, null);
|
||||
assert.equal(result.editedAt, null);
|
||||
});
|
||||
|
||||
test("filterPage selects expected fields", () => {
|
||||
const result = filterPage({
|
||||
id: "page-1",
|
||||
slugId: "slug-1",
|
||||
title: "My Page",
|
||||
parentPageId: "parent-1",
|
||||
spaceId: "space-1",
|
||||
isLocked: false,
|
||||
createdAt: "2026-01-01T00:00:00.000Z",
|
||||
updatedAt: "2026-01-02T00:00:00.000Z",
|
||||
deletedAt: null,
|
||||
// extra fields that must be dropped
|
||||
extraneous: "should not appear",
|
||||
content: "should be ignored when not passed as arg",
|
||||
});
|
||||
|
||||
assert.deepEqual(result, {
|
||||
id: "page-1",
|
||||
slugId: "slug-1",
|
||||
title: "My Page",
|
||||
parentPageId: "parent-1",
|
||||
spaceId: "space-1",
|
||||
isLocked: false,
|
||||
createdAt: "2026-01-01T00:00:00.000Z",
|
||||
updatedAt: "2026-01-02T00:00:00.000Z",
|
||||
deletedAt: null,
|
||||
});
|
||||
});
|
||||
|
||||
test("filterPage omits content key when content arg is not a string", () => {
|
||||
const result = filterPage({ id: "p1", title: "t" });
|
||||
assert.equal("content" in result, false);
|
||||
});
|
||||
|
||||
test("filterPage includes content when arg is a string", () => {
|
||||
const result = filterPage({ id: "p1", title: "t" }, "# Heading");
|
||||
assert.equal(result.content, "# Heading");
|
||||
});
|
||||
|
||||
test("filterPage includes content when arg is an empty string", () => {
|
||||
const result = filterPage({ id: "p1", title: "t" }, "");
|
||||
assert.equal("content" in result, true);
|
||||
assert.equal(result.content, "");
|
||||
});
|
||||
|
||||
test("filterPage omits subpages when none provided", () => {
|
||||
const result = filterPage({ id: "p1", title: "t" });
|
||||
assert.equal("subpages" in result, false);
|
||||
});
|
||||
|
||||
test("filterPage omits subpages when an empty array is provided", () => {
|
||||
const result = filterPage({ id: "p1", title: "t" }, undefined, []);
|
||||
assert.equal("subpages" in result, false);
|
||||
});
|
||||
|
||||
test("filterPage maps subpages to id/title only", () => {
|
||||
const result = filterPage({ id: "p1", title: "t" }, undefined, [
|
||||
{ id: "s1", title: "Sub One", extra: "drop" },
|
||||
{ id: "s2", title: "Sub Two" },
|
||||
]);
|
||||
|
||||
assert.deepEqual(result.subpages, [
|
||||
{ id: "s1", title: "Sub One" },
|
||||
{ id: "s2", title: "Sub Two" },
|
||||
]);
|
||||
});
|
||||
|
||||
test("filterPage includes both content and subpages together", () => {
|
||||
const result = filterPage({ id: "p1", title: "t" }, "body", [
|
||||
{ id: "s1", title: "Sub" },
|
||||
]);
|
||||
|
||||
assert.equal(result.content, "body");
|
||||
assert.deepEqual(result.subpages, [{ id: "s1", title: "Sub" }]);
|
||||
});
|
||||
173
packages/mcp/test/unit/json-edit.test.mjs
Normal file
173
packages/mcp/test/unit/json-edit.test.mjs
Normal file
@@ -0,0 +1,173 @@
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
|
||||
import { applyTextEdits } from "../../build/lib/json-edit.js";
|
||||
|
||||
// Helpers to build small ProseMirror docs.
|
||||
const textNode = (text, extra = {}) => ({ type: "text", text, ...extra });
|
||||
const paragraph = (...children) => ({ type: "paragraph", content: children });
|
||||
const doc = (...children) => ({ type: "doc", content: children });
|
||||
|
||||
test("single-match replace preserves ids/marks and reports replacements===1", () => {
|
||||
const input = doc({
|
||||
type: "paragraph",
|
||||
attrs: { id: "para-1" },
|
||||
content: [
|
||||
textNode("Hello world", { marks: [{ type: "bold" }] }),
|
||||
],
|
||||
});
|
||||
|
||||
const { doc: out, results } = applyTextEdits(input, [
|
||||
{ find: "world", replace: "there" },
|
||||
]);
|
||||
|
||||
assert.deepEqual(results, [{ find: "world", replacements: 1 }]);
|
||||
|
||||
const para = out.content[0];
|
||||
// Paragraph id attribute is preserved.
|
||||
assert.equal(para.attrs.id, "para-1");
|
||||
const tnode = para.content[0];
|
||||
// Text node marks are preserved.
|
||||
assert.deepEqual(tnode.marks, [{ type: "bold" }]);
|
||||
assert.equal(tnode.text, "Hello there");
|
||||
});
|
||||
|
||||
test("zero match throws not found", () => {
|
||||
const input = doc(paragraph(textNode("Hello world")));
|
||||
|
||||
assert.throws(
|
||||
() => applyTextEdits(input, [{ find: "absent", replace: "x" }]),
|
||||
/not found/,
|
||||
);
|
||||
});
|
||||
|
||||
test("text split across two text nodes (one bold) throws spans-multiple-runs", () => {
|
||||
// "Hello world" is split: "Hello " (plain) + "world" (bold). No single text
|
||||
// node contains "Hello world", but the collected document text does.
|
||||
const input = doc(
|
||||
paragraph(
|
||||
textNode("Hello "),
|
||||
textNode("world", { marks: [{ type: "bold" }] }),
|
||||
),
|
||||
);
|
||||
|
||||
assert.throws(
|
||||
() => applyTextEdits(input, [{ find: "Hello world", replace: "x" }]),
|
||||
/spans/,
|
||||
);
|
||||
});
|
||||
|
||||
test("multi-match without replaceAll throws matches", () => {
|
||||
// "ab" appears twice inside a single text node.
|
||||
const input = doc(paragraph(textNode("ab cd ab")));
|
||||
|
||||
assert.throws(
|
||||
() => applyTextEdits(input, [{ find: "ab", replace: "x" }]),
|
||||
/matches/,
|
||||
);
|
||||
});
|
||||
|
||||
test("replaceAll replaces all occurrences", () => {
|
||||
const input = doc(
|
||||
paragraph(textNode("foo and foo")),
|
||||
paragraph(textNode("more foo")),
|
||||
);
|
||||
|
||||
const { doc: out, results } = applyTextEdits(input, [
|
||||
{ find: "foo", replace: "bar", replaceAll: true },
|
||||
]);
|
||||
|
||||
// 2 in the first paragraph, 1 in the second = 3 total.
|
||||
assert.deepEqual(results, [{ find: "foo", replacements: 3 }]);
|
||||
assert.equal(out.content[0].content[0].text, "bar and bar");
|
||||
assert.equal(out.content[1].content[0].text, "more bar");
|
||||
});
|
||||
|
||||
test("replacement containing $&, $1, $$ is inserted LITERALLY (regression)", () => {
|
||||
const input = doc(paragraph(textNode("token here")));
|
||||
|
||||
const literal = "price $& cost $1 dollars $$ end";
|
||||
const { doc: out } = applyTextEdits(input, [
|
||||
{ find: "token", replace: literal },
|
||||
]);
|
||||
|
||||
// The replacement must appear verbatim, NOT regex-expanded.
|
||||
assert.equal(out.content[0].content[0].text, `${literal} here`);
|
||||
// Be explicit that the find text was not re-injected via $&.
|
||||
assert.ok(out.content[0].content[0].text.includes("$&"));
|
||||
assert.ok(!out.content[0].content[0].text.includes("token"));
|
||||
});
|
||||
|
||||
test("$ patterns are inserted literally under replaceAll too", () => {
|
||||
const input = doc(paragraph(textNode("x and x")));
|
||||
|
||||
const { doc: out } = applyTextEdits(input, [
|
||||
{ find: "x", replace: "$&$1$$", replaceAll: true },
|
||||
]);
|
||||
|
||||
assert.equal(out.content[0].content[0].text, "$&$1$$ and $&$1$$");
|
||||
});
|
||||
|
||||
test("empty replacement prunes the emptied text node", () => {
|
||||
// A paragraph whose only text node becomes empty: the node must be pruned.
|
||||
const input = doc(
|
||||
paragraph(
|
||||
textNode("DELETE", { marks: [{ type: "italic" }] }),
|
||||
textNode(" kept"),
|
||||
),
|
||||
);
|
||||
|
||||
const { doc: out, results } = applyTextEdits(input, [
|
||||
{ find: "DELETE", replace: "" },
|
||||
]);
|
||||
|
||||
assert.deepEqual(results, [{ find: "DELETE", replacements: 1 }]);
|
||||
const para = out.content[0];
|
||||
// The emptied first text node is gone; only the " kept" node remains.
|
||||
assert.equal(para.content.length, 1);
|
||||
assert.equal(para.content[0].text, " kept");
|
||||
});
|
||||
|
||||
test("multi-edit array applied in order", () => {
|
||||
const input = doc(paragraph(textNode("alpha beta")));
|
||||
|
||||
const { doc: out, results } = applyTextEdits(input, [
|
||||
{ find: "alpha", replace: "ALPHA" },
|
||||
{ find: "beta", replace: "BETA" },
|
||||
]);
|
||||
|
||||
assert.deepEqual(results, [
|
||||
{ find: "alpha", replacements: 1 },
|
||||
{ find: "beta", replacements: 1 },
|
||||
]);
|
||||
assert.equal(out.content[0].content[0].text, "ALPHA BETA");
|
||||
});
|
||||
|
||||
test("second edit can target text produced by the first (ordered application)", () => {
|
||||
const input = doc(paragraph(textNode("one")));
|
||||
|
||||
const { doc: out, results } = applyTextEdits(input, [
|
||||
{ find: "one", replace: "two" },
|
||||
{ find: "two", replace: "three" },
|
||||
]);
|
||||
|
||||
assert.deepEqual(results, [
|
||||
{ find: "one", replacements: 1 },
|
||||
{ find: "two", replacements: 1 },
|
||||
]);
|
||||
assert.equal(out.content[0].content[0].text, "three");
|
||||
});
|
||||
|
||||
test("input doc is not mutated", () => {
|
||||
const input = doc(paragraph(textNode("immutable source")));
|
||||
const snapshot = JSON.parse(JSON.stringify(input));
|
||||
|
||||
const { doc: out } = applyTextEdits(input, [
|
||||
{ find: "immutable", replace: "changed" },
|
||||
]);
|
||||
|
||||
// Original is untouched; the returned doc is a distinct object.
|
||||
assert.deepEqual(input, snapshot);
|
||||
assert.notEqual(out, input);
|
||||
assert.equal(out.content[0].content[0].text, "changed source");
|
||||
});
|
||||
151
packages/mcp/test/unit/markdown-converter.test.mjs
Normal file
151
packages/mcp/test/unit/markdown-converter.test.mjs
Normal file
@@ -0,0 +1,151 @@
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
|
||||
import { convertProseMirrorToMarkdown } from "../../build/lib/markdown-converter.js";
|
||||
|
||||
// ProseMirror builders.
|
||||
const text = (t, marks) => (marks ? { type: "text", text: t, marks } : { type: "text", text: t });
|
||||
const paragraph = (...content) => ({ type: "paragraph", content });
|
||||
const doc = (...content) => ({ type: "doc", content });
|
||||
const listItem = (...content) => ({ type: "listItem", content });
|
||||
const bulletList = (...items) => ({ type: "bulletList", content: items });
|
||||
const orderedList = (...items) => ({ type: "orderedList", content: items });
|
||||
|
||||
test("nested bulletList with 3 children keeps all children indented under the parent", () => {
|
||||
const input = doc(
|
||||
bulletList(
|
||||
listItem(
|
||||
paragraph(text("Parent")),
|
||||
bulletList(
|
||||
listItem(paragraph(text("A"))),
|
||||
listItem(paragraph(text("B"))),
|
||||
listItem(paragraph(text("C"))),
|
||||
),
|
||||
),
|
||||
),
|
||||
);
|
||||
|
||||
assert.equal(
|
||||
convertProseMirrorToMarkdown(input),
|
||||
"- Parent\n - A\n - B\n - C",
|
||||
);
|
||||
});
|
||||
|
||||
test("nested list under an ordered item indents 3 spaces", () => {
|
||||
const input = doc(
|
||||
orderedList(
|
||||
listItem(
|
||||
paragraph(text("Parent")),
|
||||
bulletList(listItem(paragraph(text("Child")))),
|
||||
),
|
||||
),
|
||||
);
|
||||
|
||||
assert.equal(
|
||||
convertProseMirrorToMarkdown(input),
|
||||
"1. Parent\n - Child",
|
||||
);
|
||||
});
|
||||
|
||||
test("link with title -> [t](url \"title\")", () => {
|
||||
const input = doc(
|
||||
paragraph(
|
||||
text("click", [
|
||||
{ type: "link", attrs: { href: "https://example.com", title: "the title" } },
|
||||
]),
|
||||
),
|
||||
);
|
||||
|
||||
assert.equal(
|
||||
convertProseMirrorToMarkdown(input),
|
||||
'[click](https://example.com "the title")',
|
||||
);
|
||||
});
|
||||
|
||||
test("hardBreak -> trailing two-spaces+newline", () => {
|
||||
const input = doc(
|
||||
paragraph(text("line1"), { type: "hardBreak" }, text("line2")),
|
||||
);
|
||||
|
||||
assert.equal(convertProseMirrorToMarkdown(input), "line1 \nline2");
|
||||
});
|
||||
|
||||
test("table cell with two block children joined by a space (and a pipe escaped)", () => {
|
||||
const input = doc({
|
||||
type: "table",
|
||||
content: [
|
||||
{
|
||||
type: "tableRow",
|
||||
content: [
|
||||
{
|
||||
type: "tableCell",
|
||||
content: [paragraph(text("a|b")), paragraph(text("c"))],
|
||||
},
|
||||
],
|
||||
},
|
||||
],
|
||||
});
|
||||
|
||||
// Single-column header row + separator. The cell joins its two paragraphs
|
||||
// with a space ("a|b c") then escapes the pipe -> "a\|b c".
|
||||
assert.equal(
|
||||
convertProseMirrorToMarkdown(input),
|
||||
"| a\\|b c |\n| --- |",
|
||||
);
|
||||
});
|
||||
|
||||
test("code block trailing newline trimmed", () => {
|
||||
const input = doc({
|
||||
type: "codeBlock",
|
||||
attrs: { language: "js" },
|
||||
content: [text("const a = 1;\n")],
|
||||
});
|
||||
|
||||
// The single trailing newline inside the code is trimmed; fences add one.
|
||||
assert.equal(
|
||||
convertProseMirrorToMarkdown(input),
|
||||
"```js\nconst a = 1;\n```",
|
||||
);
|
||||
});
|
||||
|
||||
test("textAlign value: delimiting double-quote escaped (attribute-safe, idempotent; < > left literal/inert)", () => {
|
||||
const input = doc({
|
||||
type: "paragraph",
|
||||
attrs: { textAlign: 'right"><b' },
|
||||
content: [text("body")],
|
||||
});
|
||||
|
||||
// Attribute values escape only & and " so the value cannot break out of the
|
||||
// quoted attribute. < and > are left literal: parse5/jsdom does NOT decode
|
||||
// </> inside attribute values, so escaping them would corrupt the value
|
||||
// and accumulate on every round-trip. The literal < > are inert inside quotes.
|
||||
assert.equal(
|
||||
convertProseMirrorToMarkdown(input),
|
||||
'<div align="right"><b">body</div>',
|
||||
);
|
||||
});
|
||||
|
||||
test("highlight color: delimiting double-quote escaped (attribute-safe; < > inert, and import sanitizes the color)", () => {
|
||||
const input = doc(
|
||||
paragraph(
|
||||
text("hi", [{ type: "highlight", attrs: { color: 'red"><script' } }]),
|
||||
),
|
||||
);
|
||||
|
||||
assert.equal(
|
||||
convertProseMirrorToMarkdown(input),
|
||||
'<mark style="background-color: red"><script">hi</mark>',
|
||||
);
|
||||
});
|
||||
|
||||
test("empty task item still emits its marker", () => {
|
||||
const input = doc({
|
||||
type: "taskList",
|
||||
content: [
|
||||
{ type: "taskItem", attrs: { checked: false }, content: [] },
|
||||
{ type: "taskItem", attrs: { checked: true }, content: [] },
|
||||
],
|
||||
});
|
||||
|
||||
assert.equal(convertProseMirrorToMarkdown(input), "- [ ]\n- [x]");
|
||||
});
|
||||
301
packages/mcp/test/unit/node-ops-table.test.mjs
Normal file
301
packages/mcp/test/unit/node-ops-table.test.mjs
Normal file
@@ -0,0 +1,301 @@
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
|
||||
import {
|
||||
insertNodeRelative,
|
||||
sanitizeForYjs,
|
||||
findUnstorableAttr,
|
||||
} from "../../build/lib/node-ops.js";
|
||||
|
||||
// ProseMirror builders. Blocks carry a stable id in attrs.id.
|
||||
const textNode = (text) => ({ type: "text", text });
|
||||
const para = (id, ...children) => ({
|
||||
type: "paragraph",
|
||||
attrs: { id },
|
||||
content: children,
|
||||
});
|
||||
const doc = (...children) => ({ type: "doc", content: children });
|
||||
const snapshot = (v) => JSON.parse(JSON.stringify(v));
|
||||
|
||||
// A table cell holding a single paragraph.
|
||||
const cell = (id, innerPara) => ({
|
||||
type: "tableCell",
|
||||
attrs: { id },
|
||||
content: [innerPara],
|
||||
});
|
||||
const row = (id, ...cells) => ({
|
||||
type: "tableRow",
|
||||
attrs: { id },
|
||||
content: cells,
|
||||
});
|
||||
const table = (id, ...rows) => ({
|
||||
type: "table",
|
||||
attrs: { id },
|
||||
content: rows,
|
||||
});
|
||||
|
||||
// A 2x2 table: rows r1/r2, cells c1..c4, each cell holds a paragraph p1..p4.
|
||||
const make2x2Table = () =>
|
||||
doc(
|
||||
table(
|
||||
"t1",
|
||||
row("r1", cell("c1", para("p1", textNode("A1"))), cell("c2", para("p2", textNode("A2")))),
|
||||
row("r2", cell("c3", para("p3", textNode("B1"))), cell("c4", para("p4", textNode("B2")))),
|
||||
),
|
||||
);
|
||||
|
||||
const freshRow = () => row("rNEW", cell("cNEW", para("pNEW", textNode("NEW"))));
|
||||
const freshCell = () => cell("cNEW", para("pNEW", textNode("NEW")));
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// sanitizeForYjs
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
test("sanitizeForYjs strips undefined node-attr keys, preserves null/false/0/''", () => {
|
||||
const input = doc({
|
||||
type: "paragraph",
|
||||
attrs: {
|
||||
id: "p-1",
|
||||
gone: undefined,
|
||||
keptNull: null,
|
||||
keptFalse: false,
|
||||
keptZero: 0,
|
||||
keptEmpty: "",
|
||||
},
|
||||
content: [textNode("x")],
|
||||
});
|
||||
const out = sanitizeForYjs(input);
|
||||
const attrs = out.content[0].attrs;
|
||||
assert.equal("gone" in attrs, false);
|
||||
assert.equal("keptNull" in attrs, true);
|
||||
assert.equal(attrs.keptNull, null);
|
||||
assert.equal(attrs.keptFalse, false);
|
||||
assert.equal(attrs.keptZero, 0);
|
||||
assert.equal(attrs.keptEmpty, "");
|
||||
// Input must not be mutated.
|
||||
assert.equal("gone" in input.content[0].attrs, true);
|
||||
});
|
||||
|
||||
test("sanitizeForYjs strips undefined mark-attr keys, preserves falsy values", () => {
|
||||
const input = doc({
|
||||
type: "paragraph",
|
||||
attrs: { id: "p-1" },
|
||||
content: [
|
||||
{
|
||||
type: "text",
|
||||
text: "x",
|
||||
marks: [
|
||||
{
|
||||
type: "link",
|
||||
attrs: { href: "", target: undefined, rel: null },
|
||||
},
|
||||
],
|
||||
},
|
||||
],
|
||||
});
|
||||
const out = sanitizeForYjs(input);
|
||||
const markAttrs = out.content[0].content[0].marks[0].attrs;
|
||||
assert.equal("target" in markAttrs, false);
|
||||
assert.equal(markAttrs.href, "");
|
||||
assert.equal(markAttrs.rel, null);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// findUnstorableAttr
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
test("findUnstorableAttr returns a path for an undefined node attr", () => {
|
||||
const input = doc(
|
||||
para("p-0", textNode("ok")),
|
||||
{
|
||||
type: "paragraph",
|
||||
attrs: { id: "p-1", indent: undefined },
|
||||
content: [textNode("y")],
|
||||
},
|
||||
);
|
||||
const hit = findUnstorableAttr(input);
|
||||
assert.equal(hit, "content[1].attrs.indent (undefined)");
|
||||
});
|
||||
|
||||
test("findUnstorableAttr finds an unstorable mark attr", () => {
|
||||
const input = doc({
|
||||
type: "paragraph",
|
||||
attrs: { id: "p-1" },
|
||||
content: [
|
||||
{
|
||||
type: "text",
|
||||
text: "x",
|
||||
marks: [{ type: "link", attrs: { href: () => {} } }],
|
||||
},
|
||||
],
|
||||
});
|
||||
const hit = findUnstorableAttr(input);
|
||||
assert.equal(hit, "content[0].content[0].marks[0].attrs.href (function)");
|
||||
});
|
||||
|
||||
test("findUnstorableAttr returns null for a clean doc", () => {
|
||||
const input = doc(para("p-1", textNode("clean")));
|
||||
assert.equal(findUnstorableAttr(input), null);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// insertNodeRelative — table-structure-aware
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
test("insertNodeRelative inserting a tableRow anchored on a paragraph INSIDE a cell appends a sibling row to the table", () => {
|
||||
const input = make2x2Table();
|
||||
const { doc: out, inserted } = insertNodeRelative(input, freshRow(), {
|
||||
position: "after",
|
||||
anchorNodeId: "p4", // paragraph inside last cell of the last row
|
||||
});
|
||||
assert.equal(inserted, true);
|
||||
const tbl = out.content[0];
|
||||
// table.content length +1 (the row is a direct child of the table).
|
||||
assert.equal(tbl.content.length, 3);
|
||||
// The new row is a direct child of the table, NOT nested inside a cell.
|
||||
const newRow = tbl.content[2];
|
||||
assert.equal(newRow.type, "tableRow");
|
||||
assert.equal(newRow.attrs.id, "rNEW");
|
||||
// Existing rows' cells are intact.
|
||||
assert.deepEqual(
|
||||
tbl.content[0].content.map((c) => c.attrs.id),
|
||||
["c1", "c2"],
|
||||
);
|
||||
assert.deepEqual(
|
||||
tbl.content[1].content.map((c) => c.attrs.id),
|
||||
["c3", "c4"],
|
||||
);
|
||||
// Assert the new row is NOT nested inside any existing cell.
|
||||
for (const r of [tbl.content[0], tbl.content[1]]) {
|
||||
for (const c of r.content) {
|
||||
const ids = (c.content || []).map((n) => n.attrs?.id);
|
||||
assert.equal(ids.includes("rNEW"), false);
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
test("insertNodeRelative before/after place the new row at the correct index relative to the enclosing row", () => {
|
||||
// "before" the first row.
|
||||
{
|
||||
const input = make2x2Table();
|
||||
const { doc: out } = insertNodeRelative(input, freshRow(), {
|
||||
position: "before",
|
||||
anchorNodeId: "p1", // paragraph in first row
|
||||
});
|
||||
assert.deepEqual(
|
||||
out.content[0].content.map((r) => r.attrs.id),
|
||||
["rNEW", "r1", "r2"],
|
||||
);
|
||||
}
|
||||
// "after" the first row.
|
||||
{
|
||||
const input = make2x2Table();
|
||||
const { doc: out } = insertNodeRelative(input, freshRow(), {
|
||||
position: "after",
|
||||
anchorNodeId: "p1", // paragraph in first row
|
||||
});
|
||||
assert.deepEqual(
|
||||
out.content[0].content.map((r) => r.attrs.id),
|
||||
["r1", "rNEW", "r2"],
|
||||
);
|
||||
}
|
||||
});
|
||||
|
||||
test("insertNodeRelative inserting a tableCell anchored inside a cell adds it to the enclosing row", () => {
|
||||
const input = make2x2Table();
|
||||
const { doc: out, inserted } = insertNodeRelative(input, freshCell(), {
|
||||
position: "after",
|
||||
anchorNodeId: "p1", // paragraph inside first cell of first row
|
||||
});
|
||||
assert.equal(inserted, true);
|
||||
// The cell is spliced into the enclosing row (r1) after c1.
|
||||
assert.deepEqual(
|
||||
out.content[0].content[0].content.map((c) => c.attrs.id),
|
||||
["c1", "cNEW", "c2"],
|
||||
);
|
||||
// The other row is untouched.
|
||||
assert.deepEqual(
|
||||
out.content[0].content[1].content.map((c) => c.attrs.id),
|
||||
["c3", "c4"],
|
||||
);
|
||||
});
|
||||
|
||||
test("insertNodeRelative inserting a tableRow with an anchor NOT inside a table throws", () => {
|
||||
const input = doc(para("p-1", textNode("plain")));
|
||||
assert.throws(
|
||||
() =>
|
||||
insertNodeRelative(input, freshRow(), {
|
||||
position: "after",
|
||||
anchorNodeId: "p-1",
|
||||
}),
|
||||
/not inside a table/,
|
||||
);
|
||||
});
|
||||
|
||||
test("insertNodeRelative append + tableRow throws", () => {
|
||||
const input = make2x2Table();
|
||||
assert.throws(
|
||||
() => insertNodeRelative(input, freshRow(), { position: "append" }),
|
||||
/cannot append a tableRow at the top level/,
|
||||
);
|
||||
});
|
||||
|
||||
test("insertNodeRelative structural insert with unresolved anchor returns inserted:false (no throw)", () => {
|
||||
const input = make2x2Table();
|
||||
const { doc: out, inserted } = insertNodeRelative(input, freshRow(), {
|
||||
position: "after",
|
||||
anchorNodeId: "does-not-exist",
|
||||
});
|
||||
assert.equal(inserted, false);
|
||||
assert.deepEqual(out, input);
|
||||
});
|
||||
|
||||
test("insertNodeRelative tableRow by anchorText resolving to the table block appends within the table", () => {
|
||||
const input = make2x2Table();
|
||||
// anchorText "A1" lives in the first cell; the matched top-level block is the
|
||||
// table itself, so the row appends at the end of the table.
|
||||
const { doc: out, inserted } = insertNodeRelative(input, freshRow(), {
|
||||
position: "after",
|
||||
anchorText: "A1",
|
||||
});
|
||||
assert.equal(inserted, true);
|
||||
assert.deepEqual(
|
||||
out.content[0].content.map((r) => r.attrs.id),
|
||||
["r1", "r2", "rNEW"],
|
||||
);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Regression: a normal (non-structural) paragraph insert is unchanged.
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
test("insertNodeRelative regression: normal paragraph before/after a top-level block behaves as before", () => {
|
||||
const before = doc(para("p-1", textNode("one")), para("p-2", textNode("two")));
|
||||
{
|
||||
const { doc: out, inserted } = insertNodeRelative(
|
||||
before,
|
||||
para("new", textNode("NEW")),
|
||||
{ position: "before", anchorNodeId: "p-2" },
|
||||
);
|
||||
assert.equal(inserted, true);
|
||||
assert.deepEqual(
|
||||
out.content.map((n) => n.attrs.id),
|
||||
["p-1", "new", "p-2"],
|
||||
);
|
||||
}
|
||||
{
|
||||
const snap = snapshot(before);
|
||||
const { doc: out, inserted } = insertNodeRelative(
|
||||
before,
|
||||
para("new", textNode("NEW")),
|
||||
{ position: "after", anchorNodeId: "p-1" },
|
||||
);
|
||||
assert.equal(inserted, true);
|
||||
assert.deepEqual(
|
||||
out.content.map((n) => n.attrs.id),
|
||||
["p-1", "new", "p-2"],
|
||||
);
|
||||
// Input not mutated.
|
||||
assert.deepEqual(before, snap);
|
||||
}
|
||||
});
|
||||
402
packages/mcp/test/unit/node-ops.test.mjs
Normal file
402
packages/mcp/test/unit/node-ops.test.mjs
Normal file
@@ -0,0 +1,402 @@
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
|
||||
import {
|
||||
blockPlainText,
|
||||
replaceNodeById,
|
||||
deleteNodeById,
|
||||
insertNodeRelative,
|
||||
} from "../../build/lib/node-ops.js";
|
||||
|
||||
// ProseMirror builders. Blocks carry a stable id in attrs.id.
|
||||
const textNode = (text) => ({ type: "text", text });
|
||||
const para = (id, ...children) => ({
|
||||
type: "paragraph",
|
||||
attrs: { id },
|
||||
content: children,
|
||||
});
|
||||
const doc = (...children) => ({ type: "doc", content: children });
|
||||
const snapshot = (v) => JSON.parse(JSON.stringify(v));
|
||||
|
||||
// A callout / table-cell wraps its children in `content`, just like any other
|
||||
// block, so recursion reaches a paragraph nested inside it.
|
||||
const callout = (id, ...children) => ({
|
||||
type: "callout",
|
||||
attrs: { id, type: "info" },
|
||||
content: children,
|
||||
});
|
||||
const tableDoc = (innerPara) =>
|
||||
doc({
|
||||
type: "table",
|
||||
attrs: { id: "table-1" },
|
||||
content: [
|
||||
{
|
||||
type: "tableRow",
|
||||
attrs: { id: "row-1" },
|
||||
content: [
|
||||
{
|
||||
type: "tableCell",
|
||||
attrs: { id: "cell-1" },
|
||||
content: [innerPara],
|
||||
},
|
||||
],
|
||||
},
|
||||
],
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// blockPlainText
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
test("blockPlainText concatenates nested text", () => {
|
||||
const node = {
|
||||
type: "callout",
|
||||
content: [
|
||||
para("p-1", textNode("Hello "), textNode("world")),
|
||||
para("p-2", textNode("!")),
|
||||
],
|
||||
};
|
||||
assert.equal(blockPlainText(node), "Hello world!");
|
||||
});
|
||||
|
||||
test("blockPlainText returns '' for nullish / non-object", () => {
|
||||
assert.equal(blockPlainText(null), "");
|
||||
assert.equal(blockPlainText(undefined), "");
|
||||
assert.equal(blockPlainText("just a string"), "");
|
||||
});
|
||||
|
||||
test("blockPlainText reads a bare text node", () => {
|
||||
assert.equal(blockPlainText(textNode("solo")), "solo");
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// replaceNodeById
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
test("replaceNodeById replaces the matching block and leaves others, count===1", () => {
|
||||
const input = doc(
|
||||
para("p-1", textNode("one")),
|
||||
para("p-2", textNode("two")),
|
||||
para("p-3", textNode("three")),
|
||||
);
|
||||
const newNode = para("p-2", textNode("REPLACED"));
|
||||
|
||||
const { doc: out, replaced } = replaceNodeById(input, "p-2", newNode);
|
||||
|
||||
assert.equal(replaced, 1);
|
||||
// Target replaced.
|
||||
assert.equal(out.content[1].content[0].text, "REPLACED");
|
||||
// Siblings untouched (text and ids).
|
||||
assert.equal(out.content[0].content[0].text, "one");
|
||||
assert.equal(out.content[2].content[0].text, "three");
|
||||
assert.deepEqual(
|
||||
out.content.map((n) => n.attrs.id),
|
||||
["p-1", "p-2", "p-3"],
|
||||
);
|
||||
});
|
||||
|
||||
test("replaceNodeById on no-match returns replaced===0 and does not throw", () => {
|
||||
const input = doc(para("p-1", textNode("one")));
|
||||
const { doc: out, replaced } = replaceNodeById(
|
||||
input,
|
||||
"missing",
|
||||
para("x", textNode("x")),
|
||||
);
|
||||
assert.equal(replaced, 0);
|
||||
// Document content is preserved.
|
||||
assert.equal(out.content[0].content[0].text, "one");
|
||||
});
|
||||
|
||||
test("replaceNodeById replaces EVERY node sharing the id (count reflects all)", () => {
|
||||
const input = doc(
|
||||
para("dup", textNode("a")),
|
||||
para("dup", textNode("b")),
|
||||
para("keep", textNode("c")),
|
||||
);
|
||||
const { doc: out, replaced } = replaceNodeById(
|
||||
input,
|
||||
"dup",
|
||||
para("dup", textNode("NEW")),
|
||||
);
|
||||
assert.equal(replaced, 2);
|
||||
assert.equal(out.content[0].content[0].text, "NEW");
|
||||
assert.equal(out.content[1].content[0].text, "NEW");
|
||||
assert.equal(out.content[2].content[0].text, "c");
|
||||
// The two replacements must not share a reference (deep clone per match).
|
||||
assert.notEqual(out.content[0], out.content[1]);
|
||||
});
|
||||
|
||||
test("replaceNodeById reaches a node nested inside a callout", () => {
|
||||
const input = doc(callout("c-1", para("inner", textNode("old"))));
|
||||
const { doc: out, replaced } = replaceNodeById(
|
||||
input,
|
||||
"inner",
|
||||
para("inner", textNode("new")),
|
||||
);
|
||||
assert.equal(replaced, 1);
|
||||
assert.equal(out.content[0].content[0].content[0].text, "new");
|
||||
});
|
||||
|
||||
test("replaceNodeById reaches a node nested inside a table cell", () => {
|
||||
const input = tableDoc(para("deep", textNode("before")));
|
||||
const { doc: out, replaced } = replaceNodeById(
|
||||
input,
|
||||
"deep",
|
||||
para("deep", textNode("after")),
|
||||
);
|
||||
assert.equal(replaced, 1);
|
||||
const cellPara = out.content[0].content[0].content[0].content[0];
|
||||
assert.equal(cellPara.content[0].text, "after");
|
||||
});
|
||||
|
||||
test("replaceNodeById does NOT mutate input (deep-equal snapshot)", () => {
|
||||
const input = doc(
|
||||
para("p-1", textNode("one")),
|
||||
callout("c-1", para("inner", textNode("old"))),
|
||||
);
|
||||
const snap = snapshot(input);
|
||||
const { doc: out } = replaceNodeById(
|
||||
input,
|
||||
"inner",
|
||||
para("inner", textNode("changed")),
|
||||
);
|
||||
assert.deepEqual(input, snap);
|
||||
assert.notEqual(out, input);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// deleteNodeById
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
test("deleteNodeById removes the block and reports deleted===1", () => {
|
||||
const input = doc(
|
||||
para("p-1", textNode("one")),
|
||||
para("p-2", textNode("two")),
|
||||
para("p-3", textNode("three")),
|
||||
);
|
||||
const { doc: out, deleted } = deleteNodeById(input, "p-2");
|
||||
assert.equal(deleted, 1);
|
||||
assert.deepEqual(
|
||||
out.content.map((n) => n.attrs.id),
|
||||
["p-1", "p-3"],
|
||||
);
|
||||
});
|
||||
|
||||
test("deleteNodeById on no-match returns deleted===0 and leaves content", () => {
|
||||
const input = doc(para("p-1", textNode("one")));
|
||||
const { doc: out, deleted } = deleteNodeById(input, "missing");
|
||||
assert.equal(deleted, 0);
|
||||
assert.equal(out.content.length, 1);
|
||||
});
|
||||
|
||||
test("deleteNodeById removes a node nested inside a callout", () => {
|
||||
const input = doc(
|
||||
callout("c-1", para("inner", textNode("x")), para("keep", textNode("y"))),
|
||||
);
|
||||
const { doc: out, deleted } = deleteNodeById(input, "inner");
|
||||
assert.equal(deleted, 1);
|
||||
assert.deepEqual(
|
||||
out.content[0].content.map((n) => n.attrs.id),
|
||||
["keep"],
|
||||
);
|
||||
});
|
||||
|
||||
test("deleteNodeById removes EVERY node sharing the id", () => {
|
||||
const input = doc(
|
||||
para("dup", textNode("a")),
|
||||
para("keep", textNode("b")),
|
||||
para("dup", textNode("c")),
|
||||
);
|
||||
const { doc: out, deleted } = deleteNodeById(input, "dup");
|
||||
assert.equal(deleted, 2);
|
||||
assert.deepEqual(
|
||||
out.content.map((n) => n.attrs.id),
|
||||
["keep"],
|
||||
);
|
||||
});
|
||||
|
||||
test("deleteNodeById does NOT mutate input (deep-equal snapshot)", () => {
|
||||
const input = doc(
|
||||
para("p-1", textNode("one")),
|
||||
para("p-2", textNode("two")),
|
||||
);
|
||||
const snap = snapshot(input);
|
||||
const { doc: out } = deleteNodeById(input, "p-2");
|
||||
assert.deepEqual(input, snap);
|
||||
assert.notEqual(out, input);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// insertNodeRelative
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
test("insertNodeRelative before by anchorNodeId", () => {
|
||||
const input = doc(para("p-1", textNode("one")), para("p-2", textNode("two")));
|
||||
const node = para("new", textNode("NEW"));
|
||||
const { doc: out, inserted } = insertNodeRelative(input, node, {
|
||||
position: "before",
|
||||
anchorNodeId: "p-2",
|
||||
});
|
||||
assert.equal(inserted, true);
|
||||
assert.deepEqual(
|
||||
out.content.map((n) => n.attrs.id),
|
||||
["p-1", "new", "p-2"],
|
||||
);
|
||||
});
|
||||
|
||||
test("insertNodeRelative after by anchorNodeId", () => {
|
||||
const input = doc(para("p-1", textNode("one")), para("p-2", textNode("two")));
|
||||
const node = para("new", textNode("NEW"));
|
||||
const { doc: out, inserted } = insertNodeRelative(input, node, {
|
||||
position: "after",
|
||||
anchorNodeId: "p-1",
|
||||
});
|
||||
assert.equal(inserted, true);
|
||||
assert.deepEqual(
|
||||
out.content.map((n) => n.attrs.id),
|
||||
["p-1", "new", "p-2"],
|
||||
);
|
||||
});
|
||||
|
||||
test("insertNodeRelative before/after by anchorNodeId reaches a nested sibling", () => {
|
||||
const input = doc(
|
||||
callout("c-1", para("a", textNode("a")), para("b", textNode("b"))),
|
||||
);
|
||||
const node = para("new", textNode("NEW"));
|
||||
const { doc: out, inserted } = insertNodeRelative(input, node, {
|
||||
position: "after",
|
||||
anchorNodeId: "a",
|
||||
});
|
||||
assert.equal(inserted, true);
|
||||
// Inserted as a sibling inside the callout's content array.
|
||||
assert.deepEqual(
|
||||
out.content[0].content.map((n) => n.attrs.id),
|
||||
["a", "new", "b"],
|
||||
);
|
||||
});
|
||||
|
||||
test("insertNodeRelative before by anchorText (top-level)", () => {
|
||||
const input = doc(
|
||||
para("p-1", textNode("alpha")),
|
||||
para("p-2", textNode("beta")),
|
||||
);
|
||||
const node = para("new", textNode("NEW"));
|
||||
const { doc: out, inserted } = insertNodeRelative(input, node, {
|
||||
position: "before",
|
||||
anchorText: "beta",
|
||||
});
|
||||
assert.equal(inserted, true);
|
||||
assert.deepEqual(
|
||||
out.content.map((n) => n.attrs.id),
|
||||
["p-1", "new", "p-2"],
|
||||
);
|
||||
});
|
||||
|
||||
test("insertNodeRelative after by anchorText (top-level)", () => {
|
||||
const input = doc(
|
||||
para("p-1", textNode("alpha")),
|
||||
para("p-2", textNode("beta")),
|
||||
);
|
||||
const node = para("new", textNode("NEW"));
|
||||
const { doc: out, inserted } = insertNodeRelative(input, node, {
|
||||
position: "after",
|
||||
anchorText: "alpha",
|
||||
});
|
||||
assert.equal(inserted, true);
|
||||
assert.deepEqual(
|
||||
out.content.map((n) => n.attrs.id),
|
||||
["p-1", "new", "p-2"],
|
||||
);
|
||||
});
|
||||
|
||||
test("insertNodeRelative anchorText scans TOP-LEVEL blocks via recursive plain text", () => {
|
||||
// anchorText matches the FIRST top-level block whose (recursive) blockPlainText
|
||||
// includes the string. "deeptext" lives nested in a top-level callout, so the
|
||||
// callout itself is the matched top-level block and the node lands as its
|
||||
// sibling at the top level (not inside the callout).
|
||||
const input = doc(
|
||||
callout("c-1", para("inner", textNode("deeptext"))),
|
||||
para("p-2", textNode("tail")),
|
||||
);
|
||||
const node = para("new", textNode("NEW"));
|
||||
const { doc: out, inserted } = insertNodeRelative(input, node, {
|
||||
position: "after",
|
||||
anchorText: "deeptext",
|
||||
});
|
||||
assert.equal(inserted, true);
|
||||
assert.deepEqual(
|
||||
out.content.map((n) => n.attrs.id),
|
||||
["c-1", "new", "p-2"],
|
||||
);
|
||||
});
|
||||
|
||||
test("insertNodeRelative anchorText does NOT match text only present below top level when no top-level block contains it", () => {
|
||||
// The only block whose plain text includes "lonely" is a paragraph nested two
|
||||
// levels deep, but the top-level scan still sees it through the callout's
|
||||
// recursive plain text. To prove the scan is TOP-LEVEL (parent-array) only,
|
||||
// assert the insertion happens at the top level beside the callout, never
|
||||
// inside it.
|
||||
const input = doc(callout("c-1", para("inner", textNode("lonely word"))));
|
||||
const node = para("new", textNode("NEW"));
|
||||
const { doc: out, inserted } = insertNodeRelative(input, node, {
|
||||
position: "before",
|
||||
anchorText: "lonely",
|
||||
});
|
||||
assert.equal(inserted, true);
|
||||
// Inserted at the top level (siblings of the callout), not into the callout.
|
||||
assert.deepEqual(
|
||||
out.content.map((n) => n.attrs.id),
|
||||
["new", "c-1"],
|
||||
);
|
||||
// The callout's own children are untouched.
|
||||
assert.deepEqual(
|
||||
out.content[1].content.map((n) => n.attrs.id),
|
||||
["inner"],
|
||||
);
|
||||
});
|
||||
|
||||
test("insertNodeRelative append pushes the node at the end of top-level content", () => {
|
||||
const input = doc(para("p-1", textNode("one")), para("p-2", textNode("two")));
|
||||
const node = para("new", textNode("NEW"));
|
||||
const { doc: out, inserted } = insertNodeRelative(input, node, {
|
||||
position: "append",
|
||||
});
|
||||
assert.equal(inserted, true);
|
||||
assert.deepEqual(
|
||||
out.content.map((n) => n.attrs.id),
|
||||
["p-1", "p-2", "new"],
|
||||
);
|
||||
});
|
||||
|
||||
test("insertNodeRelative inserted===false when anchorNodeId missing", () => {
|
||||
const input = doc(para("p-1", textNode("one")));
|
||||
const node = para("new", textNode("NEW"));
|
||||
const { doc: out, inserted } = insertNodeRelative(input, node, {
|
||||
position: "after",
|
||||
anchorNodeId: "nope",
|
||||
});
|
||||
assert.equal(inserted, false);
|
||||
assert.deepEqual(out, input);
|
||||
});
|
||||
|
||||
test("insertNodeRelative inserted===false when anchorText missing", () => {
|
||||
const input = doc(para("p-1", textNode("one")));
|
||||
const node = para("new", textNode("NEW"));
|
||||
const { inserted } = insertNodeRelative(input, node, {
|
||||
position: "before",
|
||||
anchorText: "nomatch",
|
||||
});
|
||||
assert.equal(inserted, false);
|
||||
});
|
||||
|
||||
test("insertNodeRelative does NOT mutate input (deep-equal snapshot)", () => {
|
||||
const input = doc(para("p-1", textNode("one")), para("p-2", textNode("two")));
|
||||
const snap = snapshot(input);
|
||||
const node = para("new", textNode("NEW"));
|
||||
const { doc: out } = insertNodeRelative(input, node, {
|
||||
position: "after",
|
||||
anchorNodeId: "p-1",
|
||||
});
|
||||
assert.deepEqual(input, snap);
|
||||
assert.notEqual(out, input);
|
||||
});
|
||||
109
packages/mcp/test/unit/outline.test.mjs
Normal file
109
packages/mcp/test/unit/outline.test.mjs
Normal file
@@ -0,0 +1,109 @@
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
|
||||
import { buildOutline, getNodeByRef } from "../../build/lib/node-ops.js";
|
||||
|
||||
// Helpers to build the small fixture doc.
|
||||
const textNode = (text) => ({ type: "text", text });
|
||||
const paragraph = (id, text) => ({
|
||||
type: "paragraph",
|
||||
attrs: { id },
|
||||
content: [textNode(text)],
|
||||
});
|
||||
// A table cell holds a paragraph; cells/rows/table carry NO attrs.id.
|
||||
const cell = (text) => ({
|
||||
type: "tableCell",
|
||||
content: [{ type: "paragraph", content: [textNode(text)] }],
|
||||
});
|
||||
const row = (...texts) => ({
|
||||
type: "tableRow",
|
||||
content: texts.map(cell),
|
||||
});
|
||||
const listItem = (text) => ({
|
||||
type: "listItem",
|
||||
content: [{ type: "paragraph", content: [textNode(text)] }],
|
||||
});
|
||||
|
||||
// A long paragraph to exercise truncation (>100 chars).
|
||||
const longText = "x".repeat(150);
|
||||
|
||||
const buildDoc = () => ({
|
||||
type: "doc",
|
||||
content: [
|
||||
{ type: "heading", attrs: { id: "h1", level: 2 }, content: [textNode("Title")] },
|
||||
paragraph("p1", longText),
|
||||
{
|
||||
type: "table",
|
||||
content: [row("A", "B", "C"), row("1", "2", "3")],
|
||||
},
|
||||
{
|
||||
type: "bulletList",
|
||||
attrs: { id: "list1" },
|
||||
content: [listItem("one"), listItem("two")],
|
||||
},
|
||||
],
|
||||
});
|
||||
|
||||
test("buildOutline returns one compact entry per top-level block", () => {
|
||||
const outline = buildOutline(buildDoc());
|
||||
assert.equal(outline.length, 4);
|
||||
|
||||
// Heading: level + id + firstText.
|
||||
assert.equal(outline[0].type, "heading");
|
||||
assert.equal(outline[0].level, 2);
|
||||
assert.equal(outline[0].id, "h1");
|
||||
assert.equal(outline[0].firstText, "Title");
|
||||
|
||||
// Long paragraph text is truncated to 100 chars + ellipsis.
|
||||
assert.equal(outline[1].id, "p1");
|
||||
assert.equal(outline[1].firstText, "x".repeat(100) + "…");
|
||||
assert.equal(outline[1].firstText.length, 101);
|
||||
|
||||
// Table: rows/cols/header from the first row; no id on the table itself.
|
||||
assert.equal(outline[2].type, "table");
|
||||
assert.equal(outline[2].rows, 2);
|
||||
assert.equal(outline[2].cols, 3);
|
||||
assert.deepEqual(outline[2].header, ["A", "B", "C"]);
|
||||
assert.equal(outline[2].id, null);
|
||||
|
||||
// List: item count.
|
||||
assert.equal(outline[3].type, "bulletList");
|
||||
assert.equal(outline[3].items, 2);
|
||||
});
|
||||
|
||||
test("buildOutline is null-safe", () => {
|
||||
assert.deepEqual(buildOutline(undefined), []);
|
||||
assert.deepEqual(buildOutline({ type: "doc" }), []);
|
||||
assert.deepEqual(buildOutline(42), []);
|
||||
});
|
||||
|
||||
test("getNodeByRef resolves a block id to its node and path", () => {
|
||||
const doc = buildDoc();
|
||||
const hit = getNodeByRef(doc, "h1");
|
||||
assert.ok(hit);
|
||||
assert.equal(hit.type, "heading");
|
||||
assert.deepEqual(hit.path, [0]);
|
||||
assert.equal(hit.node.attrs.id, "h1");
|
||||
});
|
||||
|
||||
test("getNodeByRef resolves #<index> to a top-level block (table)", () => {
|
||||
const doc = buildDoc();
|
||||
const hit = getNodeByRef(doc, "#2");
|
||||
assert.ok(hit);
|
||||
assert.equal(hit.type, "table");
|
||||
assert.deepEqual(hit.path, [2]);
|
||||
});
|
||||
|
||||
test("getNodeByRef returns null for an unknown ref", () => {
|
||||
assert.equal(getNodeByRef(buildDoc(), "nope"), null);
|
||||
});
|
||||
|
||||
test("getNodeByRef returns a clone (mutating it does not change the input)", () => {
|
||||
const doc = buildDoc();
|
||||
const hit = getNodeByRef(doc, "h1");
|
||||
hit.node.attrs.id = "MUTATED";
|
||||
hit.node.content[0].text = "changed";
|
||||
// Original doc is untouched.
|
||||
assert.equal(doc.content[0].attrs.id, "h1");
|
||||
assert.equal(doc.content[0].content[0].text, "Title");
|
||||
});
|
||||
153
packages/mcp/test/unit/page-lock.test.mjs
Normal file
153
packages/mcp/test/unit/page-lock.test.mjs
Normal file
@@ -0,0 +1,153 @@
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
|
||||
import { withPageLock } from "../../build/lib/page-lock.js";
|
||||
|
||||
const delay = (ms) => new Promise((resolve) => setTimeout(resolve, ms));
|
||||
|
||||
test("two ops on the same pageId run strictly sequentially (no overlap)", async () => {
|
||||
const events = [];
|
||||
const pageId = "same-page";
|
||||
|
||||
const p1 = withPageLock(pageId, async () => {
|
||||
events.push("start-1");
|
||||
await delay(40);
|
||||
events.push("end-1");
|
||||
return "r1";
|
||||
});
|
||||
|
||||
// Queue the second op while the first is still running.
|
||||
const p2 = withPageLock(pageId, async () => {
|
||||
events.push("start-2");
|
||||
await delay(10);
|
||||
events.push("end-2");
|
||||
return "r2";
|
||||
});
|
||||
|
||||
const [r1, r2] = await Promise.all([p1, p2]);
|
||||
|
||||
assert.equal(r1, "r1");
|
||||
assert.equal(r2, "r2");
|
||||
// First op must fully finish before the second one begins.
|
||||
assert.deepEqual(events, ["start-1", "end-1", "start-2", "end-2"]);
|
||||
});
|
||||
|
||||
test("same pageId ordering holds for many queued ops", async () => {
|
||||
const pageId = "ordered-page";
|
||||
const order = [];
|
||||
const active = { count: 0, maxConcurrent: 0 };
|
||||
|
||||
const ops = [];
|
||||
for (let i = 0; i < 6; i++) {
|
||||
ops.push(
|
||||
withPageLock(pageId, async () => {
|
||||
active.count += 1;
|
||||
active.maxConcurrent = Math.max(active.maxConcurrent, active.count);
|
||||
order.push(i);
|
||||
await delay(5);
|
||||
active.count -= 1;
|
||||
return i;
|
||||
}),
|
||||
);
|
||||
}
|
||||
|
||||
const results = await Promise.all(ops);
|
||||
|
||||
assert.deepEqual(results, [0, 1, 2, 3, 4, 5]);
|
||||
assert.deepEqual(order, [0, 1, 2, 3, 4, 5]);
|
||||
// Strictly sequential: never more than one op running at a time.
|
||||
assert.equal(active.maxConcurrent, 1);
|
||||
});
|
||||
|
||||
test("a rejecting op does not poison the chain for the same page", async () => {
|
||||
const pageId = "poison-page";
|
||||
const events = [];
|
||||
|
||||
const failing = withPageLock(pageId, async () => {
|
||||
events.push("fail-start");
|
||||
await delay(20);
|
||||
events.push("fail-throw");
|
||||
throw new Error("boom");
|
||||
});
|
||||
|
||||
// The caller of the failing op must still see the rejection.
|
||||
await assert.rejects(failing, /boom/);
|
||||
|
||||
const following = withPageLock(pageId, async () => {
|
||||
events.push("next-run");
|
||||
await delay(5);
|
||||
return "ok";
|
||||
});
|
||||
|
||||
const result = await following;
|
||||
|
||||
assert.equal(result, "ok");
|
||||
// The next op ran after the failing one settled and was not blocked by it.
|
||||
assert.deepEqual(events, ["fail-start", "fail-throw", "next-run"]);
|
||||
});
|
||||
|
||||
test("failing op queued before a success both resolve/reject correctly", async () => {
|
||||
const pageId = "poison-page-2";
|
||||
const order = [];
|
||||
|
||||
const failing = withPageLock(pageId, async () => {
|
||||
order.push("fail");
|
||||
await delay(20);
|
||||
throw new Error("nope");
|
||||
});
|
||||
|
||||
const ok = withPageLock(pageId, async () => {
|
||||
order.push("ok");
|
||||
await delay(5);
|
||||
return 123;
|
||||
});
|
||||
|
||||
await assert.rejects(failing, /nope/);
|
||||
assert.equal(await ok, 123);
|
||||
// The failing op still ran first (it was queued first), then the success.
|
||||
assert.deepEqual(order, ["fail", "ok"]);
|
||||
});
|
||||
|
||||
test("ops on different pageIds run concurrently (overlap)", async () => {
|
||||
const events = [];
|
||||
|
||||
const pA = withPageLock("page-A", async () => {
|
||||
events.push("A-start");
|
||||
await delay(40);
|
||||
events.push("A-end");
|
||||
return "A";
|
||||
});
|
||||
|
||||
const pB = withPageLock("page-B", async () => {
|
||||
events.push("B-start");
|
||||
await delay(10);
|
||||
events.push("B-end");
|
||||
return "B";
|
||||
});
|
||||
|
||||
const [rA, rB] = await Promise.all([pA, pB]);
|
||||
|
||||
assert.equal(rA, "A");
|
||||
assert.equal(rB, "B");
|
||||
// B starts before A finishes (concurrent), and B finishes before A.
|
||||
assert.deepEqual(events, ["A-start", "B-start", "B-end", "A-end"]);
|
||||
});
|
||||
|
||||
test("no functional leak: many sequential ops on same page keep working", async () => {
|
||||
const pageId = "leak-page";
|
||||
|
||||
// Run a long series of fully sequential ops (each awaited before the next is
|
||||
// queued) so the internal map entry is created and dropped repeatedly.
|
||||
for (let i = 0; i < 50; i++) {
|
||||
const value = await withPageLock(pageId, async () => {
|
||||
await delay(1);
|
||||
return i;
|
||||
});
|
||||
assert.equal(value, i);
|
||||
}
|
||||
|
||||
// After the chain has drained, a brand new op on the same page still works,
|
||||
// confirming the entry was not left in a broken state.
|
||||
const final = await withPageLock(pageId, async () => "still-works");
|
||||
assert.equal(final, "still-works");
|
||||
});
|
||||
149
packages/mcp/test/unit/roundtrip.test.mjs
Normal file
149
packages/mcp/test/unit/roundtrip.test.mjs
Normal file
@@ -0,0 +1,149 @@
|
||||
// Round-trip regression tests: PM -> markdown -> PM must preserve rich nodes.
|
||||
// These lock in the converter/schema fixes (math, mention, attachment, columns,
|
||||
// nested blocks, text color) and the attribute-escaping idempotency fix.
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
import { convertProseMirrorToMarkdown } from "../../build/lib/markdown-converter.js";
|
||||
import { markdownToProseMirror } from "../../build/lib/collaboration.js";
|
||||
|
||||
const doc = (...content) => ({ type: "doc", content });
|
||||
const para = (...content) => ({ type: "paragraph", content });
|
||||
const text = (t, marks) => (marks ? { type: "text", text: t, marks } : { type: "text", text: t });
|
||||
|
||||
// Recursively collect nodes of a given type.
|
||||
const findNodes = (node, type, acc = []) => {
|
||||
if (!node) return acc;
|
||||
if (node.type === type) acc.push(node);
|
||||
for (const c of node.content || []) findNodes(c, type, acc);
|
||||
return acc;
|
||||
};
|
||||
// Recursively collect the set of mark types present.
|
||||
const markTypes = (node, acc = new Set()) => {
|
||||
if (!node) return acc;
|
||||
for (const m of node.marks || []) acc.add(m.type);
|
||||
for (const c of node.content || []) markTypes(c, acc);
|
||||
return acc;
|
||||
};
|
||||
const roundtrip = async (pmDoc) => markdownToProseMirror(convertProseMirrorToMarkdown(pmDoc));
|
||||
|
||||
test("round-trip: text color (textStyle mark) survives", async () => {
|
||||
const input = doc(para(text("colored", [{ type: "textStyle", attrs: { color: "red" } }])));
|
||||
const out = await roundtrip(input);
|
||||
const ts = findNodes(out, "text").flatMap((n) => n.marks || []).filter((m) => m.type === "textStyle");
|
||||
assert.ok(ts.length >= 1, "textStyle mark should survive");
|
||||
assert.equal(ts[0].attrs?.color, "red");
|
||||
});
|
||||
|
||||
test("round-trip: mathInline with '<' survives and is idempotent", async () => {
|
||||
const input = doc(para(text("x"), { type: "mathInline", attrs: { text: "a < b \\leq c" } }));
|
||||
const md1 = convertProseMirrorToMarkdown(input);
|
||||
const md2 = convertProseMirrorToMarkdown(await markdownToProseMirror(md1));
|
||||
assert.equal(md1, md2, "markdown must be idempotent across a round-trip (no escape accumulation)");
|
||||
const out = await markdownToProseMirror(md1);
|
||||
const math = findNodes(out, "mathInline");
|
||||
assert.equal(math.length, 1, "mathInline node should survive");
|
||||
assert.equal(math[0].attrs?.text, "a < b \\leq c", "LaTeX (incl. '<') preserved exactly");
|
||||
});
|
||||
|
||||
test("round-trip: mathBlock survives", async () => {
|
||||
const input = doc({ type: "mathBlock", attrs: { text: "E = mc^2" } });
|
||||
const out = await roundtrip(input);
|
||||
const math = findNodes(out, "mathBlock");
|
||||
assert.equal(math.length, 1);
|
||||
assert.equal(math[0].attrs?.text, "E = mc^2");
|
||||
});
|
||||
|
||||
test("round-trip: mention node survives (not flattened to @text)", async () => {
|
||||
const input = doc(para(text("hi "), { type: "mention", attrs: { id: "u1", label: "Alice", entityType: "user", entityId: "u1" } }));
|
||||
const out = await roundtrip(input);
|
||||
assert.equal(findNodes(out, "mention").length, 1, "mention node should survive");
|
||||
});
|
||||
|
||||
test("round-trip: attachment node survives with url + name", async () => {
|
||||
const input = doc({ type: "attachment", attrs: { url: "/api/files/x/report.pdf", name: "report.pdf", mime: "application/pdf" } });
|
||||
const out = await roundtrip(input);
|
||||
const att = findNodes(out, "attachment");
|
||||
assert.equal(att.length, 1, "attachment node should survive");
|
||||
assert.equal(att[0].attrs?.url, "/api/files/x/report.pdf");
|
||||
assert.equal(att[0].attrs?.name, "report.pdf");
|
||||
});
|
||||
|
||||
test("round-trip: image inside a column survives as an image node (not literal markdown)", async () => {
|
||||
const input = doc({
|
||||
type: "columns",
|
||||
content: [
|
||||
{ type: "column", content: [para(text("left")), { type: "image", attrs: { src: "/api/files/a/p.png", alt: "pic" } }] },
|
||||
{ type: "column", content: [para(text("right"))] },
|
||||
],
|
||||
});
|
||||
const out = await roundtrip(input);
|
||||
assert.equal(findNodes(out, "image").length, 1, "image inside a column must survive");
|
||||
// and it must NOT leak as literal markdown text
|
||||
assert.ok(!JSON.stringify(out).includes("![pic]"), "image must not become literal markdown text");
|
||||
});
|
||||
|
||||
test("round-trip: blockquote inside a column survives as a blockquote node", async () => {
|
||||
const input = doc({
|
||||
type: "columns",
|
||||
content: [
|
||||
{ type: "column", content: [{ type: "blockquote", content: [para(text("quoted"))] }] },
|
||||
{ type: "column", content: [para(text("r"))] },
|
||||
],
|
||||
});
|
||||
const out = await roundtrip(input);
|
||||
assert.equal(findNodes(out, "blockquote").length, 1, "blockquote inside a column must survive");
|
||||
});
|
||||
|
||||
test("round-trip: table cell with colspan>1 keeps the grid (HTML fallback)", async () => {
|
||||
const cell = (t, attrs = {}) => ({ type: "tableCell", attrs, content: [para(text(t))] });
|
||||
const header = (t) => ({ type: "tableHeader", attrs: {}, content: [para(text(t))] });
|
||||
const input = doc({
|
||||
type: "table",
|
||||
content: [
|
||||
{ type: "tableRow", content: [header("A"), header("B")] },
|
||||
{ type: "tableRow", content: [cell("wide", { colspan: 2 })] },
|
||||
],
|
||||
});
|
||||
const out = await roundtrip(input);
|
||||
const tables = findNodes(out, "table");
|
||||
assert.equal(tables.length, 1, "table should survive");
|
||||
const spanned = findNodes(out, "tableCell").find((c) => (c.attrs?.colspan ?? 1) > 1);
|
||||
assert.ok(spanned, "colspan>1 cell should be preserved via the HTML fallback");
|
||||
});
|
||||
|
||||
test("import: an unsafe highlight color (raw data-color) is sanitized to null (no style breakout)", async () => {
|
||||
// data-color is read verbatim (no CSSOM isolation), so it is the real
|
||||
// injection surface; a value with quotes/semicolons must be clamped to null.
|
||||
const out = await markdownToProseMirror('<mark data-color="red"; background:url(x)">hi</mark>');
|
||||
const hl = findNodes(out, "text").flatMap((n) => n.marks || []).filter((m) => m.type === "highlight");
|
||||
assert.ok(hl.length >= 1, "highlight mark present");
|
||||
assert.equal(hl[0].attrs?.color ?? null, null, "unsafe color must be clamped to null");
|
||||
});
|
||||
|
||||
test("import: a safe highlight color is preserved", async () => {
|
||||
const out = await markdownToProseMirror('<mark style="background-color: #ff0000">hi</mark>');
|
||||
const hl = findNodes(out, "text").flatMap((n) => n.marks || []).filter((m) => m.type === "highlight");
|
||||
assert.ok(hl.length >= 1);
|
||||
assert.equal(hl[0].attrs?.color, "#ff0000");
|
||||
});
|
||||
|
||||
test("round-trip: attribute value with an apostrophe is idempotent (no & accumulation)", async () => {
|
||||
const input = doc({ type: "attachment", attrs: { url: "/api/files/x/o'brien's file.pdf", name: "o'brien's file.pdf" } });
|
||||
const md1 = convertProseMirrorToMarkdown(input);
|
||||
const md2 = convertProseMirrorToMarkdown(await markdownToProseMirror(md1));
|
||||
assert.equal(md1, md2, "apostrophe in an attribute value must not accumulate escapes across round-trips");
|
||||
const att = findNodes(await markdownToProseMirror(md1), "attachment");
|
||||
assert.equal(att.length, 1);
|
||||
assert.equal(att[0].attrs?.name, "o'brien's file.pdf", "apostrophe preserved verbatim");
|
||||
});
|
||||
|
||||
test("import: a colored span that is also a comment keeps the comment mark", async () => {
|
||||
const out = await markdownToProseMirror('<span data-comment-id="c1" style="color: red">x</span>');
|
||||
const marks = findNodes(out, "text").flatMap((n) => n.marks || []).map((m) => m.type);
|
||||
assert.ok(marks.includes("comment"), "comment mark must survive (textStyle must not steal the span)");
|
||||
});
|
||||
|
||||
test("import: a colored mention span keeps the mention node", async () => {
|
||||
const out = await markdownToProseMirror('<span data-type="mention" data-id="u1" data-label="Alice" style="color: blue">@Alice</span>');
|
||||
assert.equal(findNodes(out, "mention").length, 1, "mention node must survive a colored span");
|
||||
});
|
||||
77
packages/mcp/test/unit/schema.test.mjs
Normal file
77
packages/mcp/test/unit/schema.test.mjs
Normal file
@@ -0,0 +1,77 @@
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
|
||||
import {
|
||||
docmostExtensions,
|
||||
clampCalloutType,
|
||||
} from "../../build/lib/docmost-schema.js";
|
||||
import { TiptapTransformer } from "@hocuspocus/transformer";
|
||||
|
||||
test("clampCalloutType: a known type passes through", () => {
|
||||
assert.equal(clampCalloutType("warning"), "warning");
|
||||
});
|
||||
|
||||
test("clampCalloutType: an uppercase known type folds to lower case", () => {
|
||||
assert.equal(clampCalloutType("WARNING"), "warning");
|
||||
assert.equal(clampCalloutType("Info"), "info");
|
||||
});
|
||||
|
||||
test("clampCalloutType: an unknown type falls back to info", () => {
|
||||
assert.equal(clampCalloutType("bogus"), "info");
|
||||
});
|
||||
|
||||
test("clampCalloutType: null and undefined fall back to info", () => {
|
||||
assert.equal(clampCalloutType(null), "info");
|
||||
assert.equal(clampCalloutType(undefined), "info");
|
||||
});
|
||||
|
||||
// Minimal-doc builders for the toYdoc acceptance loop.
|
||||
const text = (t) => ({ type: "text", text: t });
|
||||
const paragraph = (inline) => ({ type: "paragraph", content: inline });
|
||||
const docOf = (...content) => ({ type: "doc", content });
|
||||
|
||||
// Each entry is a minimal valid doc for one Docmost node type. Inline atoms
|
||||
// (mention, mathInline) and inline-capable nodes go inside a paragraph; block
|
||||
// atoms and block containers go at the top level.
|
||||
const cases = {
|
||||
mention: docOf(
|
||||
paragraph([{ type: "mention", attrs: { id: "u1", label: "Bob" } }]),
|
||||
),
|
||||
mathInline: docOf(paragraph([{ type: "mathInline", attrs: { text: "x^2" } }])),
|
||||
mathBlock: docOf({ type: "mathBlock", attrs: { text: "x^2" } }),
|
||||
details: docOf({
|
||||
type: "details",
|
||||
content: [
|
||||
{ type: "detailsSummary", content: [text("Summary")] },
|
||||
{ type: "detailsContent", content: [paragraph([text("body")])] },
|
||||
],
|
||||
}),
|
||||
attachment: docOf({
|
||||
type: "attachment",
|
||||
attrs: { url: "http://x/f.zip", name: "f.zip" },
|
||||
}),
|
||||
video: docOf({ type: "video", attrs: { src: "http://x/v.mp4" } }),
|
||||
youtube: docOf({ type: "youtube", attrs: { src: "http://y/watch" } }),
|
||||
embed: docOf({ type: "embed", attrs: { src: "http://e", provider: "iframe" } }),
|
||||
drawio: docOf({ type: "drawio", attrs: { src: "http://d" } }),
|
||||
excalidraw: docOf({ type: "excalidraw", attrs: { src: "http://e" } }),
|
||||
columns: docOf({
|
||||
type: "columns",
|
||||
content: [
|
||||
{ type: "column", content: [paragraph([text("c1")])] },
|
||||
{ type: "column", content: [paragraph([text("c2")])] },
|
||||
],
|
||||
}),
|
||||
subpages: docOf({ type: "subpages" }),
|
||||
audio: docOf({ type: "audio", attrs: { src: "http://a.mp3" } }),
|
||||
pdf: docOf({ type: "pdf", attrs: { src: "http://p.pdf" } }),
|
||||
pageBreak: docOf({ type: "pageBreak" }),
|
||||
};
|
||||
|
||||
for (const [name, doc] of Object.entries(cases)) {
|
||||
test(`toYdoc accepts a ${name} node without throwing`, () => {
|
||||
assert.doesNotThrow(() => {
|
||||
TiptapTransformer.toYdoc(doc, "default", docmostExtensions);
|
||||
});
|
||||
});
|
||||
}
|
||||
338
packages/mcp/test/unit/table-ops.test.mjs
Normal file
338
packages/mcp/test/unit/table-ops.test.mjs
Normal file
@@ -0,0 +1,338 @@
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
|
||||
import {
|
||||
readTable,
|
||||
insertTableRow,
|
||||
deleteTableRow,
|
||||
updateTableCell,
|
||||
} from "../../build/lib/node-ops.js";
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Builders. Tables/rows/cells carry NO attrs.id — only the paragraph inside a
|
||||
// cell does. A cell holds a single plain-text paragraph.
|
||||
// ---------------------------------------------------------------------------
|
||||
const textNode = (text) => ({ type: "text", text });
|
||||
const para = (id, text) => ({
|
||||
type: "paragraph",
|
||||
attrs: { id, indent: 0 },
|
||||
content: text ? [textNode(text)] : [],
|
||||
});
|
||||
const cell = (paraId, text, colwidth) => ({
|
||||
type: "tableCell",
|
||||
attrs: { colspan: 1, rowspan: 1, ...(colwidth ? { colwidth } : {}) },
|
||||
content: [para(paraId, text)],
|
||||
});
|
||||
const row = (...cells) => ({ type: "tableRow", content: cells });
|
||||
const doc = (...children) => ({ type: "doc", content: children });
|
||||
const snapshot = (v) => JSON.parse(JSON.stringify(v));
|
||||
|
||||
// Heading at index 0, a 3x3 table at index 1.
|
||||
// Header row "A"/"B"/"C" with colwidths [120]/[200]/[150]; two data rows.
|
||||
const makeDoc = () =>
|
||||
doc(
|
||||
{ type: "heading", attrs: { id: "h1", level: 1 }, content: [textNode("Title")] },
|
||||
{
|
||||
type: "table",
|
||||
content: [
|
||||
row(
|
||||
cell("hpA", "A", [120]),
|
||||
cell("hpB", "B", [200]),
|
||||
cell("hpC", "C", [150]),
|
||||
),
|
||||
row(cell("p10", "r1c0"), cell("p11", "r1c1"), cell("p12", "r1c2")),
|
||||
row(cell("p20", "r2c0"), cell("p21", "r2c1"), cell("p22", "r2c2")),
|
||||
],
|
||||
},
|
||||
);
|
||||
|
||||
// Gather every attrs.id present anywhere in a doc.
|
||||
const allIds = (node, acc = new Set()) => {
|
||||
if (node && typeof node === "object" && !Array.isArray(node)) {
|
||||
if (node.attrs && typeof node.attrs.id === "string") acc.add(node.attrs.id);
|
||||
if (Array.isArray(node.content)) node.content.forEach((c) => allIds(c, acc));
|
||||
}
|
||||
return acc;
|
||||
};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// readTable
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
test("readTable('#1') returns the 3x3 matrix, cell ids, and path", () => {
|
||||
const t = readTable(makeDoc(), "#1");
|
||||
assert.ok(t);
|
||||
assert.equal(t.rows, 3);
|
||||
assert.equal(t.cols, 3);
|
||||
assert.deepEqual(t.cells, [
|
||||
["A", "B", "C"],
|
||||
["r1c0", "r1c1", "r1c2"],
|
||||
["r2c0", "r2c1", "r2c2"],
|
||||
]);
|
||||
assert.deepEqual(t.cellIds, [
|
||||
["hpA", "hpB", "hpC"],
|
||||
["p10", "p11", "p12"],
|
||||
["p20", "p21", "p22"],
|
||||
]);
|
||||
assert.deepEqual(t.path, [1]);
|
||||
});
|
||||
|
||||
test("readTable(<cell paragraph id>) resolves the enclosing table", () => {
|
||||
const t = readTable(makeDoc(), "p21"); // a paragraph inside a data cell
|
||||
assert.ok(t);
|
||||
assert.equal(t.rows, 3);
|
||||
assert.equal(t.cols, 3);
|
||||
assert.deepEqual(t.path, [1]);
|
||||
});
|
||||
|
||||
test("readTable on a non-table block / unknown ref returns null", () => {
|
||||
assert.equal(readTable(makeDoc(), "#0"), null); // heading, not a table
|
||||
assert.equal(readTable(makeDoc(), "nope"), null); // no such id
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// insertTableRow
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
test("insertTableRow appends a 4th row, copies header colwidths, fresh unique ids", () => {
|
||||
const input = makeDoc();
|
||||
const snap = snapshot(input);
|
||||
const existingIds = allIds(input);
|
||||
|
||||
const { doc: out, inserted } = insertTableRow(input, "#1", ["x", "y", "z"]);
|
||||
assert.equal(inserted, true);
|
||||
|
||||
// Input not mutated.
|
||||
assert.deepEqual(input, snap);
|
||||
|
||||
const tbl = out.content[1];
|
||||
assert.equal(tbl.content.length, 4);
|
||||
const newRow = tbl.content[3];
|
||||
assert.equal(newRow.type, "tableRow");
|
||||
assert.equal(newRow.content.length, 3);
|
||||
|
||||
// Cell texts.
|
||||
assert.deepEqual(
|
||||
newRow.content.map((c) => c.content[0].content[0]?.text),
|
||||
["x", "y", "z"],
|
||||
);
|
||||
// Colwidths copied from the header row.
|
||||
assert.deepEqual(
|
||||
newRow.content.map((c) => c.attrs.colwidth),
|
||||
[[120], [200], [150]],
|
||||
);
|
||||
// colspan/rowspan present.
|
||||
for (const c of newRow.content) {
|
||||
assert.equal(c.attrs.colspan, 1);
|
||||
assert.equal(c.attrs.rowspan, 1);
|
||||
}
|
||||
|
||||
// New paragraph ids are unique and not equal to any existing id.
|
||||
const newIds = newRow.content.map((c) => c.content[0].attrs.id);
|
||||
assert.equal(new Set(newIds).size, 3);
|
||||
for (const id of newIds) {
|
||||
assert.ok(typeof id === "string" && id.length > 0);
|
||||
assert.equal(existingIds.has(id), false);
|
||||
}
|
||||
});
|
||||
|
||||
test("insertTableRow at index 0 inserts before the header and pads to 3 cells", () => {
|
||||
const { doc: out, inserted } = insertTableRow(makeDoc(), "#1", ["x"], 0);
|
||||
assert.equal(inserted, true);
|
||||
|
||||
const tbl = out.content[1];
|
||||
assert.equal(tbl.content.length, 4);
|
||||
const newRow = tbl.content[0]; // inserted at the front
|
||||
assert.equal(newRow.content.length, 3);
|
||||
// First cell "x", remaining two empty.
|
||||
assert.deepEqual(
|
||||
newRow.content.map((c) => c.content[0].content.length),
|
||||
[1, 0, 0],
|
||||
);
|
||||
assert.equal(newRow.content[0].content[0].content[0].text, "x");
|
||||
});
|
||||
|
||||
test("insertTableRow throws when given more cells than columns", () => {
|
||||
assert.throws(
|
||||
() => insertTableRow(makeDoc(), "#1", ["a", "b", "c", "d"]),
|
||||
/table_insert_row: got 4 cell\(s\) but the table has 3 column\(s\)/,
|
||||
);
|
||||
});
|
||||
|
||||
test("insertTableRow on a missing table returns inserted:false", () => {
|
||||
const { inserted } = insertTableRow(makeDoc(), "#0", ["x"]);
|
||||
assert.equal(inserted, false);
|
||||
});
|
||||
|
||||
// A header cell uses type "tableHeader" (vs. "tableCell" for data cells).
|
||||
const headerCell = (paraId, text, colwidth) => ({
|
||||
type: "tableHeader",
|
||||
attrs: { colspan: 1, rowspan: 1, ...(colwidth ? { colwidth } : {}) },
|
||||
content: [para(paraId, text)],
|
||||
});
|
||||
|
||||
// Table whose first row uses tableHeader cells.
|
||||
const makeHeaderDoc = () =>
|
||||
doc({
|
||||
type: "table",
|
||||
content: [
|
||||
row(headerCell("hA", "A"), headerCell("hB", "B")),
|
||||
row(cell("p10", "r1c0"), cell("p11", "r1c1")),
|
||||
],
|
||||
});
|
||||
|
||||
test("insertTableRow at index 0 inherits the header cell type (tableHeader)", () => {
|
||||
const { doc: out, inserted } = insertTableRow(makeHeaderDoc(), "#0", ["x", "y"], 0);
|
||||
assert.equal(inserted, true);
|
||||
|
||||
const tbl = out.content[0];
|
||||
const newRow = tbl.content[0]; // landed at index 0
|
||||
// The new row's cells inherit the header type.
|
||||
assert.deepEqual(
|
||||
newRow.content.map((c) => c.type),
|
||||
["tableHeader", "tableHeader"],
|
||||
);
|
||||
assert.equal(newRow.content[0].content[0].content[0].text, "x");
|
||||
});
|
||||
|
||||
test("insertTableRow append produces data cells (tableCell), not header cells", () => {
|
||||
const { doc: out, inserted } = insertTableRow(makeHeaderDoc(), "#0", ["x", "y"]);
|
||||
assert.equal(inserted, true);
|
||||
|
||||
const tbl = out.content[0];
|
||||
const newRow = tbl.content[tbl.content.length - 1]; // appended last
|
||||
assert.deepEqual(
|
||||
newRow.content.map((c) => c.type),
|
||||
["tableCell", "tableCell"],
|
||||
);
|
||||
});
|
||||
|
||||
// Ragged table: row 0 has 2 cols, a later row has 3.
|
||||
const makeRaggedDoc = () =>
|
||||
doc({
|
||||
type: "table",
|
||||
content: [
|
||||
row(cell("a0", "a0"), cell("a1", "a1")),
|
||||
row(cell("b0", "b0"), cell("b1", "b1"), cell("b2", "b2")),
|
||||
],
|
||||
});
|
||||
|
||||
test("insertTableRow uses the max column count across all rows (ragged table)", () => {
|
||||
// colCount is 3 (the widest row), so 3 cells are accepted...
|
||||
const { doc: out, inserted } = insertTableRow(makeRaggedDoc(), "#0", ["x", "y", "z"]);
|
||||
assert.equal(inserted, true);
|
||||
const tbl = out.content[0];
|
||||
const newRow = tbl.content[tbl.content.length - 1];
|
||||
assert.equal(newRow.content.length, 3);
|
||||
assert.deepEqual(
|
||||
newRow.content.map((c) => c.content[0].content[0]?.text),
|
||||
["x", "y", "z"],
|
||||
);
|
||||
|
||||
// ...but 4 cells exceed the widest row and throw.
|
||||
assert.throws(
|
||||
() => insertTableRow(makeRaggedDoc(), "#0", ["a", "b", "c", "d"]),
|
||||
/table_insert_row: got 4 cell\(s\) but the table has 3 column\(s\)/,
|
||||
);
|
||||
});
|
||||
|
||||
test("insertTableRow into an empty table uses colCount = supplied cells", () => {
|
||||
const empty = doc({ type: "table", content: [] });
|
||||
const { doc: out, inserted } = insertTableRow(empty, "#0", ["x", "y", "z"]);
|
||||
assert.equal(inserted, true);
|
||||
const tbl = out.content[0];
|
||||
assert.equal(tbl.content.length, 1);
|
||||
assert.equal(tbl.content[0].content.length, 3);
|
||||
assert.deepEqual(
|
||||
tbl.content[0].content.map((c) => c.content[0].content[0]?.text),
|
||||
["x", "y", "z"],
|
||||
);
|
||||
});
|
||||
|
||||
test("insertTableRow mints 12-char [a-z0-9] ids that are unique and non-colliding", () => {
|
||||
const input = makeDoc();
|
||||
const existingIds = allIds(input);
|
||||
const { doc: out } = insertTableRow(input, "#1", ["x", "y", "z"]);
|
||||
|
||||
const tbl = out.content[1];
|
||||
const newRow = tbl.content[tbl.content.length - 1];
|
||||
const newIds = newRow.content.map((c) => c.content[0].attrs.id);
|
||||
|
||||
// Docmost-style: exactly 12 chars from lowercase a-z0-9.
|
||||
for (const id of newIds) {
|
||||
assert.match(id, /^[a-z0-9]{12}$/);
|
||||
assert.equal(existingIds.has(id), false); // no collision with the doc
|
||||
}
|
||||
// All distinct within the new row.
|
||||
assert.equal(new Set(newIds).size, newIds.length);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// deleteTableRow
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
test("deleteTableRow removes the 3rd row -> rows:2", () => {
|
||||
const { doc: out, deleted } = deleteTableRow(makeDoc(), "#1", 2);
|
||||
assert.equal(deleted, true);
|
||||
const tbl = out.content[1];
|
||||
assert.equal(tbl.content.length, 2);
|
||||
// The removed row was the second data row (r2*).
|
||||
assert.deepEqual(
|
||||
tbl.content.map((r) => r.content[0].content[0].content[0]?.text ?? ""),
|
||||
["A", "r1c0"],
|
||||
);
|
||||
});
|
||||
|
||||
test("deleteTableRow out-of-range index throws", () => {
|
||||
assert.throws(
|
||||
() => deleteTableRow(makeDoc(), "#1", 9),
|
||||
/table_delete_row: row index 9 out of range \(table has 3 row\(s\)\)/,
|
||||
);
|
||||
});
|
||||
|
||||
test("deleteTableRow refuses to delete the only row", () => {
|
||||
const single = doc({
|
||||
type: "table",
|
||||
content: [row(cell("only", "x"))],
|
||||
});
|
||||
assert.throws(
|
||||
() => deleteTableRow(single, "#0", 0),
|
||||
/refusing to delete the only row of the table/,
|
||||
);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// updateTableCell
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
test("updateTableCell sets cell [1,1] to 'Z' and preserves the paragraph id", () => {
|
||||
const input = makeDoc();
|
||||
const snap = snapshot(input);
|
||||
const { doc: out, updated } = updateTableCell(input, "#1", 1, 1, "Z");
|
||||
assert.equal(updated, true);
|
||||
|
||||
// Input not mutated.
|
||||
assert.deepEqual(input, snap);
|
||||
|
||||
const targetCell = out.content[1].content[1].content[1];
|
||||
assert.equal(targetCell.content.length, 1);
|
||||
const p = targetCell.content[0];
|
||||
assert.equal(p.type, "paragraph");
|
||||
assert.equal(p.attrs.id, "p11"); // preserved
|
||||
assert.equal(p.content[0].text, "Z");
|
||||
|
||||
// Cell attrs preserved.
|
||||
assert.equal(targetCell.attrs.colspan, 1);
|
||||
assert.equal(targetCell.attrs.rowspan, 1);
|
||||
});
|
||||
|
||||
test("updateTableCell out-of-range row/col throws", () => {
|
||||
assert.throws(
|
||||
() => updateTableCell(makeDoc(), "#1", 9, 0, "x"),
|
||||
/table_update_cell: cell \[9,0\] out of range/,
|
||||
);
|
||||
assert.throws(
|
||||
() => updateTableCell(makeDoc(), "#1", 0, 9, "x"),
|
||||
/table_update_cell: cell \[0,9\] out of range/,
|
||||
);
|
||||
});
|
||||
303
packages/mcp/test/unit/transforms.test.mjs
Normal file
303
packages/mcp/test/unit/transforms.test.mjs
Normal file
@@ -0,0 +1,303 @@
|
||||
import { test } from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
|
||||
import {
|
||||
blockText,
|
||||
walk,
|
||||
getList,
|
||||
insertMarkerAfter,
|
||||
setCalloutRange,
|
||||
noteItem,
|
||||
mdToInlineNodes,
|
||||
commentsToFootnotes,
|
||||
} from "../../build/lib/transforms.js";
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Builders
|
||||
// ---------------------------------------------------------------------------
|
||||
const t = (text, marks) => (marks ? { type: "text", text, marks } : { type: "text", text });
|
||||
const para = (id, ...children) => ({
|
||||
type: "paragraph",
|
||||
attrs: { id },
|
||||
content: children,
|
||||
});
|
||||
const heading = (id, text) => ({
|
||||
type: "heading",
|
||||
attrs: { id, level: 2 },
|
||||
content: [t(text)],
|
||||
});
|
||||
const olist = (...items) => ({ type: "orderedList", content: items });
|
||||
const li = (text) => ({
|
||||
type: "listItem",
|
||||
content: [{ type: "paragraph", content: [t(text)] }],
|
||||
});
|
||||
const doc = (...children) => ({ type: "doc", content: children });
|
||||
const snapshot = (v) => JSON.parse(JSON.stringify(v));
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// blockText / walk / getList
|
||||
// ---------------------------------------------------------------------------
|
||||
test("blockText concatenates nested inline text", () => {
|
||||
assert.equal(blockText(para("p", t("a"), t("b"), t("c"))), "abc");
|
||||
});
|
||||
|
||||
test("walk visits every node depth-first", () => {
|
||||
const d = doc(para("p1", t("x")), olist(li("y")));
|
||||
const types = [];
|
||||
walk(d, (n) => types.push(n.type));
|
||||
assert.deepEqual(types, [
|
||||
"doc",
|
||||
"paragraph",
|
||||
"text",
|
||||
"orderedList",
|
||||
"listItem",
|
||||
"paragraph",
|
||||
"text",
|
||||
]);
|
||||
});
|
||||
|
||||
test("getList finds an orderedList without an id", () => {
|
||||
const d = doc(para("p", t("x")), olist(li("one")));
|
||||
const found = getList(d, (n) => n.type === "orderedList");
|
||||
assert.ok(found);
|
||||
assert.equal(found.type, "orderedList");
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// insertMarkerAfter — mark-safe split
|
||||
// ---------------------------------------------------------------------------
|
||||
test("insertMarkerAfter splits a marked run and inserts an UNMARKED marker", () => {
|
||||
// A paragraph: "see " (plain) + "the link" (link mark) + " here" (plain).
|
||||
const link = [{ type: "link", attrs: { href: "http://x" } }];
|
||||
const original = doc(
|
||||
para("p1", t("see "), t("the link", link), t(" here")),
|
||||
);
|
||||
const before = snapshot(original);
|
||||
|
||||
const { doc: out, inserted } = insertMarkerAfter(
|
||||
original,
|
||||
"the link",
|
||||
"[1]",
|
||||
);
|
||||
assert.equal(inserted, true);
|
||||
// The caller's object is untouched (deep clone).
|
||||
assert.deepEqual(original, before);
|
||||
|
||||
const inline = out.content[0].content;
|
||||
// Expect: "see "(plain), "the link"(link), " [1]"(NO marks), " here"(plain).
|
||||
const marker = inline.find((n) => n.text === " [1]");
|
||||
assert.ok(marker, "marker run present");
|
||||
assert.equal(marker.marks, undefined, "marker carries no marks");
|
||||
|
||||
// The link run kept its mark verbatim.
|
||||
const linkRun = inline.find((n) => n.text === "the link");
|
||||
assert.deepEqual(linkRun.marks, link);
|
||||
|
||||
// Plain text reads correctly with the marker placed right after the anchor.
|
||||
assert.equal(blockText(out.content[0]), "see the link [1] here");
|
||||
});
|
||||
|
||||
test("insertMarkerAfter respects beforeBlock and reports not-found", () => {
|
||||
const d = doc(para("p1", t("alpha")), para("p2", t("beta")));
|
||||
// anchor only in block index 1, but search limited to blocks < 1
|
||||
const r = insertMarkerAfter(d, "beta", "[1]", { beforeBlock: 1 });
|
||||
assert.equal(r.inserted, false);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// setCalloutRange
|
||||
// ---------------------------------------------------------------------------
|
||||
test("setCalloutRange rewrites [1]…[K] to [1]…[n]", () => {
|
||||
const d = doc({
|
||||
type: "callout",
|
||||
attrs: { type: "info" },
|
||||
content: [para("c", t("Footnotes [1]…[3] are translator notes."))],
|
||||
});
|
||||
const { doc: out, changed } = setCalloutRange(d, 7);
|
||||
assert.equal(changed, 1);
|
||||
assert.equal(blockText(out), "Footnotes [1]…[7] are translator notes.");
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// noteItem / mdToInlineNodes
|
||||
// ---------------------------------------------------------------------------
|
||||
test("noteItem wraps inline nodes in a listItem with a fresh paragraph id", () => {
|
||||
const item = noteItem([t("hello")]);
|
||||
assert.equal(item.type, "listItem");
|
||||
assert.equal(item.content[0].type, "paragraph");
|
||||
assert.ok(item.content[0].attrs.id, "has a fresh id");
|
||||
assert.deepEqual(item.content[0].content, [t("hello")]);
|
||||
});
|
||||
|
||||
test("mdToInlineNodes splits a bold lead and strips a prefix", () => {
|
||||
const nodes = mdToInlineNodes("комментарий: **Lead.** body text");
|
||||
// bold lead node + plain remainder
|
||||
assert.equal(nodes[0].text, "Lead.");
|
||||
assert.deepEqual(nodes[0].marks, [{ type: "bold" }]);
|
||||
assert.ok(nodes[1].text.includes("body text"));
|
||||
assert.equal(nodes[1].marks, undefined);
|
||||
});
|
||||
|
||||
test("mdToInlineNodes strips a 'N. ' numeric prefix", () => {
|
||||
const nodes = mdToInlineNodes("3. plain note");
|
||||
assert.equal(nodes.map((n) => n.text).join(""), "plain note");
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// commentsToFootnotes — renumber by reading position on a small fixture
|
||||
// ---------------------------------------------------------------------------
|
||||
test("commentsToFootnotes anchors comments and renumbers by position", () => {
|
||||
// Body has an EXISTING footnote [1] in the second paragraph; we add two
|
||||
// inline comments anchored to text in the first and third paragraphs. After
|
||||
// running, markers must be renumbered 1,2,3 in reading order and the notes
|
||||
// list reordered to match.
|
||||
const callout = {
|
||||
type: "callout",
|
||||
attrs: { type: "info" },
|
||||
content: [para("c", t("Notes [1]…[1] follow."))],
|
||||
};
|
||||
const d = doc(
|
||||
callout,
|
||||
para("p1", t("First mentions apple.")),
|
||||
para("p2", t("Second already has a note [1] here.")),
|
||||
para("p3", t("Third mentions banana.")),
|
||||
heading("h", "Примечания переводчика"),
|
||||
olist(li("existing note one")), // matches the existing [1]
|
||||
);
|
||||
|
||||
const comments = [
|
||||
{ id: "cA", content: "apple note", selection: "apple" },
|
||||
{ id: "cB", content: "banana note", selection: "banana" },
|
||||
];
|
||||
|
||||
const { doc: out, consumed } = commentsToFootnotes(d, comments);
|
||||
assert.deepEqual(consumed.sort(), ["cA", "cB"]);
|
||||
|
||||
// Markers in reading order: p1 "apple"->[1], p2 existing->[2], p3 "banana"->[3]
|
||||
assert.match(blockText(out.content[1]), /\[1\]/);
|
||||
assert.match(blockText(out.content[2]), /\[2\]/);
|
||||
assert.match(blockText(out.content[3]), /\[3\]/);
|
||||
|
||||
// No stray placeholders remain.
|
||||
const allText = blockText(out);
|
||||
assert.doesNotMatch(allText, / F\d+ /);
|
||||
|
||||
// Notes list reordered to [apple, existing, banana] (reading order).
|
||||
const list = out.content.find((n) => n.type === "orderedList");
|
||||
assert.equal(list.content.length, 3);
|
||||
assert.equal(blockText(list.content[0]), "apple note");
|
||||
assert.equal(blockText(list.content[1]), "existing note one");
|
||||
assert.equal(blockText(list.content[2]), "banana note");
|
||||
|
||||
// Callout range synced to 3 notes.
|
||||
assert.match(blockText(out.content[0]), /\[1\]…\[3\]/);
|
||||
});
|
||||
|
||||
test("commentsToFootnotes throws when the notes heading is missing", () => {
|
||||
const d = doc(para("p", t("no notes section")));
|
||||
assert.throws(
|
||||
() => commentsToFootnotes(d, [{ id: "x", content: "y", selection: "no" }]),
|
||||
/heading .* not found/,
|
||||
);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Bug 1: the placeholder sentinel must not collide with real "F<digits>" /
|
||||
// "FN<digits>" text. Body text "F1"/"FN2"/"F12" near a real comment anchor must
|
||||
// be left untouched; only the real comment becomes a footnote. "FN2" is the key
|
||||
// case: the old printable " FN<i> " sentinel could collide with prose like "FN2",
|
||||
// which the NUL-delimited "\u0000FN<i>\u0000" sentinel makes impossible.
|
||||
// ---------------------------------------------------------------------------
|
||||
test("commentsToFootnotes leaves literal 'F1'/'FN2'/'F12' body text untouched", () => {
|
||||
const d = doc(
|
||||
para("p1", t("Press F1 for help, model FN2 and F12 for tools near apple here.")),
|
||||
heading("h", "Примечания переводчика"),
|
||||
olist(), // empty notes list; the single comment supplies the only note
|
||||
);
|
||||
|
||||
const comments = [{ id: "cA", content: "apple note", selection: "apple" }];
|
||||
|
||||
const { doc: out, consumed } = commentsToFootnotes(d, comments);
|
||||
assert.deepEqual(consumed, ["cA"]);
|
||||
|
||||
const bodyText = blockText(out.content[0]);
|
||||
// The literal "F1"/"FN2"/"F12" prose is preserved verbatim (no bogus
|
||||
// footnotes, no eaten spaces around them).
|
||||
assert.match(bodyText, /Press F1 for help, model FN2 and F12 for tools/);
|
||||
// Exactly one real footnote marker was produced, at the anchored word.
|
||||
const markerCount = (bodyText.match(/\[\d+\]/g) || []).length;
|
||||
assert.equal(markerCount, 1);
|
||||
assert.match(bodyText, /apple \[1\]/);
|
||||
|
||||
// Exactly one note in the list — "F1"/"FN2"/"F12" did not spawn extra notes.
|
||||
const list = out.content.find((n) => n.type === "orderedList");
|
||||
assert.equal(list.content.length, 1);
|
||||
assert.equal(blockText(list.content[0]), "apple note");
|
||||
|
||||
// No stray placeholder sentinel remains anywhere: the NUL-delimited sentinel
|
||||
// is fully consumed by the renumber pass, so no raw NUL control char persists
|
||||
// in the returned doc. We deliberately do NOT assert absence of the printable
|
||||
// " FN<i> " shape: the body intentionally contains real prose "model FN2 and",
|
||||
// which must survive verbatim (see the match assertion above) - that is exactly
|
||||
// why the old printable sentinel was unsafe and the NUL sentinel is not.
|
||||
assert.doesNotMatch(blockText(out), /\u0000/);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Bug 2: an out-of-range body marker must throw, not silently drop the note.
|
||||
// ---------------------------------------------------------------------------
|
||||
test("commentsToFootnotes throws on an out-of-range body marker", () => {
|
||||
// Body marker [9] but the notes list has only 1 item -> inconsistent doc.
|
||||
const d = doc(
|
||||
para("p1", t("Some text with a dangling marker [9] here.")),
|
||||
heading("h", "Примечания переводчика"),
|
||||
olist(li("the only note")),
|
||||
);
|
||||
|
||||
assert.throws(
|
||||
() => commentsToFootnotes(d, []),
|
||||
/footnote \[9\] has no matching note \(notes list has 1 items\); document is inconsistent/,
|
||||
);
|
||||
});
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Bug 4: a non-disclaimer callout in the body gets its [N] markers renumbered;
|
||||
// a disclaimer callout carrying a "[1]…[K]" range is left out of renumbering.
|
||||
// ---------------------------------------------------------------------------
|
||||
test("commentsToFootnotes renumbers body callouts but skips the disclaimer range", () => {
|
||||
const disclaimer = {
|
||||
type: "callout",
|
||||
attrs: { type: "info" },
|
||||
content: [para("d", t("Notes [1]…[2] follow."))],
|
||||
};
|
||||
const bodyCallout = {
|
||||
type: "callout",
|
||||
attrs: { type: "warning" },
|
||||
content: [para("bc", t("Important point already noted [1] above."))],
|
||||
};
|
||||
const d = doc(
|
||||
disclaimer,
|
||||
bodyCallout,
|
||||
para("p2", t("Then a second mention with [2] too.")),
|
||||
heading("h", "Примечания переводчика"),
|
||||
olist(li("first note"), li("second note")),
|
||||
);
|
||||
|
||||
const { doc: out, consumed } = commentsToFootnotes(d, []);
|
||||
assert.deepEqual(consumed, []);
|
||||
|
||||
// The disclaimer's "[1]…[K]" range is NOT treated as body markers: it stays
|
||||
// a range and is synced to the note count (2), not renumbered into [1],[2].
|
||||
assert.match(blockText(out.content[0]), /\[1\]…\[2\]/);
|
||||
|
||||
// The body callout's [1] is renumbered as a real reading-order marker.
|
||||
assert.match(blockText(out.content[1]), /noted \[1\] above/);
|
||||
// The following paragraph's [2] keeps reading order.
|
||||
assert.match(blockText(out.content[2]), /with \[2\] too/);
|
||||
|
||||
// Notes list still has the two original notes in order.
|
||||
const list = out.content.find((n) => n.type === "orderedList");
|
||||
assert.equal(list.content.length, 2);
|
||||
assert.equal(blockText(list.content[0]), "first note");
|
||||
assert.equal(blockText(list.content[1]), "second note");
|
||||
});
|
||||
14
packages/mcp/tsconfig.json
Normal file
14
packages/mcp/tsconfig.json
Normal file
@@ -0,0 +1,14 @@
|
||||
{
|
||||
"compilerOptions": {
|
||||
"target": "ES2022",
|
||||
"module": "Node16",
|
||||
"moduleResolution": "Node16",
|
||||
"outDir": "./build",
|
||||
"rootDir": "./src",
|
||||
"strict": true,
|
||||
"esModuleInterop": true,
|
||||
"skipLibCheck": true,
|
||||
"forceConsistentCasingInFileNames": true
|
||||
},
|
||||
"include": ["src/**/*"]
|
||||
}
|
||||
Reference in New Issue
Block a user