test: cover features since 053a9c0d + repair test tooling

Add ~330 tests across server (Jest), client (Vitest), editor-ext (Vitest) and packages/mcp (node:test) for the gitmost features added since 053a9c0d: AI chat, AI agent roles, public-share assistant, MCP per-user auth, HTML embed, page templates/embed, realtime tree, tree expand/collapse, and the AI-settings UI. Test-tooling fixes (prerequisite, were silently hiding coverage): - Repair 3 page-template specs broken by the 11-arg TransclusionService constructor; they never compiled, so template access-control / content -leak / unsync-strip coverage was fictitious. - Build @docmost/editor-ext before server tests via a `pretest` hook; the stale dist omitted the new HtmlEmbed/PageEmbed exports (TS2305). - Let jest resolve the .tsx email templates: add `tsx` to moduleFileExtensions and widen the ts-jest transform to (t|j)sx?. Behaviour-preserving "extract pure core" refactors that the tests drive: - server: resolveShareAssistantRequest + uiMessageTextLength (public-share controller), decideBasicGate + mapAuthResultToResponse (mcp), buildErrorAssistantRecord (ai-chat), jsonbObject export (roles). - client: render-raw-html + shouldExecute/canEdit, decide-embed-state, page-embed picker utils, tree-socket reducers, open/close branch maps, isEndpointConfigured/resolveKeyField; buildTreeWithChildren now treats a permission-trimmed orphan as a root instead of crashing. Deferred (need a test DB or HTTP harness, documented in the specs): repo-level Postgres integration tests and the public-share XFF E2E. Pre-existing DI/lib0-ESM suite failures are untouched and out of scope. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 23:40:40 +03:00
parent 692c0abe13
commit 90d3fab483
56 changed files with 5668 additions and 447 deletions
--- a/packages/editor-ext/src/lib/html-embed/html-embed-codec.spec.ts
+++ b/packages/editor-ext/src/lib/html-embed/html-embed-codec.spec.ts
@@ -0,0 +1,116 @@
+import { afterEach, describe, expect, it } from "vitest";
+import {
+  encodeHtmlEmbedSource,
+  decodeHtmlEmbedSource,
+} from "./html-embed";
+
+// Unit coverage for the base64 codec used by the htmlEmbed node's
+// data-source attribute (html-embed.ts). The codec has two branches:
+//   - the BROWSER branch: btoa(encodeURIComponent(s)) / decodeURIComponent(atob(s));
+//   - the NODE fallback: Buffer.from(..).toString("base64") / Buffer.from(s,"base64").
+// Server-side schema parsing (htmlToJson with no global btoa/atob) hits the
+// fallback, so both branches must round-trip identically; otherwise an embed
+// encoded in the browser would decode wrong on the server (or vice versa).
+//
+// We force the fallback by temporarily DELETING globalThis.btoa/atob (jsdom
+// provides them in this env), restoring them after each test so the suite stays
+// hermetic.
+
+const realBtoa = globalThis.btoa;
+const realAtob = globalThis.atob;
+
+function deleteBase64Globals(): void {
+  // @ts-expect-error — intentionally removing the globals to exercise the
+  // `typeof btoa !== "function"` Node fallback branch in the codec.
+  delete globalThis.btoa;
+  // @ts-expect-error — see above.
+  delete globalThis.atob;
+}
+
+afterEach(() => {
+  // Always restore so one test's stubbing never leaks into another.
+  globalThis.btoa = realBtoa;
+  globalThis.atob = realAtob;
+});
+
+describe("html-embed codec — browser btoa/atob branch", () => {
+  it("round-trips ASCII source", () => {
+    const src = "<script>alert(1)</script>";
+    const enc = encodeHtmlEmbedSource(src);
+    expect(enc).not.toBe("");
+    // base64 of the encodeURIComponent form never contains a raw '<'.
+    expect(enc).not.toContain("<");
+    expect(decodeHtmlEmbedSource(enc)).toBe(src);
+  });
+
+  it("round-trips UTF-8 / non-Latin1 source (the reason for encodeURIComponent)", () => {
+    const src = '<p>héllo → 世界 𝕏</p>';
+    const enc = encodeHtmlEmbedSource(src);
+    expect(decodeHtmlEmbedSource(enc)).toBe(src);
+  });
+});
+
+describe("html-embed codec — Node Buffer fallback branch", () => {
+  it("encode uses the Buffer fallback when btoa is unavailable and still round-trips (UTF-8)", () => {
+    const src = '<div>héllo → 世界 𝕏</div>';
+
+    deleteBase64Globals();
+    // With the globals gone, encode must take the Buffer path...
+    const encFallback = encodeHtmlEmbedSource(src);
+    expect(encFallback).not.toBe("");
+    // ...and decode (also via Buffer) must recover the exact source.
+    expect(decodeHtmlEmbedSource(encFallback)).toBe(src);
+  });
+
+  it("the Buffer fallback produces the SAME bytes the browser branch does (cross-env parity)", () => {
+    const src = '<span>café — 日本語</span>';
+
+    // Browser branch (globals intact).
+    const encBrowser = encodeHtmlEmbedSource(src);
+
+    // Fallback branch.
+    deleteBase64Globals();
+    const encFallback = encodeHtmlEmbedSource(src);
+
+    // Identical base64 => an embed encoded in either environment decodes
+    // identically in the other (server <-> client losslessness).
+    expect(encFallback).toBe(encBrowser);
+
+    // And the fallback can decode what the browser produced.
+    expect(decodeHtmlEmbedSource(encBrowser)).toBe(src);
+  });
+
+  it("empty string -> '' on both encode and decode in the fallback (early return, branch never reached)", () => {
+    deleteBase64Globals();
+    expect(encodeHtmlEmbedSource("")).toBe("");
+    expect(decodeHtmlEmbedSource("")).toBe("");
+  });
+
+  it("decode of malformed base64 -> '' via the catch branch (fallback)", () => {
+    // In the Buffer fallback, Buffer.from(..,'base64') is lenient and never
+    // throws, so to hit the catch we need a payload whose DECODED bytes are an
+    // invalid percent-escape, which makes decodeURIComponent throw. base64 of a
+    // lone '%' decodes back to '%', and decodeURIComponent('%') is a URIError.
+    const badBase64 = Buffer.from("%", "utf-8").toString("base64"); // "JQ=="
+
+    deleteBase64Globals();
+    // Sanity: the raw decode really does throw, so we're exercising the catch.
+    expect(() =>
+      decodeURIComponent(Buffer.from(badBase64, "base64").toString("utf-8")),
+    ).toThrow();
+    // The codec swallows it and returns "" rather than propagating.
+    expect(decodeHtmlEmbedSource(badBase64)).toBe("");
+  });
+});
+
+describe("html-embed codec — decode of malformed input (browser branch)", () => {
+  it("returns '' for input atob rejects (catch branch)", () => {
+    // atob throws on characters outside the base64 alphabet; the codec catches
+    // it and returns "" instead of throwing.
+    expect(decodeHtmlEmbedSource("@@not-base64@@")).toBe("");
+  });
+
+  it("empty string short-circuits to '' (never calls atob)", () => {
+    expect(decodeHtmlEmbedSource("")).toBe("");
+  });
+});
--- a/packages/editor-ext/src/lib/markdown/html-embed-marked.spec.ts
+++ b/packages/editor-ext/src/lib/markdown/html-embed-marked.spec.ts
@@ -0,0 +1,105 @@
+import { describe, expect, it } from "vitest";
+import { htmlEmbedExtension } from "./utils/html-embed.marked";
+import { markdownToHtml } from "./index";
+import { encodeHtmlEmbedSource } from "../html-embed/html-embed";
+
+// CONTRACT tests for the marked block tokenizer that rebuilds an htmlEmbed node
+// from the `<!--html-embed:BASE64-->` marker (html-embed.marked.ts), plus the
+// observable round-trip through markdownToHtml.
+//
+// These pin the REAL tokenizer behaviour the import path depends on:
+//   - the tokenizer rule is anchored (^) and only accepts the base64 alphabet
+//     [A-Za-z0-9+/=], so a marker with non-base64 chars is NOT tokenized and
+//     survives as a literal HTML comment (not silently turned into something the
+//     server's strip no longer recognizes);
+//   - start() reports the correct index of the next marker so marked invokes the
+//     tokenizer at the right offset when a marker sits mid-document / after text;
+//   - a marker with surrounding text on the SAME line is split out into its own
+//     embed div while the surrounding text becomes ordinary paragraphs.
+//
+// The contract is asserted against the actual exported extension and pipeline —
+// no behaviour is invented; the expectations were read off the real tokenizer.
+
+const SAMPLE = "<b>x</b>";
+const ENC = encodeHtmlEmbedSource(SAMPLE);
+
+describe("htmlEmbed marked tokenizer — start()", () => {
+  it("returns the index of a marker that sits mid-document", () => {
+    const src = `hello world <!--html-embed:${ENC}-->`;
+    expect(htmlEmbedExtension.start(src)).toBe(src.indexOf("<!--html-embed:"));
+  });
+
+  it("returns 0 when the marker is at the very start", () => {
+    expect(htmlEmbedExtension.start(`<!--html-embed:${ENC}-->`)).toBe(0);
+  });
+
+  it("returns -1 when there is no marker", () => {
+    expect(htmlEmbedExtension.start("no marker here")).toBe(-1);
+  });
+});
+
+describe("htmlEmbed marked tokenizer — tokenizer()", () => {
+  it("tokenizes a marker at the start of the input, capturing the base64 payload", () => {
+    const token = htmlEmbedExtension.tokenizer(`<!--html-embed:${ENC}-->`);
+    expect(token).toBeTruthy();
+    expect(token!.type).toBe("htmlEmbed");
+    expect(token!.raw).toBe(`<!--html-embed:${ENC}-->`);
+    expect(token!.encoded).toBe(ENC);
+  });
+
+  it("tokenizes an EMPTY marker (the [A-Za-z0-9+/=]* class allows zero chars)", () => {
+    const token = htmlEmbedExtension.tokenizer("<!--html-embed:-->");
+    expect(token).toBeTruthy();
+    expect(token!.encoded).toBe("");
+    expect(token!.raw).toBe("<!--html-embed:-->");
+  });
+
+  it("does NOT tokenize when text precedes the marker (rule is anchored ^)", () => {
+    // marked relies on start() to advance to the marker; the tokenizer itself
+    // only matches at offset 0, so a non-anchored call returns undefined.
+    expect(
+      htmlEmbedExtension.tokenizer(`hello <!--html-embed:${ENC}-->`),
+    ).toBeUndefined();
+  });
+
+  it("does NOT tokenize a marker containing a non-base64 char ('$')", () => {
+    expect(
+      htmlEmbedExtension.tokenizer("<!--html-embed:ab$cd-->"),
+    ).toBeUndefined();
+  });
+
+  it("does NOT tokenize a marker containing a space", () => {
+    expect(
+      htmlEmbedExtension.tokenizer("<!--html-embed:ab cd-->"),
+    ).toBeUndefined();
+  });
+
+  it("renderer emits the embed div the node's parseHTML recognizes", () => {
+    const token = htmlEmbedExtension.tokenizer(`<!--html-embed:${ENC}-->`)!;
+    const html = htmlEmbedExtension.renderer(token as any);
+    expect(html).toBe(
+      `<div data-type="htmlEmbed" data-source="${ENC}"></div>`,
+    );
+  });
+});
+
+describe("htmlEmbed marked tokenizer — markdownToHtml round-trip", () => {
+  it("splits a marker out of surrounding same-line text into its own embed div", async () => {
+    const html = await markdownToHtml(`before <!--html-embed:${ENC}--> after`);
+    // The marker became the embed div...
+    expect(html).toContain(
+      `<div data-type="htmlEmbed" data-source="${ENC}"></div>`,
+    );
+    // ...and the surrounding text survived as ordinary paragraph content.
+    expect(html).toContain("before");
+    expect(html).toContain("after");
+  });
+
+  it("leaves a marker with non-base64 chars as a literal comment (NOT an embed div)", async () => {
+    const html = await markdownToHtml("<!--html-embed:ab$cd-->");
+    // It is NOT tokenized into an embed div the server would strip...
+    expect(html).not.toContain('data-type="htmlEmbed"');
+    // ...it passes through unchanged as a literal HTML comment.
+    expect(html).toContain("<!--html-embed:ab$cd-->");
+  });
+});
--- a/packages/editor-ext/src/lib/page-embed/page-embed.spec.ts
+++ b/packages/editor-ext/src/lib/page-embed/page-embed.spec.ts
@@ -0,0 +1,88 @@
+import { describe, expect, it } from "vitest";
+import { getSchema } from "@tiptap/core";
+import { generateHTML, generateJSON } from "@tiptap/html";
+import { Document } from "@tiptap/extension-document";
+import { Paragraph } from "@tiptap/extension-paragraph";
+import { Text } from "@tiptap/extension-text";
+import { PageEmbed } from "./page-embed";
+
+// CONTRACT tests for the PageEmbed node's parse/render round-trip
+// (page-embed.ts). The whole-page live embed stores ONLY a `sourcePageId`
+// reference; renderHTML must serialize it as `data-source-page-id` and parseHTML
+// must recover it. If this attribute mapping drifts, an embed saved to HTML loses
+// its target page on reload (the node view would have nothing to fetch).
+//
+// We assert at the editor-ext schema level using the same Tiptap utilities the
+// other editor-ext tests use (getSchema + @tiptap/html generateHTML/generateJSON
+// over a jsdom DOM), driving a real HTML -> node JSON -> HTML round-trip through
+// the node's actual addAttributes()/parseHTML()/renderHTML().
+
+// Minimal schema: a doc of blocks, plus the PageEmbed block node under test.
+const extensions = [Document, Paragraph, Text, PageEmbed];
+
+describe("PageEmbed schema", () => {
+  it("registers the pageEmbed node in the schema", () => {
+    const schema = getSchema(extensions);
+    expect(schema.nodes.pageEmbed).toBeTruthy();
+  });
+});
+
+describe("PageEmbed parse/render round-trip", () => {
+  it("recovers sourcePageId from data-source-page-id on parse (HTML -> JSON)", () => {
+    const html = `<div data-type="pageEmbed" data-source-page-id="pg-123"></div>`;
+    const json = generateJSON(html, extensions);
+
+    const node = json.content?.[0];
+    expect(node?.type).toBe("pageEmbed");
+    expect(node?.attrs?.sourcePageId).toBe("pg-123");
+  });
+
+  it("emits data-source-page-id on render (JSON -> HTML)", () => {
+    const json = {
+      type: "doc",
+      content: [{ type: "pageEmbed", attrs: { sourcePageId: "pg-456" } }],
+    };
+    const html = generateHTML(json, extensions);
+
+    expect(html).toContain('data-type="pageEmbed"');
+    expect(html).toContain('data-source-page-id="pg-456"');
+  });
+
+  it("survives a full HTML -> node -> HTML round-trip (attribute preserved)", () => {
+    const start = `<div data-type="pageEmbed" data-source-page-id="pg-789"></div>`;
+
+    // HTML -> node JSON -> HTML.
+    const json = generateJSON(start, extensions);
+    const html = generateHTML(json, extensions);
+
+    // The id survived the round-trip in the serialized HTML...
+    expect(html).toContain('data-source-page-id="pg-789"');
+
+    // ...and re-parsing the round-tripped HTML yields the same id (stable across
+    // an extra pass — no loss, no duplication).
+    const json2 = generateJSON(html, extensions);
+    expect(json2.content?.[0]?.attrs?.sourcePageId).toBe("pg-789");
+  });
+
+  it("omits data-source-page-id entirely when sourcePageId is null (renderHTML guard)", () => {
+    // The renderHTML maps a null/empty id to {} (no attribute), so an embed
+    // without a target page does not emit a stray empty attribute.
+    const json = {
+      type: "doc",
+      content: [{ type: "pageEmbed", attrs: { sourcePageId: null } }],
+    };
+    const html = generateHTML(json, extensions);
+
+    expect(html).toContain('data-type="pageEmbed"');
+    expect(html).not.toContain("data-source-page-id");
+  });
+
+  it("parses a div without the attribute to a null sourcePageId (default)", () => {
+    const html = `<div data-type="pageEmbed"></div>`;
+    const json = generateJSON(html, extensions);
+
+    expect(json.content?.[0]?.type).toBe("pageEmbed");
+    // getAttribute returns null when absent; parseHTML returns it verbatim.
+    expect(json.content?.[0]?.attrs?.sourcePageId).toBeNull();
+  });
+});