gitmost

Author	SHA1	Message	Date
a	c4ed4a4855	fix(footnotes): strip bare definitions on rebuild; MCP full-doc + zip-import canonicalize tests (#228 ) Review #6 (approve-with-comments) follow-ups: 1. canonicalize step 7 now strips bare footnoteDefinitions at ANY depth (stripFootnoteDefinitionsDeep), not just footnotesList, in BOTH copies. A definition hand-authored outside a list (e.g. nested in a callout via a raw-JSON write path) was left in place while a copy was also added to the rebuilt list -> duplicate, idempotent, self-perpetuating. Runs only in the rebuild path (after the lists are stripped); the fast-path / placement-keep branch is untouched. Added a shared-corpus case (bare def nested in a callout) to pin it in both mirrors. 2. markdown-clipboard: removed the dead top-level footnoteReference check in canonicalizePastedFootnotes (an inline atom is never a top-level slice child; only the descendants scan can find it). Test coverage: 4. New MCP binding tests (full-doc-write-canonicalize.test.mjs): update_page_json and copy_page_content canonicalize the persisted full doc, asserted via a new `replacePage` seam (symmetric to the existing `mutatePage` seam) so no live collab socket is needed. Routed both writers through the seam. 5. New server spec (file-import-task.service.footnote-canonicalize.spec.ts): the zip-import path (processGenericImport) canonicalizes footnotes — real markdown->HTML->JSON via a real ImportService over a temp-dir .md file, DB trx stubbed to capture the persisted page content. FileImportTaskService had no spec before. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-28 01:39:25 +03:00
a	07ebd8c63e	fix(footnotes): address PR #232 review — fragment-safe canonicalization, plugin placement parity, dead-code removal (#228 ) Must-fix: - Move canonicalizeFootnotes OUT of parseProsemirrorContent. It now runs only on FULL writes (createPage, updatePageContent operation==='replace'), never on an append/prepend fragment (a fragment would lose definition-only footnotes or synthesize a bogus empty list). Add a server binding spec. - Match the live plugin's list PLACEMENT: a single already-canonical footnotesList is left exactly where it sits (the plugin never repositions a sole correct list), so the first write no longer reorders content that follows the list. Applied to BOTH the editor-ext copy and the MCP mirror; pinned by a shared golden corpus case with content after the list. - Fix MCP tool count 38 -> 39 (README x3, AGENTS.md) and the transformJs param help (add canonicalizeFootnotes/insertInlineFootnote). Simplifications: - Remove the dead duplicate re-id mechanism (deriveFootnoteId/suffix/occurrence) from the PURE canonicalizer in both copies — references are never renamed, so the derived ids were never requested; first-wins-drop is the real behaviour. This also makes the editor-ext footnote-util note about "no cross-package copy" true again. - Remove the sentinel round-trip in insertInlineFootnote: a generalized insertNodesAfterAnchor core inserts the footnoteReference node directly. - Drop the redundant per-definition deep clone in step 4 (shallow id-normalizing copy; out is already deep-cloned). Docs / architecture: - Correct the editor-ext copy's "It exists because…" header to its real consumers (server import, page.service create/update, client paste). - Note markdownToProseMirror reuse for create/update comment in collaboration.ts. - A: shared golden JSON corpus exercised by BOTH the editor-ext copy and the MCP mirror (footnote-corpus.ts / .mjs) so "the two copies behave identically" is checkable. - C: split the MCP canonicalizer into a pure mirror + footnote-authoring.ts. - B: import services persist via a different path, so left one-line consolidation comments at the call sites rather than folding (does not fall out cleanly). Tests: insertFootnote wrapper guards + docmost_transform dryRun auto-canonicalize (MCP mock), page.service create/update + append/prepend binding (server jest), shared corpus incl. nested-container reference. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 20:23:16 +03:00
a	fa929c9e86	fix(footnotes): canonicalize footnotes on server import + markdown paste (#228 ) The footnote canonicalizer was wired into the MCP and editor-ext write paths but NOT into the server's user-facing markdown/HTML import paths, so importing or pasting markdown with out-of-order, reused, or orphan footnotes did not canonicalize -- the exact trigger bug #228 fixes was still reproduced on import. markdownToHtml -> htmlToJson builds ProseMirror JSON directly and never runs the editor's footnoteSyncPlugin, and that plugin does not reorder an existing list, so the stored footnotes kept the source's physical definition order, retained orphans, and did not collapse reused references. Wire canonicalizeFootnotes (already exported from @docmost/editor-ext) into every server markdown/HTML -> page-JSON seam, before persisting: - ImportService.importPage (REST single-file .md/.html import) - FileImportTaskService (zip import worker) - PageService.parseProsemirrorContent (API createPage / updatePageContent) Also hook the client markdown paste: handlePaste applies a manual transaction (returns true), bypassing transformPasted/footnoteSyncPlugin, so a pasted out-of-order markdown footnote block would persist out of order. canonicalizePastedFootnotes reorders a self-contained pasted block (one that carries its own footnotesList) to reference order, deduped and orphan-free; it is deliberately scoped to whole-block pastes so a reference-only paste that reuses a footnote already defined in the target doc is left untouched. canonicalizeFootnotes is pure, idempotent and shape-safe (a doc with no footnotes is unchanged), so it is safe on every write path. Residual: when a pasted block merges into a doc that already has footnotes, ordering relative to the pre-existing footnotes is still governed by the live sync plugin (which does not reorder across the boundary). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 17:10:41 +03:00
claude_code	0b2af34029	test(integrations/client/packages): batch 2-4 unit coverage + zip-slip guard extraction Batch 2-4 of the test-strategy rollout. Test-only except one minimal, behaviour-preserving extraction in file.utils.ts. All suites green: server 82 suites/836+1todo, editor-ext 86, mcp 270, client (new files) 86. integrations (server): - file.utils.ts: extract pure `isEntryPathSafe(entryName, targetDir)` from extractZipInternal so the zip-slip/path-traversal guard is unit-testable; call site rerouted, behaviour identical (only a warn-message string merged). - file.utils.zip-safety.spec.ts: traversal/strip/__MACOSX/prefix-confusion cases (mutation-resistant: fails if containment loses the path.sep). - import-formatter / import.utils / table-utils / export utils / import.service extractTitleAndRemoveHeading: pure import/export transforms, Notion/XWiki formatting, table colspan widths (idempotent), slug/link rewriting. client: - safeRedirectPath: open-redirect guard, every reject branch independently. - buildChatMarkdown (fence anti-breakout), label-colors, normalize-label, share tree build, page URL builders, notification time-grouping (fake clock). packages: - editor-ext: deriveFootnoteId golden table, parseHtmlEmbedHeight crafted values, orphan footnote extraction. - mcp: deriveFootnoteId parity (drift guard vs editor-ext), applyTextEdits idempotency + cross-block replaceAll, diffDocs/summarizeChange on reorder. Reviewed (APPROVE): extraction behaviour-preserving, assertions mutation-resistant. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-21 18:22:15 +03:00
claude_code	81823fce1e	feat(html-embed): sandbox the embed block; split trusted trackers into an admin field Convert the htmlEmbed node from same-origin raw-HTML execution to a sandboxed iframe (sandbox="allow-scripts allow-popups allow-forms", no allow-same-origin, srcdoc) with postMessage auto-resize (validated by event.source) and an optional manual height attr. The block now runs in an opaque origin and cannot reach the viewer's cookies/session/API, so it is safe for any member. Because the block is now harmless, remove the entire admin/role gating apparatus: drop htmlEmbedAllowed/canAuthorHtmlEmbed/stripDisallowedHtmlEmbedNodes/ collectHtmlEmbedSources and every role-based strip on the write paths (collab REST/MCP + socket, page create/duplicate, import x2, transclusion unsync), along with the now-unused WorkspaceRepo/UserRepo injections and the PageService.create callerRole param. Keep one strip: prepareContentForShare still removes htmlEmbed on the anonymous public-share read path when the workspace master toggle is OFF. The workspace settings.htmlEmbed toggle is now a plain feature switch (gates the slash-menu and share rendering); when ON the block is available to all members. Add settings.trackerHead: an admin-only raw HTML/JS analytics snippet injected verbatim into the <head> of public share pages only (ShareSeoController), for trackers that genuinely need same-origin. Admin-gated via the existing CASL Manage/Settings ability; never injected into the authenticated app shell. Closes security-review findings #1, #2, #4, #5, #10 (and #3 as a security issue). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-21 02:48:41 +03:00
claude_code	2b3fc926cc	Merge remote-tracking branch 'gitea/develop' into feat/html-embed-admin # Conflicts: # apps/server/src/core/workspace/services/workspace.service.ts	2026-06-20 20:18:44 +03:00
vvzvlad	f650d2591b	fix(tree): address realtime-tree-server review findings - make addTreeNode receivers idempotent (invalidateOnCreatePage guard + buildTree dedup) so the author's self-echo no longer duplicates the node - broadcast realtime tree updates for bulk copy/duplicate and import via a root refetch: PAGE_CREATED now carries spaceId and the WS listener falls back to refetchRootTreeNodeEvent when no per-node snapshot is present - remove the now-dead client-relay inbound path (isTreeEvent/handleTreeEvent) that remained a stale-restriction-cache attack surface - honest string\|null cast for a root move's parent id - add tests: buildTree dedup; onPageCreated per-node vs refetch branching Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-20 19:48:06 +03:00
claude code agent 227	8fcce6a674	feat(html-embed): per-workspace feature toggle, default OFF The admin-only raw HTML/JS embed is a deliberate stored-XSS surface, so gate the whole feature behind a workspace toggle that is OFF by default; it only works when a workspace admin explicitly enables it. - settings.htmlEmbed (boolean, default false) + workspace-update field htmlEmbed, persisted via WorkspaceRepo.updateSetting with an audit diff. Flipping it is admin-only (same Manage Settings CASL as other workspace toggles). - New gate htmlEmbedAllowed(featureEnabled, role) = featureEnabled && admin/owner. All 7 server write paths (create, duplicate, collab onStoreDocument, REST/MCP/AI updatePageContent, single + zip import, transclusion unsync) now read the workspace's settings.htmlEmbed and strip unless (toggle ON AND admin). OFF (default, or a failed/empty workspace lookup) strips htmlEmbed for EVERYONE including admins -> existing embeds are cleaned up on next save, none persist. - Client (defense-in-depth): the /html slash item is hidden unless toggle ON + admin; the NodeView executes nothing and shows a 'disabled in this workspace' placeholder when OFF; an admin Switch in Workspace Settings -> General with a description of the behavior. - docs/html-embed-admin.md documents the toggle + admin-only + fail-closed coedit (a non-admin save strips an admin's embed) + execution semantics. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-20 19:28:39 +03:00
claude code agent 227	caac5c7f36	test(html-embed): exercise the REAL admin-gate write paths + import round-trip Release-cycle test audit: the strip boundary was tested only via a stand-in helper re-implemented in the spec, so a deleted/misplaced guard kept CI green (the missing create() guard was proof). Replace it with tests against real code: - persistence.extension.onStoreDocument: real ydoc from a rich doc (columns/ table/mention/htmlEmbed) -> non-admin strip removes only htmlEmbed, every other node preserved (data-loss guard); admin keeps; empty fragment no-throw. - collaboration.handler.updatePageContent: real path, user?.role gate, decoded ydoc embed-free for non-admin, kept for admin. - transclusion unsync: member stripped, admin preserved. - editor-ext gains a vitest setup (was zero tests) + a markdown round-trip: the <!--html-embed:BASE64--> marker -> htmlEmbed node with decoded source, and hasHtmlEmbedNode matches it — pinning the marked/turndown shape the import strip relies on. tsconfig now excludes specs from the shipped dist. - Fail-closed identity: source-pinned contracts that the gate keys on fileTask.creatorId (zip) / request userId (single) / callerRole (create) / authUser.role (duplicate), and missing-user -> strip (services can't load under jest's ESM graph; helpers replay the exact predicate). Adds the verified-safe ^src/ jest moduleNameMapper (identical fail set). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-20 14:52:29 +03:00
claude code agent 227	bd28dbfe2b	feat(editor): admin-only raw HTML/CSS/JS embed node Adds an htmlEmbed block node that renders and executes raw HTML/CSS/JS in the wiki origin (e.g. an analytics tracker) — the owner-chosen variant C. Because this is stored-XSS by design, only workspace admins/owners may get such a node persisted; everyone executes it when reading. - Node (editor-ext): htmlEmbed atom/isolating block; source stored base64 in data-source for lossless HTML<->JSON round-trip. renderHTML emits only the encoded marker (never inlines raw markup), so generateHTML/export/search are not themselves injection vectors. Registered in BOTH client extensions and server tiptapExtensions. Markdown round-trip via an <!--html-embed:b64--> comment (turndown) + a marked rule. - Client NodeView: injects source and re-creates <script> elements so they actually run; edit modal; renders in read-only/share too. Slash item is admin-gated (adminOnly filtered by the user's workspace role). - SERVER ENFORCEMENT (the real control — UI gating alone is insufficient): stripHtmlEmbedNodes() removes htmlEmbed from any document persisted by a non-admin, applied at every write path that introduces content from an untrusted author: collab onStoreDocument, REST/MCP/AI updatePageContent, single-file import, zip/multi-file import, page duplication, and transclusion unsync. Page restore introduces no new content. Public share/readonly viewers render fetched (already-stripped) content and do NOT open a collab socket, so the only residual is a transient broadcast window to concurrent authenticated editors (documented). Implements docs/arbitrary-html-embed-plan.md (variant C). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-20 08:54:54 +03:00
claude_code	732aaf54f8	refactor(import): remove non-functional DOCX/PDF/Confluence import stubs These import paths relied on the private EE module that was deleted from the repo. In the community build they either threw 'enterprise license' (DOCX/PDF) or silently no-op'd (Confluence). The frontend buttons were already removed in 38064064; this cleans up the dead backend stubs. - import.service.ts: drop processDocx/processPdf methods, their dispatcher branches, the pageId computation + insertPage spread, and the now-unused moduleRef param/ModuleRef import - file-import-task.service.ts: drop the Confluence branch and the now-unused moduleRef param/ModuleRef import - import.controller.ts: restrict file extensions to .md/.html and zip sources to generic/notion; update the error message accordingly - file.utils.ts: remove Confluence from the FileImportSource enum - features.ts: remove the unused CONFLUENCE_IMPORT/DOCX_IMPORT/PDF_IMPORT feature keys The isConfluenceImport logic in import-attachment.service.ts is intentionally left in place (real shared attachment-parsing code, not a stub); its removal is a separate, riskier refactor.	2026-06-20 04:05:29 +03:00
vvzvlad	3d03417c73	fix(import): surface real error cause in /pages/import instead of generic 400 The two catch blocks in importPage() threw an opaque "Error processing file content" / "Failed to create imported page" BadRequest, hiding the real cause from the HTTP response. This made a production 400 regression impossible to diagnose without server log access, and violated the project convention that errors must never be swallowed. Extract `${err.name}: ${err.message}` into both the log (full err object kept for the stack) and the thrown BadRequestException. Inner processMarkdown/ processHTML rethrowing catches and the EE processDocx/processPdf license catches are left unchanged. Local reproduction of the happy-dom 14->20 theory failed (full import chain + 22 edge cases pass on happy-dom@20.8.9), so the root cause is still pending the now-visible reason from a recurring 400. Diagnostic script test-import.tsx added; backlog doc updated with findings.	2026-06-19 16:25:12 +03:00
Philipinho	ed0501a864	fix passing wrong object	2026-05-20 19:09:22 +01:00
Philip Okugbe	c247d4c1e3	feat(ee): PDF import (#2142 ) * feat: replace pdfjs-dist with firecrawl-pdf-inspector * use modified firecrawl-pdf-inspector * feat: pdf import * increase single file upload size limit * use npm package * sync * update package	2026-05-01 14:56:39 +01:00
Philip Okugbe	09c69d7a0f	feat: properly preserve table width (#2143 )	2026-05-01 00:49:31 +01:00
Philipinho	ec83fc82d5	fix: refactor sanitize	2026-04-27 15:16:26 +01:00
Philip Okugbe	a6a7e4370a	feat(ee): PDF export api (#2112 ) * feat(ee): server side PDF export * feat: pdf export queue * sync * sync	2026-04-14 16:26:54 +01:00
Philip Okugbe	a062f7a165	fix: enhance confluence importer (#2072 ) * fix placeholder * min resize dimensions * fix media links * fix	2026-03-30 13:16:40 +01:00
Philip Okugbe	7981ef462e	feat(editor): audio and PDF nodes (#2064 ) * use local resizable * feat: aduio * support audio imports * feat: use confluence real file names * cleanup * error handling * hide notice * add audio * fix pulse * Fix import and export * unify pulse * hide in readonly mode * keywords * keyword * translations * better sort * feat: PDF embed * cleanup * remove audio menu * open active * hide focus on readonly mode * increase iframe default dimension	2026-03-28 17:33:29 +00:00
Philip Okugbe	fa4872e89e	fix(deps): package updates (#2041 ) * update * overrides * override * fix page update mutation * fix * cleanup * loader * fix excalidraw package * override * fix regex	2026-03-25 10:07:01 +00:00
Philip Okugbe	7520c329d0	fix notion importer (#2027 ) * fix notion importer * fix link selector on mobile	2026-03-15 22:06:40 +00:00
Philip Okugbe	89b94e5d19	feat: refactor link menu (#2025 ) * link markview - WIP * WIP * feat: refactor links * cleanup	2026-03-15 17:08:59 +00:00
Philipinho	057360c6be	fix: validate import size	2026-03-03 20:00:05 +00:00
Philipinho	b6478fee84	fix imports	2026-03-03 13:57:10 +00:00
Philipinho	9f4728e279	fix	2026-03-03 00:08:20 +00:00
Philip Okugbe	69d7532c6c	feat(ee): audit logs (#1977 ) feat: clickhouse driver * sync * updates	2026-03-01 01:29:03 +00:00
Philip Okugbe	b5803f42da	xwiki html import cleanup (#1969 )	2026-02-24 15:53:38 +00:00
Philip Okugbe	2f97a3debc	feat: DOCX import (#1913 )	2026-02-06 10:34:51 -08:00
Philip Okugbe	78b1c1a453	feat: switch to cursor pagination (#1884 ) * add cursor pagination function * support custom order modifier * refactor returned object * feat(db): migrate paginated endpoints to cursor-based pagination * sync * support hasPrevPage boolean * feat(client): migrate pagination from offset to cursor-based * support beforeCursor/prevCursor * wrap search results in items array for API consistency	2026-01-30 19:28:54 +00:00
Philip Okugbe	6ccb2bb872	feat(export): add metadata file to preserve page icons and ordering on import (#1877 ) * feat(export): add metadata file to preserve page icons and ordering on import - Export includes `docmost-metadata.json` - Import reads metadata to restore icons and sort siblings by original position * cleanup * bonus fixes * handle unknown prosemirror nodes * add docmost app version	2026-01-27 16:39:39 +00:00
Philip Okugbe	54775f537d	fix: handle malformed URLs gracefully during import/export (#1868 ) * Handling malformed URLs gracefully * Allow import of invalid URLs, but adding logging. --------- Co-authored-by: gpapp <gergely.papp@itworks.hu>	2026-01-25 00:48:43 +00:00
Philip Okugbe	efb0a9317b	feat: allow upload of large files (#1862 ) * Allow upload of large files * feat: createByteCountingStream utility function. --------- Co-authored-by: gpapp <gergely.papp@itworks.hu>	2026-01-22 20:00:58 +00:00
Philip Okugbe	47097969a0	fix: use subquery (#1833 ) - enhance file tasks list endpoint	2026-01-13 15:58:26 +00:00
Philipinho	ab96672ecd	fix	2025-12-02 13:14:03 +00:00
Philip Okugbe	9fb16bc842	feat(EE): AI vector search (#1691 ) * WIP * AI module - init * WIP * sync * WIP * refactor naming * new columns * sync * sync * fix search bug * stream response * WIP * feat embeddings sync * refine * Add workspaceId to page events * refine * WIP * add translation string * sync * reset ai answer on query change * hide AI search in cloud * capture streaming error * sync	2025-12-01 11:50:25 +00:00
Philip Okugbe	c3b350d943	fix: zip extraction validation (#1753 ) * fix: zip extraction validation * fix	2025-12-01 11:37:59 +00:00
Philip Okugbe	520c07a0bc	fix: generic page import hierarchy (#1747 ) * fix page hierarchy * fix	2025-11-29 11:50:02 +00:00
Philip Okugbe	bf8cf6254f	feat: Typesense search driver (EE) (#1664 ) * feat: typesense driver (EE) - WIP * feat: typesense driver (EE) - WIP * feat: typesense * sync * fix	2025-10-07 17:34:32 +01:00
Philipinho	cf5bbb10df	fix import html processing	2025-09-18 15:34:13 +01:00
Philip Okugbe	9ac180f719	fix: enhance page import (#1570 ) * change import process * fix processor * fix page name in notion import * preserve confluence table bg color * sync	2025-09-17 23:50:27 +01:00
Philipinho	f413720e15	- sync - reinstantiate S3 client to fix file upload errors during import - delete import zip file after use	2025-09-14 03:00:23 +01:00
Philip Okugbe	7ada3cb1f9	fix: page import task (#1551 ) * fix import * - fix notion importer - support notion page icon import - fix horizontal rule css - rename service file * sync * 3 mins delay	2025-09-13 03:14:59 +01:00
Philipinho	dc0650289d	sync	2025-09-04 15:07:01 -07:00
Philipinho	d43ee77617	remove debug log	2025-09-04 09:40:17 -07:00
Philipinho	5d91eb4f5f	feat: queue imported attachments for indexing	2025-09-04 09:38:30 -07:00
Philip Okugbe	1f797c3d27	fix: confluence drawio import (#1518 ) * POC * WIP - working * WIP * WIP * sync * fix drawio preview image	2025-09-03 05:19:09 +01:00
Philip Okugbe	3b85f4b616	fix: enforce C collation for page position ordering to ensure consistent behavior in Postgres 17+ (#1446 ) - Add explicit C collation to position ordering queries to fix incorrect page placement in PostgreSQL 17+ - Ensures consistent ASCII-based ordering regardless of database locale settings - Fixes issue where new pages were incorrectly placed at random positions instead of bottom	2025-08-04 09:49:29 +01:00
Philip Okugbe	4dfed2b2af	queue import attachments upload (#1353 )	2025-07-19 18:00:06 +01:00
Philip Okugbe	6d024fc3de	feat: bulk page imports (#1219 ) * refactor imports - WIP * Add readstream * WIP * fix attachmentId render * fix attachmentId render * turndown video tag * feat: add stream upload support and improve file handling - Add stream upload functionality to storage drivers\n- Improve ZIP file extraction with better encoding handling\n- Fix attachment ID rendering issues\n- Add AWS S3 upload stream support\n- Update dependencies for better compatibility * WIP * notion formatter * move embed parser to editor-ext package * import embeds * utility files * cleanup * Switch from happy-dom to cheerio * Refine code * WIP * bug fixes and UI * sync * WIP * sync * keep import modal mounted * Show modal during upload * WIP * WIP	2025-06-09 04:29:27 +01:00
Philip Okugbe	287b833838	feat: support pasting markdown (#606 )	2025-01-04 16:57:36 +00:00

1 2

59 Commits