Commit Graph

753 Commits

Author SHA1 Message Date
vvzvlad
1f2d20244e feat(ai-chat): show RAG indexing coverage in AI settings
Display "Indexed N of M pages" on the AI provider settings page so admins
can see how much of the wiki is covered by vector-RAG semantic search.

- page-embedding.repo: add countIndexedPages() — distinct non-deleted pages
  that have stored embeddings in the workspace
- page.repo: add countByWorkspace() — total non-deleted pages
- ai-settings.service: compute both counts in getMasked() (Promise.all) and
  return them with the masked settings; inject PageEmbeddingRepo + PageRepo
- MaskedAiSettings / IAiSettings: add indexedPages + totalPages
- ai-provider-settings: render a dimmed coverage line under "Embedding model"
- i18n: add the "Indexed {{indexed}} of {{total}} pages" key (en-US, ru-RU)
2026-06-17 23:18:51 +03:00
vvzvlad
cb2b7a9851 fix(ai-chat): rebrand default agent persona to Gitmost
The default system prompt introduced the assistant as embedded in
"Docmost". Update DEFAULT_PROMPT to say "Gitmost" so the agent
presents itself with the correct product name.
2026-06-17 23:12:34 +03:00
vvzvlad
551f975886 fix(collab): use '-' instead of ':' in agent page-history jobId
BullMQ rejects custom job IDs containing ':' (Redis key separator),
throwing "Custom Id cannot contain :" inside the onStoreDocument hook
for every agent edit. This broke agent-driven page saves (MCP
create_page runs as actor='agent') with HTTP 400.

Switch the agent dedup suffix from `${page.id}:agent` to
`${page.id}-agent`. The jobId is only used as a BullMQ dedup key and is
never parsed by the history processor; page.id is a UUID, so the
hyphenated id cannot collide with a human job whose id is a bare page.id.
2026-06-17 17:38:32 +03:00
vvzvlad
afd2248a75 feat(ai-chat): tolerate markdown in edit_page_text/insert_node locators
Locators (edit_page_text `find`, insert_node `anchorText`) are matched
against the document's plain text, so a model-supplied locator carrying
markdown wrappers (**bold**, *italic*, `code`, [t](url)) or trailing emoji
never matched and the edit/insert failed. Add stripInlineMarkdown() and a
fallback: try the locator verbatim first (exact match wins, so literal
asterisks/underscores still work), and only on zero matches retry with a
markdown-stripped form. The ambiguity guard runs on the post-fallback count,
and `replace` / inserted node content are never stripped, so no formatting is
lost. Failed edits gain an atom-aware reason plus a bounded "closest block
text" hint; the insert_node "anchor not found" error now points at plain-text
anchors / anchorNodeId.

New packages/mcp/src/lib/text-normalize.ts (+ unit tests); wired into
json-edit.ts and node-ops.ts; tool descriptions updated. Tests: 212 pass.
2026-06-17 15:44:19 +03:00
vvzvlad
fc9088b74d fix(ai-chat): cross-mark text edits, partial batches, JSON-string node parity
edit_page_text (applyTextEdits) now matches at the inline-block level instead of
per text node, so a find/replace may cross bold/italic/link boundaries; the
replacement inherits marks from the unchanged common prefix/suffix via a diff
splice. Atom (non-text inline) slots can never be part of a match, making the
U+FFFC placeholder collision-safe, and inserted text never inherits an atom's
marks.

The edit batch is no longer all-or-nothing: applyTextEdits returns
{ doc, results, failed } and applies what it can; editPageText writes only on a
real change (no spurious history version for a no-op) and throws an aggregated,
actionable error only when nothing applied.

The AI-chat insert_node / patch_node / update_page_json tools now JSON.parse a
node/content argument that arrives as a string, matching the standalone MCP
server (this is what made insert_node fail under OpenAI tool calls).

Tool descriptions gain concrete ProseMirror examples and reflect the new
edit_page_text behavior. Adds/updates json-edit unit tests (183 pass).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 06:57:58 +03:00
vvzvlad
0a9788e89a feat(collab): separate agent edits from human edits in page history
Page-history snapshots are debounced/coalesced (one per 1–5 min window,
jobId=page.id). A human edit followed by an agent edit in the same window
collapsed into a single snapshot, losing both the pre-agent human state and
a deterministic record of the agent's result.

Two provenance-aware boundaries now bracket an agent intervention:
- Before: on a user->agent transition, onStoreDocument synchronously pins the
  current (pre-agent) human content as its own history version tagged 'user',
  inside the page-write transaction, before the agent overwrites it.
- After: agent stores enqueue an immediate (delay 0), source-keyed history job
  (jobId=`${pageId}:agent`) so the agent's result snapshots deterministically
  as 'agent' and a later human edit (jobId=page.id) cannot coalesce/retag it.

Also add an `id desc` tie-break to findPageLastHistory so "last history" stays
deterministic when two snapshots share a created_at, consistent with
findPageHistoryByPageId.

Known trade-offs (Variant 1): the delay-0 worker re-reads the row, leaving a
millisecond mis-tag window; multiple agent edits in one turn may yield multiple
versions. The reverse agent->human boundary is intentionally out of scope.
2026-06-17 06:40:28 +03:00
vvzvlad
b0997cb749 feat(ai-chat)!: drop updateComment from the agent toolset
Editing an existing comment's text is irreversible (not version-tracked),
which breaks the agent's "only reversible operations" invariant. Remove the
updateComment tool that was added in the toolset-expansion change, leaving the
agent at 40 tools (comments: create/resolve only).

- Remove the updateComment tool from forUser().
- Remove updateComment from the DocmostClientLike interface.
- Reword SAFETY_FRAMEWORK: comments are create/resolve only; drop the
  comment-text-edit exception (keep the public-sharing one); keep the
  no-permanent-deletion guarantee and anti-prompt-injection rules.
- Tests: assert updateComment is NOT exposed (mirrors the deleteComment guard).
- docs(ai-agent-chat-plan): move updateComment to the "not exposed" list.
2026-06-17 06:03:19 +03:00
vvzvlad
6ec91c8a2c feat(ai-chat): expose full Docmost toolset to the in-app agent
Grow the agent tool registry in forUser() from 10 to 41 tools, wiring all
remaining @docmost/mcp client capabilities: reads (workspace/spaces/pages/
sidebar/outline/json/node/table/comments/shares/history/diff/export) and
reversible writes (editPageText, patch/insert/delete node, updatePageJson,
table ops, copy/import content, share/unshare, restorePageVersion,
updateComment, transformPage).

Deliberately NOT exposed: deleteComment (irreversible hard delete) and the
filePath-based image tools (uploadImage/insertImage/replaceImage — useless
and unsafe for a server-side agent). transformPage omits the deleteComments
option from its schema and never passes it, so the comment-deletion path is
unreachable from the agent.

- Extend DocmostClientLike with the new method signatures.
- Update SAFETY_FRAMEWORK to describe the broader toolset while keeping the
  no-permanent-deletion guarantee and anti-prompt-injection rules; flag that
  comment-text edits are not version-tracked and sharing is public.
- Add guardrail tests: no deleteComment tool; transformPage schema rejects
  deleteComments.
- docs(ai-agent-chat-plan): record the toolset expansion and a backlog item
  to support image insertion by URL via the existing SSRF guard.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 05:14:45 +03:00
vvzvlad
65f0713a70 fix(ai-chat): live streaming, open-page context, any-dimension embeddings" -m "- streaming: give useChat a STABLE store id (chatId ?? per-mount generated)
so the v6 hook stops re-creating its store every render on a new chat
  (which wiped the optimistic user message + streamed deltas, so nothing
  showed until the turn finished). Also send X-Accel-Buffering:no + flushHeaders.
- context: client sends the currently-open page {id,title}; the system prompt
  tells the agent which page 'this page' refers to (it reads it via its
  CASL-scoped getPage tool; id is prompt-context only, no server-side fetch).
- embeddings: make page_embeddings.embedding dimension-agnostic (drop the
  HNSW index + ALTER to vector), remove the hard 1536 guard, filter search by
  model_dimensions — so 3072-dim (and any) models index instead of being
  skipped. Seq-scan <=> search (wiki scale); existing pages reindex on next edit.
2026-06-17 04:58:06 +03:00
vvzvlad
a4b7919753 fix(ai-chat): OpenAI Chat Completions for multi-turn + provider settings, stream UX & errors" -m "Live-stand fixes (OpenRouter / OpenAI-compatible):
- openai provider: use .chat() (Chat Completions) instead of the default callable
  (Responses API), which gateways reject on multi-turn -> 400.
- updateAiProviderSettings: assemble settings.ai.provider via jsonb_build_object
  with ::text-cast bound params + jsonb_typeof self-heal (postgres.js was
  double-encoding it into an array; the ::text cast avoids 'could not determine
  data type of parameter').
- chat agent: drop the hard maxOutputTokens cap (truncated complex tool calls);
  keep a tiny cap only on the test-connection ping.
- testConnection + chat stream: surface the real provider error (statusCode+message)
  to logs and the UI instead of generic masks; never log the API key.
- chat UI: typing indicator, incremental streaming render, tool 'running' status, Stop.

Also bundled (prior uncommitted ai-chat work):
- history 'AI agent' provenance badge; vector RAG (pgvector image + page_embeddings
  + AI_QUEUE indexer + space-scoped semanticSearch); external MCP servers backend
  (@ai-sdk/mcp client, SSRF IP-pinning, encrypted headers, admin CRUD/Test);
  yjs duplicate-instance fix via pnpm patch (single CJS instance server-side).
2026-06-17 04:28:29 +03:00
vvzvlad
44b340dc1a feat(ai-chat): agent write tools, provenance wiring, chat panel + provider settings UI" -m "Backend:
- Add reversible write tools to the per-user agent toolset (page create/update/
  move/soft-delete; comment reply + resolve), exposed under the user's JWT and
  enforced by Docmost CASL; no permanent/force delete (D3).
- Non-spoofable agent provenance: sign actor/aiChatId into the access and collab
  tokens (TokenService), propagate via jwt.strategy onto the request, and set
  pages.last_updated_source/last_updated_ai_chat_id on REST create/update/move and
  comments.created_source/resolved_source/ai_chat_id.
- packages/mcp: add an optional getCollabToken provider (content-edit provenance)
  and guard against empty tokens; service-account /mcp path unchanged.

Frontend:
- Admin 'AI / Models' settings section: provider/model/embedding/base URL, a
  write-only API key field, system prompt, and Test connection.
- AI chat panel (useChat + DefaultChatTransport): conversation list, streamed
  messages, tool-call action log and page citations; header entry point gated on
  settings.ai.chat.

Compile-verified (server nest build + client tsc/vite); not yet live-tested.
Known gaps: history 'AI agent' badge (C3), vector RAG (D), external MCP (E);
chat tool-card citation links pending a fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 02:39:26 +03:00
vvzvlad
683da7a4c5 feat(ai-chat): per-user AI agent backend — LLM config, read-only agent, provenance schema
WIP checkpoint of the gitmost AI-chat backend (plan stages A + B1 + B3a).
The agent acts under the requesting user's JWT (Docmost CASL enforces page
access); the external service-account /mcp endpoint is untouched.

LLM provider config (A2-A4):
- integrations/crypto: AES-256-GCM SecretBoxService (key derived from APP_SECRET,
  per-record salt/iv; clear error on rotation instead of crashing).
- ai_provider_credentials table/repo/types: encrypted API key stored outside
  workspace settings/baseFields, write-only (never returned by any endpoint).
- integrations/ai: per-workspace AI SDK v6 provider driver (openai/gemini/ollama),
  admin-gated GET(masked)/PATCH(write-only key)/Test endpoints; settings.ai.provider
  holds non-secret config incl. systemPrompt. Removed unused AI_* env getters (DB is
  the single source of truth).

Chat module (A1, A5-A8):
- ai_chats/ai_chat_messages repos (workspace-scoped, soft-delete, tsv never selected).
- core/ai-chat: CRUD + POST /ai-chat/stream (Fastify hijack + AI SDK v6
  pipeUIMessageStreamToResponse, abort on disconnect, persist user/assistant msgs).
- Agent loop: streamText + stepCountIs(8); read tools searchPages/getPage via a
  per-request DocmostClient over loopback REST under the user's minted access token.
- Gate settings.ai.chat (+ 503 when provider unconfigured); buildSystemPrompt with a
  non-removable safety/anti-prompt-injection framework. Per-user rate limit.

Per-user auth (B1):
- @docmost/mcp DocmostClient gains an additive getToken variant (carry a user JWT,
  re-fetch on 401) and exports DocmostClient; the email/password service-account path
  (external /mcp, stdio) is unchanged.

Agent-edit provenance backbone (B3a):
- Migration: pages/page_history (last_updated_source, last_updated_ai_chat_id) and
  comments (created_source, ai_chat_id, resolved_source).
- Signed actor/aiChatId claim in the collab token; onAuthenticate propagates it,
  onStoreDocument writes it with a sticky agent marker, saveHistory copies it.

Migrations auto-run on boot (additive). Write tools, frontend, RAG and external MCP
servers are not in this checkpoint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 01:36:41 +03:00
vvzvlad
1f5987d6b0 feat(mcp): serve embedded community MCP server at /mcp
Replace the removed enterprise EE MCP (private apps/server/src/ee submodule,
license-gated /mcp route) with our docmost-mcp, vendored as an isolated ESM
workspace package and served by the server over HTTP — no enterprise license.

Backend:
- Add packages/mcp (@docmost/mcp): vendored docmost-mcp refactored into a
  side-effect-free createDocmostMcpServer() factory (38 tools preserved),
  stdio entry kept in stdio.ts, Streamable-HTTP session manager in http.ts.
- Add apps/server McpModule: @Post/@Get/@Delete('mcp') (served at /mcp via the
  existing global-prefix exclude), @SkipTransform + reply.hijack to bridge raw
  Fastify req/res into the SDK transport. The module dynamically imports the
  ESM-only package from CommonJS via a Function-indirected import resolved with
  require.resolve + file:// URL. Gated by the workspace ai.mcp toggle, a
  service-account (MCP_DOCMOST_EMAIL/PASSWORD/API_URL) and optional MCP_TOKEN;
  per-session idle eviction (MCP_SESSION_IDLE_MS).
- Drop the enterprise license check on mcpEnabled in workspace.service.
- Dockerfile: copy packages/mcp into the production image.
- .env.example: document MCP_DOCMOST_*, MCP_TOKEN, MCP_SESSION_IDLE_MS.

Frontend:
- Recreate the community "AI & MCP" workspace-settings panel (mcp-settings.tsx):
  admin-only toggle on settings.ai.mcp with optimistic update, copyable
  ${APP_URL}/mcp URL; wired into workspace-settings page. Reuses existing i18n.

Fixes:
- Pin packages/mcp tiptap deps to 3.20.4 (matching the client) and inline
  getStyleProperty, preventing a duplicate @tiptap/core@3.26.1 from leaking into
  the client editor via pnpm shamefully-hoist (was breaking apps/client tsc).
2026-06-16 23:54:53 +03:00
vvzvlad
c758a36dd2 feat(comments): implement comment resolution for the community build
Add comment resolve/re-open as a community feature, written from scratch on top
of the infrastructure already present in the community codebase: the
resolved_at/resolved_by_id columns, the COMMENT_RESOLVED notification job, the
resolveCommentMark collaboration handler, the commentResolved websocket event,
the comment service/types and the Open/Resolved tabs. No Enterprise-Edition code
is reused and there is no EE feature gating — resolving is available to anyone
who can comment.

Backend:
- add POST /comments/resolve (ResolveCommentDto) guarded by validateCanComment;
  reject resolving replies
- add CommentService.resolveComment: set/clear resolvedAt/resolvedById, sync the
  inline comment mark via collaboration handleYjsEvent, queue
  COMMENT_RESOLVED_NOTIFICATION (only when another user resolves), emit the
  commentResolved websocket event and write a resolve/reopen audit log

Frontend:
- add useResolveCommentMutation with optimistic update + rollback
- add ResolveComment toggle button
- wire the resolve button and menu item into comment-list-item / comment-menu,
  gated on canComment for parent comments

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 23:38:15 +03:00
vvzvlad
4f05fb5d2e chore(fork): drop private ee submodule and retarget CI to GHCR
Remove the private apps/server/src/ee git submodule (github.com/docmost/ee)
and the now-empty .gitmodules so that `git clone --recurse-submodules` and CI
checkout no longer fail with 404. The server loads EE only via guarded runtime
require(), so the build succeeds without it (community edition).

Rewrite .github/workflows/release.yml for the fork:
- drop the GitHub App token step and `submodules: recursive` checkout
- publish to GHCR (ghcr.io/vvzvlad/gitmost) via the built-in GITHUB_TOKEN
  instead of Docker Hub (docmost/docmost) — no extra secrets required
- add `packages: write` permission and an IMAGE env var
- log in as github.repository_owner; rename release tarballs to gitmost-*

Repoint the Dockerfile image source label to the fork.
2026-06-16 21:15:47 +03:00
Philipinho
ef04c22aea sync 2026-05-28 16:57:59 +01:00
Philipinho
2b68879e72 0.90.1 2026-05-28 16:36:18 +01:00
Philip Okugbe
33895b0607 bug fixes (#2250)
* util

* fix page position collation

* support fixed toolbar in templates editor

* date localization

* fix clipped emoji in templates editor

* fix page updated time object

* fix flickers

* fix: remove redundant breadcrumb from destination modal
2026-05-28 16:20:37 +01:00
Philipinho
830b5b4d45 fix synced block 2026-05-25 19:17:14 +01:00
Philipinho
13a7f1372f fix: update pdf-inspector package 2026-05-21 13:44:11 +01:00
Philip Okugbe
4295ea09f6 feat(storage): add Azure Blob Storage driver (#2222) 2026-05-21 12:18:58 +01:00
Philipinho
ed0501a864 fix passing wrong object 2026-05-20 19:09:22 +01:00
Philipinho
aa0c37bd68 sync 2026-05-20 18:41:23 +01:00
Philip Okugbe
a5858bc470 fix: update packages (#2221) 2026-05-20 18:30:15 +01:00
Philipinho
adb1f27767 v0.90.0 2026-05-20 16:55:23 +01:00
Philip Okugbe
6cf8101ab3 feat(ee): templates (#2215)
* feat(ee): templates
* fix tree
* fix
2026-05-19 02:41:52 +01:00
Philipinho
0d6538ab1a feat: iframe configuration 2026-05-18 22:02:31 +01:00
Philipinho
03c1e8c4ed fix collab module 2026-05-14 15:06:51 +01:00
Peter Tripp
932c1ad5b7 Better trash (#2190)
* Better trash

I recently lost a bunch of time editing and searching for pages that were actually in the Trash. Docmost intentionally tries to not link to Trashed pages, but the url of that Trashed page and any inbound links still work.  This makes it clearer when a page you are interacting with is in the Trash.

- /trash
  - Refactored banner into `trash-banner.tsx`
  - Refactored "Restore" modal into `use-restore-page-modal.tsx`
- Page (when isDeleted)
  - Add: `trash-banner.tsx`
  - Add breadcrumbs: `Parent / Child / Page (Deleted)`
  - Change: Deleted Pages are read-only
  - Replace "Move to Trash" with "Restore" in page menu (invokes `use-restore-page-modal`)

I tried very hard to keep this simple and re-use existing translation strings wherever possible.

* cleanup

---------

Co-authored-by: Philipinho <16838612+Philipinho@users.noreply.github.com>
2026-05-14 14:41:10 +01:00
Philip Okugbe
f758091b2a perf(permissions): cache space role and page edit lookups (#2208) 2026-05-14 13:11:28 +01:00
Philip Okugbe
f4af4c3fc0 feat(editor): add page break node (#2202) 2026-05-14 03:48:13 +01:00
Philipinho
3b983a27f6 sync 2026-05-14 03:01:55 +01:00
Philip Okugbe
299a9ca3c8 fix: bug fixes (#2201)
* fix(editor): hide transclusion borders and reset spacing in read-only mode

* feat(share): add full width toggle for shared pages

* feat(share): support resizing sidebar on shared pages

* fix: auto redirect if there is only one SSO provider.
- fix tighten sso redirect
- fix share tree margin

* sync

* package overrides
2026-05-14 02:54:00 +01:00
Philip Okugbe
31ed0df3f7 feat(tree): replace sidebar tree (react-aborist) with custom tree implementation (#2199)
* feat(tree): replace react-arborist with custom tree implementation

* feat(tree): keyboard arrow navigation between rows

* feat(emoji-picker): focus search input on open

* refactor(emoji): switch to @slidoapp/emoji-mart fork for accessibility

* feat(tree): Home/End and typeahead keyboard navigation

* feat(tree): roving tabindex and * to expand sibling subtrees

* feat(tree): Space activation and ARIA refinements

* fix(tree): move treeitem role to focusable row + aria-current
2026-05-13 23:01:04 +01:00
Philip Okugbe
a689cca7a0 feat: page labels/tags (#2188)
* feat: labels (WIP)
* full implementation
2026-05-10 18:14:15 +01:00
Philip Okugbe
537e45bc11 feat: page details section and backlinks (#2186)
* feat: page details section and backlinks
2026-05-09 17:03:08 +01:00
Philip Okugbe
bdc369fce0 feat(editor): fixed toolbar preference (#2185)
* feat(editor): fixed toolbar preference

* remove key

* cleanup translation strings

* update axios
2026-05-09 13:27:03 +01:00
Philip Okugbe
2d8b470495 feat(editor): indentation (#2174)
* switch to default codeblock tab handling

* feat(editor): indentation
2026-05-08 21:40:37 +01:00
Philip Okugbe
de60aa7e61 feat: synced blocks (transclusion) (#2163)
* feat: synced blocks (transclusion)

* fix:remove name

* make placeholders smaller

* feat: enforce strict transclusion schema

* fix: scope synced blocks to workspace, gate unsync on edit permission

* fix collab module error
2026-05-08 13:23:16 +01:00
Philipinho
ec51ca7815 fix request ip 2026-05-07 22:09:32 +01:00
Philipinho
2b63137217 mail 2026-05-07 18:13:24 +01:00
Philip Okugbe
73dc62bca3 update react-email (#2149) 2026-05-04 22:26:53 +01:00
Philipinho
3c74bb3dee update package 2026-05-04 22:09:19 +01:00
Sarthak Chaturvedi
fe18f22dc6 fix: prevent code block deletion when adding inline comments in read mode (#2146) 2026-05-04 21:14:21 +01:00
Philipinho
fcef0c6b96 fix: S3 2026-05-04 20:57:35 +01:00
Philipinho
17f3158a3b update aws packages 2026-05-01 20:00:20 +01:00
Philipinho
b74ca00bfd sync 2026-05-01 14:57:32 +01:00
Philip Okugbe
c247d4c1e3 feat(ee): PDF import (#2142)
* feat: replace pdfjs-dist with firecrawl-pdf-inspector

* use modified firecrawl-pdf-inspector

* feat: pdf import

* increase single file upload size limit

* use npm package

* sync

* update package
2026-05-01 14:56:39 +01:00
Philip Okugbe
641ce142df feat(ee): SCIM (#1347)
* SCIM - init (EE)

* accept db transaction

* sync

* Content parser support for scim+json

* patch scimmy

* sync

* return early if userIds is empty

* sync

* SCIM db table

* fixes

* scim tokens

* backfill

* feat(audit): add scim token events

* rename scim migration

* fix

* fix translation

* cleanup
2026-05-01 14:53:30 +01:00
Philipinho
a0aea43e25 feat(saml): allow disabling RequestedAuthnContext via env var
Adds SAML_DISABLE_REQUESTED_AUTHN_CONTEXT env var, passed through
    to the SAML strategy's disableRequestedAuthnContext option.
    Defaults to existing behavior (element sent). Set to true to omit
    the element when the IdP authenticates the user with a method that
    does not match (e.g. MFA, FIDO, passwordless), which would
    otherwise cause AADSTS75011 with Microsoft Entra ID.
2026-05-01 11:47:03 +01:00