The in-field Clear for the API key fields is implemented and merged via
this branch, so the backlog plan is no longer needed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Delete the backlog documentation that described the removal of non‑functional DOCX, PDF, and Confluence import features now that the code changes have been merged.
The two catch blocks in importPage() threw an opaque "Error processing file
content" / "Failed to create imported page" BadRequest, hiding the real cause
from the HTTP response. This made a production 400 regression impossible to
diagnose without server log access, and violated the project convention that
errors must never be swallowed.
Extract `${err.name}: ${err.message}` into both the log (full err object kept
for the stack) and the thrown BadRequestException. Inner processMarkdown/
processHTML rethrowing catches and the EE processDocx/processPdf license
catches are left unchanged.
Local reproduction of the happy-dom 14->20 theory failed (full import chain
+ 22 edge cases pass on happy-dom@20.8.9), so the root cause is still pending
the now-visible reason from a recurring 400. Diagnostic script test-import.tsx
added; backlog doc updated with findings.
Add a dedicated section describing the licensing conflict between the AGPL‑3.0‑licensed web client and App Store DRM/usage rules. Explain why this is a non‑technical blocker, outline possible distribution approaches (server‑loaded client, OTA updates, PWA, sideload), and recommend confirming the chosen path before implementing any iOS wrapper code.
Add markdown files describing the per‑user authentication mechanism and the ability to expand or collapse all nodes in the page tree, improving guidance for developers working with the MCP backlog feature.
Add two new backlog documentation files:
- ai-chat-collapse-on-page-focus.md describing auto‑collapse behavior for the AI chat window.
- comments-panel-density.md outlining UI density improvements for the comments panel.
Add a backlog design note for making page-tree realtime updates
server-authoritative instead of client-relayed.
Problem: page content syncs via Yjs/Hocuspocus (server-authoritative),
but tree create/move/delete is broadcast by the originating browser only,
so non-UI creation paths (AI agent, MCP, REST API, import) and lost-event
races leave other clients' sidebars stale.
The note specifies a WsService.emitTreeEvent broadcaster, WsTreeService
broadcast helpers, a PageWsListener on PAGE_CREATED/SOFT_DELETED/DELETED/
MOVED/RESTORED, event-payload enrichment to avoid the in-transaction
re-fetch race, a dedicated PAGE_MOVED event, removal of the client relay,
plus edge cases, work breakdown, tests, alternatives and open questions.
Remove outdated process sections from several backlog markdown files and add new backlog items for AI chat step limits, endpoint status config, and API key field UI improvements.
Add docs/backlog/stt-providers-and-async.md: how to add new synchronous STT
request formats (Deepgram, native Gemini, ElevenLabs) via the explicit
sttApiStyle axis, which providers are inherently async and don't fit the
current sync model, and a target job-based async architecture (BullMQ job
table, sync+async unification, polling -> push -> live streaming) with the
migration path and security/cleanup considerations.
Add docs/streaming-dictation-plan.md — a design document for true
"text appears as you speak" dictation via the OpenAI Realtime API.
- Maps the current batch dictation flow (client MediaRecorder -> single
blob -> POST /ai-chat/transcribe) and why streaming is impossible there.
- Documents the Realtime API contract (transcription session, ephemeral
token, pcm16 audio, input_audio_buffer.append, input_audio_transcription
delta/completed events, server_vad).
- Recommends a server-side WS proxy transport (key stays server-side,
SSRF-guarded, provider-agnostic via sttBaseUrl) over direct browser
WebRTC, and a ProseMirror decoration for interim text with final-only
commit to avoid polluting Yjs collab/history.
- Covers config additions, AudioWorklet PCM16 capture, security per repo
conventions, edge cases, phased rollout, risks, and impacted files.
The README files now list Voice dictation as a completed feature (✅) instead of an upcoming one (🔭). Consequently, the detailed `voice-dictation-plan.md` documentation has been removed. This reflects that the feature is ready and no longer merely a plan.
Add a detailed design and implementation plan for an AI assistant that
operates on publicly shared document trees. The document outlines the
feature scope, architecture, security considerations, and remaining work,
providing context for future development.
Add docs/page-templates-plan.md describing a whole-page live
transclusion feature: pages flagged is_template, a new pageEmbed
node referencing a source page, a whole-page lookup endpoint reusing
the existing transclusion access-control and share paths, reference
sync, duplicate remap, and cycle/deletion/access/export edge cases.
Decision: separate pageEmbed node over extending transclusionReference.
This document outlines the removal of non‑functional DOCX, PDF, and Confluence
import options that relied on a private EE module. It records the completed
frontend changes and lists the remaining backend cleanup tasks.
Improve agent RAG quality with three changes, plus a roadmap doc for the rest.
- Indexer: prefix each chunk with its heading path ("Page > H1 > H2"), built by
walking the ProseMirror JSON (heading nodes) so a `#` inside a fenced code block
is never mistaken for a heading. Falls back to plain-text chunking on any error.
buildChunkRows: drop indexOf-against-source offsets (breadcrumb prefixes break
verbatim matching) for a cumulative cursor — offsets are provenance-only.
- Hybrid search: new migration adds a generated `fts` tsvector column + GIN index
to page_embeddings (same english+f_unaccent config as pages.tsv). New
PageEmbeddingRepo.hybridSearch fuses cosine + full-text rankings via Reciprocal
Rank Fusion (k=60, equal weights) in one SQL query at chunk granularity.
- Tools: collapse semanticSearch + searchPages into one hybrid `searchPages` tool
with a query-rewrite-oriented description; gracefully falls back to the REST
full-text path when embeddings are unconfigured. Access control (space scope +
page-permission post-filter) preserved. Add a query-rewrite hint to the default
system prompt.
- docs/rag-improvements-plan.md: record what shipped and the deferred backlog
(reranker, attachment indexing, eval harness, tuning).
Note: requires a corpus reindex to populate breadcrumbs on existing pages.
Add docs/mobile-app-plan.md capturing the mobile-app research: current
stack, existing responsive web inventory, why the editor must stay in a
WebView, comparison of native/Capacitor/hybrid paths, the recommended
Capacitor route, required backend changes (token-in-body login, CORS
whitelist, APNs/FCM push, optional Swagger), Android/iOS specifics, and
the offline outlook (referencing docs/offline-sync-plan.md).
Enrich the "Mobile app" roadmap entry in README.md and README.ru.md to
point to the new plan and correct the inaccurate "native" wording
(Capacitor wrapper of the existing web UI, iOS first, Android to follow).
Add docs/voice-dictation-plan.md — a ready-to-implement design covering
server-side Whisper transcription via the existing per-workspace AI provider,
with the mic button in both the AI agent chat and the page editor. The doc
consolidates four parts: STT provider credentials (full parity with the LLM
and embedding creds, incl. the encrypted stt_api_key_enc column and both
provider-field whitelists), the getTranscriptionModel builder + /transcribe
endpoint, the ai.dictation visibility toggle, and the client capture
(useDictation + MicButton). Includes edge cases, security notes, an
implementation order, and the full list of affected files.
Replace the bare brand text on pages with the Gitmost logo lockup
(mark + "gitmost" wordmark) and use the mark as the favicon.
- add generated logo lockups (text outlined from Space Grotesk SemiBold)
in dark/light ink variants; add reusable theme-aware <BrandLogo> component
- use BrandLogo in the global header (mark-only on mobile, full lockup on
desktop) and on auth pages, dropping the old Docmost icon + plain text
- point favicon to /brand/gitmost-favicon.svg (SVG primary + PNG fallbacks);
regenerate favicon/app-icon PNGs from the brand SVGs
- rename app name Docmost -> Gitmost (getAppName, index.html title/apple
title, manifest name); use getAppName() in the 404 title
- align theme/background colors to the brand tile (#0E1117)
- move brand guide and logos into docs/brand/ (canonical home) with a README,
and serve runtime copies from apps/client/public/brand/
Adds a draft design document outlining the challenges, security considerations, and possible implementation approaches for inserting arbitrary HTML, CSS, and JavaScript into Docmost pages. Includes analysis of ProseMirror schema limitations, node creation steps, and isolation model options.
The built-in AI agent chat over wiki content is fully implemented
(server module, ~40 tools, RAG search, provider settings, external MCP),
so showcase it in both READMEs:
- list AI agent chat among the from-scratch community replacements
- add a "What's different" table row and a dedicated feature section
- move AI chat from the "In progress" roadmap bucket to "Done"
- add it to the Features list
- add docs/screenshots/ai-chat.png and show it in the Screenshots section
Updated in sync in README.md and README.ru.md.
Editing an existing comment's text is irreversible (not version-tracked),
which breaks the agent's "only reversible operations" invariant. Remove the
updateComment tool that was added in the toolset-expansion change, leaving the
agent at 40 tools (comments: create/resolve only).
- Remove the updateComment tool from forUser().
- Remove updateComment from the DocmostClientLike interface.
- Reword SAFETY_FRAMEWORK: comments are create/resolve only; drop the
comment-text-edit exception (keep the public-sharing one); keep the
no-permanent-deletion guarantee and anti-prompt-injection rules.
- Tests: assert updateComment is NOT exposed (mirrors the deleteComment guard).
- docs(ai-agent-chat-plan): move updateComment to the "not exposed" list.
Grow the agent tool registry in forUser() from 10 to 41 tools, wiring all
remaining @docmost/mcp client capabilities: reads (workspace/spaces/pages/
sidebar/outline/json/node/table/comments/shares/history/diff/export) and
reversible writes (editPageText, patch/insert/delete node, updatePageJson,
table ops, copy/import content, share/unshare, restorePageVersion,
updateComment, transformPage).
Deliberately NOT exposed: deleteComment (irreversible hard delete) and the
filePath-based image tools (uploadImage/insertImage/replaceImage — useless
and unsafe for a server-side agent). transformPage omits the deleteComments
option from its schema and never passes it, so the comment-deletion path is
unreachable from the agent.
- Extend DocmostClientLike with the new method signatures.
- Update SAFETY_FRAMEWORK to describe the broader toolset while keeping the
no-permanent-deletion guarantee and anti-prompt-injection rules; flag that
comment-text edits are not version-tracked and sharing is public.
- Add guardrail tests: no deleteComment tool; transformPage schema rejects
deleteComments.
- docs(ai-agent-chat-plan): record the toolset expansion and a backlog item
to support image insertion by URL via the existing SSRF guard.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Added sections 14 and 15 to the AI‑agent chat plan documenting review
findings, identified blockers (C1‑C3) and their resolutions, high/medium
issues, and verification steps. This provides clear guidance before
starting implementation.
Add documentation for external MCP server support, covering architecture,
configuration, security (SSRF protection, secret handling), system prompt
management, UI updates, and the new @ai-sdk/mcp dependency. This clarifies the
expanded three‑axis authorization model and migration steps.
Replace the removed enterprise EE MCP (private apps/server/src/ee submodule,
license-gated /mcp route) with our docmost-mcp, vendored as an isolated ESM
workspace package and served by the server over HTTP — no enterprise license.
Backend:
- Add packages/mcp (@docmost/mcp): vendored docmost-mcp refactored into a
side-effect-free createDocmostMcpServer() factory (38 tools preserved),
stdio entry kept in stdio.ts, Streamable-HTTP session manager in http.ts.
- Add apps/server McpModule: @Post/@Get/@Delete('mcp') (served at /mcp via the
existing global-prefix exclude), @SkipTransform + reply.hijack to bridge raw
Fastify req/res into the SDK transport. The module dynamically imports the
ESM-only package from CommonJS via a Function-indirected import resolved with
require.resolve + file:// URL. Gated by the workspace ai.mcp toggle, a
service-account (MCP_DOCMOST_EMAIL/PASSWORD/API_URL) and optional MCP_TOKEN;
per-session idle eviction (MCP_SESSION_IDLE_MS).
- Drop the enterprise license check on mcpEnabled in workspace.service.
- Dockerfile: copy packages/mcp into the production image.
- .env.example: document MCP_DOCMOST_*, MCP_TOKEN, MCP_SESSION_IDLE_MS.
Frontend:
- Recreate the community "AI & MCP" workspace-settings panel (mcp-settings.tsx):
admin-only toggle on settings.ai.mcp with optimistic update, copyable
${APP_URL}/mcp URL; wired into workspace-settings page. Reuses existing i18n.
Fixes:
- Pin packages/mcp tiptap deps to 3.20.4 (matching the client) and inline
getStyleProperty, preventing a duplicate @tiptap/core@3.26.1 from leaking into
the client editor via pnpm shamefully-hoist (was breaking apps/client tsc).
Add docs/offline-sync-plan.md — a ready-to-implement design document for
offline editing and synchronization in gitmost.
- Describes current state: Yjs/Hocuspocus + y-indexeddb for document body
(CRDT, offline-ready) vs REST-based structural data (online-only).
- Clarifies that PWA installability already exists (inherited from Docmost);
the missing piece is a service worker for offline app-shell.
- Defines two sync contours (CRDT body / outbox+LWW for REST) and a staged
plan M0..M4 with per-step files, acceptance criteria and risks.
- Includes conflict-resolution rules, pitfalls, npm deps, open questions
and an implementation checklist.