Design entry: hide the silence-cut streaming dictation path behind a per-
workspace settings.ai.dictationStreaming flag, default false, with batch
dictation as the default and fallback. Reuses the existing STT model and
/ai-chat/transcribe — no new provider/model/endpoint fields. Lists the server
+ client touch points, acceptance criteria, and edge cases.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Delete the backlog markdown file that outlined additional STT providers and the future async transcription architecture, as the content is now superseded by newer implementation plans.
Document Variant B for showing MCP-created comments (and pages) as AI
rather than as the service-account user, reusing the existing agent
provenance infrastructure (§15 C3).
- Root cause: MCP logs in via a plain service-account token, so
provenance.actor stays 'user' and created_source defaults to 'user';
the comment sidebar also renders no AI badge.
- B1 (backend): mark the MCP identity as agent via a new users.is_agent
flag; jwt.strategy derives req.raw.actor from it (non-spoofable).
Relax the provenance aiChatId type to string | null for external MCP.
- B2 (frontend): extend IComment with createdSource/aiChatId, extract a
shared AiAgentBadge, render it in comment-list-item.
- Includes edge cases, tests, scope decisions, and acceptance criteria.
Follow-ups from the multi-aspect review of the e5bc82c7..d4658d4c range.
- CHANGELOG: document under [Unreleased] that the default per-workspace
hourly public-share assistant cap was lowered 300 -> 100 after the
v0.93.0 tag (#62). v0.93.0 shipped 300, so existing deployments that
never set SHARE_AI_WORKSPACE_MAX_PER_HOUR drop to 100 on upgrade.
- Recreate the still-open Section 3 (AiChatService.stream integration
coverage) of the deleted feature-test-coverage-deferred.md as a focused
backlog doc so the test debt stays tracked; Sections 1-2 are already
closed by the integration harness (PR #115).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pre-merge review follow-up for the parseNodeArg dedupe (PR #114):
- Restore docs/backlog/ai-chat-tool-definitions-duplicated.md instead of
deleting it: it still tracks open debt (unified spec registry + ProseMirror
<-> Markdown converter unification) that this branch defers, and
docs/git-sync-plan.md links to its converter section. Mark the node-arg
quirk as done and add a Progress section.
- Reword the in-app helper header from "byte-for-byte" to "behaviorally
identical": the two copies differ in comments/quote style; only the logic,
throw messages and branch order match.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Builds the deferred integration tests from docs/backlog/feature-test-coverage-
deferred.md that needed real infra (a test Postgres + real Redis) which the repo
lacked. Runs against an isolated, auto-created docmost_test database and Redis
logical DB 15 — never the dev data.
Harness (apps/server/test/integration/, run via new `pnpm --filter server test:int`
=> jest --config test/jest-integration.json; default unit `jest` is untouched and
excludes these via the *.int-spec.ts name + rootDir):
- db.ts: buildTestDb() mirrors database.module.ts exactly (PostgresJSDialect,
CamelCasePlugin, bigint to:20/from:[20,1700] parsing) + minimal seed helpers.
- global-setup.ts: DROP/CREATE docmost_test, CREATE EXTENSION vector, migrate to
latest via Kysely Migrator (fails loud on any errored migration).
- global-teardown.ts: closes the pool.
Coverage (5 suites, 16 tests, all green against live PG+Redis):
- WorkspaceRepo.updateSetting: jsonb-merge persists htmlEmbed without clobbering
sibling ai/sharing namespaces (the kill-switch write half).
- AiAgentRoleRepo: soft-delete exclusion, cross-workspace tenant isolation,
duplicate (name,workspace) -> 23505, name reusable after softDelete (partial
unique index WHERE deleted_at IS NULL), same name across workspaces allowed.
- page_template_references: deleting either source or referenced page cascades
the link row (onDelete cascade) — real FK, not mocked.
- PublicShareWorkspaceLimiter vs REAL Redis: real ioredis EVAL of the sliding-
window Lua — max boundary (3 admit / 4th deny), re-admit after the window
slides, same-ms distinct members. Catches Lua bugs a FakeRedis cannot.
- AiChatRepo.findByCreator: role-badge join (enabled->badge; soft-deleted or
disabled role -> null).
Review: APPROVE; applied its two hardening suggestions (fail loud on errored
migration result even without a top-level error; TEST_REDIS_URL override + ping
preflight). tsc clean; unit run excludes int-spec (verified).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
First, safe step of docs/backlog/ai-chat-tool-definitions-duplicated.md: the
"node may be a JSON object OR a JSON string" quirk was hand-copied at 6 tool
sites. Extract it into a single parseNodeArg() helper per package and call it at
every site. Behavior-preserving — each site's throw message is byte-identical
(patch/insert: 'node was a string but not valid JSON'; update_page_json: 'content
was a string but not valid JSON'); no tool name/description/schema changed.
Two helper copies (packages/mcp/src/lib/parse-node-arg.ts and
apps/server/src/core/ai-chat/tools/parse-node-arg.ts) are intentional: the
ESM-only @docmost/mcp cannot be imported by the CommonJS server (it is loaded at
runtime via the Function('import()') trick), so runtime code cannot cross that
boundary by a normal import. Each copy is now the single source within its
package (6 inline copies -> 2 helpers). packages/mcp/build rebuilt in sync.
Tests: parse-node-arg.spec.ts (server, Jest) + parse-node-arg.test.mjs (mcp,
node:test) — object passthrough, valid-string parse, invalid-string throw with
the right message. Server tsc clean; mcp suite 254 pass; agent structural-edit
path verified live in-browser (agent inserted a node, persisted to the doc).
Deferred (documented for the record, since the backlog doc is removed with this
commit): the FULL transport-agnostic tool-spec registry (one name+schema+
description per tool shared by both transports) and deriving DocmostClientLike
from the real client type. Both are blocked by the current architecture, not by
effort: (1) @docmost/mcp ships no type declarations and is ESM-only, so a
type-only derivation needs declaration emission + tsconfig path wiring, and the
real client's precise return types break the in-app tool test stubs (attempted,
reverted to keep tsc green); (2) the two transports intentionally DIVERGE in tool
NAMES (snake_case x38 vs camelCase x41), membership (in-app adds getCurrentPage/
listSidebarPages, omits delete_comment/image tools) and model-facing
DESCRIPTIONS, so a unified registry would change behavior on BOTH the agent and
external MCP clients and needs its own verification pass. This is forward-looking
debt (the code is correct today), to be done incrementally.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replace the new-chat <Select label="Agent role"> picker with colored role
cards rendered as the empty-state of a brand-new chat (centered in the window),
per docs/backlog/ai-chat-role-cards-empty-state.md. Clicking a card selects that
identity; sending without a pick falls back to the Universal assistant; the
cards disappear once the chat is non-empty. Purely client-side — the existing
selectedAiRoleIdAtom + roleId request wiring (server role fixation on chat
creation) is unchanged.
- new RoleCards rendered through the existing emptyState prop chain
(AiChatWindow -> ChatThread -> MessageList); MessageList already supported it.
- Universal assistant card (gray, value null, default-selected) + one card per
enabled role, color cycled from a 10-name Mantine palette via the pure
roleCardColor() helper; theme-aware CSS vars (light/-light-color/-filled).
- each card is an UnstyledButton with aria-pressed for a11y + testability.
- tests: role-card-color (palette cycling, negative-safe) + role-cards.test.tsx
(render, emoji/name, selection highlight, click -> onSelect). 9 tests green,
client tsc clean.
Verified live in-browser: cards (not a Select) show for a new chat; selecting
Пират binds the chat to that role end-to-end (badge + pirate reply); no pick =>
Universal; cards vanish after the first message.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The 'current page' feature (client useMatch openPage + server getCurrentPage
tool + system-prompt injection) was already implemented & merged; this backfills
its missing test coverage and removes the completed backlog doc.
- extract pure resolveCurrentPageResult(openedPage) into current-page.util.ts
(byte-identical to the prior inline getCurrentPage tool body) so it is
unit-testable without the dynamically-imported ESM Docmost client; the tool
now delegates to it.
- current-page.util.spec.ts: 7 cases (null/undefined/no-id/empty-id/full/no-title).
- ai-chat.prompt.spec.ts: +8 cases for the openedPage context line (title+pageId
present, Untitled fallback for blank/whitespace title, no line when absent/blank
id, and sandwich ordering before the trailing safety block).
Verified live in-browser: client sends openPage{id,title} on a page and null
off-page; the agent invokes getCurrentPage and answers with the real title+id.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The feature is already implemented and merged into develop (f6e216cb):
auto-collapse the AI chat window into its header on outside-page pointer,
expand on header click, with keyboard a11y. Verified live in-browser and
covered by collapse-helpers.test.ts (9 tests). Removing the now-completed
planning doc.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The PM<->Markdown converter and its lib are duplicated the same way as the
AI-chat tool definitions: a copy lives in packages/mcp/src/lib (without
canonicalize.ts), another in docmost-sync's docmost-client lib (with
canonicalize + the no-comment-threads markdown-document mode), and the
git-sync integration plan vendors a third copy into packages/git-sync.
Record the already-observed drift (collaboration.ts ~329 changed lines,
etc.) and the docmost-schema vs @docmost/editor-ext schema-divergence risk,
and tie it to the existing single-source-of-truth fix direction.
Keep the backlog focused on deferred TESTS; the related non-test gaps
(model-allow-list, restriction-cache invalidation, server embed-recursion
guard, collectPageEmbeds cycle guard, jest DI/lib0-ESM debt) are now
tracked as issues #52-#56 and only linked from the backlog.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Captures what PR #49 intentionally left out: DB-integration tests (need a
test Postgres), the public-share XFF e2e + real-Redis Lua check (need an
HTTP/Redis harness), the full AiChatService.stream integration (R1-stream
seam), and the related non-test findings (no server-side model allow-list,
unreferenced restriction-cache invalidation, client-only embed recursion
cap, missing cycle guard, and the pre-existing jest DI/lib0-ESM debt).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Integrate the already-merged step-limit work from develop. Only conflict was
ai-chat.service.spec.ts: both sides appended a describe block and edited the
import line. Resolved as a union — keep compactToolOutput + the assistantParts/
serializeSteps/rowToUiMessage suites (this branch) AND the prepareAgentStep
suite (develop), importing all symbols from ai-chat.service.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Unit tests for the safety-critical paths: crypto secret-box (round-trip,
tamper detection, wrong key), the SSRF guard (blocked ranges + DNS-rebinding),
the ai-chat tools service, the page-embedding repo, and the
assistant-parts/serialization helpers. Those server helpers (assistantParts,
rowToUiMessage, serializeSteps) are exported ONLY for the tests — no runtime
change.
Also: keyboard a11y on the chat history header and conversation rows
(role/tabIndex/Enter+Space), and DRY refactors that move shared logic into one
place (isToolPart -> tool-parts util; buildInitialValues in the MCP form).
The behaviour-changing edits that previously rode along in this commit are
split out into the following two commits, per review.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The configured x enabled status dot is implemented and merged via this
branch, so the backlog plan is no longer needed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The in-field Clear for the API key fields is implemented and merged via
this branch, so the backlog plan is no longer needed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Delete the backlog documentation that described the removal of non‑functional DOCX, PDF, and Confluence import features now that the code changes have been merged.
The two catch blocks in importPage() threw an opaque "Error processing file
content" / "Failed to create imported page" BadRequest, hiding the real cause
from the HTTP response. This made a production 400 regression impossible to
diagnose without server log access, and violated the project convention that
errors must never be swallowed.
Extract `${err.name}: ${err.message}` into both the log (full err object kept
for the stack) and the thrown BadRequestException. Inner processMarkdown/
processHTML rethrowing catches and the EE processDocx/processPdf license
catches are left unchanged.
Local reproduction of the happy-dom 14->20 theory failed (full import chain
+ 22 edge cases pass on happy-dom@20.8.9), so the root cause is still pending
the now-visible reason from a recurring 400. Diagnostic script test-import.tsx
added; backlog doc updated with findings.
Add markdown files describing the per‑user authentication mechanism and the ability to expand or collapse all nodes in the page tree, improving guidance for developers working with the MCP backlog feature.
Add two new backlog documentation files:
- ai-chat-collapse-on-page-focus.md describing auto‑collapse behavior for the AI chat window.
- comments-panel-density.md outlining UI density improvements for the comments panel.
Add a backlog design note for making page-tree realtime updates
server-authoritative instead of client-relayed.
Problem: page content syncs via Yjs/Hocuspocus (server-authoritative),
but tree create/move/delete is broadcast by the originating browser only,
so non-UI creation paths (AI agent, MCP, REST API, import) and lost-event
races leave other clients' sidebars stale.
The note specifies a WsService.emitTreeEvent broadcaster, WsTreeService
broadcast helpers, a PageWsListener on PAGE_CREATED/SOFT_DELETED/DELETED/
MOVED/RESTORED, event-payload enrichment to avoid the in-transaction
re-fetch race, a dedicated PAGE_MOVED event, removal of the client relay,
plus edge cases, work breakdown, tests, alternatives and open questions.
Remove outdated process sections from several backlog markdown files and add new backlog items for AI chat step limits, endpoint status config, and API key field UI improvements.
Add docs/backlog/stt-providers-and-async.md: how to add new synchronous STT
request formats (Deepgram, native Gemini, ElevenLabs) via the explicit
sttApiStyle axis, which providers are inherently async and don't fit the
current sync model, and a target job-based async architecture (BullMQ job
table, sync+async unification, polling -> push -> live streaming) with the
migration path and security/cleanup considerations.
Add docs/streaming-dictation-plan.md — a design document for true
"text appears as you speak" dictation via the OpenAI Realtime API.
- Maps the current batch dictation flow (client MediaRecorder -> single
blob -> POST /ai-chat/transcribe) and why streaming is impossible there.
- Documents the Realtime API contract (transcription session, ephemeral
token, pcm16 audio, input_audio_buffer.append, input_audio_transcription
delta/completed events, server_vad).
- Recommends a server-side WS proxy transport (key stays server-side,
SSRF-guarded, provider-agnostic via sttBaseUrl) over direct browser
WebRTC, and a ProseMirror decoration for interim text with final-only
commit to avoid polluting Yjs collab/history.
- Covers config additions, AudioWorklet PCM16 capture, security per repo
conventions, edge cases, phased rollout, risks, and impacted files.
This document outlines the removal of non‑functional DOCX, PDF, and Confluence
import options that relied on a private EE module. It records the completed
frontend changes and lists the remaining backend cleanup tasks.