The streaming mic button only began recording on the SECOND click. The VAD
library creates its AudioContext inside vad.start() and never resumes it; on the
first click the lazy model load (import + MicVAD.new) ran first, so the context
was created after the user-gesture window expired and started suspended — the
audio worklet never ran, so nothing happened. The second click was fast (model
cached) so the context landed inside the gesture and worked.
Create and resume our own AudioContext synchronously at the top of start()
(inside the click gesture, before the model load) and inject it into MicVAD,
which then does not take ownership of it; it is reused across start/stop and
closed only on unmount. Add a "loading" status so the first click is shown as a
spinner (disabled) while the model loads, which also blocks a confusing second
click.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The "copy chat" button serialized `messageRows` (persisted rows loaded via
`useAiChatMessagesQuery`), which were incomplete in two ways, so the exported
Markdown dropped messages (e.g. "Messages: 2" for a multi-turn chat).
- Exhaust pagination: `useAiChatMessagesQuery` is a useInfiniteQuery that only
ever loaded the first page (server page size 50, oldest-first), silently
truncating longer chats. Add an effect that calls `fetchNextPage()` until
`hasNextPage` is false. Guard on `isFetchNextPageError` so a failed page fetch
does not loop on the app's global `retry: false`.
- Re-sync after each turn: `onTurnFinished` invalidated only the chat-list
query, never the per-chat messages query, so `messageRows` went stale during
a live session. Also invalidate `AI_CHAT_MESSAGES_RQ_KEY(activeChatId)` so the
export and token counters reflect the just-finished turn.
- Avoid tearing down the live thread: a render-phase latch (`historyLoadedKeyRef`)
keeps the history loader gating the FIRST mount only, so the post-turn
background refetch (which can transiently flip `hasNextPage` for a chat whose
message count is an exact multiple of the page size) no longer unmounts the
open thread and loses its in-progress useChat state.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
createPage always failed with "generateJSON can only be used in a Node
environment". Root cause: the MCP module (packages/mcp/.../collaboration.ts)
sets `global.window = dom.window` (jsdom) at load time and is imported
in-process by the server's AI-chat tools, leaking a global `window` into the
Node process. The server's self-contained ProseMirror helpers guarded with
`if (typeof window !== 'undefined') throw`, which then became a false positive
and broke POST /pages/import (the endpoint createPage calls).
- server: drop the vestigial `typeof window` guard in generateJSON.ts and
generateHTML.ts; both helpers create their own happy-dom Window and never
read the global one. Replace it with an explanatory comment.
- mcp: in DocmostClient.getPage, pass the resolved UUID (resultData.id) to
listSidebarPages instead of the original pageId, which may be a slugId and
triggered a Postgres "invalid input syntax for type uuid" (and a silent
empty subpages list).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Streaming dictation sends one transcription request per ended speech segment.
With redemptionMs=640 the VAD cut on every ~0.64s gap, so normal halting speech
fragmented into many segments and flooded /ai-chat/transcribe — tripping the
per-user rate limit even at modest real usage.
Raise redemptionMs to 1500 so a cut only happens on a real sentence/thought
pause (~the "couple seconds" the feature was meant to use). Request count now
tracks actual pauses rather than inter-word gaps; the server throttle is left
unchanged (the earlier limit bump was treating the symptom).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a description to the "Authorization header" input in the external
MCP server form explaining that the entered value is sent verbatim as
the Authorization HTTP header value (e.g. "Bearer <token>" or
"Basic <base64>"), since there is no implicit Bearer prefix.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Drop the "Use Tavily preset" quick-fill button from the Add server form
and its now-dead supporting code:
- remove the preset JSX block, applyTavilyPreset handler and TAVILY_PRESET
constant in ai-mcp-server-form.tsx
- drop the now-unused McpTransport import
- remove the unused "Use Tavily preset" i18n key from en-US translations
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Ship the plain onnxruntime-web wasm variant alongside JSEP so production builds
can load the VAD backend.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The production (rolldown) build resolves onnxruntime-web to the plain wasm
backend and fetched /vad/ort-wasm-simd-threaded.mjs, which we did not ship —
we only copied the JSEP variant that the Vite dev build uses. That 404'd into
the SPA fallback, reproducing "no available backend found / Failed to fetch
dynamically imported module" in production.
Copy BOTH the JSEP and the plain ort-wasm-simd-threaded.{mjs,wasm} into
public/vad/, so the runtime fetch finds a real file regardless of which build
the bundler picked.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Self-host Silero VAD / onnxruntime-web assets under /vad/ so streaming
dictation loads its wasm backend (fixes the 'text/html' MIME runtime error).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Streaming dictation failed at runtime with "no available backend found /
'text/html' is not a valid JavaScript MIME type": @ricky0123/vad-web 0.0.30
defaults baseAssetPath/onnxWASMBasePath to "./" (relative to the page URL),
so the worklet, Silero model and ORT wasm/mjs were requested against the SPA
catch-all and came back as index.html.
Serve them from a fixed /vad/ instead:
- scripts/copy-vad-assets.mjs copies the 4 runtime assets (vad worklet,
silero_vad_v5.onnx, ort-wasm-simd-threaded.jsep.{mjs,wasm}) from node_modules
into apps/client/public/vad/ (gitignored — the ORT wasm is ~26 MB)
- client dev/build scripts run the copy first so the assets are always present
- useStreamingDictation points both path constants at "/vad/"
Verified: dev server serves all four under /vad/ with HTTP 200 and correct
Content-Type (js/wasm, never text/html); tsc clean. Prod (Docker) build runs
the copy step, so dist/vad/* ships in the image.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Soft-deleting a page no longer opens a "Move this page to trash?"
confirmation modal. The page is moved to trash immediately and the
"Page moved to trash" toast now exposes an inline Undo action that
restores the page via the existing restore flow.
- Add move-to-trash-notification.tsx helper that builds the toast body
(status text + Undo button) as a ReactNode, so it can be used from the
non-TSX page-query module.
- useRemovePageMutation: show the toast with a stable id, 8s autoClose,
and an Undo handler that hides the toast and triggers restore.
- space-tree-node-menu / page-header-menu: call handleDelete directly and
drop the now-unused useDeletePageModal usage. Permanent delete keeps its
confirmation modal.
- useRestorePageMutation: read the live tree from the jotai store at
execution time and insert via a functional updater, so Undo restores
child pages correctly even after the originating component unmounted.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Streaming dictation via silence cut (Silero VAD), reusing the existing
/ai-chat/transcribe endpoint. Batch dictation kept as fallback.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a lightweight "streaming" dictation mode as a simpler alternative to the
realtime-websocket path: detect speech with Silero VAD (@ricky0123/vad-web),
cut each segment on a pause and POST it to the existing /ai-chat/transcribe
endpoint, so text appears progressively. No server changes.
- new useStreamingDictation hook (same API as useDictation), lazy-loads VAD,
in-order seq emission, session-epoch guard against stop->start races
- new encodeWavPcm16 util (Float32 -> mono PCM16 WAV, accepted by the server)
- MicButton gains a `streaming` prop; enabled in the editor toolbar and chat
- VAD tuning: redemptionMs 640 / preSpeechPadMs 320 / minSpeechMs 96
- batch dictation kept as the fallback (streaming=false)
- deps: @ricky0123/vad-web@0.0.30, onnxruntime-web@1.27.0
Note: VAD assets load from the library CDN by default; for self-hosted/offline
set VAD_BASE_ASSET_PATH/VAD_ONNX_WASM_BASE_PATH and copy assets to public/vad/.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Each chat row in the AI-chat history now shows a dimmed second line with
how long ago the chat was created and the document it was created in
("N ago / <document>", or "No document" when started outside a page).
Server:
- New migration: nullable ai_chats.page_id (FK pages.id, ON DELETE SET NULL).
- Capture the origin page at chat creation from the client-supplied openPage,
but validate it first: it must be a real page in the same workspace that the
user may read (PageAccessService.validateCanView), else null. This keeps the
"openPage.id is attacker-controllable but harmless" invariant - preventing a
cross-workspace/cross-space page-title leak and a post-hijack FK crash.
- findByCreator left-joins pages (scoped by workspace, defense-in-depth) and
returns pageTitle.
Client:
- IAiChat gains pageId/pageTitle; ConversationList renders a ChatMetaLine
(useTimeAgo + origin document) as a dimmed second line.
- Add i18n key "No document" (en-US, ru-RU).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Update the halo's border-radius from a fixed 50% circle to the theme's default radius variable. This ensures the red pulse follows the button's rounded‑square outline instead of appearing circular.
Add detection for browser fetch‑failure messages (e.g., “Failed to fetch”, “Load failed”, “NetworkError”) and return a clear error indicating the streaming connection to the server was lost. Refine the connection‑error regex to avoid overlapping patterns while preserving provider‑side error handling.
reindexWorkspace isolated every per-page failure, so an invalid/missing
API key (401 "User not found") made all pages fail identically while the
batch kept issuing hundreds of doomed requests against the provider.
Add isFatalProviderError() (401/403 auth, 402 billing) and abort the
whole batch on such errors; 429 rate-limit and embedding timeouts stay
per-page isolated. Adds unit tests for the predicate and a regression
test for the abort/iterate control flow.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A mid-stream connection drop showed a generic "Something went wrong / Load
failed" banner and left no server-side trace.
- error-message: classify the browsers' own fetch-failure strings ("Load
failed" on WebKit, "Failed to fetch" on Chrome, "NetworkError" on Firefox)
as a lost connection, so the banner names the cause instead of the generic
heading.
- ai-chat.controller: log a warning in the request close handler when the
client disconnects before completion, so a drop that reaches the app (e.g. a
reverse proxy cutting the SSE) is visible in the server logs before the abort.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adjust AppShell padding to responsive values and add a CSS module that
handles container top and side padding for different breakpoints,
replacing the previous fixed `pt="xl"` usage.
The button rendered with Mantine's default blue primary, which clashes
with the app's neutral/dark accent design. Switch both the single-space
and the space-picker variants to variant="light" color="gray".
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The button never rendered: it filtered spaces to writable ones via a
CASL ability built from membership.permissions, but the /spaces list
endpoint returns membership.role only (permissions come from
/spaces/info). The empty ability hid the button for everyone.
Resolve writability from membership.role instead, mirroring the server
space-ability mapping (ADMIN and WRITER can manage pages, READER is
read-only). Drop the now-unused CASL imports.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The byline mic rendered blue and with a smaller (16px) glyph next to the
gray 20px info icon, so it looked misaligned with an uneven gap. Add
optional color/iconSize props to MicButton (forwarded through
DictationGroup) and render the byline mic gray at 20px, wrapping it and
the info icon in a tight nowrap group so they read as a snug, aligned
pair. The AI chat mic is unchanged (passes neither prop).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The dictation button now lives in the always-rendered page byline, so the
copy in the fixed toolbar was redundant: with the toolbar enabled the mic
showed up twice. Drop the DictationGroup render, its isDictationEnabled
guard, and the unused import from the toolbar.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a per-workspace `sttLanguage` setting (ISO-639-1 hint; empty =
auto-detect) and a searchable language picker in the Voice / STT settings
card. The hint is forwarded to the transcription endpoint:
- multipart path via the AI SDK `providerOptions.openai.language`
- JSON (OpenRouter) path via a top-level `language` body field
only when non-empty, so auto-detect behaves exactly as before.
Threaded through the whole stack: ai.types, update DTO, AiSettingsService
(resolve/getMasked/update), the workspace.repo SQL allowlist, the client
ai-settings service types, and the provider-settings form. Adds en-US
source keys and ru-RU translations.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a big "New note" action to the Home screen that creates a new page
and opens it. Since the home screen has no active space, the target
space is resolved from the user's writable spaces (CASL Manage/Page
gate, mirroring the space sidebar): created directly when there is one
writable space, picked from a dropdown when there are several, hidden
when there are none. Menu items are disabled while a create is in
flight to avoid duplicate pages.
- New component features/home/components/new-note-button.tsx
- Render it at the top of pages/dashboard/home.tsx (above the carousel)
- Add i18n keys "New note" / "Create in space" to en-US and ru-RU
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The editor dictation button previously lived only in the fixed toolbar,
which is hidden by default (gated by the per-user editorToolbar
preference), so dictation was effectively unavailable in the editor. Add
the same dictation control to the always-rendered page byline row, right
next to the Details "i" icon, so voice input stays reachable.
It is shown only when workspace dictation is enabled, the page is
editable, and the editor is in edit mode. Reuses the existing
DictationGroup/MicButton and its caret-insertion logic.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Merge the comments side-panel header into the Open/Resolved tab row,
then drop the now-redundant "Comments" title; the panel keeps its
accessible name via the AppShell.Aside aria-label.
- Overlay the close (X) button on the right of the tab row and nudge it
up 4px to align with the tab labels; the tab list stays full-width so
its bottom border line is preserved. The toc/details tabs keep their
existing shared header and scroll area unchanged.
- Quote block (.textSelection): increase top margin (2px -> 8px) so it
no longer sticks to the timestamp when it is the first block, and add
margin-left: 6px so the quote's left bar lines up with the comment
body text left edge.
Merge the comments side-panel header into the Open/Resolved tab row to
save vertical space: title on the left, tabs centered, close button on
the right.
- comment-list-with-tabs: add optional `title`/`onClose` props; render
the title and close button as absolutely-positioned overlays around a
full-width centered Tabs.List. Keeping them outside Tabs.List preserves
the tablist ARIA contract (only role="tab" children) while the tab
list's full-width bottom border line is retained.
- aside: pass `title`/`onClose` to CommentListWithTabs for the comments
tab and drop the shared header for that tab; the toc/details tabs keep
their existing shared header and scroll area unchanged.
- Remove the large active-space name header in the space sidebar;
the active space stays highlighted in the spaces grid below.
- Move "Space settings" into the user avatar (top) menu next to
"Workspace settings"; it shows only while viewing a space and is
detected via useMatch("/s/:spaceSlug/*").
- Make the brand logo non-selectable/non-draggable (user-select:none
on .brand, draggable=false on the img).
- Remove the redundant "Home" button next to the logo (the logo
already links to /home).
- Remove the version label under the Settings sidebar menu.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The live mic-level halo around the stop button was frozen at a constant
scale (1.15) whenever the OS "Reduce motion" setting was on, so it never
reacted to the voice while dictating. Make haloScale unconditional so it
always follows audioLevel (amplitude 0.9), and drop the now-unused
useReducedMotion import and reduceMotion local.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Widen the comments/aside panel from 350 to 420 (~20% wider)
- Remove double padding around the panel: AppShell.Aside p="md"->"sm"
and inner Box p="md"->p={0}; reduce header-to-tabs gap mb="md"->"sm"
- Reduce empty space below the add-comment input (paddingBottom 25->10),
align the avatar with the input box (marginTop 10->2) and re-anchor the
send button (bottom 30->15)
- Pull the timestamp closer to the nickname via tighter line-height
(lh 1.2 on the name, 1.1 on the "… ago" text)
Make AI-created comments inline-only and reliably anchored: forbid
page-type comments for the agent, throw + roll back when a selection
cannot be anchored, and add robust text matching (normalization +
cross-text-node anchoring within a block).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The "Thinking…" indicator's bounce was fully disabled by the
prefers-reduced-motion rule (animation: none), leaving the dots
frozen for users with "Reduce motion" enabled. Drive the bounce
height with a --bounce custom property: -6px by default and a
smaller -3px under reduced-motion, so the indicator stays visibly
active everywhere instead of freezing.
The in-app AI chat hardcoded type='page' and the shared createComment
swallowed anchoring failures silently, so agent comments never got a
text anchor/highlight.
- Forbid page-type comments for the agent: top-level comments are always
inline and require an exact `selection`; replies inherit the parent
anchor (stored as the historical `page` type).
- Throw and roll back the just-created comment when the selection cannot
be anchored, instead of leaving an orphan unanchored comment.
- Add comment-anchor module: text normalization (smart quotes, dashes,
nbsp, collapsed whitespace) and matching across adjacent text nodes
within a block, so selections crossing inline-code/bold/link anchor.
- Update create_comment (MCP) and createComment (ai-chat) tool schemas
and descriptions; add unit + mock-HTTP orchestration tests.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a rule to the "Реализация" section of AGENTS.md stating that git
worktrees may only be created inside the .claude directory
(e.g. .claude/worktrees/<name>); creating them anywhere else is forbidden.
A brand-new chat's first turn streamed and finished successfully, but the
whole assistant response vanished from the UI. On finish the window adopts
the server-created chat id, which changed the <ChatThread> key and remounted
it — discarding the live useChat store (the full answer) and re-seeding from
not-yet-persisted history, so only the user message remained.
- chat-thread: pin the useChat store id to a per-mount value so adopting the
chatId prop no longer recreates the store and wipes the live turn.
- ai-chat-window: derive the thread mount key via setState-during-render and
move the live-thread marker in lockstep with the adopted id, so in-place
adoption keeps the same mounted thread while real chat switches still
remount and re-seed; gate the history loader to a freshly opened chat.
- cancel a pending adoption on New chat / explicit chat selection.
- log the raw stream error to the browser console for debugging.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The back-merge alone does not fix the develop version: git describe names
a tag ref, and the :develop image is built on GitHub Actions, so the tag
must exist on the `github` remote. git push of a branch does not push
tags. Document the multi-remote (gitea + github) tag-push requirement and
a recovery checklist when develop still shows the previous version.
Replace the AI chat typing indicator text "AI is typing…" with
"Thinking…".
- typing-indicator.tsx: use t("Thinking…") instead of t("AI is typing…")
- en-US: drop the now-redundant "AI is typing…" key (the "Thinking…"
key already existed and was unused)
- ru-RU: rename the key to "Thinking…" with value "Думаю…"
- update related comments in message-list.tsx and the test file
The UI version comes from `git describe --tags`, which resolves the nearest
tag in the current commit's ancestry. Release tags are created on main's
merge commit, which is not in develop's history, so develop builds keep
reporting the previous tag (e.g. v0.91.0-NNN) until main is merged back.
Add step 7 (back-merge main -> develop) to the "Cutting a release"
checklist and a subsection explaining why develop lags and how to fix it.
The onAbort terminal path persisted the partial turn but wrote nothing
to the log, so a turn killed by a client disconnect / proxy drop / stop()
was invisible in the logs (unlike onError and the controller catch, which
both log). Add a logger.warn with the chat id, completed step count and
partial-text length so an aborted turn is traceable.
showTypingIndicator treated any tool part in the latest assistant message
as visible content, so the "AI is typing…" dots were suppressed for the
rest of the turn once the first tool call appeared. During the model's
"thinking" pauses after a completed tool call, the chat showed only static
tool cards and no activity.
Inspect the last part of the assistant message instead of any part: hide
the dots only while output is actively rendering (a non-empty streaming
text part, or a tool still in the "running" state — which shows its own
Loader). Finished/errored tools and empty trailing text now keep the dots
visible, so the indicator reappears while the model thinks between steps.
Add tests covering the post-tool thinking gap and the running-tool case.
Add a pulsing halo behind the stop button that scales with the
microphone input level, giving real-time feedback that recording is
active and the mic is picking up sound.
- use-dictation: meter the captured MediaStream via AudioContext +
AnalyserNode (analyser only, never connected to destination), compute
a smoothed RMS audioLevel (0..1) in a requestAnimationFrame loop, and
tear the meter down on every recording-end path (stop/cancel/auto-stop/
unmount); meter failure is non-fatal to recording
- mic-button: render a translucent red halo whose scale follows
audioLevel; honor prefers-reduced-motion with a static halo
- stop(): recover and release resources when no live recorder remains
- fix unhandled rejection from AudioContext.resume()
Delete the backlog markdown file that outlined additional STT providers and the future async transcription architecture, as the content is now superseded by newer implementation plans.