Commit Graph

7 Commits

Author SHA1 Message Date
claude_code
44a1b5b003 feat(dictation): gate streaming dictation behind a workspace toggle
Streaming (silence-cut) dictation was hardcoded on. Put it behind a per-workspace
flag settings.ai.dictationStreaming, default off, with batch dictation as the
default and fallback. Mirrors the existing settings.ai.dictation flag end to end:

- server: aiDictationStreaming on UpdateWorkspaceDto + workspace.service writes
  settings.ai.dictationStreaming via updateAiSettings (jsonb merge keeps siblings)
- client: IWorkspaceAiSettings.dictationStreaming, an optimistic "Streaming
  dictation" sub-toggle under "Voice dictation" (disabled when dictation is off)
- gate the MicButton streaming prop in the editor toolbar and chat composer on
  the flag instead of a literal true

When the flag is absent/false both call sites pass streaming=false, so the VAD
model/wasm are never fetched and behavior is unchanged. Reuses the existing STT
model and /ai-chat/transcribe — no new provider/model/endpoint fields.

Removes the backlog entry now that it is implemented.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 23:59:35 +03:00
claude_code
e423c35676 feat(ai-chat): queue messages typed while the agent is streaming
Previously a message composed while the AI agent was streaming a reply was
silently dropped (the composer early-returned on isStreaming). Now such
messages are queued FIFO and sent automatically once the current turn
finishes cleanly.

- chat-input: submit() enqueues while streaming (via new onQueue prop) and
  sends otherwise; during streaming show a queue Send button (when text is
  present) alongside the Stop button; the textarea stays usable.
- chat-thread: per-conversation queue in local state (mirrored in a ref);
  flush the next message in onFinish ONLY on a clean finish - ai@6 useChat
  fires onFinish from a finally on Stop/disconnect/error too, where the queue
  must be preserved. Pending messages render as removable chips above the
  composer. Queue is cleared on chat switch (parent remount) and survives
  in-place new-chat id adoption.
- queue-helpers: pure FIFO helpers (enqueue/dequeue/removeQueuedById) + tests.
- i18n: add en-US/ru-RU keys (Queue message, Remove queued message,
  Send when the agent finishes).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 18:53:31 +03:00
claude_code
4f0da42d88 feat(dictation): streaming STT via silence cut (Silero VAD)
Add a lightweight "streaming" dictation mode as a simpler alternative to the
realtime-websocket path: detect speech with Silero VAD (@ricky0123/vad-web),
cut each segment on a pause and POST it to the existing /ai-chat/transcribe
endpoint, so text appears progressively. No server changes.

- new useStreamingDictation hook (same API as useDictation), lazy-loads VAD,
  in-order seq emission, session-epoch guard against stop->start races
- new encodeWavPcm16 util (Float32 -> mono PCM16 WAV, accepted by the server)
- MicButton gains a `streaming` prop; enabled in the editor toolbar and chat
- VAD tuning: redemptionMs 640 / preSpeechPadMs 320 / minSpeechMs 96
- batch dictation kept as the fallback (streaming=false)
- deps: @ricky0123/vad-web@0.0.30, onnxruntime-web@1.27.0

Note: VAD assets load from the library CDN by default; for self-hosted/offline
set VAD_BASE_ASSET_PATH/VAD_ONNX_WASM_BASE_PATH and copy assets to public/vad/.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 16:52:05 +03:00
vvzvlad
874bdd021c feat(ai): server-side voice dictation (STT) with mic in chat and editor
Add push-to-talk voice dictation that transcribes recorded audio on the
server via the workspace's OpenAI-compatible AI provider (Whisper /
gpt-4o-transcribe / self-hosted whisper), then inserts the text.

Backend:
- New `stt_api_key_enc` column + migration; STT creds parity with chat/
  embeddings (sttModel/sttBaseUrl/sttApiKey, write-only key, fallbacks to
  chat baseUrl/key). Both provider whitelists updated (service + repo).
- AiService.getTranscriptionModel + AiTranscriptionService.
- Gated POST /ai-chat/transcribe (dictation flag → 403, JWT + workspace
  scope + throttle, 25MB cap, MIME whitelist, never logs audio/key).
- New `settings.ai.dictation` workspace flag (DTO + service + audit).

Frontend:
- Wire up the Voice/STT settings card (model/base URL/key) and the
  Voice-dictation toggle.
- New `features/dictation`: useDictation (MediaRecorder state machine),
  MicButton, transcribe service; integrated into the chat composer and a
  new editor-toolbar dictation group, both gated by ai.dictation.
2026-06-18 18:45:33 +03:00
vvzvlad
3d9c9daf98 feat(ai-chat): focus composer on chat creation
Add autoFocus to the chat composer Textarea so a freshly created chat
(window open, "New chat", chat switch — all remount ChatThread via key)
lands with the cursor ready in the input field, letting the user type
immediately without clicking into it.
2026-06-18 05:51:24 +03:00
vvzvlad
4379163c21 fix(ai-chat): keep composer draft across new-chat id adoption remount
Typing into the composer while the agent was streaming lost the draft once
the turn finished: on a brand-new chat, adopting the freshly created chat id
changes ChatThread's key and remounts it, wiping ChatInput's local state.

Lift the composer draft into a module-level jotai atom (aiChatDraftAtom) so it
survives the remount. Reset it only on deliberate chat switches — startNewChat,
selectChat, and the page-history "AI agent" badge deep-link — so a draft never
leaks between conversations, while adoption (which goes through a useEffect)
preserves it.
2026-06-17 23:44:20 +03:00
vvzvlad
44b340dc1a feat(ai-chat): agent write tools, provenance wiring, chat panel + provider settings UI" -m "Backend:
- Add reversible write tools to the per-user agent toolset (page create/update/
  move/soft-delete; comment reply + resolve), exposed under the user's JWT and
  enforced by Docmost CASL; no permanent/force delete (D3).
- Non-spoofable agent provenance: sign actor/aiChatId into the access and collab
  tokens (TokenService), propagate via jwt.strategy onto the request, and set
  pages.last_updated_source/last_updated_ai_chat_id on REST create/update/move and
  comments.created_source/resolved_source/ai_chat_id.
- packages/mcp: add an optional getCollabToken provider (content-edit provenance)
  and guard against empty tokens; service-account /mcp path unchanged.

Frontend:
- Admin 'AI / Models' settings section: provider/model/embedding/base URL, a
  write-only API key field, system prompt, and Test connection.
- AI chat panel (useChat + DefaultChatTransport): conversation list, streamed
  messages, tool-call action log and page citations; header entry point gated on
  settings.ai.chat.

Compile-verified (server nest build + client tsc/vite); not yet live-tested.
Known gaps: history 'AI agent' badge (C3), vector RAG (D), external MCP (E);
chat tool-card citation links pending a fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 02:39:26 +03:00