Commit Graph

10 Commits

Author SHA1 Message Date
vvzvlad
874bdd021c feat(ai): server-side voice dictation (STT) with mic in chat and editor
Add push-to-talk voice dictation that transcribes recorded audio on the
server via the workspace's OpenAI-compatible AI provider (Whisper /
gpt-4o-transcribe / self-hosted whisper), then inserts the text.

Backend:
- New `stt_api_key_enc` column + migration; STT creds parity with chat/
  embeddings (sttModel/sttBaseUrl/sttApiKey, write-only key, fallbacks to
  chat baseUrl/key). Both provider whitelists updated (service + repo).
- AiService.getTranscriptionModel + AiTranscriptionService.
- Gated POST /ai-chat/transcribe (dictation flag → 403, JWT + workspace
  scope + throttle, 25MB cap, MIME whitelist, never logs audio/key).
- New `settings.ai.dictation` workspace flag (DTO + service + audit).

Frontend:
- Wire up the Voice/STT settings card (model/base URL/key) and the
  Voice-dictation toggle.
- New `features/dictation`: useDictation (MediaRecorder state machine),
  MicButton, transcribe service; integrated into the chat composer and a
  new editor-toolbar dictation group, both gated by ai.dictation.
2026-06-18 18:45:33 +03:00
vvzvlad
87d6bdfbd9 feat(ai): redesign AI settings page with per-endpoint test buttons
Rebuild the workspace AI settings page into card-based "Endpoints"
(Chat / Embeddings / Voice) matching the new design, and split the
single connection test into independent per-endpoint Test buttons.

- server: testConnection(workspaceId, capability) probes only the
  requested capability ('chat' | 'embeddings'); add TestAiConnectionDto
  and wire it through the /workspace/ai-settings/test controller
- client: testAiConnection(capability) + capability-typed mutation; two
  independent test mutation instances so Chat/Embeddings results are isolated
- client: full rewrite of ai-provider-settings into Endpoints section —
  drop the provider dropdown (driver is always openai, base URL + key
  always shown), move the "AI chat" and surface the "Semantic search"
  feature toggles into card headers, system message behind an Edit modal,
  pgvector/reindex footer, and a disabled Voice/STT stub
- client: restyle external MCP tools and the MCP server section; collapse
  the AI sections in workspace-settings; remove the standalone
  ai-chat-settings component
- toggles now surface the server error message (e.g. missing pgvector)
- i18n: add new English strings

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 04:20:33 +03:00
vvzvlad
91a63f0b2c fix(ai): stop RAG coverage bar sticking below 100% on empty pages
"Indexed N of M pages" stayed at e.g. "27 of 34" forever even after a
successful full reindex. The numerator counted pages that have embeddings
while the denominator counted ALL non-deleted pages, so empty / text-less
pages (which legitimately store zero embeddings) could never be reached.

Add PageRepo.countEmbeddablePages: counts non-deleted pages that have
non-empty textContent OR already have a stored embedding row, and use it as
the totalPages denominator in AiSettingsService.getMasked. The "has
embeddings" clause covers pages indexed from the content JSON (null
textContent) and guarantees indexedPages <= totalPages. No DB migration.
2026-06-18 03:33:38 +03:00
vvzvlad
80c900eb54 fix(ai): make RAG indexer observable and bound hung embedding calls
The bulk embedding reindex could hang on a single page forever
("Indexed 27 of 34 pages") with zero log output:
- all progress logs were debug-level, suppressed in production (pino info);
- embedMany() had no timeout, so a slow/hung embeddings endpoint blocked
  the sequential per-page loop indefinitely.

Changes:
- ai.service.embedTexts: bound embedMany with AbortSignal.timeout
  (configurable via AI_EMBEDDING_TIMEOUT_MS, default 120000ms); on timeout
  throw a clear, greppable message, classified by both signal.aborted and
  the error name (TimeoutError/AbortError/ResponseAborted) so a real
  provider error racing the timer keeps its diagnostics.
- embedding-indexer.reindexWorkspace: promote lifecycle/progress logs to
  info; log "[i/N] indexing page <id>" BEFORE the await so a hang names the
  stuck page; warn on slow pages (>30s); add timing + final summary.
- .env.example: document AI_EMBEDDING_TIMEOUT_MS.
2026-06-18 03:07:02 +03:00
vvzvlad
b46aed53e3 feat(ai): surface provider error bodies + probe embeddings in test connection
A misconfigured embeddings endpoint failed the RAG indexer with an opaque
"Invalid JSON response" and was not caught by "Test connection" (which only
probed the chat model), so it only surfaced silently during background
indexing.

- add describeProviderError(): formats AI SDK errors as
  "<statusCode>: <message> | response body: <truncated one-line snippet>"
  (statusCode/message/responseBody never carry the API key)
- use it in the bulk-reindex catch and the embedding processor's formatter so
  the real cause (e.g. an HTML 404 from a wrong base URL) is visible in logs
- testConnection now probes chat AND embeddings independently: skips a probe
  when that capability is unconfigured, returns ok:false with a Chat:/Embeddings:
  prefix on real failure, "not configured" when neither is set
2026-06-18 02:35:01 +03:00
vvzvlad
52e19fe678 feat(ai): wire up workspace RAG bulk reindex + manual "Reindex now"
The WORKSPACE_CREATE_EMBEDDINGS / WORKSPACE_DELETE_EMBEDDINGS jobs were
enqueued (on AI Search enable/disable) but had no AI_QUEUE handler, so
existing pages were never indexed ("Indexed 0 of N pages") and disabling
never purged embeddings.

- EmbeddingProcessor: handle WORKSPACE_CREATE_EMBEDDINGS (bulk reindex all
  live pages) and WORKSPACE_DELETE_EMBEDDINGS (purge workspace embeddings)
- EmbeddingIndexerService: add reindexWorkspace() (skips when embeddings
  unconfigured; per-page error isolation) and removeWorkspace()
- PageRepo.getIdsByWorkspace(), PageEmbeddingRepo.deleteByWorkspace()
- AiSettingsService.reindex() + admin-only POST /workspace/ai-settings/reindex
- Frontend: "Reindex now" button, service call and mutation
- Stable per-workspace jobId with remove-before-add so a stale job can't
  block future reindexes; cancel the delayed purge on enable/reindex so it
  can't wipe freshly-built embeddings
2026-06-18 02:15:18 +03:00
vvzvlad
a7f244053b feat(ai): separate base URL and API key for chat vs embedding model
Per-workspace AI provider config previously shared a single base URL and
a single API key between the chat model and the embedding model. Add
dedicated, optional embedding endpoint/token that fall back to the chat
values when empty, preserving backward compatibility.

- db: new migration adds nullable `embedding_api_key_enc` to
  `ai_provider_credentials`; chat key stays in `api_key_enc`
- repo: add `upsertEmbeddingKey` / `clearEmbeddingKey` (on-conflict
  touches only its own column, so chat/embedding keys never overwrite)
- ai-settings.service: store non-secret `embeddingBaseUrl`; resolve()
  applies fallback (embeddingBaseUrl || baseUrl; embedding key || chat
  key); getMasked() exposes raw `embeddingBaseUrl` + `hasEmbeddingApiKey`,
  never the key; update() handles the embedding key write-only
- ai.service: getEmbeddingModel() builds openai/gemini/ollama with the
  embedding-specific URL/key; chat path unchanged
- client: new "Embedding base URL" and "Embedding API key" fields with
  fallback hints and a clear-key action

Requires running the DB migration on deploy.
2026-06-18 01:33:45 +03:00
vvzvlad
1f2d20244e feat(ai-chat): show RAG indexing coverage in AI settings
Display "Indexed N of M pages" on the AI provider settings page so admins
can see how much of the wiki is covered by vector-RAG semantic search.

- page-embedding.repo: add countIndexedPages() — distinct non-deleted pages
  that have stored embeddings in the workspace
- page.repo: add countByWorkspace() — total non-deleted pages
- ai-settings.service: compute both counts in getMasked() (Promise.all) and
  return them with the masked settings; inject PageEmbeddingRepo + PageRepo
- MaskedAiSettings / IAiSettings: add indexedPages + totalPages
- ai-provider-settings: render a dimmed coverage line under "Embedding model"
- i18n: add the "Indexed {{indexed}} of {{total}} pages" key (en-US, ru-RU)
2026-06-17 23:18:51 +03:00
vvzvlad
a4b7919753 fix(ai-chat): OpenAI Chat Completions for multi-turn + provider settings, stream UX & errors" -m "Live-stand fixes (OpenRouter / OpenAI-compatible):
- openai provider: use .chat() (Chat Completions) instead of the default callable
  (Responses API), which gateways reject on multi-turn -> 400.
- updateAiProviderSettings: assemble settings.ai.provider via jsonb_build_object
  with ::text-cast bound params + jsonb_typeof self-heal (postgres.js was
  double-encoding it into an array; the ::text cast avoids 'could not determine
  data type of parameter').
- chat agent: drop the hard maxOutputTokens cap (truncated complex tool calls);
  keep a tiny cap only on the test-connection ping.
- testConnection + chat stream: surface the real provider error (statusCode+message)
  to logs and the UI instead of generic masks; never log the API key.
- chat UI: typing indicator, incremental streaming render, tool 'running' status, Stop.

Also bundled (prior uncommitted ai-chat work):
- history 'AI agent' provenance badge; vector RAG (pgvector image + page_embeddings
  + AI_QUEUE indexer + space-scoped semanticSearch); external MCP servers backend
  (@ai-sdk/mcp client, SSRF IP-pinning, encrypted headers, admin CRUD/Test);
  yjs duplicate-instance fix via pnpm patch (single CJS instance server-side).
2026-06-17 04:28:29 +03:00
vvzvlad
683da7a4c5 feat(ai-chat): per-user AI agent backend — LLM config, read-only agent, provenance schema
WIP checkpoint of the gitmost AI-chat backend (plan stages A + B1 + B3a).
The agent acts under the requesting user's JWT (Docmost CASL enforces page
access); the external service-account /mcp endpoint is untouched.

LLM provider config (A2-A4):
- integrations/crypto: AES-256-GCM SecretBoxService (key derived from APP_SECRET,
  per-record salt/iv; clear error on rotation instead of crashing).
- ai_provider_credentials table/repo/types: encrypted API key stored outside
  workspace settings/baseFields, write-only (never returned by any endpoint).
- integrations/ai: per-workspace AI SDK v6 provider driver (openai/gemini/ollama),
  admin-gated GET(masked)/PATCH(write-only key)/Test endpoints; settings.ai.provider
  holds non-secret config incl. systemPrompt. Removed unused AI_* env getters (DB is
  the single source of truth).

Chat module (A1, A5-A8):
- ai_chats/ai_chat_messages repos (workspace-scoped, soft-delete, tsv never selected).
- core/ai-chat: CRUD + POST /ai-chat/stream (Fastify hijack + AI SDK v6
  pipeUIMessageStreamToResponse, abort on disconnect, persist user/assistant msgs).
- Agent loop: streamText + stepCountIs(8); read tools searchPages/getPage via a
  per-request DocmostClient over loopback REST under the user's minted access token.
- Gate settings.ai.chat (+ 503 when provider unconfigured); buildSystemPrompt with a
  non-removable safety/anti-prompt-injection framework. Per-user rate limit.

Per-user auth (B1):
- @docmost/mcp DocmostClient gains an additive getToken variant (carry a user JWT,
  re-fetch on 401) and exports DocmostClient; the email/password service-account path
  (external /mcp, stdio) is unchanged.

Agent-edit provenance backbone (B3a):
- Migration: pages/page_history (last_updated_source, last_updated_ai_chat_id) and
  comments (created_source, ai_chat_id, resolved_source).
- Signed actor/aiChatId claim in the collab token; onAuthenticate propagates it,
  onStoreDocument writes it with a sticky agent marker, saveHistory copies it.

Migrations auto-run on boot (additive). Write tools, frontend, RAG and external MCP
servers are not in this checkpoint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 01:36:41 +03:00