Commit Graph

10 Commits

Author SHA1 Message Date
claude code agent 227
acf3df9e9d feat(ai): anonymous AI assistant on public shares
Lets an unauthenticated viewer of a published share ask an AI scoped strictly
to that share's page tree. The authenticated agent is untouched; the security
boundary is the tool scope (no identity), and nothing is persisted.

Server:
- workspace toggle settings.ai.publicShareAssistant (default off) +
  optional settings.ai.provider.publicShareChatModel (cheap model id; reuses
  the chat driver/baseUrl/key). getChatModel(workspaceId, override) substitutes
  only the model id, falling back to chatModel.
- POST /api/shares/ai/stream (@Public, SSE). Guardrail funnel, each failing
  before streaming: toggle off -> 404; share missing/wrong-workspace/sharing
  off -> 404; pageId not in share tree -> 404; provider unconfigured -> 503;
  per-IP (5/min) and per-workspace (300/h, IP-independent) rate limits -> 429.
  Uniform 404s never confirm a private page's existence.
- forShare read-only in-process toolset: searchSharePages (existing shareId
  FTS branch, no spaceId/userId), getSharePage (getShareForPage gate +
  share.id check, content via the public sanitizer), listSharePages. No write/
  comment/history/cross-space/external-MCP tools.
- Locked share system prompt + immutable safety block; stepCountIs(5).
- /shares/page-info exposes an aiAssistant flag (gated behind isSharingAllowed).

Client: an ephemeral, text-only Ask-AI widget on the public shared page,
shown only when the flag is set; useChat -> /api/shares/ai/stream,
credentials omit. Admin toggle + model field in Settings -> AI.

Also adds a jest moduleNameMapper for src/-rooted imports (fixes pre-existing
unresolvable specs; additive).

Implements docs/public-share-assistant-plan.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 07:59:56 +03:00
vvzvlad
01a5a4b5d2 refactor(ai): explicit STT request format instead of OpenRouter host-sniffing
Replace the implicit `hostname endsWith openrouter.ai` detection with an
explicit, admin-chosen provider field `sttApiStyle` ('multipart' = OpenAI-
compatible multipart /audio/transcriptions; 'json' = OpenRouter-style JSON +
base64 input_audio). The transcription path now branches on the stored field,
not on the URL — nothing hidden from the admin.

- ai.types: add SttApiStyle + STT_API_STYLES; field on AiProviderSettings and
  MaskedAiSettings (resolved via ResolvedAiConfig).
- update-ai-settings.dto: validate sttApiStyle with @IsIn(STT_API_STYLES).
- ai-settings.service: plumb sttApiStyle through resolve()/getMasked() and the
  non-secret update whitelist; workspace.repo: add it to the ALLOWED array so it
  persists.
- ai.service: drop isOpenRouter(); transcribe() branches on cfg.sttApiStyle;
  rename helper to transcribeJsonBase64 with provider-neutral error text and a
  BadRequestException (400) when the base URL is missing for the JSON style.
- client: SttApiStyle type on IAiSettings/IAiSettingsUpdate; "Request format"
  Select on the Voice/STT settings card; i18n.
2026-06-18 19:40:05 +03:00
vvzvlad
77249d59c6 feat(ai): OpenRouter STT support + real error surfacing + STT endpoint test
- ai.service: route *.openrouter.ai STT to its JSON+base64
  /audio/transcriptions API; keep the OpenAI multipart path (AI SDK) for
  OpenAI/self-hosted whisper. Unify transcription behind transcribe().
- /transcribe controller: surface the real provider/transport reason
  (describeProviderError) instead of an opaque 500; preserve HttpException.
- testConnection: add an 'stt' capability (silent-WAV probe) + DTO; client
  gets a Test endpoint button and status dot on the Voice/STT card.
- useDictation: log full errors to the console and show the real reason
  (mic start + transcription paths); handle NotReadable/Abort and missing
  mediaDevices.
- docs(CLAUDE.md): require full error logging + specific user-facing messages.
2026-06-18 19:26:35 +03:00
vvzvlad
874bdd021c feat(ai): server-side voice dictation (STT) with mic in chat and editor
Add push-to-talk voice dictation that transcribes recorded audio on the
server via the workspace's OpenAI-compatible AI provider (Whisper /
gpt-4o-transcribe / self-hosted whisper), then inserts the text.

Backend:
- New `stt_api_key_enc` column + migration; STT creds parity with chat/
  embeddings (sttModel/sttBaseUrl/sttApiKey, write-only key, fallbacks to
  chat baseUrl/key). Both provider whitelists updated (service + repo).
- AiService.getTranscriptionModel + AiTranscriptionService.
- Gated POST /ai-chat/transcribe (dictation flag → 403, JWT + workspace
  scope + throttle, 25MB cap, MIME whitelist, never logs audio/key).
- New `settings.ai.dictation` workspace flag (DTO + service + audit).

Frontend:
- Wire up the Voice/STT settings card (model/base URL/key) and the
  Voice-dictation toggle.
- New `features/dictation`: useDictation (MediaRecorder state machine),
  MicButton, transcribe service; integrated into the chat composer and a
  new editor-toolbar dictation group, both gated by ai.dictation.
2026-06-18 18:45:33 +03:00
vvzvlad
87d6bdfbd9 feat(ai): redesign AI settings page with per-endpoint test buttons
Rebuild the workspace AI settings page into card-based "Endpoints"
(Chat / Embeddings / Voice) matching the new design, and split the
single connection test into independent per-endpoint Test buttons.

- server: testConnection(workspaceId, capability) probes only the
  requested capability ('chat' | 'embeddings'); add TestAiConnectionDto
  and wire it through the /workspace/ai-settings/test controller
- client: testAiConnection(capability) + capability-typed mutation; two
  independent test mutation instances so Chat/Embeddings results are isolated
- client: full rewrite of ai-provider-settings into Endpoints section —
  drop the provider dropdown (driver is always openai, base URL + key
  always shown), move the "AI chat" and surface the "Semantic search"
  feature toggles into card headers, system message behind an Edit modal,
  pgvector/reindex footer, and a disabled Voice/STT stub
- client: restyle external MCP tools and the MCP server section; collapse
  the AI sections in workspace-settings; remove the standalone
  ai-chat-settings component
- toggles now surface the server error message (e.g. missing pgvector)
- i18n: add new English strings

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 04:20:33 +03:00
vvzvlad
80c900eb54 fix(ai): make RAG indexer observable and bound hung embedding calls
The bulk embedding reindex could hang on a single page forever
("Indexed 27 of 34 pages") with zero log output:
- all progress logs were debug-level, suppressed in production (pino info);
- embedMany() had no timeout, so a slow/hung embeddings endpoint blocked
  the sequential per-page loop indefinitely.

Changes:
- ai.service.embedTexts: bound embedMany with AbortSignal.timeout
  (configurable via AI_EMBEDDING_TIMEOUT_MS, default 120000ms); on timeout
  throw a clear, greppable message, classified by both signal.aborted and
  the error name (TimeoutError/AbortError/ResponseAborted) so a real
  provider error racing the timer keeps its diagnostics.
- embedding-indexer.reindexWorkspace: promote lifecycle/progress logs to
  info; log "[i/N] indexing page <id>" BEFORE the await so a hang names the
  stuck page; warn on slow pages (>30s); add timing + final summary.
- .env.example: document AI_EMBEDDING_TIMEOUT_MS.
2026-06-18 03:07:02 +03:00
vvzvlad
b46aed53e3 feat(ai): surface provider error bodies + probe embeddings in test connection
A misconfigured embeddings endpoint failed the RAG indexer with an opaque
"Invalid JSON response" and was not caught by "Test connection" (which only
probed the chat model), so it only surfaced silently during background
indexing.

- add describeProviderError(): formats AI SDK errors as
  "<statusCode>: <message> | response body: <truncated one-line snippet>"
  (statusCode/message/responseBody never carry the API key)
- use it in the bulk-reindex catch and the embedding processor's formatter so
  the real cause (e.g. an HTML 404 from a wrong base URL) is visible in logs
- testConnection now probes chat AND embeddings independently: skips a probe
  when that capability is unconfigured, returns ok:false with a Chat:/Embeddings:
  prefix on real failure, "not configured" when neither is set
2026-06-18 02:35:01 +03:00
vvzvlad
a7f244053b feat(ai): separate base URL and API key for chat vs embedding model
Per-workspace AI provider config previously shared a single base URL and
a single API key between the chat model and the embedding model. Add
dedicated, optional embedding endpoint/token that fall back to the chat
values when empty, preserving backward compatibility.

- db: new migration adds nullable `embedding_api_key_enc` to
  `ai_provider_credentials`; chat key stays in `api_key_enc`
- repo: add `upsertEmbeddingKey` / `clearEmbeddingKey` (on-conflict
  touches only its own column, so chat/embedding keys never overwrite)
- ai-settings.service: store non-secret `embeddingBaseUrl`; resolve()
  applies fallback (embeddingBaseUrl || baseUrl; embedding key || chat
  key); getMasked() exposes raw `embeddingBaseUrl` + `hasEmbeddingApiKey`,
  never the key; update() handles the embedding key write-only
- ai.service: getEmbeddingModel() builds openai/gemini/ollama with the
  embedding-specific URL/key; chat path unchanged
- client: new "Embedding base URL" and "Embedding API key" fields with
  fallback hints and a clear-key action

Requires running the DB migration on deploy.
2026-06-18 01:33:45 +03:00
vvzvlad
a4b7919753 fix(ai-chat): OpenAI Chat Completions for multi-turn + provider settings, stream UX & errors" -m "Live-stand fixes (OpenRouter / OpenAI-compatible):
- openai provider: use .chat() (Chat Completions) instead of the default callable
  (Responses API), which gateways reject on multi-turn -> 400.
- updateAiProviderSettings: assemble settings.ai.provider via jsonb_build_object
  with ::text-cast bound params + jsonb_typeof self-heal (postgres.js was
  double-encoding it into an array; the ::text cast avoids 'could not determine
  data type of parameter').
- chat agent: drop the hard maxOutputTokens cap (truncated complex tool calls);
  keep a tiny cap only on the test-connection ping.
- testConnection + chat stream: surface the real provider error (statusCode+message)
  to logs and the UI instead of generic masks; never log the API key.
- chat UI: typing indicator, incremental streaming render, tool 'running' status, Stop.

Also bundled (prior uncommitted ai-chat work):
- history 'AI agent' provenance badge; vector RAG (pgvector image + page_embeddings
  + AI_QUEUE indexer + space-scoped semanticSearch); external MCP servers backend
  (@ai-sdk/mcp client, SSRF IP-pinning, encrypted headers, admin CRUD/Test);
  yjs duplicate-instance fix via pnpm patch (single CJS instance server-side).
2026-06-17 04:28:29 +03:00
vvzvlad
683da7a4c5 feat(ai-chat): per-user AI agent backend — LLM config, read-only agent, provenance schema
WIP checkpoint of the gitmost AI-chat backend (plan stages A + B1 + B3a).
The agent acts under the requesting user's JWT (Docmost CASL enforces page
access); the external service-account /mcp endpoint is untouched.

LLM provider config (A2-A4):
- integrations/crypto: AES-256-GCM SecretBoxService (key derived from APP_SECRET,
  per-record salt/iv; clear error on rotation instead of crashing).
- ai_provider_credentials table/repo/types: encrypted API key stored outside
  workspace settings/baseFields, write-only (never returned by any endpoint).
- integrations/ai: per-workspace AI SDK v6 provider driver (openai/gemini/ollama),
  admin-gated GET(masked)/PATCH(write-only key)/Test endpoints; settings.ai.provider
  holds non-secret config incl. systemPrompt. Removed unused AI_* env getters (DB is
  the single source of truth).

Chat module (A1, A5-A8):
- ai_chats/ai_chat_messages repos (workspace-scoped, soft-delete, tsv never selected).
- core/ai-chat: CRUD + POST /ai-chat/stream (Fastify hijack + AI SDK v6
  pipeUIMessageStreamToResponse, abort on disconnect, persist user/assistant msgs).
- Agent loop: streamText + stepCountIs(8); read tools searchPages/getPage via a
  per-request DocmostClient over loopback REST under the user's minted access token.
- Gate settings.ai.chat (+ 503 when provider unconfigured); buildSystemPrompt with a
  non-removable safety/anti-prompt-injection framework. Per-user rate limit.

Per-user auth (B1):
- @docmost/mcp DocmostClient gains an additive getToken variant (carry a user JWT,
  re-fetch on 401) and exports DocmostClient; the email/password service-account path
  (external /mcp, stdio) is unchanged.

Agent-edit provenance backbone (B3a):
- Migration: pages/page_history (last_updated_source, last_updated_ai_chat_id) and
  comments (created_source, ai_chat_id, resolved_source).
- Signed actor/aiChatId claim in the collab token; onAuthenticate propagates it,
  onStoreDocument writes it with a sticky agent marker, saveHistory copies it.

Migrations auto-run on boot (additive). Write tools, frontend, RAG and external MCP
servers are not in this checkpoint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 01:36:41 +03:00