The custom undici RetryAgent + aiFetch transport added for issue #140
did not actually heal mid-stream provider drops: undici's retry path is
a Range-based download-resume that SSE/chat-completions endpoints cannot
satisfy, so a reset after the first byte only swapped ECONNRESET for a
"server does not support the range header" error. Its only real effect
was reconnecting a poisoned keep-alive socket before the first byte, and
PR #141 on top of it turned the 60s headers timeout into deterministic
~61s failures (plus CONTENT_LENGTH_MISMATCH from retrying a POST body
after a timeout abort). The root cause is the z.ai coding endpoint, not
our transport.
Remove the whole layer and return all AI provider calls to Node's
default global fetch.
- delete integrations/ai/ai-http.ts and its spec
- ai.service.ts: drop the aiFetch import, the AI_BYPASS_RESILIENT_FETCH
diagnostic toggle, and fetch:aiFetch from every chat/embedding/STT
factory; raw STT call back to global fetch
- ai-chat.controller.ts: drop the stream-timing START log + startedAt
- ai-chat.service.ts: drop the first-chunk/FINISHED/ERROR timing logs
- .env.example: drop AI_BYPASS_RESILIENT_FETCH
Reverts: 1af5d34a, 7c308728, b7abb7ea, 35fc58ea, d6cd2754, 6efb8656.
Preserved (not part of the rollback): client-disconnect abort, title
generation in onFinish, partial-answer persistence, Safari SSE heartbeat.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The streaming chat turn hangs in all browsers while the non-streaming test
endpoint works — both use the same model/transport (createOpenAI + aiFetch),
so the suspect is the streaming path / custom undici RetryAgent transport.
- ai-http.ts: wrap aiFetch with per-request timing logs (start, ms-to-headers
on success, elapsed ms + cause on failure). Chat at info, embeddings at
debug. Only host+path logged.
- ai-chat.controller.ts / ai-chat.service.ts: log turn START, first-chunk
latency, FINISHED duration, and elapsed ms on disconnect/error/abort.
- ai.service.ts: AI_BYPASS_RESILIENT_FETCH=true makes the CHAT model omit
fetch:aiFetch and use the default global fetch — isolates transport vs
request-shape. Chat-only; embeddings/STT untouched; reversible via env.
- .env.example: document the flag.
No timeout/retry change. tsc clean; ai-chat + ai suites pass (292).
A mid-stream connection drop showed a generic "Something went wrong / Load
failed" banner and left no server-side trace.
- error-message: classify the browsers' own fetch-failure strings ("Load
failed" on WebKit, "Failed to fetch" on Chrome, "NetworkError" on Firefox)
as a lost connection, so the banner names the cause instead of the generic
heading.
- ai-chat.controller: log a warning in the request close handler when the
client disconnects before completion, so a drop that reaches the app (e.g. a
reverse proxy cutting the SSE) is visible in the server logs before the abort.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Batches 6-9: behaviour-preserving extractions of testable pure cores plus the
tests they unblock, and a fix for the broken client test environment.
Full suites green: server 113 suites / 1117 + 1 todo, client 30 files / 338.
client (R0 infra):
- vitest.setup.ts: in-memory localStorage/sessionStorage Storage stub wired via
setupFiles. Unblocks menu-items.gating.test.ts (was 9 failing) -> client suite
fully green. + menu-items.suggestions.test.ts (getSuggestionItems filter/sort).
share:
- extract buildShareMetaHtml (share-seo.util.ts) from the SEO controller; tests
for reflected-XSS escaping in <title>/og/twitter meta, noindex, truncation;
extractPageSlugId; updateAttachmentAttr; prepareContentForShare comment-strip
(anonymous-viewer metadata-leak guard).
ai-chat (security extractions):
- selectAccessibleHits: CASL post-filter for semantic search (restricted page in
an accessible space must NOT leak to the agent).
- validateResolvedAddresses: SSRF connect-time guard (block if ANY resolved
address is private).
- resolveAudioFormat: mime whitelist (dead `?? 'webm'` fallback dropped, set
unchanged). + mcp-servers toView header-leak guard, MCP tool namespacing.
collaboration (data-loss area):
- extract computeHistoryJob (pins the "agent delay MUST stay 0" invariant) and
resolveSource. Integration: onAuthenticate read-only matrix (collab auth
bypass), HistoryProcessor (contributor restore on save failure), onStoreDocument
Approach-A boundary snapshot (human revision pinned before agent overwrite).
Reviewed (APPROVE WITH SUGGESTIONS): extractions behaviour-preserving, security
tests mutation-resistant.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Reusable, workspace-shared agent roles for the built-in AI chat. A role is
a named persona (system-prompt instructions) + optional model override; a
chat is bound to a role at creation and applies it every turn.
Backend:
- migration 20260620T120000: ai_agent_roles table + ai_chats.role_id
(FK ON DELETE SET NULL); hand-merged types into db.d.ts/entity.types.ts
(db.d.ts is hand-curated here, full codegen would clobber it).
- core/ai-chat/roles: CRUD module. list = any workspace member; create/
update/delete = admin (Manage Settings ability, like ai-settings/mcp).
All repo queries scoped by workspace_id; soft-delete (deleted_at).
- buildSystemPrompt gains roleInstructions: role REPLACES the persona base
(admin prompt / DEFAULT_PROMPT) but SAFETY_FRAMEWORK + context are always
still appended.
- stream(): role resolved from ai_chats.role_id for existing chats (never
the request body -> no per-turn role swap); body.roleId only on creation.
Disabled (enabled=false) and soft-deleted roles fall back to universal.
- getChatModel(workspaceId, override): role model_config can swap model id /
driver; a driver without configured creds throws 503 with a clear message
naming the driver+role, resolved BEFORE response hijack.
Client:
- new-chat role picker (enabled roles only, default Universal assistant),
roleId sent only on the first message; role badge (emoji+name) in the chat
header and conversation list; admin Agent-roles management section in
Settings -> AI (add/edit/delete, MCP-form pattern).
Tests: ai-chat.prompt.spec (role layering + safety always present, incl.
jailbreak); ai.service.spec (override on unconfigured driver -> 503).
Implements docs/ai-agent-roles-plan.md.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- ai.service: route *.openrouter.ai STT to its JSON+base64
/audio/transcriptions API; keep the OpenAI multipart path (AI SDK) for
OpenAI/self-hosted whisper. Unify transcription behind transcribe().
- /transcribe controller: surface the real provider/transport reason
(describeProviderError) instead of an opaque 500; preserve HttpException.
- testConnection: add an 'stt' capability (silent-WAV probe) + DTO; client
gets a Test endpoint button and status dot on the Voice/STT card.
- useDictation: log full errors to the console and show the real reason
(mic start + transcription paths); handle NotReadable/Abort and missing
mediaDevices.
- docs(CLAUDE.md): require full error logging + specific user-facing messages.
Add push-to-talk voice dictation that transcribes recorded audio on the
server via the workspace's OpenAI-compatible AI provider (Whisper /
gpt-4o-transcribe / self-hosted whisper), then inserts the text.
Backend:
- New `stt_api_key_enc` column + migration; STT creds parity with chat/
embeddings (sttModel/sttBaseUrl/sttApiKey, write-only key, fallbacks to
chat baseUrl/key). Both provider whitelists updated (service + repo).
- AiService.getTranscriptionModel + AiTranscriptionService.
- Gated POST /ai-chat/transcribe (dictation flag → 403, JWT + workspace
scope + throttle, 25MB cap, MIME whitelist, never logs audio/key).
- New `settings.ai.dictation` workspace flag (DTO + service + audit).
Frontend:
- Wire up the Voice/STT settings card (model/base URL/key) and the
Voice-dictation toggle.
- New `features/dictation`: useDictation (MediaRecorder state machine),
MicButton, transcribe service; integrated into the chat composer and a
new editor-toolbar dictation group, both gated by ai.dictation.
WIP checkpoint of the gitmost AI-chat backend (plan stages A + B1 + B3a).
The agent acts under the requesting user's JWT (Docmost CASL enforces page
access); the external service-account /mcp endpoint is untouched.
LLM provider config (A2-A4):
- integrations/crypto: AES-256-GCM SecretBoxService (key derived from APP_SECRET,
per-record salt/iv; clear error on rotation instead of crashing).
- ai_provider_credentials table/repo/types: encrypted API key stored outside
workspace settings/baseFields, write-only (never returned by any endpoint).
- integrations/ai: per-workspace AI SDK v6 provider driver (openai/gemini/ollama),
admin-gated GET(masked)/PATCH(write-only key)/Test endpoints; settings.ai.provider
holds non-secret config incl. systemPrompt. Removed unused AI_* env getters (DB is
the single source of truth).
Chat module (A1, A5-A8):
- ai_chats/ai_chat_messages repos (workspace-scoped, soft-delete, tsv never selected).
- core/ai-chat: CRUD + POST /ai-chat/stream (Fastify hijack + AI SDK v6
pipeUIMessageStreamToResponse, abort on disconnect, persist user/assistant msgs).
- Agent loop: streamText + stepCountIs(8); read tools searchPages/getPage via a
per-request DocmostClient over loopback REST under the user's minted access token.
- Gate settings.ai.chat (+ 503 when provider unconfigured); buildSystemPrompt with a
non-removable safety/anti-prompt-injection framework. Per-user rate limit.
Per-user auth (B1):
- @docmost/mcp DocmostClient gains an additive getToken variant (carry a user JWT,
re-fetch on 401) and exports DocmostClient; the email/password service-account path
(external /mcp, stdio) is unchanged.
Agent-edit provenance backbone (B3a):
- Migration: pages/page_history (last_updated_source, last_updated_ai_chat_id) and
comments (created_source, ai_chat_id, resolved_source).
- Signed actor/aiChatId claim in the collab token; onAuthenticate propagates it,
onStoreDocument writes it with a sticky agent marker, saveHistory copies it.
Migrations auto-run on boot (additive). Write tools, frontend, RAG and external MCP
servers are not in this checkpoint.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>