The "one active run per chat" guard was bypassable under a race. Two
simultaneous POST /ai-chat/stream on the same chat both passed the
controller's pre-hijack 409 check (a check-then-act TOCTOU), then the
loser's INSERT into ai_chat_runs hit the partial unique index
(ai_chat_runs_one_active_per_chat, 23505). That error was SWALLOWED, so
the second turn streamed UNTRACKED: no runId, not targetable by /stop,
and (autonomousRuns on) onClose won't abort it -> an orphan unstoppable
run that also spends provider tokens.
Make the unique-index INSERT the authoritative gate:
- AiChatRunService.beginRun: when the run-row INSERT fails with a 23505 on
ONE_ACTIVE_RUN_PER_CHAT_INDEX (via isUniqueViolation/violatedConstraint),
no longer swallow it -> throw a distinct RunAlreadyActiveError. Any other
error (incl. a 23505 on a different constraint) propagates unchanged.
- AiChatService.stream: when begin throws RunAlreadyActiveError, reject the
turn with a 409 ConflictException (code A_RUN_ALREADY_ACTIVE) BEFORE any
AI/provider call -> no tokens spent, no untracked turn. Other begin
failures keep the legacy best-effort fallback (stream socket-bound).
- ai-chat.controller: post-hijack catch honors an HttpException's real
status/body (clean 409) instead of a blanket 500, since the race 409 is
raised before a byte is written. Pre-check 409 now carries the same code.
The controller's cheap pre-check stays as a fast-path for the common
sequential double-submit; the INSERT violation is the race-safe backstop.
Tests: ai-chat-run.service.spec proves beginRun throws RunAlreadyActiveError
on the active-index 23505 (and only that constraint), leaks no controller,
and an integration-style two-concurrent-begins test where exactly one wins;
new ai-chat.service.run-race.spec proves stream rejects with a 409
ConflictException BEFORE any streamText/generateText and never persists an
untracked turn. The latter fails without the fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make an agent turn a first-class, server-side RUN that keeps executing and
persisting its steps after the browser window closes, and that a later client
can reconnect to — the core invariant of #184. Phase 1 only; the full proposal
(cross-process BullMQ runner, resumable live-tail transport, autonomy triggers,
budgets, history compaction) is explicitly deferred.
What lands:
- `ai_chat_runs` lifecycle table + repo: the run as a persistent object
(status pending->running->succeeded|failed|aborted, trigger, createdBy,
assistantMessageId projection link, error, step_count, timings). A partial
unique index enforces ONE ACTIVE run per chat; a startup sweep recovers
dangling runs (mirrors #183's sweepStreaming).
- AiChatRunService: owns the run lifecycle + an in-memory abort registry. The
abort is governed by the RUN (an explicit user stop), NOT the HTTP socket —
so a browser disconnect no longer ends the turn. Reuses #183's socket-
independent durable write path (consumeStream + flushAssistant) unchanged.
- Controller, behind `settings.ai.autonomousRuns`: /stream wraps the turn in a
run and does NOT abort on disconnect (logs only); a clean 409 rejects a
concurrent run on the same chat; new POST /ai-chat/stop (explicit stop) and
POST /ai-chat/run (reconnect -> latest persisted run + its projection). The
runId is surfaced on the streamed start metadata. Flag OFF = byte-for-byte
legacy behavior.
Tests: AiChatRunService unit spec (lifecycle, disconnect != stop, explicit
stop aborts the signal, best-effort sweeps); ai_chat_runs integration spec
(one-active-run index, detached persist+reconnect with no subscriber, explicit
stop, stale-run sweep). Server tsc + build clean; touched jest green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Remove the separate, un-toggleable `settings.ai.generative` workspace flag
(and its write-side alias `generativeAi`) along with the dead "Ask AI"
generative editor menu, and re-gate the AI page-title generation on the
general AI chat flag (`settings.ai.chat`) — the same toggle that enables
the chat agent and the chat stream endpoint.
Why: the `generative` flag had no UI toggle (its switch was already removed,
leaving orphaned i18n strings), so the title-generation button was
unreachable on self-hosted. The "Ask AI" menu was dead — its atom was never
rendered. Consolidating onto the AI chat flag makes the title button follow
the one AI switch users actually have.
Changes:
- server: title-gen endpoint gate generative -> chat (ai-chat.controller.ts);
remove generativeAi from update DTO and workspace service (update block,
delete line, cloud default now { ai: { chat: true } }); fix repo comment;
migrate generate-page-title spec assertions generative -> chat.
- client: title-gen gate -> settings.ai.chat (full-editor.tsx); remove the
dead Ask AI button + showAiMenu wiring from bubble-menu; remove AskAiGroup
usage/import and commented block from fixed-toolbar; delete ask-ai-group.tsx;
remove showAiMenuAtom; drop generative/generativeAi from workspace types.
- i18n: remove 3 orphaned generative-AI keys from all 12 locales.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
On opening the floating AI-chat window from the header on a document page,
auto-open the LAST chat bound to that document. Binding reuses the existing
ai_chats.page_id (no migration): the bound chat is the requesting user's
most-recent non-deleted chat created on that page, so a new chat on the page
becomes the bound one for free. Resolution happens only on a genuine
closed -> open transition; the provenance badge deep-link is untouched.
Server: AiChatRepo.findLatestByPage + POST /ai-chat/bound-chat (BoundChatDto),
both read-only and owner/workspace-scoped.
Client: getBoundChat service + useOpenAiChatForCurrentPage hook wired into the
app-header entry point (fail-soft to a fresh chat; draft/role cleared only on a
real switch).
Tests: repo scoping/ordering, controller wiring, and hook behavior.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add an AI button in the page byline that generates a note's title from the
live editor content (including unsaved edits) and applies it immediately.
Server: one-shot, non-streaming POST /ai-chat/generate-page-title mirroring
the chat generateTitle path — gated by settings.ai.generative, throttled via
AI_CHAT_THROTTLER, resolves the workspace chat model and returns { title }.
The endpoint never touches the page; the client applies the title through the
existing /pages/update route (which enforces edit permission).
Client: ai-chat-service.generatePageTitle, a useGeneratePageTitle hook that
converts the editor HTML to markdown, calls the endpoint, applies the title
via updateTitle + updatePageData, reflects it in the unfocused title editor,
and broadcasts the UpdateEvent (mirroring TitleEditor.saveTitle). A sparkles
button (GenerateTitleGroup) renders next to dictation, edit-mode + flag gated.
Tests: pure cleanGeneratedTitle helper + controller gate/delegation/error-map.
i18n: en-US + ru-RU strings.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
15-point review of the persistent-history PR. Architecture decisions: crash
recovery = recency threshold; tool-label duplication = leave as-is.
Must-fix:
1. Boot-sweep bounded by recency. sweepStreaming now also requires
`updatedAt < now() - SWEEP_STREAMING_STALE_MS` (10 min), so a fresh replica's
startup sweep can't abort a turn another replica is actively streaming
(multi-instance deploy). Int-spec: a FRESH 'streaming' row is NOT swept, a
STALE one IS.
2. Restore export during the FIRST streaming turn of a new chat (#174). The
server chatId is now adopted EARLY (in-place, on the start-chunk metadata) via
a new `onServerChatId` callback wired through use-chat-session → chat-thread,
so `activeChatId` is set at turn start and the Copy button is live mid-first-
turn (canExport = !!activeChatId). Hook tests for early/in-place/no-op adopt.
3. Cover finalizeAssistant's fallback-insert branch: extracted pure
`planFinalizeAssistant(assistantId)` (update when id present, insert when the
upfront insert failed) + a dispatch harness test for both arms.
Tests: onModuleInit lifecycle spec (sweep called; throw → resolves + warns);
int-spec updatedAt assertion → toBeGreaterThan.
Cleanups: cap findAllByChat at 5000 rows; upfront-insert-failure log carries
chatId+workspaceId; removed the now-dead buildPartialAssistantRecord (only the
spec consumed it; shapes still pinned by the flushAssistant suite); controller
passes `lang: dto.lang` (normalizeLang handles undefined); dropped a no-op
`?? undefined` in errorOf; documented the content-column semantics change
(concatenated step text, UI renders from metadata.parts); CHANGELOG [Unreleased]
entry (#183, #174); reworded the stale LABELS parity comment.
Verified: server build + 323 ai-chat unit + 5 integration; client tsc + 160
ai-chat unit; prettier clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The chat lived in inconsistent paradigms (in-memory stream + client export vs.
DB-as-context), which made export flaky and lost the assistant answer if the
process died mid-turn. Make the DB the single source of truth.
A. STEP-GRANULAR DURABILITY (server)
- ai_chat_messages gains a nullable `status` column (migration; NULL = legacy =
completed). The assistant row is now INSERTED UPFRONT as `status:'streaming'`
and UPDATEd on every onStepFinish with all finished steps (text + tool calls +
tool RESULTS), then finalized once to completed/error/aborted on the terminal
callback. So a process death mid-turn keeps every finished step; a startup
sweep (OnModuleInit → sweepStreaming) flips any dangling 'streaming' row to
'aborted'. The write path no longer depends on a live socket.
- Pure exported `flushAssistant(steps, inProgressText, status, extra?)` builds
the persist payload (metadata.parts byte-identical to the old builder), so a
future background worker can call the same path. AiChatMessageRepo gains
`update`, `sweepStreaming`, and `findAllByChat`.
- consumeStream drain, external-MCP client close-once, SSE heartbeat preserved.
B. SERVER-SIDE EXPORT
- New pure `chat-markdown.util.ts` renders Markdown from DB rows ONLY (server
port of the client builder). Because A persists the in-progress row, the
export now includes an interrupted turn up to its last finished step (flagged
"still generating"). `POST /ai-chat/export` (owner-gated via assertOwnedChat,
workspace-scoped) returns it; `lang` accepts a full client locale tag
('en-US'/'ru-RU') and is normalized server-side (normalizeLang) — a strict
@IsIn(['en','ru']) DTO rejected the real client's i18n.language with a 400,
caught in real-browser testing.
- Client: handleCopy calls the endpoint; `canExport = !!activeChatId`. The whole
liveThreadRef/liveStateRef/onLiveContentChange/hasLiveContent hybrid (and the
client chat-markdown util + test) is removed — the server is now authoritative.
Tests: flushAssistant unit (status shapes + parts parity), chat-markdown.util
unit (incl. legacy NULL-status + interrupted note + ru + normalizeLang locale
tags), controller export wiring + owner-gate, integration update/sweepStreaming.
Verified: server build + 318 ai-chat unit + 3 integration; client tsc + 157
ai-chat unit; and END-TO-END in a real browser — a chat turn persists mid-stream
and the Copy button exports the DB-sourced markdown (showing the in-progress
row), HTTP 200 after the locale fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Investigate the Safari-only "Lost connection to the AI provider" mid-stream
disconnect (Chrome unaffected). Pure instrumentation, no behavior change:
the 15s heartbeat interval and all stream callbacks are unchanged.
- sse-resilience.ts: startSseHeartbeat() gains an optional onBeat hook fired
after each successfully written ping (beat counter).
- ai-chat.service.ts: track stream start, first-chunk latency, model-silent
gap and heartbeat count; log them on finish/error/abort to classify the
drop (idle-gap vs hard wall-clock cap vs slow first chunk).
- ai-chat.controller.ts: append elapsed-since-request to the disconnect warn.
All blocks tagged "DIAGNOSTIC ... temporary" for easy removal once the Safari
failure mode is identified.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The custom undici RetryAgent + aiFetch transport added for issue #140
did not actually heal mid-stream provider drops: undici's retry path is
a Range-based download-resume that SSE/chat-completions endpoints cannot
satisfy, so a reset after the first byte only swapped ECONNRESET for a
"server does not support the range header" error. Its only real effect
was reconnecting a poisoned keep-alive socket before the first byte, and
PR #141 on top of it turned the 60s headers timeout into deterministic
~61s failures (plus CONTENT_LENGTH_MISMATCH from retrying a POST body
after a timeout abort). The root cause is the z.ai coding endpoint, not
our transport.
Remove the whole layer and return all AI provider calls to Node's
default global fetch.
- delete integrations/ai/ai-http.ts and its spec
- ai.service.ts: drop the aiFetch import, the AI_BYPASS_RESILIENT_FETCH
diagnostic toggle, and fetch:aiFetch from every chat/embedding/STT
factory; raw STT call back to global fetch
- ai-chat.controller.ts: drop the stream-timing START log + startedAt
- ai-chat.service.ts: drop the first-chunk/FINISHED/ERROR timing logs
- .env.example: drop AI_BYPASS_RESILIENT_FETCH
Reverts: 1af5d34a, 7c308728, b7abb7ea, 35fc58ea, d6cd2754, 6efb8656.
Preserved (not part of the rollback): client-disconnect abort, title
generation in onFinish, partial-answer persistence, Safari SSE heartbeat.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The streaming chat turn hangs in all browsers while the non-streaming test
endpoint works — both use the same model/transport (createOpenAI + aiFetch),
so the suspect is the streaming path / custom undici RetryAgent transport.
- ai-http.ts: wrap aiFetch with per-request timing logs (start, ms-to-headers
on success, elapsed ms + cause on failure). Chat at info, embeddings at
debug. Only host+path logged.
- ai-chat.controller.ts / ai-chat.service.ts: log turn START, first-chunk
latency, FINISHED duration, and elapsed ms on disconnect/error/abort.
- ai.service.ts: AI_BYPASS_RESILIENT_FETCH=true makes the CHAT model omit
fetch:aiFetch and use the default global fetch — isolates transport vs
request-shape. Chat-only; embeddings/STT untouched; reversible via env.
- .env.example: document the flag.
No timeout/retry change. tsc clean; ai-chat + ai suites pass (292).
A mid-stream connection drop showed a generic "Something went wrong / Load
failed" banner and left no server-side trace.
- error-message: classify the browsers' own fetch-failure strings ("Load
failed" on WebKit, "Failed to fetch" on Chrome, "NetworkError" on Firefox)
as a lost connection, so the banner names the cause instead of the generic
heading.
- ai-chat.controller: log a warning in the request close handler when the
client disconnects before completion, so a drop that reaches the app (e.g. a
reverse proxy cutting the SSE) is visible in the server logs before the abort.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Batches 6-9: behaviour-preserving extractions of testable pure cores plus the
tests they unblock, and a fix for the broken client test environment.
Full suites green: server 113 suites / 1117 + 1 todo, client 30 files / 338.
client (R0 infra):
- vitest.setup.ts: in-memory localStorage/sessionStorage Storage stub wired via
setupFiles. Unblocks menu-items.gating.test.ts (was 9 failing) -> client suite
fully green. + menu-items.suggestions.test.ts (getSuggestionItems filter/sort).
share:
- extract buildShareMetaHtml (share-seo.util.ts) from the SEO controller; tests
for reflected-XSS escaping in <title>/og/twitter meta, noindex, truncation;
extractPageSlugId; updateAttachmentAttr; prepareContentForShare comment-strip
(anonymous-viewer metadata-leak guard).
ai-chat (security extractions):
- selectAccessibleHits: CASL post-filter for semantic search (restricted page in
an accessible space must NOT leak to the agent).
- validateResolvedAddresses: SSRF connect-time guard (block if ANY resolved
address is private).
- resolveAudioFormat: mime whitelist (dead `?? 'webm'` fallback dropped, set
unchanged). + mcp-servers toView header-leak guard, MCP tool namespacing.
collaboration (data-loss area):
- extract computeHistoryJob (pins the "agent delay MUST stay 0" invariant) and
resolveSource. Integration: onAuthenticate read-only matrix (collab auth
bypass), HistoryProcessor (contributor restore on save failure), onStoreDocument
Approach-A boundary snapshot (human revision pinned before agent overwrite).
Reviewed (APPROVE WITH SUGGESTIONS): extractions behaviour-preserving, security
tests mutation-resistant.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Reusable, workspace-shared agent roles for the built-in AI chat. A role is
a named persona (system-prompt instructions) + optional model override; a
chat is bound to a role at creation and applies it every turn.
Backend:
- migration 20260620T120000: ai_agent_roles table + ai_chats.role_id
(FK ON DELETE SET NULL); hand-merged types into db.d.ts/entity.types.ts
(db.d.ts is hand-curated here, full codegen would clobber it).
- core/ai-chat/roles: CRUD module. list = any workspace member; create/
update/delete = admin (Manage Settings ability, like ai-settings/mcp).
All repo queries scoped by workspace_id; soft-delete (deleted_at).
- buildSystemPrompt gains roleInstructions: role REPLACES the persona base
(admin prompt / DEFAULT_PROMPT) but SAFETY_FRAMEWORK + context are always
still appended.
- stream(): role resolved from ai_chats.role_id for existing chats (never
the request body -> no per-turn role swap); body.roleId only on creation.
Disabled (enabled=false) and soft-deleted roles fall back to universal.
- getChatModel(workspaceId, override): role model_config can swap model id /
driver; a driver without configured creds throws 503 with a clear message
naming the driver+role, resolved BEFORE response hijack.
Client:
- new-chat role picker (enabled roles only, default Universal assistant),
roleId sent only on the first message; role badge (emoji+name) in the chat
header and conversation list; admin Agent-roles management section in
Settings -> AI (add/edit/delete, MCP-form pattern).
Tests: ai-chat.prompt.spec (role layering + safety always present, incl.
jailbreak); ai.service.spec (override on unconfigured driver -> 503).
Implements docs/ai-agent-roles-plan.md.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- ai.service: route *.openrouter.ai STT to its JSON+base64
/audio/transcriptions API; keep the OpenAI multipart path (AI SDK) for
OpenAI/self-hosted whisper. Unify transcription behind transcribe().
- /transcribe controller: surface the real provider/transport reason
(describeProviderError) instead of an opaque 500; preserve HttpException.
- testConnection: add an 'stt' capability (silent-WAV probe) + DTO; client
gets a Test endpoint button and status dot on the Voice/STT card.
- useDictation: log full errors to the console and show the real reason
(mic start + transcription paths); handle NotReadable/Abort and missing
mediaDevices.
- docs(CLAUDE.md): require full error logging + specific user-facing messages.
Add push-to-talk voice dictation that transcribes recorded audio on the
server via the workspace's OpenAI-compatible AI provider (Whisper /
gpt-4o-transcribe / self-hosted whisper), then inserts the text.
Backend:
- New `stt_api_key_enc` column + migration; STT creds parity with chat/
embeddings (sttModel/sttBaseUrl/sttApiKey, write-only key, fallbacks to
chat baseUrl/key). Both provider whitelists updated (service + repo).
- AiService.getTranscriptionModel + AiTranscriptionService.
- Gated POST /ai-chat/transcribe (dictation flag → 403, JWT + workspace
scope + throttle, 25MB cap, MIME whitelist, never logs audio/key).
- New `settings.ai.dictation` workspace flag (DTO + service + audit).
Frontend:
- Wire up the Voice/STT settings card (model/base URL/key) and the
Voice-dictation toggle.
- New `features/dictation`: useDictation (MediaRecorder state machine),
MicButton, transcribe service; integrated into the chat composer and a
new editor-toolbar dictation group, both gated by ai.dictation.
WIP checkpoint of the gitmost AI-chat backend (plan stages A + B1 + B3a).
The agent acts under the requesting user's JWT (Docmost CASL enforces page
access); the external service-account /mcp endpoint is untouched.
LLM provider config (A2-A4):
- integrations/crypto: AES-256-GCM SecretBoxService (key derived from APP_SECRET,
per-record salt/iv; clear error on rotation instead of crashing).
- ai_provider_credentials table/repo/types: encrypted API key stored outside
workspace settings/baseFields, write-only (never returned by any endpoint).
- integrations/ai: per-workspace AI SDK v6 provider driver (openai/gemini/ollama),
admin-gated GET(masked)/PATCH(write-only key)/Test endpoints; settings.ai.provider
holds non-secret config incl. systemPrompt. Removed unused AI_* env getters (DB is
the single source of truth).
Chat module (A1, A5-A8):
- ai_chats/ai_chat_messages repos (workspace-scoped, soft-delete, tsv never selected).
- core/ai-chat: CRUD + POST /ai-chat/stream (Fastify hijack + AI SDK v6
pipeUIMessageStreamToResponse, abort on disconnect, persist user/assistant msgs).
- Agent loop: streamText + stepCountIs(8); read tools searchPages/getPage via a
per-request DocmostClient over loopback REST under the user's minted access token.
- Gate settings.ai.chat (+ 503 when provider unconfigured); buildSystemPrompt with a
non-removable safety/anti-prompt-injection framework. Per-user rate limit.
Per-user auth (B1):
- @docmost/mcp DocmostClient gains an additive getToken variant (carry a user JWT,
re-fetch on 401) and exports DocmostClient; the email/password service-account path
(external /mcp, stdio) is unchanged.
Agent-edit provenance backbone (B3a):
- Migration: pages/page_history (last_updated_source, last_updated_ai_chat_id) and
comments (created_source, ai_chat_id, resolved_source).
- Signed actor/aiChatId claim in the collab token; onAuthenticate propagates it,
onStoreDocument writes it with a sticky agent marker, saveHistory copies it.
Migrations auto-run on boot (additive). Write tools, frontend, RAG and external MCP
servers are not in this checkpoint.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>