CRITICAL: any failure between a successful beginRun and streamText's terminal
callbacks taking ownership (the bare awaits: user-message insert, history load,
convertToModelMessages, settings resolve; the buildSystemPrompt/forUser block;
and synchronous streamText wiring) left ai_chat_runs stuck 'running' forever
(sweepRunning only runs at startup), which then 409'd every future turn in the
chat and made the observer tab poll forever. Wrap the body of stream() after
beginRun in a safety-net try/catch that settles the run to 'error' (via
onSettled) before rethrowing, and make finalizeRun idempotent (active.delete is
the once-guard) so a settle here and a settle from a streamText callback collapse
to a single terminal write.
Also from review comment 2519:
- correct three client comments that falsely claimed /ai-chat/run is "flag-gated
server-side and would 403" — it is owner-gated only; with the feature off the
chat simply has no runs so the endpoint returns { run: null }
(ai-chat-window.tsx, ai-chat-service.ts, ai-chat-query.ts).
- remove the dead UpdatableAiChatRun type (zero usages; the repo update uses an
inline Partial<...>).
- add controller specs for POST /ai-chat/run and /ai-chat/stop (owner-gating,
run:null when no run, run+message, stop by runId and by chatId).
- add tests: an exception after beginRun settles the run to 'error' and drops the
in-memory entry (next turn is not 409'd); finalizeRun is idempotent.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reopening a chat whose agent run is still going showed a frozen snapshot
from the moment it was opened. Add a passive-observer reconnect-poll path:
when this tab did NOT start the run locally, poll POST /ai-chat/run every
2s while the run is pending/running and merge its incrementally-persisted
assistant message into the thread, so new steps/tool-calls and the growing
text appear live. Polling stops on terminal status (refetchInterval keyed
on run.status, mirroring the reindex polling); a final messages invalidate
shows the persisted end state.
Observer-vs-streamer detection: ChatThread reports its local useChat
streaming status up; the window only polls/merges while NOT locally
streaming (the streamer's SSE owns the view — no double-render). Gated by
settings.ai.autonomousRuns; the query is disabled when the feature is off
so the flag-gated endpoint is never hit, and a failed fetch can't loop
(retry:false -> refetchInterval(undefined)=false).
Pure decisions (poll interval, observe gate, message merge) extracted to
run-polling.ts and unit-tested; added query enable-gating and ChatThread
observer-merge tests. Client-only change — the reconnect endpoint already
returns the run plus the assistant message with its metadata.parts.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
MUST-FIX
- isSourceUniqueViolation read the wrong error field: kysely-postgres-js
(postgres@3.4.8) puts the violated constraint on `constraint_name`, not
node-postgres' `.constraint`, so a concurrent same-slug+language import's
23505 was never recognized as a source-collision and surfaced a false
"name already exists" error. Now read `constraint_name` (with `.constraint`
as a fallback for other drivers). Fix the faked test fixture (it built the
error with the same wrong `.constraint` field, masking the bug): it now
uses `constraint_name`, so the test genuinely exercises the skip path and
FAILS against the unfixed code.
- Extract the catalog modal's role-state computation into a pure
`catalogRoleInstallState(role, workspaceRoles, language)` helper (mirrors
role-launch.ts) and cover it with vitest: import / installed / update /
same-slug-different-language.
SUGGESTIONS
- Restore IAiRoleUpdateFromCatalogResult as a discriminated union mirroring
the server; narrow the consumer via `"reason" in result` (the boolean
discriminant does not narrow under strictNullChecks:false).
- README: add a "How it's served" section documenting AI_AGENT_ROLES_CATALOG_URL
(remote http(s) base / local path / empty => in-repo folder).
- check.mjs: drop the redundant `const key = slug` alias.
- Cover the reason->message mapping in useUpdateAiRoleFromCatalogMutation
(4 branches) via renderHook with a mocked service.
- Cover importFromCatalog "bundle not in index" => BadGateway.
- Cover updateFromCatalog "slug in index but missing in bundle file" =>
not-in-catalog.
ARCHITECTURE
- Extract the shared catalog read prefix: a private `loadBundleById`
(fetchIndex -> meta -> fetchBundle -> versionMap) reused by getCatalogBundle
and importFromCatalog, and a `catalogRoleContentFields` mapper shared by the
import insert and update patch. The three orchestrations and their distinct
write paths stay separate.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Admins can browse a curated catalog of agent roles, import roles/bundles
into a workspace, and update an imported role when the catalog ships a
newer version.
Catalog: a set of JSON files (index.json manifest + bundles/<id>/<lang>.json)
served from a local folder (dev) or a remote http(s) base URL via
AI_AGENT_ROLES_CATALOG_URL. Seeded with the existing 7 RU roles (editorial +
research bundles) plus EN translations.
Server:
- migration: nullable jsonb `source` column on ai_agent_roles
({ slug, language, version }; null => manually created)
- catalog provider: remote fetch with timeout + streaming size cap, or local
read; ^[a-z0-9-]+$ segment guard against path-traversal/SSRF
- admin endpoints: catalog, catalog/bundle, import, update-from-catalog
- import/update match by slug+language; update preserves `enabled`
Client:
- catalog modal with language selector and Import/Installed/Update states
- "Import from catalog" button + empty-state CTA in the roles settings panel
- en-US/ru-RU strings
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The "copy chat" button serialized `messageRows` (persisted rows loaded via
`useAiChatMessagesQuery`), which were incomplete in two ways, so the exported
Markdown dropped messages (e.g. "Messages: 2" for a multi-turn chat).
- Exhaust pagination: `useAiChatMessagesQuery` is a useInfiniteQuery that only
ever loaded the first page (server page size 50, oldest-first), silently
truncating longer chats. Add an effect that calls `fetchNextPage()` until
`hasNextPage` is false. Guard on `isFetchNextPageError` so a failed page fetch
does not loop on the app's global `retry: false`.
- Re-sync after each turn: `onTurnFinished` invalidated only the chat-list
query, never the per-chat messages query, so `messageRows` went stale during
a live session. Also invalidate `AI_CHAT_MESSAGES_RQ_KEY(activeChatId)` so the
export and token counters reflect the just-finished turn.
- Avoid tearing down the live thread: a render-phase latch (`historyLoadedKeyRef`)
keeps the history loader gating the FIRST mount only, so the post-turn
background refetch (which can transiently flip `hasNextPage` for a chat whose
message count is an exact multiple of the page size) no longer unmounts the
open thread and loses its in-progress useChat state.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Reusable, workspace-shared agent roles for the built-in AI chat. A role is
a named persona (system-prompt instructions) + optional model override; a
chat is bound to a role at creation and applies it every turn.
Backend:
- migration 20260620T120000: ai_agent_roles table + ai_chats.role_id
(FK ON DELETE SET NULL); hand-merged types into db.d.ts/entity.types.ts
(db.d.ts is hand-curated here, full codegen would clobber it).
- core/ai-chat/roles: CRUD module. list = any workspace member; create/
update/delete = admin (Manage Settings ability, like ai-settings/mcp).
All repo queries scoped by workspace_id; soft-delete (deleted_at).
- buildSystemPrompt gains roleInstructions: role REPLACES the persona base
(admin prompt / DEFAULT_PROMPT) but SAFETY_FRAMEWORK + context are always
still appended.
- stream(): role resolved from ai_chats.role_id for existing chats (never
the request body -> no per-turn role swap); body.roleId only on creation.
Disabled (enabled=false) and soft-deleted roles fall back to universal.
- getChatModel(workspaceId, override): role model_config can swap model id /
driver; a driver without configured creds throws 503 with a clear message
naming the driver+role, resolved BEFORE response hijack.
Client:
- new-chat role picker (enabled roles only, default Universal assistant),
roleId sent only on the first message; role badge (emoji+name) in the chat
header and conversation list; admin Agent-roles management section in
Settings -> AI (add/edit/delete, MCP-form pattern).
Tests: ai-chat.prompt.spec (role layering + safety always present, incl.
jailbreak); ai.service.spec (override on unconfigured driver -> 503).
Implements docs/ai-agent-roles-plan.md.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Add reversible write tools to the per-user agent toolset (page create/update/
move/soft-delete; comment reply + resolve), exposed under the user's JWT and
enforced by Docmost CASL; no permanent/force delete (D3).
- Non-spoofable agent provenance: sign actor/aiChatId into the access and collab
tokens (TokenService), propagate via jwt.strategy onto the request, and set
pages.last_updated_source/last_updated_ai_chat_id on REST create/update/move and
comments.created_source/resolved_source/ai_chat_id.
- packages/mcp: add an optional getCollabToken provider (content-edit provenance)
and guard against empty tokens; service-account /mcp path unchanged.
Frontend:
- Admin 'AI / Models' settings section: provider/model/embedding/base URL, a
write-only API key field, system prompt, and Test connection.
- AI chat panel (useChat + DefaultChatTransport): conversation list, streamed
messages, tool-call action log and page citations; header entry point gated on
settings.ai.chat.
Compile-verified (server nest build + client tsc/vite); not yet live-tested.
Known gaps: history 'AI agent' badge (C3), vector RAG (D), external MCP (E);
chat tool-card citation links pending a fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>