Commit Graph

39 Commits

Author SHA1 Message Date
claude code agent 227
ed3b65c36b Merge remote-tracking branch 'gitea/develop' into batch/issues-2026-06-25
# Conflicts:
#	apps/server/src/core/ai-chat/ai-chat.service.spec.ts
#	apps/server/src/core/ai-chat/ai-chat.service.ts
2026-06-25 12:48:47 +03:00
claude code agent 227
aa7a115f66 refactor(review): address PR #186 re-review (approve-with-comments)
Approve-with-comments re-review; no blockers. All 7 actionable points (8 is a
forward-looking architecture note — recommendation A, keep as-is):

1. chat-markdown.util spec: restore parity coverage of the removed client spec —
   tool error state (+ errorText), unknown-tool fallback (`Ran tool <name>` en /
   `Выполнил инструмент <name>` ru), and the circular-output stringify catch.
2. findAllByChat row cap is now testable (injectable limit) + an int-spec proves
   truncation on a modest volume.
3. Stability: the per-step durability updates are SERIALIZED via a promise chain
   (stepUpdateChain) so they commit in step order — onlyIfStreaming already
   closed the finalize race, this closes inter-step ordering.
4. findAllByChat keeps the NEWEST messages on truncation (order DESC + reverse,
   like findRecent) and logs a warning with chatId, instead of silently dropping
   the newest tail.
5. The LABELS parity comment already references the real path (tool-parts.tsx /
   toolLabelKey) — confirmed accurate.
6. Removed the redundant 'off-by-one boundary' test (strict subset of the two
   adjacent prepareAgentStep cases).
7. Extracted the terminal-finalize dispatch into a shared `applyFinalize`, used
   by BOTH the service's finalizeAssistant and its test — the test now exercises
   the real path, not a copy, so a production drift fails it.

Verified: server build + 325 ai-chat unit + 6 integration; prettier clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 12:28:35 +03:00
claude code agent 227
ea61c96a7c refactor(review): address PR #186 review (#183 — recency sweep, #174 export, tests, cleanups)
15-point review of the persistent-history PR. Architecture decisions: crash
recovery = recency threshold; tool-label duplication = leave as-is.

Must-fix:
1. Boot-sweep bounded by recency. sweepStreaming now also requires
   `updatedAt < now() - SWEEP_STREAMING_STALE_MS` (10 min), so a fresh replica's
   startup sweep can't abort a turn another replica is actively streaming
   (multi-instance deploy). Int-spec: a FRESH 'streaming' row is NOT swept, a
   STALE one IS.
2. Restore export during the FIRST streaming turn of a new chat (#174). The
   server chatId is now adopted EARLY (in-place, on the start-chunk metadata) via
   a new `onServerChatId` callback wired through use-chat-session → chat-thread,
   so `activeChatId` is set at turn start and the Copy button is live mid-first-
   turn (canExport = !!activeChatId). Hook tests for early/in-place/no-op adopt.
3. Cover finalizeAssistant's fallback-insert branch: extracted pure
   `planFinalizeAssistant(assistantId)` (update when id present, insert when the
   upfront insert failed) + a dispatch harness test for both arms.

Tests: onModuleInit lifecycle spec (sweep called; throw → resolves + warns);
int-spec updatedAt assertion → toBeGreaterThan.

Cleanups: cap findAllByChat at 5000 rows; upfront-insert-failure log carries
chatId+workspaceId; removed the now-dead buildPartialAssistantRecord (only the
spec consumed it; shapes still pinned by the flushAssistant suite); controller
passes `lang: dto.lang` (normalizeLang handles undefined); dropped a no-op
`?? undefined` in errorOf; documented the content-column semantics change
(concatenated step text, UI renders from metadata.parts); CHANGELOG [Unreleased]
entry (#183, #174); reworded the stale LABELS parity comment.

Verified: server build + 323 ai-chat unit + 5 integration; client tsc + 160
ai-chat unit; prettier clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 11:53:25 +03:00
claude code agent 227
f80276d41a refactor(review): address PR #185 review (lease leak, tests, changelog, jsonb seam)
8-point multi-aspect review of the batch PR; security/regressions were clean.

1. Lease leak: the #180 reorder moved `toolsFor` (which leases external MCP
   clients, refCount+1) ahead of buildSystemPrompt + forUser, but the only
   release (closeExternalClients) was bound to the streamText callbacks. A throw
   in between leaked the lease (refCount stuck, undici sockets held until
   restart). Define closeExternalClients right after the lease and wrap
   buildSystemPrompt+forUser in try/catch that closes-then-rethrows.
2. Cover the patch_node/delete_node dup-id refusal (#159 #6): extract the guard
   into a pure `assertUnambiguousMatch` (node-ops) and unit-test 0/1/>1.
3. Regress the body-before-title order (#159 #10): mock-HTTP test (collab fails
   fast against a server with no WS upgrade) asserts /pages/update (title) is
   NEVER posted when the body write fails — for updatePage AND updatePageJson.
4. CHANGELOG [Unreleased]: #180, #168 (Added); #163 (Fixed).
5. Add the missing en-US i18n keys (Back to references / {{label}}).
6. Drop the duplicate content/empty/blank cases in ai-chat.prompt.spec.ts (they
   repeat the buildMcpToolingBlock unit tests); keep only sandwich placement +
   both-safety-copies.
7. CI Postgres pg16 -> pg18 (match docker-compose).
8. jsonb decode seam: shared `parseJsonbValue(value, guard)` in database/utils.ts
   holds the legacy double-encoding self-heal in one place; parseToolAllowlist /
   parseModelConfig keep only a type-guard.

Verified: server build + 124 unit + 15 integration; mcp 311; prettier clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 11:36:01 +03:00
claude code agent 227
59f0c8b22d fix(ai-chat): validate the open page server-side so the agent edits the right one (#159)
The client sends the "current page" as { id, title } in the request body and the
server echoed BOTH verbatim into the system prompt context and the
getCurrentPage tool. id and title are independently attacker/desync-controllable
(two tabs, stale navigation), so openPage.id could point at page B while
openPage.title said "Page A" — the model then reported "updated Page A" while it
actually edited page B (CASL still allowed it; the user has access). Red-team
finding #4.

Resolve the open page ONCE against the DB via a new `resolveOpenPageContext`:
workspace-scoped lookup + access check, returning the AUTHORITATIVE { id, title }
(title from the DB row, never the client) or null (fail-closed) for a missing /
foreign / inaccessible page. That validated value now feeds the system prompt,
the getCurrentPage tool, AND the new-chat history origin (which previously did
this validation inline, for the id only — now shared, and the title is fixed
too).

Tests: resolveOpenPageContext covers no-id, not-found, foreign-workspace,
Forbidden, non-Forbidden-fault (fail-closed), the DB-title-wins-over-client case,
and null-title coercion.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 11:36:01 +03:00
claude code agent 227
77ccc596ea feat(ai-chat): per-MCP-server instructions in the agent system prompt (#180)
Admins can now give each EXTERNAL MCP server a free-text instruction ("how/
when to use this server's tools") that the agent receives in its SYSTEM
PROMPT next to the tool descriptions — porting the built-in SERVER_INSTRUCTIONS
idea to admin-configured servers. Trusted, admin-authored text (like a system
prompt); NON-secret, so unlike headersEnc it IS returned in views/forms.

- Migration: nullable `instructions text` on ai_mcp_servers (old rows = null =
  no guidance). Table type + repo insert/update (blank/whitespace -> null via
  blankToNull). DTO `@MaxLength(4000)`. Service threads it through
  McpServerView/toView.
- mcp-clients: `McpServerInstruction { serverName, toolPrefix, instructions }`
  threaded through the toolset/cache/lease. Guidance is built ONLY for a server
  that actually connected AND contributed >=1 callable tool (the allowlist may
  filter all of them out) AND has non-blank text — so a guide never appears for
  tools the agent cannot call. Cached with the toolset, so an edit is picked up
  next turn via the existing CRUD cache invalidation.
- System prompt: `buildMcpToolingBlock` renders an <mcp_tooling> block INSIDE
  the safety sandwich (after context, before the trailing SAFETY_FRAMEWORK) so
  it informs tool choice but cannot override the rules; each section is headed
  by the server's `prefix_*` namespace. Empty/blank -> block omitted. The
  caller (ai-chat.service) now builds the external toolset BEFORE the prompt and
  passes external.instructions; client-handle lifecycle (close-once) unchanged.
- Client: instructions field in types + a Textarea (autosize, maxLength 4000)
  in the MCP-server form with a namespace-prefix hint; i18n (en/ru).

Tests across every layer (prompt block placement + both SAFETY copies; view
blank->null; buildEntry includes guidance only for connected+>=1-tool+non-blank;
DTO MaxLength; repo + integration round-trip; service wiring). Delegated impl
reviewed (APPROVE); applied the import-type follow-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 11:36:01 +03:00
claude code agent 227
ae6faf3abc fix(ai-chat): guard step-update vs finalize race with WHERE status='streaming' (#183 review)
Review caught a real race: onStepFinish fires `updateStreaming()` fire-and-
forget (not awaited), so the FINAL step's streaming UPDATE and the terminal
`finalizeAssistant` UPDATE run as two concurrent statements on different pool
connections — commit order is not guaranteed. If the late streaming update
lands AFTER finalize, the completed row is clobbered back to status='streaming'
with no usage/finishReason, and the next startup sweep then mis-marks the
finished turn 'aborted'. Green unit/integration tests don't reproduce a
cross-connection race.

Fix: scope the per-step update with `onlyIfStreaming` → SQL `WHERE
status='streaming'`. Once finalize has set a terminal status the late update
matches zero rows and no-ops, regardless of commit order; finalize runs
unguarded so it always wins. A cheap `if (finalized) return` short-circuit
avoids most wasted queries, but the SQL guard is the authoritative fix (the
flag can be set after a query is already in flight).

Integration test: finalize to 'completed', then a late onlyIfStreaming update
is a no-op — status/content/usage preserved.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 06:14:02 +03:00
claude code agent 227
e7b719bbb8 feat(ai-chat): persistent history as source of truth — step durability + server export (#183)
The chat lived in inconsistent paradigms (in-memory stream + client export vs.
DB-as-context), which made export flaky and lost the assistant answer if the
process died mid-turn. Make the DB the single source of truth.

A. STEP-GRANULAR DURABILITY (server)
- ai_chat_messages gains a nullable `status` column (migration; NULL = legacy =
  completed). The assistant row is now INSERTED UPFRONT as `status:'streaming'`
  and UPDATEd on every onStepFinish with all finished steps (text + tool calls +
  tool RESULTS), then finalized once to completed/error/aborted on the terminal
  callback. So a process death mid-turn keeps every finished step; a startup
  sweep (OnModuleInit → sweepStreaming) flips any dangling 'streaming' row to
  'aborted'. The write path no longer depends on a live socket.
- Pure exported `flushAssistant(steps, inProgressText, status, extra?)` builds
  the persist payload (metadata.parts byte-identical to the old builder), so a
  future background worker can call the same path. AiChatMessageRepo gains
  `update`, `sweepStreaming`, and `findAllByChat`.
- consumeStream drain, external-MCP client close-once, SSE heartbeat preserved.

B. SERVER-SIDE EXPORT
- New pure `chat-markdown.util.ts` renders Markdown from DB rows ONLY (server
  port of the client builder). Because A persists the in-progress row, the
  export now includes an interrupted turn up to its last finished step (flagged
  "still generating"). `POST /ai-chat/export` (owner-gated via assertOwnedChat,
  workspace-scoped) returns it; `lang` accepts a full client locale tag
  ('en-US'/'ru-RU') and is normalized server-side (normalizeLang) — a strict
  @IsIn(['en','ru']) DTO rejected the real client's i18n.language with a 400,
  caught in real-browser testing.
- Client: handleCopy calls the endpoint; `canExport = !!activeChatId`. The whole
  liveThreadRef/liveStateRef/onLiveContentChange/hasLiveContent hybrid (and the
  client chat-markdown util + test) is removed — the server is now authoritative.

Tests: flushAssistant unit (status shapes + parts parity), chat-markdown.util
unit (incl. legacy NULL-status + interrupted note + ru + normalizeLang locale
tags), controller export wiring + owner-gate, integration update/sweepStreaming.
Verified: server build + 318 ai-chat unit + 3 integration; client tsc + 157
ai-chat unit; and END-TO-END in a real browser — a chat turn persists mid-stream
and the Copy button exports the DB-sourced markdown (showing the in-progress
row), HTTP 200 after the locale fix.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 06:05:26 +03:00
claude_code
b6787cc542 fix(ai-chat): drain stream on client disconnect to stop heap-OOM leak
The /api/ai-chat/stream and public-share streaming paths piped streamText
output to the client socket via pipeUIMessageStreamToResponse, whose only
reader is that socket. On a client disconnect (pervasive Safari/proxy
ECONNRESET), backpressure stalled the stream: the controller aborted the
turn but nothing drained it, so streamText's onFinish/onError/onAbort never
fired. Cleanup (close leased MCP clients, persist partial) never ran and the
whole per-turn object graph (history, per-request toolset closures, captured
steps, SDK buffers) stayed rooted — accumulating across turns until the
default ~2GB heap saturated and the process crashed with
"Ineffective mark-compacts near heap limit - JavaScript heap out of memory".

Add the AI SDK v6 documented remedy: fire-and-forget
`result.consumeStream({ onError })` right after streamText(), which removes
backpressure and drains the stream independently of the client socket so the
terminal callbacks always fire and the turn's memory is released even when the
client has gone away. Applied to both the authenticated and public-share
stream services.

Also add `--heapsnapshot-near-heap-limit=2` to the prod start script so any
residual leak dumps a heap snapshot near OOM for diagnosis (no effect on
normal operation). Heap size stays ops-tunable via NODE_OPTIONS.

- apps/server/src/core/ai-chat/ai-chat.service.ts
- apps/server/src/core/ai-chat/public-share-chat.service.ts
- apps/server/package.json

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 03:59:32 +03:00
claude_code
13cac155c1 chore(ai-chat): add temporary Safari stream-drop diagnostics
Investigate the Safari-only "Lost connection to the AI provider" mid-stream
disconnect (Chrome unaffected). Pure instrumentation, no behavior change:
the 15s heartbeat interval and all stream callbacks are unchanged.

- sse-resilience.ts: startSseHeartbeat() gains an optional onBeat hook fired
  after each successfully written ping (beat counter).
- ai-chat.service.ts: track stream start, first-chunk latency, model-silent
  gap and heartbeat count; log them on finish/error/abort to classify the
  drop (idle-gap vs hard wall-clock cap vs slow first chunk).
- ai-chat.controller.ts: append elapsed-since-request to the disconnect warn.

All blocks tagged "DIAGNOSTIC ... temporary" for easy removal once the Safari
failure mode is identified.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 15:14:29 +03:00
claude code agent 227
0ebb1adce8 feat(ai-chat): realtime token counter + reasoning tokens, Claude-Code style (#151)
Tokens were only counted post-hoc (onFinish) and the header badge updated only on
chat open/switch; reasoning wasn't requested or shown. Now a counter ticks LIVE
during generation and surfaces reasoning ("thinking") tokens separately, like
Claude Code's `Thinking… · N tokens`.

Architecture (AI SDK v6): no provider gives exact per-token usage mid-stream, so
the live number is a cheap client estimate (chars/≈4) reconciled to AUTHORITATIVE
provider usage at step boundaries and turn end. The useChat per-delta re-render is
the existing realtime engine.

- server: `chatStreamMetadata` now also forwards usage on `finish-step` + `finish`;
  `sendReasoning: true`; persisted `metadata.usage` carries `reasoningTokens`
  (normalized from `outputTokenDetails` or the deprecated field).
- client: pure `count-stream-tokens` (estimateTokens / liveTurnTokens, prefers
  authoritative usage else estimate); `Thinking… · N tokens` in the typing
  indicator; collapsible "Thinking" reasoning block; throttled (~8 Hz) live
  turn-token header badge; `reasoningTokens` in types + Markdown export.

Review fixes folded in:
- v6 `finish-step.usage` is PER-STEP, not cumulative — the server now ACCUMULATES
  a running sum (new pure `accumulateStepUsage`) and sends the cumulative, which
  converges to `finish.totalUsage`, so the live counter never jumps DOWN on a
  multi-step agent turn.
- reasoning double-count: the authoritative turn-total is attributed to a block
  ONLY for a single-reasoning-part (one-step) turn; multi-step blocks each show
  their own estimate (the authoritative total stays in the header).
- no "0" badge flash at turn start (require live > 0, else show context size).
- comment refreshed (finish-step trigger).

Tests: server `accumulateStepUsage` + updated `chatStreamMetadata` (34 in the
suite); client pure-fn tests. Both tsc clean; 162 client ai-chat + the ai-chat
server suite pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 06:56:14 +03:00
claude_code
5161de8ba9 revert(ai-http): drop resilient fetch/RetryAgent layer (#140)
The custom undici RetryAgent + aiFetch transport added for issue #140
did not actually heal mid-stream provider drops: undici's retry path is
a Range-based download-resume that SSE/chat-completions endpoints cannot
satisfy, so a reset after the first byte only swapped ECONNRESET for a
"server does not support the range header" error. Its only real effect
was reconnecting a poisoned keep-alive socket before the first byte, and
PR #141 on top of it turned the 60s headers timeout into deterministic
~61s failures (plus CONTENT_LENGTH_MISMATCH from retrying a POST body
after a timeout abort). The root cause is the z.ai coding endpoint, not
our transport.

Remove the whole layer and return all AI provider calls to Node's
default global fetch.

- delete integrations/ai/ai-http.ts and its spec
- ai.service.ts: drop the aiFetch import, the AI_BYPASS_RESILIENT_FETCH
  diagnostic toggle, and fetch:aiFetch from every chat/embedding/STT
  factory; raw STT call back to global fetch
- ai-chat.controller.ts: drop the stream-timing START log + startedAt
- ai-chat.service.ts: drop the first-chunk/FINISHED/ERROR timing logs
- .env.example: drop AI_BYPASS_RESILIENT_FETCH

Reverts: 1af5d34a, 7c308728, b7abb7ea, 35fc58ea, d6cd2754, 6efb8656.
Preserved (not part of the rollback): client-disconnect abort, title
generation in onFinish, partial-answer persistence, Safari SSE heartbeat.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 18:48:33 +03:00
97002f318a Merge pull request 'fix(ai-chat): adopt the server-returned chat id (two-tab adoption race #137)' (#138) from fix/ai-chat-chatid-adoption into develop
Reviewed-on: #138
2026-06-23 03:35:03 +03:00
claude_code
fd66ee6cce fix(ai-chat): stop title generation racing the chat stream (provider stall)
A new-chat turn fired the chat stream (streamText) and title generation
(generateText) concurrently to the same z.ai coding endpoint. That plan
stalls one of two concurrent requests, so the chat stream black-holed for
~300s (undici headers timeout) and the turn hung forever in every browser;
the AI SDK then retried 3x. Server logs showed two concurrent POSTs to
/chat/completions per turn — one 200 in ~8s, the other "fetch failed after
301209ms". Bypassing the custom undici transport did not help, confirming
the cause is the concurrency, not the transport.

Move generateTitle from before the response pipe into onFinish, so it runs
solo AFTER the stream's provider call completes. A first turn that errors or
aborts no longer auto-titles (fallback "Untitled chat" already handles a
null title) — acceptable, and it removes the request that was stalling.
2026-06-23 02:41:14 +03:00
claude code agent 227
f59ca3cb0d refactor(ai-chat): extract useChatSession hook + lock the id lifecycle with tests
Addresses the 2nd PR #138 review (test debt + the Variant-B architecture ask).

The new→persisted chat id lifecycle (mount key, both adoption paths, the
history-load latch, the render-phase reconciler, onTurnFinished) is moved out of
the 768-line window into a new useChatSession hook driven by a pure
threadSessionReducer (reconcile/adopt), so adopt-vs-switch is one explicit
dispatch point and the scattering the review flagged is gone (window: 768→~620).

Tests (the blockers):
- use-chat-session.test.tsx — hook-level locks incl. the #137 regression
  (adopts the authoritative streamed id 'A', NOT chats.items[0]='B' — fails on
  the old heuristic), the error-path fallback (arm/adopt/ambiguous/add+delete),
  the disarm-on-reconcile lock (a fallback armed then switched away must not be
  adopted by a late refetch), in-place-adopt-keeps-key vs external-switch-remount,
  and the waitingForHistory latch.
- extractServerChatId (reading message.metadata.chatId) and newlyAddedChatIds
  extracted as pure helpers with unit tests; threadSessionReducer tested.

Cleanups: single canonical #137 explanation in adopt-chat-id.ts (other sites
reference it); fallback effect computes the set diff once; invalidate callbacks
memoized; redundant invariant tests folded.

Behavior preserved — re-verified live (z.ai glm-5.2): new-chat adopt + 2nd turn
in the same row, no mid-conversation remount, two-tab race leak-free, switch to
an existing chat reseeds full history, reload restores history.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 02:25:52 +03:00
claude_code
7c308728de chore(ai-chat): add stream timing logs + env-gated aiFetch bypass (diagnostics)
The streaming chat turn hangs in all browsers while the non-streaming test
endpoint works — both use the same model/transport (createOpenAI + aiFetch),
so the suspect is the streaming path / custom undici RetryAgent transport.

- ai-http.ts: wrap aiFetch with per-request timing logs (start, ms-to-headers
  on success, elapsed ms + cause on failure). Chat at info, embeddings at
  debug. Only host+path logged.
- ai-chat.controller.ts / ai-chat.service.ts: log turn START, first-chunk
  latency, FINISHED duration, and elapsed ms on disconnect/error/abort.
- ai.service.ts: AI_BYPASS_RESILIENT_FETCH=true makes the CHAT model omit
  fetch:aiFetch and use the default global fetch — isolates transport vs
  request-shape. Chat-only; embeddings/STT untouched; reversible via env.
- .env.example: document the flag.

No timeout/retry change. tsc clean; ai-chat + ai suites pass (292).
2026-06-23 02:13:54 +03:00
claude code agent 227
580f3442b8 fix(ai-chat): prevent duplicate chat row on first-turn error; add adoption tests
Addresses the PR #138 review.

Blocker 1 — duplicate chat row: a brand-new chat whose first turn errors BEFORE
the SSE 'start' chunk never receives the authoritative chatId, so metadata
adoption can't run; a retry then sent chatId:null and the server inserted a
SECOND chat row, orphaning the first turn. Keep metadata adoption as the primary
path (resolveAdoptedChatId) and add a bounded, unambiguous fallback: on a
new-chat finish with no server id, snapshot the known chat ids and, once the
list refetch lands, adopt the SINGLE newly-appeared id (pickNewlyCreatedChatId).
Zero or >1 new ids (e.g. two tabs racing) → no adoption — no items[0] guessing,
so #137 stays fixed. The wait-for-refetch guard compares set membership (robust
to a concurrent delete), and the diff dedupes so a repeated id from a paginated
list never reads as ambiguous.

Blocker 2 — tests: new adopt-chat-id.test.ts covers both pure helpers (adopt
decision + newly-created-id diff incl. dedupe/reorder); the server
messageMetadata callback is extracted to chatStreamStartMetadata and unit-tested
(start -> {chatId}, otherwise undefined).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 01:17:30 +03:00
claude_code
1b4de2b420 fix(ai-chat): keep SSE stream alive in Safari (heartbeat + strip hop-by-hop headers)
Safari/WebKit dropped the AI chat answer stream mid-turn ("Load failed",
shown as "Lost connection to the server") while Chrome/Firefox were fine.
Two Safari-specific causes: (1) during model think/tool gaps the UI-message
SSE stream emits no bytes and WebKit aborts a non-progressing fetch far more
aggressively than Chrome; (2) the AI SDK sets a hop-by-hop `Connection:
keep-alive` header which is illegal on HTTP/2 — Chrome/Firefox ignore it,
Safari rejects the whole response. Earlier commits only improved the error
text, never the drop itself.

Add apps/server/src/core/ai-chat/sse-resilience.ts with two helpers wired into
both stream paths (authenticated + public share):
- startSseHeartbeat: writes a `: ping` SSE comment every 15s (ignored by the
  client's EventSourceParserStream) so bytes keep flowing; unref'd timer,
  guarded writes, auto-clear on finish/close.
- stripStreamingHopByHopHeaders: wraps writeHead once to drop Connection/
  Keep-Alive before the head is sent, so they can never leak into an HTTP/2
  response.
Add sse-resilience.spec.ts (7 tests). tsc + eslint clean.
2026-06-23 01:02:55 +03:00
claude code agent 227
1858a5800d fix(ai-chat): adopt the server-returned chat id, not the newest in the list
A brand-new chat (activeChatId === null) had no way to learn the id of the row
the server created: the SSE stream never returned it, so the client adopted the
NEWEST chat in the per-user list (chats.items[0]). With two tabs open, a second
tab creating a chat at ~the same time made its row the newest, so the first tab
adopted the wrong id — its later turns persisted into the other chat and the
agent rebuilt history from it (commands leaked between chats), while the live UI
still showed the original conversation. (#137)

The server now attaches the authoritative chatId to the streamed assistant
message via the AI SDK messageMetadata on the 'start' part, so it reaches the
client on the first chunk. The client reads message.metadata.chatId in useChat's
onFinish and adopts that id in place (no remount, so the live turn and the
thread's chatIdRef follow the real id and the next turn targets the right chat).
The chats.items[0] guess and the adoptNewChat ref are removed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 23:46:50 +03:00
claude_code
fc262636ab fix(ai-chat): persist partial answer when a turn errors mid-stream
A provider error (e.g. read ECONNRESET) routed the turn through the
streamText onError callback, which persisted an EMPTY assistant record
(buildErrorAssistantRecord -> text:'', parts:[]). The answer text already
streamed to and shown by the client was therefore lost from the persisted
row, the chat export, and reopened history — leaving only the error line.

The AI SDK v6 onError callback receives only { error } (no steps/text),
and the visible final answer streams in the last, not-yet-finished step,
so it is absent from every finished step.text. Accumulate it ourselves:
onChunk folds each 'text-delta' into inProgressText; onStepFinish moves a
finished step into capturedSteps and resets inProgressText. onError and
onAbort now persist the partial answer (finished steps' text + tool parts
via assistantParts, then the in-progress text appended last) through a new
shared pure helper buildPartialAssistantRecord, recording the cause in
metadata.error on the error path. Replaces buildErrorAssistantRecord; its
empty-turn shape is preserved when nothing streamed.

Complementary to the resilient-fetch reconnect: that reduces how often a
turn dies; this preserves what was produced when it dies anyway.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 20:30:59 +03:00
claude_code
7ce1a24f82 feat(ai-chat): show creation time and origin document in chat list
Each chat row in the AI-chat history now shows a dimmed second line with
how long ago the chat was created and the document it was created in
("N ago / <document>", or "No document" when started outside a page).

Server:
- New migration: nullable ai_chats.page_id (FK pages.id, ON DELETE SET NULL).
- Capture the origin page at chat creation from the client-supplied openPage,
  but validate it first: it must be a real page in the same workspace that the
  user may read (PageAccessService.validateCanView), else null. This keeps the
  "openPage.id is attacker-controllable but harmless" invariant - preventing a
  cross-workspace/cross-space page-title leak and a post-hijack FK crash.
- findByCreator left-joins pages (scoped by workspace, defense-in-depth) and
  returns pageTitle.

Client:
- IAiChat gains pageId/pageTitle; ConversationList renders a ChatMetaLine
  (useTimeAgo + origin document) as a dimmed second line.
- Add i18n key "No document" (en-US, ru-RU).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 16:16:26 +03:00
claude_code
9a9b61b9a3 feat(ai-chat): log aborted stream turns in onAbort
The onAbort terminal path persisted the partial turn but wrote nothing
to the log, so a turn killed by a client disconnect / proxy drop / stop()
was invisible in the logs (unlike onError and the controller catch, which
both log). Add a logger.warn with the chat id, completed step count and
partial-text length so an aborted turn is traceable.
2026-06-21 21:21:48 +03:00
claude code agent 227
3953ecdb17 refactor(ai-chat): single live+enabled role resolve in the repo (#95)
resolveRoleForRequest and resolveShareRole duplicated the security invariant
'role exists, not soft-deleted, enabled, workspace-scoped, else null'. Move it to
AiAgentRoleRepo.findLiveEnabled(id, workspaceId) (deletedAt IS NULL + enabled +
workspace scope) and have both services call it, preserving each one's roleId
derivation + null handling. (describeProviderError half of #95 was done earlier.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 03:49:52 +03:00
claude_code
3695dbdf7f Merge remote-tracking branch 'gitea/develop' into fix/ai-chat-current-page 2026-06-21 01:29:37 +03:00
claude_code
90d3fab483 test: cover features since 053a9c0d + repair test tooling
Add ~330 tests across server (Jest), client (Vitest), editor-ext (Vitest)
and packages/mcp (node:test) for the gitmost features added since
053a9c0d: AI chat, AI agent roles, public-share assistant, MCP per-user
auth, HTML embed, page templates/embed, realtime tree, tree
expand/collapse, and the AI-settings UI.

Test-tooling fixes (prerequisite, were silently hiding coverage):
- Repair 3 page-template specs broken by the 11-arg TransclusionService
  constructor; they never compiled, so template access-control / content
  -leak / unsync-strip coverage was fictitious.
- Build @docmost/editor-ext before server tests via a `pretest` hook;
  the stale dist omitted the new HtmlEmbed/PageEmbed exports (TS2305).
- Let jest resolve the .tsx email templates: add `tsx` to
  moduleFileExtensions and widen the ts-jest transform to (t|j)sx?.

Behaviour-preserving "extract pure core" refactors that the tests drive:
- server: resolveShareAssistantRequest + uiMessageTextLength
  (public-share controller), decideBasicGate + mapAuthResultToResponse
  (mcp), buildErrorAssistantRecord (ai-chat), jsonbObject export (roles).
- client: render-raw-html + shouldExecute/canEdit, decide-embed-state,
  page-embed picker utils, tree-socket reducers, open/close branch maps,
  isEndpointConfigured/resolveKeyField; buildTreeWithChildren now treats
  a permission-trimmed orphan as a root instead of crashing.

Deferred (need a test DB or HTTP harness, documented in the specs):
repo-level Postgres integration tests and the public-share XFF E2E.
Pre-existing DI/lib0-ESM suite failures are untouched and out of scope.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 23:40:40 +03:00
claude code agent 227
a6ba19f0dc feat(ai-chat): add get_current_page tool for proxy-robust page context (#43, hardness #2)
The current page id was only injected as text in the system prompt, which a
proxy (CLIProxyAPI) can rewrite/truncate, so the agent could lose track of 'this
page'. Add a getCurrentPage tool the model can call to read the open page (id +
title) from the server-side request context (forUser now takes openedPage,
threaded from body.openPage — the same value used for the system prompt). The
inline system-prompt line is kept as belt-and-suspenders. Reads/writes still go
through the CASL-enforced page tools by id, so this is strictly not worse than
the existing prompt hint — just delivered over a channel the proxy can't mangle.

User-approved on the issue. Completes #43 together with the hardness-1 fix.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 22:19:40 +03:00
claude_code
4c1d1aa2ee Merge pull request 'feat(ai-chat): agent roles (admin persona + optional model)' (#11) from feat/ai-agent-roles into develop 2026-06-20 18:31:10 +03:00
vvzvlad
45cf4140eb Merge branch 'develop' into feat/ai-chat-review-followups
Integrate the already-merged step-limit work from develop. Only conflict was
ai-chat.service.spec.ts: both sides appended a describe block and edited the
import line. Resolved as a union — keep compactToolOutput + the assistantParts/
serializeSteps/rowToUiMessage suites (this branch) AND the prepareAgentStep
suite (develop), importing all symbols from ai-chat.service.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 18:09:17 +03:00
claude code agent 227
cedea4072b refactor(ai-chat)!: unify provider error formatting via describeProviderError
Behaviour change (split out of the test commit per review, and now covered).

Both the stream onError log line and the error text streamed to the client were
formatted by separate inline blocks that only emitted "<status>: <message>".
Route both through the shared describeProviderError() so formatting stays in one
place.

BEHAVIOUR CHANGE: describeProviderError additionally appends a single-line,
300-char-truncated snippet of the provider responseBody/text. So the log line
AND the user-facing stream error now include that snippet (e.g. the HTML error
page from a misconfigured endpoint), which previously neither did. This is
intentional — it makes a misconfigured external endpoint diagnosable — and is
safe: the API key travels in the Authorization header and is never echoed in
the response body (see the util's docstring). A `fallback` param is added so
each call site keeps its own default ('AI stream error' for the stream).

Adds ai-error.util.spec.ts covering the formatter, including the appended /
truncated body snippet, so this behaviour is no longer untested.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 17:59:55 +03:00
claude code agent 227
f1980cf425 test(ai-chat): safety-critical coverage + a11y + pure refactors
Unit tests for the safety-critical paths: crypto secret-box (round-trip,
tamper detection, wrong key), the SSRF guard (blocked ranges + DNS-rebinding),
the ai-chat tools service, the page-embedding repo, and the
assistant-parts/serialization helpers. Those server helpers (assistantParts,
rowToUiMessage, serializeSteps) are exported ONLY for the tests — no runtime
change.

Also: keyboard a11y on the chat history header and conversation rows
(role/tabIndex/Enter+Space), and DRY refactors that move shared logic into one
place (isToolPart -> tool-parts util; buildInitialValues in the MCP form).

The behaviour-changing edits that previously rode along in this commit are
split out into the following two commits, per review.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 17:58:44 +03:00
vvzvlad
0b969c8675 test(ai-chat): pin step-limit boundary + note AI SDK v7 system->instructions
Port two refinements from the GLM variant onto the Claude base:
- prepareAgentStep: add a comment note that AI SDK v7 renames the per-step
  `system` field to `instructions` (v6 ^6.0.134 still uses `system`), so it
  gets updated correctly on the next SDK bump.
- ai-chat.service.spec: add an explicit off-by-one boundary test for
  prepareAgentStep, expressed via MAX_AGENT_STEPS instead of a hardcoded 18/19
  so it tracks the constant if the cap changes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 17:47:16 +03:00
claude code agent 227
30c3189220 feat(ai-chat): agent roles (admin-defined persona + optional model)
Reusable, workspace-shared agent roles for the built-in AI chat. A role is
a named persona (system-prompt instructions) + optional model override; a
chat is bound to a role at creation and applies it every turn.

Backend:
- migration 20260620T120000: ai_agent_roles table + ai_chats.role_id
  (FK ON DELETE SET NULL); hand-merged types into db.d.ts/entity.types.ts
  (db.d.ts is hand-curated here, full codegen would clobber it).
- core/ai-chat/roles: CRUD module. list = any workspace member; create/
  update/delete = admin (Manage Settings ability, like ai-settings/mcp).
  All repo queries scoped by workspace_id; soft-delete (deleted_at).
- buildSystemPrompt gains roleInstructions: role REPLACES the persona base
  (admin prompt / DEFAULT_PROMPT) but SAFETY_FRAMEWORK + context are always
  still appended.
- stream(): role resolved from ai_chats.role_id for existing chats (never
  the request body -> no per-turn role swap); body.roleId only on creation.
  Disabled (enabled=false) and soft-deleted roles fall back to universal.
- getChatModel(workspaceId, override): role model_config can swap model id /
  driver; a driver without configured creds throws 503 with a clear message
  naming the driver+role, resolved BEFORE response hijack.

Client:
- new-chat role picker (enabled roles only, default Universal assistant),
  roleId sent only on the first message; role badge (emoji+name) in the chat
  header and conversation list; admin Agent-roles management section in
  Settings -> AI (add/edit/delete, MCP-form pattern).

Tests: ai-chat.prompt.spec (role layering + safety always present, incl.
jailbreak); ai.service.spec (override on unconfigured driver -> 503).

Implements docs/ai-agent-roles-plan.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 06:30:06 +03:00
claude code agent 227
b197cbedef feat(ai-chat): raise agent step cap 8->20, force a final text answer
A narrow research question could burn all 8 steps on tool calls and end the
turn with no assistant text (empty turn). Two changes:
- MAX_AGENT_STEPS = 20 (was a magic stepCountIs(8)) so multi-search turns
  aren't cut off mid-investigation.
- prepareStep reserves the LAST allowed step for a text-only synthesis:
  toolChoice 'none' + a FINAL_STEP_INSTRUCTION appended to (not replacing)
  the system prompt, so a tool-heavy turn always ends with a real answer.
Logic extracted into the pure, exported prepareAgentStep(stepNumber, system)
for unit testing; earlier steps return undefined (default behavior).

Implements docs/backlog/ai-chat-step-limit-and-forced-final-answer.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 05:38:13 +03:00
vvzvlad
f96df1c540 feat(ai-chat): show current context size instead of total tokens spent
The floating AI-chat header badge summed metadata.usage (AI SDK
totalUsage, all steps) across every assistant row, showing the
cumulative tokens SPENT — which grows each turn as history is re-sent.
Replace it with the conversation's CURRENT context size.

- server: persist metadata.contextTokens in streamText onFinish from the
  final-step `usage` (inputTokens + outputTokens ≈ current context
  window occupancy); keep usage: totalUsage for back-compat/fallback
- client: derive the badge from the most recent assistant row's
  contextTokens (fallback to that row's usage total for older chats)
  instead of summing all rows
- types: add metadata.contextTokens to IAiChatMessageRow
- i18n: rename badge label "Tokens used in this chat" -> "Current
  context size" (en-US)

No DB migration needed (metadata is a JSON column).
2026-06-18 19:54:34 +03:00
vvzvlad
41dfeeb77a perf(ai-chat): compact large tool outputs before persisting them
Read tools (getPage, getPageJson, getNode, diffPageVersions,
exportPageMarkdown) return whole pages with no size cap. Their outputs
were stored verbatim in metadata.parts and the tool_calls column, and
metadata.parts is replayed to the provider on every later turn via
convertToModelMessages. After reading a couple of large pages the prompt
grew by full page bodies each turn — rising token cost, latency and DB
row size.

Add compactToolOutput(): a pure, recursive, size-bounded compactor used
in assistantParts() and serializeSteps(). It preserves the value's kind
and small scalar fields (id/title/pageId, which the client reads to build
citations on reload) while truncating long strings, capping long arrays
with a marker, and collapsing subtrees past a depth limit. Small outputs
are returned unchanged by identity. Tool inputs are left intact so
replayed tool_use arguments keep their object shape.

Compaction runs only at persistence time (onFinish/onAbort), so the live
stream and the current turn's multi-step reasoning still see full bodies.

Add unit tests for compactToolOutput.
2026-06-17 23:44:51 +03:00
vvzvlad
65f0713a70 fix(ai-chat): live streaming, open-page context, any-dimension embeddings" -m "- streaming: give useChat a STABLE store id (chatId ?? per-mount generated)
so the v6 hook stops re-creating its store every render on a new chat
  (which wiped the optimistic user message + streamed deltas, so nothing
  showed until the turn finished). Also send X-Accel-Buffering:no + flushHeaders.
- context: client sends the currently-open page {id,title}; the system prompt
  tells the agent which page 'this page' refers to (it reads it via its
  CASL-scoped getPage tool; id is prompt-context only, no server-side fetch).
- embeddings: make page_embeddings.embedding dimension-agnostic (drop the
  HNSW index + ALTER to vector), remove the hard 1536 guard, filter search by
  model_dimensions — so 3072-dim (and any) models index instead of being
  skipped. Seq-scan <=> search (wiki scale); existing pages reindex on next edit.
2026-06-17 04:58:06 +03:00
vvzvlad
a4b7919753 fix(ai-chat): OpenAI Chat Completions for multi-turn + provider settings, stream UX & errors" -m "Live-stand fixes (OpenRouter / OpenAI-compatible):
- openai provider: use .chat() (Chat Completions) instead of the default callable
  (Responses API), which gateways reject on multi-turn -> 400.
- updateAiProviderSettings: assemble settings.ai.provider via jsonb_build_object
  with ::text-cast bound params + jsonb_typeof self-heal (postgres.js was
  double-encoding it into an array; the ::text cast avoids 'could not determine
  data type of parameter').
- chat agent: drop the hard maxOutputTokens cap (truncated complex tool calls);
  keep a tiny cap only on the test-connection ping.
- testConnection + chat stream: surface the real provider error (statusCode+message)
  to logs and the UI instead of generic masks; never log the API key.
- chat UI: typing indicator, incremental streaming render, tool 'running' status, Stop.

Also bundled (prior uncommitted ai-chat work):
- history 'AI agent' provenance badge; vector RAG (pgvector image + page_embeddings
  + AI_QUEUE indexer + space-scoped semanticSearch); external MCP servers backend
  (@ai-sdk/mcp client, SSRF IP-pinning, encrypted headers, admin CRUD/Test);
  yjs duplicate-instance fix via pnpm patch (single CJS instance server-side).
2026-06-17 04:28:29 +03:00
vvzvlad
44b340dc1a feat(ai-chat): agent write tools, provenance wiring, chat panel + provider settings UI" -m "Backend:
- Add reversible write tools to the per-user agent toolset (page create/update/
  move/soft-delete; comment reply + resolve), exposed under the user's JWT and
  enforced by Docmost CASL; no permanent/force delete (D3).
- Non-spoofable agent provenance: sign actor/aiChatId into the access and collab
  tokens (TokenService), propagate via jwt.strategy onto the request, and set
  pages.last_updated_source/last_updated_ai_chat_id on REST create/update/move and
  comments.created_source/resolved_source/ai_chat_id.
- packages/mcp: add an optional getCollabToken provider (content-edit provenance)
  and guard against empty tokens; service-account /mcp path unchanged.

Frontend:
- Admin 'AI / Models' settings section: provider/model/embedding/base URL, a
  write-only API key field, system prompt, and Test connection.
- AI chat panel (useChat + DefaultChatTransport): conversation list, streamed
  messages, tool-call action log and page citations; header entry point gated on
  settings.ai.chat.

Compile-verified (server nest build + client tsc/vite); not yet live-tested.
Known gaps: history 'AI agent' badge (C3), vector RAG (D), external MCP (E);
chat tool-card citation links pending a fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 02:39:26 +03:00
vvzvlad
683da7a4c5 feat(ai-chat): per-user AI agent backend — LLM config, read-only agent, provenance schema
WIP checkpoint of the gitmost AI-chat backend (plan stages A + B1 + B3a).
The agent acts under the requesting user's JWT (Docmost CASL enforces page
access); the external service-account /mcp endpoint is untouched.

LLM provider config (A2-A4):
- integrations/crypto: AES-256-GCM SecretBoxService (key derived from APP_SECRET,
  per-record salt/iv; clear error on rotation instead of crashing).
- ai_provider_credentials table/repo/types: encrypted API key stored outside
  workspace settings/baseFields, write-only (never returned by any endpoint).
- integrations/ai: per-workspace AI SDK v6 provider driver (openai/gemini/ollama),
  admin-gated GET(masked)/PATCH(write-only key)/Test endpoints; settings.ai.provider
  holds non-secret config incl. systemPrompt. Removed unused AI_* env getters (DB is
  the single source of truth).

Chat module (A1, A5-A8):
- ai_chats/ai_chat_messages repos (workspace-scoped, soft-delete, tsv never selected).
- core/ai-chat: CRUD + POST /ai-chat/stream (Fastify hijack + AI SDK v6
  pipeUIMessageStreamToResponse, abort on disconnect, persist user/assistant msgs).
- Agent loop: streamText + stepCountIs(8); read tools searchPages/getPage via a
  per-request DocmostClient over loopback REST under the user's minted access token.
- Gate settings.ai.chat (+ 503 when provider unconfigured); buildSystemPrompt with a
  non-removable safety/anti-prompt-injection framework. Per-user rate limit.

Per-user auth (B1):
- @docmost/mcp DocmostClient gains an additive getToken variant (carry a user JWT,
  re-fetch on 401) and exports DocmostClient; the email/password service-account path
  (external /mcp, stdio) is unchanged.

Agent-edit provenance backbone (B3a):
- Migration: pages/page_history (last_updated_source, last_updated_ai_chat_id) and
  comments (created_source, ai_chat_id, resolved_source).
- Signed actor/aiChatId claim in the collab token; onAuthenticate propagates it,
  onStoreDocument writes it with a sticky agent marker, saveHistory copies it.

Migrations auto-run on boot (additive). Write tools, frontend, RAG and external MCP
servers are not in this checkpoint.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 01:36:41 +03:00