showTypingIndicator hid the standalone thinking dots for any non-empty
trailing text part, so during the pause after the model finished an
intermediate narration and before its next step (e.g. a tool call) the
UI looked frozen. Suppress the dots only while the text part is still
streaming: a finalized ("done") trailing text part on an in-flight turn
now shows the dots again, matching the function's documented intent.
- message-list: guard the text branch with state !== "done" (AI SDK v6
TextUIPart.state); stateless parts keep their previous behavior
- show-typing-indicator.test: add done -> shown and streaming -> hidden cases
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The name label was crowding the bouncing dots when displayed. Adding extra bottom margin (mb={8}) gives the dots room and improves readability. The change only applies when the name is shown.
- Invert the transport layers so the pre-response retry is OUTERMOST and the
provider-HTTP instrumentation is INNER. Before, the retry lived inside
createStreamingFetch (under the instrumentation), so a reset the retry
recovered from logged only a clean "OK status=200" — the
"PRE-RESPONSE FAILED ... ECONNRESET ... idleSincePrevCall" signal went blind
exactly when the fix works, and AI_STREAM_KEEPALIVE_MS couldn't be tuned from
prod data. Now createStreamingFetch is the dispatcher-bound BASE (no retry) and
a new withPreResponseRetry() wraps it; ai.service composes
withPreResponseRetry(createInstrumentedFetch('AiService:provider-http',
createStreamingFetch())), so every attempt — including recovered resets — flows
through the instrumentation. (Also expresses the keepAlive-config vs retry-
behavior boundary structurally, per review #3.)
- Add the retry-exhaustion test: a server that resets EVERY connection, asserting
the call rejects with a retryable connection error AND exactly
PRE_RESPONSE_CONNECT_RETRIES + 1 (= 3) requests reached the server — pinning the
bound and that the final error propagates (guards an off-by-one / infinite loop
/ swallowed error). Existing happy-retry + abort tests moved onto
withPreResponseRetry.
Verified on the stand: a normal turn still streams (reasoning + finish) and the
provider-HTTP telemetry still logs. server tsc + ai/mcp specs green (30).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The live typing placeholder now shows only the bouncing dots; the
"Thinking… · N tokens" line is removed. Clean up the dead plumbing:
- typing-indicator: remove thinkingTokens prop, thinkingLine and the
<Text> line; keep the animated dots and the dimmed name label
- message-list: remove tailThinkingTokens helper, the thinkingTokens
prop pass-through, and the now-unused liveTurnTokens import
- delete tail-thinking-tokens.test.ts (tested the removed helper)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The real cause of the long-task "Lost connection to the AI provider" — the
earlier 300s-timeout fix (#176) was the wrong layer. The provider-HTTP telemetry
on the user's deploy shows the failures are PRE-RESPONSE `read ECONNRESET` ~500ms
in (not a 300s/15min timeout), correlated with idleSincePrevCall ~42s and large
bodies; and crucially a retry of the SAME request often succeeds. A direct probe
to the real z.ai endpoint does NOT reset (113KB bodies and a 45s-idle keep-alive
reuse both succeed), and another agent (opencode) runs fine from the same infra —
so the provider is healthy and the egress network is usable. The difference is
the transport: undici's keep-alive pool REUSES a socket that the deployment's
egress (NAT / firewall / conntrack) silently dropped during a long idle gap, so
the next request resets pre-response.
Fix (brings gitmost in line with clients that don't reuse stale sockets):
- Keep-alive recycling: the streaming dispatcher (chat fetch AND the external-MCP
dispatcher, via the shared streamingDispatcherOptions) now sets
keepAliveTimeout + keepAliveMaxTimeout to a 10s recycle window
(AI_STREAM_KEEPALIVE_MS), so a connection idle longer than that is closed
instead of reused — a long-gap step opens a fresh connection. keepAliveMaxTimeout
also caps a server-advertised keep-alive so the provider can't widen the window.
- Pre-response connection retry: createStreamingFetch retries a connection-level
reset (ECONNRESET / UND_ERR_SOCKET / ECONNREFUSED / EPIPE / *_TIMEOUT) on a
fresh connection up to 2 times. This is SAFE because fetch() only rejects before
the Response resolves — a started stream is never replayed; an abort (client
disconnect) is never retried.
Tests: ai-streaming-fetch.spec — keep-alive options, streamKeepAliveMs env,
isRetryableConnectError, and a server that resets the first connection so the
retry must land on a fresh one (+ aborted requests are not retried). Verified on
the stand that a normal turn still streams (reasoning + text + finish) through the
new transport. server tsc + ai/mcp specs green.
Note: root cause is the deployment's egress dropping idle connections (Traefik is
inbound-only); this makes the app resilient to it. AI_STREAM_KEEPALIVE_MS can be
lowered if the egress drops faster than ~10s.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Addresses the second #177 review:
- Architecture (the silent allowlist drift): the writable provider-setting keys
were maintained by hand in two TS-uncheckable places — the key-loop in
ai-settings.service and the SQL ALLOWED list in the generic workspace repo (a
miss there silently dropped a field on persist, exactly what bit chatApiStyle).
Introduce one typed source of truth PROVIDER_SETTINGS_KEYS in ai.types
(`satisfies readonly (keyof AiProviderSettings)[]`), have the service consume
it, and keep the repo's own copy (it can't import AI types) guarded by a parity
test so any future drift fails in CI.
- Tests:
- ai.service.include-usage.spec: mocks @ai-sdk/openai-compatible and asserts the
factory is called with { includeUsage: true, baseURL, apiKey, fetch, name } —
`.provider` alone could not catch a dropped includeUsage (the token-usage
zeroing regression); also asserts the 'openai' style does NOT use it.
- ai-provider-settings-keys.spec: the allowlist parity check + DTO validation
for chatApiStyle (@IsIn accepts both values, rejects garbage, optional).
- CHANGELOG: [Unreleased] entries for the new "Protocol" / chatApiStyle setting
and the default provider change (openai -> openai-compatible). (#175, #177)
server + client tsc clean; 42 ai/settings specs green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rebuilt on develop (after #176) and reworked per review: instead of inferring the
provider from baseUrl (`if (baseUrl)`), the admin picks the chat provider
EXPLICITLY via a new `chatApiStyle` ('openai-compatible' | 'openai'), mirroring
the existing sttApiStyle. A custom baseURL can front real OpenAI too, so the
heuristic was fragile.
Why reasoning was missing: glm-5.2 (and DeepSeek etc.) stream their thinking as
`reasoning_content`, but the official @ai-sdk/openai provider does not map that
field. 'openai-compatible' uses @ai-sdk/openai-compatible, which does — so
reasoning parts now stream (verified live: reasoning-start/delta/end appear, and
disappear when set to 'openai').
- Default (unset) = 'openai-compatible', so existing openai+baseUrl workspaces
surface reasoning with no admin action. No DB migration (field lives in the
settings.ai.provider JSON blob).
- includeUsage: true on the openai-compatible model — without it the provider
omits streamed usage, zeroing the live token counter / reasoning-token
metadata. The official provider always sent it; this keeps parity. (Confirmed
live: usage.totalTokens present.)
- openai-compatible has no default endpoint, so with no baseURL (real OpenAI, or
a role's cross-driver override that cleared it) it falls back to the official
provider.
Plumbing: ai.types (ChatApiStyle / CHAT_API_STYLES + AiProviderSettings /
MaskedAiSettings), update DTO (@IsIn), ai-settings.service (resolve / getMasked /
update allowlist), workspace.repo updateAiProviderSettings ALLOWED (the second,
SQL-level allowlist the review missed — without it the field never persisted),
ai.service selector. Client: ai-settings-service types + a Protocol <Select> in
the chat section + i18n (en/ru). Scope is chat-only (embeddings don't stream
reasoning; STT already has sttApiStyle).
Tests: ai.service.spec — 4 cases (openai-compatible+baseURL, openai+baseURL,
default-unset, openai-compatible-without-baseURL fallback). Verified on the stand:
default streams reasoning + usage; 'openai' drops reasoning; the setting
round-trips. server + client tsc clean; 36 ai/settings specs green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Wording: every comment now says the stream timeouts are RAISED to a
generous-but-finite ~15-min silence timeout, not "disabled (0)" (the stale
comments contradicted the code, which uses AI_STREAM_TIMEOUT_MS, default
900000ms).
- Architecture (the load-bearing-temporary trap): the streaming fetch reached
the chat provider only by riding the "temporary DIAGNOSTIC" telemetry, so
deleting the telemetry by its own label would silently revert the timeout fix.
Legitimize it: rename ai-http-diagnostics.ts -> ai-provider-http.ts,
createDiagnosticFetch -> createInstrumentedFetch, field aiDiagnosticFetch ->
aiProviderFetch, drop the "temporary" labels, and document the chat transport
(streaming fetch + instrumentation) as one intentional construct.
- Docs: AI_STREAM_TIMEOUT_MS added to .env.example next to AI_EMBEDDING_TIMEOUT_MS.
- Tests:
- ai-provider-http.spec: createInstrumentedFetch delegates to the injected
baseFetch with the same input/init, returns the Response untouched, rethrows
the error, and defaults to global fetch — covering the baseFetch seam.
- ai-streaming-fetch.spec: the delayed-server test is now LOAD-BEARING — with
AI_STREAM_TIMEOUT_MS set below the 1.5s server delay the call actually rejects
(a lost dispatcher -> global 300s default would NOT), proving the configured
dispatcher is wired; plus the default-timeout happy path.
server tsc clean; ai-streaming-fetch / ai-provider-http / ai.service / mcp-servers
/ ai-error specs green (41).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Long research turns failed mid-task with "Lost connection to the AI provider".
Node's global fetch (undici) defaults BOTH headersTimeout and bodyTimeout to
300_000ms, and the chat provider + the external-MCP dispatcher both ran on it
with no override, so:
- the z.ai chat stream dropped when a late step's huge accumulated context
pushed the model's time-to-first-token past 5 min (the model reasons
server-side with NO streamed reasoning, so the connection is silent until the
first answer token — reproduced: even a trivial glm-5.2 query has a ~4-8s
first-chunk gap; a long run reaches 400k+-token steps), or a reasoning model
paused >5 min between chunks (bodyTimeout);
- the crawl4ai SSE transport, held open across the whole turn, dropped when it
idled >5 min between tool calls.
Fix: a dedicated undici dispatcher whose stream timeouts are raised to a
generous-but-FINITE silence timeout (default 15 min, AI_STREAM_TIMEOUT_MS) on
each path. NOT disabled (0): that would let a genuinely hung provider — with the
client still connected — hang forever, since the turn's abortSignal only fires on
client disconnect. The timeout bounds SILENCE (time-to-first-byte and the gap
BETWEEN chunks), NOT total turn duration, so an arbitrarily long turn that keeps
streaming is never cut; only a stream quiet for >15 min is treated as a hang.
- ai-streaming-fetch.ts: createStreamingFetch() + streamTimeoutMs() /
streamingDispatcherOptions() (the shared, configurable timeout).
- ai.service: the chat provider fetch is createStreamingFetch(), wrapped by the
existing passive ECONNRESET telemetry (createDiagnosticFetch gained an
optional baseFetch) so the telemetry observes the SAME transport.
- mcp-clients: the SSRF-pinned Agent uses streamingDispatcherOptions().
Investigation: reproduced the transport mechanism against the real z.ai endpoint
(a 1ms headersTimeout throws UND_ERR_HEADERS_TIMEOUT — the exact drop) and ran
the actual research agent to a ~428k-token context. Verified the fixed path
streams cleanly live (glm-5.2 turns finish; telemetry confirms the streaming
fetch is in use).
Tests: ai-streaming-fetch.spec (default 15m + env override + invalid fallback +
both-timeouts + streams a delayed response); ai-http-diagnostics + ai/mcp specs
green. server tsc clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Investigate the intermittent (~20-30%) long-turn failure
"Lost connection to the AI provider" = AI_RetryError / read ECONNRESET
on the gitmost->z.ai link (browser-agnostic, mid-turn). Pure
instrumentation, no behavior change:
- ai-http-diagnostics.ts: a passive fetch wrapper injected into the
OpenAI-compatible (z.ai) client. Per provider HTTP call it logs
time-to-headers/status on success, and on a pre-response rejection the
latency, error code/cause, request-body size and idle-gap since the
previous call. The Response is returned untouched (streaming intact),
errors rethrown unchanged; no retry/timeout/dispatcher.
- ai.service.ts: wire the instrumented fetch into the openai case only.
Lets us classify the reset as connection-phase vs mid-stream before
choosing a fix, without repeating the reverted RetryAgent (#140).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Opening the edit form for an MCP server that has a saved tool allowlist crashed
the whole settings page (`TypeError: Ke.map is not a function` in Mantine) — and,
worse, the allowlist was silently NOT enforced. Both stem from one root cause:
the `tool_allowlist` jsonb column round-trips as a JSON STRING, not an array.
Root cause: `jsonbArray` bound `JSON.stringify(value)` (already a JSON string)
straight to a `::jsonb` cast. node-postgres infers the param type as jsonb and
JSON-stringifies it a SECOND time, so the column stored a jsonb STRING SCALAR
(`"[\"a\"]"`, jsonb_typeof = string) instead of an array. On read the driver
hands back the JS string `'["a"]'`. Then:
- the edit form's TagsInput called `.map` on a string -> page crash;
- mcp-clients did `Array.isArray(allow)` -> false for a string -> fell through
to "no restriction" and exposed ALL of the server's tools.
Fix (both verified on the stand):
- Write: `jsonbArray` casts `::text::jsonb` so the param is bound as text (sent
verbatim) and parsed into a real jsonb array. New rows now store
jsonb_typeof=array.
- Read: `normalizeRow` runs every fetched row through `parseToolAllowlist`, which
returns `string[] | null` for both shapes (already-array passes through; a JSON
string is parsed; null/invalid -> null). This REPAIRS existing double-encoded
rows on read, so the UI and the allowlist enforcement work without a data
migration. Applied in findById / listByWorkspace / listEnabled.
- Client: defensive `Array.isArray(...) ? ... : []` guard in the form so a bad
shape can never take the settings page down again.
Tests: ai-mcp-server.repo.spec (8 cases for parseToolAllowlist — array, the
JSON-string read, null, empty, non-array json, unparseable, non-string elements,
non-string primitive). mcp-servers-to-view + mcp-namespacing still green.
Verified live: an old double-encoded row now reads as an array; a newly created
server stores jsonb_typeof=array.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Investigate the Safari-only "Lost connection to the AI provider" mid-stream
disconnect (Chrome unaffected). Pure instrumentation, no behavior change:
the 15s heartbeat interval and all stream callbacks are unchanged.
- sse-resilience.ts: startSseHeartbeat() gains an optional onBeat hook fired
after each successfully written ping (beat counter).
- ai-chat.service.ts: track stream start, first-chunk latency, model-silent
gap and heartbeat count; log them on finish/error/abort to classify the
drop (idle-gap vs hard wall-clock cap vs slow first chunk).
- ai-chat.controller.ts: append elapsed-since-request to the disconnect warn.
All blocks tagged "DIAGNOSTIC ... temporary" for easy removal once the Safari
failure mode is identified.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Extract buildSubtree/mapSharedNodes/countNodes/SubpageNode into
subpages-view.utils.ts with a unit test (subpages-view.utils.test.ts)
covering nesting, position order, missing/unreachable parent, self-parent
guard, empty input, countNodes and mapSharedNodes remap.
- Replace the manual useState + editor.on("transaction") subscription in
subpages-menu.tsx with useEditorState (the idiom the sibling bubble menus
use), so the mode icon/tooltip track the live recursive attribute without
re-rendering on every keystroke.
- i18n: add the 6 menu/tree strings and a pluralized
"Showing {{count}} subpages" key to en-US and ru-RU.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The `subpages` node showed only one level of direct children. Add a `recursive`
attribute that renders the FULL descendant tree of the current page — fully
expanded, unlimited depth. Default `false`, so every previously-inserted node
stays flat (backward compatible). No backend changes: `POST /pages/tree` (via the
`getSpaceTree` wrapper) already returns the whole subtree as a flat `IPage[]`
(recursive CTE, permission-filtered); the nested tree is built on the client by
`parentPageId`.
- editor-ext `subpages.ts`: `recursive` attribute (parse/render `data-recursive`),
shared by client + server so the collab ProseMirror schema keeps the attribute.
- `getSpaceTree`: arg loosened to `{ spaceId?; pageId? }` (the endpoint accepts
either); new `useGetPageTreeQuery(pageId)` react-query hook.
- `subpages-view.tsx`: split into `FlatSubpages` (unchanged) and
`RecursiveSubpages`; `buildSubtree` assembles the nested tree (cycle/self-parent
guard, `sortPositionKeys` per level, root excluded) and a recursive `TreeNode`
renders it (16px indent per depth, soft "showing N" note past 300 — data never
capped). Shared/public context reads the already-nested shared tree, no
`/pages/tree` request.
- toggles: bubble-menu flat⇄tree button + a second slash-menu item "Page tree".
Review follow-ups folded in: invalidate `["page-tree"]` from the create / update /
move / delete cache helpers so an open recursive tree refreshes (no stale data);
mode icon made reactive on editor transactions; `t` threaded into `TreeNode`
(no per-node useTranslation); shared-subtree hook deduped to a thin alias.
editor-ext build + client `tsc --noEmit` both clean. Backend untouched.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per review: the file's other :global is locally scoped (.container:global(...)),
but the new float-reset media rule was fully global in a *.module.css. Scope it to
.container — the image node-view container carries BOTH the .container class and
the data-image-align attribute (same element), so behavior is unchanged while the
selector no longer leaks globally.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Review of #158 (Request changes) — core logic verified correct; addressed the
test-coverage + localization items:
1. i18n pluralization: the token-count keys were called with {count} but had one
form, so ru-RU always rendered the genitive ("1 токенов"). Added _one/_other
(en) and _one/_few/_many (ru: токен/токена/токенов) for both "Thinking… ·
{{count}} tokens" and "Thinking · {{count}} tokens"; de-duped the PR-added
duplicate "Thinking" key. Call sites unchanged.
2. ReasoningBlock: new reasoning-block.test.tsx (4 branches: authoritative count
wins / estimate fallback / header-only when count-but-no-text / body render).
3. Reasoning-token attribution: extracted the #151 anti-double-count rule into a
pure `reasoningTokensForPart(message)` (single reasoning part -> authoritative
turn total; multiple/none -> undefined so each estimates). message-item uses
it; removed the now-dead lastReasoningIndex reduce (review #5). Unit-tested.
6. adopt-chat-id.ts: refreshed 3 stale `chatStreamStartMetadata` ->
`chatStreamMetadata` comment references.
7. chat-markdown.test.ts: assert the export footer's `reasoning: N` line appears
when reasoningTokens>0 and is absent at 0/undefined.
Skipped optional #4 (mantine useThrottledCallback): the manual throttle has two
distinct exit paths (turn-end revert-to-null + the captured-total trailing emit)
with no guarding test; remapping risks the streaming behavior — non-blocking.
Client tsc clean; ai-chat suite green (171 tests).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Review of #157 (Request changes) caught two blockers:
1. DEAD responsive CSS: the `@media (max-width:600px)` float-reset was added to
`image-resize.module.css`, which is imported NOWHERE — the image container's
classes come from `common/node-resize.module.css` (via buildResizeClasses).
So on mobile a floated image kept its px width + float and crushed the text,
exactly the failure the rule promised to prevent. Moved the rule to
`common/node-resize.module.css` (the module actually imported by the resize
node views); its `:global([data-image-align=...])` selectors are data-attr
based, so they work unchanged. Reverted the dead addition from the (pre-existing,
orphaned) image-resize.module.css.
2. `applyAlignment` was untested. Exported it and added `image.spec.ts` (vitest/
jsdom) covering all five align values, the data-image-align mirror, and the
floatLeft -> left reset-then-apply (the guard against a leaked float).
Switched the float writes to the canonical CSSOM `cssFloat` property (portable:
browsers + jsdom; behavior identical to the `.float` alias).
editor-ext build + client tsc clean; 6 image.spec tests green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Review of #156 (Request changes) flagged the new CLIENT logic as untested. Extract
the decision logic from chat-thread.tsx into pure, unit-testable helpers and cover
both branches the reviewer called out:
- `roleLaunchMessage(role, default)` — the three-way handleRolePick behavior:
autoStart=false -> null (send nothing); autoStart=true + custom -> trimmed
message; autoStart=true + empty/null/whitespace -> default fallback.
- `shouldResetRolePicked(chatId, roleId, flag)` — the #149 render-phase reset; the
regression test asserts the stuck-flag case (New chat after an autoStart=false
pick -> cards return) that the pre-fix code never handled, and that a still-bound
role keeps the cards hidden.
chat-thread.tsx now calls these helpers (behavior unchanged). 9 new pure tests.
Also folded the review's cosmetic suggestion: `x ? x : null` -> `x || null` in
ai-agent-roles.repo.ts (identical for string|null|undefined).
Client tsc clean; role-launch + role-cards green; repo spec green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tokens were only counted post-hoc (onFinish) and the header badge updated only on
chat open/switch; reasoning wasn't requested or shown. Now a counter ticks LIVE
during generation and surfaces reasoning ("thinking") tokens separately, like
Claude Code's `Thinking… · N tokens`.
Architecture (AI SDK v6): no provider gives exact per-token usage mid-stream, so
the live number is a cheap client estimate (chars/≈4) reconciled to AUTHORITATIVE
provider usage at step boundaries and turn end. The useChat per-delta re-render is
the existing realtime engine.
- server: `chatStreamMetadata` now also forwards usage on `finish-step` + `finish`;
`sendReasoning: true`; persisted `metadata.usage` carries `reasoningTokens`
(normalized from `outputTokenDetails` or the deprecated field).
- client: pure `count-stream-tokens` (estimateTokens / liveTurnTokens, prefers
authoritative usage else estimate); `Thinking… · N tokens` in the typing
indicator; collapsible "Thinking" reasoning block; throttled (~8 Hz) live
turn-token header badge; `reasoningTokens` in types + Markdown export.
Review fixes folded in:
- v6 `finish-step.usage` is PER-STEP, not cumulative — the server now ACCUMULATES
a running sum (new pure `accumulateStepUsage`) and sends the cumulative, which
converges to `finish.totalUsage`, so the live counter never jumps DOWN on a
multi-step agent turn.
- reasoning double-count: the authoritative turn-total is attributed to a block
ONLY for a single-reasoning-part (one-step) turn; multi-step blocks each show
their own estimate (the authoritative total stays in the header).
- no "0" badge flash at turn start (require live > 0, else show context size).
- comment refreshed (finish-step trigger).
Tests: server `accumulateStepUsage` + updated `chatStreamMetadata` (34 in the
suite); client pure-fn tests. Both tsc clean; 162 client ai-chat + the ai-chat
server suite pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds floatLeft / floatRight image alignment so text wraps beside the image,
beyond the existing block left/center/right. Ported from Forkmost PR #7 /
upstream Docmost PR #1132 (fuscodev), adapted to gitmost's imperative image
node-view (the upstream uses a React styled component; ours styles the node-view
container directly via applyAlignment).
- editor-ext image.ts: `setImageAlign` accepts `floatLeft`/`floatRight`;
`applyAlignment` resets float/padding then, for a float mode, sets
`float:left|right` + side padding on the (shrink-to-fit) container so text
flows beside it (the inner <img> already has max-width:100%). The resolved
align is mirrored onto the container as `data-image-align` for the responsive
rule. `data-align` already round-trips the value through parse/renderHTML, so
float survives serialization / collab / history with no schema change.
- image-menu.tsx: Float-left / Float-right bubble-menu buttons (IconFloatLeft/
Right) with active state.
- image-resize.module.css: on narrow screens (<=600px) a floated image collapses
to full width and drops the float (`!important`, keyed on data-image-align) —
the upstream "100% width on small screen" follow-up.
- i18n: en-US + ru-RU strings.
editor-ext build + client tsc --noEmit clean. Visual wrap behavior is best
confirmed in-browser (logic/serialization verified by build + types).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Agent role cards always auto-sent a hardcoded "Take a look at the current
document" on pick. Make it configurable per role:
- autoStart (bool, default true): whether picking the role auto-sends a message.
- launchMessage (nullable text): the text sent on auto-start; empty -> the
built-in default. autoStart=false -> bind the role and send nothing (the user
types the first message, which still carries the roleId).
Existing roles default to autoStart=true / launchMessage=null => identical old
behavior.
Full-stack:
- migration 20260624T120000 adds `auto_start boolean NOT NULL DEFAULT true` +
`launch_message text` (additive; down drops both); db.d.ts updated by hand.
- DTO: autoStart (@IsBoolean) + launchMessage (trim @Transform, @MaxLength 2000).
- repo/service: thread + normalize (undefined=unchanged, ""=>null, autoStart??true).
Both fields exposed in the picker-view for ordinary members (they decide
whether/what to auto-send); instructions/modelConfig stay ADMIN-ONLY.
- client: IAiRole types, role form (Switch + Textarea, re-hydrated on edit),
handleRolePick branches on autoStart; i18n en-US + ru-RU.
Review follow-ups folded in: reset the `rolePickedNoSend` flag when the thread
returns to an empty role-less state (the "New chat after autoStart=false pick"
stuck-UI bug — render-phase one-shot reset); made create/update launchMessage
normalization symmetric (raw value, server normalizes ""→null).
Server: 68 role tests pass, tsc clean. Client: tsc clean, role tests pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The standalone "Thinking…" indicator's agent-name label switched owners a couple seconds into a turn: at first TypingIndicator rendered name+dots (4px gap), then once useChat materialized the empty/reasoning-only assistant message the name moved to the empty MessageItem row (16px messageRow margin), pushing the dots ~20px away — a visible layout jump.
Introduce a shared pure helper assistantMessageHasVisibleContent() as the single source of truth mirroring MessageItem's render decisions. MessageItem now returns null for an assistant message with no visible content, and typingIndicatorShowsName keeps the name on the indicator until the assistant row has visible content. Exactly one element owns the name throughout the pre-content gap, so the layout no longer reflows.
- new utils/message-content.ts + unit tests (12 cases)
- message-list.tsx: typingIndicatorShowsName uses the helper
- message-item.tsx: early return null when no visible content
- typing-indicator-shows-name.test.ts: updated expectations
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Resolve the pre-merge review items for the #146 NodeView content-first fix:
- Export collectScrollAncestors/reflowAfterPaste and add editor-paste-handler
unit tests covering ancestor selection (overlay included, non-overflowing
auto excluded, X axis), the scrollHeight>clientHeight gate, scrollingElement
dedup, the docEl==null branch, and the double-rAF nudge.
- Extend the structural guard with CodeBlockView and merge the two it.each
blocks into one document-order assertion (handles the <pre> nesting where the
contentDOM is not the literal first child).
- Simplify the post-paste nudge to a single scrollTo(scrollLeft, scrollTop).
- Document that the post-paste reflow runs on every paste path intentionally,
and cross-reference the two #146 mitigations in both fixes.
- a11y: aria-hidden the decorative footnotes heading and number marker, and
label the footnotes list via role="group" + aria-label so the visual reorder
does not break screen-reader reading order (WCAG 1.3.2).
- CHANGELOG: add a Fixed entry noting the caret fix is macOS-verified manually.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Resolves the open items from the latest PR #143 code review:
- test(page): cover the four agentSourceFields stamp sites (create, update,
movePage, movePageToSpace) with agent + normal-user payload assertions;
add findById({ includeIsAgent: true }) wiring guards to the JWT and collab
auth-seam specs so a future drop of the option is caught.
- fix(privacy): drop `isAgent` from UserRepo.baseFields and gate it behind a
new opt-in `findById({ includeIsAgent })`, requested only by the two auth
seams that derive provenance — stops the flag leaking via the workspace
member list and generic user payloads.
- docs: correct the agentSourceFields JSDoc and the two UPDATE-site comments
to distinguish INSERT (omitted column → DB default 'user') from UPDATE
(omitted column → existing value kept, Kysely writes only present keys).
- style(page): collapse three stray double blank lines left by an earlier edit.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Addresses the #147 review (Approve with comments):
- Add footnote-views.structure.test.tsx: a structural regression guard asserting
the editable NodeViewContent is the FIRST child of FootnotesListView and
FootnoteDefinitionView, with no contenteditable=false chrome before it. The
whole #146 fix rests on this DOM-order invariant; the macOS caret symptom needs
a real browser, but the order proxy is testable in jsdom. Stubs @tiptap/react
so the views render as plain DOM — the test passes on the fixed order and fails
on the pre-fix chrome-first order.
- Reword the code-block-view comment: it claimed a "top-right overlay (the
transclusion pattern)", but the menu stays fully in flow as a full-width row
lifted via flex `order: -1` (the .codeBlock wrapper is a flex column). No
overlay/absolute positioning.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pasting markdown/code inserts React NodeViews that mount asynchronously; until
the next reflow the browser's hit-test geometry is stale, so ProseMirror's
posAtCoords/caretRangeFromPoint maps a click to the wrong (offset) line — which
users reported clears itself on any scroll. Reproduce that scroll's side effect
with a ZERO-delta nudge (re-assign scrollTop/scrollLeft to their current value)
on every scrollable ancestor + the document scrolling element, run across two
animation frames so it lands after the pasted content + NodeViews commit. The
nudge does not move the viewport.
Wired into editor-paste-handler's handlePaste, which ProseMirror's someProp runs
(as an editorProps handler) before the MarkdownClipboard plugin that performs the
markdown/code insert — so the nudge is scheduled on exactly the paste path that
triggers the bug. Complements the structural NodeViewContent-order fix in this
branch.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Architecture & design:
- Arch A: introduce resolveProvenance() as the single source of truth for
deriving a write's actor/aiChatId from the SIGNED identity, and wire it into
BOTH transport seams — the REST jwt.strategy and the collab
authentication.extension. Previously the collab seam derived actor from the
token claim alone and ignored user.isAgent, so a flagged service account's
page-content edits over the websocket persisted as lastUpdatedSource='user',
drifting from REST. The seams now share one resolver and can't diverge.
- Arch B: drop AiAgentBadge's page-history coupling. The generic ui/ badge no
longer imports historyAtoms; it exposes an onActivate callback fired after the
deep-link, and the history row passes onActivate to close its own modal.
Suggestions/warnings:
- S1: soften the jwt.strategy provenance comment (applies to every REST write).
- S2/suggestion-3: drop the redundant comment-list-item null-aiChatId test
(covered by ai-agent-badge.test.tsx).
- S3: de-duplicate jwt.strategy.spec test #3 (the no-claim→'user' half
duplicated test #2); keep only the signed actor='agent' claim assertion.
- W2: add keyboard-activation tests for the badge (Enter/Space, unrelated key).
- W3: flip the design doc status to "реализовано (#143)".
Tests:
- new auth-provenance.decorator.spec.ts unit-tests resolveProvenance +
agentSourceFields.
- new collab-seam test: is_agent user with no claim → actor='agent'
(Arch A regression guard).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The agent write-stamp idiom — `...(isAgent ? { <source>: 'agent', <chat>: aiChatId } : {})`
— was hand-reimplemented at every REST write site, so each new path risked a
wrong literal or a forgotten aiChatId. Extract a single
`agentSourceFields(provenance, sourceKey, chatKey)` next to AuthProvenanceData and
call it at the 5 uniform spread sites:
- comment.service create -> createdSource / aiChatId
- page.service create/update/orphan-move/move -> lastUpdatedSource / lastUpdatedAiChatId
Sites that must CLEAR the source on a non-agent action keep their own conditional
(comment un-resolve writes an explicit null), and the collab persistence path keeps
its sticky-window logic — both noted in the helper's doc.
Behavior-preserving (the helper returns the identical object/`{}`). Typecheck
clean; server comment/page/auth/collab suites 246 pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- [warn 1] Document the is_agent operator setup so it survives plan deletion:
added an AI-agent block to .env.example (use a DEDICATED account, set is_agent
via SQL, never flag a human/shared account) + a CHANGELOG "Added" entry.
- [warn 2] Test the badge deep-link side effects: ai-agent-badge.test.tsx now
renders inside an explicit jotai store, clicks the badge, and asserts the
active chat id, window-open, cleared draft, closed history modal, AND that
stopPropagation keeps a parent onClick from firing.
- [suggestion 3] Hoist the window.matchMedia stub into vitest.setup.ts and drop
the duplicated beforeAll block from the three test files (ai-agent-badge,
comment-list-item, role-cards).
- [suggestion 4] Merge the two near-duplicate "non-clickable" cases via it.each.
- [follow-up 6] Introduce a single ProvenanceSource = 'user' | 'agent' type in
jwt-payload.ts and reference it from AuthProvenanceData, JwtPayload/
JwtCollabPayload, and resolveSource() — so a typo can't slip through as a bare
string. (Server auth chain; client IComment mirroring left as a follow-up.)
Follow-up 5 (shared agentSourceFields write-stamp helper) is deferred as the
review marked it — the 6 REST sites use varied shapes (create-spread vs
resolve-conditional-null vs page move), so it's a separate focused refactor.
Tests: client badge/comment/role-cards suites 11/11 pass; server auth+comment
suites 62 pass; typecheck clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three editable NodeViews rendered a contentEditable=false "chrome" element IN
FLOW BEFORE NodeViewContent. On macOS the browser's click hit-testing
(posAtCoords → caretRangeFromPoint) then misses the contentDOM and snaps the
caret to the previous node — the caret/selection lands a line (code block) or
several lines (footnotes, into the body) above where the user clicked.
Fix (the transclusion pattern / issue #146 plan): make the editable
NodeViewContent the FIRST child in the DOM and move the non-editable chrome
AFTER it, restoring its visual position with CSS:
- code-block-view: <pre><NodeViewContent/></pre> first; the language/copy menu
follows and is lifted above via flex `order` (.codeBlock is now a flex column).
- footnotes-list-view: NodeViewContent first; the "Footnotes" heading follows and
is lifted above via flex `order` (.list is a flex column; the separator border
stays on the container).
- footnote-definition-view: NodeViewContent first; the "N." marker follows with
`order:-1` to stay on the left; the ↩ back-link stays on the right.
Layout is visually unchanged. Verified in a real browser (Chromium): the
contentDOM is now the first child of every editable NodeView wrapper (no
contentEditable=false element precedes it), and the menu/heading/marker still
render in their original positions.
NOTE: the caret-offset itself is macOS-specific text hit-testing and does not
reproduce in headless Chromium/WebKit on Linux (verified extensively), so the
visible fix is best confirmed on macOS.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
While a multi-step agent turn is "thinking between tool steps", the
assistant identity name (e.g. the role name) was rendered twice, stacked:
once by the assistant message row (MessageItem) and once by the standalone
TypingIndicator below it. The indicator's name label only makes sense when
it stands in for a not-yet-started assistant row; between steps the row
above already shows the same name.
Render the indicator's dimmed name label only when it is standalone (no
assistant row at the tail yet); otherwise show just the "Thinking…" dots.
- typing-indicator.tsx: optional showName prop (default true); the name
label renders only when showName !== false
- message-list.tsx: exported typingIndicatorShowsName(messages) helper;
pass showName to the indicator at the render site
- typing-indicator-shows-name.test.ts: unit-cover the four cases
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The custom undici RetryAgent + aiFetch transport added for issue #140
did not actually heal mid-stream provider drops: undici's retry path is
a Range-based download-resume that SSE/chat-completions endpoints cannot
satisfy, so a reset after the first byte only swapped ECONNRESET for a
"server does not support the range header" error. Its only real effect
was reconnecting a poisoned keep-alive socket before the first byte, and
PR #141 on top of it turned the 60s headers timeout into deterministic
~61s failures (plus CONTENT_LENGTH_MISMATCH from retrying a POST body
after a timeout abort). The root cause is the z.ai coding endpoint, not
our transport.
Remove the whole layer and return all AI provider calls to Node's
default global fetch.
- delete integrations/ai/ai-http.ts and its spec
- ai.service.ts: drop the aiFetch import, the AI_BYPASS_RESILIENT_FETCH
diagnostic toggle, and fetch:aiFetch from every chat/embedding/STT
factory; raw STT call back to global fetch
- ai-chat.controller.ts: drop the stream-timing START log + startedAt
- ai-chat.service.ts: drop the first-chunk/FINISHED/ERROR timing logs
- .env.example: drop AI_BYPASS_RESILIENT_FETCH
Reverts: 1af5d34a, 7c308728, b7abb7ea, 35fc58ea, d6cd2754, 6efb8656.
Preserved (not part of the rollback): client-disconnect abort, title
generation in onFinish, partial-answer persistence, Safari SSE heartbeat.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Mark comments (and, via existing page provenance, pages) created under an
is_agent service account as authored by AI, derived from the SIGNED server
identity rather than any client field, and render the existing AI badge in
the comments sidebar.
Backend (B1):
- Add additive users.is_agent boolean (default false) migration; reflect in
the Users Kysely type, the user repo baseFields, and (via Selectable) the
User entity.
- jwt.strategy: derive req.raw.actor from user.isAgent (an is_agent account
stamps every write 'agent'); external MCP has no internal ai_chats row so
aiChatId stays null. Non-spoofable: a plain user cannot obtain
created_source='agent'.
- Loosen the provenance aiChatId type to string|null across token.service and
the JwtPayload/JwtCollabPayload claims (type-level only; the internal AI-chat
path still passes a real aiChatId).
Frontend (B2):
- Extend IComment with createdSource/aiChatId/resolvedSource (backend already
returns them via selectAll).
- Extract the local AiAgentBadge from history-item into a shared
components/ui/ai-agent-badge.tsx (clickable deep-link when aiChatId present,
plain label when null/absent); reuse it in history-item and render it in
comment-list-item next to the author name when createdSource==='agent'.
Tests: comment.service agent/null-aiChatId provenance, jwt.strategy provenance
derivation + anti-spoof, AiAgentBadge clickable/non-clickable branches, and
comment-list-item badge render/no-render.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds co-located unit tests for ten targets (client → vitest *.test.ts(x),
server → jest *.spec.ts), plus minimal behavior-preserving extractions/exports
where the issue required a pure function to test:
- encode-wav: WAV header + PCM16 clamping
- editor-ext embed-provider / utils (sanitizeUrl, isInternalFileUrl) / indent
(export clampIndent)
- label.dto @Matches regex
- move-page.dto vs generateJitteredKeyBetween parity (bug locked via test.failing)
- new-note-button canCreatePage (extracted to can-create-page.ts)
- history-editor diff (extracted pure computeHistoryDiff into history-diff.ts)
- notification getTypesForTab + repo contract (direct-tab divergence locked via
test.failing)
- search buildTsQuery (extracted + sanitizes operator inputs so adversarial
queries no longer risk a to_tsquery 500)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extend ai-http.spec with two loopback-server tests: a provider that stalls
without sending headers triggers the (lowered) headersTimeout and is retried on a
fresh connection, recovering; a healthy fast response passes through in one
attempt. No external network calls.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The z.ai GLM coding endpoint intermittently accepts the chat request but never
sends response headers; undici's default 300s headersTimeout then hung the user
for five minutes before failing, and UND_ERR_HEADERS_TIMEOUT was not in the
RetryAgent's retried error set, so there was no recovery.
headersTimeout only bounds time-to-FIRST-headers (before any body) — it is NOT
the streaming budget, so lowering it does not truncate live SSE streams. Cap it
(env AI_HTTP_HEADERS_TIMEOUT_MS, default 60s) so a header stall fails fast, and
add UND_ERR_HEADERS_TIMEOUT to the retried error codes so the stalled request is
retried on a fresh connection (which usually responds in seconds). bodyTimeout
kept generous (env AI_HTTP_BODY_TIMEOUT_MS, default 300s) so slow streams with
sparse chunks survive. UND_ERR_BODY_TIMEOUT is deliberately NOT retried (mid-body,
partial SSE already delivered).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extend the window.gitmost native-host bridge with three methods that work
when no page is open, registered globally at the app-shell level (not in
page-editor.tsx) so the react-router navigate fn and the api-client are
available:
- listSpaces(): reuse getSpaces() -> [{id, name}], flags truncation.
- listPages({spaceId, parentPageId?}): reuse getSidebarPages()
-> [{id, title, hasChildren}], first page only (truncated flag).
- createPageWithRecording({spaceId, parentPageId?, title?, base64,
filename, mimeType}): validate/decode the audio first (so a bad payload
leaves no junk page), resolve the space slug via getSpaceById (no-space
probe), createPage(), navigate via the router (no reload), wait for the
new page's editor to be mounted+editable+Yjs-connected, then run the same
uploadAudioAction path as insertRecording. Resolve-only error contract:
no-space | create-failed | editor-timeout | insert-failed.
DRY: extract the base64 decode/validate + audio-insert pipeline from
page-editor.tsx into features/editor/gitmost/gitmost-recording.ts; the
existing insertRecording now delegates to it (behavior unchanged).
Mount GitmostGlobalBridge once in GlobalAppShell. Before navigating, reset
the shared yjsConnectionStatusAtom so the readiness gate waits for the NEW
page's provider to connect instead of a stale "connected" from a previously
open page.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Closes the 4th PR #138 review (3 suggestions, no blockers).
- Double-call safety: a failed turn fires both useChat onFinish AND onError, so
onTurnFinished can run twice in one turn (streamed id, then no id) before a
re-render. onTurnFinished now reads the live id from a ref (set imperatively on
primary adoption) instead of the stale closure, so the 2nd no-id call cannot
re-arm the error-path fallback at the source; the render-phase reconciler is the
second layer. Added a hook test for the sequence — verified it fails only if
BOTH layers are removed (non-tautological).
- Conventions: extracted named UseChatSessionOptions / UseChatSessionResult
interfaces (was an inline param literal + ChatSession); the test derives its
driver props from them.
- Simplification: extracted the chatIdSnapshot(chats) projection used at both the
fallback arm site and the resolver effect.
Architecture notes from the review (caller-driven disarm contract; cross-process
{chatId} type) intentionally left as Variant A per the reviewer's recommendation.
tsc clean; 128 ai-chat tests green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Closes the 3rd PR #138 review.
Warning fix: the render-phase reconciler only disarms the error-path adoption
fallback when activeChatId actually changes. Pressing 'New chat' while ALREADY
in a new chat keeps activeChatId === null (a no-op atom write), so the reconciler
never fired and a stale armed fallback could adopt the just-failed chat from a
late refetch, yanking the user out of their fresh chat. useChatSession now
returns cancelPendingAdoption(); the window calls it from startNewChat AND
selectChat. (The hook call moved above those callbacks so they can reference it.)
Added a hook test that fails without the explicit disarm, plus a test for the
existing-chat onTurnFinished branch (no adoption + per-chat invalidation).
Cleanups: removed the dead pickNewlyCreatedChatId (the fallback effect uses
newlyAddedChatIds directly with the 0/1/>1 decision inline) and its tests/doc
mention; inlined the two invalidation closures (onTurnFinished is read live by
useChat's onFinish, never in an effect dep array, so memoizing them was needless
ceremony).
Verified: tsc clean, 127 ai-chat tests green; live (z.ai glm-5.2) new chat + 2nd
turn recalled the number in the SAME row (1 chat / 4 messages), no page errors.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Node's fetch returns a generic "fetch failed" error, hiding the actual
reason (e.g., ECONNRESET, timeout) in the error's cause chain. This
change extracts up to three levels of the cause, formats each with its
code and message, and includes the chain in the warning log, making
failures more actionable.
A new-chat turn fired the chat stream (streamText) and title generation
(generateText) concurrently to the same z.ai coding endpoint. That plan
stalls one of two concurrent requests, so the chat stream black-holed for
~300s (undici headers timeout) and the turn hung forever in every browser;
the AI SDK then retried 3x. Server logs showed two concurrent POSTs to
/chat/completions per turn — one 200 in ~8s, the other "fetch failed after
301209ms". Bypassing the custom undici transport did not help, confirming
the cause is the concurrency, not the transport.
Move generateTitle from before the response pipe into onFinish, so it runs
solo AFTER the stream's provider call completes. A first turn that errors or
aborts no longer auto-titles (fallback "Untitled chat" already handles a
null title) — acceptable, and it removes the request that was stalling.
Add a per-card "Save and test" button to the AI provider settings (save the
whole form, then probe the endpoint on success).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>