Compare commits

..

5 Commits

Author SHA1 Message Date
claude code agent 227
cb61274187 test(ai-chat): simplify msg factory and lock signature↔render coupling
Address non-blocking review items on the AI-chat stream-perf PR:

- Drop the unused `metadata` param from the `msg` test factory in
  message-item.test.ts; no caller passed it.
- Add a per-part-kind coupling guard to message-signature.test.ts that, for
  each part kind rendered today (text, reasoning, tool-*) plus the metadata
  banners, asserts that mutating a field the MessageItem render body DRAWS
  flips messageSignature — an executable lock for the load-bearing memo
  invariant documented in message-signature.ts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 16:57:31 +03:00
claude_code
eafd15f0ef docs(ai-chat): document load-bearing invariant of messageSignature memo
PR #182 review (post-fix pass) surfaced two latent correctness risks in the
new MessageItem memo: the per-message signature tracks only [type, text length,
state, error/output presence] + metadata, so a part kind whose VISIBLE content
can change WITHOUT changing those fields would silently freeze a stale row.
Neither is reachable with the current toolset (tool output is set once;
streaming is append-only with a fixed id), so the correct fix is to harden the
documented invariant rather than hash output content on every delta (getPage
returns full page content — hashing it per-delta would tax the hot path this
PR optimizes).

Add a WARNING in messageSignature naming the two future triggers (a tool that
streams `preliminary` output; a client-side regenerate/edit that mutates a
finalized row in place) and the required action (extend the signature).

No behavior change (comment only). vitest src/features/ai-chat 189/189 pass,
tsc clean for the touched files.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 23:44:49 +03:00
claude_code
63c26042ba test(review): address PR #182 review — tests + extract messageSignature, CHANGELOG
Resolve the PR #182 code-review (Request changes) on top of the already-merged
develop (the merge commit preserves both the markdown useMemo and the
collapseBlankLines fix in reasoning-block.tsx).

- Extract messageSignature from message-item.tsx into utils/message-signature.ts
  (matches the feature's "pure UIMessage helper + colocated test" convention) and
  export arePropsEqual so the memo seam is unit-testable. No logic change.
- Add utils/message-signature.test.ts covering every change signal (text grows,
  part appended, state flip, output appears, errorText appears, usage.reasoningTokens
  arriving on finish-step, metadata error/finishReason) plus the negative
  content-identical-clone case.
- Add components/message-item.test.ts for arePropsEqual (each prop diff -> false,
  identity fast-path -> true, same-content-different-object -> true, changed -> false).
- Add components/message-item-memo.test.tsx: render-level proof that finalized text
  parts are not re-parsed when only a tail part grows (MarkdownPart memo).
- CHANGELOG: add the user-facing 100% CPU freeze fix under [Unreleased] / Fixed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 22:33:14 +03:00
claude_code
2f058a6e40 Merge remote-tracking branch 'gitea/develop' into fix/ai-chat-stream-perf
# Conflicts:
#	apps/client/src/features/ai-chat/components/reasoning-block.tsx
2026-06-25 22:21:41 +03:00
claude_code
99d0cb8773 perf(ai-chat): throttle stream + memoize markdown to stop CPU spikes on long runs
On long agent runs (dozens of tool calls) the desktop app froze at 100% CPU with
no user interaction: useChat updated state on every streamed token, and
MessageItem/ReasoningBlock re-parsed the whole transcript's markdown (the marked
pipeline + DOMPurify) on every delta. Per-turn work grew quadratically and
saturated the main thread; the SSE stream drove it, so it hung "on its own".

- chat-thread: pass experimental_throttle (50ms) to useChat so the streamed
  messages state re-renders at most ~20 Hz instead of once per token.
- message-item: memoize MessageItem on a cheap per-message content signature
  (the streaming tail still re-renders as it grows; finalized rows are skipped),
  and render each text part via a memoized MarkdownPart so finalized parts are
  not re-parsed. The signature includes usage.reasoningTokens so the
  authoritative "Thinking - N tokens" count still snaps in at finish-step.
- reasoning-block: memoize the markdown render (useMemo on the text) and wrap the
  component in React.memo.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 03:26:44 +03:00
26 changed files with 895 additions and 458 deletions

View File

@@ -43,13 +43,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
OpenRouter, etc.; `openai` uses the official provider (real-OpenAI
reasoning-model request shaping). Chosen explicitly rather than inferred from
the base URL, since a custom URL can front real OpenAI too. (#175, #177)
- **AI chat "Context window (tokens)" setting (`chatContextWindow`).** A new
admin field in AI settings that records the chat model's context-window size.
When set (> 0) it becomes the denominator of the header context-badge, which
now reads "used / max"; `0`/empty clears the limit and the badge shows only
the current context as before. There is no provider-independent way to read a
model's window automatically, so it is an explicit workspace-level value.
(#189)
- **Per-MCP-server instructions in the agent prompt.** Each external MCP server
now has an admin-authored `instructions` field ("how/when to use this server's
tools") that is injected into the agent's system prompt next to that server's
@@ -68,12 +61,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
model's reasoning out of the box. An endpoint that is real OpenAI behind a
custom base URL should set the new `chatApiStyle` "Protocol" to `openai`. (#177)
- **AI chat header context-badge now shows "used / max".** When an admin sets
the new `chatContextWindow`, the badge displays the current context size over
the configured window (e.g. `120k / 200k`) instead of switching to a live
per-turn token counter during streaming. With no window configured the badge
keeps showing just the current context. (#189)
- **Footnotes now reuse (Pandoc semantics).** Multiple `[^a]` references to the
same id are ONE footnote — one number, one definition, several back-references
— instead of being renamed to `a__2`, `a__3`. Duplicate `[^a]:` definitions are
@@ -91,6 +78,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Fixed
- **AI chat: the desktop app no longer freezes at 100% CPU on long agent runs.**
`useChat` re-rendered on every streamed token and `MessageItem`/`ReasoningBlock`
re-parsed the whole transcript markdown (marked + DOMPurify) on every delta, so
per-turn work grew quadratically and saturated the main thread. The stream is now
throttled (`experimental_throttle`) to ~20 Hz and each finalized message row /
markdown part / reasoning block is memoized, so a long turn no longer re-parses
already-finished content. (#182)
- **Editor: caret/selection landed on the wrong line when clicking inside code
blocks and footnotes.** The affected NodeViews rendered their non-editable
chrome (language menu, footnotes heading, footnote number marker) before the

View File

@@ -1168,10 +1168,7 @@
"Built-in assistant persona": "Built-in assistant persona",
"Minimize": "Minimize",
"Current context size": "Current context size",
"Context size / model limit": "Context size / model limit",
"Context window (tokens)": "Context window (tokens)",
"Shows used / total in the chat header badge; empty hides the total.": "Shows used / total in the chat header badge; empty hides the total.",
"e.g. 200000": "e.g. 200000",
"Tokens generated this turn": "Tokens generated this turn",
"AI agent": "AI agent",
"Take a look at the current document": "Take a look at the current document",
"AI agent is typing…": "AI agent is typing…",

View File

@@ -705,10 +705,7 @@
"Copy chat": "Копировать чат",
"Created successfully": "Успешно создано",
"Current context size": "Текущий размер контекста",
"Context size / model limit": "Размер контекста / лимит модели",
"Context window (tokens)": "Размер окна контекста (токены)",
"Shows used / total in the chat header badge; empty hides the total.": "Показывает использовано/всего в шапке чата; пусто — скрыть лимит.",
"e.g. 200000": "напр. 200000",
"Tokens generated this turn": "Токенов сгенерировано за ход",
"Delete this chat?": "Удалить этот чат?",
"Deleted successfully": "Успешно удалено",
"Edited by AI agent on behalf of {{name}}": "Отредактировано AI-агентом от имени {{name}}",

View File

@@ -6,7 +6,7 @@ import {
useRef,
useState,
} from "react";
import { Group, Loader } from "@mantine/core";
import { Group, Loader, Tooltip } from "@mantine/core";
import {
IconArrowsDiagonal,
IconCheck,
@@ -39,7 +39,6 @@ import {
} from "@/features/ai-chat/queries/ai-chat-query.ts";
import ConversationList from "@/features/ai-chat/components/conversation-list.tsx";
import ChatThread from "@/features/ai-chat/components/chat-thread.tsx";
import { ContextBadge } from "@/features/ai-chat/components/context-badge.tsx";
import { exportAiChat } from "@/features/ai-chat/services/ai-chat-service.ts";
import { useChatSession } from "@/features/ai-chat/hooks/use-chat-session.ts";
import {
@@ -61,6 +60,13 @@ const MIN_HEIGHT = 400;
// Margin kept between the window and the viewport edges while dragging.
const EDGE_MARGIN = 8;
/** Compact token formatter: 1.2M / 3.4k / 950. */
function formatTokens(n: number): string {
if (n >= 1_000_000) return `${(n / 1_000_000).toFixed(1)}M`;
if (n >= 1_000) return `${(n / 1_000).toFixed(1)}k`;
return String(n);
}
// Compute the initial top-right placement at the default size, fitted to the
// current viewport. Reads `window` only when called (inside an effect).
function computeInitialGeom() {
@@ -155,6 +161,12 @@ export default function AiChatWindow() {
const { data: messageRows, isLoading: messagesLoading } =
useAiChatMessagesQuery(activeChatId ?? undefined);
// Live turn-token total (reasoning + output) for the in-flight turn, pushed up
// (THROTTLED to ~8 Hz inside ChatThread) so the header badge ticks mid-stream.
// `null` means no turn is in flight -> the badge falls back to the persisted
// context size below.
const [liveTurnTokens, setLiveTurnTokens] = useState<number | null>(null);
// The page the user is currently viewing. AiChatWindow lives in a pathless
// parent layout route, so useParams() can't see :pageSlug. Match the full
// pathname against the authenticated page route instead so "the current page"
@@ -294,21 +306,6 @@ export default function AiChatWindow() {
return 0;
}, [activeChatId, messageRows]);
// The model's context-window size (badge denominator), read from the most
// recent assistant row that carries it. Admin-configured in AI settings and
// stamped onto the turn server-side, so it travels with the message metadata —
// no client-side model resolution, and it survives public shares / per-role
// models automatically. 0 (no limit configured, or older rows) → the badge
// hides the denominator and shows only the current context size.
const maxContextTokens = useMemo(() => {
if (!activeChatId || !messageRows) return 0;
for (let i = messageRows.length - 1; i >= 0; i--) {
const max = messageRows[i].metadata?.maxContextTokens;
if (typeof max === "number" && max > 0) return max;
}
return 0;
}, [activeChatId, messageRows]);
// On (re)open, settle the geometry before paint (useLayoutEffect → no
// first-frame jump): compute an initial top-right placement the first time,
// and re-clamp an existing geometry to the current viewport on later opens
@@ -498,14 +495,23 @@ export default function AiChatWindow() {
)}
<div style={{ flex: 1, display: "flex", justifyContent: "center" }}>
{/* Context badge: always "current / max" context size (or just current
when no model limit is configured). It no longer flips to a live
per-turn generation counter mid-stream — that live feedback lives in
the chat body's "Thinking · N tokens" block. */}
<ContextBadge
contextTokens={contextTokens}
maxContextTokens={maxContextTokens}
/>
{/* While a turn streams, show the LIVE turn-token count (ticks ~8 Hz);
once it finishes, fall back to the persisted context size. Require
> 0 so the very first emit (an empty tail message, count 0) does not
flash a "0" badge before any token streams in (#151 review). */}
{liveTurnTokens !== null && liveTurnTokens > 0 ? (
<Tooltip label={t("Tokens generated this turn")} withArrow>
<span className={classes.badge}>
{formatTokens(liveTurnTokens)}
</span>
</Tooltip>
) : contextTokens > 0 ? (
<Tooltip label={t("Current context size")} withArrow>
<span className={classes.badge}>
{formatTokens(contextTokens)}
</span>
</Tooltip>
) : null}
</div>
<div style={{ display: "flex", alignItems: "center", gap: 1 }}>
@@ -628,6 +634,7 @@ export default function AiChatWindow() {
assistantName={currentRole?.name}
onTurnFinished={onTurnFinished}
onServerChatId={onServerChatId}
onLiveTurnTokens={setLiveTurnTokens}
/>
)}
</div>

View File

@@ -20,6 +20,7 @@ import {
} from "@/features/ai-chat/utils/role-launch.ts";
import { describeChatError } from "@/features/ai-chat/utils/error-message.ts";
import { extractServerChatId } from "@/features/ai-chat/utils/adopt-chat-id.ts";
import { liveTurnTokens } from "@/features/ai-chat/utils/count-stream-tokens.ts";
import {
dequeue,
enqueueMessage,
@@ -28,6 +29,14 @@ import {
} from "@/features/ai-chat/utils/queue-helpers.ts";
import classes from "@/features/ai-chat/components/ai-chat.module.css";
// Throttle how often the streamed `messages` state triggers a re-render. Without
// it, useChat updates state on EVERY token, so the whole transcript's markdown
// (marked + DOMPurify) is re-parsed per token — on a long agent run that grows
// into a quadratic CPU storm that pins the main thread and freezes the UI.
// ~50ms (20 Hz) keeps streaming visually smooth while decoupling re-render cost
// from the token rate.
const STREAM_THROTTLE_MS = 50;
/** The page the user is currently viewing, sent as chat context. */
export interface OpenPageContext {
id: string;
@@ -66,6 +75,12 @@ interface ChatThreadProps {
* Copy/export button available mid-stream). Distinct from onTurnFinished,
* which fires only at the terminal outcome. */
onServerChatId?: (serverChatId?: string) => void;
/** Reports the live turn-token total (reasoning + output) for the in-flight
* turn so the parent can show a header badge that ticks mid-stream. THROTTLED
* here (~8 Hz) so the parent re-renders a handful of times a second, not on
* every streamed delta. Called with `null` when no turn is in flight (the
* parent then reverts the badge to the persisted context size). */
onLiveTurnTokens?: (tokens: number | null) => void;
}
/**
@@ -110,6 +125,7 @@ export default function ChatThread({
assistantName,
onTurnFinished,
onServerChatId,
onLiveTurnTokens,
}: ChatThreadProps) {
const { t } = useTranslation();
@@ -238,6 +254,8 @@ export default function ChatThread({
id: chatStoreId,
messages: initialMessages,
transport,
// See STREAM_THROTTLE_MS — bounds re-render/markdown-reparse frequency.
experimental_throttle: STREAM_THROTTLE_MS,
// `onFinish` (ai@6 useChat) fires from a `finally` on EVERY terminal outcome
// — success, user Stop/abort (`isAbort`), network drop (`isDisconnect`), and
// stream error (`isError`). Keep calling `onTurnFinished()` on all of them
@@ -320,6 +338,53 @@ export default function ChatThread({
// the SAME on-screen banner text can be mirrored into the export (issue #160).
const errorView = error ? describeChatError(error.message ?? "", t) : null;
// Report the live turn-token total to the parent header badge, THROTTLED to
// ~8 Hz so the parent re-renders a few times a second instead of on every
// streamed delta. The tail assistant message's reasoning+output (estimate while
// streaming, authoritative once a step reports usage) is the live figure. When
// the turn ends we emit a final exact value, then `null` so the parent reverts
// the badge to the persisted context size.
const lastEmitRef = useRef(0);
const emitTimerRef = useRef<ReturnType<typeof setTimeout> | null>(null);
useEffect(() => {
if (!onLiveTurnTokens) return;
if (!isStreaming) {
// Turn ended (or never started): clear any pending throttle and revert.
if (emitTimerRef.current) {
clearTimeout(emitTimerRef.current);
emitTimerRef.current = null;
}
lastEmitRef.current = 0;
onLiveTurnTokens(null);
return;
}
const tail = messages[messages.length - 1];
const live = tail?.role === "assistant" ? liveTurnTokens(tail) : null;
const total = live ? live.reasoning + live.output : 0;
const now = Date.now();
const MIN_INTERVAL = 120; // ms (~8 Hz)
const elapsed = now - lastEmitRef.current;
if (elapsed >= MIN_INTERVAL) {
lastEmitRef.current = now;
onLiveTurnTokens(total);
} else if (!emitTimerRef.current) {
// Schedule a trailing emit so the FINAL value of a burst is not dropped.
emitTimerRef.current = setTimeout(() => {
emitTimerRef.current = null;
lastEmitRef.current = Date.now();
onLiveTurnTokens(total);
}, MIN_INTERVAL - elapsed);
}
}, [messages, isStreaming, onLiveTurnTokens]);
// Clear any pending throttle timer on unmount (chat switch via `key`) so a
// trailing emit can't fire into a torn-down thread's parent.
useEffect(() => {
return () => {
if (emitTimerRef.current) clearTimeout(emitTimerRef.current);
};
}, []);
// A role was picked with autoStart=false: the role is bound but NOTHING was
// sent, so chatId stays null and the empty state would keep showing the cards.
// This flag hides the cards and reveals the composer (with the role indicated)

View File

@@ -1,69 +0,0 @@
import { describe, it, expect } from "vitest";
import { render, screen, fireEvent } from "@testing-library/react";
import { MantineProvider } from "@mantine/core";
import { ContextBadge, formatTokens } from "./context-badge";
// matchMedia (read by MantineProvider) is stubbed globally in vitest.setup.ts.
// Without an I18nextProvider, `t(key)` returns the key verbatim, so tooltip
// labels assert against their English source strings.
function renderBadge(props: {
contextTokens: number;
maxContextTokens?: number;
}) {
return render(
<MantineProvider>
<ContextBadge {...props} />
</MantineProvider>,
);
}
describe("formatTokens", () => {
it("formats with k / M suffixes", () => {
expect(formatTokens(572)).toBe("572");
expect(formatTokens(200_000)).toBe("200.0k");
expect(formatTokens(1_500_000)).toBe("1.5M");
});
});
describe("ContextBadge", () => {
it("shows `current / max` when a limit is configured", () => {
renderBadge({ contextTokens: 572, maxContextTokens: 200_000 });
expect(screen.getByText("572 / 200.0k")).toBeDefined();
});
it("shows only the current size when no limit is configured", () => {
renderBadge({ contextTokens: 572, maxContextTokens: 0 });
expect(screen.getByText("572")).toBeDefined();
// No denominator rendered.
expect(screen.queryByText(/\//)).toBeNull();
});
it("treats an undefined limit as no limit", () => {
renderBadge({ contextTokens: 1234 });
expect(screen.getByText("1.2k")).toBeDefined();
expect(screen.queryByText(/\//)).toBeNull();
});
it("renders nothing until there is a current context size", () => {
const { container } = renderBadge({
contextTokens: 0,
maxContextTokens: 200_000,
});
expect(container.querySelector("span")).toBeNull();
});
it("never flips to a live per-turn counter (no live mode); shows context as-is even above max", () => {
// `current > max` (estimate drift / smaller-model role) is shown unclamped.
renderBadge({ contextTokens: 210_000, maxContextTokens: 200_000 });
expect(screen.getByText("210.0k / 200.0k")).toBeDefined();
});
it("exposes the limit tooltip label on hover", async () => {
renderBadge({ contextTokens: 572, maxContextTokens: 200_000 });
fireEvent.mouseEnter(screen.getByText("572 / 200.0k"));
expect(
await screen.findByText("Context size / model limit"),
).toBeDefined();
});
});

View File

@@ -1,61 +0,0 @@
import { Tooltip } from "@mantine/core";
import { useTranslation } from "react-i18next";
import classes from "@/features/ai-chat/components/ai-chat-window.module.css";
/** Compact token formatter: 1.2M / 3.4k / 950. */
export function formatTokens(n: number): string {
if (n >= 1_000_000) return `${(n / 1_000_000).toFixed(1)}M`;
if (n >= 1_000) return `${(n / 1_000).toFixed(1)}k`;
return String(n);
}
interface ContextBadgeProps {
// Current context size for the active chat (tokens occupied in the model's
// window). 0 = unknown → nothing is rendered.
contextTokens: number;
// The model's context-window size (tokens), from AI settings. 0/undefined =
// no limit known → only the current size is shown (no denominator).
maxContextTokens?: number;
}
/**
* Header badge that ALWAYS shows the current context size, and — when the model's
* context-window size is configured — appends "/ max" so the badge reads
* "current / max" (e.g. `572 / 200k`). This is a single, stable meaning: unlike
* the previous design it never flips to a live per-turn generation counter while
* streaming (that live feedback lives in the chat body's "Thinking · N tokens").
*
* No limit configured (or older history rows without it) → the denominator is
* hidden and the badge shows the current size only, matching the prior at-rest
* behaviour. `context > max` (estimate drift, or a role on a smaller model) is
* shown as-is, without clamping.
*/
export function ContextBadge({
contextTokens,
maxContextTokens,
}: ContextBadgeProps) {
const { t } = useTranslation();
// Nothing to show until the first persisted context figure exists.
if (!(contextTokens > 0)) return null;
const hasMax = typeof maxContextTokens === "number" && maxContextTokens > 0;
const label = hasMax
? `${formatTokens(contextTokens)} / ${formatTokens(maxContextTokens)}`
: formatTokens(contextTokens);
return (
<Tooltip
label={
hasMax
? t("Context size / model limit")
: t("Current context size")
}
withArrow
>
<span className={classes.badge}>{label}</span>
</Tooltip>
);
}
export default ContextBadge;

View File

@@ -0,0 +1,81 @@
import { describe, expect, it, vi } from "vitest";
import { render } from "@testing-library/react";
import { MantineProvider } from "@mantine/core";
import type { UIMessage } from "@ai-sdk/react";
// Stub react-i18next (the component reads `useTranslation`). Mirrors the stub in
// reasoning-block.test.tsx.
vi.mock("react-i18next", () => ({
useTranslation: () => ({ t: (key: string) => key }),
}));
// Spy on `renderChatMarkdown` so we can count parse calls per text. We keep every
// OTHER named export of markdown.ts intact via `importActual`, and override only
// `renderChatMarkdown` with a `vi.fn()` that returns simple HTML so the component
// still renders. This is the seam that proves the MarkdownPart memo works: a
// finalized text part must NOT be re-parsed on a later streamed delta.
// `vi.hoisted` so the spy exists when the hoisted `vi.mock` factory runs.
const { renderChatMarkdownSpy } = vi.hoisted(() => ({
renderChatMarkdownSpy: vi.fn((text: string) => `<p>${text}</p>`),
}));
vi.mock("@/features/ai-chat/utils/markdown.ts", async () => {
const actual = await vi.importActual<
typeof import("@/features/ai-chat/utils/markdown.ts")
>("@/features/ai-chat/utils/markdown.ts");
return { ...actual, renderChatMarkdown: renderChatMarkdownSpy };
});
import MessageItem from "./message-item";
// matchMedia (read by MantineProvider) is stubbed globally in vitest.setup.ts.
const msg = (parts: UIMessage["parts"]): UIMessage =>
({ id: "m1", role: "assistant", parts }) as UIMessage;
const renderRow = (message: UIMessage) =>
render(
<MantineProvider>
<MessageItem message={message} />
</MantineProvider>,
);
/** Count how many spy calls parsed exactly `text` (filtering by the first arg). */
const callsFor = (text: string) =>
renderChatMarkdownSpy.mock.calls.filter((c) => c[0] === text).length;
describe("MessageItem markdown memoization", () => {
it("does not re-parse finalized text parts when only a tail part grows", () => {
renderChatMarkdownSpy.mockClear();
// Two finalized text parts.
const first = msg([
{ type: "text", text: "alpha" },
{ type: "text", text: "beta" },
]);
const { rerender } = renderRow(first);
// Both finalized parts parsed exactly once on the initial render.
expect(callsFor("alpha")).toBe(1);
expect(callsFor("beta")).toBe(1);
// A streamed delta: a NEW message object where only a third tail part grows;
// the first two parts' text is byte-identical.
const next = msg([
{ type: "text", text: "alpha" },
{ type: "text", text: "beta" },
{ type: "text", text: "gamm" },
]);
rerender(
<MantineProvider>
<MessageItem message={next} />
</MantineProvider>,
);
// The finalized parts hit the MarkdownPart memo: still parsed at most once
// each across BOTH renders (the resilient invariant). The only new parse is
// for the changed/added tail part.
expect(callsFor("alpha")).toBe(1);
expect(callsFor("beta")).toBe(1);
expect(callsFor("gamm")).toBe(1);
});
});

View File

@@ -0,0 +1,73 @@
import { describe, expect, it, vi } from "vitest";
import type { UIMessage } from "@ai-sdk/react";
// Stub react-i18next: importing the component module pulls in `useTranslation`,
// and we only exercise the pure `arePropsEqual` comparator (no rendering), so a
// minimal `t` that echoes the key is enough. Mirrors the stub in
// reasoning-block.test.tsx.
vi.mock("react-i18next", () => ({
useTranslation: () => ({ t: (key: string) => key }),
}));
import { arePropsEqual } from "./message-item";
/**
* Tests for `arePropsEqual`, the `React.memo` comparator for MessageItem. It must
* return false on any visible prop/content change (so the row re-renders) and
* true when nothing visible changed (so a finalized row is skipped). A FIXED
* message id is used so a content-identical clone yields an equal signature.
*/
const msg = (parts: UIMessage["parts"]): UIMessage =>
({ id: "m1", role: "assistant", parts }) as UIMessage;
const props = (
message: UIMessage,
over: Record<string, unknown> = {},
) => ({
message,
showCitations: true,
neutralizeInternalLinks: false,
assistantName: "AI",
...over,
});
describe("arePropsEqual", () => {
it("returns false when showCitations differs", () => {
const m = msg([{ type: "text", text: "answer" }]);
expect(
arePropsEqual(props(m), props(m, { showCitations: false })),
).toBe(false);
});
it("returns false when neutralizeInternalLinks differs", () => {
const m = msg([{ type: "text", text: "answer" }]);
expect(
arePropsEqual(props(m), props(m, { neutralizeInternalLinks: true })),
).toBe(false);
});
it("returns false when assistantName differs", () => {
const m = msg([{ type: "text", text: "answer" }]);
expect(
arePropsEqual(props(m), props(m, { assistantName: "Other" })),
).toBe(false);
});
it("returns true on the identity fast path (same message object, equal props)", () => {
const m = msg([{ type: "text", text: "answer" }]);
expect(arePropsEqual(props(m), props(m))).toBe(true);
});
it("returns true for the same content in a different message object", () => {
const a = msg([{ type: "text", text: "answer" }]);
const b = msg([{ type: "text", text: "answer" }]);
expect(a).not.toBe(b);
expect(arePropsEqual(props(a), props(b))).toBe(true);
});
it("returns false when content changed in a different message object", () => {
const a = msg([{ type: "text", text: "answer" }]);
const b = msg([{ type: "text", text: "answer grown" }]);
expect(arePropsEqual(props(a), props(b))).toBe(false);
});
});

View File

@@ -1,3 +1,4 @@
import { memo } from "react";
import { Box, Text } from "@mantine/core";
import { useTranslation } from "react-i18next";
import type { UIMessage } from "@ai-sdk/react";
@@ -10,6 +11,7 @@ import { assistantMessageHasVisibleContent } from "@/features/ai-chat/utils/mess
import { renderChatMarkdown } from "@/features/ai-chat/utils/markdown.ts";
import { resolveAssistantName } from "@/features/ai-chat/utils/assistant-name.ts";
import { reasoningTokensForPart } from "@/features/ai-chat/utils/reasoning-tokens.ts";
import { messageSignature } from "@/features/ai-chat/utils/message-signature.ts";
import { describeChatError } from "@/features/ai-chat/utils/error-message.ts";
import classes from "@/features/ai-chat/components/ai-chat.module.css";
@@ -34,6 +36,39 @@ interface MessageItemProps {
assistantName?: string;
}
/**
* One assistant text part rendered as sanitized markdown. Memoized on its inputs
* so a finalized text part is NOT re-parsed on every streamed delta: during a
* turn only the actively-growing tail part changes its `text`, so every earlier
* part hits the memo and skips the expensive marked + DOMPurify pass. Props are
* primitives, so React.memo's default shallow compare is exactly right (the
* `text` string is compared by value).
*/
const MarkdownPart = memo(function MarkdownPart({
text,
neutralizeInternalLinks,
}: {
text: string;
neutralizeInternalLinks: boolean;
}) {
const html = renderChatMarkdown(text, { neutralizeInternalLinks });
if (html) {
return (
<div
className={classes.markdown}
// Sanitized by renderChatMarkdown (DOMPurify) before insertion.
dangerouslySetInnerHTML={{ __html: html }}
/>
);
}
// Fallback when markdown could not render synchronously: raw text.
return (
<Text className={classes.markdown} style={{ whiteSpace: "pre-wrap" }}>
{text}
</Text>
);
});
/**
* Render a single UIMessage by iterating its `parts`:
* - `text` parts -> sanitized markdown.
@@ -41,12 +76,13 @@ interface MessageItemProps {
* Other part kinds (reasoning, sources, files, step-start) are ignored for v1.
* User messages render their text as a right-aligned plain bubble.
*
* This component is intentionally NOT memoized: `useChat` replaces the streaming
* assistant message with a freshly cloned object on every streamed delta, so the
* `message` prop identity (and its `parts`) changes each tick. Re-rendering the
* text parts on each delta is what makes the answer stream in progressively.
* This component is memoized (see `arePropsEqual` at the bottom) on a cheap
* per-message content signature: the streaming TAIL message's signature changes
* on each delta so it still re-renders and streams in, while finalized rows are
* skipped. Each text part's markdown is itself memoized via `MarkdownPart`, so a
* long turn no longer re-parses the whole transcript on every token.
*/
export default function MessageItem({
function MessageItem({
message,
showCitations = true,
neutralizeInternalLinks = false,
@@ -109,24 +145,12 @@ export default function MessageItem({
// starts with an empty text part before the first token arrives); the
// typing indicator covers that gap until real content streams in.
if (!part.text.trim()) return null;
const html = renderChatMarkdown(part.text, {
neutralizeInternalLinks,
});
if (html) {
return (
<div
key={index}
className={classes.markdown}
// Sanitized by renderChatMarkdown (DOMPurify) before insertion.
dangerouslySetInnerHTML={{ __html: html }}
/>
);
}
// Fallback when markdown could not render synchronously: raw text.
return (
<Text key={index} className={classes.markdown} style={{ whiteSpace: "pre-wrap" }}>
{part.text}
</Text>
<MarkdownPart
key={index}
text={part.text}
neutralizeInternalLinks={neutralizeInternalLinks}
/>
);
}
@@ -177,3 +201,26 @@ export default function MessageItem({
</Box>
);
}
/** Skip re-rendering a message whose visible content is unchanged. The streaming
* TAIL message gets a fresh object whose signature changes each delta, so it
* still re-renders and streams in; every FINALIZED message is skipped, turning a
* per-token whole-transcript re-render into a tail-only one. */
export function arePropsEqual(
prev: MessageItemProps,
next: MessageItemProps,
): boolean {
if (
prev.showCitations !== next.showCitations ||
prev.neutralizeInternalLinks !== next.neutralizeInternalLinks ||
prev.assistantName !== next.assistantName
) {
return false;
}
// Fast path: identical message object (finalized rows keep their identity
// across deltas) — skip without building signatures.
if (prev.message === next.message) return true;
return messageSignature(prev.message) === messageSignature(next.message);
}
export default memo(MessageItem, arePropsEqual);

View File

@@ -1,4 +1,4 @@
import { useState } from "react";
import { memo, useMemo, useState } from "react";
import { Box, Collapse, Group, Text, UnstyledButton } from "@mantine/core";
import { IconChevronDown } from "@tabler/icons-react";
import { useTranslation } from "react-i18next";
@@ -27,19 +27,23 @@ interface ReasoningBlockProps {
* Providers that don't stream reasoning TEXT still render this block from the
* authoritative count alone (header only, empty body) so the cost is visible.
*/
export default function ReasoningBlock({ text, tokens }: ReasoningBlockProps) {
function ReasoningBlock({ text, tokens }: ReasoningBlockProps) {
const { t } = useTranslation();
const [open, setOpen] = useState(false);
// Authoritative count wins; otherwise estimate live from the streamed text.
const count = tokens && tokens > 0 ? tokens : estimateTokens(text);
const trimmed = text.trim();
// Collapse the blank-line gaps the model emits between every list item /
// paragraph so the reasoning renders compactly (tight lists, joined
// paragraphs) — see collapseBlankLines. ONLY here, not in the normal answer.
const html = trimmed
? renderChatMarkdown(collapseBlankLines(trimmed), {})
: "";
// Memoize the markdown render so toggling `open` (or a parent re-render caused
// by an unrelated streamed delta) does not re-parse the reasoning text; it
// recomputes only when the reasoning text itself changes (while it streams in).
// collapseBlankLines collapses the blank-line gaps the model emits between every
// list item / paragraph so the reasoning renders compactly (tight lists, joined
// paragraphs) — ONLY here, not in the normal answer.
const html = useMemo(
() => (trimmed ? renderChatMarkdown(collapseBlankLines(trimmed), {}) : ""),
[trimmed],
);
return (
<Box className={classes.reasoningBlock} mb={6}>
@@ -87,3 +91,8 @@ export default function ReasoningBlock({ text, tokens }: ReasoningBlockProps) {
</Box>
);
}
// Memoized: re-renders only when `text`/`tokens` change (primitive props, default
// shallow compare), so a parent re-render during streaming of OTHER content does
// not re-run the markdown parse for an already-finalized reasoning block.
export default memo(ReasoningBlock);

View File

@@ -113,14 +113,9 @@ export interface IAiChatMessageRow {
};
// Current context size for the turn = final-step (input+output) tokens, i.e.
// how much the conversation occupies in the model's context window after this
// turn. Distinct from `usage` (legacy cumulative totalUsage). Shown as the
// numerator of the floating window's "current / max" header badge.
// turn. Distinct from `usage` (legacy cumulative totalUsage). Shown in the
// floating window's header badge.
contextTokens?: number;
// The model's context-window size (tokens), admin-configured in AI settings
// and stamped onto the turn server-side. The denominator of the header badge.
// Absent/0 (older rows, or no limit configured) → the badge hides the
// denominator and shows only the current context size (`contextTokens`).
maxContextTokens?: number;
// Set on an assistant row whose turn ended in a provider/stream error; the
// raw provider error text (e.g. "402: ...") for inline display in the thread.
error?: string;

View File

@@ -1,5 +1,17 @@
import { describe, expect, it } from "vitest";
import { estimateTokens } from "@/features/ai-chat/utils/count-stream-tokens.ts";
import type { UIMessage } from "@ai-sdk/react";
import {
estimateTokens,
liveTurnTokens,
} from "@/features/ai-chat/utils/count-stream-tokens.ts";
const msg = (parts: unknown[], metadata?: unknown): UIMessage =>
({
id: Math.random().toString(),
role: "assistant",
parts,
metadata,
}) as UIMessage;
describe("estimateTokens", () => {
it("returns 0 for the empty string", () => {
@@ -13,3 +25,147 @@ describe("estimateTokens", () => {
expect(estimateTokens("12345678")).toBe(2);
});
});
describe("liveTurnTokens — estimate path", () => {
it("is all zeros for an undefined message", () => {
expect(liveTurnTokens(undefined)).toEqual({
reasoning: 0,
output: 0,
authoritative: false,
});
});
it("is all zeros for a parts-less message", () => {
expect(liveTurnTokens({ id: "x", role: "assistant" } as UIMessage)).toEqual({
reasoning: 0,
output: 0,
authoritative: false,
});
});
it("estimates output from text parts", () => {
// 8 chars -> 2 tokens.
const r = liveTurnTokens(msg([{ type: "text", text: "12345678" }]));
expect(r).toEqual({ reasoning: 0, output: 2, authoritative: false });
});
it("estimates reasoning from reasoning parts (kept separate from output)", () => {
const r = liveTurnTokens(
msg([
{ type: "reasoning", text: "12345678" },
{ type: "text", text: "abcd" },
]),
);
expect(r).toEqual({ reasoning: 2, output: 1, authoritative: false });
});
it("accumulates across multiple text + reasoning parts (multi-step)", () => {
const r = liveTurnTokens(
msg([
{ type: "reasoning", text: "abcd" }, // 1
{ type: "text", text: "abcd" }, // 1
{ type: "tool-getPage", state: "output-available" }, // ignored
{ type: "reasoning", text: "abcd" }, // 1
{ type: "text", text: "abcdefgh" }, // 2
]),
);
expect(r).toEqual({ reasoning: 2, output: 3, authoritative: false });
});
it("ignores non text/reasoning parts (tools, step-start)", () => {
const r = liveTurnTokens(
msg([
{ type: "step-start" },
{ type: "tool-getPage", state: "input-available" },
]),
);
expect(r).toEqual({ reasoning: 0, output: 0, authoritative: false });
});
});
describe("liveTurnTokens — authoritative path", () => {
it("returns authoritative usage verbatim, splitting reasoning out of output", () => {
// outputTokens INCLUDES reasoning in the AI SDK shape -> answer = 100 - 30.
const r = liveTurnTokens(
msg([{ type: "text", text: "estimate would be tiny" }], {
usage: { inputTokens: 500, outputTokens: 100, reasoningTokens: 30 },
}),
);
expect(r).toEqual({ reasoning: 30, output: 70, authoritative: true });
});
it("treats missing reasoningTokens as 0 and keeps full output", () => {
const r = liveTurnTokens(
msg([{ type: "text", text: "x" }], {
usage: { inputTokens: 10, outputTokens: 42 },
}),
);
expect(r).toEqual({ reasoning: 0, output: 42, authoritative: true });
});
it("never returns a negative output when reasoning exceeds reported output", () => {
const r = liveTurnTokens(
msg([], { usage: { outputTokens: 10, reasoningTokens: 40 } }),
);
expect(r).toEqual({ reasoning: 40, output: 0, authoritative: true });
});
it("falls back to the estimate when metadata has no usage object", () => {
const r = liveTurnTokens(
msg([{ type: "text", text: "abcd" }], { chatId: "c1" }),
);
expect(r).toEqual({ reasoning: 0, output: 1, authoritative: false });
});
});
describe("liveTurnTokens — combined authoritative + estimate (#163)", () => {
it("ticks the in-flight step above the completed-steps authoritative base", () => {
// The authoritative usage is the sum over COMPLETED steps (step 1). The
// CURRENT step is streaming and its text is NOT in `usage` yet, but it IS in
// the parts -> the running estimate must push the live figure above the base
// so the badge keeps growing between step boundaries.
const longText = "x".repeat(800); // 800 chars -> 200 est output tokens
const r = liveTurnTokens(
msg([{ type: "text", text: longText }], {
usage: { inputTokens: 500, outputTokens: 40 }, // step-1 base: 40 output
}),
);
// max(authOutput=40, estOutput=200) = 200 -> the counter ticks, not frozen.
expect(r.output).toBe(200);
expect(r.authoritative).toBe(true);
});
it("ticks reasoning of the in-flight step above the authoritative reasoning base", () => {
const longReasoning = "r".repeat(400); // 400 chars -> 100 est reasoning
const r = liveTurnTokens(
msg([{ type: "reasoning", text: longReasoning }], {
usage: { inputTokens: 100, outputTokens: 20, reasoningTokens: 20 },
}),
);
// reasoning: max(20, 100) = 100 ; output: max(max(0,20-20)=0, 0) = 0.
expect(r.reasoning).toBe(100);
expect(r.output).toBe(0);
expect(r.authoritative).toBe(true);
});
it("snaps to the authoritative figure once it exceeds the rough estimate", () => {
// Short on-screen text (estimate tiny) but a large authoritative output:
// the exact figure wins at the boundary (the counter never under-reports).
const r = liveTurnTokens(
msg([{ type: "text", text: "abcd" }], {
usage: { inputTokens: 10, outputTokens: 5000 },
}),
);
expect(r.output).toBe(5000);
});
it("is monotonic: max never drops below the authoritative base when the estimate is smaller", () => {
// Mirrors the legacy 'verbatim' tests: estimate < authoritative -> unchanged.
const r = liveTurnTokens(
msg([{ type: "text", text: "tiny" }], {
usage: { inputTokens: 500, outputTokens: 100, reasoningTokens: 30 },
}),
);
expect(r).toEqual({ reasoning: 30, output: 70, authoritative: true });
});
});

View File

@@ -1,16 +1,18 @@
import type { UIMessage } from "@ai-sdk/react";
/**
* Live token ESTIMATION for a streaming AI-chat turn.
* Live token counting for a streaming AI-chat turn — split into REASONING
* (thinking) and OUTPUT (answer) tokens, mirroring how Claude Code shows
* `Thinking… · 60 tokens` next to its thinking indicator.
*
* No provider streams exact per-token usage mid-stream, so the live number is a
* CLIENT ESTIMATE (chars/≈4 heuristic). It powers the chat body's
* `Thinking… · N tokens` indicator (see `ReasoningBlock`), which reconciles to
* the authoritative server usage once it lands. Pure + unit-testable: it never
* runs a real BPE tokenizer (that would be O(n²) on the hot path, bloat the
* CLIENT ESTIMATE (chars/≈4 heuristic) that is reconciled to AUTHORITATIVE usage
* once the server attaches it on a step/turn boundary (see the server's
* `chatStreamMetadata` + the client's read of `message.metadata.usage`). When
* authoritative usage is present we return it verbatim (the number "jumps to
* exact"); otherwise we return the running estimate. Pure + unit-testable: it
* never runs a real BPE tokenizer (that would be O(n²) on the hot path, bloat the
* bundle, and be wrong for Gemini/Ollama anyway).
*
* The former header-badge `liveTurnTokens()` split was removed with #189 (the
* header badge now shows the stable "current / max" context size, not a live
* per-turn counter); the live feedback remains in `ReasoningBlock`.
*/
/**
@@ -22,3 +24,90 @@ export function estimateTokens(text: string): number {
if (!text) return 0;
return Math.ceil(text.length / 4);
}
/** Authoritative per-step/turn usage the server attaches to message metadata. */
export interface AuthoritativeUsage {
inputTokens?: number;
outputTokens?: number;
totalTokens?: number;
reasoningTokens?: number;
}
/** Live token split for a turn's tail (streaming) assistant message. */
export interface LiveTurnTokens {
/** Thinking/reasoning tokens (estimate, or authoritative when available). */
reasoning: number;
/** Answer/output tokens (estimate, or authoritative when available). */
output: number;
/** True when the numbers come from authoritative server usage, not estimate. */
authoritative: boolean;
}
/** Read the authoritative usage off a UIMessage's metadata, if the server set it. */
function metadataUsage(message: UIMessage): AuthoritativeUsage | undefined {
const meta = message?.metadata as
| { usage?: AuthoritativeUsage }
| undefined;
const usage = meta?.usage;
if (!usage || typeof usage !== "object") return undefined;
return usage;
}
/**
* Token split for the given (streaming) assistant message.
*
* COMBINES the authoritative server usage with the running text estimate so the
* counter ticks in real time AND lands exact. The server only attaches
* `metadata.usage` at a step/turn boundary (`finish-step`/`finish`) and it is
* CUMULATIVE over COMPLETED steps — it does NOT yet include the in-flight step.
* So a multi-step turn that returned the authoritative figure verbatim would
* FREEZE between boundaries and jump in steps (issue #163).
*
* Instead we always compute the running ESTIMATE (chars/≈4 over the message's
* `reasoning`/`text` parts, which grows on every streamed delta) and take the
* per-component MAX of the authoritative base and the estimate:
* - between boundaries the estimate of the in-flight step ticks the number up;
* - at a boundary the authoritative figure snaps it to exact;
* - because the server's usage is cumulative and we only ever take the max, the
* number is MONOTONIC — it never drops.
*
* Providers that don't stream reasoning text still surface a reasoning count once
* the authoritative usage arrives (`max(reasoningTokens, 0)`); on the pure
* estimate path (no usage yet) such a turn shows `reasoning: 0` until then.
*/
export function liveTurnTokens(message: UIMessage | undefined): LiveTurnTokens {
if (!message) return { reasoning: 0, output: 0, authoritative: false };
// Running ESTIMATE over every reasoning/text part — grows on each delta. This
// includes the IN-FLIGHT step, which the authoritative usage does not cover yet.
let estReasoning = 0;
let estOutput = 0;
for (const part of message.parts ?? []) {
if (part.type === "reasoning") {
estReasoning += estimateTokens((part as { text?: string }).text ?? "");
} else if (part.type === "text") {
estOutput += estimateTokens((part as { text?: string }).text ?? "");
}
}
const usage = metadataUsage(message);
if (!usage) {
// No authoritative usage streamed yet: the estimate IS the live figure.
return { reasoning: estReasoning, output: estOutput, authoritative: false };
}
// Authoritative sum over COMPLETED steps. `outputTokens` already INCLUDES
// reasoning in the AI SDK usage shape, so subtract it out for the "answer"
// figure (never go negative if a provider reports them inconsistently).
const authReasoning = usage.reasoningTokens ?? 0;
const authOutput = Math.max(0, (usage.outputTokens ?? 0) - authReasoning);
// Per-component max: the in-flight step's estimate ticks above the completed-
// steps base between boundaries, and the authoritative figure wins once it
// exceeds the (rough) estimate at the next boundary. Monotonic by construction.
return {
reasoning: Math.max(authReasoning, estReasoning),
output: Math.max(authOutput, estOutput),
authoritative: true,
};
}

View File

@@ -0,0 +1,241 @@
import { describe, expect, it } from "vitest";
import type { UIMessage } from "@ai-sdk/react";
import { messageSignature } from "@/features/ai-chat/utils/message-signature.ts";
/**
* Pure-helper tests for `messageSignature`, the cheap per-message content
* signature that drives MessageItem's memo (a streaming row's signature must
* change on every delta so it re-renders, while a finalized row's stays stable
* so it is skipped). Each test exercises ONE change signal and asserts it flips
* the signature; a content-identical clone must keep an EQUAL signature.
*
* The signature embeds `message.id` and `message.role`, so the `msg` factory
* uses a FIXED id/role here (not `Math.random()`): otherwise two messages with
* identical content would get different signatures and the negative case would
* be impossible to express.
*/
const msg = (
parts: UIMessage["parts"],
metadata?: unknown,
): UIMessage =>
({
id: "m1",
role: "assistant",
parts,
metadata,
}) as UIMessage;
describe("messageSignature", () => {
it("changes when a text part grows", () => {
const before = msg([{ type: "text", text: "alpha" }]);
const after = msg([{ type: "text", text: "alpha beta" }]);
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
it("changes when a new part is appended", () => {
const before = msg([{ type: "text", text: "alpha" }]);
const after = msg([
{ type: "text", text: "alpha" },
{ type: "text", text: "beta" },
]);
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
it("changes when a part's state flips", () => {
const before = msg([
{ type: "tool-getPage", state: "input-streaming" } as never,
]);
const after = msg([
{ type: "tool-getPage", state: "output-available" } as never,
]);
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
it("changes when a tool part gains an output", () => {
const before = msg([
{ type: "tool-getPage", state: "output-available" } as never,
]);
const after = msg([
{
type: "tool-getPage",
state: "output-available",
output: { ok: true },
} as never,
]);
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
it("changes when a part gains an errorText", () => {
const before = msg([
{ type: "tool-getPage", state: "output-error" } as never,
]);
const after = msg([
{
type: "tool-getPage",
state: "output-error",
errorText: "boom",
} as never,
]);
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
it("changes when usage.reasoningTokens arrives on finish-step (text/state already frozen)", () => {
// The specifically-commented edge case: the authoritative turn total lands on
// the final finish-step AFTER the reasoning text length and state are frozen.
// Only the token count appears between these two snapshots, so the signature
// MUST still flip — otherwise the "Thinking · N tokens" header would never
// snap from the live estimate to the exact figure.
const before = msg([
{ type: "reasoning", text: "thinking", state: "done" } as never,
]);
const after = msg(
[{ type: "reasoning", text: "thinking", state: "done" } as never],
{ usage: { reasoningTokens: 42 } },
);
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
it("changes when metadata.error appears", () => {
const before = msg([{ type: "text", text: "answer" }]);
const after = msg([{ type: "text", text: "answer" }], { error: "boom" });
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
it("changes when metadata.finishReason changes (e.g. to 'aborted')", () => {
const before = msg([{ type: "text", text: "answer" }], {
finishReason: "stop",
});
const after = msg([{ type: "text", text: "answer" }], {
finishReason: "aborted",
});
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
it("is UNCHANGED for a content-identical clone (different object, same values)", () => {
// A finalized row that is re-created as a fresh object (different parts array
// by reference, same parts by value) must keep an EQUAL signature, so the
// memo skips re-rendering it.
const a = msg([
{ type: "text", text: "alpha" },
{ type: "tool-getPage", state: "output-available", output: { ok: true } } as never,
]);
const b = msg([
{ type: "text", text: "alpha" },
{ type: "tool-getPage", state: "output-available", output: { ok: true } } as never,
]);
expect(a).not.toBe(b);
expect(messageSignature(a)).toBe(messageSignature(b));
});
});
/**
* Per-part-kind coupling guard for the load-bearing invariant documented at the
* top of message-signature.ts: the signature MUST sample every VISIBLE field the
* MessageItem render body draws, or the memo freezes a stale row. This is an
* executable lock for the part kinds rendered TODAY — read alongside
* `MessageItem` (message-item.tsx) and the `assistantMessageHasVisibleContent`
* helper (message-content.ts), which "mirrors MessageItem's render decisions
* EXACTLY". For each kind, mutating a field the render body DRAWS must flip the
* signature. If a new visible field is rendered without being added here AND to
* the signature, the corresponding assertion below should fail — that is the
* guard. (This intentionally stops short of the render-descriptor refactor:
* adding a part kind or a visible field still requires a human to extend both
* the signature and this block.)
*/
describe("messageSignature ↔ render coupling (per visible part kind)", () => {
describe("text part — render draws part.text (MarkdownPart text={part.text})", () => {
it("flips when the visible text changes", () => {
// Streaming is append-only, so the visible text only grows; the signature
// samples its length, so the growth is the change signal.
const before = msg([{ type: "text", text: "answer" }]);
const after = msg([{ type: "text", text: "answer extended" }]);
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
});
describe("reasoning part — render draws text + tokens (ReasoningBlock)", () => {
it("flips when the visible reasoning text changes", () => {
const before = msg([
{ type: "reasoning", text: "think", state: "streaming" } as never,
]);
const after = msg([
{ type: "reasoning", text: "think harder", state: "streaming" } as never,
]);
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
it("flips when the visible token count (metadata.usage.reasoningTokens) lands", () => {
// The header's "Thinking · N tokens" reads reasoningTokensForPart, fed by
// metadata.usage.reasoningTokens — a VISIBLE field that arrives on the final
// finish-step after text length and state are frozen.
const before = msg([
{ type: "reasoning", text: "think", state: "done" } as never,
]);
const after = msg(
[{ type: "reasoning", text: "think", state: "done" } as never],
{ usage: { reasoningTokens: 99 } },
);
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
});
describe("tool-* part — render draws state/errorText/citations (ToolCallCard)", () => {
it("flips when the run state changes (running ↔ done icon + label)", () => {
// toolRunState(part.state) selects the spinner/check/error icon.
const before = msg([
{ type: "tool-getPage", state: "input-available" } as never,
]);
const after = msg([
{ type: "tool-getPage", state: "output-available" } as never,
]);
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
it("flips when output arrives (drives the rendered citation links)", () => {
// toolCitations reads part.output to render the "/p/{id}" anchors.
const before = msg([
{ type: "tool-getPage", state: "output-available" } as never,
]);
const after = msg([
{
type: "tool-getPage",
state: "output-available",
output: { id: "page-1", title: "Doc" },
} as never,
]);
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
it("flips when errorText appears (the visible red error detail line)", () => {
const before = msg([
{ type: "tool-getPage", state: "output-error" } as never,
]);
const after = msg([
{
type: "tool-getPage",
state: "output-error",
errorText: "permission denied",
} as never,
]);
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
});
describe("metadata banners — render draws error / aborted notices", () => {
it("flips when metadata.error appears (ChatErrorAlert banner)", () => {
const before = msg([{ type: "text", text: "answer" }]);
const after = msg([{ type: "text", text: "answer" }], { error: "boom" });
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
it("flips when metadata.finishReason becomes 'aborted' (ChatStoppedNotice)", () => {
const before = msg([{ type: "text", text: "answer" }], {
finishReason: "stop",
});
const after = msg([{ type: "text", text: "answer" }], {
finishReason: "aborted",
});
expect(messageSignature(before)).not.toBe(messageSignature(after));
});
});
});

View File

@@ -0,0 +1,44 @@
import type { UIMessage } from "@ai-sdk/react";
/** Cheap content signature for one message: changes iff something VISIBLE in the
* row changed. Streaming is APPEND-ONLY (text parts only grow, parts are only
* appended, a tool/text part flips state once), so a per-part [type, text
* length, state, error/output presence] tuple + the persisted metadata
* (error/finishReason) is a sufficient change signal without comparing full
* strings on every delta. WARNING — load-bearing for the MessageItem memo:
* if a future part kind's VISIBLE content can change WITHOUT changing [type,
* text length, state, error/output presence] (e.g. a tool that streams
* `preliminary` output, or a client-side regenerate that edits a finalized
* row in place), extend this signature or the memo will freeze a stale row. */
export function messageSignature(message: UIMessage): string {
const parts = message.parts
.map((p) => {
const any = p as {
type: string;
text?: string;
state?: string;
errorText?: string;
output?: unknown;
};
return [
any.type,
any.text?.length ?? 0,
any.state ?? "",
any.errorText ? 1 : 0,
any.output !== undefined ? 1 : 0,
].join(":");
})
.join("|");
const meta = message.metadata as
| { error?: string; finishReason?: string; usage?: { reasoningTokens?: number } }
| undefined;
// `usage.reasoningTokens` is neither append-only nor part-bound: the authoritative
// turn total arrives on the final `finish-step` AFTER the reasoning text length and
// state are already frozen. Without it in the signature the row's signature would be
// unchanged at that point and the re-render skipped, so the "Thinking · N tokens"
// header (reasoningTokensForPart) would keep the live estimate instead of snapping
// to the exact figure.
return `${message.id}#${message.role}#${parts}#${meta?.error ?? ""}#${
meta?.finishReason ?? ""
}#${meta?.usage?.reasoningTokens ?? ""}`;
}

View File

@@ -7,7 +7,6 @@ import {
Button,
Group,
Modal,
NumberInput,
Paper,
PasswordInput,
Select,
@@ -86,9 +85,6 @@ const formSchema = z.object({
chatModel: z.string(),
// Chat provider implementation (reasoning surfacing). Default openai-compatible.
chatApiStyle: z.enum(["openai-compatible", "openai"]),
// Model context-window size (tokens) shown as the chat header badge's "max".
// Empty string = no limit (NumberInput emits "" when cleared).
chatContextWindow: z.union([z.number(), z.literal("")]),
// Cheap model id for the anonymous public-share assistant; empty = use chatModel.
publicShareChatModel: z.string(),
// Agent-role id whose persona the public-share assistant adopts; empty =
@@ -316,7 +312,6 @@ export default function AiProviderSettings() {
initialValues: {
chatModel: "",
chatApiStyle: "openai-compatible" as ChatApiStyle,
chatContextWindow: "" as number | "",
publicShareChatModel: "",
publicShareAssistantRoleId: "",
embeddingModel: "",
@@ -340,10 +335,6 @@ export default function AiProviderSettings() {
form.setValues({
chatModel: settings.chatModel ?? "",
chatApiStyle: settings.chatApiStyle ?? "openai-compatible",
// 0/unset = no limit → show an empty field (not a literal "0").
chatContextWindow: settings.chatContextWindow
? settings.chatContextWindow
: "",
publicShareChatModel: settings.publicShareChatModel ?? "",
publicShareAssistantRoleId: settings.publicShareAssistantRoleId ?? "",
embeddingModel: settings.embeddingModel ?? "",
@@ -374,11 +365,6 @@ export default function AiProviderSettings() {
driver: "openai",
chatModel: values.chatModel,
chatApiStyle: values.chatApiStyle,
// Empty → 0, which clears the limit server-side (badge shows current only).
chatContextWindow:
typeof values.chatContextWindow === "number"
? values.chatContextWindow
: 0,
// Cheap model id for the anonymous public-share assistant; empty falls
// back to chatModel server-side.
publicShareChatModel: values.publicShareChatModel,
@@ -799,22 +785,6 @@ export default function AiProviderSettings() {
{...form.getInputProps("chatApiStyle")}
/>
<NumberInput
mt="sm"
label={t("Context window (tokens)")}
description={t(
"Shows used / total in the chat header badge; empty hides the total.",
)}
placeholder={t("e.g. 200000")}
min={0}
step={1000}
allowDecimal={false}
allowNegative={false}
thousandSeparator=" "
disabled={isLoading}
{...form.getInputProps("chatContextWindow")}
/>
{/* Anonymous public-share assistant: a single master toggle + an
optional cheaper model id. Reuses this card's driver/URL/key. */}
<Group justify="space-between" align="center" wrap="nowrap" mt="md">

View File

@@ -23,9 +23,6 @@ export interface IAiSettings {
driver?: AiDriver;
chatModel?: string;
chatApiStyle?: ChatApiStyle;
// Chat model context-window size (tokens); shown as the "max" in the chat
// header context badge. 0/unset = no limit (badge shows the current size only).
chatContextWindow?: number;
// Cheap model id for the anonymous public-share assistant; empty = chatModel.
publicShareChatModel?: string;
// Agent-role id whose persona the public-share assistant adopts; empty =
@@ -60,8 +57,6 @@ export interface IAiSettingsUpdate {
driver?: AiDriver;
chatModel?: string;
chatApiStyle?: ChatApiStyle;
// Chat model context-window size (tokens); 0 clears the limit.
chatContextWindow?: number;
publicShareChatModel?: string;
// Agent-role id whose persona the public-share assistant adopts; empty =
// built-in locked persona.

View File

@@ -292,26 +292,6 @@ describe('flushAssistant', () => {
expect(f.metadata.contextTokens).toBe(15);
});
it('completed: writes maxContextTokens when the model limit is > 0', () => {
const f = flushAssistant([toolStep], '', 'completed', {
contextTokens: 15,
maxContextTokens: 200_000,
});
expect(f.metadata.maxContextTokens).toBe(200_000);
});
it('omits maxContextTokens when the limit is unset or 0', () => {
const unset = flushAssistant([toolStep], '', 'completed', {
contextTokens: 15,
});
expect('maxContextTokens' in unset.metadata).toBe(false);
const zero = flushAssistant([toolStep], '', 'completed', {
contextTokens: 15,
maxContextTokens: 0,
});
expect('maxContextTokens' in zero.metadata).toBe(false);
});
it('error: records the error and a derived finishReason', () => {
const f = flushAssistant([], 'partial answer', 'error', { error: 'boom' });
expect(f.status).toBe('error');

View File

@@ -616,9 +616,6 @@ export class AiChatService implements OnModuleInit {
contextTokens:
(usage?.inputTokens ?? 0) + (usage?.outputTokens ?? 0) ||
undefined,
// Admin-configured context-window size for this model (badge max).
// Resolved once per turn above; written to metadata only when > 0.
maxContextTokens: resolved?.chatContextWindow,
}),
);
// Lifecycle: release the external MCP clients leased for this turn.
@@ -1226,10 +1223,6 @@ export function flushAssistant(
finishReason?: string;
usage?: ChatStreamUsage | StreamUsage | undefined;
contextTokens?: number;
// Admin-configured context-window size (tokens) for this turn's model; the
// denominator of the client's "current / max" header badge. Written only
// when > 0 (0/unset = no limit known → the badge shows current only).
maxContextTokens?: number;
error?: string;
},
): AssistantFlush {
@@ -1260,9 +1253,6 @@ export function flushAssistant(
normalizeStreamUsage(extra.usage as StreamUsage) ?? extra.usage;
}
if (extra?.contextTokens) metadata.contextTokens = extra.contextTokens;
if (extra?.maxContextTokens && extra.maxContextTokens > 0) {
metadata.maxContextTokens = extra.maxContextTokens;
}
if (extra?.error) metadata.error = extra.error;
return {

View File

@@ -21,7 +21,6 @@ export const AI_PROVIDER_SETTINGS_ALLOWED: readonly string[] = [
'driver',
'chatModel',
'chatApiStyle',
'chatContextWindow',
'embeddingModel',
'baseUrl',
'embeddingBaseUrl',
@@ -256,17 +255,11 @@ export class WorkspaceRepo {
): Promise<Workspace> {
const db = dbOrTx(this.db, trx);
// Assemble the provider object IN SQL. Keys are fixed provider field names
// (sql.lit -> inlined literals, no injection); values are bound params with
// an explicit cast — postgres.js sends bound params untyped, and
// jsonb_build_object's value args are polymorphic ("any"), so without the
// cast Postgres throws "could not determine data type of parameter $1". The
// cast is branched by the JS runtime type so the value lands in jsonb with
// the matching JSON type: a number stays a JSON number (e.g.
// chatContextWindow → `{"chatContextWindow":200000}`, jsonb_typeof 'number'),
// a boolean a JSON boolean, everything else a JSON string. A plain `::text`
// for all would store a numeric field as the JSON STRING `"200000"`, which
// the client's `typeof === "number"` guards reject. The result is a real
// jsonb object, never a double-encoded string. The CASE self-heals
// (sql.lit -> inlined literals, no injection); values are bound params cast
// to ::text — postgres.js sends bound params untyped, and jsonb_build_object's
// value args are polymorphic ("any"), so without the explicit ::text cast
// Postgres throws "could not determine data type of parameter $1". The result
// is a real jsonb object, never a double-encoded string. The CASE self-heals
// workspaces whose settings.ai.provider was previously corrupted into an
// array/string.
const entries = Object.entries(provider).filter(
@@ -274,14 +267,7 @@ export class WorkspaceRepo {
);
const patch = entries.length
? sql`jsonb_build_object(${sql.join(
entries.flatMap(([k, v]) => [
sql.lit(k),
typeof v === 'number'
? sql`${v}::numeric`
: typeof v === 'boolean'
? sql`${v}::boolean`
: sql`${v}::text`,
]),
entries.flatMap(([k, v]) => [sql.lit(k), sql`${v}::text`]),
)})`
: sql`'{}'::jsonb`;
return db

View File

@@ -41,35 +41,3 @@ describe('UpdateAiSettingsDto.chatApiStyle', () => {
expect(errs.find((e) => e.property === 'chatApiStyle')).toBeUndefined();
});
});
/** DTO validation for chatContextWindow (@IsOptional @IsInt @Min(0)). */
describe('UpdateAiSettingsDto.chatContextWindow', () => {
const errorsFor = async (chatContextWindow: unknown) =>
validate(plainToInstance(UpdateAiSettingsDto, { chatContextWindow }));
it('accepts a non-negative integer (incl. 0 = clear the limit)', async () => {
for (const v of [0, 200000]) {
const errs = await errorsFor(v);
expect(
errs.find((e) => e.property === 'chatContextWindow'),
).toBeUndefined();
}
});
it('rejects a negative value', async () => {
const errs = await errorsFor(-1);
expect(errs.find((e) => e.property === 'chatContextWindow')).toBeDefined();
});
it('rejects a non-integer value', async () => {
const errs = await errorsFor(1.5);
expect(errs.find((e) => e.property === 'chatContextWindow')).toBeDefined();
});
it('accepts the field being omitted (optional)', async () => {
const errs = await validate(plainToInstance(UpdateAiSettingsDto, {}));
expect(
errs.find((e) => e.property === 'chatContextWindow'),
).toBeUndefined();
});
});

View File

@@ -27,8 +27,6 @@ export interface UpdateAiSettingsInput {
driver?: AiDriver;
chatModel?: string;
chatApiStyle?: ChatApiStyle;
// Chat context-window size (tokens); 0/empty clears the limit.
chatContextWindow?: number;
embeddingModel?: string;
baseUrl?: string;
embeddingBaseUrl?: string;
@@ -164,8 +162,6 @@ export class AiSettingsService {
chatModel: provider.chatModel,
// Plain passthrough; getChatModel defaults unset to 'openai-compatible'.
chatApiStyle: provider.chatApiStyle,
// Admin-configured context-window size; 0/unset = no limit (badge denominator).
chatContextWindow: provider.chatContextWindow,
// Cheap model id for the anonymous public-share assistant; reuses the chat
// driver/baseUrl/apiKey. Empty/unset → callers fall back to chatModel.
publicShareChatModel: provider.publicShareChatModel,
@@ -248,7 +244,6 @@ export class AiSettingsService {
driver: provider.driver,
chatModel: provider.chatModel,
chatApiStyle: provider.chatApiStyle,
chatContextWindow: provider.chatContextWindow,
embeddingModel: provider.embeddingModel,
baseUrl: provider.baseUrl,
embeddingBaseUrl: provider.embeddingBaseUrl,

View File

@@ -35,13 +35,6 @@ export interface AiProviderSettings {
// Chat provider implementation for the `openai` driver. Unset → defaults to
// 'openai-compatible' (so reasoning is surfaced by default). See ChatApiStyle.
chatApiStyle?: ChatApiStyle;
// Admin-configured chat model context-window size, in tokens. There is no
// provider-independent way to discover this (OpenAI's /v1/models usually omits
// it, Gemini/Ollama/OpenRouter each expose it differently), so it is entered
// manually. Surfaced to the chat client (via assistant message metadata) as the
// denominator of the header "current / max" context badge. Empty/0 = no limit
// known → the badge shows only the current context size.
chatContextWindow?: number;
embeddingModel?: string;
baseUrl?: string;
// Embedding-specific base URL. Falls back to `baseUrl` when empty/unset.
@@ -80,7 +73,6 @@ export const PROVIDER_SETTINGS_KEYS = [
'driver',
'chatModel',
'chatApiStyle',
'chatContextWindow',
'embeddingModel',
'baseUrl',
'embeddingBaseUrl',
@@ -106,10 +98,6 @@ export const PROVIDER_SETTINGS_KEYS = [
export interface ResolvedAiConfig extends Partial<AiProviderSettings> {
driver?: AiDriver;
chatModel?: string;
// Admin-configured chat context-window size (tokens); 0/unset = no limit. Used
// as the header context-badge denominator. Re-declared for parity with the
// explicit fields above.
chatContextWindow?: number;
// Cheap model id for the public-share assistant; reuses the chat creds.
publicShareChatModel?: string;
// Agent-role id whose persona the public-share assistant adopts (empty/unset
@@ -129,8 +117,6 @@ export interface MaskedAiSettings {
driver?: AiDriver;
chatModel?: string;
chatApiStyle?: ChatApiStyle;
// Admin-configured chat context-window size (tokens); 0/unset = no limit.
chatContextWindow?: number;
embeddingModel?: string;
baseUrl?: string;
embeddingBaseUrl?: string;

View File

@@ -1,4 +1,4 @@
import { IsIn, IsInt, IsOptional, IsString, Min } from 'class-validator';
import { IsIn, IsOptional, IsString } from 'class-validator';
import {
AI_DRIVERS,
AiDriver,
@@ -29,13 +29,6 @@ export class UpdateAiSettingsDto {
@IsIn(CHAT_API_STYLES)
chatApiStyle?: ChatApiStyle;
// Chat model context-window size in tokens (header context-badge denominator).
// 0 (or empty) clears the limit so the badge shows only the current context.
@IsOptional()
@IsInt()
@Min(0)
chatContextWindow?: number;
@IsOptional()
@IsString()
embeddingModel?: string;

View File

@@ -1,91 +0,0 @@
import { Kysely, sql } from 'kysely';
import { WorkspaceRepo } from '@docmost/db/repos/workspace/workspace.repo';
import { getTestDb, destroyTestDb, createWorkspace } from './db';
/**
* WorkspaceRepo.updateAiProviderSettings numeric round-trip (#189, #213).
*
* `chatContextWindow` is the first NUMERIC provider field routed through this
* generic SQL layer. The patch builder must cast a JS number so it lands in
* jsonb as a JSON NUMBER, not the JSON STRING `"200000"` — the client guards
* (`typeof === "number"`) reject a string, silently killing the `/ max` badge
* denominator. A plain `::text` cast (the prior code) regressed exactly this.
* These specs are real SQL and assert both the JS value type and the on-disk
* `jsonb_typeof`.
*/
describe('WorkspaceRepo.updateAiProviderSettings (numeric round-trip) [integration]', () => {
let db: Kysely<any>;
let repo: WorkspaceRepo;
beforeAll(() => {
db = getTestDb();
repo = new WorkspaceRepo(db as any);
});
afterAll(async () => {
await destroyTestDb();
});
it('stores chatContextWindow as a JSON number (not a "200000" string)', async () => {
const ws = await createWorkspace(db, { settings: undefined });
const updated = await repo.updateAiProviderSettings(ws.id, {
driver: 'openai',
chatModel: 'gpt-4o',
chatContextWindow: 200000,
});
// Returned row: the number survives as a real JS number, alongside the
// string fields which stay strings.
const provider = (updated.settings as any)?.ai?.provider;
expect(provider.chatContextWindow).toBe(200000);
expect(typeof provider.chatContextWindow).toBe('number');
expect(provider.driver).toBe('openai');
expect(provider.chatModel).toBe('gpt-4o');
// On disk: the jsonb value is typed 'number' (the must-fix assertion), and
// sibling string fields are typed 'string'.
const typed = await db
.selectFrom('workspaces')
.select([
sql<string>`jsonb_typeof(settings->'ai'->'provider'->'chatContextWindow')`.as(
'windowType',
),
sql<string>`jsonb_typeof(settings->'ai'->'provider'->'chatModel')`.as(
'modelType',
),
])
.where('id', '=', ws.id)
.executeTakeFirstOrThrow();
expect(typed.windowType).toBe('number');
expect(typed.modelType).toBe('string');
});
it('re-reads chatContextWindow as a number after a partial-merge update', async () => {
const ws = await createWorkspace(db, {
settings: { ai: { provider: { driver: 'openai', chatModel: 'x' } } },
});
// Merge in only the numeric field; siblings must be preserved and the value
// must still be a JSON number, not a string.
await repo.updateAiProviderSettings(ws.id, { chatContextWindow: 128000 });
const row = await db
.selectFrom('workspaces')
.select([
'settings',
sql<string>`jsonb_typeof(settings->'ai'->'provider'->'chatContextWindow')`.as(
'windowType',
),
])
.where('id', '=', ws.id)
.executeTakeFirstOrThrow();
expect(row.windowType).toBe('number');
const provider = (row.settings as any)?.ai?.provider;
expect(provider.chatContextWindow).toBe(128000);
expect(provider.driver).toBe('openai');
expect(provider.chatModel).toBe('x');
});
});