The floating chat window's header badge flipped meaning — a live per-turn token counter while streaming, the persisted context size at rest — so it "reset to 1" on each prompt and conflated two different numbers. Replace it with a stable "current / max" context badge (e.g. `572 / 200k`). The live "Thinking · N tokens" inside the chat body stays; only the duplicate live counter is removed from the header. Max comes from a new admin setting "Context window (tokens)". The server resolves it and attaches `maxContextTokens` to the completed assistant turn's metadata (next to contextTokens), so the badge needs no client-side model resolution and this survives public shares / per-role models. Server: - ai.types: chatContextWindow on AiProviderSettings + PROVIDER_SETTINGS_KEYS + ResolvedAiConfig + MaskedAiSettings. - workspace.repo: chatContextWindow in AI_PROVIDER_SETTINGS_ALLOWED (parity). - update-ai-settings.dto: @IsInt @Min(0) chatContextWindow. - ai-settings.service: coerce the ::text-stored value to a positive int in resolve()/getMasked(). - ai-chat.service: flushAssistant writes metadata.maxContextTokens (>0); the completed turn passes resolved.chatContextWindow. Client: - ai-chat.types: maxContextTokens on the message-row metadata. - ai-chat-window: read maxContextTokens; render "current [/ max]"; drop the liveTurnTokens state/branch and the onLiveTurnTokens prop; new tooltip. - chat-thread: remove the live-turn-token throttle effect and plumbing. - count-stream-tokens: drop the now-dead liveTurnTokens()/types; keep estimateTokens. - settings: chatContextWindow on IAiSettings(+Update) + a NumberInput in the AI provider settings form. i18n: add the badge/settings keys (en, ru); remove the two now-unused keys. Tests: flushAssistant maxContextTokens, DTO validation, trim token tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
20 lines
766 B
TypeScript
20 lines
766 B
TypeScript
/**
|
|
* Rough client-side token estimation for AI-chat UI affordances.
|
|
*
|
|
* No provider streams exact per-token usage mid-stream, so any in-flight figure
|
|
* is a CLIENT ESTIMATE (chars/≈4 heuristic). Pure + unit-testable: it never runs
|
|
* a real BPE tokenizer (that would be O(n²) on the hot path, bloat the bundle,
|
|
* and be wrong for Gemini/Ollama anyway). Used by the in-body reasoning counter
|
|
* ("Thinking · N tokens").
|
|
*/
|
|
|
|
/**
|
|
* Rough token estimate for a piece of text using the standard chars/≈4 heuristic.
|
|
* Returns 0 for empty/whitespace-free-of-content input, and ceils so any
|
|
* non-empty text counts as at least one token.
|
|
*/
|
|
export function estimateTokens(text: string): number {
|
|
if (!text) return 0;
|
|
return Math.ceil(text.length / 4);
|
|
}
|