feat(ai-chat): realtime token counter + reasoning tokens, Claude-Code style (#151)

Tokens were only counted post-hoc (onFinish) and the header badge updated only on chat open/switch; reasoning wasn't requested or shown. Now a counter ticks LIVE during generation and surfaces reasoning ("thinking") tokens separately, like Claude Code's `Thinking… · N tokens`. Architecture (AI SDK v6): no provider gives exact per-token usage mid-stream, so the live number is a cheap client estimate (chars/≈4) reconciled to AUTHORITATIVE provider usage at step boundaries and turn end. The useChat per-delta re-render is the existing realtime engine. - server: `chatStreamMetadata` now also forwards usage on `finish-step` + `finish`; `sendReasoning: true`; persisted `metadata.usage` carries `reasoningTokens` (normalized from `outputTokenDetails` or the deprecated field). - client: pure `count-stream-tokens` (estimateTokens / liveTurnTokens, prefers authoritative usage else estimate); `Thinking… · N tokens` in the typing indicator; collapsible "Thinking" reasoning block; throttled (~8 Hz) live turn-token header badge; `reasoningTokens` in types + Markdown export. Review fixes folded in: - v6 `finish-step.usage` is PER-STEP, not cumulative — the server now ACCUMULATES a running sum (new pure `accumulateStepUsage`) and sends the cumulative, which converges to `finish.totalUsage`, so the live counter never jumps DOWN on a multi-step agent turn. - reasoning double-count: the authoritative turn-total is attributed to a block ONLY for a single-reasoning-part (one-step) turn; multi-step blocks each show their own estimate (the authoritative total stays in the header). - no "0" badge flash at turn start (require live > 0, else show context size). - comment refreshed (finish-step trigger). Tests: server `accumulateStepUsage` + updated `chatStreamMetadata` (34 in the suite); client pure-fn tests. Both tsc clean; 162 client ai-chat + the ai-chat server suite pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 06:56:14 +03:00
parent acf6d85b07
commit 0ebb1adce8
16 changed files with 775 additions and 28 deletions
--- a/apps/client/public/locales/en-US/translation.json
+++ b/apps/client/public/locales/en-US/translation.json
@@ -1147,6 +1147,9 @@
  "Ask a question about this documentation.": "Ask a question about this documentation.",
  "Ask a question…": "Ask a question…",
  "Thinking…": "Thinking…",
+  "Thinking… · {{count}} tokens": "Thinking… · {{count}} tokens",
+  "Thinking": "Thinking",
+  "Thinking · {{count}} tokens": "Thinking · {{count}} tokens",
  "The assistant is unavailable right now. Please try again.": "The assistant is unavailable right now. Please try again.",
  "Public share assistant": "Public share assistant",
  "Let anonymous visitors of public shares ask an AI assistant scoped to that share's pages. You pay for the tokens.": "Let anonymous visitors of public shares ask an AI assistant scoped to that share's pages. You pay for the tokens.",
@@ -1158,6 +1161,7 @@
  "Built-in assistant persona": "Built-in assistant persona",
  "Minimize": "Minimize",
  "Current context size": "Current context size",
+  "Tokens generated this turn": "Tokens generated this turn",
  "AI agent": "AI agent",
  "Take a look at the current document": "Take a look at the current document",
  "AI agent is typing…": "AI agent is typing…",