feat(ai-chat): realtime token counter + reasoning tokens, Claude-Code style (#151)
Tokens were only counted post-hoc (onFinish) and the header badge updated only on
chat open/switch; reasoning wasn't requested or shown. Now a counter ticks LIVE
during generation and surfaces reasoning ("thinking") tokens separately, like
Claude Code's `Thinking… · N tokens`.
Architecture (AI SDK v6): no provider gives exact per-token usage mid-stream, so
the live number is a cheap client estimate (chars/≈4) reconciled to AUTHORITATIVE
provider usage at step boundaries and turn end. The useChat per-delta re-render is
the existing realtime engine.
- server: `chatStreamMetadata` now also forwards usage on `finish-step` + `finish`;
`sendReasoning: true`; persisted `metadata.usage` carries `reasoningTokens`
(normalized from `outputTokenDetails` or the deprecated field).
- client: pure `count-stream-tokens` (estimateTokens / liveTurnTokens, prefers
authoritative usage else estimate); `Thinking… · N tokens` in the typing
indicator; collapsible "Thinking" reasoning block; throttled (~8 Hz) live
turn-token header badge; `reasoningTokens` in types + Markdown export.
Review fixes folded in:
- v6 `finish-step.usage` is PER-STEP, not cumulative — the server now ACCUMULATES
a running sum (new pure `accumulateStepUsage`) and sends the cumulative, which
converges to `finish.totalUsage`, so the live counter never jumps DOWN on a
multi-step agent turn.
- reasoning double-count: the authoritative turn-total is attributed to a block
ONLY for a single-reasoning-part (one-step) turn; multi-step blocks each show
their own estimate (the authoritative total stays in the header).
- no "0" badge flash at turn start (require live > 0, else show context size).
- comment refreshed (finish-step trigger).
Tests: server `accumulateStepUsage` + updated `chatStreamMetadata` (34 in the
suite); client pure-fn tests. Both tsc clean; 162 client ai-chat + the ai-chat
server suite pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -1147,6 +1147,9 @@
|
||||
"Ask a question about this documentation.": "Ask a question about this documentation.",
|
||||
"Ask a question…": "Ask a question…",
|
||||
"Thinking…": "Thinking…",
|
||||
"Thinking… · {{count}} tokens": "Thinking… · {{count}} tokens",
|
||||
"Thinking": "Thinking",
|
||||
"Thinking · {{count}} tokens": "Thinking · {{count}} tokens",
|
||||
"The assistant is unavailable right now. Please try again.": "The assistant is unavailable right now. Please try again.",
|
||||
"Public share assistant": "Public share assistant",
|
||||
"Let anonymous visitors of public shares ask an AI assistant scoped to that share's pages. You pay for the tokens.": "Let anonymous visitors of public shares ask an AI assistant scoped to that share's pages. You pay for the tokens.",
|
||||
@@ -1158,6 +1161,7 @@
|
||||
"Built-in assistant persona": "Built-in assistant persona",
|
||||
"Minimize": "Minimize",
|
||||
"Current context size": "Current context size",
|
||||
"Tokens generated this turn": "Tokens generated this turn",
|
||||
"AI agent": "AI agent",
|
||||
"Take a look at the current document": "Take a look at the current document",
|
||||
"AI agent is typing…": "AI agent is typing…",
|
||||
|
||||
Reference in New Issue
Block a user