feat(ai-chat): realtime token counter + reasoning tokens (#151) #158

Merged

vvzvlad merged 2 commits from feat/ai-chat-realtime-tokens into develop

2026-06-24 13:07:52 +03:00

Author	SHA1	Message	Date
claude code agent 227	044e3f7e6a	fix(ai-chat): plural token strings + cover reasoning UI + cleanups (#151 review) Review of #158 (Request changes) — core logic verified correct; addressed the test-coverage + localization items: 1. i18n pluralization: the token-count keys were called with {count} but had one form, so ru-RU always rendered the genitive ("1 токенов"). Added _one/_other (en) and _one/_few/_many (ru: токен/токена/токенов) for both "Thinking… · {{count}} tokens" and "Thinking · {{count}} tokens"; de-duped the PR-added duplicate "Thinking" key. Call sites unchanged. 2. ReasoningBlock: new reasoning-block.test.tsx (4 branches: authoritative count wins / estimate fallback / header-only when count-but-no-text / body render). 3. Reasoning-token attribution: extracted the #151 anti-double-count rule into a pure `reasoningTokensForPart(message)` (single reasoning part -> authoritative turn total; multiple/none -> undefined so each estimates). message-item uses it; removed the now-dead lastReasoningIndex reduce (review #5). Unit-tested. 6. adopt-chat-id.ts: refreshed 3 stale `chatStreamStartMetadata` -> `chatStreamMetadata` comment references. 7. chat-markdown.test.ts: assert the export footer's `reasoning: N` line appears when reasoningTokens>0 and is absent at 0/undefined. Skipped optional #4 (mantine useThrottledCallback): the manual throttle has two distinct exit paths (turn-end revert-to-null + the captured-total trailing emit) with no guarding test; remapping risks the streaming behavior — non-blocking. Client tsc clean; ai-chat suite green (171 tests). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 13:05:07 +03:00
claude code agent 227	0ebb1adce8	feat(ai-chat): realtime token counter + reasoning tokens, Claude-Code style (#151 ) Tokens were only counted post-hoc (onFinish) and the header badge updated only on chat open/switch; reasoning wasn't requested or shown. Now a counter ticks LIVE during generation and surfaces reasoning ("thinking") tokens separately, like Claude Code's `Thinking… · N tokens`. Architecture (AI SDK v6): no provider gives exact per-token usage mid-stream, so the live number is a cheap client estimate (chars/≈4) reconciled to AUTHORITATIVE provider usage at step boundaries and turn end. The useChat per-delta re-render is the existing realtime engine. - server: `chatStreamMetadata` now also forwards usage on `finish-step` + `finish`; `sendReasoning: true`; persisted `metadata.usage` carries `reasoningTokens` (normalized from `outputTokenDetails` or the deprecated field). - client: pure `count-stream-tokens` (estimateTokens / liveTurnTokens, prefers authoritative usage else estimate); `Thinking… · N tokens` in the typing indicator; collapsible "Thinking" reasoning block; throttled (~8 Hz) live turn-token header badge; `reasoningTokens` in types + Markdown export. Review fixes folded in: - v6 `finish-step.usage` is PER-STEP, not cumulative — the server now ACCUMULATES a running sum (new pure `accumulateStepUsage`) and sends the cumulative, which converges to `finish.totalUsage`, so the live counter never jumps DOWN on a multi-step agent turn. - reasoning double-count: the authoritative turn-total is attributed to a block ONLY for a single-reasoning-part (one-step) turn; multi-step blocks each show their own estimate (the authoritative total stays in the header). - no "0" badge flash at turn start (require live > 0, else show context size). - comment refreshed (finish-step trigger). Tests: server `accumulateStepUsage` + updated `chatStreamMetadata` (34 in the suite); client pure-fn tests. Both tsc clean; 162 client ai-chat + the ai-chat server suite pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 06:56:14 +03:00

Author

SHA1

Message

Date

claude code agent 227

044e3f7e6a

fix(ai-chat): plural token strings + cover reasoning UI + cleanups (#151 review)

Review of #158 (Request changes) — core logic verified correct; addressed the
test-coverage + localization items:

1. i18n pluralization: the token-count keys were called with {count} but had one
   form, so ru-RU always rendered the genitive ("1 токенов"). Added _one/_other
   (en) and _one/_few/_many (ru: токен/токена/токенов) for both "Thinking… ·
   {{count}} tokens" and "Thinking · {{count}} tokens"; de-duped the PR-added
   duplicate "Thinking" key. Call sites unchanged.
2. ReasoningBlock: new reasoning-block.test.tsx (4 branches: authoritative count
   wins / estimate fallback / header-only when count-but-no-text / body render).
3. Reasoning-token attribution: extracted the #151 anti-double-count rule into a
   pure `reasoningTokensForPart(message)` (single reasoning part -> authoritative
   turn total; multiple/none -> undefined so each estimates). message-item uses
   it; removed the now-dead lastReasoningIndex reduce (review #5). Unit-tested.
6. adopt-chat-id.ts: refreshed 3 stale `chatStreamStartMetadata` ->
   `chatStreamMetadata` comment references.
7. chat-markdown.test.ts: assert the export footer's `reasoning: N` line appears
   when reasoningTokens>0 and is absent at 0/undefined.

Skipped optional #4 (mantine useThrottledCallback): the manual throttle has two
distinct exit paths (turn-end revert-to-null + the captured-total trailing emit)
with no guarding test; remapping risks the streaming behavior — non-blocking.

Client tsc clean; ai-chat suite green (171 tests).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-24 13:05:07 +03:00

claude code agent 227

0ebb1adce8

feat(ai-chat): realtime token counter + reasoning tokens, Claude-Code style (#151 )

Tokens were only counted post-hoc (onFinish) and the header badge updated only on
chat open/switch; reasoning wasn't requested or shown. Now a counter ticks LIVE
during generation and surfaces reasoning ("thinking") tokens separately, like
Claude Code's `Thinking… · N tokens`.

Architecture (AI SDK v6): no provider gives exact per-token usage mid-stream, so
the live number is a cheap client estimate (chars/≈4) reconciled to AUTHORITATIVE
provider usage at step boundaries and turn end. The useChat per-delta re-render is
the existing realtime engine.

- server: `chatStreamMetadata` now also forwards usage on `finish-step` + `finish`;
  `sendReasoning: true`; persisted `metadata.usage` carries `reasoningTokens`
  (normalized from `outputTokenDetails` or the deprecated field).
- client: pure `count-stream-tokens` (estimateTokens / liveTurnTokens, prefers
  authoritative usage else estimate); `Thinking… · N tokens` in the typing
  indicator; collapsible "Thinking" reasoning block; throttled (~8 Hz) live
  turn-token header badge; `reasoningTokens` in types + Markdown export.

Review fixes folded in:
- v6 `finish-step.usage` is PER-STEP, not cumulative — the server now ACCUMULATES
  a running sum (new pure `accumulateStepUsage`) and sends the cumulative, which
  converges to `finish.totalUsage`, so the live counter never jumps DOWN on a
  multi-step agent turn.
- reasoning double-count: the authoritative turn-total is attributed to a block
  ONLY for a single-reasoning-part (one-step) turn; multi-step blocks each show
  their own estimate (the authoritative total stays in the header).
- no "0" badge flash at turn start (require live > 0, else show context size).
- comment refreshed (finish-step trigger).

Tests: server `accumulateStepUsage` + updated `chatStreamMetadata` (34 in the
suite); client pure-fn tests. Both tsc clean; 162 client ai-chat + the ai-chat
server suite pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-24 06:56:14 +03:00

feat(ai-chat): realtime token counter + reasoning tokens (#151) #158

2 Commits