Files
gitmost/apps/client/src/features/ai-chat/components/ai-chat.module.css
claude code agent 227 0ebb1adce8 feat(ai-chat): realtime token counter + reasoning tokens, Claude-Code style (#151)
Tokens were only counted post-hoc (onFinish) and the header badge updated only on
chat open/switch; reasoning wasn't requested or shown. Now a counter ticks LIVE
during generation and surfaces reasoning ("thinking") tokens separately, like
Claude Code's `Thinking… · N tokens`.

Architecture (AI SDK v6): no provider gives exact per-token usage mid-stream, so
the live number is a cheap client estimate (chars/≈4) reconciled to AUTHORITATIVE
provider usage at step boundaries and turn end. The useChat per-delta re-render is
the existing realtime engine.

- server: `chatStreamMetadata` now also forwards usage on `finish-step` + `finish`;
  `sendReasoning: true`; persisted `metadata.usage` carries `reasoningTokens`
  (normalized from `outputTokenDetails` or the deprecated field).
- client: pure `count-stream-tokens` (estimateTokens / liveTurnTokens, prefers
  authoritative usage else estimate); `Thinking… · N tokens` in the typing
  indicator; collapsible "Thinking" reasoning block; throttled (~8 Hz) live
  turn-token header badge; `reasoningTokens` in types + Markdown export.

Review fixes folded in:
- v6 `finish-step.usage` is PER-STEP, not cumulative — the server now ACCUMULATES
  a running sum (new pure `accumulateStepUsage`) and sends the cumulative, which
  converges to `finish.totalUsage`, so the live counter never jumps DOWN on a
  multi-step agent turn.
- reasoning double-count: the authoritative turn-total is attributed to a block
  ONLY for a single-reasoning-part (one-step) turn; multi-step blocks each show
  their own estimate (the authoritative total stays in the header).
- no "0" badge flash at turn start (require live > 0, else show context size).
- comment refreshed (finish-step trigger).

Tests: server `accumulateStepUsage` + updated `chatStreamMetadata` (34 in the
suite); client pure-fn tests. Both tsc clean; 162 client ai-chat + the ai-chat
server suite pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 06:56:14 +03:00

175 lines
4.0 KiB
CSS

.panel {
display: flex;
flex-direction: column;
height: 100%;
min-height: 0;
}
.messages {
flex: 1 1 auto;
min-height: 0;
/* Smaller, denser chat text (cascades into user bubble + assistant markdown
+ tool cards; the explicit-size Mantine labels keep their own size). */
font-size: var(--mantine-font-size-xs);
}
.messageRow {
margin-bottom: var(--mantine-spacing-md);
}
.userBubble {
background: var(--mantine-color-gray-light);
border-radius: var(--mantine-radius-md);
padding: 8px 12px;
white-space: pre-wrap;
overflow-wrap: break-word;
word-break: break-word;
}
/* Rendered markdown for assistant messages. Keep block margins compact. */
.markdown {
overflow-wrap: break-word;
word-break: break-word;
}
.markdown p {
margin-block-start: 0;
margin-block-end: 0.5em;
}
.markdown pre {
background: var(--mantine-color-gray-light);
border-radius: var(--mantine-radius-sm);
padding: 8px;
overflow-x: auto;
}
.markdown code {
font-size: 0.85em;
}
.markdown ul,
.markdown ol {
margin-block-start: 0;
margin-block-end: 0.5em;
padding-inline-start: 1.4em;
}
/* Animated three-dot "typing" indicator shown while the agent is thinking but
has not yet produced any visible text/tool parts. */
.typingDots {
display: inline-flex;
align-items: center;
gap: 4px;
}
.typingDots span {
width: 5px;
height: 5px;
border-radius: 50%;
background: var(--mantine-color-dimmed, var(--mantine-color-gray-5));
opacity: 0.4;
animation: aiTypingBounce 1.2s infinite ease-in-out;
}
.typingDots span:nth-child(2) {
animation-delay: 0.2s;
}
.typingDots span:nth-child(3) {
animation-delay: 0.4s;
}
@keyframes aiTypingBounce {
0%,
80%,
100% {
transform: translateY(0);
opacity: 0.4;
}
40% {
/* Bounce height is driven by --bounce so reduced-motion can dampen it
(below) without disabling the animation outright. */
transform: translateY(var(--bounce, -6px));
opacity: 1;
}
}
/* Respect reduced-motion preferences: keep a smaller bounce instead of a full
stop, so the "thinking" indicator still reads as active rather than frozen. */
@media (prefers-reduced-motion: reduce) {
.typingDots span {
--bounce: -3px;
}
}
.toolCard {
border: 1px solid light-dark(var(--mantine-color-gray-3), var(--mantine-color-dark-4));
border-radius: var(--mantine-radius-sm);
padding: 6px 10px;
margin-bottom: 6px;
background: light-dark(var(--mantine-color-gray-0), var(--mantine-color-dark-6));
}
/* Collapsible "Thinking" (reasoning) block: a subtle left rule, dimmer than the
answer so it reads as secondary thinking context above the real answer. */
.reasoningBlock {
border-left: 2px solid light-dark(var(--mantine-color-gray-3), var(--mantine-color-dark-4));
padding-left: 8px;
}
.reasoningText {
margin-top: 4px;
font-size: var(--mantine-font-size-xs);
color: light-dark(var(--mantine-color-gray-7), var(--mantine-color-dark-1));
white-space: pre-wrap;
}
.reasoningText p {
margin: 0 0 4px;
}
.inputWrapper {
flex: 0 0 auto;
padding-top: var(--mantine-spacing-xs);
}
.conversationItem {
cursor: pointer;
border-radius: var(--mantine-radius-sm);
}
.conversationItem:hover {
background: var(--mantine-color-gray-light);
}
.conversationItemActive {
background: var(--mantine-color-gray-light);
}
/* Pending messages queued by the user while a turn is still streaming. They
are sent automatically, FIFO, once the current turn finishes. */
.queuedList {
padding-bottom: var(--mantine-spacing-xs);
}
.queuedItem {
background: var(--mantine-color-gray-light);
border-radius: var(--mantine-radius-sm);
padding: 4px 8px;
}
.queuedIcon {
flex: none;
color: var(--mantine-color-dimmed);
}
.queuedText {
flex: 1;
min-width: 0;
color: var(--mantine-color-dimmed);
white-space: pre-wrap;
overflow-wrap: break-word;
word-break: break-word;
}