fix(dictation): address PR #118 review feedback (security, stability, tests)

Implements all reviewer comments (code-review, red-team, and test-strategy audit), accepting the recommended variants. Server — realtime service (ai-realtime.service.ts): - SSRF: pin the validated IP via a WebSocket `lookup` hook that re-checks every resolved address with isIpAllowed (mirrors external-mcp buildPinnedDispatcher), closing the TOCTOU/DNS-rebinding window; fix the misleading comment. - no-silent-loss: on Stop, drain the in-flight segment (bounded 2.5s) and deliver the final via onFinal before closing instead of dropping the tail. - fail-closed deriveRealtimeUrl: a non-empty unparseable base now THROWS (no silent api.openai.com fallback that would leak a self-hosted key); http://ws:// bases rejected (plaintext key). Path normalization preserved. - parseUpstreamEvent keys the accumulator by item_id+content_index so GA segments don't concatenate. - inject a wsFactory seam for testing; also fix a latent bug — `import WebSocket from 'ws'` resolved to undefined at runtime (no esModuleInterop) -> import=require. - unref idle/max/drain timers. Server — realtime gateway (ai-realtime.gateway.ts, session-limits.ts): - reject revoked/disabled users and inactive sessions (mirror jwt.strategy: findById+isUserDisabled + findActiveById) with NO counter increment. - CSWSH: Origin allowlist (matching APP_URL, or no Origin for native clients) before auth, no increment. - extract SessionCounters (delete-at-zero, never negative) + pure canConnect (both caps >= checked before any increment); document the per-process/in-memory cap caveat (single-replica only). Client: - dictation-group: realtime final now inserts at the captured rangeRef SNAPSHOT (not the live caret) and guards editor.isEditable; single-space separator. - use-realtime-dictation/realtime-dictation-client: stop-during-acquisition tears down the mic (no leak / button reset); reconnect re-emits start (double-start guarded); interim ghost cleared on teardown; io() options de-duplicated. - pcm16-worklet: flush the partial sub-frame tail on stop; one-pole anti-aliasing low-pass before 48k->24k. - extract shared mic-capture (acquireMicStream/mapGetUserMediaError, used by batch + realtime), pure DSP (pcm16-dsp.ts), and the session reducer/baseLanguageSubtag; extract applyInterimMeta/clampRange/resolveUrl/appendFinalToDraft. Tests + infra: +~150 server tests (deriveRealtimeUrl, parseUpstreamEvent branches, openSession/lifecycle/timers/testConnection via fake ws, gateway auth/caps/no-leak, realtime-test admin contract, AiSettings update/resolve, DTO boolean, SSRF deny) and +~140 client tests (DSP property/edge, resampler continuity, framing, reducer, mic-capture, RealtimeDictationClient/MicButton, ProseMirror interim regression + history guards, appendFinalToDraft, resolveKeyField, route contract). Added @vitest/coverage-v8. CHANGELOG [Unreleased] entry incl. the single-replica caveat. Review: APPROVE WITH SUGGESTIONS (no critical/regression); applied the drain-timer unref. Server tsc clean + 358 tests; client tsc clean + 201 tests; vite build ok. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
feat(dictation): add realtime streaming STT (live dictation)
2026-06-21 17:15:33 +03:00 · 2026-06-21 14:47:28 +03:00
503 changed files with 10956 additions and 40365 deletions
--- a/.env.example
+++ b/.env.example
@@ -123,45 +123,11 @@ MCP_DOCMOST_PASSWORD=
 # expose the port publicly).
 # MCP_TOKEN=
 # MCP_SESSION_IDLE_MS=1800000
-#
-# AI-AGENT ATTRIBUTION (comments/pages written via MCP are badged as "AI"):
-# attribution is driven by a per-user `is_agent` flag on the users row. There is
-# NO admin UI/API for it — set it out-of-band with SQL. Use a DEDICATED service
-# account for the MCP fallback above and flag ONLY that account, e.g.:
-#     UPDATE users SET is_agent = true WHERE email = 'mcp-bot@your-domain';
-# NEVER set is_agent on a human or shared account — every action by that account
-# (including normal human edits) would then be mis-attributed as AI.

 # Per-embedding-call timeout in milliseconds for the RAG indexer.
 # A slow/hung embeddings endpoint fails after this and the batch continues.
 # AI_EMBEDDING_TIMEOUT_MS=120000

-# Silence timeout (ms) for streaming chat/agent AI calls AND external-MCP traffic.
-# Bounds time-to-first-byte and the gap BETWEEN chunks (NOT the total turn length),
-# so an arbitrarily long turn that keeps streaming is never cut. Finite so a hung
-# provider is eventually broken instead of leaking forever. Default 900000 (15 min).
-# AI_STREAM_TIMEOUT_MS=900000
-
-# Keep-alive recycle window (ms) for streaming chat/agent AI + external-MCP calls.
-# A pooled connection idle longer than this is closed instead of reused, so a
-# NAT / egress firewall / reverse proxy that silently drops idle connections
-# cannot poison a reused socket into a PRE-RESPONSE `read ECONNRESET`. Lower it if
-# your egress drops idle connections faster than ~10s. Default 10000 (10 s).
-# AI_STREAM_KEEPALIVE_MS=10000
-
-# Silence timeout (ms) for EXTERNAL-MCP transport ONLY (not the chat provider).
-# Tighter than AI_STREAM_TIMEOUT_MS so a byte-silent/hung MCP server is broken in
-# ~5 min instead of 15. Note it also cuts a legitimately long but byte-silent
-# single tool call (a slow crawl that emits nothing until done) and an SSE
-# transport idling >5 min BETWEEN tool calls. Default 300000 (5 min).
-# AI_MCP_STREAM_TIMEOUT_MS=300000
-
-# Total wall-clock cap (ms) for ONE external MCP tool call (app-level, not
-# transport). Aborts a tool that keeps the socket warm (SSE heartbeats / trickle)
-# but never returns a result — which the silence timeout above never breaks.
-# Default 900000 (15 min).
-# AI_MCP_CALL_TIMEOUT_MS=900000
-
 # --- Anonymous public-share AI assistant ---
 # Opt-in per workspace (AI settings -> "public share assistant"; off by default).
 # When enabled, anonymous visitors of a published share can ask an AI about that
@@ -181,8 +147,8 @@ MCP_DOCMOST_PASSWORD=
 # per-IP limit is fully evaded. It is a COST backstop, not an access control, and
 # FAILS CLOSED if Redis is unavailable (an optional assistant briefly going
 # offline is safer than an unbounded bill). Override the hourly cap below
-# (default: 100 calls per workspace per rolling hour).
-# SHARE_AI_WORKSPACE_MAX_PER_HOUR=100
+# (default: 300 calls per workspace per rolling hour).
+# SHARE_AI_WORKSPACE_MAX_PER_HOUR=300
 #
 # Per-request output-token ceiling for the anonymous assistant (default: 512).
 # Worst-case output per accepted call = agent steps (5) × this value.
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -15,38 +15,6 @@ permissions:
 jobs:
  test:
    runs-on: ubuntu-latest
-    # Real Postgres + Redis so the server integration suite (`*.int-spec.ts`,
-    # behind `pnpm --filter server test:int`) runs in CI (red-team finding #7).
-    # Without it, cost-cap / FK-cascade / jsonb-round-trip / real-apply tests
-    # only ran locally, so regressions in those paths stayed green in CI.
-    # Postgres uses the pgvector image because migrations create vector columns
-    # and global-setup runs `CREATE EXTENSION vector`. Credentials/db match the
-    # defaults in apps/server/test/integration/db.ts + global-setup.ts
-    # (docmost / docmost_dev_pw, maintenance db `docmost`, redis on 6379), so no
-    # TEST_*_URL overrides are needed.
-    services:
-      postgres:
-        image: pgvector/pgvector:pg18
-        env:
-          POSTGRES_USER: docmost
-          POSTGRES_PASSWORD: docmost_dev_pw
-          POSTGRES_DB: docmost
-        ports:
-          - 5432:5432
-        options: >-
-          --health-cmd "pg_isready -U docmost"
-          --health-interval 10s
-          --health-timeout 5s
-          --health-retries 5
-      redis:
-        image: redis:7
-        ports:
-          - 6379:6379
-        options: >-
-          --health-cmd "redis-cli ping"
-          --health-interval 10s
-          --health-timeout 5s
-          --health-retries 5
    steps:
      - name: Checkout
        uses: actions/checkout@v4
@@ -68,12 +36,5 @@ jobs:
      - name: Build editor-ext
        run: pnpm --filter @docmost/editor-ext build

-      - name: Run unit tests
+      - name: Run tests
        run: pnpm -r test
-
-      # Integration suite against the real Postgres/Redis services above. Runs
-      # the FK-cascade, cost-cap, jsonb-round-trip and real-apply specs that the
-      # unit run (mocks only) cannot cover. global-setup drops/recreates the
-      # isolated `docmost_test` DB and migrates it to latest.
-      - name: Run server integration tests
-        run: pnpm --filter server test:int
--- a/.gitignore
+++ b/.gitignore
@@ -45,6 +45,4 @@ lerna-debug.log*

 # TypeScript incremental build artifacts
 *.tsbuildinfo
-
-# Self-hosted VAD / onnxruntime-web assets (copied from node_modules at dev/build time)
-apps/client/public/vad/
+apps/client/coverage/
--- a/.vscode/tasks.json
+++ b/.vscode/tasks.json
@@ -1,14 +0,0 @@
-{
-  // VSCode tasks for this repo.
-  "version": "2.0.0",
-  "tasks": [
-    {
-      "label": "git push (github + gitea)",
-      "type": "shell",
-      "command": "git push github develop && git push gitea develop",
-      "options": { "cwd": "${workspaceFolder}" },
-      "presentation": { "reveal": "never", "focus": false, "panel": "shared", "showReuseMessage": false, "close": true },
-      "problemMatcher": []
-    }
-  ]
-}
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -5,48 +5,45 @@ repository. It has two layers: **how to run a task end-to-end** (the
 sections below), and **how the codebase is built** (the technical sections
 further down, formerly in `CLAUDE.md`).

-## Task lifecycle
+## Жизненный цикл задачи

-### 1. Start: sync with develop
+### 1. Старт: синхронизация с develop

-Before starting **any** work, update your local `develop` and branch off it:
+Перед началом **любой** работы обнови локальный `develop` и ветвись от него:

 ```bash
 git checkout develop
 git fetch gitea
 git pull --ff-only gitea develop
-git checkout -b <short-feature-name>
+git checkout -b <короткое-имя-фичи>
 ```

-Never build a feature directly on `develop`, and never branch off a stale
-`develop` — otherwise the PR will carry extra commits or conflict.
+Никогда не пилит фичу прямо в `develop` и не ветвись от устаревшего
+`develop` — иначе PR будет содержать лишние коммиты или конфликтовать.

-### 2. Implementation
+### 2. Реализация

-Run the task through the workflow from the system prompt (Phase 1 analysis →
-Phase 3 implementation → Phase 4 review → Phase 5 verification → Phase 6
-report). Delegate large changes to a general subagent; review via the review
-subagent.
+Веди задачу по workflow из системного промпта (Phase 1 анализ → Phase 3
+реализация → Phase 4 review → Phase 5 верификация → Phase 6 отчёт). Большие
+изменения делегируй в general subagent, ревьюй через review subagent.

-**Create worktrees only inside the `.claude` folder** (e.g.
-`.claude/worktrees/<name>`). Creating a git worktree anywhere else — the repo
-root, sibling directories, or temp folders — is forbidden.
+### 3. Коммит — ТОЛЬКО в Gitea и ТОЛЬКО от `claude_code`

-### 3. Commit — ONLY to Gitea and ONLY as `claude_code`
+Это правило без исключений:

-This rule has no exceptions:
-
- **Where:** the only remote for commits/pushes is **`gitea`**
-  (`gitea.vvzvlad.xyz`). **Never** push to `origin` (the GitHub mirror), and
-  especially not to `upstream` (the original Docmost). The GitHub mirror is
-  updated by the owner's CI process, not by the agent.
- **Who:** commit **only** as the agent identity. Any commit whose author or
-  committer is `vvzvlad` is an error and must be rewritten.
+- **Куда:** единственный remote для коммитов/пушей — **`gitea`**
+  (`gitea.vvzvlad.xyz`). **Никогда** не пушь в `origin` (GitHub-зеркало) и
+  тем более в `upstream` (оригинальный Docmost). GitHub-зеркало обновляется
+  CI-процессом владельца, не агентом.
+- **От кого:** коммить **только** от агентского identity. Любой коммит,
+  у которого author или committer — `vvzvlad`, считается ошибкой и должен
+  быть переписан.
  - **name:** `claude_code`
  - **email:** `claude_code@vvzvlad.xyz`

-Use `--reset-author` when amending, otherwise git keeps the original author
-(the default config on this machine is `vvzvlad`, so check after every commit):
+Используй `--reset-author` при amend, иначе git оставит оригинального
+автора (по умолчанию config на этой машине — `vvzvlad`, поэтому проверяй
+после каждого коммита):

 ```bash
 GIT_AUTHOR_NAME="claude_code" \
@@ -56,33 +53,34 @@ GIT_COMMITTER_EMAIL="claude_code@vvzvlad.xyz" \
 git commit --amend --no-edit --reset-author
 ```

-For a regular new commit, set the branch-local config once and commit normally:
+Для обычного нового коммита достаточно один раз выставить локальный
+config ветки и коммитить штатно:

 ```bash
 git config user.name "claude_code"
 git config user.email "claude_code@vvzvlad.xyz"
 ```

-Check before push:
+Проверка перед push:

 ```bash
 git log -1 --format='Author: %an <%ae>%nCommitter: %cn <%ce>'
-# both lines must show claude_code <claude_code@vvzvlad.xyz>
+# обе строки должны показать claude_code <claude_code@vvzvlad.xyz>
 ```

-### 4. Push and PR to develop
+### 4. Push и PR в develop

-PRs always target `develop`. The `claude_code` password lives in the macOS
-keychain as a **generic password** under service `gitea-claude-code` (do not
-duplicate it as an internet-password for `gitea.vvzvlad.xyz` — that creates a
-conflict with the owner's account in the git credential helper):
+PR всегда в `develop`. Пароль `claude_code` лежит в macOS keychain как
+**generic password** под service `gitea-claude-code` (не дублируй его как
+internet-password для `gitea.vvzvlad.xyz` — это создаст конфликт с учёткой
+владельца в git credential helper):

 ```bash
 AGENT_PASS=$(security find-generic-password -s gitea-claude-code -w)
 ```

-Push by temporarily injecting the credentials into the remote URL, then always
-restore the URL to its clean form (the password must not linger in git
+Push — через временную подстановку кредов в remote URL, после чего URL
+обязательно возвращается в чистый вид (пароль не должен оседать в git
 config / reflog):

 ```bash
@@ -94,7 +92,7 @@ git remote set-url gitea "$ORIG_URL"
 unset AGENT_PASS SAFE_PASS
 ```

-The PR is created via the Gitea REST API (Basic Auth as `claude_code`):
+PR создаётся через Gitea REST API (Basic Auth от `claude_code`):

 ```bash
 curl -s -X POST \
@@ -104,75 +102,63 @@ curl -s -X POST \
  "https://gitea.vvzvlad.xyz/api/v1/repos/vvzvlad/gitmost/pulls"
 ```

-`base: develop`, `head: <branch>`. In the PR body: what was done, what is out
-of scope, verification results (tsc/lint/tests).
+`base: develop`, `head: <branch>`. В теле PR — что сделано, что вне scope,
+результаты верификации (tsc/lint/tests).

-> If push fails with `User permission denied for writing`, then `claude_code`
-> lacks collaborator rights on the repo. Ask the owner to add them (once, via
-> the Gitea UI or `PUT /api/v1/repos/vvzvlad/gitmost/collaborators/claude_code`
-> with `{"permission":"write"}` from their account).
+> Если push падает с `User permission denied for writing` — значит у
+> `claude_code` нет коллабораторских прав на репо. Попроси владельца
+> добавить (один раз, через Gitea UI или
+> `PUT /api/v1/repos/vvzvlad/gitmost/collaborators/claude_code` с
+> `{"permission":"write"}` от его учётки).

-### 5. Merge and cleanup
+### 5. Мерж и cleanup

- **The user merges the PR into develop** (not the agent). The agent does not
-  press the merge button.
- **After implementing a task, delete its plan from `docs/backlog/<task>.md`** —
-  this is part of closing the task, not the user's work. Files in
-  `docs/backlog/` are the work queue; completed items get cleaned out of it.
-  Do this in a separate commit from the same `claude_code` on the same branch
-  (or ask the user to delete it if the PR is already open and you don't want to
-  repush it).
- Any junk left uncommitted in the working tree? Check `git status` before the
-  final report.
+- **Мерж PR в develop делает пользователь** (не агент). Агент не жмёт
+  кнопку merge.
+- **После реализации задачи удали её план из `docs/backlog/<task>.md`** —
+  это часть закрытия задачи, не пользовательская работа. Файлы в
+  `docs/backlog/` — это очередь работы, выполненное из неё вычищается.
+  Сделай это в отдельном коммите от того же `claude_code` в той же ветке
+  (или попроси пользователя удалить, если PR уже открыт и ты не хочешь
+  его перепушивать).
+- Не закоммичен ли мусор в рабочем дереве? Проверь `git status` перед
+  финальным отчётом.

-## Release cycle: staging a new version
+## Релизный цикл: набор на новую версию

-When enough changes have accumulated on `develop` for a release, a **final
-review by three orchestrator skills** runs before the merge/tag:
+Когда в `develop` накопилось достаточно изменений для релиза, запускается
+**финальное ревью тремя скиллами-оркестраторами** перед мержем/тегом:

-1. **test-orchestrator** (the `code-review-orchestrator` skill focused on test
-   coverage) — verifies new code is covered by tests and there are no
-   regressions in existing ones.
-2. **review-orchestrator** (the `code-review-orchestrator` skill) —
-   multi-aspect code review: security, stability, convention conformance,
-   regressions, over-complexity.
-3. **red-team-orchestrator** (the red-team skill) — adversarial analysis of
-   attack scenarios against the affected components.
+1. **test-orchestrator** (skill `code-review-orchestrator` с фокусом на
+   тестовом покрытии) — проверяет, что новый код покрыт тестами и нет
+   регрессий в существующих.
+2. **review-orchestrator** (skill `code-review-orchestrator`) —
+   мульти-аспектный код-ревью: безопасность, стабильность, соответствие
+   конвенциям, регрессии, перегруженность.
+3. **red-team-orchestrator** (red-team скилл) — адверсариальный анализ
+   атакующих сценариев на затронутые компоненты.

-Order: the orchestrators return finding lists → the agent fixes everything they
-found (via a subagent or itself, per the delegation rules) → re-runs the review
-on the affected areas → cuts the tag per the "Cutting a release" procedure
-below.
+Порядок: оркестраторы возвращают списки находок → агент правит всё, что
+они нашли (через subagent или сам, по правилам делегирования) → повторно
+прогоняет ревью затронутых мест → режет тег по процедуре «Cutting a
+release» ниже.

-## Accounts & endpoints cheat sheet
+## Шпаргалка по учёткам и endpoint'ам

-| Item | Value |
+| Что | Значение |
 | --- | --- |
-| Only remote for commits | `gitea` → `https://vvzvlad@gitea.vvzvlad.xyz/vvzvlad/gitmost.git` |
-| Agent user (Gitea/git) | `claude_code` |
-| Agent email | `claude_code@vvzvlad.xyz` |
-| Keychain password | `security find-generic-password -s gitea-claude-code -w` |
-| PR API | `https://gitea.vvzvlad.xyz/api/v1/repos/vvzvlad/gitmost/pulls` (here `gitmost` is the repo's real slug on the server) |
-| Base branch | `develop` |
-| `origin` | GitHub mirror `vvzvlad/gitmost` — **do not push**, updated by the owner's CI |
-| `upstream` | The original Docmost — **never push** |
-
-## Creating issues (Gitea `tea` CLI)
-
-Issues are filed with the official Gitea CLI `tea`, already logged in as
-`claude_code` (`tea logins list` shows the `gitea` login as default):
-
-```bash
-tea issues create --repo vvzvlad/gitmost --labels feature \
-  --title '<title>' --description "$(cat body.md)"
-```
-
-> Gotcha (tea 0.14.1): the issue body flag is `--description`/`-d`, **not**
-> `--body` — passing `--body` fails with `flag provided but not defined: -body`.
+| Единственный remote для коммитов | `gitea` → `https://vvzvlad@gitea.vvzvlad.xyz/vvzvlad/gitmost.git` |
+| Агентский user (Gitea/git) | `claude_code` |
+| Агентский email | `claude_code@vvzvlad.xyz` |
+| Пароль в keychain | `security find-generic-password -s gitea-claude-code -w` |
+| PR API | `https://gitea.vvzvlad.xyz/api/v1/repos/vvzvlad/gitmost/pulls` (тут `gitmost` — реальный slug репо на сервере) |
+| Базовая ветка | `develop` |
+| `origin` | GitHub-зеркало `vvzvlad/gitmost` — **не пушить**, обновляется CI владельца |
+| `upstream` | Оригинальный Docmost — **не пушить никогда** |

 ---

-# Architecture and codebase
+# Архитектура и кодовая база

 ## What this is

@@ -223,7 +209,7 @@ pnpm --filter @docmost/mcp test                  # node --test (unit + mock)
 pnpm --filter @docmost/mcp test:e2e              # MCP end-to-end against a live instance
 ```

-**Database migrations** (Kysely, run from `apps/server`). **Where they auto-apply:** in **production** (the built image / `start:prod`) pending migrations run automatically on server boot. In **local dev** (the `pnpm dev` stand / `nest start --watch`) they do **NOT** auto-run — after you pull or switch branches you must apply them yourself with `pnpm --filter server migration:latest`, or any endpoint touching a new column/table 500s (e.g. a freshly-added `ai_chats.page_id` blanket-500s all of AI chat until migrated).
+**Database migrations** (Kysely, run from `apps/server`; they auto-run on server startup too):
 ```bash
 pnpm --filter server migration:create --name=my_change   # new empty migration
 pnpm --filter server migration:latest                    # apply all pending
@@ -291,29 +277,6 @@ The git tag is the source of truth for the displayed version (UI reads `git desc
 4. Update `CHANGELOG.md` (Keep a Changelog format): add a `## [X.Y.Z] - YYYY-MM-DD` section summarising `git log vPREV..HEAD --no-merges` grouped by type (Breaking / Added / Changed / Fixed / Removed), and add the `compare/vPREV...vX.Y.Z` link at the bottom. Fold the bump + changelog into the release commit.
 5. Tag the release commit with a **lightweight** tag (existing release tags are lightweight): `git tag vX.Y.Z`.
 6. Push commit and tag: `git push origin main && git push origin vX.Y.Z`. Pushing the `v*` tag triggers `release.yml` (multi-arch GHCR images + a draft GitHub Release).
-7. **Back-merge the release into `develop`** so develop builds report the new version: `git checkout develop && git merge --no-ff main && git push origin develop` (push to Gitea as well if that is the canonical remote).
-
-#### Why develop keeps showing the *previous* version (and why step 7 matters)
-
-The UI version is `git describe --tags --always` (see `vite.config.ts`), which walks **backwards from the current commit** and picks the **nearest tag reachable in that commit's ancestry**, then appends `-<commits-since-tag>-g<short-hash>`.
-
-The release tag (`vX.Y.Z`) is created on **`main`'s release merge commit**, and that commit is **not** in `develop`'s history. So until the release is back-merged, `git describe` on `develop` cannot see the new tag and falls back to the *previous* reachable tag. Result: every develop build — and the `ghcr.io/vvzvlad/gitmost:develop` image — keeps reporting e.g. `v0.91.0-NNN-g<hash>` even though `main` is already tagged `v0.93.0`. This is the classic git-flow pitfall: the version on `develop` does **not** advance just because a release was tagged on `main`.
-
-Back-merging `main → develop` (step 7) pulls the tagged release commit into `develop`'s ancestry, after which develop builds correctly show `vX.Y.Z-NNN-g<hash>`. If `develop` already drifted (release tagged but never back-merged), just run step 7 now — no new tag is needed.
-
-##### The tag must also exist on the remote that CI builds from (multi-remote gotcha)
-
-`git describe` names a tag **ref**, not just a commit — so the back-merge is *necessary but not sufficient*. The develop image is built by GitHub Actions (`develop.yml`, `actions/checkout` with `fetch-depth: 0`, then `git describe --tags --always`), so the version it prints depends on which tags exist **on the `github` remote**, not on your local clone or on `gitea`.
-
-This repo has two writable remotes — `gitea` (canonical, where commits land) and `github` (where the `:develop` and release images are built) — plus `upstream` (docmost, never push). **`git push <branch>` does NOT push tags**; tags must be pushed explicitly and *to each remote separately*. A release tag that only lives on `gitea` is invisible to the GitHub Actions build: even with the tagged commit fully in `develop`'s history (step 7 done), `git describe` on the GitHub runner falls back to the previous tag it *does* have, so the develop image keeps showing e.g. `v0.91.0-NNN` while `git describe` locally already says `v0.93.0-NN`.
-
-Fix / checklist when develop still shows the old version after a back-merge:
-
-1. Confirm the tag is missing on github: `git ls-remote --tags github` (compare with `gitea`).
-2. Push it there: `git push github vX.Y.Z` (and `git push gitea vX.Y.Z` if it is missing on gitea too). Note: pushing a `v*` tag to `github` also triggers `release.yml` (multi-arch GHCR images + draft Release) — expected, but be aware.
-3. Re-run the develop build (`gh workflow run Develop`, or push any commit to `develop`) so `git describe` re-resolves with the tag now present.
-
-(The `git push origin ...` in steps 6–7 above is shorthand — there is no `origin` remote here; substitute `gitea` **and** `github` as appropriate, and always push release tags to both.)

 ## Planning docs

--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -12,148 +12,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Added

- **Inline "Test" button per external MCP server.** Each server row in admin AI
-  settings now has its own "Test" button that runs an isolated connection check:
-  idle `Test` → green `OK · N` (with a tooltip listing the discovered tools, or
-  "No tools available") on success, or red `Failed` (tooltip with the sanitized
-  error) on a connection problem. State is per-row, so testing one server never
-  spins or recolours the others. (#170)
-
- **Persistent AI-chat history as the source of truth + server-side export.**
-  An assistant turn is now persisted to the database step by step: the row is
-  inserted upfront as `streaming` and updated as each agent step finishes, then
-  finalized once to `completed`/`error`/`aborted`. A process that dies mid-turn
-  keeps every finished step, and a startup sweep flips any dangling `streaming`
-  row (untouched for 10 minutes) to `aborted`. Chat "Copy" now exports
-  server-side from these rows (`POST /ai-chat/export`) rather than from live
-  client state, so the export is identical whether a chat is freshly streaming,
-  just switched to, or reloaded — and is available from the first turn of a new
-  chat. (#183, #174)
-
- **AI-agent attribution for MCP writes.** Comments (and pages) created through
-  the MCP endpoint by a dedicated agent account are now badged as "AI", with
-  unspoofable provenance derived from a per-user `is_agent` flag (not from the
-  request body). **Operator setup:** use a _dedicated_ service account for the
-  MCP fallback and set the flag with SQL —
-  `UPDATE users SET is_agent = true WHERE email = '<mcp-account>'`. Never flag a
-  human or shared account, or its normal edits get mis-attributed as AI. See the
-  AI-agent block in `.env.example`. (#143)
- **Footnote import diagnostics.** The MCP page-write tools (`create_page`,
-  `update_page`, `import_page_markdown`) now return a `footnoteWarnings` array
-  flagging dangling references, empty or duplicate definitions, and `[^id]`
-  markers inside table rows, so an agent can fix its own markup. The page is
-  still created; the field is omitted when there are no problems. (#166)
- **AI chat "Protocol" setting (`chatApiStyle`).** A new admin choice in AI
-  settings for the `openai` driver: `openai-compatible` (default) routes chat
-  through `@ai-sdk/openai-compatible`, which surfaces a provider's streamed
-  reasoning (`reasoning_content` → reasoning parts) for z.ai/GLM, DeepSeek,
-  OpenRouter, etc.; `openai` uses the official provider (real-OpenAI
-  reasoning-model request shaping). Chosen explicitly rather than inferred from
-  the base URL, since a custom URL can front real OpenAI too. (#175, #177)
- **Per-MCP-server instructions in the agent prompt.** Each external MCP server
-  now has an admin-authored `instructions` field ("how/when to use this server's
-  tools") that is injected into the agent's system prompt next to that server's
-  tool descriptions. Trusted text, rendered inside the prompt safety sandwich;
-  shown only for a server that actually connected and contributed ≥1 callable
-  tool. (#180)
- **Footnote multi-backlinks.** A footnote referenced more than once now shows a
-  back-link per reference (↩ a b c …), each scrolling to its own occurrence, like
-  Pandoc/Wikipedia; a single-reference footnote keeps the plain ↩. (#168)
-
-### Changed
-
- **AI chat default provider is now `openai-compatible` (reasoning surfaced).**
-  For the `openai` driver the chat provider defaults to the openai-compatible
-  implementation, so a workspace pointing at z.ai/GLM/DeepSeek now streams the
-  model's reasoning out of the box. An endpoint that is real OpenAI behind a
-  custom base URL should set the new `chatApiStyle` "Protocol" to `openai`. (#177)
-
- **Footnotes now reuse (Pandoc semantics).** Multiple `[^a]` references to the
-  same id are ONE footnote — one number, one definition, several back-references
-  — instead of being renamed to `a__2`, `a__3`. Duplicate `[^a]:` definitions are
-  first-wins on import (the rest are dropped and reported via `footnoteWarnings`),
-  and a reference with no definition yields a single empty footnote rather than
-  one per occurrence. This supersedes the 0.93.0 "survive duplicate-id
-  definitions" behavior for the import path. (#166)
-
- **Public share AI: default per-workspace hourly assistant cap lowered
-  300 → 100.** The limiter falls back to this default whenever
-  `SHARE_AI_WORKSPACE_MAX_PER_HOUR` is unset, so a `0.93.0` deployment that
-  never set the env var has its anonymous public-share assistant hourly cap
-  cut from 300 to 100 on upgrade. Set `SHARE_AI_WORKSPACE_MAX_PER_HOUR` to
-  keep the previous limit. (#62)
-
-### Fixed
-
- **Editor: caret/selection landed on the wrong line when clicking inside code
-  blocks and footnotes.** The affected NodeViews rendered their non-editable
-  chrome (language menu, footnotes heading, footnote number marker) before the
-  editable content, so the browser's click hit-testing missed the contentDOM and
-  snapped the caret to a previous node. Content now renders first in the DOM
-  (chrome is lifted back into place via CSS flex `order`), and scroll containers
-  are nudged after a paste to refresh stale hit-testing geometry. The caret
-  symptom is macOS-specific and was confirmed manually on macOS; the automated
-  guard pins the DOM-order invariant, not the caret behavior itself. (#146, #147)
- **AI chat: the live token counter now ticks between agent steps.** During a
-  multi-step turn the header token badge (and the "Thinking… · N tokens" line)
-  no longer froze on the previous step's authoritative usage; the current step's
-  estimate is combined per-component with `max`, so the count rises smoothly and
-  never jumps backwards. (#163)
-
-## [0.93.0] - 2026-06-21
-
-This release builds on the 0.91.0 AI foundation: admin-defined AI agent roles,
-an anonymous AI assistant on public shares, server-side voice dictation, an
-editor footnotes model, live page-template embeds, and sandboxed arbitrary-HTML
-embeds — plus a large batch of security hardening and test coverage.
-
-### Breaking Changes
-
- **MCP shared-token auth moved to its own header.** The `/mcp` shared guard
-  no longer reads `Authorization: Bearer <MCP_TOKEN>`; it now reads only the
-  `X-MCP-Token` header. The `Authorization` header is now reserved for per-user
-  HTTP Basic / Bearer access-JWT credentials, so each `/mcp` request
-  authenticates as a specific user (the `MCP_DOCMOST_*` service account is only
-  a fallback). Existing MCP clients (e.g. Claude Desktop) configured with
-  `Authorization: Bearer <MCP_TOKEN>` must be reconfigured to send
-  `X-MCP-Token: <MCP_TOKEN>` instead. See `MCP_TOKEN` in `.env.example`. As a
-  one-time aid, the server logs a single migration warning when it sees the
-  old-style header.
-
-### Added
-
- **AI agent roles**: admin-defined assistant personas with an optional
-  per-role model override, selectable in chat.
- **Anonymous AI assistant on public shares**: public-share visitors can chat
-  with a selectable agent-role identity that reuses the internal chat
-  presentation, with per-request output-token caps and a fail-closed Redis
-  limiter.
- **Voice dictation (STT)**: server-side speech-to-text with a mic button in
-  the chat and the editor, OpenRouter STT support, an endpoint test, and real
-  provider-error surfacing.
- **Footnotes**: an editor footnotes model (inline references + a definitions
-  list).
- **Page templates**: live whole-page embed (MVP) with a template-marker icon
-  in the page tree and a working Refresh action.
- **Arbitrary HTML/CSS/JS embeds**: a sandboxed-iframe embed block gated by a
-  per-workspace toggle (default OFF); insertable by any member when the toggle
-  is on.
- Admin-only **"Analytics / tracker"** workspace setting: a raw HTML/JS snippet
+- **Realtime streaming dictation**: a new live-dictation mic mode layered on top
+  of the existing batch STT. Audio streams over a dedicated `/ai-realtime`
+  Socket.IO namespace and text is inserted as you speak (interim partials shown
+  as a ghost decoration, only finals committed to the document). Gated by a new
+  `dictationRealtime` workspace toggle, with `sttRealtimeModel` and
+  `sttRealtimeBaseUrl` settings (empty model falls back to `sttModel`; empty base
+  URL falls back to the STT base URL server-side).
+  - **Ops caveat (single-process assumption):** the realtime concurrency caps
+    (1 concurrent session per user, 5 per workspace) are enforced **in-memory,
+    per API process**. They are therefore authoritative only on a **single API
+    replica** — running multiple API instances (horizontal scale / load
+    balancing) lets a user or workspace exceed these caps, since each process
+    counts only its own sessions. Treat the limits as per-process until the
+    counters are moved to a shared store.
+- Admin-only "Analytics / tracker" workspace setting: a raw HTML/JS snippet
  injected into the `<head>` of public share pages only (for analytics such as
-  Google Analytics or Yandex.Metrika), kept separate from the member-facing
-  HTML-embed feature.
- **MCP**: a hierarchical tree mode for `list_pages`, and per-user auth for the
-  embedded `/mcp` endpoint.
- **Page tree**: Expand all / Collapse all for the space tree, and
-  server-authoritative realtime tree updates.
- **AI chat UX**: a `get_current_page` tool for proxy-robust page context, a
-  current-context-size readout, an agent step cap raised 8→20 with a forced
-  final text answer, and auto-collapse of the chat window on page focus.
- **AI settings**: a Clear control inside the API-key field and an endpoint
-  status dot bound to "configured × enabled".
- **Client**: an always-visible space grid replacing the space-switcher popover,
-  removal of the sidebar Overview item, tighter comments-panel density, and no
-  auto-open of the comments panel when adding a comment.
+  Google Analytics or Yandex.Metrika).

 ### Changed

@@ -167,40 +42,16 @@ embeds — plus a large batch of security hardening and test coverage.
  server-side strip is the public-share read path, which still honors the
  workspace HTML-embed toggle.

-### Fixed
+### Breaking Changes

- AI chat: preserve scroll position during streaming, record chats that fail on
-  their first turn, and resolve the current page for agent context behind
-  proxies.
- AI roles: guard `update()` against concurrent soft-delete; harden the model
-  override, role-name uniqueness, and id validation; sandwich the safety
-  framework around the role persona.
- Auth: handle null-password (SSO/LDAP-only) accounts without a bcrypt throw.
- Footnotes: survive duplicate-id definitions without collab divergence.
- HTML embed: fix stale iframe height and damp the resize loop; strip embeds at
-  serve time on authenticated read paths and the plain page-create path.
- Page templates: import `ThrottleModule` so collab boots, never strand an
-  in-flight page-embed id, and add defense-in-depth workspace checks.
- Pages: `movePage` cycle guard with no phantom `PAGE_MOVED` event.
- Import: surface the real error cause from `/pages/import` instead of a generic 400.
-
-### Security
-
- MCP: close an SSO/MFA bypass on Basic auth and stop minting non-init sessions;
-  close a brute-force limiter check-then-act race.
- Public share: block restricted descendants in the anonymous assistant, cap
-  per-request output, fail closed when Redis is unavailable, and reject non-text
-  message parts to close a size-cap bypass.
- Make `trustProxy` env-configurable with a safe default.
-
-### Internal
-
- CI: gate the `develop` and release image builds on the test suite, run the
-  suites on push/PR, and build the `:develop` image on push to `develop`.
- Docs: replace `CLAUDE.md` with `AGENTS.md` codifying the agent workflow and
-  the release procedure, add migration-ordering guidance, and prune implemented
-  plans.
- A large batch of new server/client test coverage.
+- **MCP shared-token auth moved to its own header.** The `/mcp` shared guard
+  no longer reads `Authorization: Bearer <MCP_TOKEN>`; it now reads only the
+  `X-MCP-Token` header. Existing MCP clients (e.g. Claude Desktop) configured
+  with `Authorization: Bearer <MCP_TOKEN>` must be reconfigured to send
+  `X-MCP-Token: <MCP_TOKEN>` instead. The `Authorization` header is now
+  reserved for per-user HTTP Basic / Bearer access JWT credentials. See
+  `MCP_TOKEN` in `.env.example`. As a one-time aid, the server logs a single
+  migration warning when it sees the old-style header.

 ## [0.91.0] - 2026-06-18

@@ -284,6 +135,5 @@ knowledge layer, an embedded MCP server, and the Gitmost rebrand.
 - Build: drop the private EE submodule, retarget CI to GHCR, and update the
  Docker image to the GHCR registry.

-[Unreleased]: https://github.com/vvzvlad/gitmost/compare/v0.93.0...HEAD
-[0.93.0]: https://github.com/vvzvlad/gitmost/compare/v0.91.0...v0.93.0
+[Unreleased]: https://github.com/vvzvlad/gitmost/compare/v0.91.0...HEAD
 [0.91.0]: https://github.com/vvzvlad/gitmost/compare/v0.90.1...v0.91.0
--- a/apps/client/package.json
+++ b/apps/client/package.json
@@ -3,8 +3,8 @@
  "private": true,
  "version": "0.93.0",
  "scripts": {
-    "dev": "node scripts/copy-vad-assets.mjs && vite",
-    "build": "node scripts/copy-vad-assets.mjs && tsc && vite build",
+    "dev": "vite",
+    "build": "tsc && vite build",
    "lint": "eslint .",
    "preview": "vite preview",
    "format": "prettier --write \"src/**/*.tsx\" \"src/**/*.ts\"",
@@ -28,7 +28,6 @@
    "@mantine/modals": "8.3.18",
    "@mantine/notifications": "8.3.18",
    "@mantine/spotlight": "8.3.18",
-    "@ricky0123/vad-web": "^0.0.30",
    "@slidoapp/emoji-mart": "5.8.7",
    "@slidoapp/emoji-mart-data": "1.2.4",
    "@slidoapp/emoji-mart-react": "1.1.5",
@@ -54,7 +53,6 @@
    "mantine-form-zod-resolver": "1.3.0",
    "mermaid": "11.15.0",
    "mitt": "3.0.1",
-    "onnxruntime-web": "^1.27.0",
    "posthog-js": "1.372.2",
    "react": "18.3.1",
    "react-clear-modal": "^2.0.18",
@@ -81,6 +79,7 @@
    "@types/react": "18.3.12",
    "@types/react-dom": "18.3.1",
    "@vitejs/plugin-react": "6.0.1",
+    "@vitest/coverage-v8": "4.1.6",
    "eslint": "9.28.0",
    "eslint-plugin-react": "7.37.5",
    "eslint-plugin-react-hooks": "7.0.1",
--- a/apps/client/public/locales/en-US/translation.json
+++ b/apps/client/public/locales/en-US/translation.json
@@ -119,8 +119,6 @@
  "Name": "Name",
  "New email": "New email",
  "New page": "New page",
-  "New note": "New note",
-  "Create in space": "Create in space",
  "New password": "New password",
  "No group found": "No group found",
  "No page history saved yet.": "No page history saved yet.",
@@ -258,7 +256,6 @@
  "Copy to space": "Copy to space",
  "Copy chat": "Copy chat",
  "Copied": "Copied",
-  "Failed to export chat": "Failed to export chat",
  "Duplicate": "Duplicate",
  "Select a user": "Select a user",
  "Select a group": "Select a group",
@@ -421,8 +418,6 @@
  "{{count}} command available_other": "{{count}} commands available",
  "{{count}} result available_one": "1 result available",
  "{{count}} result available_other": "{{count}} results available",
-  "{{count}} result found_one": "{{count}} result found",
-  "{{count}} result found_other": "{{count}} results found",
  "Equal columns": "Equal columns",
  "Left sidebar": "Left sidebar",
  "Right sidebar": "Right sidebar",
@@ -711,10 +706,8 @@
  "Authorization header": "Authorization header",
  "Tool allowlist": "Tool allowlist",
  "Optional. Leave empty to allow all tools the server exposes.": "Optional. Leave empty to allow all tools the server exposes.",
-  "Optional guidance for the agent on how and when to use this server's tools. Injected into the system prompt. The server's tools are namespaced as \"<server name>_*\".": "Optional guidance for the agent on how and when to use this server's tools. Injected into the system prompt. The server's tools are namespaced as \"<server name>_*\".",
+  "Use Tavily preset": "Use Tavily preset",
  "Test": "Test",
-  "Failed": "Failed",
-  "OK · {{count}}": "OK · {{count}}",
  "Available tools": "Available tools",
  "No tools available": "No tools available",
  "Created successfully": "Created successfully",
@@ -958,7 +951,6 @@
  "Try a different search term.": "Try a different search term.",
  "Try again": "Try again",
  "Untitled chat": "Untitled chat",
-  "No document": "No document",
  "You": "You",
  "What can I help you with?": "What can I help you with?",
  "Are you sure you want to revoke this {{credential}}": "Are you sure you want to revoke this {{credential}}",
@@ -1081,8 +1073,6 @@
  "Undo": "Undo",
  "Redo": "Redo",
  "Backlinks": "Backlinks",
-  "Back to references": "Back to references",
-  "Back to reference {{label}}": "Back to reference {{label}}",
  "Last updated by": "Last updated by",
  "Last updated": "Last updated",
  "Stats": "Stats",
@@ -1135,32 +1125,15 @@
  "Removed from favorites": "Removed from favorites",
  "Added {{name}} to favorites": "Added {{name}} to favorites",
  "Removed {{name}} from favorites": "Removed {{name}} from favorites",
-  "Label added": "Label added",
-  "Label removed": "Label removed",
-  "Image updated": "Image updated",
-  "Unsupported image type": "Unsupported image type",
-  "Member deactivated": "Member deactivated",
-  "Member activated": "Member activated",
-  "Name is required": "Name is required",
-  "Name must be 40 characters or fewer": "Name must be 40 characters or fewer",
-  "Group name must be at least 2 characters": "Group name must be at least 2 characters",
-  "Group name must be 100 characters or fewer": "Group name must be 100 characters or fewer",
-  "Description must be 500 characters or fewer": "Description must be 500 characters or fewer",
-  "Invalid invitation link": "Invalid invitation link",
  "Page menu for {{name}}": "Page menu for {{name}}",
  "Create subpage of {{name}}": "Create subpage of {{name}}",
  "AI chat": "AI chat",
  "Ask a question about this documentation.": "Ask a question about this documentation.",
  "Ask a question…": "Ask a question…",
  "Thinking…": "Thinking…",
-  "Thinking… · {{count}} tokens": "Thinking… · {{count}} tokens",
-  "Thinking… · {{count}} tokens_one": "Thinking… · {{count}} token",
-  "Thinking… · {{count}} tokens_other": "Thinking… · {{count}} tokens",
-  "Thinking · {{count}} tokens": "Thinking · {{count}} tokens",
-  "Thinking · {{count}} tokens_one": "Thinking · {{count}} token",
-  "Thinking · {{count}} tokens_other": "Thinking · {{count}} tokens",
  "The assistant is unavailable right now. Please try again.": "The assistant is unavailable right now. Please try again.",
  "Public share assistant": "Public share assistant",
+  "Enabled": "Enabled",
  "Let anonymous visitors of public shares ask an AI assistant scoped to that share's pages. You pay for the tokens.": "Let anonymous visitors of public shares ask an AI assistant scoped to that share's pages. You pay for the tokens.",
  "Public assistant model": "Public assistant model",
  "Defaults to the chat model": "Defaults to the chat model",
@@ -1170,19 +1143,11 @@
  "Built-in assistant persona": "Built-in assistant persona",
  "Minimize": "Minimize",
  "Current context size": "Current context size",
-  "Tokens generated this turn": "Tokens generated this turn",
  "AI agent": "AI agent",
-  "Take a look at the current document": "Take a look at the current document",
  "AI agent is typing…": "AI agent is typing…",
  "{{name}} is typing…": "{{name}} is typing…",
  "Send": "Send",
-  "Send when the agent finishes": "Send when the agent finishes",
-  "Queue message": "Queue message",
-  "Remove queued message": "Remove queued message",
  "Stop": "Stop",
-  "Response stopped.": "Response stopped.",
-  "Connection lost — the answer was interrupted.": "Connection lost — the answer was interrupted.",
-  "Response stopped (manually or the connection dropped).": "Response stopped (manually or the connection dropped).",
  "Chat menu": "Chat menu",
  "No chats yet.": "No chats yet.",
  "Delete this chat?": "Delete this chat?",
@@ -1214,11 +1179,13 @@
  "Semantic search": "Semantic search",
  "Voice / STT": "Voice / STT",
  "Voice dictation": "Voice dictation",
-  "Streaming dictation": "Streaming dictation",
-  "Transcribe as you speak, cutting on pauses": "Transcribe as you speak, cutting on pauses",
+  "Realtime dictation": "Realtime dictation",
+  "Realtime model": "Realtime model",
+  "Realtime endpoint": "Realtime endpoint",
+  "Streams audio live and inserts text as you speak (requires an OpenAI-compatible Realtime endpoint)": "Streams audio live and inserts text as you speak (requires an OpenAI-compatible Realtime endpoint)",
+  "Leave empty to use the STT base URL": "Leave empty to use the STT base URL",
  "Voice dictation is not available yet.": "Voice dictation is not available yet.",
  "Test endpoint": "Test endpoint",
-  "Save and test": "Save and test",
  "Save endpoints": "Save endpoints",
  "Configured and enabled": "Configured and enabled",
  "Configured but disabled": "Configured but disabled",
@@ -1251,8 +1218,6 @@
  "No microphone found": "No microphone found",
  "Could not start recording": "Could not start recording",
  "Transcription failed": "Transcription failed",
-  "Transcribe": "Transcribe",
-  "No speech detected": "No speech detected",
  "Voice dictation is not configured": "Voice dictation is not configured",
  "Microphone is unavailable or already in use": "Microphone is unavailable or already in use",
  "Audio recording is not available in this browser/context": "Audio recording is not available in this browser/context",
@@ -1260,9 +1225,6 @@
  "How transcription requests are sent to the endpoint": "How transcription requests are sent to the endpoint",
  "OpenAI-compatible (multipart/form-data)": "OpenAI-compatible (multipart/form-data)",
  "OpenRouter (JSON, base64 audio)": "OpenRouter (JSON, base64 audio)",
-  "Dictation language": "Dictation language",
-  "Auto-detect": "Auto-detect",
-  "Spoken language hint sent to the transcription model. Auto-detect lets the model decide.": "Spoken language hint sent to the transcription model. Auto-detect lets the model decide.",
  "Agent role": "Agent role",
  "Universal assistant": "Universal assistant",
  "Add role": "Add role",
@@ -1279,10 +1241,6 @@
  "Optional. Defaults to the workspace model.": "Optional. Defaults to the workspace model.",
  "e.g. gpt-4o-mini": "e.g. gpt-4o-mini",
  "If you choose a different provider, it must already be configured in AI settings.": "If you choose a different provider, it must already be configured in AI settings.",
-  "Start automatically": "Start automatically",
-  "When on, picking this role sends a launch message and starts the chat. When off, the role is selected and you type the first message yourself.": "When on, picking this role sends a launch message and starts the chat. When off, the role is selected and you type the first message yourself.",
-  "Launch message": "Launch message",
-  "Sent automatically when this role is picked. Leave empty to use the default text. Ignored when “Start automatically” is off.": "Sent automatically when this role is picked. Leave empty to use the default text. Ignored when “Start automatically” is off.",
  "Agent roles": "Agent roles",
  "Reusable presets that shape the agent's behavior (and optionally its model). Picked when starting a new chat.": "Reusable presets that shape the agent's behavior (and optionally its model). Picked when starting a new chat.",
  "No roles configured": "No roles configured",
@@ -1302,20 +1260,5 @@
  "Embeds run inside a sandboxed iframe with a separate origin, so they cannot read or modify the page they are embedded in.": "Embeds run inside a sandboxed iframe with a separate origin, so they cannot read or modify the page they are embedded in.",
  "Turning this off hides existing embeds (they render as a disabled placeholder) and stops serving them on public share pages.": "Turning this off hides existing embeds (they render as a disabled placeholder) and stops serving them on public share pages.",
  "Analytics / tracker": "Analytics / tracker",
-  "Injected verbatim into the <head> of PUBLIC SHARE pages only (same-origin). For analytics snippets (Google Analytics, Yandex.Metrika, etc.). Admin only.": "Injected verbatim into the <head> of PUBLIC SHARE pages only (same-origin). For analytics snippets (Google Analytics, Yandex.Metrika, etc.). Admin only.",
-  "Go to login page": "Go to login page",
-  "Move to space": "Move to space",
-  "Float left (wrap text)": "Float left (wrap text)",
-  "Float right (wrap text)": "Float right (wrap text)",
-  "Switch to tree": "Switch to tree",
-  "Switch to flat list": "Switch to flat list",
-  "Toggle subpages display mode": "Toggle subpages display mode",
-  "Page tree (child pages, recursive)": "Page tree (child pages, recursive)",
-  "Render the full nested tree of all descendant pages": "Render the full nested tree of all descendant pages",
-  "Showing {{count}} subpages_one": "Showing {{count}} subpage",
-  "Showing {{count}} subpages_other": "Showing {{count}} subpages",
-  "Protocol": "Protocol",
-  "How chat requests are sent and how reasoning is surfaced": "How chat requests are sent and how reasoning is surfaced",
-  "OpenAI-compatible (surfaces reasoning)": "OpenAI-compatible (surfaces reasoning)",
-  "OpenAI (official)": "OpenAI (official)"
+  "Injected verbatim into the <head> of PUBLIC SHARE pages only (same-origin). For analytics snippets (Google Analytics, Yandex.Metrika, etc.). Admin only.": "Injected verbatim into the <head> of PUBLIC SHARE pages only (same-origin). For analytics snippets (Google Analytics, Yandex.Metrika, etc.). Admin only."
 }
--- a/apps/client/public/locales/ru-RU/translation.json
+++ b/apps/client/public/locales/ru-RU/translation.json
@@ -119,8 +119,6 @@
  "Name": "Имя",
  "New email": "Новый электронный адрес",
  "New page": "Новая страница",
-  "New note": "Новая заметка",
-  "Create in space": "Создать в пространстве",
  "New password": "Новый пароль",
  "No group found": "Группа не найдена",
  "No page history saved yet.": "История страниц ещё не сохранена.",
@@ -257,7 +255,6 @@
  "Copy": "Копировать",
  "Copy to space": "Копировать в пространство",
  "Copied": "Скопировано",
-  "Failed to export chat": "Не удалось экспортировать чат",
  "Duplicate": "Дублировать",
  "Select a user": "Выберите пользователя",
  "Select a group": "Выберите группу",
@@ -386,11 +383,6 @@
  "Quote": "Цитата",
  "Image": "Изображение",
  "Audio": "Аудио",
-  "Transcribe": "Транскрибировать",
-  "Transcribing…": "Транскрибация…",
-  "No speech detected": "Речь не распознана",
-  "Transcription failed": "Не удалось распознать речь",
-  "Voice dictation is not configured": "Голосовой ввод не настроен",
  "Embed PDF": "Встроить PDF",
  "Upload and embed a PDF file.": "Загрузите и встроите PDF-файл.",
  "Embed as PDF": "Встроить как PDF",
@@ -406,8 +398,6 @@
  "Footnote {{number}}": "Сноска {{number}}",
  "Go to footnote": "Перейти к сноске",
  "Back to reference": "Вернуться к ссылке",
-  "Back to references": "Вернуться к ссылкам",
-  "Back to reference {{label}}": "Вернуться к ссылке {{label}}",
  "Empty footnote": "Пустая сноска",
  "Math inline": "Строчная формула",
  "Insert inline math equation.": "Вставить математическое выражение в строку.",
@@ -679,50 +669,8 @@
  "AI Answer": "Ответ ИИ",
  "Ask AI": "Спросить ИИ",
  "AI agent": "AI-агент",
-  "Take a look at the current document": "Посмотри текущий документ",
-  "Start automatically": "Запускать автоматически",
-  "When on, picking this role sends a launch message and starts the chat. When off, the role is selected and you type the first message yourself.": "Когда включено, выбор этой роли отправляет стартовое сообщение и начинает чат. Когда выключено, роль выбирается, а первое сообщение вы вводите сами.",
-  "Launch message": "Стартовое сообщение",
-  "Sent automatically when this role is picked. Leave empty to use the default text. Ignored when “Start automatically” is off.": "Отправляется автоматически при выборе этой роли. Оставьте пустым, чтобы использовать текст по умолчанию. Игнорируется, когда «Запускать автоматически» выключено.",
  "AI agent is typing…": "AI-агент печатает…",
  "{{name}} is typing…": "{{name}} печатает…",
-  "Thinking…": "Думаю…",
-  "Thinking… · {{count}} tokens": "Думаю… · {{count}} токенов",
-  "Thinking… · {{count}} tokens_one": "Думаю… · {{count}} токен",
-  "Thinking… · {{count}} tokens_few": "Думаю… · {{count}} токена",
-  "Thinking… · {{count}} tokens_many": "Думаю… · {{count}} токенов",
-  "Thinking · {{count}} tokens": "Размышления · {{count}} токенов",
-  "Thinking · {{count}} tokens_one": "Размышления · {{count}} токен",
-  "Thinking · {{count}} tokens_few": "Размышления · {{count}} токена",
-  "Thinking · {{count}} tokens_many": "Размышления · {{count}} токенов",
-  "Agent role": "Роль агента",
-  "AI chat": "AI-чат",
-  "AI chat is disabled for this workspace.": "AI-чат отключён для этого рабочего пространства.",
-  "Ask a question about this documentation.": "Задайте вопрос об этой документации.",
-  "Ask a question…": "Задайте вопрос…",
-  "Ask the AI agent anything about your workspace.": "Спросите AI-агента о чём угодно по вашему рабочему пространству.",
-  "Ask the AI agent…": "Спросите AI-агента…",
-  "Copy chat": "Копировать чат",
-  "Created successfully": "Успешно создано",
-  "Current context size": "Текущий размер контекста",
-  "Tokens generated this turn": "Токенов сгенерировано за ход",
-  "Delete this chat?": "Удалить этот чат?",
-  "Deleted successfully": "Успешно удалено",
-  "Edited by AI agent on behalf of {{name}}": "Отредактировано AI-агентом от имени {{name}}",
-  "Failed to delete chat": "Не удалось удалить чат",
-  "Failed to rename chat": "Не удалось переименовать чат",
-  "Minimize": "Свернуть",
-  "No chats yet.": "Чатов пока нет.",
-  "Send": "Отправить",
-  "Send when the agent finishes": "Отправить, когда агент закончит",
-  "Queue message": "Поставить в очередь",
-  "Remove queued message": "Убрать из очереди",
-  "Something went wrong": "Что-то пошло не так",
-  "Stop": "Стоп",
-  "The AI agent could not respond. Please try again.": "AI-агент не смог ответить. Попробуйте ещё раз.",
-  "The AI provider is not configured. Ask an administrator to set it up.": "AI-провайдер не настроен. Попросите администратора настроить его.",
-  "Universal assistant": "Универсальный ассистент",
-  "You": "Вы",
  "AI is thinking...": "ИИ обрабатывает запрос...",
  "Thinking": "Думаю",
  "Ask a question...": "Задайте вопрос...",
@@ -752,8 +700,6 @@
  "Manage API keys for all users in the workspace. View the <anchor>API documentation</anchor> for usage details.": "Управляйте API-ключами для всех пользователей в рабочем пространстве. Смотрите <anchor>документацию по API</anchor> для получения информации об использовании.",
  "View the <anchor>API documentation</anchor> for usage details.": "Смотрите <anchor>документацию по API</anchor> для получения информации об использовании.",
  "View the <anchor>MCP documentation</anchor>.": "Смотрите <anchor>документацию по MCP</anchor>.",
-  "Instructions": "Инструкции",
-  "Optional guidance for the agent on how and when to use this server's tools. Injected into the system prompt. The server's tools are namespaced as \"<server name>_*\".": "Необязательное указание агенту, как и когда использовать инструменты этого сервера. Добавляется в системный промпт. Инструменты сервера именуются с префиксом «<имя сервера>_*».",
  "Sources": "Источники",
  "AI Answers not available for attachments": "Ответы ИИ недоступны для вложений",
  "No answer available": "Ответ недоступен",
@@ -980,7 +926,6 @@
  "Try a different search term.": "Попробуйте другой поисковый запрос.",
  "Try again": "Попробовать снова",
  "Untitled chat": "Чат без названия",
-  "No document": "Без документа",
  "What can I help you with?": "Чем я могу вам помочь?",
  "Are you sure you want to revoke this {{credential}}": "Вы уверены, что хотите отозвать этот {{credential}}",
  "Automatically provision users and groups from your identity provider via SCIM.": "Автоматически предоставляйте доступ пользователям и группам из вашего провайдера удостоверений через SCIM.",
@@ -1152,26 +1097,5 @@
  "Added {{name}} to favorites": "{{name}} добавлено в избранное",
  "Removed {{name}} from favorites": "{{name}} удалено из избранного",
  "Page menu for {{name}}": "Меню страницы для {{name}}",
-  "Create subpage of {{name}}": "Создать подстраницу для {{name}}",
-  "Dictation language": "Язык диктовки",
-  "Auto-detect": "Автоопределение",
-  "Spoken language hint sent to the transcription model. Auto-detect lets the model decide.": "Подсказка языка речи для модели транскрипции. «Автоопределение» оставляет выбор за моделью.",
-  "Float left (wrap text)": "Обтекание слева",
-  "Float right (wrap text)": "Обтекание справа",
-  "Switch to tree": "Переключить на дерево",
-  "Switch to flat list": "Переключить на плоский список",
-  "Toggle subpages display mode": "Переключить режим отображения подстраниц",
-  "Page tree (child pages, recursive)": "Дерево страниц (дочерние, рекурсивно)",
-  "Render the full nested tree of all descendant pages": "Показать полное вложенное дерево всех дочерних страниц",
-  "Showing {{count}} subpages_one": "Показано {{count}} подстраница",
-  "Showing {{count}} subpages_few": "Показано {{count}} подстраницы",
-  "Showing {{count}} subpages_many": "Показано {{count}} подстраниц",
-  "Protocol": "Протокол",
-  "How chat requests are sent and how reasoning is surfaced": "Как отправляются запросы чата и как показывается reasoning",
-  "OpenAI-compatible (surfaces reasoning)": "OpenAI-совместимый (показывает reasoning)",
-  "OpenAI (official)": "OpenAI (официальный)",
-  "Test": "Тест",
-  "Failed": "Ошибка",
-  "OK · {{count}}": "OK · {{count}}",
-  "No tools available": "Нет доступных инструментов"
+  "Create subpage of {{name}}": "Создать подстраницу для {{name}}"
 }
--- a/apps/client/scripts/copy-vad-assets.mjs
+++ b/apps/client/scripts/copy-vad-assets.mjs
@@ -1,70 +0,0 @@
-// Self-host the @ricky0123/vad-web + onnxruntime-web runtime assets under
-// apps/client/public/vad/.
-//
-// WHY THIS EXISTS:
-// Both vad-web and onnxruntime-web resolve their assets by URL *at runtime* (the
-// VAD audio worklet + Silero model, and ORT's wasm/mjs backend). In vad-web
-// 0.0.30 the default baseAssetPath / onnxWASMBasePath is "./" — i.e. relative to
-// the current page URL — NOT a CDN. In this SPA that "./" request hits the
-// client-side catch-all route and gets served index.html (text/html), so the
-// onnxruntime ESM/wasm backend fails to initialize ("'text/html' is not a valid
-// JavaScript MIME type"). We fix that by copying the needed runtime files into
-// public/vad/ and pointing both path constants at the fixed absolute "/vad/".
-//
-// These copies are NOT committed (the ORT wasm is ~26 MB); this script runs
-// before `dev` and `build` (see package.json) to repopulate them from
-// node_modules. It is idempotent: it (re)creates the dir and overwrites.
-
-import { createRequire } from "node:module";
-import { fileURLToPath } from "node:url";
-import path from "node:path";
-import fs from "node:fs";
-
-const require = createRequire(import.meta.url);
-const here = path.dirname(fileURLToPath(import.meta.url));
-const outDir = path.join(here, "..", "public", "vad");
-
-// vad-web exposes ./package.json, so derive its dist dir from there.
-const vadDist = path.join(
-  path.dirname(require.resolve("@ricky0123/vad-web/package.json")),
-  "dist",
-);
-
-// onnxruntime-web's "exports" map does NOT expose ./package.json, so resolving
-// it would throw ERR_PACKAGE_PATH_NOT_EXPORTED. It DOES export the exact asset
-// subpaths we need, so resolve those files directly.
-//
-// ORT ships several wasm backends and which one the app bundle references depends
-// on the resolver: Vite dev resolves the JSEP build (ort-wasm-simd-threaded.jsep.*)
-// while the production rolldown build resolves the plain build
-// (ort-wasm-simd-threaded.*). Ship BOTH variants so the runtime fetch hits a real
-// file under /vad/ regardless of which the bundle picked (each .mjs proxy fetches
-// its matching .wasm at init).
-const ortJsepMjs = require.resolve(
-  "onnxruntime-web/ort-wasm-simd-threaded.jsep.mjs",
-);
-const ortJsepWasm = require.resolve(
-  "onnxruntime-web/ort-wasm-simd-threaded.jsep.wasm",
-);
-const ortMjs = require.resolve("onnxruntime-web/ort-wasm-simd-threaded.mjs");
-const ortWasm = require.resolve("onnxruntime-web/ort-wasm-simd-threaded.wasm");
-
-// [absolute source path, output filename]
-const files = [
-  [path.join(vadDist, "vad.worklet.bundle.min.js"), "vad.worklet.bundle.min.js"],
-  [path.join(vadDist, "silero_vad_v5.onnx"), "silero_vad_v5.onnx"],
-  [ortJsepMjs, "ort-wasm-simd-threaded.jsep.mjs"],
-  [ortJsepWasm, "ort-wasm-simd-threaded.jsep.wasm"],
-  [ortMjs, "ort-wasm-simd-threaded.mjs"],
-  [ortWasm, "ort-wasm-simd-threaded.wasm"],
-];
-
-fs.mkdirSync(outDir, { recursive: true });
-for (const [src, name] of files) {
-  if (!fs.existsSync(src)) {
-    console.error(`[copy-vad-assets] missing source: ${src}`);
-    process.exit(1);
-  }
-  fs.copyFileSync(src, path.join(outDir, name));
-  console.log(`[copy-vad-assets] ${name}`);
-}
--- a/apps/client/src/components/common/avatar-uploader.tsx
+++ b/apps/client/src/components/common/avatar-uploader.tsx
@@ -42,23 +42,6 @@ export default function AvatarUploader({
      return;
    }

-    // Validate file type. The `accept` attribute only filters the dialog;
-    // a user can still select a non-image file, which previously failed
-    // silently. Surface a visible error instead (issue #133). Accept any
-    // image/* MIME (png, jpeg, webp, gif, svg, ...) so we don't narrow below
-    // what the server accepts; only genuinely non-image files are rejected.
-    if (!file.type.startsWith("image/")) {
-      notifications.show({
-        message: t("Unsupported image type"),
-        color: "red",
-      });
-      // Reset the input
-      if (fileInputRef.current) {
-        fileInputRef.current.value = "";
-      }
-      return;
-    }
-
    // Validate file size (max 10MB)
    const maxSizeInBytes = 10 * 1024 * 1024;
    if (file.size > maxSizeInBytes) {
@@ -75,8 +58,6 @@ export default function AvatarUploader({

    try {
      await onUpload(file);
-      // Notify on success so the upload gives visible feedback (issue #128)
-      notifications.show({ message: t("Image updated") });
    } catch (error) {
      console.error(error);
      notifications.show({
@@ -136,7 +117,7 @@ export default function AvatarUploader({
        type="file"
        ref={fileInputRef}
        onChange={handleFileInputChange}
-        accept="image/*"
+        accept="image/png,image/jpeg,image/jpg"
        aria-label={ariaLabel}
        tabIndex={-1}
        style={{ display: "none" }}
--- a/apps/client/src/components/common/recent-changes.tsx
+++ b/apps/client/src/components/common/recent-changes.tsx
@@ -67,7 +67,6 @@ export default function RecentChanges({ spaceId }: Props) {
                    <Badge
                      color={getInitialsColor(page?.space.name)}
                      variant="light"
-                      tt="none"
                      component={Link}
                      to={getSpaceUrl(page?.space.slug)}
                      style={{ cursor: "pointer" }}
--- a/apps/client/src/components/icons/icon-columns-4.tsx
+++ b/apps/client/src/components/icons/icon-columns-4.tsx
@@ -9,10 +9,8 @@ export function IconColumns4({ size = 24, stroke = 2 }: Props) {
  return (
    <svg
      xmlns="http://www.w3.org/2000/svg"
-      // rem(size) returns a `calc(...)` string, which is invalid for the raw
-      // SVG width/height length attributes ("Expected length, calc(...)"). Pass
-      // it via CSS style instead (matching the other icon components).
-      style={{ width: rem(size), height: rem(size) }}
+      width={rem(size)}
+      height={rem(size)}
      viewBox="0 0 24 24"
      fill="none"
      stroke="currentColor"
--- a/apps/client/src/components/icons/icon-columns-5.tsx
+++ b/apps/client/src/components/icons/icon-columns-5.tsx
@@ -9,10 +9,8 @@ export function IconColumns5({ size = 24, stroke = 2 }: Props) {
  return (
    <svg
      xmlns="http://www.w3.org/2000/svg"
-      // rem(size) returns a `calc(...)` string, which is invalid for the raw
-      // SVG width/height length attributes ("Expected length, calc(...)"). Pass
-      // it via CSS style instead (matching the other icon components).
-      style={{ width: rem(size), height: rem(size) }}
+      width={rem(size)}
+      height={rem(size)}
      viewBox="0 0 24 24"
      fill="none"
      stroke="currentColor"
--- a/apps/client/src/components/layouts/global/app-header.module.css
+++ b/apps/client/src/components/layouts/global/app-header.module.css
@@ -13,7 +13,6 @@
    text-decoration: none;
    color: inherit;
    cursor: pointer;
-    user-select: none;
 }

 .brandIcon {
@@ -34,3 +33,21 @@
       that is ~9.3px, minus the font descent (~2px) ≈ 7px. */
    margin-bottom: rem(7px);
 }
+
+.link {
+    display: block;
+    line-height: 1;
+    padding: rem(8px) rem(12px);
+    border-radius: var(--mantine-radius-sm);
+    text-decoration: none;
+    color: light-dark(var(--mantine-color-gray-7), var(--mantine-color-dark-0));
+    font-size: var(--mantine-font-size-sm);
+    font-weight: 500;
+    user-select: none;
+    white-space: nowrap;
+    flex-shrink: 0;
+
+    @mixin hover {
+        background-color: light-dark(var(--mantine-color-gray-0), var(--mantine-color-dark-6));
+    }
+}
--- a/apps/client/src/components/layouts/global/app-header.tsx
+++ b/apps/client/src/components/layouts/global/app-header.tsx
@@ -5,11 +5,12 @@ import {
  Text,
  Tooltip,
 } from "@mantine/core";
-import { IconMessage } from "@tabler/icons-react";
+import { IconSparkles } from "@tabler/icons-react";
 import classes from "./app-header.module.css";
 import { BrandLogo } from "@/components/ui/brand-logo";
 import TopMenu from "@/components/layouts/global/top-menu.tsx";
 import { Link } from "react-router-dom";
+import APP_ROUTE from "@/lib/app-route.ts";
 import { useAtom, useSetAtom } from "jotai";
 import {
  desktopSidebarAtom,
@@ -29,6 +30,10 @@ import {
 } from "@/features/search/constants.ts";
 import { NotificationPopover } from "@/features/notification/components/notification-popover.tsx";

+const links = [
+  { link: APP_ROUTE.HOME, label: "Home" },
+];
+
 export function AppHeader() {
  const { t } = useTranslation();
  const [mobileOpened] = useAtom(mobileSidebarAtom);
@@ -42,6 +47,12 @@ export function AppHeader() {
  // AI chat entry point: only shown when the workspace enables it (A7 gate).
  const aiChatEnabled = workspace?.settings?.ai?.chat === true;

+  const items = links.map((link) => (
+    <Link key={link.label} to={link.link} className={classes.link}>
+      {t(link.label)}
+    </Link>
+  ));
+
  return (
    <>
      <Group h="100%" px="md" justify="space-between" wrap={"nowrap"}>
@@ -86,6 +97,10 @@ export function AppHeader() {
              </Text>
            </Tooltip>
          </Group>
+
+          <Group ml="xl" gap={5} className={classes.links} visibleFrom="sm">
+            {items}
+          </Group>
        </Group>

        <div>
@@ -107,7 +122,7 @@ export function AppHeader() {
                aria-label={t("AI chat")}
                onClick={() => setAiChatWindowOpen((v) => !v)}
              >
-                <IconMessage size={20} />
+                <IconSparkles size={20} />
              </ActionIcon>
            </Tooltip>
          )}
--- a/apps/client/src/components/layouts/global/aside.tsx
+++ b/apps/client/src/components/layouts/global/aside.tsx
@@ -27,7 +27,7 @@ export default function Aside() {

  switch (tab) {
    case "comments":
-      component = <CommentListWithTabs onClose={closeAside} />;
+      component = <CommentListWithTabs />;
      title = "Comments";
      break;
    case "toc":
@@ -44,27 +44,26 @@ export default function Aside() {
  }

  return (
-    <Box p={0} style={{ height: "100%", display: "flex", flexDirection: "column" }}>
-      {component &&
-        (tab === "comments" ? (
-          component
-        ) : (
-          <>
-            <Group justify="space-between" wrap="nowrap" mb="sm">
-              <Title order={2} size="h6" fw={500}>
-                {t(title)}
-              </Title>
-              <Tooltip label={t("Close")} withArrow>
-                <ActionIcon
-                  variant="subtle"
-                  color="gray"
-                  onClick={closeAside}
-                  aria-label={t("Close")}
-                >
-                  <IconX size={18} />
-                </ActionIcon>
-              </Tooltip>
-            </Group>
+    <Box p="md" style={{ height: "100%", display: "flex", flexDirection: "column" }}>
+      {component && (
+        <>
+          <Group justify="space-between" wrap="nowrap" mb="md">
+            <Title order={2} size="h6" fw={500}>{t(title)}</Title>
+            <Tooltip label={t("Close")} withArrow>
+              <ActionIcon
+                variant="subtle"
+                color="gray"
+                onClick={closeAside}
+                aria-label={t("Close")}
+              >
+                <IconX size={18} />
+              </ActionIcon>
+            </Tooltip>
+          </Group>
+
+          {tab === "comments" ? (
+            component
+          ) : (
            <ScrollArea
              style={{ height: "85vh" }}
              scrollbarSize={5}
@@ -72,8 +71,9 @@ export default function Aside() {
            >
              <div style={{ paddingBottom: "200px" }}>{component}</div>
            </ScrollArea>
-          </>
-        ))}
+          )}
+        </>
+      )}
    </Box>
  );
 }
--- a/apps/client/src/components/layouts/global/global-app-shell.tsx
+++ b/apps/client/src/components/layouts/global/global-app-shell.tsx
@@ -14,7 +14,6 @@ import { SpaceSidebar } from "@/features/space/components/sidebar/space-sidebar.
 import { AppHeader } from "@/components/layouts/global/app-header.tsx";
 import Aside from "@/components/layouts/global/aside.tsx";
 import AiChatWindow from "@/features/ai-chat/components/ai-chat-window.tsx";
-import GitmostGlobalBridge from "@/features/editor/gitmost/gitmost-global-bridge.tsx";
 import classes from "./app-shell.module.css";
 import { useToggleSidebar } from "@/components/layouts/global/hooks/hooks/use-toggle-sidebar.ts";
 import GlobalSidebar from "@/components/layouts/global/global-sidebar.tsx";
@@ -95,12 +94,12 @@ export default function GlobalAppShell({
      }}
      aside={
        isPageRoute && {
-          width: 420,
+          width: 350,
          breakpoint: "sm",
          collapsed: { mobile: !isAsideOpen, desktop: !isAsideOpen },
        }
      }
-      padding={{ base: "xs", sm: "md" }}
+      padding="md"
    >
      <AppShell.Header px="md" className={classes.header}>
        <AppHeader />
@@ -139,7 +138,7 @@ export default function GlobalAppShell({
          id={ASIDE_PANEL_ID}
          tabIndex={-1}
          className={classes.aside}
-          p="sm"
+          p="md"
          withBorder={false}
          aria-label={
            asideTab === "comments"
@@ -158,10 +157,6 @@ export default function GlobalAppShell({
    {/* Floating AI chat window. Mounted once globally; it is position: fixed
        and self-hides when closed, so its place in the tree is not critical. */}
    <AiChatWindow />
-      {/* Global gitmost native bridge: registers listSpaces / listPages /
-          createPageWithRecording on window.gitmost so the native host can
-          create a page with a recording even when no page editor is open. */}
-      <GitmostGlobalBridge />
    </>
  );
 }
--- a/apps/client/src/components/layouts/global/top-menu.tsx
+++ b/apps/client/src/components/layouts/global/top-menu.tsx
@@ -20,29 +20,18 @@ import {
 } from "@tabler/icons-react";
 import { useAtom } from "jotai";
 import { currentUserAtom } from "@/features/user/atoms/current-user-atom.ts";
-import { Link, useMatch } from "react-router-dom";
+import { Link } from "react-router-dom";
 import APP_ROUTE from "@/lib/app-route.ts";
 import useAuth from "@/features/auth/hooks/use-auth.ts";
 import { CustomAvatar } from "@/components/ui/custom-avatar.tsx";
 import { useTranslation } from "react-i18next";
 import { AvatarIconType } from "@/features/attachments/types/attachment.types.ts";
-import { useDisclosure } from "@mantine/hooks";
-import SpaceSettingsModal from "@/features/space/components/settings-modal.tsx";

 export default function TopMenu() {
  const { t } = useTranslation();
  const [currentUser] = useAtom(currentUserAtom);
  const { logout } = useAuth();
  const { colorScheme, setColorScheme } = useMantineColorScheme();
-  // Detect the currently viewed space so the "Space settings" item is only
-  // offered while the user is inside a space. The "/*" splat also matches the
-  // bare "/s/:spaceSlug" route (the splat matches an empty segment).
-  const spaceMatch = useMatch("/s/:spaceSlug/*");
-  const spaceSlug = spaceMatch?.params?.spaceSlug;
-  const [
-    spaceSettingsOpened,
-    { open: openSpaceSettings, close: closeSpaceSettings },
-  ] = useDisclosure(false);

  const user = currentUser?.user;
  const workspace = currentUser?.workspace;
@@ -52,143 +41,124 @@ export default function TopMenu() {
  }

  return (
-    <>
-      <Menu width={250} position="bottom-end" withArrow shadow={"lg"}>
-        <Menu.Target>
-          <UnstyledButton>
-            <Group gap={7} wrap={"nowrap"}>
-              <CustomAvatar
-                avatarUrl={workspace?.logo}
-                name={workspace?.name}
-                variant="filled"
-                size="sm"
-                type={AvatarIconType.WORKSPACE_ICON}
-              />
-              <Text fw={500} size="sm" lh={1} mr={3} lineClamp={1}>
-                {workspace?.name}
+    <Menu width={250} position="bottom-end" withArrow shadow={"lg"}>
+      <Menu.Target>
+        <UnstyledButton>
+          <Group gap={7} wrap={"nowrap"}>
+            <CustomAvatar
+              avatarUrl={workspace?.logo}
+              name={workspace?.name}
+              variant="filled"
+              size="sm"
+              type={AvatarIconType.WORKSPACE_ICON}
+            />
+            <Text fw={500} size="sm" lh={1} mr={3} lineClamp={1}>
+              {workspace?.name}
+            </Text>
+            <IconChevronDown size={16} />
+          </Group>
+        </UnstyledButton>
+      </Menu.Target>
+      <Menu.Dropdown>
+        <Menu.Label>{t("Workspace")}</Menu.Label>
+
+        <Menu.Item
+          component={Link}
+          to={APP_ROUTE.SETTINGS.WORKSPACE.GENERAL}
+          leftSection={<IconSettings size={16} />}
+        >
+          {t("Workspace settings")}
+        </Menu.Item>
+
+        <Menu.Item
+          component={Link}
+          to={APP_ROUTE.SETTINGS.WORKSPACE.MEMBERS}
+          leftSection={<IconUsers size={16} />}
+        >
+          {t("Manage members")}
+        </Menu.Item>
+
+        <Menu.Divider />
+
+        <Menu.Label>{t("Account")}</Menu.Label>
+        <Menu.Item component={Link} to={APP_ROUTE.SETTINGS.ACCOUNT.PROFILE}>
+          <Group wrap={"nowrap"}>
+            <CustomAvatar
+              size={"sm"}
+              avatarUrl={user.avatarUrl}
+              name={user.name}
+            />
+
+            <div style={{ width: 190 }}>
+              <Text size="sm" fw={500} lineClamp={1}>
+                {user.name}
              </Text>
-              <IconChevronDown size={16} />
-            </Group>
-          </UnstyledButton>
-        </Menu.Target>
-        <Menu.Dropdown>
-          <Menu.Label>{t("Workspace")}</Menu.Label>
+              <Text size="xs" c="dimmed" truncate="end">
+                {user.email}
+              </Text>
+            </div>
+          </Group>
+        </Menu.Item>
+        <Menu.Item
+          component={Link}
+          to={APP_ROUTE.SETTINGS.ACCOUNT.PROFILE}
+          leftSection={<IconUserCircle size={16} />}
+        >
+          {t("My profile")}
+        </Menu.Item>

-          <Menu.Item
-            component={Link}
-            to={APP_ROUTE.SETTINGS.WORKSPACE.GENERAL}
-            leftSection={<IconSettings size={16} />}
-          >
-            {t("Workspace settings")}
-          </Menu.Item>
+        <Menu.Item
+          component={Link}
+          to={APP_ROUTE.SETTINGS.ACCOUNT.PREFERENCES}
+          leftSection={<IconBrush size={16} />}
+        >
+          {t("My preferences")}
+        </Menu.Item>

-          {spaceSlug && (
+        <Menu.Sub>
+          <Menu.Sub.Target>
+            <Menu.Sub.Item leftSection={<IconBrightnessFilled size={16} />}>
+              {t("Theme")}
+            </Menu.Sub.Item>
+          </Menu.Sub.Target>
+
+          <Menu.Sub.Dropdown>
            <Menu.Item
-              onClick={openSpaceSettings}
-              leftSection={<IconSettings size={16} />}
+              onClick={() => setColorScheme("light")}
+              leftSection={<IconSun size={16} />}
+              rightSection={
+                colorScheme === "light" ? <IconCheck size={16} /> : null
+              }
            >
-              {t("Space settings")}
+              {t("Light")}
            </Menu.Item>
-          )}
+            <Menu.Item
+              onClick={() => setColorScheme("dark")}
+              leftSection={<IconMoon size={16} />}
+              rightSection={
+                colorScheme === "dark" ? <IconCheck size={16} /> : null
+              }
+            >
+              {t("Dark")}
+            </Menu.Item>
+            <Menu.Item
+              onClick={() => setColorScheme("auto")}
+              leftSection={<IconDeviceDesktop size={16} />}
+              rightSection={
+                colorScheme === "auto" ? <IconCheck size={16} /> : null
+              }
+            >
+              {t("System settings")}
+            </Menu.Item>
+          </Menu.Sub.Dropdown>
+        </Menu.Sub>

-          <Menu.Item
-            component={Link}
-            to={APP_ROUTE.SETTINGS.WORKSPACE.MEMBERS}
-            leftSection={<IconUsers size={16} />}
-          >
-            {t("Manage members")}
-          </Menu.Item>
+        <Menu.Divider />

-          <Menu.Divider />
-
-          <Menu.Label>{t("Account")}</Menu.Label>
-          <Menu.Item component={Link} to={APP_ROUTE.SETTINGS.ACCOUNT.PROFILE}>
-            <Group wrap={"nowrap"}>
-              <CustomAvatar
-                size={"sm"}
-                avatarUrl={user.avatarUrl}
-                name={user.name}
-              />
-
-              <div style={{ width: 190 }}>
-                <Text size="sm" fw={500} lineClamp={1}>
-                  {user.name}
-                </Text>
-                <Text size="xs" c="dimmed" truncate="end">
-                  {user.email}
-                </Text>
-              </div>
-            </Group>
-          </Menu.Item>
-          <Menu.Item
-            component={Link}
-            to={APP_ROUTE.SETTINGS.ACCOUNT.PROFILE}
-            leftSection={<IconUserCircle size={16} />}
-          >
-            {t("My profile")}
-          </Menu.Item>
-
-          <Menu.Item
-            component={Link}
-            to={APP_ROUTE.SETTINGS.ACCOUNT.PREFERENCES}
-            leftSection={<IconBrush size={16} />}
-          >
-            {t("My preferences")}
-          </Menu.Item>
-
-          <Menu.Sub>
-            <Menu.Sub.Target>
-              <Menu.Sub.Item leftSection={<IconBrightnessFilled size={16} />}>
-                {t("Theme")}
-              </Menu.Sub.Item>
-            </Menu.Sub.Target>
-
-            <Menu.Sub.Dropdown>
-              <Menu.Item
-                onClick={() => setColorScheme("light")}
-                leftSection={<IconSun size={16} />}
-                rightSection={
-                  colorScheme === "light" ? <IconCheck size={16} /> : null
-                }
-              >
-                {t("Light")}
-              </Menu.Item>
-              <Menu.Item
-                onClick={() => setColorScheme("dark")}
-                leftSection={<IconMoon size={16} />}
-                rightSection={
-                  colorScheme === "dark" ? <IconCheck size={16} /> : null
-                }
-              >
-                {t("Dark")}
-              </Menu.Item>
-              <Menu.Item
-                onClick={() => setColorScheme("auto")}
-                leftSection={<IconDeviceDesktop size={16} />}
-                rightSection={
-                  colorScheme === "auto" ? <IconCheck size={16} /> : null
-                }
-              >
-                {t("System settings")}
-              </Menu.Item>
-            </Menu.Sub.Dropdown>
-          </Menu.Sub>
-
-          <Menu.Divider />
-
-          <Menu.Item onClick={logout} leftSection={<IconLogout size={16} />}>
-            {t("Logout")}
-          </Menu.Item>
-        </Menu.Dropdown>
-      </Menu>
-
-      {spaceSlug && (
-        <SpaceSettingsModal
-          spaceId={spaceSlug}
-          opened={spaceSettingsOpened}
-          onClose={closeSpaceSettings}
-        />
-      )}
-    </>
+        <Menu.Item onClick={logout} leftSection={<IconLogout size={16} />}>
+          {t("Logout")}
+        </Menu.Item>
+      </Menu.Dropdown>
+    </Menu>
  );
 }
--- a/apps/client/src/components/settings/settings-sidebar.tsx
+++ b/apps/client/src/components/settings/settings-sidebar.tsx
@@ -20,6 +20,7 @@ import {
  prefetchSpaces,
  prefetchWorkspaceMembers,
 } from "@/components/settings/settings-queries.tsx";
+import AppVersion from "@/components/settings/app-version.tsx";
 import { mobileSidebarAtom } from "@/components/layouts/global/hooks/atoms/sidebar-atom.ts";
 import { useToggleSidebar } from "@/components/layouts/global/hooks/hooks/use-toggle-sidebar.ts";
 import { useSettingsNavigation } from "@/hooks/use-settings-navigation";
@@ -140,6 +141,8 @@ export default function SettingsSidebar() {
      </Group>

      <ScrollArea w="100%">{menuItems}</ScrollArea>
+
+      <AppVersion />
    </div>
  );
 }
--- a/apps/client/src/components/ui/ai-agent-badge.test.tsx
+++ b/apps/client/src/components/ui/ai-agent-badge.test.tsx
@@ -1,96 +0,0 @@
-import { describe, it, expect, vi } from "vitest";
-import { render, screen, fireEvent } from "@testing-library/react";
-import { MantineProvider } from "@mantine/core";
-import { Provider, createStore } from "jotai";
-import { AiAgentBadge } from "./ai-agent-badge";
-import {
-  activeAiChatIdAtom,
-  aiChatWindowOpenAtom,
-  aiChatDraftAtom,
-} from "@/features/ai-chat/atoms/ai-chat-atom.ts";
-
-// matchMedia (read by MantineProvider) is stubbed globally in vitest.setup.ts.
-
-function renderBadge(props: { authorName?: string; aiChatId?: string | null }) {
-  return render(
-    <MantineProvider>
-      <AiAgentBadge {...props} />
-    </MantineProvider>,
-  );
-}
-
-// Render a clickable badge inside an explicit jotai store, with a leftover draft
-// and an onActivate + parent-click spy, so the deep-link side effects are
-// assertable. Returns the store and spies.
-function setupClickable() {
-  const store = createStore();
-  store.set(aiChatDraftAtom, "leftover draft from another chat");
-  const onActivate = vi.fn();
-  const onParentClick = vi.fn();
-  render(
-    <Provider store={store}>
-      <MantineProvider>
-        <div onClick={onParentClick}>
-          <AiAgentBadge authorName="Bot" aiChatId="chat-1" onActivate={onActivate} />
-        </div>
-      </MantineProvider>
-    </Provider>,
-  );
-  return { store, onActivate, onParentClick, badge: screen.getByRole("button") };
-}
-
-function expectDeepLinked(store: ReturnType<typeof createStore>, onActivate: ReturnType<typeof vi.fn>) {
-  expect(store.get(activeAiChatIdAtom)).toBe("chat-1");
-  expect(store.get(aiChatWindowOpenAtom)).toBe(true);
-  expect(store.get(aiChatDraftAtom)).toBe(""); // draft cleared
-  expect(onActivate).toHaveBeenCalledTimes(1); // caller closes its own modal etc.
-}
-
-describe("AiAgentBadge", () => {
-  it("renders the AI-agent label", () => {
-    renderBadge({ authorName: "Bot" });
-    expect(screen.getByText("AI-agent")).toBeDefined();
-  });
-
-  it("is clickable (accessible button) when aiChatId is present", () => {
-    renderBadge({ authorName: "Bot", aiChatId: "chat-1" });
-    const badge = screen.getByRole("button");
-    expect(badge).toBeDefined();
-    expect(badge.textContent).toContain("AI-agent");
-  });
-
-  it("click deep-links: sets active chat, clears draft, opens window, fires onActivate, stops propagation", () => {
-    const { store, onActivate, onParentClick, badge } = setupClickable();
-    fireEvent.click(badge);
-    expectDeepLinked(store, onActivate);
-    expect(onParentClick).not.toHaveBeenCalled(); // stopPropagation contained the click
-  });
-
-  it.each(["Enter", " "])(
-    "keyboard %j activates the deep-link (same side effects as click)",
-    (key) => {
-      const { store, onActivate, badge } = setupClickable();
-      fireEvent.keyDown(badge, { key });
-      expectDeepLinked(store, onActivate);
-    },
-  );
-
-  it("an unrelated key does NOT activate the badge", () => {
-    const { store, onActivate, badge } = setupClickable();
-    fireEvent.keyDown(badge, { key: "Tab" });
-    expect(store.get(activeAiChatIdAtom)).toBeNull();
-    expect(store.get(aiChatWindowOpenAtom)).toBe(false);
-    expect(store.get(aiChatDraftAtom)).toBe("leftover draft from another chat");
-    expect(onActivate).not.toHaveBeenCalled();
-  });
-
-  it.each([{ aiChatId: null }, {}])(
-    "is a plain non-clickable label without a chat target (%o)",
-    (props) => {
-      renderBadge({ authorName: "Bot", ...props });
-      expect(screen.getByText("AI-agent")).toBeDefined();
-      // No interactive role is exposed when there is no chat to deep-link into.
-      expect(screen.queryByRole("button")).toBeNull();
-    },
-  );
-});
--- a/apps/client/src/components/ui/ai-agent-badge.tsx
+++ b/apps/client/src/components/ui/ai-agent-badge.tsx
@@ -1,99 +0,0 @@
-import { Badge, Tooltip } from "@mantine/core";
-import { IconSparkles } from "@tabler/icons-react";
-import { useCallback } from "react";
-import { useTranslation } from "react-i18next";
-import { useSetAtom } from "jotai";
-import {
-  activeAiChatIdAtom,
-  aiChatWindowOpenAtom,
-  aiChatDraftAtom,
-} from "@/features/ai-chat/atoms/ai-chat-atom.ts";
-
-interface AiAgentBadgeProps {
-  authorName?: string;
-  aiChatId?: string | null;
-  // Fired after the badge deep-links into its chat. The caller handles its own
-  // context (e.g. the page-history row closes the history modal) so this generic
-  // ui/ primitive stays free of cross-feature coupling (#143 review Arch B).
-  onActivate?: () => void;
-}
-
-/**
- * Badge marking content written by the AI agent (provenance C3 / §7.4). It is
- * ADDITIVE — shown next to the human author, never replacing them. Reused by the
- * page-history list and the comments sidebar.
- *
- * When the item carries an `aiChatId` (an internal AI-chat edit), clicking the
- * badge deep-links into that chat: it sets the active-chat atom and opens the
- * floating AI-chat window, then invokes `onActivate` so the caller can react
- * (e.g. the history modal closes itself). When `aiChatId` is null/absent (an
- * external MCP write with no internal ai_chats row), the badge is a plain
- * non-clickable label. The click is contained (stopPropagation) so it does not
- * also trigger an enclosing row's click handler.
- */
-export function AiAgentBadge({
-  authorName,
-  aiChatId,
-  onActivate,
-}: AiAgentBadgeProps) {
-  const { t } = useTranslation();
-  const setAiChatWindowOpen = useSetAtom(aiChatWindowOpenAtom);
-  const setActiveChatId = useSetAtom(activeAiChatIdAtom);
-  const setDraft = useSetAtom(aiChatDraftAtom);
-
-  const tooltip = t("Edited by AI agent on behalf of {{name}}", {
-    name: authorName ?? "",
-  });
-
-  const openChat = useCallback(
-    (event: React.SyntheticEvent) => {
-      event.stopPropagation();
-      if (!aiChatId) return;
-      setActiveChatId(aiChatId);
-      // Switching to another chat must start with a clean composer — clear any
-      // unsent draft so it does not leak from the previously open chat.
-      setDraft("");
-      setAiChatWindowOpen(true);
-      onActivate?.();
-    },
-    [aiChatId, setActiveChatId, setDraft, setAiChatWindowOpen, onActivate],
-  );
-
-  const badge = (
-    <Badge
-      size="sm"
-      variant="light"
-      color="violet"
-      radius="sm"
-      leftSection={<IconSparkles size={12} stroke={2} />}
-      style={aiChatId ? { cursor: "pointer" } : undefined}
-      {...(aiChatId
-        ? {
-            // Keep the default Badge root element (not a <button>) to avoid an
-            // invalid <button>-in-<button> nesting inside a row's
-            // UnstyledButton; expose it as an accessible button via
-            // role/keyboard.
-            role: "button",
-            tabIndex: 0,
-            onClick: openChat,
-            onKeyDown: (event: React.KeyboardEvent) => {
-              if (event.key === "Enter" || event.key === " ") {
-                event.preventDefault();
-                openChat(event);
-              }
-            },
-          }
-        : {})}
-    >
-      {t("AI-agent")}
-    </Badge>
-  );
-
-  return (
-    <Tooltip label={tooltip} withArrow>
-      {badge}
-    </Tooltip>
-  );
-}
-
-export default AiAgentBadge;
--- a/apps/client/src/components/ui/brand-logo.tsx
+++ b/apps/client/src/components/ui/brand-logo.tsx
@@ -27,7 +27,6 @@ export function BrandLogo({
      src={src}
      alt="Gitmost"
      className={className}
-      draggable={false}
      style={{ height, width: "auto", display: "block", userSelect: "none" }}
    />
  );
--- a/apps/client/src/features/ai-chat/atoms/ai-chat-atom.ts
+++ b/apps/client/src/features/ai-chat/atoms/ai-chat-atom.ts
@@ -1,22 +1,4 @@
 import { atom } from "jotai";
-import { atomWithStorage } from "jotai/utils";
-
-/**
- * Persisted floating AI chat window geometry (position + size). Held in
- * localStorage so a drag/resize survives a full page reload. `null` means
- * "never placed yet" — the window then computes an initial top-right placement.
- * On restore the value is clamped to the current viewport (see AiChatWindow).
- */
-export type AiChatWindowGeom = {
-  left: number;
-  top: number;
-  width: number;
-  height: number;
-};
-export const aiChatWindowGeomAtom = atomWithStorage<AiChatWindowGeom | null>(
-  "ai-chat-window-geom",
-  null,
-);

 /**
 * The currently selected chat id. `null` means a fresh (not-yet-created) chat:
--- a/apps/client/src/features/ai-chat/components/ai-chat-window.tsx
+++ b/apps/client/src/features/ai-chat/components/ai-chat-window.tsx
@@ -6,7 +6,7 @@ import {
  useRef,
  useState,
 } from "react";
-import { Group, Loader, Tooltip } from "@mantine/core";
+import { Group, Loader, Select, Tooltip } from "@mantine/core";
 import {
  IconArrowsDiagonal,
  IconCheck,
@@ -24,7 +24,6 @@ import { useQueryClient } from "@tanstack/react-query";
 import {
  activeAiChatIdAtom,
  aiChatWindowOpenAtom,
-  aiChatWindowGeomAtom,
  aiChatDraftAtom,
  selectedAiRoleIdAtom,
 } from "@/features/ai-chat/atoms/ai-chat-atom.ts";
@@ -32,15 +31,13 @@ import { usePageQuery } from "@/features/page/queries/page-query.ts";
 import { extractPageSlugId } from "@/lib";
 import {
  AI_CHATS_RQ_KEY,
-  AI_CHAT_MESSAGES_RQ_KEY,
  useAiChatMessagesQuery,
  useAiChatsQuery,
  useAiRolesQuery,
 } from "@/features/ai-chat/queries/ai-chat-query.ts";
 import ConversationList from "@/features/ai-chat/components/conversation-list.tsx";
 import ChatThread from "@/features/ai-chat/components/chat-thread.tsx";
-import { exportAiChat } from "@/features/ai-chat/services/ai-chat-service.ts";
-import { useChatSession } from "@/features/ai-chat/hooks/use-chat-session.ts";
+import { buildChatMarkdown } from "@/features/ai-chat/utils/chat-markdown.ts";
 import {
  shouldCollapseOnOutsidePointer,
  isHeaderClick,
@@ -79,31 +76,17 @@ function computeInitialGeom() {
    Math.min(DEFAULT_HEIGHT, window.innerHeight - 2 * EDGE_MARGIN),
  );
  const left = Math.max(EDGE_MARGIN, window.innerWidth - width - 24);
-  const maxTop = Math.max(
-    EDGE_MARGIN,
-    window.innerHeight - height - EDGE_MARGIN,
-  );
+  const maxTop = Math.max(EDGE_MARGIN, window.innerHeight - height - EDGE_MARGIN);
  const top = Math.min(60, maxTop);
  return { left, top, width, height };
 }

 // Clamp a geometry so the window stays within the current viewport.
-function clampGeom(g: {
-  left: number;
-  top: number;
-  width: number;
-  height: number;
-}) {
+function clampGeom(g: { left: number; top: number; width: number; height: number }) {
  const effWidth = Math.max(g.width, MIN_WIDTH);
  const effHeight = Math.max(g.height, MIN_HEIGHT);
-  const maxLeft = Math.max(
-    EDGE_MARGIN,
-    window.innerWidth - effWidth - EDGE_MARGIN,
-  );
-  const maxTop = Math.max(
-    EDGE_MARGIN,
-    window.innerHeight - effHeight - EDGE_MARGIN,
-  );
+  const maxLeft = Math.max(EDGE_MARGIN, window.innerWidth - effWidth - EDGE_MARGIN);
+  const maxTop = Math.max(EDGE_MARGIN, window.innerHeight - effHeight - EDGE_MARGIN);
  return {
    ...g,
    left: Math.min(Math.max(EDGE_MARGIN, g.left), maxLeft),
@@ -114,13 +97,12 @@ function clampGeom(g: {
 /**
 * Floating, draggable, resizable, minimizable AI chat window. Replaces the
 * former right-aside `AiChatPanel`: it owns ALL chat orchestration (active
- * chat, new chat, in-place id adoption from streamed metadata, open-page
- * context, token sum) and wraps the
+ * chat, new chat, adopt-new-chat, open-page context, token sum) and wraps the
 * reused inner components (ConversationList + ChatThread) in window chrome
 * ported from the GitmostAgent.jsx design.
 */
 export default function AiChatWindow() {
-  const { t, i18n } = useTranslation();
+  const { t } = useTranslation();
  const clipboard = useClipboard({ timeout: 500 });
  const queryClient = useQueryClient();
  const [windowOpen, setWindowOpen] = useAtom(aiChatWindowOpenAtom);
@@ -138,13 +120,19 @@ export default function AiChatWindow() {
  minimizedRef.current = minimized;

  const winRef = useRef<HTMLDivElement>(null);
-  // Live window geometry (position + size); persisted to localStorage so a
-  // drag/resize survives a full page reload (and close/reopen). `null` means
-  // "never placed yet" — the layout effect below then computes an initial
-  // top-right placement anchored to the current viewport, and on restore it is
-  // re-clamped to the viewport (so a placement saved on a larger screen is not
-  // left partly off-screen).
-  const [geom, setGeom] = useAtom(aiChatWindowGeomAtom);
+  // Live window geometry (position + size); initialized lazily on first open so
+  // it is anchored to the current viewport (top-right corner). Kept in state so
+  // a user resize survives close/reopen and can be re-clamped to the viewport.
+  const [geom, setGeom] = useState<{
+    left: number;
+    top: number;
+    width: number;
+    height: number;
+  } | null>(null);
+
+  // Track whether we are awaiting the id of a just-created (new) chat, so we
+  // can adopt it once the chat list refreshes after the first turn finishes.
+  const adoptNewChat = useRef(false);

  const { data: chats } = useAiChatsQuery();
  // Roles for the new-chat picker (any member may list them). Only fetched while
@@ -157,16 +145,9 @@ export default function AiChatWindow() {
    () => (roles ?? []).filter((r) => r.enabled === true),
    [roles],
  );
-
  const { data: messageRows, isLoading: messagesLoading } =
    useAiChatMessagesQuery(activeChatId ?? undefined);

-  // Live turn-token total (reasoning + output) for the in-flight turn, pushed up
-  // (THROTTLED to ~8 Hz inside ChatThread) so the header badge ticks mid-stream.
-  // `null` means no turn is in flight -> the badge falls back to the persisted
-  // context size below.
-  const [liveTurnTokens, setLiveTurnTokens] = useState<number | null>(null);
-
  // The page the user is currently viewing. AiChatWindow lives in a pathless
  // parent layout route, so useParams() can't see :pageSlug. Match the full
  // pathname against the authenticated page route instead so "the current page"
@@ -184,101 +165,71 @@ export default function AiChatWindow() {
    ? { id: openPageData.id, title: openPageData.title }
    : null;

-  // The AI-chat thread-identity lifecycle (mount key, both new-chat id adoption
-  // paths, the history-loaded latch, the render-phase reconciler) lives in this
-  // hook. See adopt-chat-id.ts for the canonical #137 two-tab race explanation.
-  // The invalidate closures are passed inline: `onTurnFinished` is read live by
-  // useChat's onFinish (never in an effect dep array), so their identity does not
-  // matter — no memoization ceremony needed.
-  const {
-    threadKey,
-    waitingForHistory,
-    onTurnFinished,
-    onServerChatId,
-    cancelPendingAdoption,
-  } = useChatSession({
-    activeChatId,
-    setActiveChatId,
-    chats,
-    messagesLoading,
-    onInvalidateChatList: () =>
-      queryClient.invalidateQueries({ queryKey: AI_CHATS_RQ_KEY }),
-    onInvalidateChatMessages: (id) =>
-      queryClient.invalidateQueries({ queryKey: AI_CHAT_MESSAGES_RQ_KEY(id) }),
-  });
-
-  // startNewChat/selectChat set the public atom; the hook's render-phase
-  // reconciler handles the remount when activeChatId actually CHANGES. But
-  // pressing "New chat" while already in a new chat leaves activeChatId === null
-  // (a no-op for the atom), so the reconciler never fires — explicitly disarm any
-  // armed error-path fallback here so a late refetch can't yank the user into a
-  // just-failed chat after they chose a fresh one.
  const startNewChat = useCallback((): void => {
-    cancelPendingAdoption();
    setActiveChatId(null);
    setHistoryOpen(false);
    setDraft("");
    // Default the picker back to "Universal assistant" for the fresh chat.
    setSelectedRoleId(null);
-  }, [cancelPendingAdoption, setActiveChatId, setDraft, setSelectedRoleId]);
+  }, [setActiveChatId, setDraft, setSelectedRoleId]);

  const selectChat = useCallback(
    (chatId: string): void => {
-      cancelPendingAdoption();
      setActiveChatId(chatId);
      setHistoryOpen(false);
      setDraft("");
-      // Reset the card-picked role so a stale pick can't leak into the existing
-      // chat's header/assistant-name (which prefers the chat's persisted role).
-      setSelectedRoleId(null);
    },
-    [cancelPendingAdoption, setActiveChatId, setDraft, setSelectedRoleId],
+    [setActiveChatId, setDraft],
  );

-  // The active chat object (for its title) and an export gate. The export is now
-  // SERVER-sourced (the DB is the single source of truth — #183): the assistant
-  // row is persisted upfront + per step, so even a brand-new chat whose first
-  // turn is streaming/interrupted has a server row to render. Enable the button
-  // whenever a persisted chat is active (`activeChatId` is set). For a BRAND-NEW
-  // chat that id is adopted EARLY — at the stream's `start` chunk via
-  // onServerChatId (#174) — so the Copy button is available during the first
-  // turn's stream, not only after it terminates.
+  // After a turn finishes, refresh the chat list. For a brand-new chat (no id
+  // yet), the server has just created the row; adopt the newest chat id so the
+  // thread switches from "new" to the persisted chat (and loads its history on
+  // later opens).
+  const onTurnFinished = useCallback(() => {
+    if (activeChatId === null) adoptNewChat.current = true;
+    queryClient.invalidateQueries({ queryKey: AI_CHATS_RQ_KEY });
+  }, [activeChatId, queryClient]);
+
+  // The active chat object (for its title) and an export gate: only enable the
+  // export button when an existing chat with loaded persisted rows is active.
  const activeChat = useMemo(
    () => chats?.items?.find((c) => c.id === activeChatId) ?? null,
    [chats, activeChatId],
  );
-  const canExport = !!activeChatId;
+  const canExport = !!activeChatId && !!messageRows && messageRows.length > 0;

-  // The role to display in the header and as the assistant's name. Prefer the
-  // persisted role of an existing chat (chat-list JOIN); fall back to the role
-  // picked via a card click for a brand-new or just-adopted chat. selectChat
-  // resets selectedRoleId, so this fallback never leaks into an unrelated chat.
-  const currentRole = useMemo<{
-    name: string;
-    emoji: string | null;
-  } | null>(() => {
-    if (activeChat?.roleName) {
-      return { name: activeChat.roleName, emoji: activeChat.roleEmoji ?? null };
-    }
-    const picked = enabledRoles.find((r) => r.id === selectedRoleId);
-    return picked ? { name: picked.name, emoji: picked.emoji } : null;
-  }, [activeChat, enabledRoles, selectedRoleId]);
+  // Build a Markdown export from the already-loaded persisted rows (no network
+  // call) and copy it to the clipboard. The "Copied" notification is the
+  // feedback.
+  const handleCopy = useCallback(() => {
+    if (!activeChatId || !messageRows || messageRows.length === 0) return;
+    const markdown = buildChatMarkdown({
+      title: activeChat?.title ?? null,
+      chatId: activeChatId,
+      rows: messageRows,
+      t,
+    });
+    clipboard.copy(markdown);
+    notifications.show({ message: t("Copied") });
+  }, [activeChatId, messageRows, activeChat, clipboard, t]);

-  // Fetch the server-rendered Markdown export and copy it to the clipboard. The
-  // server is the single source of truth (#183): it renders the transcript from
-  // the persisted rows — including an interrupted turn's in-progress row — so the
-  // export is identical whether the chat is freshly streaming, just switched to,
-  // or reloaded. The `lang` of the active i18n drives the few localized labels.
-  const handleCopy = useCallback(async () => {
-    if (!activeChatId) return;
-    try {
-      const markdown = await exportAiChat(activeChatId, i18n.language);
-      clipboard.copy(markdown);
-      notifications.show({ message: t("Copied") });
-    } catch {
-      notifications.show({ message: t("Failed to export chat"), color: "red" });
+  // When awaiting a new chat's id, adopt the most-recent chat (the list is
+  // ordered newest-first) once it appears.
+  useEffect(() => {
+    if (!adoptNewChat.current) return;
+    const newest = chats?.items?.[0];
+    if (newest) {
+      adoptNewChat.current = false;
+      setActiveChatId(newest.id);
    }
-  }, [activeChatId, clipboard, t, i18n.language]);
+  }, [chats, setActiveChatId]);
+
+  // The thread is remounted when the active chat changes so initial messages
+  // re-seed. For a new chat we key on "new"; adopting the id remounts the
+  // thread with the persisted history loaded.
+  const threadKey = activeChatId ?? "new";
+  const waitingForHistory = activeChatId !== null && messagesLoading;

  // Current context size for the active chat: how much the conversation now
  // occupies in the model's context window — NOT the cumulative tokens spent.
@@ -345,23 +296,18 @@ export default function AiChatWindow() {
  useEffect(() => {
    if (!windowOpen || minimized) return;
    const el = winRef.current;
-    // `geom` is in the deps so this re-runs once geometry is settled and the
-    // window is actually rendered (on the first open `geom` is still null on the
-    // render that flips windowOpen, so winRef.current is null then — without the
-    // geom dep the observer would never attach and resizes would not persist).
    if (!el) return;
    const ro = new ResizeObserver(() => {
      const width = el.offsetWidth;
      const height = el.offsetHeight;
      setGeom((prev) => {
-        if (!prev || (prev.width === width && prev.height === height))
-          return prev;
+        if (!prev || (prev.width === width && prev.height === height)) return prev;
        return { ...prev, width, height };
      });
    });
    ro.observe(el);
    return () => ro.disconnect();
-  }, [windowOpen, minimized, geom !== null]);
+  }, [windowOpen, minimized]);

  const startDrag = useCallback((e: React.MouseEvent): void => {
    // Ignore drags that originate on a button (minimize/close/new chat).
@@ -484,34 +430,21 @@ export default function AiChatWindow() {
          {t("AI chat")}
        </span>

-        {/* Role badge (emoji + name). Shows the persisted role of an existing
-            chat, or the role picked via a card for a brand-new chat. Hidden for
-            a universal (no-role) chat. */}
-        {currentRole && (
+        {/* Role badge for the active chat (emoji + name). Shown only when the
+            chat is bound to a role that still exists. */}
+        {activeChat?.roleName && (
          <span className={classes.badge} title={t("Agent role")}>
-            {currentRole.emoji ? `${currentRole.emoji} ` : ""}
-            {currentRole.name}
+            {activeChat.roleEmoji ? `${activeChat.roleEmoji} ` : ""}
+            {activeChat.roleName}
          </span>
        )}

        <div style={{ flex: 1, display: "flex", justifyContent: "center" }}>
-          {/* While a turn streams, show the LIVE turn-token count (ticks ~8 Hz);
-              once it finishes, fall back to the persisted context size. Require
-              > 0 so the very first emit (an empty tail message, count 0) does not
-              flash a "0" badge before any token streams in (#151 review). */}
-          {liveTurnTokens !== null && liveTurnTokens > 0 ? (
-            <Tooltip label={t("Tokens generated this turn")} withArrow>
-              <span className={classes.badge}>
-                {formatTokens(liveTurnTokens)}
-              </span>
-            </Tooltip>
-          ) : contextTokens > 0 ? (
+          {contextTokens > 0 && (
            <Tooltip label={t("Current context size")} withArrow>
-              <span className={classes.badge}>
-                {formatTokens(contextTokens)}
-              </span>
+              <span className={classes.badge}>{formatTokens(contextTokens)}</span>
            </Tooltip>
-          ) : null}
+          )}
        </div>

        <div style={{ display: "flex", alignItems: "center", gap: 1 }}>
@@ -523,11 +456,7 @@ export default function AiChatWindow() {
              aria-label={t("Copy chat")}
              onClick={handleCopy}
            >
-              {clipboard.copied ? (
-                <IconCheck size={14} />
-              ) : (
-                <IconCopy size={14} />
-              )}
+              {clipboard.copied ? <IconCheck size={14} /> : <IconCopy size={14} />}
            </button>
          )}
          <button
@@ -608,10 +537,28 @@ export default function AiChatWindow() {
          )}
        </div>

-        {/* The role picker for a NEW chat is rendered as the chat's empty-state
-            (colored role cards centered in the empty window) by ChatThread
-            itself — clicking a card starts the chat with that role. Once the
-            chat exists, its role is fixed and shown as a header badge instead. */}
+        {/* Role picker — only for a NEW chat (before it is created). Once the
+            chat exists, its role is fixed and shown as a header badge instead.
+            Defaults to "Universal assistant" (no role). */}
+        {activeChatId === null && (enabledRoles?.length ?? 0) > 0 && (
+          <div style={{ padding: "4px 8px 0" }}>
+            <Select
+              size="xs"
+              label={t("Agent role")}
+              value={selectedRoleId ?? ""}
+              onChange={(value) => setSelectedRoleId(value || null)}
+              allowDeselect={false}
+              comboboxProps={{ withinPortal: true }}
+              data={[
+                { value: "", label: t("Universal assistant") },
+                ...enabledRoles.map((r) => ({
+                  value: r.id,
+                  label: `${r.emoji ? `${r.emoji} ` : ""}${r.name}`,
+                })),
+              ]}
+            />
+          </div>
+        )}

        {/* body: active chat thread */}
        <div className={classes.body}>
@@ -627,14 +574,7 @@ export default function AiChatWindow() {
              openPage={openPage}
              // Honoured only for a new chat; null = universal assistant.
              roleId={activeChatId === null ? selectedRoleId : null}
-              // Role cards are the new-chat empty-state; offered only when this
-              // is a brand-new chat. Clicking a card starts the chat with it.
-              roles={activeChatId === null ? enabledRoles : undefined}
-              onRolePicked={(role) => setSelectedRoleId(role.id)}
-              assistantName={currentRole?.name}
              onTurnFinished={onTurnFinished}
-              onServerChatId={onServerChatId}
-              onLiveTurnTokens={setLiveTurnTokens}
            />
          )}
        </div>
--- a/apps/client/src/features/ai-chat/components/ai-chat.module.css
+++ b/apps/client/src/features/ai-chat/components/ai-chat.module.css
@@ -55,45 +55,6 @@
    padding-inline-start: 1.4em;
 }

-/* GFM tables in assistant markdown. The chat lives in a NARROW side panel, so a
-   wide LLM table must scroll horizontally instead of collapsing its columns:
-   `.markdown` sets `word-break: break-word`, which (with the default table
-   layout) shrinks columns to a single glyph and wraps headers mid-word
-   ("Секция" -> "Секци / я"). Make the table a horizontally scrollable block,
-   give cells a readable minimum width, and restore word-boundary wrapping. */
-.markdown table {
-    display: block;
-    /* lets the table scroll horizontally on its own */
-    max-width: 100%;
-    overflow-x: auto;
-    border-collapse: collapse;
-    margin-block-end: 0.5em;
-}
-
-.markdown th,
-.markdown td {
-    border: 1px solid light-dark(var(--mantine-color-gray-3), var(--mantine-color-dark-4));
-    padding: 3px 8px;
-    /* readable floor; the block scrolls when the row exceeds the panel */
-    min-width: 6em;
-    text-align: left;
-    vertical-align: top;
-    /* cancel the inherited break-word so words don't split mid-glyph */
-    word-break: normal;
-    /* still wrap genuinely long words / URLs at the cell edge */
-    overflow-wrap: break-word;
-}
-
-.markdown th {
-    background: light-dark(var(--mantine-color-gray-1), var(--mantine-color-dark-5));
-    font-weight: 600;
-}
-
-/* GFM wraps cell text in <p>; drop its default block margin inside cells. */
-.markdown table p {
-    margin: 0;
-}
-
 /* Animated three-dot "typing" indicator shown while the agent is thinking but
   has not yet produced any visible text/tool parts. */
 .typingDots {
@@ -127,18 +88,16 @@
        opacity: 0.4;
    }
    40% {
-        /* Bounce height is driven by --bounce so reduced-motion can dampen it
-           (below) without disabling the animation outright. */
-        transform: translateY(var(--bounce, -6px));
+        transform: translateY(-3px);
        opacity: 1;
    }
 }

-/* Respect reduced-motion preferences: keep a smaller bounce instead of a full
-   stop, so the "thinking" indicator still reads as active rather than frozen. */
+/* Respect reduced-motion preferences: fall back to a static dimmed state. */
@media (prefers-reduced-motion: reduce) {
    .typingDots span {
-        --bounce: -3px;
+        animation: none;
+        opacity: 0.6;
    }
 }

@@ -150,28 +109,6 @@
    background: light-dark(var(--mantine-color-gray-0), var(--mantine-color-dark-6));
 }

-/* Collapsible "Thinking" (reasoning) block: a subtle left rule, dimmer than the
-   answer so it reads as secondary thinking context above the real answer. */
-.reasoningBlock {
-    border-left: 2px solid light-dark(var(--mantine-color-gray-3), var(--mantine-color-dark-4));
-    padding-left: 8px;
-}
-
-.reasoningText {
-    margin-top: 4px;
-    font-size: var(--mantine-font-size-xs);
-    color: light-dark(var(--mantine-color-gray-7), var(--mantine-color-dark-1));
-    /* NOTE: `white-space: pre-wrap` is intentionally NOT set here. On the
-       rendered markdown <div> it would turn the newlines between block tags
-       (</li>\n<li>, </p>\n<ol>) into visible blank lines/indents on top of the
-       margins. The plain-text fallback <Text> that needs pre-wrap sets it
-       inline itself (see reasoning-block.tsx). */
-}
-
-.reasoningText p {
-    margin: 0 0 4px;
-}
-
 .inputWrapper {
    flex: 0 0 auto;
    padding-top: var(--mantine-spacing-xs);
@@ -189,29 +126,3 @@
 .conversationItemActive {
    background: var(--mantine-color-gray-light);
 }
-
-/* Pending messages queued by the user while a turn is still streaming. They
-   are sent automatically, FIFO, once the current turn finishes. */
-.queuedList {
-    padding-bottom: var(--mantine-spacing-xs);
-}
-
-.queuedItem {
-    background: var(--mantine-color-gray-light);
-    border-radius: var(--mantine-radius-sm);
-    padding: 4px 8px;
-}
-
-.queuedIcon {
-    flex: none;
-    color: var(--mantine-color-dimmed);
-}
-
-.queuedText {
-    flex: 1;
-    min-width: 0;
-    color: var(--mantine-color-dimmed);
-    white-space: pre-wrap;
-    overflow-wrap: break-word;
-    word-break: break-word;
-}
--- a/apps/client/src/features/ai-chat/components/chat-error-alert.tsx
+++ b/apps/client/src/features/ai-chat/components/chat-error-alert.tsx
@@ -1,49 +0,0 @@
-import { Alert, Group, Text, type AlertProps } from "@mantine/core";
-import { IconAlertTriangle } from "@tabler/icons-react";
-
-/**
- * A classified AI chat error banner: a warning icon + bold heading on the first
- * row, with the detail text spanning the full width below. Rendered for BOTH the
- * live stream error (ChatThread) and a persisted assistant error (MessageItem),
- * so this markup lives in one place. The detail is full-width (no hanging indent
- * under the heading) so it wraps less and leaves no stranded icon / empty gap.
- * The heading reuses Mantine's adaptive red "light" colour so it stays correct
- * in dark mode. Layout-only props (mb/mt/...) are forwarded to the Alert root.
- */
-interface ChatErrorAlertProps extends Omit<AlertProps, "title" | "children"> {
-  title: string;
-  detail: string;
-}
-
-export default function ChatErrorAlert({
-  title,
-  detail,
-  style,
-  ...alertProps
-}: ChatErrorAlertProps) {
-  // Mantine's own "light" alert colour, adaptive across light/dark schemes.
-  const accent = "var(--mantine-color-red-light-color)";
-  return (
-    // flexShrink: 0 keeps the banner fully visible. Mantine's Alert root is
-    // `overflow: hidden`, so as a flex child of the chat panel it can otherwise
-    // be compressed below its content height and clip the detail text; the
-    // scrollable message list absorbs the height pressure instead.
-    <Alert
-      {...alertProps}
-      variant="light"
-      color="red"
-      p="xs"
-      style={[{ flexShrink: 0 }, style]}
-    >
-      <Group gap={8} wrap="nowrap" align="center" mb={4}>
-        <IconAlertTriangle size={18} style={{ flex: "none", color: accent }} />
-        <Text fw={700} size="sm" lh={1.2} style={{ color: accent }}>
-          {title}
-        </Text>
-      </Group>
-      <Text size="sm" lh={1.4}>
-        {detail}
-      </Text>
-    </Alert>
-  );
-}
--- a/apps/client/src/features/ai-chat/components/chat-input.append.test.ts
+++ b/apps/client/src/features/ai-chat/components/chat-input.append.test.ts
@@ -0,0 +1,26 @@
+import { describe, it, expect } from "vitest";
+import { appendFinalToDraft } from "./chat-input";
+
+describe("appendFinalToDraft", () => {
+  it("an empty draft becomes the final verbatim", () => {
+    expect(appendFinalToDraft("", "hello")).toBe("hello");
+  });
+
+  it("a non-empty draft gets the final appended with exactly one space", () => {
+    expect(appendFinalToDraft("draft", "final")).toBe("draft final");
+  });
+
+  it("never introduces a leading or double space", () => {
+    const out = appendFinalToDraft("draft", "final");
+    expect(out.startsWith(" ")).toBe(false);
+    expect(out).not.toContain("  ");
+  });
+
+  it("accumulates left-to-right across repeated calls", () => {
+    let draft = "";
+    draft = appendFinalToDraft(draft, "a");
+    draft = appendFinalToDraft(draft, "b");
+    draft = appendFinalToDraft(draft, "c");
+    expect(draft).toBe("a b c");
+  });
+});
--- a/apps/client/src/features/ai-chat/components/chat-input.tsx
+++ b/apps/client/src/features/ai-chat/components/chat-input.tsx
@@ -1,32 +1,44 @@
-import { KeyboardEvent } from "react";
-import { ActionIcon, Group, Textarea, Tooltip } from "@mantine/core";
+import { KeyboardEvent, useState } from "react";
+import {
+  ActionIcon,
+  Group,
+  Stack,
+  Text,
+  Textarea,
+  Tooltip,
+} from "@mantine/core";
 import { IconPlayerStopFilled, IconSend } from "@tabler/icons-react";
 import { useTranslation } from "react-i18next";
 import { useAtom, useAtomValue } from "jotai";
 import { aiChatDraftAtom } from "@/features/ai-chat/atoms/ai-chat-atom.ts";
 import { workspaceAtom } from "@/features/user/atoms/current-user-atom";
 import { MicButton } from "@/features/dictation/components/mic-button";
+import { RealtimeMicButton } from "@/features/dictation/components/realtime-mic-button";

 interface ChatInputProps {
  onSend: (text: string) => void;
-  /** Called instead of `onSend` while a turn is streaming: the text is queued
-   *  and sent automatically once the current turn finishes. */
-  onQueue: (text: string) => void;
  onStop: () => void;
  isStreaming: boolean;
  disabled?: boolean;
 }

 /**
- * Message composer. Enter submits, Shift+Enter inserts a newline. While the
- * agent is streaming, submitting QUEUES the message (via `onQueue`) instead of
- * dropping it — it is sent automatically once the current turn finishes; the
- * Stop button (calls `stop()`) is also shown. The textarea stays usable so the
- * user can draft / queue the next turn while the agent is busy.
+ * Merge a finalized dictation segment into the existing draft. Pure +
+ * unit-testable. An empty draft becomes the final verbatim; a non-empty draft
+ * gets the final appended with exactly one space separator. Repeated calls
+ * accumulate left-to-right ("a" then "b" -> "a b").
+ */
+export function appendFinalToDraft(draft: string, final: string): string {
+  return draft ? `${draft} ${final}` : final;
+}
+
+/**
+ * Message composer. Enter sends, Shift+Enter inserts a newline. While the agent
+ * is streaming, the send button becomes a Stop button (calls `stop()`); the
+ * textarea stays usable so the user can draft the next turn.
 */
 export default function ChatInput({
  onSend,
-  onQueue,
  onStop,
  isStreaming,
  disabled,
@@ -35,28 +47,29 @@ export default function ChatInput({
  const [value, setValue] = useAtom(aiChatDraftAtom);
  const workspace = useAtomValue(workspaceAtom);
  const isDictationEnabled = workspace?.settings?.ai?.dictation === true;
-  // Streaming (silence-cut) dictation is opt-in per workspace; absent/false
-  // keeps the stable batch path.
-  const streamingDictation =
-    workspace?.settings?.ai?.dictationStreaming === true;
+  const isRealtime = workspace?.settings?.ai?.dictationRealtime === true;
+  // Live interim (partial) transcript shown as a dimmed tail under the input.
+  const [interim, setInterim] = useState("");

-  const submit = (): void => {
+  const send = (): void => {
    const text = value.trim();
-    if (!text || disabled) return;
-    if (isStreaming) onQueue(text);
-    else onSend(text);
+    if (!text || isStreaming || disabled) return;
+    onSend(text);
    setValue("");
+    // Drop any leftover partial when a message is sent.
+    setInterim("");
  };

  const handleKeyDown = (e: KeyboardEvent<HTMLTextAreaElement>): void => {
    if (e.key === "Enter" && !e.shiftKey) {
      e.preventDefault();
-      submit();
+      send();
    }
  };

  return (
-    <Group gap="xs" align="flex-end" wrap="nowrap">
+    <Stack gap="xs">
+      <Group gap="xs" align="flex-end" wrap="nowrap">
      <Textarea
        style={{ flex: 1 }}
        placeholder={t("Ask the AI agent…")}
@@ -72,46 +85,42 @@ export default function ChatInput({
        // switch), so a fresh chat lands with the cursor ready in the field.
        autoFocus
      />
-      {isDictationEnabled && (
-        <MicButton
-          size="lg"
-          streaming={streamingDictation}
-          disabled={isStreaming || disabled}
-          onText={(text) => setValue((v) => (v ? `${v} ${text}` : text))}
-        />
-      )}
+      {isDictationEnabled &&
+        (isRealtime ? (
+          <RealtimeMicButton
+            size="lg"
+            disabled={isStreaming || disabled}
+            onInterim={(text) => setInterim(text)}
+            onFinal={(text) => {
+              setValue((v) => appendFinalToDraft(v, text));
+              setInterim("");
+            }}
+          />
+        ) : (
+          <MicButton
+            size="lg"
+            disabled={isStreaming || disabled}
+            onText={(text) => setValue((v) => appendFinalToDraft(v, text))}
+          />
+        ))}
      {isStreaming ? (
-        <Group gap="xs" wrap="nowrap">
-          {value.trim().length > 0 && (
-            <Tooltip label={t("Send when the agent finishes")} withArrow>
-              <ActionIcon
-                size="lg"
-                variant="filled"
-                onClick={submit}
-                aria-label={t("Queue message")}
-              >
-                <IconSend size={18} />
-              </ActionIcon>
-            </Tooltip>
-          )}
-          <Tooltip label={t("Stop")} withArrow>
-            <ActionIcon
-              size="lg"
-              color="red"
-              variant="light"
-              onClick={onStop}
-              aria-label={t("Stop")}
-            >
-              <IconPlayerStopFilled size={18} />
-            </ActionIcon>
-          </Tooltip>
-        </Group>
+        <Tooltip label={t("Stop")} withArrow>
+          <ActionIcon
+            size="lg"
+            color="red"
+            variant="light"
+            onClick={onStop}
+            aria-label={t("Stop")}
+          >
+            <IconPlayerStopFilled size={18} />
+          </ActionIcon>
+        </Tooltip>
      ) : (
        <Tooltip label={t("Send")} withArrow>
          <ActionIcon
            size="lg"
            variant="filled"
-            onClick={submit}
+            onClick={send}
            disabled={disabled || value.trim().length === 0}
            aria-label={t("Send")}
          >
@@ -119,6 +128,12 @@ export default function ChatInput({
          </ActionIcon>
        </Tooltip>
      )}
-    </Group>
+      </Group>
+      {interim && (
+        <Text size="sm" c="dimmed">
+          {interim}
+        </Text>
+      )}
+    </Stack>
  );
 }
--- a/apps/client/src/features/ai-chat/components/chat-stopped-notice.tsx
+++ b/apps/client/src/features/ai-chat/components/chat-stopped-notice.tsx
@@ -1,41 +0,0 @@
-import { Alert, Group, Text, type AlertProps } from "@mantine/core";
-import { IconPlayerStopFilled } from "@tabler/icons-react";
-
-/**
- * A neutral "turn was interrupted" notice (NOT an error). Rendered for an
- * aborted turn — a manual Stop or a dropped connection — both live (ChatThread)
- * and in reopened history (MessageItem). Deliberately gray/subtle so it reads as
- * an informational marker, distinct from the red ChatErrorAlert. Layout-only
- * props (mt/mb/...) are forwarded to the Alert root.
- */
-interface ChatStoppedNoticeProps extends Omit<AlertProps, "title" | "children"> {
-  text: string;
-}
-
-export default function ChatStoppedNotice({
-  text,
-  style,
-  ...alertProps
-}: ChatStoppedNoticeProps) {
-  return (
-    <Alert
-      {...alertProps}
-      variant="light"
-      color="gray"
-      p="xs"
-      // flexShrink: 0 mirrors ChatErrorAlert so the notice is not compressed as a
-      // flex child of the chat panel.
-      style={[{ flexShrink: 0 }, style]}
-    >
-      <Group gap={8} wrap="nowrap" align="center">
-        <IconPlayerStopFilled
-          size={16}
-          style={{ flex: "none", color: "var(--mantine-color-dimmed)" }}
-        />
-        <Text size="sm" lh={1.3} c="dimmed">
-          {text}
-        </Text>
-      </Group>
-    </Alert>
-  );
-}
--- a/apps/client/src/features/ai-chat/components/chat-thread.tsx
+++ b/apps/client/src/features/ai-chat/components/chat-thread.tsx
@@ -1,32 +1,14 @@
-import { useCallback, useEffect, useMemo, useRef, useState } from "react";
+import { useMemo, useRef } from "react";
 import { generateId } from "ai";
-import { ActionIcon, Box, Group, Stack, Text } from "@mantine/core";
-import { IconClockHour4, IconX } from "@tabler/icons-react";
+import { Alert, Box, Stack } from "@mantine/core";
+import { IconAlertTriangle } from "@tabler/icons-react";
 import { useTranslation } from "react-i18next";
 import { useChat, type UIMessage } from "@ai-sdk/react";
 import { DefaultChatTransport } from "ai";
 import MessageList from "@/features/ai-chat/components/message-list.tsx";
 import ChatInput from "@/features/ai-chat/components/chat-input.tsx";
-import RoleCards from "@/features/ai-chat/components/role-cards.tsx";
-import ChatErrorAlert from "@/features/ai-chat/components/chat-error-alert.tsx";
-import ChatStoppedNotice from "@/features/ai-chat/components/chat-stopped-notice.tsx";
-import {
-  IAiChatMessageRow,
-  IAiRole,
-} from "@/features/ai-chat/types/ai-chat.types.ts";
-import {
-  roleLaunchMessage,
-  shouldResetRolePicked,
-} from "@/features/ai-chat/utils/role-launch.ts";
+import { IAiChatMessageRow } from "@/features/ai-chat/types/ai-chat.types.ts";
 import { describeChatError } from "@/features/ai-chat/utils/error-message.ts";
-import { extractServerChatId } from "@/features/ai-chat/utils/adopt-chat-id.ts";
-import { liveTurnTokens } from "@/features/ai-chat/utils/count-stream-tokens.ts";
-import {
-  dequeue,
-  enqueueMessage,
-  removeQueuedById,
-  type QueuedMessage,
-} from "@/features/ai-chat/utils/queue-helpers.ts";
 import classes from "@/features/ai-chat/components/ai-chat.module.css";

 /** The page the user is currently viewing, sent as chat context. */
@@ -47,32 +29,9 @@ interface ChatThreadProps {
   *  in the request body so the server persists it on chat creation; ignored by
   *  the server for existing chats (the role is read from the chat row). */
  roleId?: string | null;
-  /** Enabled roles for the new-chat empty state (only meaningful when
-   *  `chatId === null`). Rendered as the colored role cards. */
-  roles?: IAiRole[];
-  /** Notify the parent which role was picked via a card, so it can update the
-   *  header badge / assistant name for the brand-new chat. */
-  onRolePicked?: (role: IAiRole) => void;
-  /** Display name for the assistant label / typing line (the role name);
-   *  forwarded to MessageList. Absent => the generic "AI agent". */
-  assistantName?: string;
-  /** Called when a turn finishes; the parent refreshes the chat list and, for a
-   *  new chat, adopts the freshly created chat id. `serverChatId` is the
-   *  authoritative id the server streamed on the assistant message metadata, or
-   *  undefined on a failed turn — see adopt-chat-id.ts for the full #137 design. */
-  onTurnFinished: (serverChatId?: string) => void;
-  /** Called EARLY (at the stream's `start` chunk) with the authoritative server
-   *  chat id streamed on the assistant message metadata, so a brand-new chat
-   *  adopts its real id WHILE the first turn is still streaming (#174 — makes the
-   *  Copy/export button available mid-stream). Distinct from onTurnFinished,
-   *  which fires only at the terminal outcome. */
-  onServerChatId?: (serverChatId?: string) => void;
-  /** Reports the live turn-token total (reasoning + output) for the in-flight
-   *  turn so the parent can show a header badge that ticks mid-stream. THROTTLED
-   *  here (~8 Hz) so the parent re-renders a handful of times a second, not on
-   *  every streamed delta. Called with `null` when no turn is in flight (the
-   *  parent then reverts the badge to the persisted context size). */
-  onLiveTurnTokens?: (tokens: number | null) => void;
+  /** Called when a turn finishes; the parent refreshes the chat list and, for
+   *  a new chat, adopts the freshly created chat id. */
+  onTurnFinished: () => void;
 }

 /**
@@ -87,18 +46,13 @@ function rowToUiMessage(row: IAiChatMessageRow): UIMessage {
      ? row.metadata.parts
      : ([{ type: "text", text: row.content ?? "" }] as UIMessage["parts"]);
  const error = row.metadata?.error;
-  const finishReason = row.metadata?.finishReason;
-  const metadata: Record<string, unknown> = {};
-  if (error) metadata.error = error;
-  if (finishReason) metadata.finishReason = finishReason;
  return {
    id: row.id,
    role,
    parts,
-    // Carry persisted turn outcome (error text and/or finishReason) so MessageItem
-    // can render the error banner / "stopped" marker after a remount and in
-    // reopened history.
-    ...(Object.keys(metadata).length > 0 ? { metadata } : {}),
+    // Carry a persisted turn error so MessageItem can render it after a remount
+    // (e.g. when a new chat adopts its id) and in reopened chat history.
+    ...(error ? { metadata: { error } } : {}),
  } as UIMessage;
 }

@@ -112,12 +66,7 @@ export default function ChatThread({
  initialRows,
  openPage,
  roleId,
-  roles,
-  onRolePicked,
-  assistantName,
  onTurnFinished,
-  onServerChatId,
-  onLiveTurnTokens,
 }: ChatThreadProps) {
  const { t } = useTranslation();

@@ -164,55 +113,7 @@ export default function ChatThread({
  // The id only needs to be stable per mount — the parent remounts this via
  // `key` on chat switch, which re-seeds cleanly.
  const stableIdRef = useRef<string>(chatId ?? `new-${generateId()}`);
-  // Stable for the LIFETIME of this mount. When a brand-new chat adopts its
-  // server id, the parent now updates the `chatId` prop WITHOUT remounting this
-  // thread, so the store id must NOT follow `chatId`: recreating the useChat
-  // store would wipe the live (just-finished) turn. The server still resolves
-  // the real chat from `chatId` in the request body (see chatIdRef /
-  // prepareSendMessagesRequest), so this purely-client store key can stay fixed.
-  const chatStoreId = stableIdRef.current;
-
-  // Pending messages the user composed WHILE a turn was streaming. They are sent
-  // automatically, FIFO, on successful turn completion (`onFinish`). The queue is
-  // LOCAL state so it is scoped to this conversation: it is cleared when the user
-  // deliberately switches chat / starts a new chat (the parent remounts this via
-  // `key`), but it SURVIVES in-place new-chat id adoption (no remount), so a
-  // message queued during a brand-new chat's first turn is not lost. On Stop or
-  // error the queue is intentionally preserved (onFinish does not fire then) so
-  // the user decides what to do with the pending messages.
-  const [queued, setQueued] = useState<QueuedMessage[]>([]);
-  // Mirror the queue in a ref so the `onFinish` flush always reads the latest
-  // queue without a stale closure; `setQueue` updates BOTH the ref and the state.
-  const queuedRef = useRef<QueuedMessage[]>([]);
-  const setQueue = useCallback((next: QueuedMessage[]) => {
-    queuedRef.current = next;
-    setQueued(next);
-  }, []);
-
-  // Capture the latest `sendMessage` (returned by useChat below) so the flush
-  // helper can call the current instance from the stable `onFinish` callback.
-  const sendMessageRef = useRef<((m: { text: string }) => void) | null>(null);
-
-  // FIFO dequeue + send the next queued message (no-op when the queue is empty).
-  const flushNext = useCallback(() => {
-    const { head, rest } = dequeue(queuedRef.current);
-    if (!head) return;
-    setQueue(rest);
-    sendMessageRef.current?.({ text: head.text });
-  }, [setQueue]);
-
-  const enqueue = useCallback(
-    (text: string) => {
-      setQueue(enqueueMessage(queuedRef.current, { id: generateId(), text }));
-    },
-    [setQueue],
-  );
-  const removeQueued = useCallback(
-    (id: string) => {
-      setQueue(removeQueuedById(queuedRef.current, id));
-    },
-    [setQueue],
-  );
+  const chatStoreId = chatId ?? stableIdRef.current;

  const transport = useMemo(
    () =>
@@ -246,234 +147,37 @@ export default function ChatThread({
    id: chatStoreId,
    messages: initialMessages,
    transport,
-    // `onFinish` (ai@6 useChat) fires from a `finally` on EVERY terminal outcome
-    // — success, user Stop/abort (`isAbort`), network drop (`isDisconnect`), and
-    // stream error (`isError`). Keep calling `onTurnFinished()` on all of them
-    // (chat-list refresh + new-chat id adoption must happen even on a failed
-    // first turn), but flush the pending queue ONLY on a clean finish: auto-
-    // sending after the user hit Stop — or blindly retrying after a failure —
-    // would be wrong, so on Stop/disconnect/error the queue is left intact for
-    // the user to decide.
-    onFinish: ({ message, isAbort, isDisconnect, isError }) => {
-      // Forward the authoritative server chatId (streamed on the assistant
-      // message metadata) so the parent adopts the REAL created chat id for a new
-      // chat — see adopt-chat-id.ts for the full #137 design.
-      onTurnFinished(extractServerChatId(message));
-      // Show a neutral "stopped" marker for an aborted turn; the red error banner
-      // (via `error`) already covers isError, and a clean finish clears any marker.
-      if (isError) setStopNotice(null);
-      else if (isAbort) setStopNotice("manual");
-      else if (isDisconnect) setStopNotice("disconnect");
-      else setStopNotice(null);
-      if (isAbort || isDisconnect || isError) return;
-      flushNext();
-    },
-    // `onError` runs in addition to `onFinish` (which ai@6 also calls on error).
-    // Log the raw failure here for devtools; the UI shows a friendly classified
-    // banner via `error` below. We still call `onTurnFinished()` with NO server id
-    // (idempotent with the onFinish call): for a brand-new chat that ARMS the
-    // bounded list-refetch fallback (adopt the single newly-appeared chat once the
-    // refetch lands); for an existing chat it just refreshes the chat list
-    // immediately rather than after a manual refresh.
-    onError: (streamError) => {
-      // Surface the raw failure in the browser console (devtools) for debugging;
-      // the UI separately shows a friendly classified banner (see errorView).
-      console.error("AI chat stream error:", streamError);
-      onTurnFinished();
-    },
+    onFinish: () => onTurnFinished(),
+    // In AI SDK v6 `onFinish` does NOT fire when the stream errors, so a brand
+    // new chat that fails on its first turn would never invalidate the chat list
+    // nor adopt the server-created chat id (the server still creates the row and
+    // saves the error message). Run the same post-turn path on error so the
+    // failed chat appears in history immediately instead of after a manual
+    // refresh. The error itself is still surfaced via `error` below.
+    onError: () => onTurnFinished(),
  });

-  // Keep the flush helper pointed at the latest sendMessage instance.
-  sendMessageRef.current = sendMessage;
-
-  // EARLY chat-id adoption (#174): the server streams the authoritative chat id
-  // on the assistant message metadata at the `start` chunk (message.metadata.
-  // chatId — see adopt-chat-id.ts / chatStreamMetadata). Forward it to the parent
-  // AS SOON AS it appears (mid-stream), so a brand-new chat adopts its real id
-  // WHILE the first turn is still streaming and activeChatId-gated affordances
-  // (the Copy/export button) light up immediately, instead of only at onFinish.
-  // Keyed by the last-seen id so we forward each distinct id exactly once. The
-  // parent's onServerChatId is idempotent and a no-op once the chat has an id.
-  const lastForwardedChatIdRef = useRef<string | undefined>(undefined);
-  useEffect(() => {
-    if (!onServerChatId) return;
-    const tail = messages[messages.length - 1];
-    if (tail?.role !== "assistant") return;
-    const serverChatId = extractServerChatId(tail);
-    if (!serverChatId || serverChatId === lastForwardedChatIdRef.current)
-      return;
-    lastForwardedChatIdRef.current = serverChatId;
-    onServerChatId(serverChatId);
-  }, [messages, onServerChatId]);
-
-  // Live "turn was interrupted" marker for the CURRENT session. The red error
-  // banner (driven by `error`) covers the error case; this covers an aborted
-  // turn, distinguishing a manual Stop (`isAbort`) from a dropped connection
-  // (`isDisconnect`) — a distinction only available live (the server persists
-  // both as finishReason 'aborted'). Cleared when the next turn starts.
-  const [stopNotice, setStopNotice] = useState<null | "manual" | "disconnect">(
-    null,
-  );
-
  const isStreaming = status === "submitted" || status === "streaming";

-  // Clear the stopped marker as soon as a new turn begins streaming.
-  useEffect(() => {
-    if (isStreaming) setStopNotice(null);
-  }, [isStreaming]);
-
-  // Classify the turn error into a heading + detail so the banner names the cause
-  // (connection reset, timeout, rate limit, context overflow, quota, ...) instead
-  // of a generic "Something went wrong". Computed here (not only in the JSX) so
-  // the SAME on-screen banner text can be mirrored into the export (issue #160).
-  const errorView = error ? describeChatError(error.message ?? "", t) : null;
-
-  // Report the live turn-token total to the parent header badge, THROTTLED to
-  // ~8 Hz so the parent re-renders a few times a second instead of on every
-  // streamed delta. The tail assistant message's reasoning+output (estimate while
-  // streaming, authoritative once a step reports usage) is the live figure. When
-  // the turn ends we emit a final exact value, then `null` so the parent reverts
-  // the badge to the persisted context size.
-  const lastEmitRef = useRef(0);
-  const emitTimerRef = useRef<ReturnType<typeof setTimeout> | null>(null);
-  useEffect(() => {
-    if (!onLiveTurnTokens) return;
-    if (!isStreaming) {
-      // Turn ended (or never started): clear any pending throttle and revert.
-      if (emitTimerRef.current) {
-        clearTimeout(emitTimerRef.current);
-        emitTimerRef.current = null;
-      }
-      lastEmitRef.current = 0;
-      onLiveTurnTokens(null);
-      return;
-    }
-    const tail = messages[messages.length - 1];
-    const live = tail?.role === "assistant" ? liveTurnTokens(tail) : null;
-    const total = live ? live.reasoning + live.output : 0;
-    const now = Date.now();
-    const MIN_INTERVAL = 120; // ms (~8 Hz)
-    const elapsed = now - lastEmitRef.current;
-    if (elapsed >= MIN_INTERVAL) {
-      lastEmitRef.current = now;
-      onLiveTurnTokens(total);
-    } else if (!emitTimerRef.current) {
-      // Schedule a trailing emit so the FINAL value of a burst is not dropped.
-      emitTimerRef.current = setTimeout(() => {
-        emitTimerRef.current = null;
-        lastEmitRef.current = Date.now();
-        onLiveTurnTokens(total);
-      }, MIN_INTERVAL - elapsed);
-    }
-  }, [messages, isStreaming, onLiveTurnTokens]);
-
-  // Clear any pending throttle timer on unmount (chat switch via `key`) so a
-  // trailing emit can't fire into a torn-down thread's parent.
-  useEffect(() => {
-    return () => {
-      if (emitTimerRef.current) clearTimeout(emitTimerRef.current);
-    };
-  }, []);
-
-  // A role was picked with autoStart=false: the role is bound but NOTHING was
-  // sent, so chatId stays null and the empty state would keep showing the cards.
-  // This flag hides the cards and reveals the composer (with the role indicated)
-  // so the user can type the first message themselves. roleIdRef is already set,
-  // so that first manual message carries the roleId.
-  const [rolePickedNoSend, setRolePickedNoSend] = useState(false);
-
-  // Clicking a role card always binds the role to THIS new chat. Whether it also
-  // auto-starts the conversation is per-role (autoStart). roleIdRef is set
-  // synchronously here because the parent's selectedRoleId state update would
-  // only reach roleIdRef on the next render — after this synchronous sendMessage
-  // has already read it.
-  const handleRolePick = (role: IAiRole): void => {
-    roleIdRef.current = role.id;
-    onRolePicked?.(role);
-    const launch = roleLaunchMessage(
-      role,
-      t("Take a look at the current document"),
-    );
-    if (launch !== null) {
-      sendMessage({ text: launch });
-    } else {
-      // autoStart=false -> bind only: hide the cards, show the composer.
-      setRolePickedNoSend(true);
-    }
-  };
-  // Reset the "picked, not sent" flag when the thread returns to a truly empty,
-  // role-less state — e.g. the user hit "New chat" after picking an autoStart=false
-  // role. That path clears the parent's selectedRoleId (roleId -> null) but leaves
-  // chatId null, so the thread never remounts and the flag would stay set, hiding
-  // the cards forever. A picked-and-bound role keeps roleId non-null, so the cards
-  // correctly stay hidden then. Render-phase reset (React "adjust state on prop
-  // change"): one-shot — it re-renders with the flag false and the guard no longer
-  // matches, so it cannot loop. (Review of #149.)
-  if (shouldResetRolePicked(chatId, roleId, rolePickedNoSend)) {
-    setRolePickedNoSend(false);
-  }
-  const showRoleCards =
-    chatId === null && (roles?.length ?? 0) > 0 && !rolePickedNoSend;
-  const roleCardsEmptyState = showRoleCards ? (
-    <RoleCards roles={roles ?? []} onPick={handleRolePick} />
-  ) : undefined;
-
  return (
    <Box className={classes.panel}>
-      <MessageList
-        messages={messages}
-        isStreaming={isStreaming}
-        emptyState={roleCardsEmptyState}
-        assistantName={assistantName}
-      />
+      <MessageList messages={messages} isStreaming={isStreaming} />

-      {errorView ? (
-        <ChatErrorAlert
-          title={errorView.title}
-          detail={errorView.detail}
+      {error && (
+        <Alert
+          variant="light"
+          color="red"
+          icon={<IconAlertTriangle size={16} />}
          mb="xs"
-        />
-      ) : stopNotice ? (
-        <ChatStoppedNotice
-          text={
-            stopNotice === "manual"
-              ? t("Response stopped.")
-              : t("Connection lost — the answer was interrupted.")
-          }
-          mb="xs"
-        />
-      ) : null}
+          title={t("Something went wrong")}
+        >
+          {describeChatError(error.message ?? "", t)}
+        </Alert>
+      )}

      <Stack gap={0} className={classes.inputWrapper}>
-        {queued.length > 0 && (
-          <Stack gap={4} className={classes.queuedList}>
-            {queued.map((m) => (
-              <Group
-                key={m.id}
-                gap={6}
-                wrap="nowrap"
-                className={classes.queuedItem}
-              >
-                <IconClockHour4 size={14} className={classes.queuedIcon} />
-                <Text size="xs" lineClamp={2} className={classes.queuedText}>
-                  {m.text}
-                </Text>
-                <ActionIcon
-                  size="xs"
-                  variant="subtle"
-                  color="gray"
-                  onClick={() => removeQueued(m.id)}
-                  aria-label={t("Remove queued message")}
-                >
-                  <IconX size={12} />
-                </ActionIcon>
-              </Group>
-            ))}
-          </Stack>
-        )}
        <ChatInput
          onSend={(text) => sendMessage({ text })}
-          onQueue={enqueue}
          onStop={stop}
          isStreaming={isStreaming}
        />
--- a/apps/client/src/features/ai-chat/components/conversation-list.tsx
+++ b/apps/client/src/features/ai-chat/components/conversation-list.tsx
@@ -18,31 +18,8 @@ import {
  useRenameAiChatMutation,
 } from "@/features/ai-chat/queries/ai-chat-query.ts";
 import { IAiChat } from "@/features/ai-chat/types/ai-chat.types.ts";
-import { useTimeAgo } from "@/hooks/use-time-ago.tsx";
 import classes from "@/features/ai-chat/components/ai-chat.module.css";

-/**
- * The dimmed second line of a chat row: how long ago the chat was created and
- * the document it was created in. Its own component so the self-updating
- * `useTimeAgo` hook is called per row legally (hooks cannot run inside `.map()`).
- */
-function ChatMetaLine({
-  createdAt,
-  pageTitle,
-}: {
-  createdAt: string;
-  pageTitle?: string | null;
-}) {
-  const { t } = useTranslation();
-  const ago = useTimeAgo(createdAt);
-  // e.g. "2 hours ago · Onboarding guide" / "2 hours ago · No document"
-  return (
-    <Text size="xs" c="dimmed" lineClamp={1}>
-      {ago} · {pageTitle || t("No document")}
-    </Text>
-  );
-}
-
 interface ConversationListProps {
  activeChatId: string | null;
  onSelect: (chatId: string) => void;
@@ -150,24 +127,16 @@ export default function ConversationList({
              }
            }}
          >
-            <Box style={{ flex: 1, minWidth: 0 }}>
-              <Group gap={4} wrap="nowrap" style={{ minWidth: 0 }}>
-                {chat.roleName && (
-                  <Text
-                    size="sm"
-                    span
-                    title={chat.roleName}
-                    style={{ flex: "none" }}
-                  >
-                    {chat.roleEmoji || "🤖"}
-                  </Text>
-                )}
-                <Text size="sm" lineClamp={1} style={{ flex: 1, minWidth: 0 }}>
-                  {chat.title || t("Untitled chat")}
+            <Group gap={4} wrap="nowrap" style={{ flex: 1, minWidth: 0 }}>
+              {chat.roleName && (
+                <Text size="sm" span title={chat.roleName} style={{ flex: "none" }}>
+                  {chat.roleEmoji || "🤖"}
                </Text>
-              </Group>
-              <ChatMetaLine createdAt={chat.createdAt} pageTitle={chat.pageTitle} />
-            </Box>
+              )}
+              <Text size="sm" lineClamp={1} style={{ flex: 1, minWidth: 0 }}>
+                {chat.title || t("Untitled chat")}
+              </Text>
+            </Group>
            <Menu shadow="md" width={180} position="bottom-end">
              <Menu.Target>
                <ActionIcon
--- a/apps/client/src/features/ai-chat/components/message-item.tsx
+++ b/apps/client/src/features/ai-chat/components/message-item.tsx
@@ -1,15 +1,11 @@
-import { Box, Text } from "@mantine/core";
+import { Alert, Box, Text } from "@mantine/core";
+import { IconAlertTriangle } from "@tabler/icons-react";
 import { useTranslation } from "react-i18next";
 import type { UIMessage } from "@ai-sdk/react";
 import ToolCallCard from "@/features/ai-chat/components/tool-call-card.tsx";
-import ReasoningBlock from "@/features/ai-chat/components/reasoning-block.tsx";
-import ChatErrorAlert from "@/features/ai-chat/components/chat-error-alert.tsx";
-import ChatStoppedNotice from "@/features/ai-chat/components/chat-stopped-notice.tsx";
 import { ToolUiPart, isToolPart } from "@/features/ai-chat/utils/tool-parts.tsx";
-import { assistantMessageHasVisibleContent } from "@/features/ai-chat/utils/message-content.ts";
 import { renderChatMarkdown } from "@/features/ai-chat/utils/markdown.ts";
 import { resolveAssistantName } from "@/features/ai-chat/utils/assistant-name.ts";
-import { reasoningTokensForPart } from "@/features/ai-chat/utils/reasoning-tokens.ts";
 import { describeChatError } from "@/features/ai-chat/utils/error-message.ts";
 import classes from "@/features/ai-chat/components/ai-chat.module.css";

@@ -69,41 +65,12 @@ export default function MessageItem({
    );
  }

-  // An assistant message with nothing visible to render yet (an empty streaming
-  // text part, or a reasoning/step-start part while the model is still thinking)
-  // renders nothing here. The standalone TypingIndicator stands in for the nascent
-  // bubble (name + dots) until real content arrives, so exactly one element owns
-  // the agent name during the pre-content gap and the layout never jumps. Persisted
-  // errored/aborted turns DO have visible content per the helper (metadata.error /
-  // finishReason === "aborted"), so their banners below still render — this early
-  // return won't fire for them.
-  if (!assistantMessageHasVisibleContent(message)) return null;
-
-  // Authoritative reasoning token count to attribute to a reasoning block, or
-  // undefined when the block must estimate on its own. See reasoningTokensForPart
-  // for the #151 anti-double-count rule (only a single reasoning part may carry
-  // the turn total). The authoritative turn total is still surfaced live in the
-  // header badge regardless.
-  const reasoningTokens = reasoningTokensForPart(message);
-
  return (
    <Box className={classes.messageRow}>
      <Text size="xs" c="dimmed" mb={4}>
        {resolveAssistantName(assistantName) ?? t("AI agent")}
      </Text>
      {message.parts.map((part, index) => {
-        if (part.type === "reasoning") {
-          // Reasoning ("thinking") -> a collapsible block with its own token
-          // count. Empty/whitespace reasoning with no authoritative count carries
-          // nothing to show, so skip it (avoids an empty 0-token block).
-          const text = (part as { text?: string }).text ?? "";
-          if (!text.trim() && !(reasoningTokens && reasoningTokens > 0))
-            return null;
-          return (
-            <ReasoningBlock key={index} text={text} tokens={reasoningTokens} />
-          );
-        }
-
        if (part.type === "text") {
          // Skip empty/whitespace-only text parts (a streaming message often
          // starts with an empty text part before the first token arrives); the
@@ -147,31 +114,15 @@ export default function MessageItem({
      {(() => {
        const errorText = (message.metadata as { error?: string } | undefined)?.error;
        if (!errorText) return null;
-        // Same classified-error banner as the live chat: a heading naming the
-        // cause plus a one-line detail.
-        const errorView = describeChatError(errorText, t);
        return (
-          <ChatErrorAlert
-            title={errorView.title}
-            detail={errorView.detail}
+          <Alert
+            variant="light"
+            color="red"
+            icon={<IconAlertTriangle size={16} />}
            mt={4}
-          />
-        );
-      })()}
-      {/* A persisted turn that was aborted (manual Stop or a dropped connection)
-          with no error banner. The server cannot tell a manual Stop from a
-          connection drop (both persist as finishReason 'aborted'), so reopened
-          history uses a combined wording. */}
-      {(() => {
-        const meta = message.metadata as
-          | { error?: string; finishReason?: string }
-          | undefined;
-        if (meta?.error || meta?.finishReason !== "aborted") return null;
-        return (
-          <ChatStoppedNotice
-            text={t("Response stopped (manually or the connection dropped).")}
-            mt={4}
-          />
+          >
+            {describeChatError(errorText, t)}
+          </Alert>
        );
      })()}
    </Box>
--- a/apps/client/src/features/ai-chat/components/message-list.tsx
+++ b/apps/client/src/features/ai-chat/components/message-list.tsx
@@ -4,8 +4,7 @@ import { useTranslation } from "react-i18next";
 import type { UIMessage } from "@ai-sdk/react";
 import MessageItem from "@/features/ai-chat/components/message-item.tsx";
 import TypingIndicator from "@/features/ai-chat/components/typing-indicator.tsx";
-import { isToolPart, toolRunState, ToolUiPart } from "@/features/ai-chat/utils/tool-parts.tsx";
-import { assistantMessageHasVisibleContent } from "@/features/ai-chat/utils/message-content.ts";
+import { isToolPart } from "@/features/ai-chat/utils/tool-parts.tsx";
 import classes from "@/features/ai-chat/components/ai-chat.module.css";

 interface MessageListProps {
@@ -44,68 +43,23 @@ interface MessageListProps {
 const BOTTOM_THRESHOLD = 40;

 /**
- * Whether to show the standalone "Thinking…" indicator. It bridges every
- * gap in a turn where the assistant is working but nothing visible is actively
- * being produced yet — so it shows while a turn is in flight AND the latest
- * assistant message's LAST part is not live output:
+ * Whether to show the standalone "AI agent is typing…" indicator. It bridges the
+ * gap between sending and the first streamed content, so it shows only while a
+ * turn is in flight AND the latest assistant message has nothing visible yet:
 *  - the last message is still the user's (assistant hasn't started a row), or
- *  - the assistant row has no parts yet, or
- *  - its last part is an empty/whitespace text part, or a finished ("done")
- *    text part while the turn continues (the model paused after some narration
- *    and is thinking about its next step), or
- *  - its last part is a finished/errored tool (the model is thinking about the
- *    next step between tool calls).
- * It hides only while output is actively rendering: a non-empty streaming text
- * part, or a tool that is still running (ToolCallCard shows its own Loader).
+ *  - the last (assistant) message has no non-empty text and no tool part.
+ * Once any text/tool part arrives, MessageItem renders it and this hides.
 */
 export function showTypingIndicator(messages: UIMessage[], isStreaming: boolean): boolean {
  if (!isStreaming) return false;
  const last = messages[messages.length - 1];
  if (!last) return true; // submitted with nothing rendered yet.
  if (last.role !== "assistant") return true; // assistant row not started.
-  const lastPart = last.parts[last.parts.length - 1];
-  if (!lastPart) return true; // assistant row exists but has no parts yet.
-  // The answer text is actively streaming in -> MessageItem renders it; no dots.
-  // Only while it is STILL streaming, though: once a non-empty text part is
-  // finalized ("done") but the turn is still in flight, the model has paused
-  // after some narration and is working on its next step (e.g. about to call a
-  // tool) — nothing is visibly progressing, so the dots must show. A text part
-  // without a `state` is treated as still-rendering (kept suppressed); this
-  // branch only runs while streaming, where live parts always carry a state.
-  if (
-    lastPart.type === "text" &&
-    lastPart.text.trim().length > 0 &&
-    (lastPart as { state?: "streaming" | "done" }).state !== "done"
-  ) {
-    return false;
-  }
-  // A tool still in flight shows its own Loader in ToolCallCard -> no dots.
-  if (
-    isToolPart(lastPart.type) &&
-    toolRunState((lastPart as unknown as ToolUiPart).state) === "running"
-  ) {
-    return false;
-  }
-  // Otherwise the turn is in flight but nothing is actively producing visible
-  // output yet: a finished/errored tool with no follow-up content, or an empty
-  // trailing text part. The model is thinking between steps -> show the dots.
-  return true;
-}
-
-/**
- * Whether the standalone typing indicator should render its own assistant-name
- * label. The indicator OWNS the name while the tail assistant row has no visible
- * content yet (an empty streaming text part, or reasoning/step-start while the
- * model is still thinking): in that gap the assistant MessageItem renders nothing,
- * so the indicator stands in for the nascent bubble (name + dots) at a constant
- * gap. It hides the name only once that row shows visible content, because then
- * MessageItem draws the same name — avoids a duplicate stacked label and the
- * layout jump that switching owners mid-stream used to cause.
- */
-export function typingIndicatorShowsName(messages: UIMessage[]): boolean {
-  const last = messages[messages.length - 1];
-  if (!last || last.role !== "assistant") return true;
-  return !assistantMessageHasVisibleContent(last);
+  const hasVisible = last.parts.some(
+    (p) =>
+      (p.type === "text" && p.text.trim().length > 0) || isToolPart(p.type),
+  );
+  return !hasVisible;
 }

 /**
@@ -204,12 +158,7 @@ export default function MessageList({
            assistantName={assistantName}
          />
        ))}
-        {typing && (
-          <TypingIndicator
-            assistantName={assistantName}
-            showName={typingIndicatorShowsName(messages)}
-          />
-        )}
+        {typing && <TypingIndicator assistantName={assistantName} />}
      </Stack>
    </ScrollArea>
  );
--- a/apps/client/src/features/ai-chat/components/reasoning-block.test.tsx
+++ b/apps/client/src/features/ai-chat/components/reasoning-block.test.tsx
@@ -1,65 +0,0 @@
-import { describe, it, expect, vi } from "vitest";
-import { render, screen } from "@testing-library/react";
-import { MantineProvider } from "@mantine/core";
-
-// Stub react-i18next so `t` returns the key with `{{count}}` interpolated. This
-// keeps the assertions on the component's OWN count logic (authoritative vs
-// estimate) rather than on translation, and mirrors the t-mock pattern used by
-// other component tests in the repo.
-vi.mock("react-i18next", () => ({
-  useTranslation: () => ({
-    t: (key: string, opts?: { count?: number }) =>
-      opts && typeof opts.count === "number"
-        ? key.replace("{{count}}", String(opts.count))
-        : key,
-  }),
-}));
-
-import ReasoningBlock from "./reasoning-block";
-import { estimateTokens } from "@/features/ai-chat/utils/count-stream-tokens.ts";
-
-// matchMedia (read by MantineProvider) is stubbed globally in vitest.setup.ts.
-
-function renderBlock(props: { text: string; tokens?: number }) {
-  return render(
-    <MantineProvider>
-      <ReasoningBlock {...props} />
-    </MantineProvider>,
-  );
-}
-
-describe("ReasoningBlock", () => {
-  it("shows the authoritative count in the header when tokens > 0", () => {
-    // Text "thinking…" estimates to ceil(9/4) = 3, but the authoritative 42
-    // must win, so the header shows 42 (and NOT the 3-token estimate).
-    renderBlock({ text: "thinking…", tokens: 42 });
-    expect(screen.getByText("Thinking · 42 tokens")).toBeDefined();
-    expect(screen.queryByText("Thinking · 3 tokens")).toBeNull();
-  });
-
-  it("falls back to the text-length estimate when no authoritative tokens", () => {
-    const text = "some reasoning prose that streams in";
-    const estimate = estimateTokens(text);
-    renderBlock({ text });
-    expect(estimate).toBeGreaterThan(0);
-    expect(screen.getByText(new RegExp(`${estimate} tokens`))).toBeDefined();
-  });
-
-  it("header-only when text is empty but an authoritative count is present", () => {
-    renderBlock({ text: "", tokens: 17 });
-    expect(screen.getByText(/17 tokens/)).toBeDefined();
-    // No disclosure body to expand: the toggle button is disabled.
-    const button = screen.getByRole("button");
-    expect((button as HTMLButtonElement).disabled).toBe(true);
-  });
-
-  it("renders the reasoning body (markdown or raw-text fallback)", () => {
-    renderBlock({ text: "**bold** reasoning", tokens: 5 });
-    // The toggle is enabled because there IS body text to expand.
-    const button = screen.getByRole("button");
-    expect((button as HTMLButtonElement).disabled).toBe(false);
-    // The body prose renders (markdown -> sanitized html, or raw-text fallback);
-    // either way the text is present in the document.
-    expect(screen.getByText(/reasoning/)).toBeDefined();
-  });
-});
--- a/apps/client/src/features/ai-chat/components/reasoning-block.tsx
+++ b/apps/client/src/features/ai-chat/components/reasoning-block.tsx
@@ -1,89 +0,0 @@
-import { useState } from "react";
-import { Box, Collapse, Group, Text, UnstyledButton } from "@mantine/core";
-import { IconChevronDown } from "@tabler/icons-react";
-import { useTranslation } from "react-i18next";
-import { estimateTokens } from "@/features/ai-chat/utils/count-stream-tokens.ts";
-import { collapseBlankLines } from "@/features/ai-chat/utils/collapse-blank-lines.ts";
-import { renderChatMarkdown } from "@/features/ai-chat/utils/markdown.ts";
-import classes from "@/features/ai-chat/components/ai-chat.module.css";
-
-interface ReasoningBlockProps {
-  /** The streamed/persisted reasoning (thinking) text. May be empty when the
-   *  provider reports only a reasoning token COUNT without the text. */
-  text: string;
-  /** Authoritative reasoning token count from `usage.reasoningTokens`, when the
-   *  step/turn has finished. When absent (or 0) the count is estimated from the
-   *  text length so it ticks live as the reasoning streams in. */
-  tokens?: number;
-}
-
-/**
- * Collapsible "Thinking" block for an assistant `reasoning` part. Mirrors Claude
- * Code's surfacing of the model's thinking: a header that shows the thinking
- * token count (authoritative when the step has reported usage, else a live
- * estimate from the streamed text) and an expandable body with the reasoning
- * prose. Collapsed by default so it never crowds out the answer.
- *
- * Providers that don't stream reasoning TEXT still render this block from the
- * authoritative count alone (header only, empty body) so the cost is visible.
- */
-export default function ReasoningBlock({ text, tokens }: ReasoningBlockProps) {
-  const { t } = useTranslation();
-  const [open, setOpen] = useState(false);
-
-  // Authoritative count wins; otherwise estimate live from the streamed text.
-  const count = tokens && tokens > 0 ? tokens : estimateTokens(text);
-  const trimmed = text.trim();
-  // Collapse the blank-line gaps the model emits between every list item /
-  // paragraph so the reasoning renders compactly (tight lists, joined
-  // paragraphs) — see collapseBlankLines. ONLY here, not in the normal answer.
-  const html = trimmed
-    ? renderChatMarkdown(collapseBlankLines(trimmed), {})
-    : "";
-
-  return (
-    <Box className={classes.reasoningBlock} mb={6}>
-      <UnstyledButton
-        onClick={() => setOpen((o) => !o)}
-        // No body to expand when the provider reported only a token count.
-        disabled={!trimmed}
-        aria-expanded={open}
-      >
-        <Group gap={6} wrap="nowrap" align="center">
-          <IconChevronDown
-            size={12}
-            style={{
-              transform: open ? "none" : "rotate(-90deg)",
-              transition: "transform 150ms ease",
-              opacity: trimmed ? 1 : 0.4,
-            }}
-          />
-          <Text size="xs" c="dimmed">
-            {count > 0
-              ? t("Thinking · {{count}} tokens", { count })
-              : t("Thinking")}
-          </Text>
-        </Group>
-      </UnstyledButton>
-
-      {trimmed && (
-        <Collapse in={open}>
-          {html ? (
-            <div
-              className={classes.reasoningText}
-              // Sanitized by renderChatMarkdown (DOMPurify) before insertion.
-              dangerouslySetInnerHTML={{ __html: html }}
-            />
-          ) : (
-            <Text
-              className={classes.reasoningText}
-              style={{ whiteSpace: "pre-wrap" }}
-            >
-              {trimmed}
-            </Text>
-          )}
-        </Collapse>
-      )}
-    </Box>
-  );
-}
--- a/apps/client/src/features/ai-chat/components/role-cards.module.css
+++ b/apps/client/src/features/ai-chat/components/role-cards.module.css
@@ -1,65 +0,0 @@
-/* Layout only — per-card colors are injected inline via Mantine CSS vars. */
-
-.container {
-  display: flex;
-  flex-wrap: wrap;
-  justify-content: center;
-  /* flex-start keeps the first row reachable when the wrapped cards overflow and
-     the container scrolls. With align-content: center, an overflowing top row is
-     pushed out of the scrollable area and becomes unreachable. The parent Mantine
-     Center still vertically centers the whole block when it fits. */
-  align-content: flex-start;
-  gap: 10px;
-  /* Cap the height so a large number of roles scrolls instead of blowing out
-     the empty chat area. */
-  max-height: 100%;
-  overflow-y: auto;
-  padding: 8px;
-}
-
-.card {
-  position: relative;
-  display: flex;
-  flex-direction: column;
-  align-items: center;
-  justify-content: center;
-  gap: 4px;
-  /* Grow to fill the row so cards use the available window width instead of
-     leaving large side gaps; the flex-basis sets how many fit per row before
-     wrapping (≈2 columns at the default window width, more as it widens). */
-  flex: 1 1 240px;
-  min-width: 200px;
-  max-width: 360px;
-  min-height: 90px;
-  padding: 12px 10px;
-  border-radius: var(--mantine-radius-md);
-  border: 2px solid transparent;
-  cursor: pointer;
-  text-align: center;
-  transition:
-    transform 120ms ease,
-    box-shadow 120ms ease,
-    border-color 120ms ease;
-}
-
-.card:hover {
-  transform: translateY(-2px);
-  box-shadow: var(--mantine-shadow-sm);
-}
-
-.emoji {
-  font-size: 22px;
-  line-height: 1;
-}
-
-/* The description: small and slightly muted, inheriting the card's color. We
-   reduce opacity instead of using Mantine's `c="dimmed"` so it doesn't clash
-   with the card's inline color. */
-.description {
-  opacity: 0.8;
-  line-height: 1.3;
-  /* Break long unbreakable tokens (URLs, long foreign words) in the
-     admin-configured description so they wrap instead of overflowing the card
-     width now that the line clamp no longer caps the text. */
-  overflow-wrap: anywhere;
-}
--- a/apps/client/src/features/ai-chat/components/role-cards.test.tsx
+++ b/apps/client/src/features/ai-chat/components/role-cards.test.tsx
@@ -1,59 +0,0 @@
-import { describe, it, expect, vi } from "vitest";
-import { render, screen, fireEvent } from "@testing-library/react";
-import { MantineProvider } from "@mantine/core";
-import RoleCards from "./role-cards";
-import { IAiRole } from "@/features/ai-chat/types/ai-chat.types.ts";
-
-// matchMedia (read by MantineProvider) is stubbed globally in vitest.setup.ts.
-
-const roles: IAiRole[] = [
-  {
-    id: "r1",
-    name: "Pirate",
-    emoji: "🏴‍☠️",
-    description: "Talks like a pirate",
-    enabled: true,
-    autoStart: true,
-    launchMessage: null,
-  },
-  {
-    id: "r2",
-    name: "Grandpa",
-    emoji: null,
-    description: null,
-    enabled: true,
-    autoStart: true,
-    launchMessage: null,
-  },
-];
-
-function renderCards(onPick = vi.fn()) {
-  render(
-    <MantineProvider>
-      <RoleCards roles={roles} onPick={onPick} />
-    </MantineProvider>,
-  );
-  return onPick;
-}
-
-describe("RoleCards", () => {
-  it("renders one card per role with name, emoji, and description", () => {
-    renderCards();
-    expect(screen.getByText("Pirate")).toBeDefined();
-    expect(screen.getByText("Talks like a pirate")).toBeDefined();
-    expect(screen.getByText("Grandpa")).toBeDefined();
-    // The emoji is shown for the role that has one.
-    expect(screen.getByText("🏴‍☠️")).toBeDefined();
-  });
-
-  it("does NOT render a Universal assistant card", () => {
-    renderCards();
-    expect(screen.queryByText("Universal assistant")).toBeNull();
-  });
-
-  it("calls onPick with the role object when a card is clicked", () => {
-    const onPick = renderCards();
-    fireEvent.click(screen.getByText("Pirate"));
-    expect(onPick).toHaveBeenCalledWith(roles[0]);
-  });
-});
--- a/apps/client/src/features/ai-chat/components/role-cards.tsx
+++ b/apps/client/src/features/ai-chat/components/role-cards.tsx
@@ -1,78 +0,0 @@
-import { UnstyledButton, Text } from "@mantine/core";
-import { IAiRole } from "@/features/ai-chat/types/ai-chat.types.ts";
-import { roleCardColor } from "@/features/ai-chat/utils/role-card-color.ts";
-import classes from "@/features/ai-chat/components/role-cards.module.css";
-
-interface RoleCardsProps {
-  /** The enabled roles to render (one card each). */
-  roles: IAiRole[];
-  /** Called with the picked role when a card is clicked. The parent starts the
-   *  chat with this role (binds it and sends the opening message). */
-  onPick: (role: IAiRole) => void;
-}
-
-/**
- * One role card. Colors are injected inline via theme-aware Mantine CSS vars so
- * they render correctly in both light and dark themes; the CSS module owns only
- * the layout. The card shows the emoji (if any), the role name, and a small
- * dimmed description line (if any).
- */
-function RoleCard({
-  color,
-  name,
-  emoji,
-  description,
-  onClick,
-}: {
-  color: string;
-  name: string;
-  emoji?: string | null;
-  description?: string | null;
-  onClick: () => void;
-}) {
-  return (
-    <UnstyledButton
-      className={classes.card}
-      style={{
-        backgroundColor: `var(--mantine-color-${color}-light)`,
-        color: `var(--mantine-color-${color}-light-color)`,
-      }}
-      title={description ?? name}
-      onClick={onClick}
-    >
-      {emoji && <span className={classes.emoji}>{emoji}</span>}
-      <Text size="sm" fw={600} lineClamp={2}>
-        {name}
-      </Text>
-      {description && (
-        <Text size="xs" className={classes.description}>
-          {description}
-        </Text>
-      )}
-    </UnstyledButton>
-  );
-}
-
-/**
- * Colored role cards rendered as the empty-state of a brand-new chat. There is
- * no Universal assistant card — the universal assistant is the implicit default
- * the user gets by simply typing into the composer without picking a card.
- * Clicking a card immediately STARTS the chat with that role (the parent binds
- * the role to the new chat and sends the opening message).
- */
-export default function RoleCards({ roles, onPick }: RoleCardsProps) {
-  return (
-    <div className={classes.container}>
-      {roles.map((role, index) => (
-        <RoleCard
-          key={role.id}
-          color={roleCardColor(index)}
-          name={role.name}
-          emoji={role.emoji}
-          description={role.description}
-          onClick={() => onPick(role)}
-        />
-      ))}
-    </div>
-  );
-}
--- a/apps/client/src/features/ai-chat/components/show-typing-indicator.test.ts
+++ b/apps/client/src/features/ai-chat/components/show-typing-indicator.test.ts
@@ -5,7 +5,7 @@ import { showTypingIndicator } from "@/features/ai-chat/components/message-list.
 /**
 * Pure-helper tests for the typing-indicator bridging logic that the internal
 * chat and the public share widget now share. This is the behavior that decides
- * whether the animated "Thinking…" placeholder shows in the gap
+ * whether the animated "AI agent is typing…" placeholder shows in the gap
 * between sending and the first streamed token.
 */
 const msg = (
@@ -52,44 +52,4 @@ describe("showTypingIndicator", () => {
      showTypingIndicator([msg("assistant", [toolPart])], true),
    ).toBe(false);
  });
-
-  it("shows while streaming after a tool has finished (thinking between steps)", () => {
-    const doneTool = { type: "tool-getPage", state: "output-available" } as unknown as UIMessage["parts"][number];
-    expect(
-      showTypingIndicator([msg("assistant", [doneTool])], true),
-    ).toBe(true);
-  });
-
-  it("shows while streaming when a finished tool is the last part after some text", () => {
-    const text = { type: "text", text: "Let me check" } as unknown as UIMessage["parts"][number];
-    const doneTool = { type: "tool-getPage", state: "output-available" } as unknown as UIMessage["parts"][number];
-    expect(
-      showTypingIndicator([msg("assistant", [text, doneTool])], true),
-    ).toBe(true);
-  });
-
-  it("hides while a tool is still running", () => {
-    const runningTool = { type: "tool-getPage", state: "input-available" } as unknown as UIMessage["parts"][number];
-    expect(
-      showTypingIndicator([msg("assistant", [runningTool])], true),
-    ).toBe(false);
-  });
-
-  it("hides once the assistant streams non-empty text after a finished tool", () => {
-    const doneTool = { type: "tool-getPage", state: "output-available" } as unknown as UIMessage["parts"][number];
-    const text = { type: "text", text: "The answer is 42" } as unknown as UIMessage["parts"][number];
-    expect(
-      showTypingIndicator([msg("assistant", [doneTool, text])], true),
-    ).toBe(false);
-  });
-
-  it("shows while streaming after a text part is finalized (paused before the next step)", () => {
-    const doneText = { type: "text", text: "Now creating the page in", state: "done" } as unknown as UIMessage["parts"][number];
-    expect(showTypingIndicator([msg("assistant", [doneText])], true)).toBe(true);
-  });
-
-  it("hides while a text part is actively streaming (state: streaming)", () => {
-    const streamingText = { type: "text", text: "Now writ", state: "streaming" } as unknown as UIMessage["parts"][number];
-    expect(showTypingIndicator([msg("assistant", [streamingText])], true)).toBe(false);
-  });
 });
--- a/apps/client/src/features/ai-chat/components/typing-indicator-shows-name.test.ts
+++ b/apps/client/src/features/ai-chat/components/typing-indicator-shows-name.test.ts
@@ -1,52 +0,0 @@
-import { describe, expect, it } from "vitest";
-import type { UIMessage } from "@ai-sdk/react";
-import { typingIndicatorShowsName } from "@/features/ai-chat/components/message-list.tsx";
-
-/**
- * Pure-helper tests for whether the standalone "Thinking…" indicator renders its
- * own dimmed assistant-name label. The indicator OWNS the name while the tail
- * assistant row has no visible content yet (an empty streaming text part, or
- * reasoning/step-start while the model is still thinking) — in that gap the
- * assistant MessageItem renders nothing, so the indicator stands in for the
- * nascent bubble (name + dots). It hides the name only once the tail assistant
- * row shows visible content, because then MessageItem draws the same name — this
- * avoids a duplicate stacked label and the layout jump that switching owners
- * mid-stream used to cause.
- */
-const msg = (
-  role: "user" | "assistant",
-  parts: UIMessage["parts"],
-): UIMessage => ({ id: Math.random().toString(), role, parts }) as UIMessage;
-
-describe("typingIndicatorShowsName", () => {
-  it("shows the name with no messages yet (standalone, just submitted)", () => {
-    expect(typingIndicatorShowsName([])).toBe(true);
-  });
-
-  it("shows the name when the last message is still the user's", () => {
-    expect(
-      typingIndicatorShowsName([msg("user", [{ type: "text", text: "q" }])]),
-    ).toBe(true);
-  });
-
-  it("shows the name when the tail assistant row has no visible content yet (empty text part)", () => {
-    // The empty streaming text part has no visible content, so MessageItem renders
-    // nothing and the indicator owns the name (the nascent bubble).
-    expect(
-      typingIndicatorShowsName([msg("assistant", [{ type: "text", text: "" }])]),
-    ).toBe(true);
-  });
-
-  it("hides the name once the tail assistant row shows content (a tool part)", () => {
-    const doneTool = { type: "tool-getPage", state: "output-available" } as unknown as UIMessage["parts"][number];
-    expect(
-      typingIndicatorShowsName([msg("assistant", [doneTool])]),
-    ).toBe(false);
-  });
-
-  it("hides the name once the tail assistant row shows content (non-empty text)", () => {
-    expect(
-      typingIndicatorShowsName([msg("assistant", [{ type: "text", text: "answer" }])]),
-    ).toBe(false);
-  });
-});
--- a/apps/client/src/features/ai-chat/components/typing-indicator.tsx
+++ b/apps/client/src/features/ai-chat/components/typing-indicator.tsx
@@ -10,12 +10,6 @@ interface TypingIndicatorProps {
   * (agent role) name.
   */
  assistantName?: string;
-  /**
-   * Whether to render the dimmed assistant-name label. Defaults to true
-   * (standalone behavior preserved). Set false between agent steps where the
-   * assistant row above already shows the same name, to avoid a duplicate label.
-   */
-  showName?: boolean;
 }

 /**
@@ -25,30 +19,27 @@ interface TypingIndicatorProps {
 * the real assistant message once content starts arriving.
 *
 * Mirrors the assistant row layout in MessageItem (the dimmed label), so it reads
- * as the assistant's bubble taking shape. The dimmed label uses the configured
- * identity name when provided (otherwise the generic "AI agent"); below it the
- * animated dots stand in for the nascent bubble until content arrives.
+ * as the assistant's bubble taking shape. The label and typing line use the
+ * configured identity name when provided, otherwise the generic "AI agent".
 */
-export default function TypingIndicator({ assistantName, showName = true }: TypingIndicatorProps) {
+export default function TypingIndicator({ assistantName }: TypingIndicatorProps) {
  const { t } = useTranslation();
  const name = resolveAssistantName(assistantName);

  return (
    <Box className={classes.messageRow}>
-      {showName !== false && (
-        // Extra bottom gap (vs MessageItem's mb={4}) gives the small bouncing
-        // dots room below the name label; without it they crowd the label. Only
-        // applies when the name is shown — the nameless case spaces fine on its own.
-        <Text size="xs" c="dimmed" mb={8}>
-          {name ?? t("AI agent")}
-        </Text>
-      )}
+      <Text size="xs" c="dimmed" mb={4}>
+        {name ?? t("AI agent")}
+      </Text>
      <Group gap={8} align="center">
        <span className={classes.typingDots} aria-hidden="true">
          <span />
          <span />
          <span />
        </span>
+        <Text size="sm" c="dimmed">
+          {name ? t("{{name}} is typing…", { name }) : t("AI agent is typing…")}
+        </Text>
      </Group>
    </Box>
  );
--- a/apps/client/src/features/ai-chat/hooks/use-chat-session.test.tsx
+++ b/apps/client/src/features/ai-chat/hooks/use-chat-session.test.tsx
@@ -1,246 +0,0 @@
-import { describe, it, expect, vi, beforeEach } from "vitest";
-import { renderHook } from "@testing-library/react";
-import { useChatSession } from "./use-chat-session";
-import type { UseChatSessionOptions } from "./use-chat-session";
-
-// The props the test drives: the parent-owned subset of UseChatSessionOptions
-// (the spies are injected by setup, not per-render). messagesLoading is optional
-// here (defaulted to false in setup) for terser test call sites.
-type DriverProps = Pick<UseChatSessionOptions, "activeChatId" | "chats"> & {
-  messagesLoading?: boolean;
-};
-
-// Drive the hook the way the window does: the parent owns `activeChatId` and
-// passes it back in. `setActiveChatId` is a spy so we can assert the EXACT id the
-// hook adopts (the #137 regression: it must be the authoritative streamed id, not
-// the newest chat in the list).
-function setup(initial: DriverProps) {
-  const setActiveChatId = vi.fn();
-  const onInvalidateChatList = vi.fn();
-  const onInvalidateChatMessages = vi.fn();
-  const { result, rerender } = renderHook(
-    (props: DriverProps) =>
-      useChatSession({
-        activeChatId: props.activeChatId,
-        setActiveChatId,
-        chats: props.chats,
-        messagesLoading: props.messagesLoading ?? false,
-        onInvalidateChatList,
-        onInvalidateChatMessages,
-      }),
-    { initialProps: initial },
-  );
-  return {
-    result,
-    rerender,
-    setActiveChatId,
-    onInvalidateChatList,
-    onInvalidateChatMessages,
-  };
-}
-
-describe("useChatSession", () => {
-  beforeEach(() => vi.clearAllMocks());
-
-  it("#137 REGRESSION LOCK: adopts the authoritative streamed id, NOT items[0]", () => {
-    // Brand-new chat, list already holds a SIBLING chat B as items[0] (a second
-    // tab just created it). The server streams the real id "A" for THIS chat.
-    const { result, setActiveChatId } = setup({
-      activeChatId: null,
-      chats: { items: [{ id: "B" }] },
-    });
-    result.current.onTurnFinished("A");
-    // Must adopt the authoritative id, not the newest-in-list guess.
-    expect(setActiveChatId).toHaveBeenCalledWith("A");
-    expect(setActiveChatId).not.toHaveBeenCalledWith("B");
-  });
-
-  it("fallback adopt: arms on a server-id-less finish, adopts the single new id after refetch", () => {
-    const { result, rerender, setActiveChatId } = setup({
-      activeChatId: null,
-      chats: { items: [{ id: "x" }] },
-    });
-    // No server id => arm the fallback (no adoption yet).
-    result.current.onTurnFinished(undefined);
-    expect(setActiveChatId).not.toHaveBeenCalled();
-    // The refetch lands with the new row => adopt it.
-    rerender({
-      activeChatId: null,
-      chats: { items: [{ id: "x" }, { id: "new" }] },
-    });
-    expect(setActiveChatId).toHaveBeenCalledWith("new");
-  });
-
-  it("fallback ambiguous: two new ids appear => no adoption", () => {
-    const { result, rerender, setActiveChatId } = setup({
-      activeChatId: null,
-      chats: { items: [{ id: "x" }] },
-    });
-    result.current.onTurnFinished(undefined);
-    rerender({
-      activeChatId: null,
-      chats: { items: [{ id: "x" }, { id: "n1" }, { id: "n2" }] },
-    });
-    expect(setActiveChatId).not.toHaveBeenCalled();
-  });
-
-  it("fallback add+delete in one window: adopts the new id (membership compare)", () => {
-    const { result, rerender, setActiveChatId } = setup({
-      activeChatId: null,
-      chats: { items: [{ id: "a" }, { id: "b" }] },
-    });
-    result.current.onTurnFinished(undefined);
-    // a was deleted, new was added — same length, but membership changed.
-    rerender({
-      activeChatId: null,
-      chats: { items: [{ id: "b" }, { id: "new" }] },
-    });
-    expect(setActiveChatId).toHaveBeenCalledWith("new");
-  });
-
-  it("disarm on reconcile: a fallback armed then switched away is NOT adopted by a late refetch", () => {
-    // Arm the error-path fallback on a brand-new chat (snapshot before=["x"]).
-    const { result, rerender, setActiveChatId } = setup({
-      activeChatId: null,
-      chats: { items: [{ id: "x" }] },
-    });
-    result.current.onTurnFinished(undefined);
-    // The user switches to an existing chat C BEFORE the refetch lands; the
-    // render-phase reconciler must DISARM the pending fallback.
-    rerender({ activeChatId: "C", chats: { items: [{ id: "x" }] } });
-    // ...then starts a fresh new chat again (back to null), without re-arming.
-    rerender({ activeChatId: null, chats: { items: [{ id: "x" }] } });
-    // A late refetch now brings a new row. Because the earlier fallback was
-    // disarmed on the switch (not left armed with the stale ["x"] snapshot), it
-    // must NOT be adopted. (Without the disarm this would wrongly adopt "new".)
-    rerender({
-      activeChatId: null,
-      chats: { items: [{ id: "x" }, { id: "new" }] },
-    });
-    expect(setActiveChatId).not.toHaveBeenCalledWith("new");
-  });
-
-  it("startNewChat while already in a new chat: cancelPendingAdoption stops a late refetch adopting the failed chat", () => {
-    // The Warning path the render-phase reconciler can't catch: pressing "New
-    // chat" while already in a new chat keeps activeChatId === null (a no-op for
-    // the atom), so only the explicit cancelPendingAdoption() disarms.
-    const { result, rerender, setActiveChatId } = setup({
-      activeChatId: null,
-      chats: { items: [{ id: "x" }] },
-    });
-    result.current.onTurnFinished(undefined); // first turn failed → arm (before=["x"])
-    result.current.cancelPendingAdoption(); // window calls this from startNewChat
-    // The just-failed row lands in a late refetch; it must NOT be adopted.
-    rerender({
-      activeChatId: null,
-      chats: { items: [{ id: "x" }, { id: "failed" }] },
-    });
-    expect(setActiveChatId).not.toHaveBeenCalledWith("failed");
-  });
-
-  it("onTurnFinished for an existing chat: no adoption, invalidates that chat's messages", () => {
-    const {
-      result,
-      setActiveChatId,
-      onInvalidateChatList,
-      onInvalidateChatMessages,
-    } = setup({ activeChatId: "chat-1", chats: { items: [{ id: "chat-1" }] } });
-    result.current.onTurnFinished("chat-1");
-    expect(setActiveChatId).not.toHaveBeenCalled(); // existing chat is never re-adopted
-    expect(onInvalidateChatList).toHaveBeenCalled();
-    expect(onInvalidateChatMessages).toHaveBeenCalledWith("chat-1");
-  });
-
-  it("double onTurnFinished on a failed-after-start turn: primary adopt, 2nd no-id call does NOT re-arm the fallback", () => {
-    // ai@6 fires onFinish AND onError on a failed turn. If the failure happened
-    // AFTER the `start` chunk, onFinish carries the streamed id and onError does
-    // not — so onTurnFinished runs twice in one turn (id, then no-id) before any
-    // re-render. The 2nd call must NOT re-arm the fallback off the still-null
-    // closure; otherwise a late refetch (parent hasn't reflected the adoption yet)
-    // would wrongly adopt a sibling row.
-    const { result, rerender, setActiveChatId } = setup({
-      activeChatId: null,
-      chats: { items: [{ id: "x" }] },
-    });
-    result.current.onTurnFinished("A"); // onFinish: primary adoption
-    expect(setActiveChatId).toHaveBeenCalledWith("A");
-    result.current.onTurnFinished(undefined); // onError: same turn, no id
-    // Even in the worst case (the parent has NOT yet reflected activeChatId="A"
-    // and a late refetch lands a new row), the just-failed sibling must NOT be
-    // adopted. Two layers guarantee this: the ref guard keeps the 2nd call from
-    // re-arming at the source, and the render-phase reconciler disarms anything
-    // stale once thread.chatId ("A") diverges from the still-null activeChatId.
-    rerender({
-      activeChatId: null,
-      chats: { items: [{ id: "x" }, { id: "late" }] },
-    });
-    expect(setActiveChatId).not.toHaveBeenCalledWith("late");
-  });
-
-  it("#174 early adopt: onServerChatId adopts the streamed id mid-stream (Copy button available during the first turn)", () => {
-    // Brand-new chat: no id yet. The server streams the real chat id "A" on the
-    // `start` chunk WHILE the first turn is still streaming (before onTurnFinished
-    // fires at the terminal outcome). The hook must adopt it immediately so the
-    // window's activeChatId-gated Copy/export button lights up during the stream.
-    const { result, setActiveChatId } = setup({
-      activeChatId: null,
-      chats: { items: [] },
-    });
-    result.current.onServerChatId("A");
-    expect(setActiveChatId).toHaveBeenCalledWith("A");
-  });
-
-  it("#174 early adopt is in-place: threadKey stays stable (live stream not torn down)", () => {
-    const chats = { items: [] };
-    const { result, rerender } = setup({ activeChatId: null, chats });
-    const keyBefore = result.current.threadKey;
-    result.current.onServerChatId("A");
-    // Parent reflects the adopted id back in; the SAME mount key is kept so the
-    // in-flight useChat store (the streaming turn) is preserved.
-    rerender({ activeChatId: "A", chats });
-    expect(result.current.threadKey).toBe(keyBefore);
-  });
-
-  it("#174 early adopt: no-op for an existing chat and for a missing id", () => {
-    const { result, setActiveChatId } = setup({
-      activeChatId: "chat-1",
-      chats: { items: [{ id: "chat-1" }] },
-    });
-    result.current.onServerChatId("chat-1"); // already has an id
-    result.current.onServerChatId(undefined); // no streamed id
-    expect(setActiveChatId).not.toHaveBeenCalled();
-  });
-
-  it("in-place adopt keeps threadKey stable; an external switch remounts", () => {
-    const chats = { items: [{ id: "B" }] };
-    const { result, rerender } = setup({ activeChatId: null, chats });
-    const keyBefore = result.current.threadKey;
-    // Adopt the streamed id; the PARENT then reflects activeChatId="A" back in.
-    result.current.onTurnFinished("A");
-    rerender({ activeChatId: "A", chats });
-    // In-place adoption: SAME mount key (the live useChat store is preserved).
-    expect(result.current.threadKey).toBe(keyBefore);
-
-    // An EXTERNAL switch (not via adopt) to a different chat must remount: the
-    // key becomes the chat id.
-    rerender({ activeChatId: "C", chats });
-    expect(result.current.threadKey).toBe("C");
-  });
-
-  it("waitingForHistory gates the loader only while opening an unloaded existing chat", () => {
-    // Open an existing chat whose history is still loading => loader on.
-    const { result, rerender } = setup({
-      activeChatId: "chat-1",
-      chats: { items: [{ id: "chat-1" }] },
-      messagesLoading: true,
-    });
-    expect(result.current.waitingForHistory).toBe(true);
-    // Once loading finishes, the latch flips and the loader is off.
-    rerender({
-      activeChatId: "chat-1",
-      chats: { items: [{ id: "chat-1" }] },
-      messagesLoading: false,
-    });
-    expect(result.current.waitingForHistory).toBe(false);
-  });
-});
--- a/apps/client/src/features/ai-chat/hooks/use-chat-session.ts
+++ b/apps/client/src/features/ai-chat/hooks/use-chat-session.ts
@@ -1,268 +0,0 @@
-import { useCallback, useEffect, useReducer, useRef } from "react";
-import { generateId } from "ai";
-import {
-  resolveAdoptedChatId,
-  newlyAddedChatIds,
-} from "@/features/ai-chat/utils/adopt-chat-id.ts";
-import {
-  newThread,
-  switchThread,
-  threadSessionReducer,
-} from "@/features/ai-chat/utils/thread-identity.ts";
-
-/** Inputs to {@link useChatSession}. `activeChatId`/`setActiveChatId` are the
- *  public selection atom (also written from outside the window, e.g. page
- *  history); the rest is read-only context the hook needs. */
-export interface UseChatSessionOptions {
-  activeChatId: string | null;
-  setActiveChatId: (id: string | null) => void;
-  chats: { items?: { id: string }[] } | undefined;
-  messagesLoading: boolean;
-  /** Wraps queryClient.invalidateQueries(AI_CHATS_RQ_KEY). */
-  onInvalidateChatList: () => void;
-  /** Wraps the per-chat messages invalidation. */
-  onInvalidateChatMessages: (chatId: string) => void;
-}
-
-/** What the window needs from a chat session: the ChatThread mount key, the
- *  history-loader gate, and the turn-finished callback. */
-export interface UseChatSessionResult {
-  /** ChatThread mount key (was `thread.key`). */
-  threadKey: string;
-  /** Show the history loader instead of the live thread. */
-  waitingForHistory: boolean;
-  /** Call when a turn finishes; `serverChatId` is the authoritative streamed id
-   *  (undefined on a failed turn). Handles new-chat id adoption + invalidations. */
-  onTurnFinished: (serverChatId?: string) => void;
-  /** Call EARLY (at the stream's `start` chunk) with the authoritative streamed
-   *  chat id so a brand-new chat adopts its real id WHILE its first turn is still
-   *  streaming — making `activeChatId`-gated affordances (e.g. the Copy/export
-   *  button, #174) available immediately. In-place adoption only (same mount key,
-   *  no list/messages invalidation — that is left to onTurnFinished at the end).
-   *  Idempotent and a no-op once the chat already has an id. */
-  onServerChatId: (serverChatId?: string) => void;
-  /** Disarm any pending error-path new-chat fallback. The window calls this from
-   *  startNewChat/selectChat so a late refetch can't yank the user back into a
-   *  just-failed chat after they explicitly moved on. */
-  cancelPendingAdoption: () => void;
-}
-
-/** Project a chat list to its id array (the before/after snapshot for the
- *  error-path fallback). */
-function chatIdSnapshot(
-  chats: { items?: { id: string }[] } | undefined,
-): string[] {
-  return chats?.items?.map((c) => c.id) ?? [];
-}
-
-/**
- * Owns the AI-chat thread-identity lifecycle: the single atomic thread identity,
- * both new-chat id adoption paths (primary streamed-metadata + bounded error-path
- * fallback), the history-loaded latch, and the render-phase reconciler that keeps
- * the thread's mount key in sync with the public `activeChatId` atom.
- *
- * This is the twice-bugged area for the #137 two-tab adoption race; the canonical
- * explanation of the adoption design lives in adopt-chat-id.ts.
- */
-export function useChatSession(
-  params: UseChatSessionOptions,
-): UseChatSessionResult {
-  const {
-    activeChatId,
-    setActiveChatId,
-    chats,
-    messagesLoading,
-    onInvalidateChatList,
-    onInvalidateChatMessages,
-  } = params;
-
-  // Live mirror of `activeChatId`, read by onTurnFinished. ai@6 fires both
-  // onFinish AND onError on a failed turn, so onTurnFinished can run twice in one
-  // turn (once with the streamed id, once without) BEFORE a re-render. Reading
-  // the ref — which the primary-adoption branch updates imperatively — makes that
-  // second call see the just-adopted id, so it cannot re-arm the fallback. (A
-  // plain closure over `activeChatId` would still read null on the second call.)
-  const activeChatIdRef = useRef(activeChatId);
-  activeChatIdRef.current = activeChatId;
-
-  // The mounted thread's identity: ONE atomic value tying ChatThread's mount key
-  // (`thread.key`) to the chat id that mounted thread holds (`thread.chatId`).
-  // Consolidating these makes the "key vs chat id diverged" state unrepresentable
-  // — every change goes through an explicit transition (see thread-identity.ts):
-  // `newThread`/`switchThread` to (re)mount, `adoptThread` for in-place adoption.
-  // Initial: a non-null activeChatId switches to it; a null one gets a fresh
-  // session key with no chat id yet.
-  const [thread, dispatch] = useReducer(threadSessionReducer, undefined, () =>
-    activeChatId === null
-      ? newThread(`new-${generateId()}`)
-      : switchThread(activeChatId),
-  );
-
-  // Error-path fallback for new-chat id adoption. When a brand-new chat's first
-  // turn errors BEFORE the server's `start` chunk, no authoritative chatId ever
-  // reaches the client, so the primary metadata adoption cannot run. We then ARM
-  // this ref with a snapshot of the currently-known chat ids; once the list
-  // refetch lands with the just-created row, the fallback effect below adopts the
-  // SINGLE newly-appeared id. `null` = not armed. See adopt-chat-id.ts (#137).
-  const pendingNewChatRef = useRef<string[] | null>(null);
-
-  // Latch: the chat id whose full persisted history has finished loading while
-  // its thread is mounted. Used so a later BACKGROUND refetch (the post-turn
-  // messages invalidation) never tears the live thread back down to the loader.
-  const historyLoadedKeyRef = useRef<string | null>(null);
-
-  // After a turn finishes, refresh the chat list. For a brand-new chat (no id
-  // yet) we adopt the server's AUTHORITATIVE streamed id (never the newest in the
-  // list, which races a second tab — #137; see adopt-chat-id.ts).
-  const onTurnFinished = useCallback(
-    (serverChatId?: string) => {
-      // Read the live id from the ref, not the closure: on a failed turn this can
-      // run twice in one turn (onFinish + onError) before any re-render, and the
-      // primary branch below updates the ref so the second call sees the adopted id.
-      const current = activeChatIdRef.current;
-      const adopted = resolveAdoptedChatId(current, serverChatId);
-      if (adopted) {
-        // PRIMARY path. In-place adoption: set the public selection and the
-        // thread identity to the real id together. `adopt` keeps the SAME mount
-        // key, so the render-phase reconciler sees `activeChatId === thread.chatId`
-        // and keeps the SAME mounted thread (its useChat already holds the
-        // just-finished turn) instead of remounting + re-seeding from
-        // not-yet-persisted history.
-        activeChatIdRef.current = adopted; // a same-turn 2nd call now sees the id
-        setActiveChatId(adopted);
-        dispatch({ type: "adopt", chatId: adopted });
-        // Primary adoption won — disarm any previously-armed fallback.
-        pendingNewChatRef.current = null;
-      } else if (current === null) {
-        // FALLBACK path: a brand-new chat finished with NO server id (the first
-        // turn errored before the `start` chunk). Arm the bounded list-refetch
-        // fallback by snapshotting the currently-known chat ids. `chats` is still
-        // the pre-refetch list here, so the just-created row is NOT yet in it; the
-        // effect below adopts the single id that newly appears after the refetch.
-        pendingNewChatRef.current = chatIdSnapshot(chats);
-      }
-      onInvalidateChatList();
-      // Re-sync the persisted message rows for the active chat so the Markdown
-      // export and token counters reflect the just-finished turn. The live thread
-      // renders from its own useChat store (stable thread.key), so this never
-      // re-seeds or tears down the open thread. For a brand-new chat `current` is
-      // still null here; later turns hit this with the adopted id.
-      if (current) {
-        onInvalidateChatMessages(current);
-      }
-    },
-    [chats, setActiveChatId, onInvalidateChatList, onInvalidateChatMessages],
-  );
-
-  // EARLY adoption (#174): adopt the authoritative streamed chat id the moment
-  // the server emits it on the `start` chunk, so a brand-new chat gets its real
-  // `activeChatId` WHILE its first turn streams — not only at terminal
-  // onTurnFinished. This makes the activeChatId-gated Copy/export button
-  // available during the first turn. Pure in-place adoption (same mount key, like
-  // the primary path) with NO invalidation: the list/messages refresh stays on
-  // onTurnFinished at the end of the turn. Reads the live id from the ref so a
-  // repeat call after adoption is a no-op (resolveAdoptedChatId only fires for a
-  // still-new chat).
-  const onServerChatId = useCallback(
-    (serverChatId?: string) => {
-      const adopted = resolveAdoptedChatId(
-        activeChatIdRef.current,
-        serverChatId,
-      );
-      if (!adopted) return;
-      activeChatIdRef.current = adopted;
-      setActiveChatId(adopted);
-      dispatch({ type: "adopt", chatId: adopted });
-      // Early adoption beat the error-path fallback to it — disarm.
-      pendingNewChatRef.current = null;
-    },
-    [setActiveChatId],
-  );
-
-  // FALLBACK resolver. Armed only by onTurnFinished when a brand-new chat's first
-  // turn errored before the `start` chunk (no authoritative id streamed). Once
-  // the per-user list refetch lands with the just-created row, adopt the SINGLE
-  // id that newly appeared relative to the pre-refetch snapshot. Adoption is IN
-  // PLACE (set activeChatId + `adopt` together) like the primary path, so the
-  // render-phase reconciler does not remount.
-  useEffect(() => {
-    const before = pendingNewChatRef.current;
-    if (before === null || activeChatId !== null) return; // not armed / already adopted
-    const after = chatIdSnapshot(chats);
-    const added = newlyAddedChatIds(before, after);
-    // Keep waiting until a genuinely-new id appears. Set-based, so it is robust
-    // to an add+delete in the same window (a length compare would miss it), and
-    // it deliberately keeps waiting through an unrelated deletion (no new id yet)
-    // until the just-created row actually lands, rather than giving up early.
-    if (added.size === 0) return; // list not refetched yet — keep waiting
-    pendingNewChatRef.current = null; // resolved — disarm
-    if (added.size === 1) {
-      // single unambiguous new id; >1 = ambiguous → give up
-      const adopted = [...added][0];
-      setActiveChatId(adopted);
-      dispatch({ type: "adopt", chatId: adopted });
-    }
-  }, [chats, activeChatId, setActiveChatId]);
-
-  // Reconcile the thread identity against the active-chat atom during render when
-  // they diverge — the React-sanctioned alternative to an effect (re-renders
-  // before paint, no extra commit, and converges since the next render finds them
-  // equal). This reconciliation MUST remain: `activeChatId` is the public
-  // selection and is ALSO set from OUTSIDE this component (e.g. page-history opens
-  // a referenced chat via setActiveChatId). A divergence here is a genuine SWITCH
-  // (external atom change OR user switch via selectChat/startNewChat), so
-  // `reconcile` remounts + reseeds. In-place adoption never reaches this branch:
-  // it set activeChatId and thread.chatId to the same value.
-  if (activeChatId !== thread.chatId) {
-    // A genuine switch makes any pending error-path new-chat fallback moot.
-    pendingNewChatRef.current = null;
-    dispatch({
-      type: "reconcile",
-      chatId: activeChatId,
-      newKey: `new-${generateId()}`,
-    });
-  }
-
-  // Latch the active chat once its full history has loaded and its thread is
-  // mounted, so a later background refetch (the post-turn messages invalidation,
-  // which can transiently flip hasNextPage for a chat whose message count is an
-  // exact multiple of the server page size) does not tear the live thread down to
-  // a loader and lose its in-progress useChat state.
-  if (
-    activeChatId !== null &&
-    thread.key === activeChatId &&
-    !messagesLoading &&
-    historyLoadedKeyRef.current !== activeChatId
-  ) {
-    historyLoadedKeyRef.current = activeChatId;
-  }
-
-  // Show the history loader only when freshly OPENING an existing chat (the key
-  // equals the chat id) whose history has not been fully loaded yet. For a live
-  // in-place thread that adopted its id, the key is still the "new-…" session
-  // key, so the live thread keeps rendering; and once a chat's history has loaded,
-  // a later background refetch no longer tears it down (see the latch above).
-  const waitingForHistory =
-    activeChatId !== null &&
-    messagesLoading &&
-    thread.key === activeChatId &&
-    historyLoadedKeyRef.current !== activeChatId;
-
-  // Explicit disarm for startNewChat/selectChat. The render-phase reconciler only
-  // disarms when activeChatId actually changes, but "New chat" pressed while the
-  // user is ALREADY in a new chat is a no-op for the atom (activeChatId stays
-  // null), so the reconciler never fires — without this an armed fallback could
-  // adopt the just-failed chat from a late refetch and yank the user out of their
-  // fresh chat. Stable identity (writes a ref).
-  const cancelPendingAdoption = useCallback(() => {
-    pendingNewChatRef.current = null;
-  }, []);
-
-  return {
-    threadKey: thread.key,
-    waitingForHistory,
-    onTurnFinished,
-    onServerChatId,
-    cancelPendingAdoption,
-  };
-}
--- a/apps/client/src/features/ai-chat/queries/ai-chat-query.ts
+++ b/apps/client/src/features/ai-chat/queries/ai-chat-query.ts
@@ -4,7 +4,7 @@ import {
  useQuery,
  useQueryClient,
 } from "@tanstack/react-query";
-import { useEffect, useMemo } from "react";
+import { useMemo } from "react";
 import { useTranslation } from "react-i18next";
 import { notifications } from "@mantine/notifications";
 import {
@@ -75,31 +75,6 @@ export function useAiChatMessagesQuery(chatId: string | undefined) {
    enabled: !!chatId,
  });

-  // useInfiniteQuery only fetches the first page on its own. The hook's contract
-  // (and both the Markdown export and the model-history seed) require the
-  // COMPLETE thread, so keep pulling subsequent pages until the server reports
-  // none remain. The isFetchingNextPage guard issues one request at a time;
-  // when chatId is undefined the query is disabled and hasNextPage is false, so
-  // this is a no-op. The isFetchNextPageError guard is critical: the app sets a
-  // global `retry: false`, so a rejected fetchNextPage leaves hasNextPage true
-  // and isFetchingNextPage false — without this guard the effect would re-fire
-  // immediately and hammer the endpoint in a tight loop. isFetchNextPageError
-  // latches the last next-page failure and clears once a fetch succeeds.
-  useEffect(() => {
-    if (
-      query.hasNextPage &&
-      !query.isFetchingNextPage &&
-      !query.isFetchNextPageError
-    ) {
-      void query.fetchNextPage();
-    }
-  }, [
-    query.hasNextPage,
-    query.isFetchingNextPage,
-    query.isFetchNextPageError,
-    query.fetchNextPage,
-  ]);
-
  const data = useMemo<IAiChatMessageRow[] | undefined>(() => {
    if (!query.data) return undefined;
    return query.data.pages.flatMap((p) => p.items);
--- a/apps/client/src/features/ai-chat/services/ai-chat-service.ts
+++ b/apps/client/src/features/ai-chat/services/ai-chat-service.ts
@@ -50,24 +50,6 @@ export async function deleteAiChat(chatId: string): Promise<void> {
  await api.post("/ai-chat/delete", { chatId });
 }

-/**
- * Export a chat to Markdown (#183). The server renders the transcript from the
- * persisted rows (the DB is the single source of truth — including an
- * interrupted turn's in-progress row, persisted upfront + per step), so the
- * client just copies the returned string. `lang` localizes the few fixed
- * role/tool labels; defaults to English server-side when omitted.
- */
-export async function exportAiChat(
-  chatId: string,
-  lang?: string,
-): Promise<string> {
-  const req = await api.post<{ markdown: string }>("/ai-chat/export", {
-    chatId,
-    lang,
-  });
-  return req.data.markdown;
-}
-
 /**
 * Agent roles API (`/ai-chat/roles`). `list` is available to any workspace
 * member (for the chat-creation picker); create/update/delete are admin-only
@@ -94,8 +76,6 @@ export async function updateAiRole(data: IAiRoleUpdate): Promise<IAiRole> {

 /** Soft-delete a role (admin). */
 export async function deleteAiRole(id: string): Promise<{ success: true }> {
-  const req = await api.post<{ success: true }>("/ai-chat/roles/delete", {
-    id,
-  });
+  const req = await api.post<{ success: true }>("/ai-chat/roles/delete", { id });
  return req.data;
 }
--- a/apps/client/src/features/ai-chat/types/ai-chat.types.ts
+++ b/apps/client/src/features/ai-chat/types/ai-chat.types.ts
@@ -19,12 +19,6 @@ export interface IAiChat {
  // Null when the chat has no role or the role was soft-deleted.
  roleName?: string | null;
  roleEmoji?: string | null;
-  // The document the chat was created in (ai_chats.page_id). Null when started
-  // outside any document.
-  pageId?: string | null;
-  // Denormalized via a JOIN in the chat list response: the origin page's title.
-  // Null when there is no origin page (or it was hard-deleted).
-  pageTitle?: string | null;
 }

 /** Supported model drivers (mirrors the server `AI_DRIVERS`). */
@@ -53,10 +47,6 @@ export interface IAiRole {
  instructions?: string;
  modelConfig?: IAiRoleModelConfig | null;
  enabled: boolean;
-  // Whether picking the role auto-sends a launch message and starts the chat.
-  autoStart: boolean;
-  // Custom auto-start text; null/empty => the default launch message is sent.
-  launchMessage: string | null;
  createdAt?: string;
  updatedAt?: string;
 }
@@ -69,8 +59,6 @@ export interface IAiRoleCreate {
  instructions: string;
  modelConfig?: IAiRoleModelConfig | null;
  enabled?: boolean;
-  autoStart?: boolean;
-  launchMessage?: string;
 }

 /** Admin update payload for a role (partial). */
@@ -82,8 +70,6 @@ export interface IAiRoleUpdate {
  instructions?: string;
  modelConfig?: IAiRoleModelConfig | null;
  enabled?: boolean;
-  autoStart?: boolean;
-  launchMessage?: string;
 }

 /**
@@ -106,10 +92,6 @@ export interface IAiChatMessageRow {
      inputTokens?: number;
      outputTokens?: number;
      totalTokens?: number;
-      // Reasoning (thinking) tokens, when the provider reports them. Optional so
-      // old history rows (recorded before this shipped) stay valid. Included in
-      // `outputTokens` per the AI SDK usage shape.
-      reasoningTokens?: number;
    };
    // Current context size for the turn = final-step (input+output) tokens, i.e.
    // how much the conversation occupies in the model's context window after this
@@ -119,11 +101,6 @@ export interface IAiChatMessageRow {
    // Set on an assistant row whose turn ended in a provider/stream error; the
    // raw provider error text (e.g. "402: ...") for inline display in the thread.
    error?: string;
-    // Terminal outcome of the assistant turn: 'error' (provider/stream error,
-    // paired with `error`), 'aborted' (client disconnect — a manual Stop or a
-    // dropped connection), or the SDK's finish reason on a clean turn. The UI
-    // renders a "stopped" marker on interrupted turns.
-    finishReason?: string;
  } | null;
  createdAt: string;
 }
--- a/apps/client/src/features/ai-chat/utils/adopt-chat-id.test.ts
+++ b/apps/client/src/features/ai-chat/utils/adopt-chat-id.test.ts
@@ -1,72 +0,0 @@
-import { describe, it, expect } from "vitest";
-import {
-  resolveAdoptedChatId,
-  newlyAddedChatIds,
-  extractServerChatId,
-} from "./adopt-chat-id";
-
-describe("resolveAdoptedChatId", () => {
-  it("adopts the server id for a brand-new chat (activeChatId null + id)", () => {
-    expect(resolveAdoptedChatId(null, "chat-1")).toBe("chat-1");
-  });
-
-  it("returns null for an existing chat even with a server id", () => {
-    expect(resolveAdoptedChatId("chat-existing", "chat-1")).toBeNull();
-  });
-
-  it("returns null for a new chat with no server id", () => {
-    expect(resolveAdoptedChatId(null, undefined)).toBeNull();
-    expect(resolveAdoptedChatId(null, null)).toBeNull();
-  });
-});
-
-describe("newlyAddedChatIds", () => {
-  it("returns the single new id", () => {
-    expect([...newlyAddedChatIds(["a", "b"], ["a", "b", "c"])]).toEqual(["c"]);
-  });
-
-  it("returns an empty set when nothing was added", () => {
-    expect(newlyAddedChatIds(["a", "b"], ["b", "a"]).size).toBe(0);
-  });
-
-  it("returns both new ids when two were added", () => {
-    expect(newlyAddedChatIds(["a"], ["a", "b", "c"])).toEqual(
-      new Set(["b", "c"]),
-    );
-  });
-
-  it("keeps only the new id across an add+delete in the same window", () => {
-    // before [a,b] -> after [b,new]: a was deleted, new was added.
-    expect([...newlyAddedChatIds(["a", "b"], ["b", "new"])]).toEqual(["new"]);
-  });
-
-  it("dedupes a repeated new id to a single entry", () => {
-    expect(newlyAddedChatIds(["a"], ["a", "new", "new"])).toEqual(
-      new Set(["new"]),
-    );
-  });
-});
-
-describe("extractServerChatId", () => {
-  it("returns the chatId when present on metadata", () => {
-    expect(extractServerChatId({ metadata: { chatId: "chat-1" } })).toBe(
-      "chat-1",
-    );
-  });
-
-  it("returns undefined when the message has no metadata", () => {
-    expect(extractServerChatId({})).toBeUndefined();
-  });
-
-  it("returns undefined when metadata lacks chatId", () => {
-    expect(extractServerChatId({ metadata: { other: 1 } })).toBeUndefined();
-  });
-
-  it("returns undefined for a non-string chatId", () => {
-    expect(extractServerChatId({ metadata: { chatId: 42 } })).toBeUndefined();
-  });
-
-  it("returns undefined for an undefined message", () => {
-    expect(extractServerChatId(undefined)).toBeUndefined();
-  });
-});
--- a/apps/client/src/features/ai-chat/utils/adopt-chat-id.ts
+++ b/apps/client/src/features/ai-chat/utils/adopt-chat-id.ts
@@ -1,70 +0,0 @@
-/**
- * Pure helpers for adopting a brand-new chat's authoritative server id.
- *
- * ============================ CANONICAL #137 NOTE ============================
- * This docblock is the single authoritative explanation of the new-chat id
- * adoption design and the #137 two-tab race it fixes. Other call sites
- * (use-chat-session.ts, the server's `chatStreamMetadata`) reference here
- * rather than restating it.
- *
- * When a user sends the first turn of a BRAND-NEW chat, the client has no chat
- * id yet (`activeChatId === null`). The server creates the row and the client
- * must "adopt" that row's real id so the SECOND turn targets the same chat.
- *
- * The OLD heuristic adopted `items[0]` — the newest chat in the refetched list.
- * That races a second tab: if another tab created a chat in the same moment,
- * its row could be `items[0]`, so this tab would adopt the SIBLING chat and
- * leak its later turns into it (#137). We adopt by IDENTITY instead, two ways:
- *
- * PRIMARY path: the server streams the real chat id on the assistant message
- * metadata's `start` part (see `chatStreamMetadata` server-side);
- * `extractServerChatId` reads it off the finished message and
- * `resolveAdoptedChatId` turns it into the id to adopt for a new chat. This is
- * authoritative and immune to the race.
- *
- * FALLBACK path (only when a new chat's first turn errors BEFORE the `start`
- * chunk, so no metadata id ever reached the client): adopt the single chat that
- * NEWLY appeared in the per-user list relative to a pre-refetch snapshot —
- * `newlyAddedChatIds` (the fallback effect adopts only when exactly one id is
- * new). This is unambiguous and does not race a second tab the way the old
- * "newest chat in the list" guess did.
- * ============================================================================
- */
-
-/**
- * Resolve the id to adopt from the server-streamed metadata. Returns
- * `serverChatId` only for a brand-new chat (`activeChatId === null`) that
- * received a truthy id; otherwise null (existing chat, or no id streamed).
- */
-export function resolveAdoptedChatId(
-  activeChatId: string | null,
-  serverChatId: string | null | undefined,
-): string | null {
-  return activeChatId === null && serverChatId ? serverChatId : null;
-}
-
-/**
- * Read the authoritative server chat id off a finished assistant message. The
- * server attaches it as `message.metadata.chatId` on the `start` part (see
- * `chatStreamMetadata`). Returns it only when it is a string; undefined for
- * a missing message, missing metadata, or a non-string `chatId`.
- */
-export function extractServerChatId(
-  message: { metadata?: unknown } | undefined,
-): string | undefined {
-  const m = message?.metadata as { chatId?: string } | undefined;
-  return typeof m?.chatId === "string" ? m.chatId : undefined;
-}
-
-/**
- * The deduped set of ids present in `afterIds` but not in `beforeIds`. A
- * paginated/flatMapped list can repeat the same id, so dedupe: one genuinely-new
- * chat must not read as multiple from a duplicate.
- */
-export function newlyAddedChatIds(
-  beforeIds: readonly string[],
-  afterIds: readonly string[],
-): Set<string> {
-  const before = new Set(beforeIds);
-  return new Set(afterIds.filter((id) => !before.has(id)));
-}
--- a/apps/client/src/features/ai-chat/utils/chat-markdown.ts
+++ b/apps/client/src/features/ai-chat/utils/chat-markdown.ts
@@ -0,0 +1,165 @@
+/**
+ * Client-only Markdown builder for an AI agent chat. Serializes the already
+ * persisted message rows (loaded via `useAiChatMessagesQuery`) into a single
+ * Markdown string suitable for copying to the clipboard. NO network call is
+ * made and NO server/DB code is touched — this reuses the rich "request
+ * internals" (tool calls with input/output, per-message token usage,
+ * finish/error info) that the chat already holds client-side.
+ *
+ * Only role labels and tool action labels are localized via the passed-in `t`
+ * translator; the structural document words (Input/Output/Error/Tokens/...) are
+ * plain English constants because the output is a technical artifact.
+ */
+
+import type { IAiChatMessageRow } from "@/features/ai-chat/types/ai-chat.types.ts";
+import {
+  ToolUiPart,
+  getToolName,
+  toolRunState,
+  toolLabelKey,
+} from "@/features/ai-chat/utils/tool-parts.tsx";
+
+// Minimal translator signature compatible with react-i18next's `t`.
+type Translate = (key: string, values?: Record<string, unknown>) => string;
+
+interface BuildChatMarkdownArgs {
+  title: string | null;
+  chatId: string;
+  rows: IAiChatMessageRow[];
+  t: Translate;
+}
+
+/** A single AI SDK UIMessage part (text part or other). */
+interface TextLikePart {
+  type: string;
+  text?: string;
+}
+
+/**
+ * Stringify an arbitrary tool input/output value for a fenced block. Strings
+ * pass through as-is; everything else is pretty-printed JSON, falling back to
+ * `String(value)` if serialization throws (e.g. a circular structure).
+ */
+function stringify(value: unknown): string {
+  if (typeof value === "string") return value;
+  try {
+    return JSON.stringify(value, null, 2);
+  } catch {
+    return String(value);
+  }
+}
+
+/**
+ * Wrap `code` in a fenced code block whose backtick delimiter is LONGER than
+ * the longest backtick run inside the content, so embedded backticks (or even
+ * a literal ``` fence) never break out of the block. Minimum 3 backticks.
+ */
+function fence(code: string, lang = ""): string {
+  const runs: string[] = code.match(/`+/g) ?? [];
+  const longest = runs.reduce((m, s) => Math.max(m, s.length), 0);
+  const delim = "`".repeat(Math.max(3, longest + 1));
+  return `${delim}${lang}\n${code}\n${delim}`;
+}
+
+/** Per-row token count, mirroring the header sum in ai-chat-window.tsx. */
+function rowTokens(usage: {
+  inputTokens?: number;
+  outputTokens?: number;
+  totalTokens?: number;
+}): number {
+  return (
+    usage.totalTokens ?? (usage.inputTokens ?? 0) + (usage.outputTokens ?? 0)
+  );
+}
+
+/**
+ * Serialize a chat to a Markdown string. Pure (apart from `new Date()` for the
+ * export timestamp), so it is straightforward to unit-test.
+ */
+export function buildChatMarkdown(args: BuildChatMarkdownArgs): string {
+  const { title, chatId, rows, t } = args;
+  const blocks: string[] = [];
+
+  const heading = (title ?? "").trim() || t("Untitled chat");
+  blocks.push(`# ${heading}`);
+
+  // Metadata bullet list. Total tokens is only shown when there is a sum.
+  const totalTokens = rows.reduce((sum, row) => {
+    const usage = row.metadata?.usage;
+    return usage ? sum + rowTokens(usage) : sum;
+  }, 0);
+  const meta = [
+    `- Chat ID: \`${chatId}\``,
+    `- Exported: ${new Date().toISOString()}`,
+    `- Messages: ${rows.length}`,
+  ];
+  if (totalTokens > 0) meta.push(`- Total tokens: ${totalTokens}`);
+  blocks.push(meta.join("\n"));
+
+  rows.forEach((row, index) => {
+    blocks.push("---");
+
+    const roleLabel = row.role === "assistant" ? t("AI agent") : t("You");
+    blocks.push(`## ${index + 1}. ${roleLabel}`);
+
+    // Created-at kept in source as an HTML comment (out of the rendered prose).
+    blocks.push(`<!-- ${row.createdAt} -->`);
+
+    // Resolve parts: prefer the rich persisted parts, else a single text part
+    // built from the plain-text content (mirrors `rowToUiMessage`).
+    const parts: TextLikePart[] =
+      Array.isArray(row.metadata?.parts) && row.metadata.parts.length > 0
+        ? (row.metadata.parts as TextLikePart[])
+        : [{ type: "text", text: row.content ?? "" }];
+
+    for (const part of parts) {
+      if (part.type === "text") {
+        const text = (part.text ?? "").trim();
+        // Skip empty/whitespace-only text parts (matches MessageItem).
+        if (text.length > 0) blocks.push(text);
+        continue;
+      }
+
+      const isToolPart =
+        part.type.startsWith("tool-") || part.type === "dynamic-tool";
+      if (!isToolPart) continue;
+
+      const tp = part as unknown as ToolUiPart;
+      const name = getToolName(tp);
+      const { key, values } = toolLabelKey(name);
+      const label = t(key, values);
+      const state = toolRunState(tp.state);
+
+      const toolLines: string[] = [
+        `**Tool: ${label}** (\`${name}\`) — ${state}`,
+      ];
+      if (tp.input !== undefined) {
+        toolLines.push("Input:");
+        toolLines.push(fence(stringify(tp.input), "json"));
+      }
+      if (tp.output !== undefined) {
+        toolLines.push("Output:");
+        toolLines.push(fence(stringify(tp.output), "json"));
+      }
+      if (tp.errorText) {
+        toolLines.push(`**Error:** ${tp.errorText}`);
+      }
+      blocks.push(toolLines.join("\n\n"));
+    }
+
+    if (row.metadata?.error) {
+      blocks.push(`**⚠️ Error:** ${row.metadata.error}`);
+    }
+
+    const usage = row.metadata?.usage;
+    if (usage) {
+      const total = usage.totalTokens ?? rowTokens(usage);
+      blocks.push(
+        `_Tokens — in: ${usage.inputTokens ?? "?"}, out: ${usage.outputTokens ?? "?"}, total: ${total}_`,
+      );
+    }
+  });
+
+  // Blank line between blocks so the Markdown renders cleanly.
+  return blocks.join("\n\n");
+}
--- a/apps/client/src/features/ai-chat/utils/collapse-blank-lines.test.ts
+++ b/apps/client/src/features/ai-chat/utils/collapse-blank-lines.test.ts
@@ -1,61 +0,0 @@
-import { describe, it, expect } from "vitest";
-import { collapseBlankLines } from "@/features/ai-chat/utils/collapse-blank-lines.ts";
-import { renderChatMarkdown } from "@/features/ai-chat/utils/markdown.ts";
-
-describe("collapseBlankLines", () => {
-  it("collapses a run of 2+ newlines to a single newline", () => {
-    expect(collapseBlankLines("a\n\nb")).toBe("a\nb");
-    expect(collapseBlankLines("a\n\n\n\nb")).toBe("a\nb");
-  });
-
-  it("keeps single newlines untouched", () => {
-    expect(collapseBlankLines("a\nb\nc")).toBe("a\nb\nc");
-  });
-
-  it("preserves blank lines INSIDE a fenced code block", () => {
-    const src = "a\n\n\nb\n\n```\nx\n\n\ny\n```\n\nc";
-    // Prose blanks collapse; the blank lines between the ``` fences survive.
-    expect(collapseBlankLines(src)).toBe("a\nb\n```\nx\n\n\ny\n```\nc");
-  });
-
-  it("handles a tilde fence and preserves its interior blanks", () => {
-    const src = "p\n\n~~~\ncode\n\nmore\n~~~\n\nq";
-    expect(collapseBlankLines(src)).toBe("p\n~~~\ncode\n\nmore\n~~~\nq");
-  });
-
-  it("leaves an unclosed fence's remaining lines verbatim", () => {
-    const src = "intro\n\n```\nstill\n\nopen";
-    expect(collapseBlankLines(src)).toBe("intro\n```\nstill\n\nopen");
-  });
-
-  it("is a no-op for text with no blank lines", () => {
-    expect(collapseBlankLines("just one line")).toBe("just one line");
-  });
-});
-
-describe("collapseBlankLines + renderChatMarkdown (tight reasoning rendering)", () => {
-  it("renders a blank-line-separated list as a TIGHT list (no <li><p>)", () => {
-    const loose =
-      "Intro paragraph.\n\n- item one\n\n- item two\n\n- item three";
-    const html = renderChatMarkdown(collapseBlankLines(loose), {});
-    // Tight list: each <li> holds the text directly, not wrapped in a <p>.
-    expect(html).toContain("<li>item one</li>");
-    expect(html).not.toContain("<li><p>");
-    // The list still parses as a list after the paragraph (not a paragraph+<br>).
-    expect(html).toContain("<ul>");
-    expect(html).toContain("<p>Intro paragraph.</p>");
-  });
-
-  it("renders an ordered list (1. 2.) as tight after collapsing", () => {
-    const loose = "Intro.\n\n1. first\n\n2. second";
-    const html = renderChatMarkdown(collapseBlankLines(loose), {});
-    expect(html).toContain("<ol>");
-    expect(html).toContain("<li>first</li>");
-    expect(html).not.toContain("<li><p>");
-  });
-
-  it("the loose source WOULD render <li><p> without collapsing (control)", () => {
-    const loose = "- a\n\n- b";
-    expect(renderChatMarkdown(loose, {})).toContain("<li><p>");
-  });
-});
--- a/apps/client/src/features/ai-chat/utils/collapse-blank-lines.ts
+++ b/apps/client/src/features/ai-chat/utils/collapse-blank-lines.ts
@@ -1,56 +0,0 @@
-// Pure helper for compact reasoning ("Thinking") rendering. Kept free of React
-// so it can be unit-tested in isolation (see collapse-blank-lines.test.ts).
-
-/**
- * Collapse runs of 2+ newlines down to a single newline, EXCEPT inside fenced
- * code blocks (``` ... ``` or ~~~ ... ~~~), where blank lines are significant.
- *
- * Why: reasoning models emit thinking with a blank line (`\n\n`) between every
- * list item and paragraph. `marked` turns those into "loose" lists (each `<li>`
- * wrapped in a `<p>`) and separate `<p>` paragraphs, each carrying a vertical
- * margin — so the "Thinking" block renders with large, airy gaps. Removing the
- * blank-line gaps yields tight lists (no `<li><p>`) and joined paragraphs. The
- * chat markdown renderer runs with `breaks: true`, so a single `\n` still
- * becomes a `<br>` — line breaks inside the reasoning are preserved; only the
- * empty gaps between blocks disappear. Apply ONLY to reasoning text, never to a
- * normal assistant answer (where paragraph spacing is intentional).
- *
- * Fenced code is preserved verbatim: a fence opens on a line whose first
- * non-space characters are ``` or ~~~ and closes on the next line that starts
- * with the same fence character. Blank lines between fences (significant for
- * code formatting) are never collapsed.
- */
-export function collapseBlankLines(text: string): string {
-  const lines = text.split("\n");
-  const out: string[] = [];
-  let inFence = false;
-  let fenceChar = "";
-
-  for (const line of lines) {
-    const fenceMatch = line.match(/^\s*(`{3,}|~{3,})/);
-    if (fenceMatch) {
-      const ch = fenceMatch[1][0];
-      if (!inFence) {
-        inFence = true;
-        fenceChar = ch;
-      } else if (ch === fenceChar) {
-        inFence = false;
-      }
-      out.push(line);
-      continue;
-    }
-
-    // Inside a fenced block every line (including blanks) is significant.
-    if (inFence) {
-      out.push(line);
-      continue;
-    }
-
-    // Outside fences: drop blank lines so a `\n\n+` gap collapses to a single
-    // `\n` between the surrounding content lines.
-    if (line.trim() === "") continue;
-    out.push(line);
-  }
-
-  return out.join("\n");
-}
--- a/apps/client/src/features/ai-chat/utils/count-stream-tokens.test.ts
+++ b/apps/client/src/features/ai-chat/utils/count-stream-tokens.test.ts
@@ -1,171 +0,0 @@
-import { describe, expect, it } from "vitest";
-import type { UIMessage } from "@ai-sdk/react";
-import {
-  estimateTokens,
-  liveTurnTokens,
-} from "@/features/ai-chat/utils/count-stream-tokens.ts";
-
-const msg = (parts: unknown[], metadata?: unknown): UIMessage =>
-  ({
-    id: Math.random().toString(),
-    role: "assistant",
-    parts,
-    metadata,
-  }) as UIMessage;
-
-describe("estimateTokens", () => {
-  it("returns 0 for the empty string", () => {
-    expect(estimateTokens("")).toBe(0);
-  });
-
-  it("ceils chars/4 so any non-empty text is at least 1 token", () => {
-    expect(estimateTokens("a")).toBe(1);
-    expect(estimateTokens("abcd")).toBe(1);
-    expect(estimateTokens("abcde")).toBe(2);
-    expect(estimateTokens("12345678")).toBe(2);
-  });
-});
-
-describe("liveTurnTokens — estimate path", () => {
-  it("is all zeros for an undefined message", () => {
-    expect(liveTurnTokens(undefined)).toEqual({
-      reasoning: 0,
-      output: 0,
-      authoritative: false,
-    });
-  });
-
-  it("is all zeros for a parts-less message", () => {
-    expect(liveTurnTokens({ id: "x", role: "assistant" } as UIMessage)).toEqual({
-      reasoning: 0,
-      output: 0,
-      authoritative: false,
-    });
-  });
-
-  it("estimates output from text parts", () => {
-    // 8 chars -> 2 tokens.
-    const r = liveTurnTokens(msg([{ type: "text", text: "12345678" }]));
-    expect(r).toEqual({ reasoning: 0, output: 2, authoritative: false });
-  });
-
-  it("estimates reasoning from reasoning parts (kept separate from output)", () => {
-    const r = liveTurnTokens(
-      msg([
-        { type: "reasoning", text: "12345678" },
-        { type: "text", text: "abcd" },
-      ]),
-    );
-    expect(r).toEqual({ reasoning: 2, output: 1, authoritative: false });
-  });
-
-  it("accumulates across multiple text + reasoning parts (multi-step)", () => {
-    const r = liveTurnTokens(
-      msg([
-        { type: "reasoning", text: "abcd" }, // 1
-        { type: "text", text: "abcd" }, // 1
-        { type: "tool-getPage", state: "output-available" }, // ignored
-        { type: "reasoning", text: "abcd" }, // 1
-        { type: "text", text: "abcdefgh" }, // 2
-      ]),
-    );
-    expect(r).toEqual({ reasoning: 2, output: 3, authoritative: false });
-  });
-
-  it("ignores non text/reasoning parts (tools, step-start)", () => {
-    const r = liveTurnTokens(
-      msg([
-        { type: "step-start" },
-        { type: "tool-getPage", state: "input-available" },
-      ]),
-    );
-    expect(r).toEqual({ reasoning: 0, output: 0, authoritative: false });
-  });
-});
-
-describe("liveTurnTokens — authoritative path", () => {
-  it("returns authoritative usage verbatim, splitting reasoning out of output", () => {
-    // outputTokens INCLUDES reasoning in the AI SDK shape -> answer = 100 - 30.
-    const r = liveTurnTokens(
-      msg([{ type: "text", text: "estimate would be tiny" }], {
-        usage: { inputTokens: 500, outputTokens: 100, reasoningTokens: 30 },
-      }),
-    );
-    expect(r).toEqual({ reasoning: 30, output: 70, authoritative: true });
-  });
-
-  it("treats missing reasoningTokens as 0 and keeps full output", () => {
-    const r = liveTurnTokens(
-      msg([{ type: "text", text: "x" }], {
-        usage: { inputTokens: 10, outputTokens: 42 },
-      }),
-    );
-    expect(r).toEqual({ reasoning: 0, output: 42, authoritative: true });
-  });
-
-  it("never returns a negative output when reasoning exceeds reported output", () => {
-    const r = liveTurnTokens(
-      msg([], { usage: { outputTokens: 10, reasoningTokens: 40 } }),
-    );
-    expect(r).toEqual({ reasoning: 40, output: 0, authoritative: true });
-  });
-
-  it("falls back to the estimate when metadata has no usage object", () => {
-    const r = liveTurnTokens(
-      msg([{ type: "text", text: "abcd" }], { chatId: "c1" }),
-    );
-    expect(r).toEqual({ reasoning: 0, output: 1, authoritative: false });
-  });
-});
-
-describe("liveTurnTokens — combined authoritative + estimate (#163)", () => {
-  it("ticks the in-flight step above the completed-steps authoritative base", () => {
-    // The authoritative usage is the sum over COMPLETED steps (step 1). The
-    // CURRENT step is streaming and its text is NOT in `usage` yet, but it IS in
-    // the parts -> the running estimate must push the live figure above the base
-    // so the badge keeps growing between step boundaries.
-    const longText = "x".repeat(800); // 800 chars -> 200 est output tokens
-    const r = liveTurnTokens(
-      msg([{ type: "text", text: longText }], {
-        usage: { inputTokens: 500, outputTokens: 40 }, // step-1 base: 40 output
-      }),
-    );
-    // max(authOutput=40, estOutput=200) = 200 -> the counter ticks, not frozen.
-    expect(r.output).toBe(200);
-    expect(r.authoritative).toBe(true);
-  });
-
-  it("ticks reasoning of the in-flight step above the authoritative reasoning base", () => {
-    const longReasoning = "r".repeat(400); // 400 chars -> 100 est reasoning
-    const r = liveTurnTokens(
-      msg([{ type: "reasoning", text: longReasoning }], {
-        usage: { inputTokens: 100, outputTokens: 20, reasoningTokens: 20 },
-      }),
-    );
-    // reasoning: max(20, 100) = 100 ; output: max(max(0,20-20)=0, 0) = 0.
-    expect(r.reasoning).toBe(100);
-    expect(r.output).toBe(0);
-    expect(r.authoritative).toBe(true);
-  });
-
-  it("snaps to the authoritative figure once it exceeds the rough estimate", () => {
-    // Short on-screen text (estimate tiny) but a large authoritative output:
-    // the exact figure wins at the boundary (the counter never under-reports).
-    const r = liveTurnTokens(
-      msg([{ type: "text", text: "abcd" }], {
-        usage: { inputTokens: 10, outputTokens: 5000 },
-      }),
-    );
-    expect(r.output).toBe(5000);
-  });
-
-  it("is monotonic: max never drops below the authoritative base when the estimate is smaller", () => {
-    // Mirrors the legacy 'verbatim' tests: estimate < authoritative -> unchanged.
-    const r = liveTurnTokens(
-      msg([{ type: "text", text: "tiny" }], {
-        usage: { inputTokens: 500, outputTokens: 100, reasoningTokens: 30 },
-      }),
-    );
-    expect(r).toEqual({ reasoning: 30, output: 70, authoritative: true });
-  });
-});
--- a/apps/client/src/features/ai-chat/utils/count-stream-tokens.ts
+++ b/apps/client/src/features/ai-chat/utils/count-stream-tokens.ts
@@ -1,113 +0,0 @@
-import type { UIMessage } from "@ai-sdk/react";
-
-/**
- * Live token counting for a streaming AI-chat turn — split into REASONING
- * (thinking) and OUTPUT (answer) tokens, mirroring how Claude Code shows
- * `Thinking… · 60 tokens` next to its thinking indicator.
- *
- * No provider streams exact per-token usage mid-stream, so the live number is a
- * CLIENT ESTIMATE (chars/≈4 heuristic) that is reconciled to AUTHORITATIVE usage
- * once the server attaches it on a step/turn boundary (see the server's
- * `chatStreamMetadata` + the client's read of `message.metadata.usage`). When
- * authoritative usage is present we return it verbatim (the number "jumps to
- * exact"); otherwise we return the running estimate. Pure + unit-testable: it
- * never runs a real BPE tokenizer (that would be O(n²) on the hot path, bloat the
- * bundle, and be wrong for Gemini/Ollama anyway).
- */
-
-/**
- * Rough token estimate for a piece of text using the standard chars/≈4 heuristic.
- * Returns 0 for empty/whitespace-free-of-content input, and ceils so any
- * non-empty text counts as at least one token.
- */
-export function estimateTokens(text: string): number {
-  if (!text) return 0;
-  return Math.ceil(text.length / 4);
-}
-
-/** Authoritative per-step/turn usage the server attaches to message metadata. */
-export interface AuthoritativeUsage {
-  inputTokens?: number;
-  outputTokens?: number;
-  totalTokens?: number;
-  reasoningTokens?: number;
-}
-
-/** Live token split for a turn's tail (streaming) assistant message. */
-export interface LiveTurnTokens {
-  /** Thinking/reasoning tokens (estimate, or authoritative when available). */
-  reasoning: number;
-  /** Answer/output tokens (estimate, or authoritative when available). */
-  output: number;
-  /** True when the numbers come from authoritative server usage, not estimate. */
-  authoritative: boolean;
-}
-
-/** Read the authoritative usage off a UIMessage's metadata, if the server set it. */
-function metadataUsage(message: UIMessage): AuthoritativeUsage | undefined {
-  const meta = message?.metadata as
-    | { usage?: AuthoritativeUsage }
-    | undefined;
-  const usage = meta?.usage;
-  if (!usage || typeof usage !== "object") return undefined;
-  return usage;
-}
-
-/**
- * Token split for the given (streaming) assistant message.
- *
- * COMBINES the authoritative server usage with the running text estimate so the
- * counter ticks in real time AND lands exact. The server only attaches
- * `metadata.usage` at a step/turn boundary (`finish-step`/`finish`) and it is
- * CUMULATIVE over COMPLETED steps — it does NOT yet include the in-flight step.
- * So a multi-step turn that returned the authoritative figure verbatim would
- * FREEZE between boundaries and jump in steps (issue #163).
- *
- * Instead we always compute the running ESTIMATE (chars/≈4 over the message's
- * `reasoning`/`text` parts, which grows on every streamed delta) and take the
- * per-component MAX of the authoritative base and the estimate:
- *   - between boundaries the estimate of the in-flight step ticks the number up;
- *   - at a boundary the authoritative figure snaps it to exact;
- *   - because the server's usage is cumulative and we only ever take the max, the
- *     number is MONOTONIC — it never drops.
- *
- * Providers that don't stream reasoning text still surface a reasoning count once
- * the authoritative usage arrives (`max(reasoningTokens, 0)`); on the pure
- * estimate path (no usage yet) such a turn shows `reasoning: 0` until then.
- */
-export function liveTurnTokens(message: UIMessage | undefined): LiveTurnTokens {
-  if (!message) return { reasoning: 0, output: 0, authoritative: false };
-
-  // Running ESTIMATE over every reasoning/text part — grows on each delta. This
-  // includes the IN-FLIGHT step, which the authoritative usage does not cover yet.
-  let estReasoning = 0;
-  let estOutput = 0;
-  for (const part of message.parts ?? []) {
-    if (part.type === "reasoning") {
-      estReasoning += estimateTokens((part as { text?: string }).text ?? "");
-    } else if (part.type === "text") {
-      estOutput += estimateTokens((part as { text?: string }).text ?? "");
-    }
-  }
-
-  const usage = metadataUsage(message);
-  if (!usage) {
-    // No authoritative usage streamed yet: the estimate IS the live figure.
-    return { reasoning: estReasoning, output: estOutput, authoritative: false };
-  }
-
-  // Authoritative sum over COMPLETED steps. `outputTokens` already INCLUDES
-  // reasoning in the AI SDK usage shape, so subtract it out for the "answer"
-  // figure (never go negative if a provider reports them inconsistently).
-  const authReasoning = usage.reasoningTokens ?? 0;
-  const authOutput = Math.max(0, (usage.outputTokens ?? 0) - authReasoning);
-
-  // Per-component max: the in-flight step's estimate ticks above the completed-
-  // steps base between boundaries, and the authoritative figure wins once it
-  // exceeds the (rough) estimate at the next boundary. Monotonic by construction.
-  return {
-    reasoning: Math.max(authReasoning, estReasoning),
-    output: Math.max(authOutput, estOutput),
-    authoritative: true,
-  };
-}
--- a/apps/client/src/features/ai-chat/utils/error-message.test.ts
+++ b/apps/client/src/features/ai-chat/utils/error-message.test.ts
@@ -6,163 +6,48 @@ import { describeChatError } from "./error-message";
 const t = (key: string) => key;

 describe("describeChatError", () => {
-  it('maps a {"statusCode":403} body to the disabled heading', () => {
+  it('surfaces a provider "402: ..." stream error verbatim', () => {
+    expect(describeChatError("402: Insufficient credits", t)).toBe(
+      "402: Insufficient credits",
+    );
+  });
+
+  it('does NOT misclassify a body that merely contains "403" (no "statusCode":403)', () => {
+    // A provider message mentioning the number 403 must be surfaced verbatim,
+    // never folded into the "AI chat is disabled" gating message.
+    const msg = "429: rate limited after 403 attempts";
+    expect(describeChatError(msg, t)).toBe(msg);
+  });
+
+  it('maps a {"statusCode":403} body to the disabled message', () => {
    const body = '{"statusCode":403,"message":"Forbidden"}';
-    expect(describeChatError(body, t)).toEqual({
-      title: "AI chat is disabled",
-      detail: "AI chat is disabled for this workspace.",
-    });
+    expect(describeChatError(body, t)).toBe(
+      "AI chat is disabled for this workspace.",
+    );
  });

-  it('maps a {"statusCode":503} body to the not-configured heading', () => {
+  it('maps a {"statusCode":503} body to the not-configured message', () => {
    const body = '{"statusCode":503,"message":"Service Unavailable"}';
-    expect(describeChatError(body, t)).toEqual({
-      title: "AI provider not configured",
-      detail:
-        "The AI provider is not configured. Ask an administrator to set it up.",
-    });
-  });
-
-  it("classifies a dropped connection (ECONNRESET) as a lost-connection error", () => {
-    expect(
-      describeChatError("Cannot connect to API: read ECONNRESET", t).title,
-    ).toBe("Lost connection to the AI provider");
-  });
-
-  it('classifies "fetch failed" as a lost-connection error', () => {
-    expect(describeChatError("fetch failed", t).title).toBe(
-      "Lost connection to the AI provider",
+    expect(describeChatError(body, t)).toBe(
+      "The AI provider is not configured. Ask an administrator to set it up.",
    );
  });

-  it("classifies ETIMEDOUT as a timeout", () => {
-    expect(describeChatError("ETIMEDOUT", t).title).toBe(
-      "The AI provider timed out",
+  it('falls back to the generic message for "An error occurred."', () => {
+    expect(describeChatError("An error occurred.", t)).toBe(
+      "The AI agent could not respond. Please try again.",
    );
  });

-  it('classifies "504: Gateway Timeout" as a timeout', () => {
-    expect(describeChatError("504: Gateway Timeout", t).title).toBe(
-      "The AI provider timed out",
+  it('falls back to the generic message for "Internal server error"', () => {
+    expect(describeChatError("Internal server error", t)).toBe(
+      "The AI agent could not respond. Please try again.",
    );
  });

-  it('classifies "429: Too Many Requests" as rate limited', () => {
-    expect(describeChatError("429: Too Many Requests", t).title).toBe(
-      "Rate limited by the AI provider",
-    );
-  });
-
-  it('does NOT misclassify a body that merely contains "403" as disabled', () => {
-    // Regression intent: a provider message mentioning the number 403 must never
-    // be folded into the "AI chat is disabled" gating heading. Here the 429
-    // signature wins (checked before any bare-403 logic exists), so it maps to
-    // the rate-limit category instead.
-    const view = describeChatError("429: rate limited after 403 attempts", t);
-    expect(view.title).toBe("Rate limited by the AI provider");
-    expect(view.title).not.toBe("AI chat is disabled");
-  });
-
-  it("classifies a context-window overflow as too-large", () => {
-    expect(
-      describeChatError(
-        "This model's maximum context length is 128000 tokens",
-        t,
-      ).title,
-    ).toBe("The conversation is too large");
-  });
-
-  it('classifies "402: Insufficient credits" as quota exceeded', () => {
-    expect(describeChatError("402: Insufficient credits", t).title).toBe(
-      "AI provider quota exceeded",
-    );
-  });
-
-  it('classifies "401: Unauthorized" as an auth failure', () => {
-    expect(describeChatError("401: Unauthorized", t).title).toBe(
-      "AI provider authentication failed",
-    );
-  });
-
-  it("falls back to the generic heading + detail for empty input", () => {
-    expect(describeChatError("", t)).toEqual({
-      title: "Something went wrong",
-      detail: "The AI agent could not respond. Please try again.",
-    });
-  });
-
-  it('falls back to the generic heading + detail for "An error occurred."', () => {
-    expect(describeChatError("An error occurred.", t)).toEqual({
-      title: "Something went wrong",
-      detail: "The AI agent could not respond. Please try again.",
-    });
-  });
-
-  it('falls back to the generic heading + detail for "Internal server error"', () => {
-    expect(describeChatError("Internal server error", t)).toEqual({
-      title: "Something went wrong",
-      detail: "The AI agent could not respond. Please try again.",
-    });
-  });
-
-  it("surfaces an unknown-but-informative provider detail verbatim under the generic heading", () => {
-    expect(describeChatError("418: I'm a teapot", t)).toEqual({
-      title: "Something went wrong",
-      detail: "418: I'm a teapot",
-    });
-  });
-
-  it("does NOT treat a number inside the response body as a leading status code (no auth)", () => {
-    // The real status (500) leads the string; the "401" lives in the snippet and
-    // must not trigger the auth category. The verbatim provider text is surfaced.
-    const body =
-      "500: Server error | response body: model gpt-4o-401-preview not found";
-    expect(describeChatError(body, t)).toEqual({
-      title: "Something went wrong",
-      detail: body,
-    });
-  });
-
-  it("does NOT treat a passing mention of billing as a quota error", () => {
-    // "billing" is no longer a quota signature; the verbatim text is surfaced.
-    const body = "502: Bad Gateway | response body: see our billing page";
-    expect(describeChatError(body, t)).toEqual({
-      title: "Something went wrong",
-      detail: body,
-    });
-  });
-
-  it('still rate-limits "429: rate limited after 403 attempts" and never disables', () => {
-    const view = describeChatError("429: rate limited after 403 attempts", t);
-    expect(view.title).toBe("Rate limited by the AI provider");
-    expect(view.title).not.toBe("AI chat is disabled");
-  });
-
-  it('does NOT treat "rate limit" inside the response body as a rate-limit error', () => {
-    // The textual rate-limit phrase lives only in the response-body snippet, and
-    // the leading 500 is not a classified numeric code, so it must not leak into
-    // the rate-limit category. (The detail itself falls back to the generic line
-    // here because the leading message contains "Internal Server Error", which
-    // providerDetail suppresses — the title is what this case pins.)
-    const body =
-      "500: Internal Server Error | response body: rate limit info: see our docs";
-    expect(describeChatError(body, t).title).toBe("Something went wrong");
-    expect(describeChatError(body, t).title).not.toBe(
-      "Rate limited by the AI provider",
-    );
-  });
-
-  it('does NOT treat ETIMEDOUT inside the response body as a timeout', () => {
-    // The 503 leads the string but is not a classified numeric code, and the
-    // ETIMEDOUT signature appears only in the body, so it must not leak into the
-    // timeout category; the verbatim text is surfaced under the generic heading.
-    const body = "503: x | response body: ETIMEDOUT appears in this log line";
-    expect(describeChatError(body, t)).toEqual({
-      title: "Something went wrong",
-      detail: body,
-    });
-    expect(describeChatError(body, t).title).not.toBe(
-      "The AI provider timed out",
+  it("falls back to the generic message for empty input", () => {
+    expect(describeChatError("", t)).toBe(
+      "The AI agent could not respond. Please try again.",
    );
  });
 });
--- a/apps/client/src/features/ai-chat/utils/error-message.ts
+++ b/apps/client/src/features/ai-chat/utils/error-message.ts
@@ -1,174 +1,24 @@
 /**
- * A classified AI chat error: a short bold heading naming the cause category and
- * a one-line human-readable detail / next step. Both strings are already passed
- * through `t`, so callers render them directly.
- */
-export interface ChatErrorView {
-  title: string;
-  detail: string;
-}
-
-/**
- * Turn an AI chat error message into a friendly heading + detail. Used for BOTH
- * the live `useChat().error` (its `.message`) and a persisted assistant error in
- * `metadata.error`. Our own gating responses arrive as a raw NestJS JSON error
- * body carrying a numeric "statusCode" (matched precisely, not by bare substring,
- * so a provider message that merely contains "403"/"503" is never misclassified).
- * Known provider/network failures (connection reset, timeout, rate limit, context
- * overflow, quota, auth) are mapped to a clear category; anything else falls back
- * to the raw provider detail (or a generic line) under the original heading.
+ * Turn an AI chat error message into a friendly inline string. Used for BOTH the
+ * live `useChat().error` (its `.message`) and a persisted assistant error stored
+ * in `metadata.error`. Our own gating responses arrive as a raw NestJS JSON error
+ * body carrying a numeric "statusCode" field (matched precisely, not by bare
+ * substring, so a provider message that merely contains "403"/"503"/"disabled" is
+ * never misclassified). Everything else — provider stream failures forwarded as
+ * "<status>: <message>" (402 credits, 429 rate limit, ...) — is surfaced verbatim.
 */
 export function describeChatError(
  message: string,
  t: (key: string) => string,
-): ChatErrorView {
+): string {
  const msg = message ?? "";
-
  if (/"statusCode"\s*:\s*403\b/.test(msg)) {
-    return {
-      title: t("AI chat is disabled"),
-      detail: t("AI chat is disabled for this workspace."),
-    };
+    return t("AI chat is disabled for this workspace.");
  }
  if (/"statusCode"\s*:\s*503\b/.test(msg)) {
-    return {
-      title: t("AI provider not configured"),
-      detail: t(
-        "The AI provider is not configured. Ask an administrator to set it up.",
-      ),
-    };
+    return t("The AI provider is not configured. Ask an administrator to set it up.");
  }
-
-  const category = classifyProviderError(msg);
-  if (category) {
-    return { title: t(category.title), detail: t(category.detail) };
-  }
-
-  // Unknown error: surface the raw provider detail when it is informative,
-  // otherwise a generic line. The heading stays the original generic one.
-  return {
-    title: t("Something went wrong"),
-    detail:
-      providerDetail(msg) ??
-      t("The AI agent could not respond. Please try again."),
-  };
-}
-
-interface ErrorCategory {
-  /** English key for the bold heading. */
-  title: string;
-  /** English key for the one-line explanation. */
-  detail: string;
-}
-
-/**
- * Map a provider/network error string to a friendly category. Order matters: the
- * most specific signatures are tested first. Returns null when nothing matches,
- * so the caller can fall back to the raw provider text. The English keys returned
- * here are passed through `t` by the caller.
- *
- * The server formats provider errors as "<statusCode>: <message> | response body:
- * <snippet>" (see server-side describeProviderError), so the HTTP status is always
- * the LEADING token. We match a numeric code only when it leads the string, so a
- * number inside the response-body snippet never triggers a category; textual
- * signatures are matched only against the leading message (before the response
- * body), so a phrase inside the snippet never triggers a category either.
- */
-function classifyProviderError(msg: string): ErrorCategory | null {
-  const code = /^\s*(\d{3})\b/.exec(msg)?.[1] ?? "";
-  // The server appends "| response body: <snippet>" to provider errors; match
-  // textual signatures only against the leading provider message so a phrase
-  // inside the response-body snippet never triggers a wrong category. The numeric
-  // status code is read from the start of the full string above.
-  const head = msg.split(/\|\s*response body:/i)[0];
-
-  // The browser's OWN fetch-failure messages — WebKit/Safari "Load failed",
-  // Chrome "Failed to fetch", Firefox "NetworkError when attempting to fetch
-  // resource". These mean the streaming connection between the browser and THIS
-  // server (/api/ai-chat/stream) dropped mid-answer: the browser<->server link,
-  // NOT the server<->AI-provider link, so do NOT blame the provider. A failed
-  // fetch carries no status/body, so the browser has no further detail — the real
-  // cause is in the server logs (the stream controller logs the disconnect) and
-  // the reverse proxy (often buffering or timing out the long-lived SSE).
-  if (/failed to fetch|load failed|networkerror/i.test(head)) {
-    return {
-      title: "Lost connection to the server",
-      detail:
-        "The streaming connection to the server dropped before the answer finished. The browser reports no further detail — the cause is in the server logs and the reverse proxy (often buffering or timing out the stream). Reload and try again.",
-    };
-  }
-  // Connection dropped / provider unreachable. ECONNRESET is the production case:
-  // the LLM socket was reset mid-stream (surfaced by the server's error
-  // formatter). "terminated" is scoped to a connection/stream context so it does
-  // not match benign "... was terminated" messages.
-  if (
-    /ECONNRESET|ECONNREFUSED|ENOTFOUND|EAI_AGAIN|EPIPE|socket hang up|cannot connect|fetch failed|network error|connection (?:error|closed|reset|terminated)|stream terminated/i.test(
-      head,
-    )
-  ) {
-    return {
-      title: "Lost connection to the AI provider",
-      detail:
-        "The connection to the AI provider dropped before the answer finished. Please try again.",
-    };
-  }
-  // Timeout.
-  if (
-    code === "504" ||
-    code === "408" ||
-    /ETIMEDOUT|timed[\s-]?out|\btimeout\b/i.test(head)
-  ) {
-    return {
-      title: "The AI provider timed out",
-      detail: "The AI provider took too long to respond. Please try again.",
-    };
-  }
-  // Rate limited.
-  if (code === "429" || /rate[\s-]?limit|too many requests/i.test(head)) {
-    return {
-      title: "Rate limited by the AI provider",
-      detail:
-        "The AI provider is rate-limiting requests. Wait a moment and try again.",
-    };
-  }
-  // Context window / token budget exceeded.
-  if (
-    code === "413" ||
-    /context[\s_-]?(?:length|window)|maximum context|context_length_exceeded|too many tokens|maximum[^.]*tokens|reduce the length/i.test(
-      head,
-    )
-  ) {
-    return {
-      title: "The conversation is too large",
-      detail:
-        "The document and search results exceeded the model's context window. Start a new chat or narrow the request.",
-    };
-  }
-  // Out of credits / quota / payment required.
-  if (
-    code === "402" ||
-    /payment required|insufficient (?:credits|quota|funds|balance)|out of credits|quota (?:exceeded|exhausted)/i.test(
-      head,
-    )
-  ) {
-    return {
-      title: "AI provider quota exceeded",
-      detail:
-        "The AI provider rejected the request because of credits or quota. Check the provider account.",
-    };
-  }
-  // Authentication / bad API key.
-  if (
-    code === "401" ||
-    /\bunauthorized\b|invalid api key|user not found|\bauthentication\b/i.test(head)
-  ) {
-    return {
-      title: "AI provider authentication failed",
-      detail:
-        "The AI provider rejected the credentials. Ask an administrator to check the API key.",
-    };
-  }
-  return null;
+  return providerDetail(msg) ?? t("The AI agent could not respond. Please try again.");
 }

 /**
--- a/apps/client/src/features/ai-chat/utils/message-content.test.ts
+++ b/apps/client/src/features/ai-chat/utils/message-content.test.ts
@@ -1,94 +0,0 @@
-import { describe, expect, it } from "vitest";
-import type { UIMessage } from "@ai-sdk/react";
-import { assistantMessageHasVisibleContent } from "@/features/ai-chat/utils/message-content.ts";
-
-/**
- * Pure-helper tests for `assistantMessageHasVisibleContent`, the single source of
- * truth shared by MessageItem (whether to render the bubble) and
- * typingIndicatorShowsName (whether the standalone indicator owns the name). It
- * must mirror MessageItem's render decisions exactly so exactly one element owns
- * the agent name during the pre-content "thinking" gap.
- */
-const msg = (
-  parts: UIMessage["parts"],
-  metadata?: unknown,
-): UIMessage =>
-  ({
-    id: Math.random().toString(),
-    role: "assistant",
-    parts,
-    metadata,
-  }) as UIMessage;
-
-describe("assistantMessageHasVisibleContent", () => {
-  it("is false for an empty text part", () => {
-    expect(assistantMessageHasVisibleContent(msg([{ type: "text", text: "" }]))).toBe(false);
-  });
-
-  it("is false for a whitespace-only text part", () => {
-    expect(assistantMessageHasVisibleContent(msg([{ type: "text", text: "   " }]))).toBe(false);
-  });
-
-  it("is true for a non-empty text part", () => {
-    expect(assistantMessageHasVisibleContent(msg([{ type: "text", text: "answer" }]))).toBe(true);
-  });
-
-  it("is true for a tool part", () => {
-    const toolPart = { type: "tool-getPage", state: "output-available" } as unknown as UIMessage["parts"][number];
-    expect(assistantMessageHasVisibleContent(msg([toolPart]))).toBe(true);
-  });
-
-  it("is true when metadata.error is set (persisted error banner)", () => {
-    expect(
-      assistantMessageHasVisibleContent(msg([{ type: "text", text: "" }], { error: "boom" })),
-    ).toBe(true);
-  });
-
-  it("is true when metadata.finishReason is 'aborted' (persisted stopped notice)", () => {
-    expect(
-      assistantMessageHasVisibleContent(msg([], { finishReason: "aborted" })),
-    ).toBe(true);
-  });
-
-  it("is false for a message with no parts and no metadata", () => {
-    expect(assistantMessageHasVisibleContent(msg([]))).toBe(false);
-  });
-
-  it("is false for an unsupported part kind (reasoning)", () => {
-    const reasoning = { type: "reasoning", text: "let me think" } as unknown as UIMessage["parts"][number];
-    expect(assistantMessageHasVisibleContent(msg([reasoning]))).toBe(false);
-  });
-
-  it("is true for a running tool part (input-available)", () => {
-    // Tool visibility does not depend on tool state: MessageItem renders a
-    // ToolCallCard for any tool part, so a still-running tool is visible.
-    const runningTool = { type: "tool-getPage", state: "input-available" } as unknown as UIMessage["parts"][number];
-    expect(assistantMessageHasVisibleContent(msg([runningTool]))).toBe(true);
-  });
-
-  it("is true for an empty leading text part followed by a non-empty one", () => {
-    // An empty leading text part followed by a non-empty one is still visible
-    // (mirrors the real streaming sequence where text arrives incrementally).
-    expect(
-      assistantMessageHasVisibleContent(
-        msg([{ type: "text", text: "" }, { type: "text", text: "answer" }]),
-      ),
-    ).toBe(true);
-  });
-
-  it("is false for an empty completed turn (finishReason 'stop')", () => {
-    // A completed turn with no text/tools and a non-aborted finishReason renders
-    // nothing — this is intentional (hiding a dangling name-only row), distinct
-    // from the `aborted`/`error` cases which DO render.
-    expect(
-      assistantMessageHasVisibleContent(msg([{ type: "text", text: "" }], { finishReason: "stop" })),
-    ).toBe(false);
-  });
-
-  it("is false for a parts-less message (the `?? []` guard makes it safe)", () => {
-    // The `?? []` guard makes a parts-less object safe instead of throwing.
-    expect(
-      assistantMessageHasVisibleContent({ id: "x", role: "assistant" } as unknown as UIMessage),
-    ).toBe(false);
-  });
-});
--- a/apps/client/src/features/ai-chat/utils/message-content.ts
+++ b/apps/client/src/features/ai-chat/utils/message-content.ts
@@ -1,39 +0,0 @@
-import type { UIMessage } from "@ai-sdk/react";
-import { isToolPart } from "@/features/ai-chat/utils/tool-parts.tsx";
-
-/**
- * Whether an assistant `UIMessage` has anything visible to render in its bubble.
- *
- * This mirrors MessageItem's render decisions EXACTLY and is the single source of
- * truth shared by both MessageItem (to decide whether to render the bubble at all)
- * and typingIndicatorShowsName (to decide whether the standalone "Thinking…"
- * indicator owns the dimmed agent-name label). Keeping one helper guarantees the
- * two stay in lockstep, so exactly one element owns the name during the pre-content
- * "thinking" gap and the layout never reflows mid-stream.
- *
- * An assistant message has visible content iff ANY of:
- *  - a `text` part whose trimmed length > 0 (non-empty markdown), OR
- *  - ANY tool part (`isToolPart(part.type)`), OR
- *  - `metadata.error` is truthy (a persisted error banner renders), OR
- *  - `metadata.finishReason === "aborted"` (a persisted "response stopped" notice).
- * Empty/whitespace-only text parts and unsupported part kinds (reasoning, sources,
- * files, step-start) are NOT visible.
- */
-export function assistantMessageHasVisibleContent(message: UIMessage): boolean {
-  const meta = message.metadata as
-    | { error?: string; finishReason?: string }
-    | undefined;
-  // Persisted errored/aborted turns always render their banner/notice.
-  if (meta?.error) return true;
-  if (meta?.finishReason === "aborted") return true;
-
-  // `parts` may be empty (a nascent streaming message has no parts yet).
-  // `?? []` also guards a sparse/partial message object (metadata-only, no
-  // `parts`) so iterating cannot throw — it does not change behavior for any
-  // current input.
-  for (const part of message.parts ?? []) {
-    if (part.type === "text" && part.text.trim().length > 0) return true;
-    if (isToolPart(part.type)) return true;
-  }
-  return false;
-}
--- a/apps/client/src/features/ai-chat/utils/queue-helpers.test.ts
+++ b/apps/client/src/features/ai-chat/utils/queue-helpers.test.ts
@@ -1,107 +0,0 @@
-import { describe, it, expect } from "vitest";
-import {
-  enqueueMessage,
-  dequeue,
-  removeQueuedById,
-  type QueuedMessage,
-} from "./queue-helpers";
-
-describe("enqueueMessage", () => {
-  it("appends a message to the end of the queue", () => {
-    const queue: QueuedMessage[] = [{ id: "a", text: "first" }];
-    const next = enqueueMessage(queue, { id: "b", text: "second" });
-    expect(next).toEqual([
-      { id: "a", text: "first" },
-      { id: "b", text: "second" },
-    ]);
-  });
-
-  it("does not mutate the input queue", () => {
-    const queue: QueuedMessage[] = [{ id: "a", text: "first" }];
-    enqueueMessage(queue, { id: "b", text: "second" });
-    expect(queue).toEqual([{ id: "a", text: "first" }]);
-  });
-});
-
-describe("dequeue", () => {
-  it("returns {head:null, rest:[]} for an empty queue", () => {
-    expect(dequeue([])).toEqual({ head: null, rest: [] });
-  });
-
-  it("returns the first item as head and the remainder as rest", () => {
-    const queue: QueuedMessage[] = [
-      { id: "a", text: "first" },
-      { id: "b", text: "second" },
-      { id: "c", text: "third" },
-    ];
-    const { head, rest } = dequeue(queue);
-    expect(head).toEqual({ id: "a", text: "first" });
-    expect(rest).toEqual([
-      { id: "b", text: "second" },
-      { id: "c", text: "third" },
-    ]);
-  });
-
-  it("does not mutate the input queue", () => {
-    const queue: QueuedMessage[] = [
-      { id: "a", text: "first" },
-      { id: "b", text: "second" },
-    ];
-    dequeue(queue);
-    expect(queue).toEqual([
-      { id: "a", text: "first" },
-      { id: "b", text: "second" },
-    ]);
-  });
-});
-
-describe("removeQueuedById", () => {
-  it("removes the matching id and leaves the others", () => {
-    const queue: QueuedMessage[] = [
-      { id: "a", text: "first" },
-      { id: "b", text: "second" },
-      { id: "c", text: "third" },
-    ];
-    const next = removeQueuedById(queue, "b");
-    expect(next).toEqual([
-      { id: "a", text: "first" },
-      { id: "c", text: "third" },
-    ]);
-  });
-
-  it("returns an equivalent list when the id is not present", () => {
-    const queue: QueuedMessage[] = [{ id: "a", text: "first" }];
-    expect(removeQueuedById(queue, "missing")).toEqual([
-      { id: "a", text: "first" },
-    ]);
-  });
-
-  it("does not mutate the input queue", () => {
-    const queue: QueuedMessage[] = [
-      { id: "a", text: "first" },
-      { id: "b", text: "second" },
-    ];
-    removeQueuedById(queue, "a");
-    expect(queue).toEqual([
-      { id: "a", text: "first" },
-      { id: "b", text: "second" },
-    ]);
-  });
-});
-
-describe("FIFO order", () => {
-  it("preserves order across enqueue -> dequeue", () => {
-    let queue: QueuedMessage[] = [];
-    queue = enqueueMessage(queue, { id: "1", text: "one" });
-    queue = enqueueMessage(queue, { id: "2", text: "two" });
-    queue = enqueueMessage(queue, { id: "3", text: "three" });
-
-    const order: string[] = [];
-    while (queue.length > 0) {
-      const { head, rest } = dequeue(queue);
-      if (head) order.push(head.text);
-      queue = rest;
-    }
-    expect(order).toEqual(["one", "two", "three"]);
-  });
-});
--- a/apps/client/src/features/ai-chat/utils/queue-helpers.ts
+++ b/apps/client/src/features/ai-chat/utils/queue-helpers.ts
@@ -1,34 +0,0 @@
-// Pure FIFO helpers for the AI-chat "send while the agent is busy" queue.
-// Kept side-effect free so they can be unit-tested without React.
-
-export interface QueuedMessage {
-  id: string;
-  text: string;
-}
-
-/** Append a message to the end of the queue (returns a new array). */
-export function enqueueMessage(
-  queue: QueuedMessage[],
-  message: QueuedMessage,
-): QueuedMessage[] {
-  return [...queue, message];
-}
-
-/** Split the queue into its first item (`head`) and the remainder (`rest`).
- *  `head` is null when the queue is empty. Does not mutate the input. */
-export function dequeue(queue: QueuedMessage[]): {
-  head: QueuedMessage | null;
-  rest: QueuedMessage[];
-} {
-  if (queue.length === 0) return { head: null, rest: [] };
-  const [head, ...rest] = queue;
-  return { head, rest };
-}
-
-/** Remove the queued message with the given id (returns a new array). */
-export function removeQueuedById(
-  queue: QueuedMessage[],
-  id: string,
-): QueuedMessage[] {
-  return queue.filter((m) => m.id !== id);
-}
--- a/apps/client/src/features/ai-chat/utils/reasoning-tokens.test.ts
+++ b/apps/client/src/features/ai-chat/utils/reasoning-tokens.test.ts
@@ -1,56 +0,0 @@
-import { describe, expect, it } from "vitest";
-import type { UIMessage } from "@ai-sdk/react";
-import { reasoningTokensForPart } from "@/features/ai-chat/utils/reasoning-tokens.ts";
-
-/**
- * Pure-helper tests for `reasoningTokensForPart`, the #151 anti-double-count
- * rule: the authoritative `usage.reasoningTokens` is the TURN TOTAL, so it may
- * only be attributed when the turn has exactly one reasoning part. With multiple
- * reasoning parts (or no authoritative usage) every part falls back to its own
- * per-part estimate, signalled here by `undefined`.
- */
-const msg = (
-  parts: UIMessage["parts"],
-  metadata?: unknown,
-): UIMessage =>
-  ({
-    id: Math.random().toString(),
-    role: "assistant",
-    parts,
-    metadata,
-  }) as UIMessage;
-
-describe("reasoningTokensForPart", () => {
-  it("single reasoning part -> the authoritative turn total", () => {
-    const m = msg(
-      [
-        { type: "reasoning", text: "thinking…" } as never,
-        { type: "text", text: "answer" },
-      ],
-      { usage: { reasoningTokens: 42 } },
-    );
-    expect(reasoningTokensForPart(m)).toBe(42);
-  });
-
-  it("multiple reasoning parts -> undefined (each estimates on its own)", () => {
-    const m = msg(
-      [
-        { type: "reasoning", text: "step one" } as never,
-        { type: "reasoning", text: "step two" } as never,
-        { type: "text", text: "answer" },
-      ],
-      { usage: { reasoningTokens: 99 } },
-    );
-    // Even with an authoritative total, two reasoning parts must each estimate
-    // (attributing the total to one would double-count against the other).
-    expect(reasoningTokensForPart(m)).toBeUndefined();
-  });
-
-  it("no authoritative usage -> undefined even for a single reasoning part", () => {
-    const m = msg([
-      { type: "reasoning", text: "thinking…" } as never,
-      { type: "text", text: "answer" },
-    ]);
-    expect(reasoningTokensForPart(m)).toBeUndefined();
-  });
-});
--- a/apps/client/src/features/ai-chat/utils/reasoning-tokens.ts
+++ b/apps/client/src/features/ai-chat/utils/reasoning-tokens.ts
@@ -1,34 +0,0 @@
-import type { UIMessage } from "@ai-sdk/react";
-
-/**
- * Decide the authoritative reasoning token count to attribute to a single
- * `reasoning` part of an assistant message — or `undefined` when the part should
- * fall back to its own per-part estimate.
- *
- * `usage.reasoningTokens` is the TURN TOTAL, so it may only be attributed to a
- * block when the turn has exactly ONE reasoning part (the common one-step turn):
- * then that block can show the exact figure. With MULTIPLE reasoning parts (a
- * multi-step agent turn) every block must fall back to its own estimate —
- * attributing the turn total to one of them would double-count against the
- * others' estimates (#151 review anti-double-count rule). When there is no
- * authoritative usage at all, every part estimates.
- *
- * Returns the authoritative `reasoningTokens` only for the single-reasoning-part
- * case; `undefined` otherwise (the caller estimates from the part text).
- */
-export function reasoningTokensForPart(
-  message: UIMessage,
-): number | undefined {
-  const reasoningTokens = (
-    message.metadata as { usage?: { reasoningTokens?: number } } | undefined
-  )?.usage?.reasoningTokens;
-
-  const reasoningPartCount = (message.parts ?? []).reduce(
-    (acc, p) => (p.type === "reasoning" ? acc + 1 : acc),
-    0,
-  );
-
-  // Exactly one reasoning part -> attribute the authoritative turn total to it.
-  // Otherwise (zero or multiple) each part estimates on its own.
-  return reasoningPartCount === 1 ? reasoningTokens : undefined;
-}
--- a/apps/client/src/features/ai-chat/utils/role-card-color.test.ts
+++ b/apps/client/src/features/ai-chat/utils/role-card-color.test.ts
@@ -1,23 +0,0 @@
-import { describe, it, expect } from "vitest";
-import { ROLE_CARD_PALETTE, roleCardColor } from "./role-card-color";
-
-describe("roleCardColor", () => {
-  it("has a 10-color palette", () => {
-    expect(ROLE_CARD_PALETTE).toHaveLength(10);
-  });
-
-  it("maps index 0 to the first palette color (blue)", () => {
-    expect(roleCardColor(0)).toBe("blue");
-    expect(roleCardColor(1)).toBe("grape");
-  });
-
-  it("wraps around at the end of the palette", () => {
-    expect(roleCardColor(10)).toBe("blue");
-    expect(roleCardColor(11)).toBe("grape");
-  });
-
-  it("is safe for negative indices", () => {
-    expect(roleCardColor(-1)).toBe("violet");
-    expect(roleCardColor(-10)).toBe("blue");
-  });
-});
--- a/apps/client/src/features/ai-chat/utils/role-card-color.ts
+++ b/apps/client/src/features/ai-chat/utils/role-card-color.ts
@@ -1,25 +0,0 @@
-// Fixed Mantine color palette for the new-chat role cards. Cards cycle through
-// these names by index; the colors are applied via theme-aware Mantine CSS vars
-// (`--mantine-color-<name>-light` etc.) so they are correct in both themes.
-// Universal assistant uses neutral `gray` separately (not part of this palette).
-export const ROLE_CARD_PALETTE = [
-  "blue",
-  "grape",
-  "teal",
-  "orange",
-  "pink",
-  "cyan",
-  "lime",
-  "indigo",
-  "red",
-  "violet",
-] as const;
-
-/**
- * Pick a palette color name for a role card by its index. Cycles through the
- * palette and is safe for negative indices.
- */
-export function roleCardColor(index: number): string {
-  const len = ROLE_CARD_PALETTE.length;
-  return ROLE_CARD_PALETTE[((index % len) + len) % len];
-}
--- a/apps/client/src/features/ai-chat/utils/role-launch.test.ts
+++ b/apps/client/src/features/ai-chat/utils/role-launch.test.ts
@@ -1,72 +0,0 @@
-import { describe, it, expect } from "vitest";
-import { roleLaunchMessage, shouldResetRolePicked } from "./role-launch.ts";
-
-const DEFAULT = "Take a look at the current document";
-
-// Covers the three-way handleRolePick behavior (issue #149) without mounting the
-// chat-thread component — the logic lives in these pure helpers.
-describe("roleLaunchMessage", () => {
-  it("autoStart=true + custom launchMessage -> the trimmed custom text", () => {
-    expect(
-      roleLaunchMessage(
-        { autoStart: true, launchMessage: "  Draft a plan  " },
-        DEFAULT,
-      ),
-    ).toBe("Draft a plan");
-  });
-
-  it("autoStart=true + empty launchMessage -> the default fallback", () => {
-    expect(
-      roleLaunchMessage({ autoStart: true, launchMessage: "" }, DEFAULT),
-    ).toBe(DEFAULT);
-  });
-
-  it("autoStart=true + whitespace-only launchMessage -> the default fallback", () => {
-    expect(
-      roleLaunchMessage({ autoStart: true, launchMessage: "   " }, DEFAULT),
-    ).toBe(DEFAULT);
-  });
-
-  it("autoStart=true + null launchMessage -> the default fallback", () => {
-    expect(
-      roleLaunchMessage({ autoStart: true, launchMessage: null }, DEFAULT),
-    ).toBe(DEFAULT);
-  });
-
-  it("autoStart=false -> null (bind only, send nothing) regardless of message", () => {
-    expect(
-      roleLaunchMessage(
-        { autoStart: false, launchMessage: "ignored" },
-        DEFAULT,
-      ),
-    ).toBeNull();
-    expect(
-      roleLaunchMessage({ autoStart: false, launchMessage: null }, DEFAULT),
-    ).toBeNull();
-  });
-});
-
-// Regression guard for #149: the "picked, not sent" flag must reset when the
-// user starts a fresh chat after an autoStart=false pick. On pre-fix code there
-// was no reset, so the flag stayed stuck and the role cards never returned —
-// this is exactly the `true` case below (which the old code never acted on).
-describe("shouldResetRolePicked", () => {
-  it("resets when the thread is empty and the bound role was cleared (New chat)", () => {
-    // chatId still null, roleId cleared by the parent, flag stuck -> reset.
-    expect(shouldResetRolePicked(null, null, true)).toBe(true);
-    expect(shouldResetRolePicked(null, undefined, true)).toBe(true);
-  });
-
-  it("does NOT reset while a role is still bound (cards stay hidden, composer shown)", () => {
-    // Right after the autoStart=false pick, roleId is the picked role -> keep hidden.
-    expect(shouldResetRolePicked(null, "role-1", true)).toBe(false);
-  });
-
-  it("does NOT reset once the chat exists (a message was sent / chat created)", () => {
-    expect(shouldResetRolePicked("chat-1", null, true)).toBe(false);
-  });
-
-  it("is a no-op when the flag is already false", () => {
-    expect(shouldResetRolePicked(null, null, false)).toBe(false);
-  });
-});
--- a/apps/client/src/features/ai-chat/utils/role-launch.ts
+++ b/apps/client/src/features/ai-chat/utils/role-launch.ts
@@ -1,34 +0,0 @@
-import type { IAiRole } from "@/features/ai-chat/types/ai-chat.types.ts";
-
-/**
- * Decide what (if anything) to auto-send when an agent role card is picked
- * (issue #149). Extracted as a pure function so the three-way behavior is
- * unit-testable without mounting the chat-thread component:
- *   - autoStart=false              -> null  (bind the role only, send nothing)
- *   - autoStart=true + message     -> the trimmed custom launchMessage
- *   - autoStart=true + empty/null  -> the default fallback text
- */
-export function roleLaunchMessage(
-  role: Pick<IAiRole, "autoStart" | "launchMessage">,
-  defaultText: string,
-): string | null {
-  if (!role.autoStart) return null;
-  return role.launchMessage?.trim() || defaultText;
-}
-
-/**
- * Whether the "role picked but nothing sent yet" flag (`rolePickedNoSend`)
- * should reset to false. After an autoStart=false pick the thread shows the
- * composer with chatId still null; when the user then starts a fresh chat the
- * parent clears the bound role (roleId -> null) but chatId stays null, so the
- * thread never remounts and the flag would otherwise stay set — hiding the role
- * cards forever. Reset exactly in that state; a still-bound role (roleId set)
- * keeps the cards hidden. (Regression guard for #149.)
- */
-export function shouldResetRolePicked(
-  chatId: string | null,
-  roleId: string | null | undefined,
-  rolePickedNoSend: boolean,
-): boolean {
-  return chatId === null && roleId == null && rolePickedNoSend;
-}
--- a/apps/client/src/features/ai-chat/utils/thread-identity.test.ts
+++ b/apps/client/src/features/ai-chat/utils/thread-identity.test.ts
@@ -1,79 +0,0 @@
-import { describe, it, expect } from "vitest";
-import {
-  newThread,
-  switchThread,
-  adoptThread,
-  threadSessionReducer,
-} from "./thread-identity";
-
-describe("newThread", () => {
-  it("uses the supplied key and has no chat id yet", () => {
-    expect(newThread("new-abc")).toEqual({ key: "new-abc", chatId: null });
-  });
-});
-
-describe("switchThread", () => {
-  it("switches to an existing chat: key becomes the chat id", () => {
-    expect(switchThread("chat-1")).toEqual({
-      key: "chat-1",
-      chatId: "chat-1",
-    });
-  });
-});
-
-describe("adoptThread", () => {
-  // Key UNCHANGED (no remount) + chatId moved null->realId. The unchanged key is
-  // what keeps the live useChat store alive; the matching chatId is what makes the
-  // window's render-phase reconciler (activeChatId !== thread.chatId) treat the
-  // adopted thread as already-in-sync rather than a switch.
-  it("adopts in place for a new chat: keeps the key, sets the chat id", () => {
-    const prev = newThread("new-abc");
-    expect(adoptThread(prev, "chat-1")).toEqual({
-      key: "new-abc",
-      chatId: "chat-1",
-    });
-  });
-
-  it("is a no-op for an already-persisted chat", () => {
-    const prev: { key: string; chatId: string | null } = {
-      key: "chat-1",
-      chatId: "chat-1",
-    };
-    expect(adoptThread(prev, "chat-2")).toBe(prev);
-  });
-});
-
-describe("threadSessionReducer", () => {
-  it("reconcile to an existing id switches (key becomes the id)", () => {
-    const next = threadSessionReducer(newThread("new-abc"), {
-      type: "reconcile",
-      chatId: "chat-1",
-      newKey: "new-xyz",
-    });
-    expect(next).toEqual({ key: "chat-1", chatId: "chat-1" });
-  });
-
-  it("reconcile to null starts a fresh new thread with the supplied key", () => {
-    const next = threadSessionReducer(switchThread("chat-1"), {
-      type: "reconcile",
-      chatId: null,
-      newKey: "new-xyz",
-    });
-    expect(next).toEqual({ key: "new-xyz", chatId: null });
-  });
-
-  it("adopt on a new thread keeps the key and sets the id", () => {
-    const next = threadSessionReducer(newThread("new-abc"), {
-      type: "adopt",
-      chatId: "chat-1",
-    });
-    expect(next).toEqual({ key: "new-abc", chatId: "chat-1" });
-  });
-
-  it("adopt on a persisted thread is a no-op", () => {
-    const prev = switchThread("chat-1");
-    expect(threadSessionReducer(prev, { type: "adopt", chatId: "chat-2" })).toBe(
-      prev,
-    );
-  });
-});
--- a/apps/client/src/features/ai-chat/utils/thread-identity.ts
+++ b/apps/client/src/features/ai-chat/utils/thread-identity.ts
@@ -1,73 +0,0 @@
-/**
- * Pure transitions for the AI-chat thread's identity: the single source of
- * truth tying ChatThread's mount key to the chat id that mounted thread holds.
- *
- * The window keeps exactly ONE of these in state. Consolidating the mount key
- * and the live thread's chat id into one atomic value makes the "stale chat id
- * vs key" state unrepresentable: every change goes through one of the explicit
- * transitions below, so the key and chatId can never silently diverge.
- *
- * - `newThread`/`switchThread` produce a key that forces a remount (+ reseed):
- *   `newThread` for a brand-new (id-less) chat, `switchThread` for an existing
- *   one. The caller picks which based on whether there is a chat id.
- * - `adoptThread` keeps the SAME key so a brand-new chat learns its real id
- *   WITHOUT remounting (the live useChat store, holding the just-finished turn,
- *   is preserved and the next turn sends the real chatId).
- *
- * `newThread` takes the session key from the impure `generateId()` at the call
- * site so these stay pure and unit-testable.
- */
-export type ThreadIdentity = { key: string; chatId: string | null };
-
-/**
- * A brand-new chat: a fresh session key and no chat id yet. `newKey` is
- * supplied by the caller (generateId() is impure) so this stays pure/testable.
- */
-export function newThread(newKey: string): ThreadIdentity {
-  return { key: newKey, chatId: null };
-}
-
-/**
- * Switch to an EXISTING chat: the mount key becomes the chat id, forcing a
- * remount + reseed from the persisted history. (A switch to a brand-new chat
- * goes through `newThread` instead — there is no id to key on.)
- */
-export function switchThread(chatId: string): ThreadIdentity {
-  return { key: chatId, chatId };
-}
-
-/**
- * In-place adoption: a brand-new chat (`prev.chatId === null`) learns its real
- * id WITHOUT remounting — keep the SAME key, set the chat id. If `prev` already
- * has a chatId (not a new chat), this is a no-op (returns `prev`): adoption only
- * applies to an as-yet-unadopted new thread.
- */
-export function adoptThread(prev: ThreadIdentity, chatId: string): ThreadIdentity {
-  return prev.chatId === null ? { key: prev.key, chatId } : prev;
-}
-
-/**
- * Thread-identity transitions as a reducer action. See `threadSessionReducer`.
- */
-export type ThreadSessionAction =
-  | { type: "reconcile"; chatId: string | null; newKey: string }
-  | { type: "adopt"; chatId: string };
-
-/**
- * Single source of truth for thread-identity transitions. `reconcile` handles a
- * genuine switch (user OR external atom write) -> remount; `adopt` moves a brand-
- * new chat to its real id in place (no remount).
- */
-export function threadSessionReducer(
-  state: ThreadIdentity,
-  action: ThreadSessionAction,
-): ThreadIdentity {
-  switch (action.type) {
-    case "reconcile":
-      return action.chatId === null
-        ? newThread(action.newKey)
-        : switchThread(action.chatId);
-    case "adopt":
-      return adoptThread(state, action.chatId);
-  }
-}
--- a/apps/client/src/features/auth/components/invite-sign-up-form.tsx
+++ b/apps/client/src/features/auth/components/invite-sign-up-form.tsx
@@ -10,12 +10,9 @@ import {
  PasswordInput,
  Box,
  Stack,
-  Group,
-  Text,
 } from "@mantine/core";
 import { zod4Resolver } from "mantine-form-zod-resolver";
-import { Link, useParams, useSearchParams } from "react-router-dom";
-import APP_ROUTE from "@/lib/app-route";
+import { useParams, useSearchParams } from "react-router-dom";
 import useAuth from "@/features/auth/hooks/use-auth";
 import classes from "@/features/auth/components/auth.module.css";
 import { useGetInvitationQuery } from "@/features/workspace/queries/workspace-query.ts";
@@ -61,27 +58,7 @@ export function InviteSignUpForm() {
  }

  if (isError) {
-    // Styled error with a CTA to login, mirroring the password-reset
-    // error page and the 404 page (issue #133)
-    return (
-      <AuthLayout>
-        <Container my={40}>
-          <Text size="lg" ta="center">
-            {t("Invalid invitation link")}
-          </Text>
-          <Group justify="center">
-            <Button
-              component={Link}
-              to={APP_ROUTE.AUTH.LOGIN}
-              variant="subtle"
-              size="md"
-            >
-              {t("Go to login page")}
-            </Button>
-          </Group>
-        </Container>
-      </AuthLayout>
-    );
+    return <div>{t("invalid invitation link")}</div>;
  }

  if (!invitation) {
--- a/apps/client/src/features/comment/components/comment-list-item.test.tsx
+++ b/apps/client/src/features/comment/components/comment-list-item.test.tsx
@@ -1,59 +0,0 @@
-import { describe, it, expect, vi } from "vitest";
-import { render, screen } from "@testing-library/react";
-import { MantineProvider } from "@mantine/core";
-import { IComment } from "@/features/comment/types/comment.types";
-
-// matchMedia (read by MantineProvider) is stubbed globally in vitest.setup.ts.
-
-// The comment mutation hooks reach out to react-query/network — stub them so the
-// component renders in isolation. We only assert the AI-badge rendering branch.
-vi.mock("@/features/comment/queries/comment-query", () => ({
-  useDeleteCommentMutation: () => ({ mutateAsync: vi.fn() }),
-  useResolveCommentMutation: () => ({ mutateAsync: vi.fn() }),
-  useUpdateCommentMutation: () => ({ mutateAsync: vi.fn() }),
-}));
-
-// CommentEditor pulls in the full TipTap editor stack; replace it with a stub.
-vi.mock("@/features/comment/components/comment-editor", () => ({
-  default: () => <div data-testid="comment-editor" />,
-}));
-
-import CommentListItem from "./comment-list-item";
-
-const baseComment = (over?: Partial<IComment>): IComment =>
-  ({
-    id: "c-1",
-    content: JSON.stringify({ type: "doc", content: [] }),
-    creatorId: "user-1",
-    pageId: "page-1",
-    workspaceId: "ws-1",
-    createdAt: new Date(),
-    creator: { id: "user-1", name: "Service Bot", avatarUrl: null } as any,
-    ...over,
-  }) as IComment;
-
-function renderItem(comment: IComment) {
-  return render(
-    <MantineProvider>
-      <CommentListItem comment={comment} pageId="page-1" canComment={true} />
-    </MantineProvider>,
-  );
-}
-
-describe("CommentListItem — AI badge", () => {
-  it('renders the AI-agent badge when createdSource === "agent"', () => {
-    renderItem(baseComment({ createdSource: "agent", aiChatId: null }));
-    expect(screen.getByText("AI-agent")).toBeDefined();
-    expect(screen.getByText("Service Bot")).toBeDefined();
-  });
-
-  it('does NOT render the badge for a normal user comment (createdSource "user")', () => {
-    renderItem(baseComment({ createdSource: "user" }));
-    expect(screen.queryByText("AI-agent")).toBeNull();
-    expect(screen.getByText("Service Bot")).toBeDefined();
-  });
-
-  // The non-clickable (null aiChatId) branch is a property of AiAgentBadge itself
-  // and is covered in ai-agent-badge.test.tsx; this integration suite only needs
-  // the insertion gate (agent → badge, user → no badge) above (#143 review).
-});
--- a/apps/client/src/features/comment/components/comment-list-item.tsx
+++ b/apps/client/src/features/comment/components/comment-list-item.tsx
@@ -1,5 +1,4 @@
-import { Group, Text, Box } from "@mantine/core";
-import { AiAgentBadge } from "@/components/ui/ai-agent-badge.tsx";
+import { Group, Text, Box, Badge } from "@mantine/core";
 import React, { useEffect, useRef, useState } from "react";
 import classes from "./comment.module.css";
 import { useAtom, useAtomValue } from "jotai";
@@ -127,18 +126,9 @@ function CommentListItem({

        <div style={{ flex: 1 }}>
          <Group justify="space-between" wrap="nowrap">
-            <Group gap={6} wrap="nowrap" style={{ minWidth: 0 }}>
-              <Text size="xs" fw={500} lineClamp={1} lh={1.2}>
-                {comment.creator.name}
-              </Text>
-
-              {comment.createdSource === "agent" && (
-                <AiAgentBadge
-                  authorName={comment.creator?.name}
-                  aiChatId={comment.aiChatId}
-                />
-              )}
-            </Group>
+            <Text size="xs" fw={500} lineClamp={1}>
+              {comment.creator.name}
+            </Text>

            <div style={{ visibility: hovered ? "visible" : "hidden" }}>
              {!comment.parentCommentId && canComment && (
@@ -165,7 +155,7 @@ function CommentListItem({
          </Group>

          <Group gap="xs">
-            <Text size="xs" fw={500} c="dimmed" lh={1.1}>
+            <Text size="xs" fw={500} c="dimmed">
              {createdAtAgo}
            </Text>
          </Group>
--- a/apps/client/src/features/comment/components/comment-list-with-tabs.tsx
+++ b/apps/client/src/features/comment/components/comment-list-with-tabs.tsx
@@ -11,7 +11,6 @@ import {
  Badge,
  Text,
  ScrollArea,
-  Tooltip,
 } from "@mantine/core";
 import CommentListItem from "@/features/comment/components/comment-list-item";
 import {
@@ -27,16 +26,12 @@ import { IPagination } from "@/lib/types.ts";
 import { extractPageSlugId } from "@/lib";
 import { useTranslation } from "react-i18next";
 import { useGetSpaceBySlugQuery } from "@/features/space/queries/space-query.ts";
-import { IconArrowUp, IconMessageOff, IconX } from "@tabler/icons-react";
+import { IconArrowUp, IconMessageOff } from "@tabler/icons-react";
 import { useAtom } from "jotai";
 import { currentUserAtom } from "@/features/user/atoms/current-user-atom";
 import { CustomAvatar } from "@/components/ui/custom-avatar.tsx";

-interface CommentListWithTabsProps {
-  onClose?: () => void;
-}
-
-function CommentListWithTabs({ onClose }: CommentListWithTabsProps) {
+function CommentListWithTabs() {
  const { t } = useTranslation();
  const { pageSlug } = useParams();
  const { data: page } = usePageQuery({ pageId: extractPageSlugId(pageSlug) });
@@ -199,50 +194,28 @@ function CommentListWithTabs({ onClose }: CommentListWithTabsProps) {
          overflow: "hidden",
        }}
      >
-        {/* Header row: full-width centered tab list with the close button overlaid on the right. */}
-        <div style={{ position: "relative" }}>
-          <Tabs.List justify="center">
-            <Tabs.Tab
-              value="open"
-              leftSection={
-                <Badge size="sm" variant="light" color="blue">
-                  {activeComments.length}
-                </Badge>
-              }
-            >
-              {t("Open")}
-            </Tabs.Tab>
-            <Tabs.Tab
-              value="resolved"
-              leftSection={
-                <Badge size="sm" variant="light" color="green">
-                  {resolvedComments.length}
-                </Badge>
-              }
-            >
-              {t("Resolved")}
-            </Tabs.Tab>
-          </Tabs.List>
-          {onClose && (
-            <Tooltip label={t("Close")} withArrow>
-              <ActionIcon
-                variant="subtle"
-                color="gray"
-                onClick={onClose}
-                aria-label={t("Close")}
-                style={{
-                  position: "absolute",
-                  right: 0,
-                  top: "50%",
-                  // Nudge the close button slightly up to align with the tab labels.
-                  transform: "translateY(calc(-50% - 4px))",
-                }}
-              >
-                <IconX size={18} />
-              </ActionIcon>
-            </Tooltip>
-          )}
-        </div>
+        <Tabs.List justify="center">
+          <Tabs.Tab
+            value="open"
+            leftSection={
+              <Badge size="sm" variant="light" color="blue">
+                {activeComments.length}
+              </Badge>
+            }
+          >
+            {t("Open")}
+          </Tabs.Tab>
+          <Tabs.Tab
+            value="resolved"
+            leftSection={
+              <Badge size="sm" variant="light" color="green">
+                {resolvedComments.length}
+              </Badge>
+            }
+          >
+            {t("Resolved")}
+          </Tabs.Tab>
+        </Tabs.List>

        <ScrollArea
          style={{ flex: "1 1 auto" }}
@@ -392,7 +365,7 @@ const PageCommentInput = ({ onSave, isLoading }) => {
        flex: "0 0 auto",
        borderTop: "1px solid var(--mantine-color-default-border)",
        paddingTop: "var(--mantine-spacing-sm)",
-        paddingBottom: 10,
+        paddingBottom: 25,
        position: "relative",
      }}
    >
@@ -401,7 +374,7 @@ const PageCommentInput = ({ onSave, isLoading }) => {
          size="sm"
          avatarUrl={currentUser?.user?.avatarUrl}
          name={currentUser?.user?.name}
-          style={{ flexShrink: 0, marginTop: 2 }}
+          style={{ flexShrink: 0, marginTop: 10 }}
        />
        <div style={{ flex: 1, minWidth: 0 }}>
          <CommentEditor
@@ -423,7 +396,7 @@ const PageCommentInput = ({ onSave, isLoading }) => {
          onClick={handleSave}
          onMouseDown={(e) => e.preventDefault()}
          loading={isLoading}
-          style={{ position: "absolute", right: 8, bottom: 15 }}
+          style={{ position: "absolute", right: 8, bottom: 30 }}
        >
          <IconArrowUp size={16} />
        </ActionIcon>
--- a/apps/client/src/features/comment/components/comment.module.css
+++ b/apps/client/src/features/comment/components/comment.module.css
@@ -3,12 +3,7 @@
 }

 .textSelection {
-    /* Breathing room below the comment header (author + timestamp) so the
-       quote does not stick to the timestamp when it is the first block. */
-    margin-top: 8px;
-    /* Align the quote's left bar with the comment body text left edge
-       (the comment editor insets its text by 6px). */
-    margin-left: 6px;
+    margin-top: 2px;
    border-left: 2px solid var(--mantine-color-gray-6);
    padding: 6px;
    background: var(--mantine-color-gray-light);
--- a/apps/client/src/features/comment/types/comment.types.ts
+++ b/apps/client/src/features/comment/types/comment.types.ts
@@ -17,13 +17,6 @@ export interface IComment {
  deletedAt?: Date;
  creator: IUser;
  resolvedBy?: IUser;
-  // Agent-edit provenance (returned by the backend via selectAll('comments')).
-  // createdSource === "agent" marks a comment authored via an AI agent (MCP /
-  // internal AI chat); aiChatId deep-links to the internal chat when present
-  // (null for an external MCP agent); resolvedSource marks an AI-resolved thread.
-  createdSource?: string;
-  aiChatId?: string | null;
-  resolvedSource?: string | null;
  yjsSelection?: {
    anchor: any;
    head: any;
--- a/apps/client/src/features/dictation/audio/audio-worklet.d.ts
+++ b/apps/client/src/features/dictation/audio/audio-worklet.d.ts
@@ -0,0 +1,33 @@
+// Minimal ambient declarations for the AudioWorklet global scope.
+//
+// The client tsconfig only pulls in the DOM libs (no "webworker"/"audioworklet"
+// lib), so the symbols available inside an AudioWorkletProcessor module are not
+// known to `tsc`. These declarations are intentionally narrow: just enough for
+// `pcm16-worklet.ts` to typecheck, matching the Web Audio API spec shapes used
+// by that processor. They describe the worklet global scope, not the main thread.
+
+declare abstract class AudioWorkletProcessor {
+  // Message channel back to the main thread (used to transfer PCM16 buffers).
+  readonly port: MessagePort;
+
+  constructor();
+
+  // Called for each render quantum. `inputs`/`outputs` are channel arrays
+  // indexed as [input][channel][sample]; `parameters` maps AudioParam names to
+  // their per-sample (or single-value) Float32Array. Return `true` to keep the
+  // processor alive.
+  abstract process(
+    inputs: Float32Array[][],
+    outputs: Float32Array[][],
+    parameters: Record<string, Float32Array>,
+  ): boolean;
+}
+
+// Registers a processor class under a name usable from `new AudioWorkletNode`.
+declare function registerProcessor(
+  name: string,
+  processorCtor: new () => AudioWorkletProcessor,
+): void;
+
+// The render context's sample rate, in Hz, available in the worklet global scope.
+declare const sampleRate: number;
--- a/apps/client/src/features/dictation/audio/mic-capture.test.ts
+++ b/apps/client/src/features/dictation/audio/mic-capture.test.ts
@@ -0,0 +1,87 @@
+import { describe, it, expect } from "vitest";
+import {
+  mapGetUserMediaError,
+  canStartCapture,
+  MicUnavailableError,
+} from "./mic-capture";
+
+// Identity translator so assertions read the source key (the i18n layer is not
+// under test here).
+const t = (k: string) => k;
+
+describe("mapGetUserMediaError", () => {
+  it("maps NotAllowedError / SecurityError to denied", () => {
+    expect(mapGetUserMediaError({ name: "NotAllowedError" }, t)).toBe(
+      "Microphone access denied",
+    );
+    expect(mapGetUserMediaError({ name: "SecurityError" }, t)).toBe(
+      "Microphone access denied",
+    );
+  });
+
+  it("maps NotFoundError / OverconstrainedError to not found", () => {
+    expect(mapGetUserMediaError({ name: "NotFoundError" }, t)).toBe(
+      "No microphone found",
+    );
+    expect(mapGetUserMediaError({ name: "OverconstrainedError" }, t)).toBe(
+      "No microphone found",
+    );
+  });
+
+  it("maps NotReadableError / AbortError to in-use", () => {
+    expect(mapGetUserMediaError({ name: "NotReadableError" }, t)).toBe(
+      "Microphone is unavailable or already in use",
+    );
+    expect(mapGetUserMediaError({ name: "AbortError" }, t)).toBe(
+      "Microphone is unavailable or already in use",
+    );
+  });
+
+  it("falls back to a detailed message for unknown errors", () => {
+    const msg = mapGetUserMediaError(
+      { name: "WeirdError", message: "boom" },
+      t,
+    );
+    expect(msg).toContain("Could not start recording");
+    expect(msg).toContain("WeirdError");
+    expect(msg).toContain("boom");
+  });
+
+  it("falls back without a name", () => {
+    const msg = mapGetUserMediaError(new Error("nope"), t);
+    expect(msg).toContain("Could not start recording");
+    expect(msg).toContain("nope");
+  });
+});
+
+describe("canStartCapture", () => {
+  const base = {
+    starting: false,
+    hasStream: false,
+    hasLiveResource: false,
+    statusIsIdle: true,
+  };
+  it("allows when idle and nothing live", () => {
+    expect(canStartCapture(base)).toBe(true);
+  });
+  it("blocks while already starting", () => {
+    expect(canStartCapture({ ...base, starting: true })).toBe(false);
+  });
+  it("blocks when a stream is live", () => {
+    expect(canStartCapture({ ...base, hasStream: true })).toBe(false);
+  });
+  it("blocks when a downstream resource is live", () => {
+    expect(canStartCapture({ ...base, hasLiveResource: true })).toBe(false);
+  });
+  it("blocks when status is not idle", () => {
+    expect(canStartCapture({ ...base, statusIsIdle: false })).toBe(false);
+  });
+});
+
+describe("MicUnavailableError", () => {
+  it("is identifiable via instanceof", () => {
+    const e = new MicUnavailableError();
+    expect(e).toBeInstanceOf(MicUnavailableError);
+    expect(e.name).toBe("MicUnavailableError");
+  });
+});
--- a/apps/client/src/features/dictation/audio/mic-capture.ts
+++ b/apps/client/src/features/dictation/audio/mic-capture.ts
@@ -0,0 +1,68 @@
+// Shared microphone-acquisition front-end used by BOTH the batch (`use-dictation`)
+// and streaming (`use-realtime-dictation`) hooks. Only the getUserMedia handshake
+// and its error→message mapping live here — the two hooks keep their own distinct
+// downstream graphs (MediaRecorder vs AudioWorklet) and their own streamRef
+// ownership. This collapses the ~37 duplicated lines without merging the hooks.
+
+// Translate function shape (react-i18next's `t`). Kept structural so this module
+// has no i18next dependency and stays trivially testable.
+export type Translate = (key: string) => string;
+
+/** Thrown by `acquireMicStream` when the environment cannot capture audio. */
+export class MicUnavailableError extends Error {
+  constructor() {
+    super("navigator.mediaDevices.getUserMedia is unavailable in this context");
+    this.name = "MicUnavailableError";
+  }
+}
+
+/**
+ * Map a getUserMedia rejection to a user-facing, localized message. Mirrors the
+ * branching both hooks used previously so behavior is identical. Pure aside from
+ * the injected `t`; safe to unit-test with a stub translator.
+ */
+export function mapGetUserMediaError(err: unknown, t: Translate): string {
+  const name = (err as { name?: string })?.name;
+  const detail = (err as { message?: string })?.message ?? String(err);
+  if (name === "NotAllowedError" || name === "SecurityError") {
+    return t("Microphone access denied");
+  }
+  if (name === "NotFoundError" || name === "OverconstrainedError") {
+    return t("No microphone found");
+  }
+  if (name === "NotReadableError" || name === "AbortError") {
+    return t("Microphone is unavailable or already in use");
+  }
+  // Unknown failure: show the real reason instead of a generic string.
+  return `${t("Could not start recording")}: ${name ? `${name}: ` : ""}${detail}`;
+}
+
+/**
+ * Request the microphone. Throws `MicUnavailableError` when the API is missing
+ * (so callers can show the "not available in this context" notification), and
+ * otherwise rethrows the raw getUserMedia error for `mapGetUserMediaError`. The
+ * caller owns the returned stream (assigns it to its own streamRef and is
+ * responsible for stopping the tracks on every exit path).
+ */
+export async function acquireMicStream(): Promise<MediaStream> {
+  if (!navigator.mediaDevices?.getUserMedia) {
+    throw new MicUnavailableError();
+  }
+  return navigator.mediaDevices.getUserMedia({ audio: true });
+}
+
+/**
+ * Shared synchronous double-start guard. Returns true when a new capture may
+ * begin, false when one is already starting or live (so the second click is a
+ * no-op and never opens a leaking second MediaStream). `status` is the React
+ * status; the refs cover the window before the next render commits.
+ */
+export function canStartCapture(args: {
+  starting: boolean;
+  hasStream: boolean;
+  hasLiveResource: boolean;
+  statusIsIdle: boolean;
+}): boolean {
+  if (args.starting || args.hasStream || args.hasLiveResource) return false;
+  return args.statusIsIdle;
+}
--- a/apps/client/src/features/dictation/audio/pcm16-dsp.test.ts
+++ b/apps/client/src/features/dictation/audio/pcm16-dsp.test.ts
@@ -0,0 +1,178 @@
+import { describe, it, expect } from "vitest";
+import {
+  floatSampleToInt16,
+  floatToPcm16LE,
+  LinearResampler,
+  OnePoleLowPass,
+  FrameAccumulator,
+  FRAME_SAMPLES,
+} from "./pcm16-dsp";
+
+// Read back the LE int16 values from a PCM16 ArrayBuffer for assertions.
+function readInt16LE(buf: ArrayBuffer): number[] {
+  const view = new DataView(buf);
+  const out: number[] = [];
+  for (let i = 0; i < buf.byteLength; i += 2) out.push(view.getInt16(i, true));
+  return out;
+}
+
+describe("floatSampleToInt16 / floatToPcm16LE", () => {
+  it("maps +1 → 32767, -1 → -32768, 0 → 0", () => {
+    expect(floatSampleToInt16(1)).toBe(32767);
+    expect(floatSampleToInt16(-1)).toBe(-32768);
+    expect(floatSampleToInt16(0)).toBe(0);
+  });
+
+  it("clamps +2 / -2 without overflow", () => {
+    expect(floatSampleToInt16(2)).toBe(32767);
+    expect(floatSampleToInt16(-2)).toBe(-32768);
+    expect(floatSampleToInt16(1000)).toBe(32767);
+    expect(floatSampleToInt16(-1000)).toBe(-32768);
+  });
+
+  it("handles NaN and Infinity", () => {
+    expect(floatSampleToInt16(NaN)).toBe(0);
+    expect(floatSampleToInt16(Infinity)).toBe(32767);
+    expect(floatSampleToInt16(-Infinity)).toBe(-32768);
+  });
+
+  it("writes little-endian byte order", () => {
+    // 1 → 32767 = 0x7FFF → LE bytes [0xFF, 0x7F].
+    const buf = floatToPcm16LE([1]);
+    const bytes = new Uint8Array(buf);
+    expect(bytes[0]).toBe(0xff);
+    expect(bytes[1]).toBe(0x7f);
+    expect(buf.byteLength).toBe(2);
+  });
+
+  it("emits exactly length*2 bytes and round-trips", () => {
+    const input = [0, 1, -1, 0.5, -0.5];
+    const buf = floatToPcm16LE(input);
+    expect(buf.byteLength).toBe(input.length * 2);
+    const back = readInt16LE(buf);
+    expect(back[0]).toBe(0);
+    expect(back[1]).toBe(32767);
+    expect(back[2]).toBe(-32768);
+  });
+
+  it("property: output is always within [-32768, 32767]", () => {
+    for (let i = 0; i < 1000; i++) {
+      const v = (Math.random() - 0.5) * 10; // span well beyond [-1,1]
+      const out = floatSampleToInt16(v);
+      expect(out).toBeGreaterThanOrEqual(-32768);
+      expect(out).toBeLessThanOrEqual(32767);
+    }
+    // Include hostile values explicitly.
+    for (const v of [NaN, Infinity, -Infinity, 1e308, -1e308]) {
+      const out = floatSampleToInt16(v);
+      expect(out).toBeGreaterThanOrEqual(-32768);
+      expect(out).toBeLessThanOrEqual(32767);
+    }
+  });
+});
+
+describe("LinearResampler", () => {
+  function ramp(n: number): Float32Array {
+    const a = new Float32Array(n);
+    for (let i = 0; i < n; i++) a[i] = i / n;
+    return a;
+  }
+
+  it("48k → 24k produces ~half the samples", () => {
+    const rs = new LinearResampler(48000, 24000);
+    const out = rs.process(ramp(1000));
+    expect(out.length).toBeGreaterThan(480);
+    expect(out.length).toBeLessThan(520);
+  });
+
+  it("ratio = 1 is approximately a passthrough (length-wise)", () => {
+    const rs = new LinearResampler(24000, 24000);
+    const input = ramp(1000);
+    const out = rs.process(input);
+    expect(Math.abs(out.length - input.length)).toBeLessThanOrEqual(1);
+  });
+
+  it("44.1k → 24k fractional ratio yields the expected count", () => {
+    const rs = new LinearResampler(44100, 24000);
+    const n = 4410;
+    const out = rs.process(ramp(n));
+    const expected = n * (24000 / 44100); // ~2400
+    expect(Math.abs(out.length - expected)).toBeLessThan(3);
+  });
+
+  it("cross-quantum continuity: split == single", () => {
+    const input = ramp(2000);
+    const single = new LinearResampler(48000, 24000).process(input);
+
+    const split = new LinearResampler(48000, 24000);
+    const a = split.process(input.subarray(0, 777));
+    const b = split.process(input.subarray(777));
+    const joined = new Float32Array(a.length + b.length);
+    joined.set(a, 0);
+    joined.set(b, a.length);
+
+    expect(joined.length).toBe(single.length);
+    for (let i = 0; i < single.length; i++) {
+      expect(joined[i]).toBeCloseTo(single[i], 6);
+    }
+  });
+
+  it("never reads out of bounds (no NaN in output)", () => {
+    const rs = new LinearResampler(48000, 24000);
+    for (let q = 0; q < 50; q++) {
+      const out = rs.process(ramp(128));
+      for (const v of out) expect(Number.isNaN(v)).toBe(false);
+    }
+  });
+});
+
+describe("OnePoleLowPass", () => {
+  it("is a passthrough when not downsampling", () => {
+    const lp = new OnePoleLowPass(24000, 24000);
+    for (const v of [0.5, -0.3, 1, -1]) expect(lp.process(v)).toBe(v);
+  });
+
+  it("attenuates a step (smooths) when downsampling", () => {
+    const lp = new OnePoleLowPass(48000, 24000);
+    const first = lp.process(1);
+    // One-pole on a step from 0 should not jump straight to 1.
+    expect(first).toBeLessThan(1);
+    expect(first).toBeGreaterThan(0);
+  });
+});
+
+describe("FrameAccumulator", () => {
+  it("emits exactly one 7200-byte frame for FRAME_SAMPLES samples", () => {
+    expect(FRAME_SAMPLES).toBe(3600);
+    const acc = new FrameAccumulator();
+    const frames = acc.push(new Float32Array(FRAME_SAMPLES));
+    expect(frames).toHaveLength(1);
+    expect(frames[0].byteLength).toBe(7200);
+    expect(acc.pending).toBe(0);
+  });
+
+  it("emits no frame for FRAME_SAMPLES-1 samples and carries the remainder", () => {
+    const acc = new FrameAccumulator();
+    const frames = acc.push(new Float32Array(FRAME_SAMPLES - 1));
+    expect(frames).toHaveLength(0);
+    expect(acc.pending).toBe(FRAME_SAMPLES - 1);
+  });
+
+  it("carries the remainder across pushes", () => {
+    const acc = new FrameAccumulator();
+    expect(acc.push(new Float32Array(2000))).toHaveLength(0);
+    const frames = acc.push(new Float32Array(2000)); // 4000 total → one frame
+    expect(frames).toHaveLength(1);
+    expect(acc.pending).toBe(400); // 4000 - 3600
+  });
+
+  it("flush emits the partial tail then clears", () => {
+    const acc = new FrameAccumulator();
+    acc.push(new Float32Array(100));
+    const tail = acc.flush();
+    expect(tail).not.toBeNull();
+    expect(tail!.byteLength).toBe(200);
+    expect(acc.pending).toBe(0);
+    expect(acc.flush()).toBeNull();
+  });
+});
--- a/apps/client/src/features/dictation/audio/pcm16-dsp.ts
+++ b/apps/client/src/features/dictation/audio/pcm16-dsp.ts
@@ -0,0 +1,187 @@
+// Pure DSP primitives for the realtime dictation capture path. These functions
+// carry NO Web Audio / worklet dependencies so they can be unit-tested directly
+// in jsdom/node. The AudioWorklet processor (`pcm16-worklet.ts`) re-implements
+// the same math inline (the worklet global scope forbids ES imports at runtime),
+// but THIS module is the single canonical reference the tests exercise and the
+// worklet is kept byte-identical in behavior to it. See the note in
+// `pcm16-worklet.ts`.
+
+// Target output rate required by the upstream transcription contract.
+export const TARGET_RATE = 24000;
+// ~150 ms of audio at the target rate: 24000 * 0.15 = 3600 samples per message.
+export const FRAME_SAMPLES = Math.round(TARGET_RATE * 0.15);
+
+/**
+ * Convert a single normalized float audio sample in [-1, 1] to a signed 16-bit
+ * integer. Values outside the range are clamped; NaN/Inf collapse to 0/±range so
+ * the output is ALWAYS within [-32768, 32767]. Negative values scale by 0x8000
+ * and non-negative by 0x7fff so that +1 → 32767 and -1 → -32768 exactly.
+ */
+export function floatSampleToInt16(sample: number): number {
+  let s = sample;
+  if (Number.isNaN(s)) return 0;
+  if (s > 1) s = 1;
+  else if (s < -1) s = -1;
+  const scaled = s < 0 ? s * 0x8000 : s * 0x7fff;
+  // Math.round to the nearest integer, then a hard clamp as a final guard.
+  let v = Math.round(scaled);
+  if (v > 32767) v = 32767;
+  else if (v < -32768) v = -32768;
+  return v;
+}
+
+/**
+ * Convert a Float32 sample buffer to little-endian PCM16 bytes. The returned
+ * ArrayBuffer is exactly `float32.length * 2` bytes; byte order is LE regardless
+ * of host endianness (DataView writes are explicit).
+ */
+export function floatToPcm16LE(float32: ArrayLike<number>): ArrayBuffer {
+  const count = float32.length;
+  const buffer = new ArrayBuffer(count * 2);
+  const view = new DataView(buffer);
+  for (let i = 0; i < count; i++) {
+    view.setInt16(i * 2, floatSampleToInt16(float32[i]), true);
+  }
+  return buffer;
+}
+
+/**
+ * A simple one-pole IIR low-pass filter used as a cheap anti-aliasing stage
+ * before downsampling (e.g. 48k → 24k). The coefficient is derived from the
+ * normalized cutoff so the filter attenuates content above the output Nyquist,
+ * reducing aliasing noise that would otherwise confuse the STT model. State is
+ * carried across quanta via the returned `prev` so there are no per-quantum
+ * seams. When `inputRate <= outputRate` (no downsampling) the filter is a
+ * passthrough.
+ */
+export class OnePoleLowPass {
+  private alpha: number;
+  private prev: number;
+  private readonly enabled: boolean;
+
+  constructor(inputRate: number, outputRate: number, primed = 0) {
+    // Cutoff a touch below the output Nyquist to leave transition room.
+    const cutoff = (outputRate / 2) * 0.9;
+    this.enabled = inputRate > outputRate && cutoff > 0 && inputRate > 0;
+    // Standard one-pole alpha: dt / (rc + dt), rc = 1 / (2π fc).
+    const dt = 1 / Math.max(inputRate, 1);
+    const rc = 1 / (2 * Math.PI * Math.max(cutoff, 1));
+    this.alpha = dt / (rc + dt);
+    this.prev = primed;
+  }
+
+  /** Filter one sample in place; passthrough when disabled. */
+  process(sample: number): number {
+    if (!this.enabled) return sample;
+    this.prev = this.prev + this.alpha * (sample - this.prev);
+    return this.prev;
+  }
+}
+
+/**
+ * Stateful linear resampler that converts a stream of input quanta at
+ * `inputRate` to `outputRate`, carrying the fractional read position and the
+ * boundary sample across calls so splitting a signal into two `process()` calls
+ * yields the same output as one call (cross-quantum continuity). Never reads out
+ * of bounds: the right neighbor of every emitted sample is guaranteed to exist
+ * within the current quantum; any leftover position is carried.
+ */
+export class LinearResampler {
+  private readonly ratio: number;
+  private resamplePos = 0;
+  private prevSample = 0;
+  private primed = false;
+
+  constructor(inputRate: number, outputRate: number) {
+    // Input samples consumed per output sample. >1 when downsampling.
+    this.ratio = inputRate / outputRate;
+  }
+
+  /**
+   * Resample one quantum and return the produced output samples. The optional
+   * `filter` is applied to each input sample as it is consumed (anti-aliasing).
+   */
+  process(channel: ArrayLike<number>, filter?: OnePoleLowPass): Float32Array {
+    const n = channel.length;
+    if (n === 0) return new Float32Array(0);
+
+    // Apply the anti-aliasing filter once over the raw input, keeping the result
+    // in a local buffer so resampling reads filtered values. The filter state is
+    // carried inside `filter` across calls.
+    let src: ArrayLike<number> = channel;
+    if (filter) {
+      const filtered = new Float32Array(n);
+      for (let i = 0; i < n; i++) filtered[i] = filter.process(channel[i]);
+      src = filtered;
+    }
+
+    if (!this.primed) {
+      this.prevSample = src[0];
+      this.primed = true;
+      this.resamplePos = 0;
+    }
+
+    // Worst case output count for sizing; trim at the end.
+    const out: number[] = [];
+    let pos = this.resamplePos;
+    while (pos < n - 1) {
+      const floor = Math.floor(pos);
+      const frac = pos - floor;
+      const s0 = floor < 0 ? this.prevSample : src[floor];
+      const s1 = src[floor + 1];
+      out.push(s0 + (s1 - s0) * frac);
+      pos += this.ratio;
+    }
+
+    this.resamplePos = pos - n;
+    this.prevSample = src[n - 1];
+    return Float32Array.from(out);
+  }
+}
+
+/**
+ * Accumulates resampled Float32 samples and emits whole PCM16 frames of exactly
+ * FRAME_SAMPLES (7200-byte ArrayBuffers). The remainder is carried until the
+ * next push completes a frame. `flush()` emits any partial remainder (used on
+ * teardown so the final ~150 ms is not lost).
+ */
+export class FrameAccumulator {
+  private acc: Float32Array;
+  private accLen = 0;
+  private readonly frameSamples: number;
+
+  constructor(frameSamples: number = FRAME_SAMPLES) {
+    this.frameSamples = frameSamples;
+    this.acc = new Float32Array(frameSamples);
+  }
+
+  /**
+   * Push samples; returns zero or more complete PCM16 frame buffers (each
+   * `frameSamples * 2` bytes). The carried remainder stays buffered.
+   */
+  push(samples: ArrayLike<number>): ArrayBuffer[] {
+    const frames: ArrayBuffer[] = [];
+    for (let i = 0; i < samples.length; i++) {
+      this.acc[this.accLen] = samples[i];
+      this.accLen += 1;
+      if (this.accLen >= this.frameSamples) {
+        frames.push(floatToPcm16LE(this.acc.subarray(0, this.accLen)));
+        this.accLen = 0;
+      }
+    }
+    return frames;
+  }
+
+  /** Emit the partial remainder (if any) as one frame and clear it. */
+  flush(): ArrayBuffer | null {
+    if (this.accLen === 0) return null;
+    const buf = floatToPcm16LE(this.acc.subarray(0, this.accLen));
+    this.accLen = 0;
+    return buf;
+  }
+
+  /** Number of buffered samples not yet flushed. */
+  get pending(): number {
+    return this.accLen;
+  }
+}
--- a/apps/client/src/features/dictation/audio/pcm16-worklet.ts
+++ b/apps/client/src/features/dictation/audio/pcm16-worklet.ts
@@ -0,0 +1,179 @@
+// Self-contained AudioWorkletProcessor that turns the live microphone stream into
+// PCM16 (signed 16-bit, little-endian), mono, 24000 Hz chunks for the realtime STT
+// upstream. It runs in the AudioWorklet global scope, so it MUST NOT import anything
+// (the worklet module has no module graph / bundler runtime around it).
+//
+// IMPORTANT — single source of truth: the DSP math below (float→PCM16 conversion,
+// the one-pole anti-aliasing low-pass, linear resampling, and frame accumulation)
+// is the SAME algorithm exported as pure, unit-tested functions from the sibling
+// `pcm16-dsp.ts`. Because the worklet scope cannot `import` at runtime, the logic
+// is mirrored here inline rather than imported, and the tests assert that the pure
+// module behaves identically. Any change to one MUST be mirrored in the other.
+//
+// Per `process()` call the host hands us a render quantum (typically 128 frames) at
+// the context sample rate. We read the first input channel (mono), apply a cheap
+// anti-aliasing low-pass, linearly resample to 24000 Hz while carrying the
+// fractional read position across calls (so we never assume a particular input
+// rate, e.g. 44.1k or 48k), accumulate the resampled samples, and once we have
+// ~150 ms worth (3600 samples) we emit them as an Int16 ArrayBuffer transferred to
+// the main thread. A 'flush' message from the main thread emits the partial tail so
+// the last ~150 ms is not lost on stop.
+
+// Target output rate required by the upstream transcription contract.
+const TARGET_RATE = 24000;
+// ~150 ms of audio at the target rate: 24000 * 0.15 = 3600 samples per message.
+const FRAME_SAMPLES = Math.round(TARGET_RATE * 0.15);
+
+class Pcm16Worklet extends AudioWorkletProcessor {
+  // Fractional read position within the CURRENT quantum, in input-sample units.
+  // Kept across `process()` calls so resampling has no per-quantum seams. After a
+  // quantum it is rebased relative to the next quantum's start, so a value in
+  // [-1, 0) means "interpolate between the previous quantum's last sample and the
+  // next quantum's first sample".
+  private resamplePos = 0;
+
+  // The previous quantum's last input sample, used to interpolate across the
+  // boundary between two render quanta (the conceptual sample at index -1).
+  private prevSample = 0;
+
+  // True once at least one sample has been seen (so `prevSample` is meaningful).
+  private primed = false;
+
+  // Accumulated resampled Float32 samples awaiting conversion + flush.
+  private acc: Float32Array = new Float32Array(FRAME_SAMPLES);
+  private accLen = 0;
+
+  // --- Anti-aliasing one-pole low-pass state (see OnePoleLowPass in pcm16-dsp) ---
+  // Configured lazily on the first quantum once `sampleRate` is known.
+  private lpAlpha = 1;
+  private lpPrev = 0;
+  private lpEnabled = false;
+  private lpConfigured = false;
+
+  constructor() {
+    super();
+    // The main thread asks for a tail flush on stop so the last partial frame
+    // (~150 ms) is not dropped. Any message triggers a flush of the remainder.
+    this.port.onmessage = (event: MessageEvent) => {
+      if (event.data === "flush") this.flush();
+    };
+  }
+
+  private configureLowPass(): void {
+    if (this.lpConfigured) return;
+    this.lpConfigured = true;
+    const inputRate = sampleRate;
+    const outputRate = TARGET_RATE;
+    const cutoff = (outputRate / 2) * 0.9;
+    this.lpEnabled = inputRate > outputRate && cutoff > 0 && inputRate > 0;
+    const dt = 1 / Math.max(inputRate, 1);
+    const rc = 1 / (2 * Math.PI * Math.max(cutoff, 1));
+    this.lpAlpha = dt / (rc + dt);
+    this.lpPrev = 0;
+  }
+
+  private lowPass(sample: number): number {
+    if (!this.lpEnabled) return sample;
+    this.lpPrev = this.lpPrev + this.lpAlpha * (sample - this.lpPrev);
+    return this.lpPrev;
+  }
+
+  process(inputs: Float32Array[][]): boolean {
+    const input = inputs[0];
+    // No connected input (or a momentarily empty quantum): keep the node alive
+    // and emit silence below.
+    const channel = input && input.length > 0 ? input[0] : undefined;
+
+    if (channel && channel.length > 0) {
+      this.resampleAndAccumulate(channel);
+    }
+
+    // Drive silence to the output so connecting this node to destination keeps
+    // the graph running without echoing the microphone back to the speakers.
+    return true;
+  }
+
+  // Apply anti-aliasing, linearly resample `channel` (at the context `sampleRate`)
+  // to TARGET_RATE, and push the results into the accumulator, flushing whole
+  // frames as they fill.
+  private resampleAndAccumulate(channel: Float32Array): void {
+    this.configureLowPass();
+    const ratio = sampleRate / TARGET_RATE; // input samples consumed per output sample
+    const n = channel.length;
+
+    // Anti-alias the raw input first; carry the filter state across quanta.
+    const src = new Float32Array(n);
+    for (let i = 0; i < n; i++) src[i] = this.lowPass(channel[i]);
+
+    if (!this.primed) {
+      // First quantum: there is no real predecessor, so seed the virtual index -1
+      // with this quantum's first sample and start reading from 0.
+      this.prevSample = src[0];
+      this.primed = true;
+      this.resamplePos = 0;
+    }
+
+    let pos = this.resamplePos;
+
+    // Emit output samples whose RIGHT neighbor (floor + 1) is available within
+    // this quantum, i.e. while floor + 1 <= n - 1  ⇔  pos < n - 1. The left
+    // neighbor at floor === -1 is the carried `prevSample`; floor >= 0 reads the
+    // quantum directly. Any leftover position (whose right neighbor would be the
+    // NEXT quantum's first sample) is carried via `resamplePos` and resolved on
+    // the next call. This guarantees we never read `src[n]` (out of bounds).
+    while (pos < n - 1) {
+      const floor = Math.floor(pos);
+      const frac = pos - floor;
+
+      const s0 = floor < 0 ? this.prevSample : src[floor];
+      const s1 = src[floor + 1];
+
+      this.pushSample(s0 + (s1 - s0) * frac);
+      pos += ratio;
+    }
+
+    // Rebase the leftover position relative to the next quantum's start and carry
+    // this quantum's last sample as the predecessor for the boundary interval.
+    this.resamplePos = pos - n;
+    this.prevSample = src[n - 1];
+  }
+
+  // Append one resampled sample; flush a full PCM16 frame whenever the
+  // accumulator reaches FRAME_SAMPLES.
+  private pushSample(sample: number): void {
+    this.acc[this.accLen] = sample;
+    this.accLen += 1;
+    if (this.accLen >= FRAME_SAMPLES) {
+      this.flush();
+    }
+  }
+
+  // Convert the accumulated Float32 samples to Int16 LE and post the ArrayBuffer
+  // to the main thread, transferring ownership (zero-copy). DataView writes are
+  // little-endian to match the PCM16 contract regardless of host endianness.
+  // Also invoked on a 'flush' message to emit a partial tail frame on stop.
+  private flush(): void {
+    const count = this.accLen;
+    if (count === 0) return;
+
+    const buffer = new ArrayBuffer(count * 2);
+    const view = new DataView(buffer);
+    for (let i = 0; i < count; i++) {
+      // Clamp to [-1, 1] then scale to the signed 16-bit range. Mirrors
+      // floatSampleToInt16 in pcm16-dsp.ts.
+      let s = this.acc[i];
+      if (Number.isNaN(s)) s = 0;
+      else if (s > 1) s = 1;
+      else if (s < -1) s = -1;
+      let v = Math.round(s < 0 ? s * 0x8000 : s * 0x7fff);
+      if (v > 32767) v = 32767;
+      else if (v < -32768) v = -32768;
+      view.setInt16(i * 2, v, true);
+    }
+    this.accLen = 0;
+
+    this.port.postMessage(buffer, [buffer]);
+  }
+}
+
+registerProcessor("pcm16-worklet", Pcm16Worklet);
--- a/apps/client/src/features/dictation/components/mic-button.module.css
+++ b/apps/client/src/features/dictation/components/mic-button.module.css
@@ -1,24 +0,0 @@
-.recordingWrap {
-  position: relative;
-  display: inline-flex;
-  align-items: center;
-  justify-content: center;
-}
-
-/* Translucent red halo that sits behind the stop button and scales with the
-   live microphone level (scale set inline from audioLevel). Radius follows the
-   ActionIcon's own radius so the halo matches the button's rounded-square
-   outline instead of being a circle. */
-.pulse {
-  position: absolute;
-  inset: 0;
-  border-radius: var(--mantine-radius-default);
-  background-color: var(--mantine-color-red-5);
-  opacity: 0.35;
-  transform-origin: center;
-  transform: scale(1);
-  transition: transform 90ms linear;
-  pointer-events: none;
-  will-change: transform;
-  z-index: 0;
-}
--- a/apps/client/src/features/dictation/components/mic-button.tsx
+++ b/apps/client/src/features/dictation/components/mic-button.tsx
@@ -3,8 +3,6 @@ import { ActionIcon, Loader, Tooltip } from "@mantine/core";
 import { IconMicrophone, IconPlayerStopFilled } from "@tabler/icons-react";
 import { useTranslation } from "react-i18next";
 import { useDictation } from "@/features/dictation/hooks/use-dictation";
-import { useStreamingDictation } from "@/features/dictation/hooks/use-streaming-dictation";
-import classes from "./mic-button.module.css";

 interface MicButtonProps {
  onText: (text: string) => void;
@@ -13,14 +11,6 @@ interface MicButtonProps {
  // Mantine ActionIcon size token; "lg" matches the chat composer, "md" the
  // editor toolbar.
  size?: "md" | "lg";
-  // Optional Mantine color override for the idle/transcribing states (the
-  // recording state stays red). Defaults to the theme primary when omitted.
-  color?: string;
-  // Optional explicit glyph size override; defaults to the size-token value.
-  iconSize?: number;
-  // When true, use the streaming (Silero-VAD) dictation controller, which emits
-  // text progressively as the user pauses; otherwise use the batch controller.
-  streaming?: boolean;
 }

 /**
@@ -34,64 +24,35 @@ export const MicButton: FC<MicButtonProps> = ({
  onStart,
  disabled,
  size = "lg",
-  color,
-  iconSize,
-  streaming = false,
 }) => {
  const { t } = useTranslation();
-  // Call BOTH hooks unconditionally to respect the rules of hooks: which one is
-  // active is a render-time choice, but both must be invoked every render. This
-  // is safe because both controllers are inert until start() is called — neither
-  // opens the mic on mount — so the unused one costs nothing.
-  const batchCtl = useDictation({ onText, onStart });
-  const streamingCtl = useStreamingDictation({ onText, onStart });
-  const ctl = streaming ? streamingCtl : batchCtl;
-  const { status, start, stop, audioLevel } = ctl;
-  const resolvedIconSize = iconSize ?? (size === "lg" ? 18 : 16);
+  const { status, start, stop } = useDictation({ onText, onStart });
+  const iconSize = size === "lg" ? 18 : 16;

  if (status === "recording") {
-    // Live volume-driven halo: the scale follows the current mic level.
-    const haloScale = 1 + Math.min(1, audioLevel) * 0.9;
    return (
      <Tooltip label={t("Stop recording")} withArrow>
-        <span className={classes.recordingWrap}>
-          <span
-            className={classes.pulse}
-            style={{ transform: `scale(${haloScale})` }}
-            aria-hidden="true"
-          />
-          <ActionIcon
-            size={size}
-            color="red"
-            variant="light"
-            onClick={stop}
-            aria-label={t("Stop recording")}
-            style={{ position: "relative", zIndex: 1 }}
-          >
-            <IconPlayerStopFilled size={resolvedIconSize} />
-          </ActionIcon>
-        </span>
+        <ActionIcon
+          size={size}
+          color="red"
+          variant="light"
+          onClick={stop}
+          aria-label={t("Stop recording")}
+        >
+          <IconPlayerStopFilled size={iconSize} />
+        </ActionIcon>
      </Tooltip>
    );
  }

-  if (
-    status === "loading" ||
-    status === "transcribing" ||
-    status === "error"
-  ) {
-    // "loading" (streaming hook fetching the VAD model on first use) shows the
-    // same spinner+disabled state so the first click is visibly acknowledged and
-    // a confusing second click can't fire while the model loads.
-    const label = status === "loading" ? t("Preparing…") : t("Transcribing…");
+  if (status === "transcribing" || status === "error") {
    return (
-      <Tooltip label={label} withArrow>
+      <Tooltip label={t("Transcribing…")} withArrow>
        <ActionIcon
          size={size}
          variant="subtle"
-          color={color}
          disabled
-          aria-label={label}
+          aria-label={t("Transcribing…")}
        >
          <Loader size="xs" />
        </ActionIcon>
@@ -104,12 +65,11 @@ export const MicButton: FC<MicButtonProps> = ({
      <ActionIcon
        size={size}
        variant="subtle"
-        color={color}
        onClick={() => void start()}
        disabled={disabled}
        aria-label={t("Start dictation")}
      >
-        <IconMicrophone size={resolvedIconSize} />
+        <IconMicrophone size={iconSize} />
      </ActionIcon>
    </Tooltip>
  );
--- a/apps/client/src/features/dictation/components/realtime-mic-button.test.tsx
+++ b/apps/client/src/features/dictation/components/realtime-mic-button.test.tsx
@@ -0,0 +1,103 @@
+import { describe, it, expect, vi, beforeEach } from "vitest";
+import { render, screen, fireEvent, cleanup } from "@testing-library/react";
+import { MantineProvider } from "@mantine/core";
+
+// jsdom has no matchMedia; Mantine's color-scheme provider needs it. Stub a
+// minimal, inert implementation before any MantineProvider mounts.
+if (typeof window.matchMedia !== "function") {
+  window.matchMedia = (query: string) =>
+    ({
+      matches: false,
+      media: query,
+      onchange: null,
+      addListener: () => undefined,
+      removeListener: () => undefined,
+      addEventListener: () => undefined,
+      removeEventListener: () => undefined,
+      dispatchEvent: () => false,
+    }) as unknown as MediaQueryList;
+}
+
+// Mock i18n so labels render the raw key.
+vi.mock("react-i18next", () => ({
+  useTranslation: () => ({ t: (k: string) => k, i18n: {} }),
+}));
+
+// Controllable mock of the dictation hook. Tests set the returned status and
+// inspect the start/stop spies.
+const hookState: {
+  status: "idle" | "recording" | "error";
+  start: ReturnType<typeof vi.fn>;
+  stop: ReturnType<typeof vi.fn>;
+  cancel: ReturnType<typeof vi.fn>;
+} = {
+  status: "idle",
+  start: vi.fn(),
+  stop: vi.fn(),
+  cancel: vi.fn(),
+};
+
+vi.mock("@/features/dictation/hooks/use-realtime-dictation", () => ({
+  useRealtimeDictation: () => hookState,
+}));
+
+import { RealtimeMicButton } from "./realtime-mic-button";
+
+function renderButton(props: Partial<Parameters<typeof RealtimeMicButton>[0]> = {}) {
+  const onInterim = vi.fn();
+  const onFinal = vi.fn();
+  const utils = render(
+    <MantineProvider>
+      <RealtimeMicButton onInterim={onInterim} onFinal={onFinal} {...props} />
+    </MantineProvider>,
+  );
+  return { onInterim, onFinal, ...utils };
+}
+
+beforeEach(() => {
+  cleanup();
+  hookState.status = "idle";
+  hookState.start = vi.fn();
+  hookState.stop = vi.fn();
+  hookState.cancel = vi.fn();
+});
+
+describe("RealtimeMicButton", () => {
+  it("idle: clicking calls start", () => {
+    renderButton();
+    fireEvent.click(screen.getByLabelText("Start dictation"));
+    expect(hookState.start).toHaveBeenCalledTimes(1);
+    expect(hookState.stop).not.toHaveBeenCalled();
+  });
+
+  it("recording: clicking calls stop", () => {
+    hookState.status = "recording";
+    renderButton();
+    fireEvent.click(screen.getByLabelText("Stop recording"));
+    expect(hookState.stop).toHaveBeenCalledTimes(1);
+    expect(hookState.start).not.toHaveBeenCalled();
+  });
+
+  it("recording → idle transition fires onInterim('') exactly once", () => {
+    hookState.status = "recording";
+    const { onInterim, rerender } = renderButton();
+    expect(onInterim).not.toHaveBeenCalled();
+
+    hookState.status = "idle";
+    rerender(
+      <MantineProvider>
+        <RealtimeMicButton onInterim={onInterim} onFinal={vi.fn()} />
+      </MantineProvider>,
+    );
+    expect(onInterim).toHaveBeenCalledTimes(1);
+    expect(onInterim).toHaveBeenCalledWith("");
+
+    // A further re-render in idle does not fire it again.
+    rerender(
+      <MantineProvider>
+        <RealtimeMicButton onInterim={onInterim} onFinal={vi.fn()} />
+      </MantineProvider>,
+    );
+    expect(onInterim).toHaveBeenCalledTimes(1);
+  });
+});
--- a/apps/client/src/features/dictation/components/realtime-mic-button.tsx
+++ b/apps/client/src/features/dictation/components/realtime-mic-button.tsx
@@ -0,0 +1,84 @@
+import { FC, useEffect, useRef } from "react";
+import { ActionIcon, Tooltip } from "@mantine/core";
+import { IconMicrophone, IconPlayerStopFilled } from "@tabler/icons-react";
+import { useTranslation } from "react-i18next";
+import {
+  useRealtimeDictation,
+  type RealtimeDictationStatus,
+} from "@/features/dictation/hooks/use-realtime-dictation";
+
+interface RealtimeMicButtonProps {
+  onInterim: (text: string) => void;
+  onFinal: (text: string) => void;
+  onStart?: () => void;
+  disabled?: boolean;
+  // Mantine ActionIcon size token; "lg" matches the chat composer, "md" the
+  // editor toolbar.
+  size?: "md" | "lg";
+}
+
+/**
+ * Streaming sibling of MicButton. Drives the realtime dictation state machine:
+ * a click starts recording (mic icon), a second click stops it (stop icon).
+ * Interim/final transcripts are surfaced through the onInterim/onFinal props as
+ * they arrive; there is no "transcribing" state because final text lands
+ * incrementally while recording. Mirrors MicButton's look and tooltips.
+ */
+export const RealtimeMicButton: FC<RealtimeMicButtonProps> = ({
+  onInterim,
+  onFinal,
+  onStart,
+  disabled,
+  size = "lg",
+}) => {
+  const { t } = useTranslation();
+  const { status, start, stop } = useRealtimeDictation({
+    onInterim,
+    onFinal,
+    onStart,
+  });
+  const iconSize = size === "lg" ? 18 : 16;
+
+  // When recording ends (status leaves "recording" for idle/error), clear any
+  // leftover partial in the consumer once. Tracked via the previous status so
+  // it only fires on the transition, not on every render.
+  const prevStatusRef = useRef<RealtimeDictationStatus>(status);
+  useEffect(() => {
+    if (prevStatusRef.current === "recording" && status !== "recording") {
+      onInterim("");
+    }
+    prevStatusRef.current = status;
+  }, [status, onInterim]);
+
+  if (status === "recording") {
+    return (
+      <Tooltip label={t("Stop recording")} withArrow>
+        <ActionIcon
+          size={size}
+          color="red"
+          variant="light"
+          onClick={stop}
+          aria-label={t("Stop recording")}
+        >
+          <IconPlayerStopFilled size={iconSize} />
+        </ActionIcon>
+      </Tooltip>
+    );
+  }
+
+  // idle / error: subtle mic to (re)start. No spinner — there is no separate
+  // transcribing phase in the realtime flow.
+  return (
+    <Tooltip label={t("Start dictation")} withArrow>
+      <ActionIcon
+        size={size}
+        variant="subtle"
+        onClick={() => void start()}
+        disabled={disabled}
+        aria-label={t("Start dictation")}
+      >
+        <IconMicrophone size={iconSize} />
+      </ActionIcon>
+    </Tooltip>
+  );
+};
--- a/apps/client/src/features/dictation/hooks/use-dictation.ts
+++ b/apps/client/src/features/dictation/hooks/use-dictation.ts
@@ -2,16 +2,14 @@ import { useCallback, useEffect, useRef, useState } from "react";
 import { notifications } from "@mantine/notifications";
 import { useTranslation } from "react-i18next";
 import { transcribeAudio } from "@/features/dictation/services/dictation-service";
+import {
+  acquireMicStream,
+  canStartCapture,
+  mapGetUserMediaError,
+  MicUnavailableError,
+} from "@/features/dictation/audio/mic-capture";

-// "loading" is set only by the streaming hook while it lazily loads the VAD
-// model on first use; the batch hook never sets it. It exists so the streaming
-// hook and the mic button can show immediate feedback during that load.
-export type DictationStatus =
-  | "idle"
-  | "recording"
-  | "transcribing"
-  | "error"
-  | "loading";
+export type DictationStatus = "idle" | "recording" | "transcribing" | "error";

 interface UseDictationOptions {
  onText: (text: string) => void;
@@ -24,8 +22,6 @@ interface UseDictationResult {
  start: () => Promise<void>;
  stop: () => void;
  cancel: () => void;
-  // Smoothed live microphone level in the 0..1 range while recording (0 when idle).
-  audioLevel: number;
 }

 // Candidate container/codec combinations in preference order. The first one the
@@ -66,7 +62,6 @@ export function useDictation(
 ): UseDictationResult {
  const { t } = useTranslation();
  const [status, setStatus] = useState<DictationStatus>("idle");
-  const [audioLevel, setAudioLevel] = useState(0);

  // Keep the latest callbacks in a ref so the recorder's onstop closure always
  // calls the current handlers without re-creating the recorder.
@@ -81,15 +76,6 @@ export function useDictation(
  const canceledRef = useRef(false);
  const startingRef = useRef(false);

-  // Web Audio metering: derives a live input level from the captured stream.
-  const audioContextRef = useRef<AudioContext | null>(null);
-  const analyserRef = useRef<AnalyserNode | null>(null);
-  const sourceRef = useRef<MediaStreamAudioSourceNode | null>(null);
-  const rafRef = useRef<number | null>(null);
-  // Exponentially smoothed level, and the last value pushed to React state.
-  const smoothedLevelRef = useRef(0);
-  const emittedLevelRef = useRef(0);
-
  const clearTimer = useCallback(() => {
    if (timerRef.current !== null) {
      clearTimeout(timerRef.current);
@@ -102,132 +88,39 @@ export function useDictation(
    streamRef.current = null;
  }, []);

-  // Tear the audio meter down fully. Safe to call multiple times and on any exit
-  // path; defensive try/catch so cleanup never throws.
-  const stopMeter = useCallback(() => {
-    // Cancel the rAF first so getByteTimeDomainData can't run on a closed context.
-    if (rafRef.current !== null) {
-      cancelAnimationFrame(rafRef.current);
-      rafRef.current = null;
-    }
-    try {
-      sourceRef.current?.disconnect();
-      sourceRef.current = null;
-      analyserRef.current = null;
-      if (audioContextRef.current && audioContextRef.current.state !== "closed") {
-        void audioContextRef.current.close();
-      }
-      audioContextRef.current = null;
-    } catch (err) {
-      // Cleanup must never throw; just log for diagnosis.
-      console.warn("[dictation] audio meter teardown failed", err);
-    }
-    smoothedLevelRef.current = 0;
-    emittedLevelRef.current = 0;
-    setAudioLevel(0);
-  }, []);
-
-  // Set up Web Audio metering on the already-captured stream. Reuses the existing
-  // MediaStream — never requests a second mic. Failure here must not break
-  // recording: on any error we warn and return, leaving the recorder running.
-  const startMeter = useCallback((stream: MediaStream) => {
-    try {
-      const Ctor =
-        window.AudioContext ||
-        (window as unknown as { webkitAudioContext?: typeof AudioContext })
-          .webkitAudioContext;
-      if (!Ctor) return;
-
-      const audioContext = new Ctor();
-      // Some browsers start the context suspended; resume so the loop produces
-      // data. Swallow rejection (e.g. context already closed by a fast
-      // start/stop race) to avoid an unhandled promise rejection.
-      audioContext.resume().catch(() => {});
-      const source = audioContext.createMediaStreamSource(stream);
-      const analyser = audioContext.createAnalyser();
-      analyser.fftSize = 512;
-      analyser.smoothingTimeConstant = 0.5;
-      // Connect ONLY to the analyser — never to destination, which would echo the
-      // mic back to the speakers.
-      source.connect(analyser);
-
-      audioContextRef.current = audioContext;
-      sourceRef.current = source;
-      analyserRef.current = analyser;
-
-      // Allocate the time-domain buffer once and reuse it on every tick.
-      const data = new Uint8Array(analyser.fftSize);
-
-      const tick = () => {
-        const a = analyserRef.current;
-        if (!a) return;
-        a.getByteTimeDomainData(data);
-        // RMS of the centered waveform (samples are 0..255, midpoint 128).
-        let sumSquares = 0;
-        for (let i = 0; i < data.length; i++) {
-          const v = (data[i] - 128) / 128;
-          sumSquares += v * v;
-        }
-        const rms = Math.sqrt(sumSquares / data.length);
-        // Boost + clamp so normal speech maps to a visible 0..1 range.
-        const level = Math.min(1, rms * 3);
-        // Exponential smoothing to avoid jitter.
-        smoothedLevelRef.current = smoothedLevelRef.current * 0.8 + level * 0.2;
-        // Throttle React re-renders: only push when it changed meaningfully.
-        if (Math.abs(smoothedLevelRef.current - emittedLevelRef.current) > 0.01) {
-          emittedLevelRef.current = smoothedLevelRef.current;
-          setAudioLevel(smoothedLevelRef.current);
-        }
-        rafRef.current = requestAnimationFrame(tick);
-      };
-      rafRef.current = requestAnimationFrame(tick);
-    } catch (err) {
-      // Web Audio unavailable or threw: recording continues without the meter.
-      console.warn("[dictation] audio meter unavailable", err);
-    }
-  }, []);
-
  const start = useCallback(async (): Promise<void> => {
-    // Synchronous live guard: status is stale between renders, so also block on
-    // refs to prevent a double-click from opening two MediaStreams (the first
-    // would leak).
-    if (startingRef.current || recorderRef.current || streamRef.current) return;
-    if (status !== "idle") return;
-    startingRef.current = true;
-
-    if (!navigator.mediaDevices?.getUserMedia) {
-      const reason =
-        "navigator.mediaDevices.getUserMedia is unavailable in this context";
-      console.error("[dictation] " + reason);
-      notifications.show({
-        color: "red",
-        message: t("Audio recording is not available in this browser/context"),
-      });
-      setStatus("idle");
-      startingRef.current = false;
+    // Synchronous live guard (shared with the streaming hook): status is stale
+    // between renders, so also block on refs to prevent a double-click from
+    // opening two MediaStreams (the first would leak).
+    if (
+      !canStartCapture({
+        starting: startingRef.current,
+        hasStream: streamRef.current !== null,
+        hasLiveResource: recorderRef.current !== null,
+        statusIsIdle: status === "idle",
+      })
+    ) {
      return;
    }
+    startingRef.current = true;

    let stream: MediaStream;
    try {
-      stream = await navigator.mediaDevices.getUserMedia({ audio: true });
+      stream = await acquireMicStream();
    } catch (err) {
-      // Always log the full error for diagnosis (name, message, stack).
-      console.error("[dictation] getUserMedia failed", err);
-      const name = (err as { name?: string })?.name;
-      const detail = (err as { message?: string })?.message ?? String(err);
-      let message: string;
-      if (name === "NotAllowedError" || name === "SecurityError") {
-        message = t("Microphone access denied");
-      } else if (name === "NotFoundError" || name === "OverconstrainedError") {
-        message = t("No microphone found");
-      } else if (name === "NotReadableError" || name === "AbortError") {
-        message = t("Microphone is unavailable or already in use");
+      if (err instanceof MicUnavailableError) {
+        console.error("[dictation] " + err.message);
+        notifications.show({
+          color: "red",
+          message: t(
+            "Audio recording is not available in this browser/context",
+          ),
+        });
      } else {
-        // Unknown failure: show the real reason instead of a generic string.
-        message = `${t("Could not start recording")}: ${name ? `${name}: ` : ""}${detail}`;
+        // Always log the full error for diagnosis (name, message, stack).
+        console.error("[dictation] getUserMedia failed", err);
+        notifications.show({ color: "red", message: mapGetUserMediaError(err, t) });
      }
-      notifications.show({ color: "red", message });
      setStatus("idle");
      startingRef.current = false;
      return;
@@ -268,9 +161,8 @@ export function useDictation(
      const recordedMime = recorder.mimeType || mimeType || "audio/webm";
      const wasCanceled = canceledRef.current;

-      // Stop the mic tracks and the audio meter regardless of how we got here.
+      // Stop the mic tracks regardless of how we got here.
      stopTracks();
-      stopMeter();
      recorderRef.current = null;

      if (wasCanceled) {
@@ -343,49 +235,34 @@ export function useDictation(
    // Recording has truly begun; release the synchronous start guard.
    startingRef.current = false;

-    // Start the live audio meter on the stream we already acquired.
-    startMeter(stream);
-
    const maxDurationMs = optionsRef.current.maxDurationMs ?? 120000;
    timerRef.current = setTimeout(() => {
      if (recorderRef.current?.state === "recording") {
        recorderRef.current.stop();
      }
    }, maxDurationMs);
-  }, [status, t, clearTimer, stopTracks, startMeter, stopMeter]);
+  }, [status, t, clearTimer, stopTracks]);

  const stop = useCallback((): void => {
    clearTimer();
    const recorder = recorderRef.current;
    if (recorder && recorder.state === "recording") {
-      // Normal path: onstop tears down tracks + meter and runs transcription.
      recorder.stop();
-    } else {
-      // No live recorder (e.g. the track ended on its own): tear everything
-      // down directly so the meter/AudioContext and stream don't leak, and
-      // recover the UI to idle.
-      stopTracks();
-      stopMeter();
-      recorderRef.current = null;
-      chunksRef.current = [];
-      setStatus("idle");
    }
-  }, [clearTimer, stopTracks, stopMeter]);
+  }, [clearTimer]);

  const cancel = useCallback((): void => {
    clearTimer();
    canceledRef.current = true;
    const recorder = recorderRef.current;
    if (recorder && recorder.state === "recording") {
-      // onstop sees canceledRef and skips transcription; it also stops tracks
-      // and the meter.
+      // onstop sees canceledRef and skips transcription; it also stops tracks.
      recorder.stop();
    } else {
      stopTracks();
-      stopMeter();
    }
    setStatus("idle");
-  }, [clearTimer, stopTracks, stopMeter]);
+  }, [clearTimer, stopTracks]);

  // Clean up on unmount: stop any live recorder/stream and clear the timers.
  useEffect(() => {
@@ -401,9 +278,8 @@ export function useDictation(
        recorder.stop();
      }
      stopTracks();
-      stopMeter();
    };
-  }, [clearTimer, stopTracks, stopMeter]);
+  }, [clearTimer, stopTracks]);

-  return { status, start, stop, cancel, audioLevel };
+  return { status, start, stop, cancel };
 }
--- a/apps/client/src/features/dictation/hooks/use-realtime-dictation.ts
+++ b/apps/client/src/features/dictation/hooks/use-realtime-dictation.ts
@@ -0,0 +1,492 @@
+import { useCallback, useEffect, useRef, useState } from "react";
+import { notifications } from "@mantine/notifications";
+import { useTranslation } from "react-i18next";
+import { RealtimeDictationClient } from "@/features/dictation/services/realtime-dictation-client";
+import {
+  acquireMicStream,
+  canStartCapture,
+  mapGetUserMediaError,
+  MicUnavailableError,
+} from "@/features/dictation/audio/mic-capture";
+import { baseLanguageSubtag } from "@/features/dictation/services/dictation-reducer";
+
+// The worklet module URL is produced via `new URL(..., import.meta.url)` so Vite
+// emits the processor as a separate, self-contained module chunk (it must run in
+// the AudioWorklet global scope, outside the main bundle). Built once at module
+// load — the resolved URL is stable for the app's lifetime.
+const PCM16_WORKLET_URL = new URL(
+  "../audio/pcm16-worklet.ts",
+  import.meta.url,
+);
+
+export type RealtimeDictationStatus = "idle" | "recording" | "error";
+
+export interface UseRealtimeDictationOptions {
+  onInterim: (text: string) => void; // latest partial for the live segment
+  onFinal: (text: string) => void; // a completed segment (trimmed)
+  onStart?: () => void; // fired right when capture begins (caret snapshot)
+  maxDurationMs?: number; // default 120000
+}
+
+export interface UseRealtimeDictationResult {
+  status: RealtimeDictationStatus;
+  start: () => Promise<void>;
+  stop: () => void;
+  cancel: () => void;
+}
+
+// AudioContext is webkit-prefixed on some older Safari builds; keep a typed
+// fallback so the hook never crashes when the standard name is missing.
+function getAudioContextCtor(): typeof AudioContext | undefined {
+  if (typeof AudioContext !== "undefined") return AudioContext;
+  const w = window as unknown as { webkitAudioContext?: typeof AudioContext };
+  return w.webkitAudioContext;
+}
+
+/**
+ * Streaming sibling of `use-dictation`. Captures the mic, resamples to PCM16
+ * 24 kHz in an AudioWorklet, and streams it over the normalized `/ai-realtime`
+ * Socket.IO namespace, surfacing interim/final transcripts as they arrive.
+ *
+ * Mirrors `use-dictation`'s conventions: refs hold the live graph/client/timers
+ * so re-renders never lose them, getUserMedia errors map to the same Mantine
+ * notifications, and every exit path stops the MediaStream tracks and closes the
+ * AudioContext. There is no `transcribing` state — final text arrives
+ * incrementally while `recording`.
+ */
+export function useRealtimeDictation(
+  options: UseRealtimeDictationOptions,
+): UseRealtimeDictationResult {
+  const { t, i18n } = useTranslation();
+  const [status, setStatus] = useState<RealtimeDictationStatus>("idle");
+
+  // Keep the latest callbacks in a ref so async socket handlers always call the
+  // current handlers without re-creating the capture graph.
+  const optionsRef = useRef(options);
+  optionsRef.current = options;
+
+  const streamRef = useRef<MediaStream | null>(null);
+  const audioContextRef = useRef<AudioContext | null>(null);
+  const sourceRef = useRef<MediaStreamAudioSourceNode | null>(null);
+  const workletRef = useRef<AudioWorkletNode | null>(null);
+  const clientRef = useRef<RealtimeDictationClient | null>(null);
+
+  const timerRef = useRef<ReturnType<typeof setTimeout> | null>(null);
+  const errorTimerRef = useRef<ReturnType<typeof setTimeout> | null>(null);
+  // Defers the upstream/socket teardown a short beat after a graceful stop so the
+  // worklet's flushed tail frame can round-trip and be forwarded before we close.
+  const flushTimerRef = useRef<ReturnType<typeof setTimeout> | null>(null);
+
+  const canceledRef = useRef(false);
+  const startingRef = useRef(false);
+  // True once the server emits `ready`; audio is buffered until then, then flushed.
+  const readyRef = useRef(false);
+  // PCM16 chunks captured before the upstream session is ready.
+  const pendingAudioRef = useRef<ArrayBuffer[]>([]);
+  // Stable ref to the latest stop() so the max-duration timer (armed inside the
+  // start closure) can invoke the current version without re-arming every render.
+  const stopRef = useRef<() => void>(() => undefined);
+
+  const clearTimer = useCallback(() => {
+    if (timerRef.current !== null) {
+      clearTimeout(timerRef.current);
+      timerRef.current = null;
+    }
+    if (flushTimerRef.current !== null) {
+      clearTimeout(flushTimerRef.current);
+      flushTimerRef.current = null;
+    }
+  }, []);
+
+  const stopTracks = useCallback(() => {
+    streamRef.current?.getTracks().forEach((track) => track.stop());
+    streamRef.current = null;
+  }, []);
+
+  // Tear down the audio graph (worklet node, source, context). Never throws on a
+  // half-built or already-closed graph.
+  const teardownAudio = useCallback(() => {
+    const worklet = workletRef.current;
+    if (worklet) {
+      worklet.port.onmessage = null;
+      try {
+        worklet.disconnect();
+      } catch {
+        // Node may already be disconnected; ignore.
+      }
+      workletRef.current = null;
+    }
+
+    const source = sourceRef.current;
+    if (source) {
+      try {
+        source.disconnect();
+      } catch {
+        // Ignore disconnect of an already-detached node.
+      }
+      sourceRef.current = null;
+    }
+
+    const ctx = audioContextRef.current;
+    if (ctx) {
+      audioContextRef.current = null;
+      if (ctx.state !== "closed") {
+        // close() returns a promise; swallow rejections so teardown never throws.
+        void ctx.close().catch(() => undefined);
+      }
+    }
+  }, []);
+
+  // Full teardown shared by stop/cancel/unmount. Order: stop streaming upstream,
+  // disconnect the socket, then dismantle the local audio graph and tracks, then
+  // clear timers and reset the ready/pending state. Also clears the interim
+  // "ghost" decoration in the consumer so it does not stick when the toolbar
+  // closes mid-recording (the unmount path runs teardown).
+  const teardown = useCallback(() => {
+    const client = clientRef.current;
+    if (client) {
+      clientRef.current = null;
+      try {
+        client.stop();
+      } catch {
+        // Socket may already be gone; ignore.
+      }
+      client.disconnect();
+    }
+
+    teardownAudio();
+    stopTracks();
+    clearTimer();
+
+    readyRef.current = false;
+    pendingAudioRef.current = [];
+    startingRef.current = false;
+
+    // Clear any leftover interim decoration. Guarded so a throwing consumer
+    // callback can never break teardown.
+    try {
+      optionsRef.current.onInterim("");
+    } catch (err) {
+      console.error("[realtime-dictation] onInterim('') during teardown threw", err);
+    }
+  }, [teardownAudio, stopTracks, clearTimer]);
+
+  // Ask the worklet to emit its partial tail frame (the last ~150 ms that has not
+  // yet filled a full frame) so it is not lost on stop. The worklet posts the
+  // remaining samples back over the existing port.onmessage handler, which
+  // forwards them upstream before the socket is closed.
+  const flushWorklet = useCallback(() => {
+    try {
+      workletRef.current?.port.postMessage("flush");
+    } catch {
+      // Port may already be closed; ignore.
+    }
+  }, []);
+
+  // Surface a concrete failure: log it, notify, flip to "error", and reset to
+  // "idle" after a short delay (mirrors use-dictation's error timer).
+  const handleError = useCallback(
+    (message: string, err?: unknown) => {
+      if (canceledRef.current) return;
+      // Never log audio — only the textual reason.
+      console.error("[realtime-dictation]", message, err ?? "");
+      notifications.show({ color: "red", message });
+      teardown();
+      setStatus("error");
+      if (errorTimerRef.current !== null) {
+        clearTimeout(errorTimerRef.current);
+      }
+      errorTimerRef.current = setTimeout(() => {
+        errorTimerRef.current = null;
+        setStatus("idle");
+      }, 1500);
+    },
+    [teardown],
+  );
+
+  const start = useCallback(async (): Promise<void> => {
+    // Synchronous live guard (shared with the batch hook): status is stale between
+    // renders, so also block on refs to prevent a double-click from opening two
+    // MediaStreams / sockets.
+    if (
+      !canStartCapture({
+        starting: startingRef.current,
+        hasStream: streamRef.current !== null,
+        hasLiveResource:
+          audioContextRef.current !== null || clientRef.current !== null,
+        statusIsIdle: status === "idle",
+      })
+    ) {
+      return;
+    }
+    startingRef.current = true;
+    canceledRef.current = false;
+    readyRef.current = false;
+    pendingAudioRef.current = [];
+
+    let stream: MediaStream;
+    try {
+      stream = await acquireMicStream();
+    } catch (err) {
+      if (err instanceof MicUnavailableError) {
+        console.error("[realtime-dictation] " + err.message);
+        notifications.show({
+          color: "red",
+          message: t(
+            "Audio recording is not available in this browser/context",
+          ),
+        });
+      } else {
+        // Always log the full error for diagnosis (name, message, stack).
+        console.error("[realtime-dictation] getUserMedia failed", err);
+        notifications.show({
+          color: "red",
+          message: mapGetUserMediaError(err, t),
+        });
+      }
+      setStatus("idle");
+      startingRef.current = false;
+      return;
+    }
+
+    // If a stop/cancel landed during the await (the button was pressed while the
+    // permission prompt was still pending), drop the just-acquired stream and bail
+    // out cleanly so the mic does not stay physically on and the button does not
+    // stick on "recording".
+    if (canceledRef.current) {
+      stream.getTracks().forEach((track) => track.stop());
+      startingRef.current = false;
+      setStatus("idle");
+      return;
+    }
+    streamRef.current = stream;
+
+    // Build the capture graph. The worklet still resamples robustly if the browser
+    // ignores the 24 kHz hint, so any actual context rate is handled correctly.
+    const AudioCtx = getAudioContextCtor();
+    if (!AudioCtx) {
+      stopTracks();
+      notifications.show({
+        color: "red",
+        message: t("Audio recording is not available in this browser/context"),
+      });
+      setStatus("idle");
+      startingRef.current = false;
+      return;
+    }
+
+    let audioContext: AudioContext;
+    try {
+      audioContext = new AudioCtx({ sampleRate: 24000 });
+      audioContextRef.current = audioContext;
+      // AudioWorklet requires a secure context (https/localhost), same constraint
+      // as getUserMedia. A failure here means the UI should fall back to batch.
+      await audioContext.audioWorklet.addModule(PCM16_WORKLET_URL);
+    } catch (err) {
+      console.error("[realtime-dictation] audio worklet setup failed", err);
+      teardownAudio();
+      stopTracks();
+      const detail = (err as { message?: string })?.message ?? String(err);
+      notifications.show({
+        color: "red",
+        message: `${t("Could not start recording")}: ${detail}`,
+      });
+      setStatus("idle");
+      startingRef.current = false;
+      return;
+    }
+
+    // Another cancel could have landed during addModule().
+    if (canceledRef.current) {
+      teardownAudio();
+      stopTracks();
+      startingRef.current = false;
+      setStatus("idle");
+      return;
+    }
+
+    let source: MediaStreamAudioSourceNode;
+    let worklet: AudioWorkletNode;
+    try {
+      source = audioContext.createMediaStreamSource(stream);
+      worklet = new AudioWorkletNode(audioContext, "pcm16-worklet");
+      sourceRef.current = source;
+      workletRef.current = worklet;
+      // MediaStreamSource → worklet → destination. The worklet emits silence, so
+      // connecting to destination drives the render graph without echoing the mic.
+      source.connect(worklet);
+      worklet.connect(audioContext.destination);
+    } catch (err) {
+      console.error("[realtime-dictation] audio graph wiring failed", err);
+      teardownAudio();
+      stopTracks();
+      const detail = (err as { message?: string })?.message ?? String(err);
+      notifications.show({
+        color: "red",
+        message: `${t("Could not start recording")}: ${detail}`,
+      });
+      setStatus("idle");
+      startingRef.current = false;
+      return;
+    }
+
+    // Each worklet message is a PCM16 ArrayBuffer. Forward it once the upstream
+    // session is ready; until then buffer so no leading audio is dropped.
+    worklet.port.onmessage = (event: MessageEvent) => {
+      if (canceledRef.current) return;
+      const buf = event.data as ArrayBuffer;
+      if (!(buf instanceof ArrayBuffer)) return;
+      if (readyRef.current && clientRef.current) {
+        clientRef.current.sendAudio(buf);
+      } else {
+        pendingAudioRef.current.push(buf);
+      }
+    };
+
+    // Wire the realtime transport. The server replies `ready` once the upstream
+    // STT session is live; we then flush any buffered audio.
+    const client = new RealtimeDictationClient({
+      onReady: () => {
+        if (canceledRef.current) return;
+        readyRef.current = true;
+        const pending = pendingAudioRef.current;
+        pendingAudioRef.current = [];
+        for (const buf of pending) clientRef.current?.sendAudio(buf);
+      },
+      onInterim: (_itemId, text) => {
+        if (canceledRef.current) return;
+        optionsRef.current.onInterim(text);
+      },
+      onFinal: (_itemId, text) => {
+        if (canceledRef.current) return;
+        const trimmed = text.trim();
+        if (trimmed.length > 0) optionsRef.current.onFinal(trimmed);
+      },
+      onError: (message) => {
+        handleError(message);
+      },
+      onClosed: () => {
+        // The server ended the session (idle/max-duration or graceful upstream
+        // close). Skip if a cancel already tore everything down, or if an error
+        // path already owns the status (its error→idle timer is pending), or if a
+        // local stop already cleared the live refs. Otherwise tear down the capture
+        // graph + socket and return to idle so the mic/AudioContext don't leak and
+        // the button doesn't stay stuck on "recording".
+        if (canceledRef.current) return;
+        if (errorTimerRef.current !== null) return;
+        if (
+          !clientRef.current &&
+          !audioContextRef.current &&
+          !streamRef.current
+        ) {
+          return;
+        }
+        teardown();
+        setStatus("idle");
+      },
+    });
+    clientRef.current = client;
+
+    // Notify the caller right when capture begins (before opening the socket) so
+    // the editor can snapshot the caret position.
+    try {
+      optionsRef.current.onStart?.();
+    } catch (err) {
+      console.error("[realtime-dictation] onStart callback threw", err);
+    }
+
+    // Open the socket, then ask the server to open the upstream session. The
+    // language hint is the base subtag of the resolved UI language (e.g. "en-US"
+    // → "en"), since the upstream transcription model expects an ISO language
+    // code, not a region-tagged locale; the server omits it upstream when absent.
+    client.connect();
+    const locale = i18n.resolvedLanguage || i18n.language || "";
+    const language = baseLanguageSubtag(locale);
+    client.start({ language });
+
+    setStatus("recording");
+    // Capture has truly begun; release the synchronous start guard.
+    startingRef.current = false;
+
+    const maxDurationMs = optionsRef.current.maxDurationMs ?? 120000;
+    timerRef.current = setTimeout(() => {
+      // Reuse stop() so the upstream is flushed/closed gracefully.
+      stopRef.current?.();
+    }, maxDurationMs);
+  }, [status, t, i18n, stopTracks, teardownAudio, handleError]);
+
+  const stop = useCallback((): void => {
+    // Nothing live and not mid-acquisition → no-op (never crash on idle).
+    if (
+      !clientRef.current &&
+      !audioContextRef.current &&
+      !streamRef.current &&
+      !startingRef.current
+    ) {
+      return;
+    }
+
+    // If stop() is pressed while getUserMedia / addModule is still pending, the
+    // start() continuation has not yet stored every ref. Set the cancel flag so
+    // the awaiting start() path bails after its await and stops the stream it just
+    // acquired (otherwise the mic stays physically ON, the red indicator sticks,
+    // and the button stays on "recording"). teardown() below also tears down
+    // anything already wired (partial graph / socket), so either path leaves us
+    // fully idle. The flag is the same one cancel() uses to neutralize late
+    // socket/worklet callbacks.
+    if (startingRef.current) {
+      // Mid-acquisition: no worklet/socket to flush. Set the cancel flag (the
+      // awaiting start() bails and stops the just-acquired stream) and tear down
+      // anything already wired. UI returns to idle immediately.
+      canceledRef.current = true;
+      teardown();
+      setStatus("idle");
+      return;
+    }
+
+    // Graceful stop of a fully-live session: ask the worklet to emit its partial
+    // tail frame, then defer the socket/graph teardown a short beat so that tail
+    // can round-trip and be forwarded upstream before the session closes. The UI
+    // returns to idle right away; the deferred teardown is idempotent and is also
+    // cancelled by clearTimer() on any subsequent start/cancel/unmount.
+    if (workletRef.current && clientRef.current) {
+      flushWorklet();
+      if (flushTimerRef.current !== null) clearTimeout(flushTimerRef.current);
+      flushTimerRef.current = setTimeout(() => {
+        flushTimerRef.current = null;
+        teardown();
+      }, 60);
+      setStatus("idle");
+      return;
+    }
+
+    // No live worklet (e.g. graph half-built): tear down immediately.
+    teardown();
+    setStatus("idle");
+  }, [teardown, flushWorklet]);
+
+  // Keep the stop ref pointed at the latest stop() for the max-duration timer.
+  stopRef.current = stop;
+
+  const cancel = useCallback((): void => {
+    // Mark canceled first so any late socket/worklet callbacks are ignored.
+    canceledRef.current = true;
+    teardown();
+    setStatus("idle");
+  }, [teardown]);
+
+  // Clean up on unmount: stop tracks, close the context/worklet, disconnect the
+  // socket, and clear timers.
+  useEffect(() => {
+    return () => {
+      canceledRef.current = true;
+      if (errorTimerRef.current !== null) {
+        clearTimeout(errorTimerRef.current);
+        errorTimerRef.current = null;
+      }
+      teardown();
+    };
+  }, [teardown]);
+
+  return { status, start, stop, cancel };
+}
--- a/apps/client/src/features/dictation/hooks/use-streaming-dictation.ts
+++ b/apps/client/src/features/dictation/hooks/use-streaming-dictation.ts
@@ -1,474 +0,0 @@
-import { useCallback, useEffect, useRef, useState } from "react";
-import { notifications } from "@mantine/notifications";
-import { useTranslation } from "react-i18next";
-import { transcribeAudio } from "@/features/dictation/services/dictation-service";
-import { encodeWavPcm16 } from "@/features/dictation/utils/encode-wav";
-import type { DictationStatus } from "@/features/dictation/hooks/use-dictation";
-
-// Lazily-imported MicVAD type. The runtime import happens inside start() so the
-// heavy onnxruntime-web / Silero model is code-split out of the main bundle and
-// only fetched when the user actually begins dictation.
-type MicVADInstance = {
-  start: () => Promise<void>;
-  pause: () => Promise<void>;
-  destroy: () => Promise<void>;
-};
-
-interface UseStreamingDictationOptions {
-  onText: (text: string) => void;
-  onStart?: () => void;
-  maxDurationMs?: number;
-}
-
-interface UseStreamingDictationResult {
-  status: DictationStatus;
-  start: () => Promise<void>;
-  stop: () => void;
-  cancel: () => void;
-  // Smoothed live speech level in the 0..1 range while recording (0 when idle).
-  audioLevel: number;
-}
-
-// Sample rate of the audio MicVAD hands to onSpeechEnd (Silero VAD runs at 16k).
-const VAD_SAMPLE_RATE = 16000;
-
-// Asset paths for the VAD worklet/Silero model and the onnxruntime-web WASM
-// binaries. vad-web 0.0.30's default asset path is "./" (relative to the current
-// page URL), NOT a CDN — in this SPA that request hits the client-side catch-all
-// route and returns index.html (text/html), so the onnxruntime ESM/wasm backend
-// fails to initialize. We instead self-host the four needed files (the vad-web
-// worklet + `silero_vad_v5.onnx` model and the onnxruntime-web `*.jsep.mjs`/
-// `*.jsep.wasm`) under `apps/client/public/vad/` — populated by
-// `scripts/copy-vad-assets.mjs`, which runs before `dev`/`build` — and point both
-// paths at the fixed absolute "/vad/".
-const VAD_BASE_ASSET_PATH: string | undefined = "/vad/";
-const VAD_ONNX_WASM_BASE_PATH: string | undefined = "/vad/";
-
-/**
- * Streaming variant of useDictation. Detects speech with a real (Silero) VAD and,
- * each time the speaker pauses, cuts that speech segment and POSTs it to the same
- * batch transcription endpoint, so text appears progressively as the user speaks.
- *
- * Returns the SAME shape as useDictation ({ status, start, stop, cancel,
- * audioLevel }) so MicButton can use either interchangeably. Refs hold the live
- * VAD instance / counters / timer so component re-renders never lose them, and
- * every exit path destroys the VAD and stops the MediaStream.
- */
-export function useStreamingDictation(
-  options: UseStreamingDictationOptions,
-): UseStreamingDictationResult {
-  const { t } = useTranslation();
-  const [status, setStatus] = useState<DictationStatus>("idle");
-  const [audioLevel, setAudioLevel] = useState(0);
-
-  // Keep the latest callbacks in a ref so async VAD/HTTP closures always call the
-  // current handlers without re-creating the VAD.
-  const optionsRef = useRef(options);
-  optionsRef.current = options;
-
-  const vadRef = useRef<MicVADInstance | null>(null);
-  // AudioContext we create+resume inside the click gesture and inject into
-  // MicVAD (see start()). We own it; MicVAD does not close an injected context.
-  const audioContextRef = useRef<AudioContext | null>(null);
-  const timerRef = useRef<ReturnType<typeof setTimeout> | null>(null);
-  const canceledRef = useRef(false);
-  const startingRef = useRef(false);
-  // True while a recording session is active (VAD listening). Used to ignore late
-  // VAD callbacks that fire after stop()/cancel().
-  const activeRef = useRef(false);
-
-  // In-order emission: each segment gets a monotonically increasing seq when its
-  // speech ends; completed transcriptions are buffered by seq and flushed in
-  // order so out-of-order HTTP responses can't scramble the text.
-  const nextSeqRef = useRef(0);
-  const nextEmitSeqRef = useRef(0);
-  const resultsRef = useRef<Map<number, string>>(new Map());
-  // Number of transcription requests still in flight.
-  const inFlightRef = useRef(0);
-  // Session epoch: bumped when a NEW session starts (start) or everything is
-  // hard-discarded (cancel). Each in-flight request captures the epoch at send
-  // time; if the epoch has since changed, the request is stale and its
-  // then/catch/finally are skipped so old text can't leak into a new session and
-  // the in-flight counter can't be driven negative across sessions.
-  const epochRef = useRef(0);
-
-  // Exponentially smoothed speech level, and the last value pushed to React state.
-  const smoothedLevelRef = useRef(0);
-  const emittedLevelRef = useRef(0);
-
-  const clearTimer = useCallback(() => {
-    if (timerRef.current !== null) {
-      clearTimeout(timerRef.current);
-      timerRef.current = null;
-    }
-  }, []);
-
-  // Reset the level meter back to zero (refs + React state).
-  const resetLevel = useCallback(() => {
-    smoothedLevelRef.current = 0;
-    emittedLevelRef.current = 0;
-    setAudioLevel(0);
-  }, []);
-
-  // Destroy the live VAD instance (which also releases the mic stream and audio
-  // context it created). Safe to call multiple times and on any exit path;
-  // defensive try/catch so teardown never throws.
-  const destroyVad = useCallback(() => {
-    const vad = vadRef.current;
-    vadRef.current = null;
-    if (vad) {
-      try {
-        // destroy() pauses + tears down the worklet/stream/context internally.
-        // It returns a promise, so attach a .catch too: the surrounding
-        // try/catch only catches synchronous throws, and a rejected destroy()
-        // would otherwise surface as an unhandled rejection.
-        void vad
-          .destroy()
-          .catch((err) =>
-            console.warn("[dictation] VAD teardown failed", err),
-          );
-      } catch (err) {
-        // Cleanup must never throw; just log for diagnosis.
-        console.warn("[dictation] VAD teardown failed", err);
-      }
-    }
-  }, []);
-
-  // Decide the status once recording has ended: stay "transcribing" while
-  // requests are in flight, otherwise return to "idle".
-  const settleAfterStop = useCallback(() => {
-    if (inFlightRef.current > 0) {
-      setStatus("transcribing");
-    } else {
-      setStatus("idle");
-    }
-  }, []);
-
-  // Drain the in-order result buffer: while the next expected seq is ready, trim
-  // it, emit it if non-empty, and advance. Called after every resolved request.
-  const drainResults = useCallback(() => {
-    const results = resultsRef.current;
-    while (results.has(nextEmitSeqRef.current)) {
-      const text = results.get(nextEmitSeqRef.current)!;
-      results.delete(nextEmitSeqRef.current);
-      nextEmitSeqRef.current += 1;
-      const trimmed = text.trim();
-      // Whisper often returns a leading space; emit the trimmed value.
-      if (trimmed.length > 0) optionsRef.current.onText(trimmed);
-    }
-  }, []);
-
-  // Map a transcription error to a user-facing message, mirroring the batch hook.
-  const transcriptionErrorMessage = useCallback(
-    (err: unknown): string => {
-      const resp = (
-        err as { response?: { status?: number; data?: { message?: string } } }
-      )?.response;
-      const serverMsg = resp?.data?.message;
-      if (serverMsg && serverMsg.trim().length > 0) {
-        // The server already explains the cause (e.g. provider 404, bad format,
-        // STT not configured) — show it verbatim.
-        return serverMsg;
-      }
-      if (resp?.status === 503 || resp?.status === 403) {
-        return t("Voice dictation is not configured");
-      }
-      return `${t("Transcription failed")}: ${(err as { message?: string })?.message ?? String(err)}`;
-    },
-    [t],
-  );
-
-  // Handle one ended speech segment: encode to WAV and transcribe. Results are
-  // buffered by seq and flushed in order. A single failed segment does NOT kill
-  // the session: log + one notification, then advance past that seq so later
-  // segments still flush.
-  const handleSegment = useCallback(
-    (audio: Float32Array) => {
-      const seq = nextSeqRef.current;
-      nextSeqRef.current += 1;
-      inFlightRef.current += 1;
-      // Capture the epoch for this request synchronously at send time.
-      const epoch = epochRef.current;
-
-      const wavBlob = encodeWavPcm16(audio, VAD_SAMPLE_RATE);
-      void transcribeAudio(wavBlob, "speech.wav")
-        .then((text) => {
-          // Stale request from a previous session: drop it without touching any
-          // current-session state.
-          if (epoch !== epochRef.current) return;
-          // Defend against a non-string server value before drainResults trims.
-          resultsRef.current.set(seq, typeof text === "string" ? text : "");
-          drainResults();
-        })
-        .catch((err: unknown) => {
-          if (epoch !== epochRef.current) return;
-          // Log the full error for diagnosis (status + body + stack).
-          console.error("[dictation] segment transcription failed", err);
-          notifications.show({
-            color: "red",
-            message: transcriptionErrorMessage(err),
-          });
-          // Skip this seq so later segments can still flush in order.
-          if (nextEmitSeqRef.current === seq) {
-            nextEmitSeqRef.current += 1;
-            drainResults();
-          } else {
-            resultsRef.current.set(seq, "");
-            drainResults();
-          }
-        })
-        .finally(() => {
-          if (epoch !== epochRef.current) return;
-          inFlightRef.current -= 1;
-          // If recording already stopped, flip to idle once everything drained.
-          if (!activeRef.current && inFlightRef.current === 0) {
-            setStatus("idle");
-          }
-        });
-    },
-    [drainResults, transcriptionErrorMessage],
-  );
-
-  const start = useCallback(async (): Promise<void> => {
-    // Synchronous live guard: status is stale between renders, so also block on
-    // refs to prevent a double-click from creating two VAD instances (the first
-    // would leak its mic stream).
-    if (startingRef.current || vadRef.current || activeRef.current) return;
-    if (status !== "idle") return;
-    startingRef.current = true;
-
-    // Notify the caller right when dictation begins (before any async work) so the
-    // editor can snapshot the caret position.
-    optionsRef.current.onStart?.();
-
-    // Reset per-session in-order emission state. Bump the epoch so any request
-    // still in flight from a previous (stopped) session becomes stale and its
-    // then/catch/finally are skipped — it can neither emit old text into this
-    // new session nor decrement this session's freshly-zeroed in-flight counter.
-    epochRef.current += 1;
-    canceledRef.current = false;
-    nextSeqRef.current = 0;
-    nextEmitSeqRef.current = 0;
-    resultsRef.current = new Map();
-    inFlightRef.current = 0;
-    resetLevel();
-
-    // Create and resume the AudioContext NOW, inside the click gesture, before
-    // the (first-time-slow) model load below. A context first touched outside a
-    // user gesture stays "suspended" and the VAD audio worklet never runs — that
-    // is exactly why the first click did nothing and only the second (model
-    // already cached, so MicVAD.new was fast enough to create the context inside
-    // the gesture) started recording. We own this context and inject it into
-    // MicVAD (which then will NOT close it); it is reused across start/stop and
-    // closed only on unmount.
-    const AudioCtor =
-      window.AudioContext ||
-      (window as unknown as { webkitAudioContext?: typeof AudioContext })
-        .webkitAudioContext;
-    if (AudioCtor && !audioContextRef.current) {
-      audioContextRef.current = new AudioCtor();
-    }
-    // Resume within the gesture; swallow rejection (e.g. already running/closed).
-    void audioContextRef.current?.resume().catch(() => {});
-    // Show immediate feedback while the model loads (see Part B).
-    setStatus("loading");
-
-    let vad: MicVADInstance;
-    try {
-      // Lazy import so the heavy onnx model/worklet are only fetched on first use
-      // and code-split out of the main bundle.
-      const { MicVAD } = await import("@ricky0123/vad-web");
-
-      vad = await MicVAD.new({
-        // Silero v5 model (smaller/faster than the legacy model).
-        model: "v5",
-        // vad-web 0.0.30 defaults startOnLoad:true, which opens the mic (calls
-        // getUserMedia) inside new() and leaves the later vad.start() a no-op —
-        // making its mic-permission error handling dead code. Force it off so the
-        // mic is opened only by the explicit vad.start() below, where the real
-        // getUserMedia errors are caught and mapped.
-        startOnLoad: false,
-        // Inject the AudioContext we created+resumed inside the click gesture so
-        // the VAD worklet runs on a "running" context. When provided, the library
-        // uses it and does NOT take ownership/close it.
-        ...(audioContextRef.current
-          ? { audioContext: audioContextRef.current }
-          : {}),
-        // Only pass asset paths when defined; otherwise the library uses its
-        // bundled CDN defaults.
-        ...(VAD_BASE_ASSET_PATH !== undefined
-          ? { baseAssetPath: VAD_BASE_ASSET_PATH }
-          : {}),
-        ...(VAD_ONNX_WASM_BASE_PATH !== undefined
-          ? { onnxWASMBasePath: VAD_ONNX_WASM_BASE_PATH }
-          : {}),
-        // --- VAD tuning (all tunable) ---
-        // Probability over which a frame counts as speech.
-        positiveSpeechThreshold: 0.5,
-        // Probability under which a frame counts as non-speech (~0.15 below the
-        // positive threshold, per Silero guidance).
-        negativeSpeechThreshold: 0.35,
-        // Silence to wait through before ending a segment (the "don't cut
-        // immediately" delay). Each ended segment is ONE transcription request, so
-        // cutting on short gaps over-fragments normal speech into a flood of tiny
-        // requests (and trips the server's per-user rate limit). Wait ~1.5s — a
-        // real sentence/thought boundary — so request count tracks actual pauses,
-        // not every inter-word gap. Higher = fewer requests but more latency
-        // before text appears. NOTE: vad-web 0.0.30 takes this in ms, not frames
-        // (one Silero frame is ~32ms at 16k).
-        redemptionMs: 1500,
-        // Audio kept before speech start (left padding so the first word isn't
-        // clipped) — ~0.3s.
-        preSpeechPadMs: 320,
-        // Ignore sub-100ms blips like clicks.
-        minSpeechMs: 96,
-        onFrameProcessed: (probabilities: { isSpeech: number }) => {
-          // Drive the level meter from the speech probability. Light exponential
-          // smoothing + a throttle so React state isn't updated every frame; this
-          // powers the existing button halo. Reuses the VAD's own frame
-          // probabilities — no second AudioContext/AnalyserNode.
-          if (!activeRef.current) return;
-          const level = Math.min(1, Math.max(0, probabilities.isSpeech));
-          smoothedLevelRef.current = smoothedLevelRef.current * 0.8 + level * 0.2;
-          if (Math.abs(smoothedLevelRef.current - emittedLevelRef.current) > 0.01) {
-            emittedLevelRef.current = smoothedLevelRef.current;
-            setAudioLevel(smoothedLevelRef.current);
-          }
-        },
-        onSpeechStart: () => {
-          // No-op: the segment is only handled once it ends.
-        },
-        onSpeechEnd: (audio: Float32Array) => {
-          // A pause was detected — cut this segment and transcribe it. Ignore late
-          // callbacks that fire after stop()/cancel().
-          if (!activeRef.current || canceledRef.current) return;
-          handleSegment(audio);
-        },
-      });
-    } catch (err) {
-      // With startOnLoad:false, new() loads the model/worklet/wasm but does NOT
-      // open the mic, so a throw here is an asset/init failure (model fetch,
-      // worklet, onnxruntime wasm), not a mic-permission error. Map it as a
-      // generic "could not start" with the underlying detail. (The mic-permission
-      // name checks are kept in the vad.start() catch below, where getUserMedia
-      // actually runs.)
-      console.error("[dictation] VAD init failed", err);
-      const detail = (err as { message?: string })?.message ?? String(err);
-      notifications.show({
-        color: "red",
-        message: `${t("Could not start recording")}: ${detail}`,
-      });
-      // Defensive: if MicVAD.new partially succeeded before throwing, make sure we
-      // don't leak it.
-      destroyVad();
-      setStatus("idle");
-      startingRef.current = false;
-      return;
-    }
-
-    vadRef.current = vad;
-    // Accept frames once start() resolves; the VAD callbacks already guard on
-    // activeRef, so setting it before start() is safe.
-    activeRef.current = true;
-
-    try {
-      // With startOnLoad:false this is where getUserMedia actually runs, so map
-      // mic-permission errors here the same way the batch hook does; otherwise
-      // fall back to a generic "could not start" message.
-      await vad.start();
-    } catch (err) {
-      // Always log the full error for diagnosis (name, message, stack).
-      console.error("[dictation] VAD.start failed", err);
-      const name = (err as { name?: string })?.name;
-      const detail = (err as { message?: string })?.message ?? String(err);
-      let message: string;
-      if (name === "NotAllowedError" || name === "SecurityError") {
-        message = t("Microphone access denied");
-      } else if (name === "NotFoundError" || name === "OverconstrainedError") {
-        message = t("No microphone found");
-      } else if (name === "NotReadableError" || name === "AbortError") {
-        message = t("Microphone is unavailable or already in use");
-      } else {
-        message = `${t("Could not start recording")}: ${detail}`;
-      }
-      notifications.show({ color: "red", message });
-      activeRef.current = false;
-      destroyVad();
-      setStatus("idle");
-      startingRef.current = false;
-      return;
-    }
-
-    setStatus("recording");
-    // Recording has truly begun; release the synchronous start guard.
-    startingRef.current = false;
-
-    // Optional overall safety cap: auto-stop after maxDurationMs like the batch
-    // hook does.
-    const maxDurationMs = optionsRef.current.maxDurationMs ?? 120000;
-    timerRef.current = setTimeout(() => {
-      if (activeRef.current) stopRef.current();
-    }, maxDurationMs);
-  }, [status, t, resetLevel, destroyVad, handleSegment]);
-
-  const stop = useCallback((): void => {
-    clearTimer();
-    if (!activeRef.current && !vadRef.current) {
-      // Nothing is running; make sure the UI is idle.
-      setStatus("idle");
-      return;
-    }
-    // Mark inactive first so late onSpeechEnd/onFrameProcessed callbacks are
-    // ignored. Any speech segment that has NOT yet ended (user clicks Stop
-    // mid-utterance) is dropped — acceptable for v1; users normally pause before
-    // stopping.
-    activeRef.current = false;
-    destroyVad();
-    resetLevel();
-    settleAfterStop();
-  }, [clearTimer, destroyVad, resetLevel, settleAfterStop]);
-
-  // Keep stop() reachable from the maxDuration timer closure (which is created
-  // before stop is defined) without re-creating the VAD.
-  const stopRef = useRef(stop);
-  stopRef.current = stop;
-
-  const cancel = useCallback((): void => {
-    clearTimer();
-    canceledRef.current = true;
-    activeRef.current = false;
-    // Hard discard: bump the epoch so any in-flight request becomes stale and is
-    // ignored the moment it resolves (no emit, no counter touch).
-    epochRef.current += 1;
-    // Drop pending results / queue; in-flight requests will resolve into a now-
-    // empty buffer and be ignored.
-    resultsRef.current = new Map();
-    nextSeqRef.current = 0;
-    nextEmitSeqRef.current = 0;
-    inFlightRef.current = 0;
-    destroyVad();
-    resetLevel();
-    setStatus("idle");
-  }, [clearTimer, destroyVad, resetLevel]);
-
-  // Clean up on unmount: destroy the VAD, stop the mic stream, clear the timer.
-  // Defensive try/catch lives inside destroyVad so teardown never throws.
-  useEffect(() => {
-    return () => {
-      clearTimer();
-      activeRef.current = false;
-      canceledRef.current = true;
-      destroyVad();
-      // Close the AudioContext we own (MicVAD never closes an injected one).
-      if (
-        audioContextRef.current &&
-        audioContextRef.current.state !== "closed"
-      ) {
-        void audioContextRef.current.close().catch(() => {});
-      }
-      audioContextRef.current = null;
-    };
-  }, [clearTimer, destroyVad]);
-
-  return { status, start, stop, cancel, audioLevel };
-}
--- a/apps/client/src/features/dictation/services/dictation-reducer.test.ts
+++ b/apps/client/src/features/dictation/services/dictation-reducer.test.ts
@@ -0,0 +1,91 @@
+import { describe, it, expect } from "vitest";
+import {
+  baseLanguageSubtag,
+  initSessionState,
+  onAudio,
+  onReady,
+  onInterim,
+  onFinal,
+  onCancel,
+  onStop,
+} from "./dictation-reducer";
+
+describe("baseLanguageSubtag", () => {
+  it("reduces a region-tagged locale to its base subtag", () => {
+    expect(baseLanguageSubtag("en-US")).toBe("en");
+  });
+  it("returns a bare subtag unchanged", () => {
+    expect(baseLanguageSubtag("en")).toBe("en");
+  });
+  it("returns undefined for empty / blank / nullish", () => {
+    expect(baseLanguageSubtag("")).toBeUndefined();
+    expect(baseLanguageSubtag("   ")).toBeUndefined();
+    expect(baseLanguageSubtag(undefined)).toBeUndefined();
+    expect(baseLanguageSubtag(null)).toBeUndefined();
+  });
+});
+
+function ab(n: number): ArrayBuffer {
+  return new ArrayBuffer(n);
+}
+
+describe("dictation session reducer", () => {
+  it("buffers audio until ready then flushes in order", () => {
+    const s = initSessionState();
+    const a = ab(1);
+    const b = ab(2);
+    expect(onAudio(s, a).send).toEqual([]);
+    expect(onAudio(s, b).send).toEqual([]);
+    expect(s.pending).toHaveLength(2);
+
+    const ready = onReady(s);
+    expect(ready.send).toEqual([a, b]); // flushed in arrival order
+    expect(s.pending).toHaveLength(0);
+    expect(s.ready).toBe(true);
+
+    // After ready, audio is sent immediately.
+    const c = ab(3);
+    expect(onAudio(s, c).send).toEqual([c]);
+  });
+
+  it("interim replaces interim", () => {
+    const s = initSessionState();
+    expect(onInterim(s, "hel").emitInterim).toBe("hel");
+    expect(onInterim(s, "hello").emitInterim).toBe("hello");
+    expect(s.interim).toBe("hello");
+  });
+
+  it("final trims and drops empty, clearing the interim", () => {
+    const s = initSessionState();
+    onInterim(s, "draft");
+    expect(onFinal(s, "  hi there  ").emitFinal).toBe("hi there");
+    expect(s.interim).toBe("");
+
+    const empty = onFinal(s, "   ");
+    expect(empty.emitFinal).toBeUndefined();
+  });
+
+  it("cancel drops pending and ignores later events", () => {
+    const s = initSessionState();
+    onAudio(s, ab(1));
+    onCancel(s);
+    expect(s.pending).toHaveLength(0);
+    expect(s.canceled).toBe(true);
+
+    // Later events are no-ops.
+    expect(onAudio(s, ab(2)).send).toEqual([]);
+    expect(onReady(s).send).toEqual([]);
+    expect(onInterim(s, "late").emitInterim).toBeUndefined();
+    expect(onFinal(s, "late").emitFinal).toBeUndefined();
+  });
+
+  it("closed/stop after stop is a no-op", () => {
+    const s = initSessionState();
+    onReady(s);
+    onStop(s);
+    expect(s.canceled).toBe(true);
+    // Audio arriving after stop is ignored (server has no session).
+    expect(onAudio(s, ab(1)).send).toEqual([]);
+    expect(onInterim(s, "x").emitInterim).toBeUndefined();
+  });
+});
--- a/apps/client/src/features/dictation/services/dictation-reducer.ts
+++ b/apps/client/src/features/dictation/services/dictation-reducer.ts
@@ -0,0 +1,113 @@
+// Pure logic extracted from `use-realtime-dictation` so the transcript/session
+// state machine can be unit-tested without React or a live socket. The hook wires
+// these to refs/callbacks; nothing here touches the DOM or Web Audio.
+
+/**
+ * Reduce a BCP-47 locale to its base language subtag for the upstream STT model,
+ * which expects an ISO language code, not a region-tagged locale.
+ *   "en-US" → "en", "en" → "en", "" → undefined, "  " → undefined.
+ * Returns undefined when no usable subtag exists so the server can omit the hint.
+ */
+export function baseLanguageSubtag(locale: string | undefined | null): string | undefined {
+  if (!locale) return undefined;
+  const base = locale.trim().split("-")[0]?.trim();
+  return base && base.length > 0 ? base : undefined;
+}
+
+/**
+ * Session/transcript reducer. Models the audio-buffering + interim/final/cancel
+ * lifecycle as a pure state object so the ordering rules (buffer-until-ready,
+ * cancel ignores later events, closed-after-stop is a no-op) are testable. The
+ * hook keeps the live socket/graph; this only decides what to emit.
+ */
+export interface DictationSessionState {
+  // Server has confirmed the upstream session; audio may flow.
+  ready: boolean;
+  // Local stop/cancel happened; later interim/final/audio are ignored.
+  canceled: boolean;
+  // Audio captured before `ready`; flushed in arrival order once ready.
+  pending: ArrayBuffer[];
+  // Latest interim transcript for the live (not-yet-final) segment.
+  interim: string;
+}
+
+export function initSessionState(): DictationSessionState {
+  return { ready: false, canceled: false, pending: [], interim: "" };
+}
+
+// Effects the hook should perform after a reduction. Keeps the reducer pure: it
+// describes what to do, the hook does it (send over the socket, call callbacks).
+export interface DictationEffects {
+  // Audio chunks to send upstream now, in order.
+  send: ArrayBuffer[];
+  // Interim text to surface, if it changed.
+  emitInterim?: string;
+  // Final (trimmed, non-empty) text to surface.
+  emitFinal?: string;
+}
+
+const NONE: DictationEffects = { send: [] };
+
+/** Audio chunk captured: send immediately if ready, else buffer it. */
+export function onAudio(
+  state: DictationSessionState,
+  buf: ArrayBuffer,
+): DictationEffects {
+  if (state.canceled) return NONE;
+  if (state.ready) return { send: [buf] };
+  state.pending.push(buf);
+  return NONE;
+}
+
+/** Server ready: flush all buffered audio in order, then stream live. */
+export function onReady(state: DictationSessionState): DictationEffects {
+  if (state.canceled) return NONE;
+  state.ready = true;
+  const send = state.pending;
+  state.pending = [];
+  return { send };
+}
+
+/** Interim transcript: replaces the previous interim for the live segment. */
+export function onInterim(
+  state: DictationSessionState,
+  text: string,
+): DictationEffects {
+  if (state.canceled) return NONE;
+  state.interim = text;
+  return { send: [], emitInterim: text };
+}
+
+/**
+ * Final transcript: trim and drop if empty; the live interim segment is cleared
+ * (the final supersedes it).
+ */
+export function onFinal(
+  state: DictationSessionState,
+  text: string,
+): DictationEffects {
+  if (state.canceled) return NONE;
+  const trimmed = text.trim();
+  state.interim = "";
+  if (trimmed.length === 0) return { send: [] };
+  return { send: [], emitFinal: trimmed };
+}
+
+/** Cancel: drop pending audio and ignore all later events. */
+export function onCancel(state: DictationSessionState): DictationEffects {
+  state.canceled = true;
+  state.pending = [];
+  state.interim = "";
+  return NONE;
+}
+
+/**
+ * Stop: like cancel for the purposes of "no more events should be processed".
+ * Distinct name kept so the hook can flush the worklet tail before stopping; the
+ * reducer treats post-stop events as no-ops the same way.
+ */
+export function onStop(state: DictationSessionState): DictationEffects {
+  state.canceled = true;
+  state.pending = [];
+  return NONE;
+}
--- a/apps/client/src/features/dictation/services/realtime-dictation-client.test.ts
+++ b/apps/client/src/features/dictation/services/realtime-dictation-client.test.ts
@@ -0,0 +1,185 @@
+import { describe, it, expect, vi, beforeEach } from "vitest";
+
+// --- Mock socket.io-client with a controllable fake socket --------------------
+// The mock records registered listeners so tests can fire server events, and
+// records emits so the start/reconnect behavior can be asserted.
+interface FakeSocket {
+  connected: boolean;
+  listeners: Record<string, ((...args: unknown[]) => void)[]>;
+  emits: { event: string; args: unknown[] }[];
+  on: (e: string, cb: (...a: unknown[]) => void) => FakeSocket;
+  emit: (e: string, ...a: unknown[]) => void;
+  connect: () => void;
+  disconnect: () => void;
+  removeAllListeners: () => void;
+  fire: (e: string, ...a: unknown[]) => void;
+}
+
+function makeFakeSocket(): FakeSocket {
+  const socket: FakeSocket = {
+    connected: false,
+    listeners: {},
+    emits: [],
+    on(e, cb) {
+      (socket.listeners[e] ??= []).push(cb);
+      return socket;
+    },
+    emit(e, ...a) {
+      socket.emits.push({ event: e, args: a });
+    },
+    connect() {
+      socket.connected = true;
+      socket.fire("connect");
+    },
+    disconnect() {
+      socket.connected = false;
+    },
+    removeAllListeners() {
+      socket.listeners = {};
+    },
+    fire(e, ...a) {
+      (socket.listeners[e] ?? []).forEach((cb) => cb(...a));
+    },
+  };
+  return socket;
+}
+
+let lastSocket: FakeSocket;
+const ioMock = vi.fn((..._args: unknown[]) => {
+  lastSocket = makeFakeSocket();
+  return lastSocket;
+});
+
+vi.mock("socket.io-client", () => ({
+  io: (...args: unknown[]) => ioMock(...args),
+  Socket: class {},
+}));
+
+vi.mock("@/features/websocket/types", () => ({ SOCKET_URL: undefined }));
+
+import { RealtimeDictationClient } from "./realtime-dictation-client";
+
+function makeHandlers() {
+  return {
+    onReady: vi.fn(),
+    onInterim: vi.fn(),
+    onFinal: vi.fn(),
+    onError: vi.fn(),
+    onClosed: vi.fn(),
+  };
+}
+
+beforeEach(() => {
+  ioMock.mockClear();
+});
+
+describe("RealtimeDictationClient", () => {
+  it("uses a single io() call with the bare namespace URL and shared opts", () => {
+    const c = new RealtimeDictationClient(makeHandlers());
+    c.connect();
+    expect(ioMock).toHaveBeenCalledTimes(1);
+    const call = ioMock.mock.calls[0] as unknown[];
+    expect(call[0]).toBe("/ai-realtime");
+    expect(call[1]).toMatchObject({
+      transports: ["websocket"],
+      withCredentials: true,
+      autoConnect: false,
+    });
+  });
+
+  it("decodes ready/interim/final with ?? '' defaults", () => {
+    const h = makeHandlers();
+    const c = new RealtimeDictationClient(h);
+    c.connect();
+
+    lastSocket.fire("ready");
+    expect(h.onReady).toHaveBeenCalledTimes(1);
+
+    lastSocket.fire("interim", { itemId: "a", text: "hi" });
+    expect(h.onInterim).toHaveBeenCalledWith("a", "hi");
+    lastSocket.fire("interim", {});
+    expect(h.onInterim).toHaveBeenCalledWith("", "");
+
+    lastSocket.fire("final", { itemId: "b", text: "done" });
+    expect(h.onFinal).toHaveBeenCalledWith("b", "done");
+    lastSocket.fire("final", undefined);
+    expect(h.onFinal).toHaveBeenCalledWith("", "");
+  });
+
+  it("surfaces error (string and object) and connect_error", () => {
+    const h = makeHandlers();
+    const c = new RealtimeDictationClient(h);
+    c.connect();
+    lastSocket.fire("error", "boom");
+    expect(h.onError).toHaveBeenCalledWith("boom");
+  });
+
+  it("error fires at most once per connection (error-once guard)", () => {
+    const h = makeHandlers();
+    const c = new RealtimeDictationClient(h);
+    c.connect();
+    lastSocket.fire("error", { message: "first" });
+    lastSocket.fire("connect_error", new Error("second"));
+    lastSocket.fire("error", "third");
+    expect(h.onError).toHaveBeenCalledTimes(1);
+    expect(h.onError).toHaveBeenCalledWith("first");
+  });
+
+  it("connect_error builds a concrete message", () => {
+    const h = makeHandlers();
+    const c = new RealtimeDictationClient(h);
+    c.connect();
+    lastSocket.fire("connect_error", new Error("handshake"));
+    expect(h.onError).toHaveBeenCalledWith(
+      "Realtime connection failed: handshake",
+    );
+  });
+
+  it("emits start once on first connect after start()", () => {
+    const c = new RealtimeDictationClient(makeHandlers());
+    c.connect(); // fires connect, socket.connected = true
+    c.start({ language: "en" });
+    const starts = lastSocket.emits.filter((e) => e.event === "start");
+    expect(starts).toHaveLength(1);
+    expect(starts[0].args[0]).toEqual({ language: "en" });
+  });
+
+  it("re-emits start on reconnect (does not double-start while live)", () => {
+    const c = new RealtimeDictationClient(makeHandlers());
+    c.connect();
+    c.start({ language: "en" });
+    // A second connect with no disconnect must NOT re-start (still live).
+    lastSocket.fire("connect");
+    let starts = lastSocket.emits.filter((e) => e.event === "start");
+    expect(starts).toHaveLength(1);
+
+    // Transient drop then reconnect → re-establish the session exactly once.
+    lastSocket.fire("disconnect");
+    lastSocket.fire("connect");
+    starts = lastSocket.emits.filter((e) => e.event === "start");
+    expect(starts).toHaveLength(2);
+    expect(starts[1].args[0]).toEqual({ language: "en" });
+  });
+
+  it("disconnect removes listeners and resets the error flag", () => {
+    const h = makeHandlers();
+    const c = new RealtimeDictationClient(h);
+    c.connect();
+    const removeSpy = vi.spyOn(lastSocket, "removeAllListeners");
+    c.disconnect();
+    expect(removeSpy).toHaveBeenCalled();
+    expect(lastSocket.connected).toBe(false);
+
+    // A fresh connect on the reused instance can error again.
+    c.connect();
+    lastSocket.fire("error", "again");
+    expect(h.onError).toHaveBeenCalledWith("again");
+  });
+
+  it("connect() is a no-op while a socket already exists", () => {
+    const c = new RealtimeDictationClient(makeHandlers());
+    c.connect();
+    c.connect();
+    expect(ioMock).toHaveBeenCalledTimes(1);
+  });
+});
--- a/apps/client/src/features/dictation/services/realtime-dictation-client.ts
+++ b/apps/client/src/features/dictation/services/realtime-dictation-client.ts
@@ -0,0 +1,156 @@
+import { io, Socket } from "socket.io-client";
+import { SOCKET_URL } from "@/features/websocket/types";
+
+// Handlers the hook supplies; the client translates the normalized `/ai-realtime`
+// Socket.IO events into these callbacks. The client itself owns no React state —
+// it is a thin transport wrapper so the hook can stay focused on the audio graph.
+export interface RealtimeDictationHandlers {
+  // Upstream STT session is established; safe to start sending audio.
+  onReady: () => void;
+  // Latest partial transcript for the current (not-yet-final) segment.
+  onInterim: (itemId: string, text: string) => void;
+  // A completed segment's transcript.
+  onFinal: (itemId: string, text: string) => void;
+  // Concrete failure reason (connect error or server-surfaced error).
+  onError: (message: string) => void;
+  // Session ended (graceful stop or upstream closed).
+  onClosed: () => void;
+}
+
+interface StartOptions {
+  language?: string;
+}
+
+// Wraps the dedicated `/ai-realtime` Socket.IO namespace. Cookie-based auth rides
+// the handshake via `withCredentials` (no bearer token), exactly like the main
+// app socket. `autoConnect: false` lets the hook wire listeners up before the
+// handshake fires so no early event is missed.
+export class RealtimeDictationClient {
+  private socket: Socket | null = null;
+  // onError must fire at most once per session: the server `error` and socket
+  // `connect_error` can both arrive (e.g. an error then a failed reconnect), but
+  // the hook owns the error→idle flow and a second call would double-fire it.
+  private erroredFlag = false;
+  // The last `start` params, retained so we can re-establish the upstream session
+  // after a transient socket.io reconnect (otherwise the server has no session and
+  // silently drops audio). Null until start() is first called.
+  private startOptions: StartOptions | null = null;
+  // True between a successful `start` emit and the next disconnect, so a reconnect
+  // re-emits `start` exactly once and we never double-start a live session.
+  private started = false;
+
+  constructor(private readonly handlers: RealtimeDictationHandlers) {}
+
+  // Forward the first error reason only; later error/connect_error are swallowed.
+  private emitError(message: string): void {
+    if (this.erroredFlag) return;
+    this.erroredFlag = true;
+    this.handlers.onError(message);
+  }
+
+  // Create the socket, register listeners, then open the connection. Safe to call
+  // once per client instance; a second call is a no-op while a socket exists.
+  connect(): void {
+    if (this.socket) return;
+    // Fresh socket → allow onError to fire again for this connection.
+    this.erroredFlag = false;
+
+    // SOCKET_URL is undefined in this app (socket.io derives the page origin), so
+    // the `/ai-realtime` namespace rides the same `/socket.io` path as the main
+    // socket — which the Vite dev server proxies as a websocket. The URL is the
+    // only thing that varies; the options are shared (single io() call).
+    const url = SOCKET_URL ? `${SOCKET_URL}/ai-realtime` : "/ai-realtime";
+    const socket: Socket = io(url, {
+      transports: ["websocket"],
+      withCredentials: true,
+      autoConnect: false,
+    });
+
+    this.socket = socket;
+
+    // On every (re)connect, re-establish the upstream session if start() has run
+    // but we are not currently in a started session. The first connect after
+    // start() is handled by start() itself (started === true by then); this branch
+    // covers reconnects after a transient drop, where the server lost the session
+    // and would otherwise silently discard all subsequent audio. The `started`
+    // guard prevents double-starting a live session.
+    socket.on("connect", () => {
+      if (this.startOptions && !this.started) {
+        this.started = true;
+        socket.emit("start", { language: this.startOptions.language });
+      }
+    });
+
+    // A disconnect (transient drop or close) ends the server-side session; clear
+    // `started` so the next `connect` re-emits `start`.
+    socket.on("disconnect", () => {
+      this.started = false;
+    });
+
+    socket.on("ready", () => this.handlers.onReady());
+
+    socket.on("interim", (payload: { itemId: string; text: string }) => {
+      this.handlers.onInterim(payload?.itemId ?? "", payload?.text ?? "");
+    });
+
+    socket.on("final", (payload: { itemId: string; text: string }) => {
+      this.handlers.onFinal(payload?.itemId ?? "", payload?.text ?? "");
+    });
+
+    socket.on("error", (payload: { message?: string } | string) => {
+      const message =
+        typeof payload === "string"
+          ? payload
+          : payload?.message || "Realtime dictation error";
+      this.emitError(message);
+    });
+
+    socket.on("closed", () => this.handlers.onClosed());
+
+    // Low-level transport failure (handshake/auth/proxy). Surface a concrete cause.
+    socket.on("connect_error", (err: Error) => {
+      const message = err?.message
+        ? `Realtime connection failed: ${err.message}`
+        : "Realtime connection failed";
+      this.emitError(message);
+    });
+
+    socket.connect();
+  }
+
+  // Ask the server to resolve config and open the upstream STT session. The params
+  // are retained so a post-reconnect `connect` can re-establish the session.
+  start(opts: StartOptions): void {
+    this.startOptions = opts;
+    // If the socket is already connected, emit now and mark started; otherwise the
+    // `connect` handler will emit once the handshake completes.
+    if (this.socket?.connected && !this.started) {
+      this.started = true;
+      this.socket.emit("start", { language: opts.language });
+    }
+  }
+
+  // Forward a raw PCM16 chunk; socket.io serializes the ArrayBuffer as binary.
+  sendAudio(buf: ArrayBuffer): void {
+    this.socket?.emit("audio", buf);
+  }
+
+  // Request a graceful flush/close of the upstream session.
+  stop(): void {
+    this.socket?.emit("stop");
+  }
+
+  // Tear down: drop every listener and close the connection. Idempotent.
+  disconnect(): void {
+    const socket = this.socket;
+    if (!socket) return;
+    this.socket = null;
+    // Reset so a subsequent connect() on a reused instance can error again and a
+    // fresh session can be started.
+    this.erroredFlag = false;
+    this.started = false;
+    this.startOptions = null;
+    socket.removeAllListeners();
+    socket.disconnect();
+  }
+}
--- a/apps/client/src/features/dictation/utils/encode-wav.test.ts
+++ b/apps/client/src/features/dictation/utils/encode-wav.test.ts
@@ -1,87 +0,0 @@
-import { describe, it, expect } from "vitest";
-import { encodeWavPcm16 } from "./encode-wav";
-
-// Contract tests for `encodeWavPcm16` (encode-wav.ts). The dictation feature
-// streams microphone audio as mono 16-bit PCM WAV to the STT endpoint, which
-// whitelists audio/wav. A regression in the WAV header or PCM16 clamping would
-// produce audio the server cannot decode (silence / garbled transcripts), so we
-// assert the canonical 44-byte header layout and the sample quantisation rails.
-
-// Read a DataView back out of a Blob. jsdom's Blob does not implement
-// `.arrayBuffer()`, so go through FileReader.readAsArrayBuffer instead.
-function readView(blob: Blob): Promise<DataView> {
-  return new Promise((resolve, reject) => {
-    const reader = new FileReader();
-    reader.onload = () => resolve(new DataView(reader.result as ArrayBuffer));
-    reader.onerror = () => reject(reader.error);
-    reader.readAsArrayBuffer(blob);
-  });
-}
-
-function readStr(view: DataView, offset: number, length: number): string {
-  let s = "";
-  for (let i = 0; i < length; i++) s += String.fromCharCode(view.getUint8(offset + i));
-  return s;
-}
-
-describe("encodeWavPcm16", () => {
-  it("writes the canonical RIFF/WAVE/fmt /data tags", async () => {
-    const view = await readView(encodeWavPcm16(new Float32Array(4)));
-    expect(readStr(view, 0, 4)).toBe("RIFF");
-    expect(readStr(view, 8, 4)).toBe("WAVE");
-    expect(readStr(view, 12, 4)).toBe("fmt ");
-    expect(readStr(view, 36, 4)).toBe("data");
-  });
-
-  it("writes a PCM fmt chunk (size=16, format=1, mono, 16-bit)", async () => {
-    const samples = new Float32Array(10);
-    const view = await readView(encodeWavPcm16(samples));
-    expect(view.getUint32(16, true)).toBe(16); // fmt chunk size
-    expect(view.getUint16(20, true)).toBe(1); // audioFormat = PCM
-    expect(view.getUint16(22, true)).toBe(1); // channels = mono
-    expect(view.getUint16(34, true)).toBe(16); // bits per sample
-  });
-
-  it("derives byteRate, blockAlign and dataSize from the sample rate and length", async () => {
-    const sampleRate = 16000;
-    const samples = new Float32Array(10);
-    const view = await readView(encodeWavPcm16(samples, sampleRate));
-    expect(view.getUint32(28, true)).toBe(sampleRate * 2); // byteRate = sampleRate * 2
-    expect(view.getUint16(32, true)).toBe(2); // blockAlign = 2 (mono * 16-bit)
-    expect(view.getUint32(40, true)).toBe(samples.length * 2); // dataSize
-    expect(view.getUint32(4, true)).toBe(36 + samples.length * 2); // RIFF chunk size
-  });
-
-  it("defaults the sample rate to 16000 at offset 24", async () => {
-    const view = await readView(encodeWavPcm16(new Float32Array(2)));
-    expect(view.getUint32(24, true)).toBe(16000);
-  });
-
-  it("writes the overridden sample rate at offset 24 (8000 / 48000)", async () => {
-    const view8 = await readView(encodeWavPcm16(new Float32Array(2), 8000));
-    expect(view8.getUint32(24, true)).toBe(8000);
-    expect(view8.getUint32(28, true)).toBe(8000 * 2); // byteRate follows the override
-
-    const view48 = await readView(encodeWavPcm16(new Float32Array(2), 48000));
-    expect(view48.getUint32(24, true)).toBe(48000);
-    expect(view48.getUint32(28, true)).toBe(48000 * 2);
-  });
-
-  it("clamps and quantises PCM16 samples to the asymmetric rails", async () => {
-    // +1.0 -> 32767 (clamped>=0 uses *0x7fff), -1.0 -> -32768 (clamped<0 uses *0x8000),
-    // 0 -> 0, and out-of-range values are clamped to the rails first.
-    const samples = new Float32Array([1.0, -1.0, 0, 1.5, -1.5]);
-    const view = await readView(encodeWavPcm16(samples));
-    expect(view.getInt16(44 + 0 * 2, true)).toBe(32767); // +1.0
-    expect(view.getInt16(44 + 1 * 2, true)).toBe(-32768); // -1.0
-    expect(view.getInt16(44 + 2 * 2, true)).toBe(0); // 0
-    expect(view.getInt16(44 + 3 * 2, true)).toBe(32767); // +1.5 -> clamped to +1.0
-    expect(view.getInt16(44 + 4 * 2, true)).toBe(-32768); // -1.5 -> clamped to -1.0
-  });
-
-  it("produces a mono blob of length 44 + samples.length * 2", () => {
-    expect(encodeWavPcm16(new Float32Array(0)).size).toBe(44);
-    expect(encodeWavPcm16(new Float32Array(100)).size).toBe(44 + 100 * 2);
-    expect(encodeWavPcm16(new Float32Array(100)).type).toBe("audio/wav");
-  });
-});
--- a/apps/client/src/features/dictation/utils/encode-wav.ts
+++ b/apps/client/src/features/dictation/utils/encode-wav.ts
@@ -1,32 +0,0 @@
-// Encode mono Float32 PCM samples into a 16-bit PCM WAV blob (audio/wav).
-// The server STT endpoint whitelists audio/wav, so this is sent as-is.
-export function encodeWavPcm16(samples: Float32Array, sampleRate = 16000): Blob {
-  const bytesPerSample = 2;
-  const blockAlign = bytesPerSample; // mono
-  const dataSize = samples.length * bytesPerSample;
-  const buffer = new ArrayBuffer(44 + dataSize);
-  const view = new DataView(buffer);
-  const writeStr = (offset: number, s: string) => {
-    for (let i = 0; i < s.length; i++) view.setUint8(offset + i, s.charCodeAt(i));
-  };
-  writeStr(0, "RIFF");
-  view.setUint32(4, 36 + dataSize, true);
-  writeStr(8, "WAVE");
-  writeStr(12, "fmt ");
-  view.setUint32(16, 16, true); // PCM fmt chunk size
-  view.setUint16(20, 1, true); // audio format = PCM
-  view.setUint16(22, 1, true); // channels = mono
-  view.setUint32(24, sampleRate, true);
-  view.setUint32(28, sampleRate * blockAlign, true); // byte rate
-  view.setUint16(32, blockAlign, true);
-  view.setUint16(34, 16, true); // bits per sample
-  writeStr(36, "data");
-  view.setUint32(40, dataSize, true);
-  let offset = 44;
-  for (let i = 0; i < samples.length; i++) {
-    const clamped = Math.max(-1, Math.min(1, samples[i]));
-    view.setInt16(offset, clamped < 0 ? clamped * 0x8000 : clamped * 0x7fff, true);
-    offset += 2;
-  }
-  return new Blob([buffer], { type: "audio/wav" });
-}
--- a/apps/client/src/features/editor/components/audio/audio-menu.tsx
+++ b/apps/client/src/features/editor/components/audio/audio-menu.tsx
@@ -1,43 +1,23 @@
 import { BubbleMenu as BaseBubbleMenu } from "@tiptap/react/menus";
 import { findParentNode, posToDOMRect, useEditorState } from "@tiptap/react";
-import { useCallback, useState } from "react";
+import { useCallback } from "react";
 import { Node as PMNode } from "@tiptap/pm/model";
 import { isEditorReady } from "@docmost/editor-ext";
 import {
  EditorMenuProps,
  ShouldShowProps,
 } from "@/features/editor/components/table/types/types.ts";
-import { ActionIcon, Loader, Tooltip } from "@mantine/core";
+import { ActionIcon, Tooltip } from "@mantine/core";
 import {
  IconDownload,
-  IconFileText,
  IconTrash,
 } from "@tabler/icons-react";
-import { notifications } from "@mantine/notifications";
-import { useAtomValue } from "jotai";
 import { useTranslation } from "react-i18next";
 import { getFileUrl } from "@/lib/config.ts";
-import { workspaceAtom } from "@/features/user/atoms/current-user-atom.ts";
-import { transcribeAudio } from "@/features/dictation/services/dictation-service";
 import classes from "../common/toolbar-menu.module.css";

-// STT-accepted audio MIME types (mirror of the server whitelist). If the
-// fetched blob's type is not one of these, we infer it from the file
-// extension so the upload's content-type is something the endpoint accepts.
-const RECOGNIZED_AUDIO_MIME = new Set([
-  "audio/webm", "audio/ogg", "audio/mp4", "audio/mpeg",
-  "audio/wav", "audio/x-wav", "audio/wave", "audio/m4a", "audio/x-m4a",
-]);
-const AUDIO_MIME_BY_EXT: Record<string, string> = {
-  mp3: "audio/mpeg", m4a: "audio/mp4", mp4: "audio/mp4",
-  wav: "audio/wav", ogg: "audio/ogg", oga: "audio/ogg", webm: "audio/webm",
-};
-
 export function AudioMenu({ editor }: EditorMenuProps) {
  const { t } = useTranslation();
-  const workspace = useAtomValue(workspaceAtom);
-  const dictationEnabled = workspace?.settings?.ai?.dictation === true;
-  const [isTranscribing, setIsTranscribing] = useState(false);

  const editorState = useEditorState({
    editor,
@@ -88,100 +68,6 @@ export function AudioMenu({ editor }: EditorMenuProps) {
    };
  }, [editor]);

-  const handleTranscribe = useCallback(async () => {
-    const src = editorState?.src;
-    if (!src || isTranscribing) return;
-
-    // The bubble menu shows for the selected audio node, so selection.from is
-    // that node's start position. Capture it now to disambiguate duplicate-src
-    // blocks after the async transcription completes.
-    const selectedPos = editor.state.selection.from;
-
-    setIsTranscribing(true);
-    try {
-      const fileUrl = getFileUrl(src);
-      // Derive a filename from the internal src for the multipart part name and
-      // for MIME inference when the fetched blob has no usable type.
-      const filename = decodeURIComponent(
-        src.split("?")[0].split("/").pop() || "audio",
-      );
-
-      const res = await fetch(fileUrl, { credentials: "include" });
-      if (!res.ok) {
-        throw new Error(`Failed to fetch audio file (HTTP ${res.status})`);
-      }
-      const blob = await res.blob();
-
-      // Ensure the upload's content-type is one the STT endpoint accepts; the
-      // server keys off the blob's MIME type.
-      let uploadBlob = blob;
-      const baseType = (blob.type || "").split(";")[0].trim().toLowerCase();
-      if (!RECOGNIZED_AUDIO_MIME.has(baseType)) {
-        const ext = filename.split(".").pop()?.toLowerCase() ?? "";
-        const inferred = AUDIO_MIME_BY_EXT[ext];
-        if (inferred) {
-          // Rebuild the blob with an accepted content-type; the server keys off it.
-          uploadBlob = new Blob([blob], { type: inferred });
-        }
-      }
-
-      const text = (await transcribeAudio(uploadBlob, filename)).trim();
-      if (text.length === 0) {
-        notifications.show({ message: t("No speech detected") });
-        return;
-      }
-
-      // Re-scan the doc at insert time so a collaborative edit during the async
-      // transcription can't misplace the text. Among audio nodes with this src
-      // (the same file may be embedded more than once), pick the occurrence
-      // closest to the originally-selected block.
-      let insertPos: number | null = null;
-      let bestDelta = Infinity;
-      editor.state.doc.descendants((node, pos) => {
-        if (node.type.name === "audio" && node.attrs.src === src) {
-          const delta = Math.abs(pos - selectedPos);
-          if (delta < bestDelta) {
-            bestDelta = delta;
-            insertPos = pos + node.nodeSize; // position just after the audio block
-          }
-        }
-        return true; // visit all nodes to find the closest match
-      });
-
-      const paragraph = { type: "paragraph", content: [{ type: "text", text }] };
-      try {
-        if (insertPos !== null) {
-          editor.chain().focus().insertContentAt(insertPos, paragraph).run();
-        } else {
-          editor.chain().focus().insertContent(paragraph).run();
-        }
-      } catch (insertErr) {
-        // A destroyed editor or out-of-bounds position must not throw; log and
-        // ignore so the transcription itself is not reported as a failure.
-        console.error("[audio-transcribe] insert failed", insertErr);
-      }
-    } catch (err) {
-      console.error("[audio-transcribe] failed", err);
-      const resp = (
-        err as { response?: { status?: number; data?: { message?: string } } }
-      )?.response;
-      const serverMsg = resp?.data?.message;
-      let message: string;
-      if (serverMsg && serverMsg.trim().length > 0) {
-        // The server already explains the cause (e.g. provider error, bad
-        // format, STT not configured) — show it verbatim.
-        message = serverMsg;
-      } else if (resp?.status === 503 || resp?.status === 403) {
-        message = t("Voice dictation is not configured");
-      } else {
-        message = `${t("Transcription failed")}: ${(err as { message?: string })?.message ?? String(err)}`;
-      }
-      notifications.show({ color: "red", message });
-    } finally {
-      setIsTranscribing(false);
-    }
-  }, [editor, editorState?.src, isTranscribing, t]);
-
  const handleDownload = useCallback(() => {
    if (!editorState?.src) return;
    const url = getFileUrl(editorState.src);
@@ -209,20 +95,6 @@ export function AudioMenu({ editor }: EditorMenuProps) {
      shouldShow={shouldShow}
    >
      <div className={classes.toolbar}>
-        {dictationEnabled && (
-          <Tooltip position="top" label={isTranscribing ? t("Transcribing…") : t("Transcribe")} withinPortal={false}>
-            <ActionIcon
-              onClick={handleTranscribe}
-              size="lg"
-              aria-label={t("Transcribe")}
-              variant="subtle"
-              disabled={isTranscribing}
-            >
-              {isTranscribing ? <Loader size={18} /> : <IconFileText size={18} />}
-            </ActionIcon>
-          </Tooltip>
-        )}
-
        <Tooltip position="top" label={t("Download")} withinPortal={false}>
          <ActionIcon
            onClick={handleDownload}
--- a/apps/client/src/features/editor/components/code-block/code-block-view.tsx
+++ b/apps/client/src/features/editor/components/code-block/code-block-view.tsx
@@ -47,26 +47,6 @@ export default function CodeBlockView(props: NodeViewProps) {

  return (
    <NodeViewWrapper className="codeBlock">
-      {/* #146: the editable <pre><code> (contentDOM) MUST come first in the DOM.
-          With the non-editable menu rendered before it, the browser's click
-          hit-testing snapped the caret up one line. Render content first; the
-          menu is rendered after it and lifted back above visually via flex
-          `order: -1` (the `.codeBlock` wrapper is a flex column — see
-          code-block.module.css). It stays fully in flow as a full-width row
-          above the code: no overlay/absolute positioning. The second #146
-          mitigation lives in editor-paste-handler.tsx (reflowAfterPaste). */}
-      <pre
-        spellCheck="false"
-        hidden={
-          ((language === "mermaid" && !editor.isEditable) ||
-            (language === "mermaid" && !isSelected)) &&
-          node.textContent.length > 0
-        }
-      >
-        {/* @ts-ignore */}
-        <NodeViewContent as="code" className={`language-${language}`} />
-      </pre>
-
      <Group
        justify="flex-end"
        contentEditable={false}
@@ -103,6 +83,18 @@ export default function CodeBlockView(props: NodeViewProps) {
        </CopyButton>
      </Group>

+      <pre
+        spellCheck="false"
+        hidden={
+          ((language === "mermaid" && !editor.isEditable) ||
+            (language === "mermaid" && !isSelected)) &&
+          node.textContent.length > 0
+        }
+      >
+        {/* @ts-ignore */}
+        <NodeViewContent as="code" className={`language-${language}`} />
+      </pre>
+
      {language === "mermaid" && (
        <Suspense fallback={null}>
          <MermaidView props={props} />
--- a/apps/client/src/features/editor/components/code-block/code-block.module.css
+++ b/apps/client/src/features/editor/components/code-block/code-block.module.css
@@ -17,14 +17,7 @@
    justify-content: center;
 }

-/* #146: the menu now follows the <pre> in the DOM (so the editable contentDOM is
-   FIRST and click hit-testing is correct). Lift it back ABOVE the code visually
-   with flex `order` — the .codeBlock wrapper is a flex column (see code.css) —
-   so the menu still reads as a row above the code, exactly as before, without
-   sitting in-flow before the contentDOM. */
 .menuGroup {
-    order: -1;
-
    @media print {
        display: none;
    }
--- a/apps/client/src/features/editor/components/common/editor-paste-handler.test.ts
+++ b/apps/client/src/features/editor/components/common/editor-paste-handler.test.ts
@@ -1,160 +0,0 @@
-import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
-import {
-  collectScrollAncestors,
-  reflowAfterPaste,
-} from "./editor-paste-handler";
-
-/**
- * Unit tests for the #146 post-paste reflow helpers. jsdom does not compute
- * styles or layout, so we stub getComputedStyle (per element via a Map) and the
- * scroll/overflow geometry properties (per element via Object.defineProperty).
- * Element trees are built DETACHED from `document`, so the ancestor walk only
- * traverses the elements we create. collectScrollAncestors always appends
- * document.scrollingElement, so we assert on specific ancestors with
- * toContain/not.toContain rather than exact-array equality.
- */
-
-type Overflow = { overflowX: string; overflowY: string };
-const styleMap = new Map<Element, Overflow>();
-
-function makeScrollable(
-  overflowY: string,
-  {
-    sh = 0,
-    ch = 0,
-    sw = 0,
-    cw = 0,
-    left = 0,
-    top = 0,
-    overflowX = "visible",
-  }: {
-    sh?: number;
-    ch?: number;
-    sw?: number;
-    cw?: number;
-    left?: number;
-    top?: number;
-    overflowX?: string;
-  } = {},
-) {
-  const el = document.createElement("div");
-  Object.defineProperty(el, "scrollHeight", { configurable: true, value: sh });
-  Object.defineProperty(el, "clientHeight", { configurable: true, value: ch });
-  Object.defineProperty(el, "scrollWidth", { configurable: true, value: sw });
-  Object.defineProperty(el, "clientWidth", { configurable: true, value: cw });
-  Object.defineProperty(el, "scrollLeft", { configurable: true, value: left });
-  Object.defineProperty(el, "scrollTop", { configurable: true, value: top });
-  styleMap.set(el, { overflowX, overflowY });
-  return el;
-}
-
-// A leaf node whose parentElement is `parent`. The walk starts from
-// node.parentElement, so the parent is the first candidate ancestor.
-function makeNodeUnder(parent: HTMLElement) {
-  const node = document.createElement("div");
-  parent.appendChild(node);
-  return node;
-}
-
-// Override `document.scrollingElement` as an instance own-property (the native
-// implementation is a getter on Document.prototype, which we never touch).
-function setScrollingElement(value: Element | null) {
-  Object.defineProperty(document, "scrollingElement", {
-    configurable: true,
-    get: () => value,
-  });
-}
-
-beforeEach(() => {
-  styleMap.clear();
-  vi.stubGlobal("getComputedStyle", (el: Element) => {
-    return styleMap.get(el) ?? { overflowX: "visible", overflowY: "visible" };
-  });
-});
-
-afterEach(() => {
-  vi.unstubAllGlobals();
-  // Drop the per-test instance override so the native prototype getter shows
-  // through again (it was never modified, so no further restore is needed).
-  delete (document as any).scrollingElement;
-});
-
-describe("collectScrollAncestors", () => {
-  it("includes an overflow:overlay ancestor that overflows (macOS case)", () => {
-    setScrollingElement(null);
-    const a = makeScrollable("overlay", { sh: 200, ch: 100 });
-    const node = makeNodeUnder(a);
-    expect(collectScrollAncestors(node)).toContain(a);
-  });
-
-  it("excludes an overflow:auto ancestor that does NOT overflow (gate fails)", () => {
-    setScrollingElement(null);
-    const a = makeScrollable("auto", { sh: 100, ch: 100 });
-    const node = makeNodeUnder(a);
-    expect(collectScrollAncestors(node)).not.toContain(a);
-  });
-
-  it("includes an overflow:auto ancestor that overflows", () => {
-    setScrollingElement(null);
-    const a = makeScrollable("auto", { sh: 200, ch: 100 });
-    const node = makeNodeUnder(a);
-    expect(collectScrollAncestors(node)).toContain(a);
-  });
-
-  it("excludes a non-scrollable overflow even when it overflows", () => {
-    setScrollingElement(null);
-    const a = makeScrollable("hidden", { sh: 200, ch: 100 });
-    const node = makeNodeUnder(a);
-    expect(collectScrollAncestors(node)).not.toContain(a);
-  });
-
-  it("includes an X-axis overflow:scroll ancestor that overflows horizontally", () => {
-    setScrollingElement(null);
-    const a = makeScrollable("visible", {
-      overflowX: "scroll",
-      sw: 200,
-      cw: 100,
-    });
-    const node = makeNodeUnder(a);
-    expect(collectScrollAncestors(node)).toContain(a);
-  });
-
-  it("dedups: scrollingElement already in the walk is added exactly once", () => {
-    const a = makeScrollable("auto", { sh: 200, ch: 100 });
-    setScrollingElement(a);
-    const node = makeNodeUnder(a);
-    const result = collectScrollAncestors(node);
-    expect(result.filter((x) => x === a).length).toBe(1);
-  });
-
-  it("does not throw and appends nothing when scrollingElement is null", () => {
-    setScrollingElement(null);
-    const a = makeScrollable("auto", { sh: 200, ch: 100 });
-    const node = makeNodeUnder(a);
-    const result = collectScrollAncestors(node);
-    // Only the qualifying ancestor we built — no trailing scrollingElement.
-    expect(result).toEqual([a]);
-  });
-});
-
-describe("reflowAfterPaste", () => {
-  it("runs the double rAF and nudges each ancestor with scrollTo(scrollLeft, scrollTop)", () => {
-    // Run the double-nested requestAnimationFrame synchronously.
-    vi.stubGlobal(
-      "requestAnimationFrame",
-      (cb: FrameRequestCallback) => {
-        cb(0);
-        return 0;
-      },
-    );
-    setScrollingElement(null);
-
-    const a = makeScrollable("auto", { sh: 200, ch: 100, left: 5, top: 10 });
-    const node = makeNodeUnder(a);
-    (a as any).scrollTo = vi.fn();
-
-    reflowAfterPaste({ view: { dom: node } } as any);
-
-    expect((a as any).scrollTo).toHaveBeenCalledWith(5, 10);
-  });
-});
--- a/apps/client/src/features/editor/components/common/editor-paste-handler.tsx
+++ b/apps/client/src/features/editor/components/common/editor-paste-handler.tsx
@@ -22,81 +22,12 @@ const ATTACHMENT_NODE_TYPES = [

 const ATTACHMENT_URL_RE = /\/api\/files\/([0-9a-f-]+)\//;

-const SCROLLABLE_OVERFLOW = new Set(["auto", "scroll", "overlay"]);
-
-/**
- * Collect every scrollable ancestor of the editor DOM whose hit-test layer
- * could be stale after a paste, plus the document scrolling element. We nudge
- * ALL of them (a zero-delta nudge is harmless) because the real scroll container
- * varies — a styled overflow ancestor on most pages, the document itself on
- * others — and `overflow: overlay` (common on macOS, where #146 reproduces)
- * must count as scrollable too. Called only AFTER the paste has committed, so
- * `scrollHeight > clientHeight` reflects the inserted content.
- */
-export function collectScrollAncestors(node: HTMLElement): HTMLElement[] {
-  const targets: HTMLElement[] = [];
-  // Walk every ancestor (incl. body/html) — on some layouts the scroll lives on
-  // body rather than the documentElement that scrollingElement points at.
-  let el: HTMLElement | null = node.parentElement;
-  while (el) {
-    const { overflowX, overflowY } = getComputedStyle(el);
-    const scrollsY =
-      SCROLLABLE_OVERFLOW.has(overflowY) && el.scrollHeight > el.clientHeight;
-    const scrollsX =
-      SCROLLABLE_OVERFLOW.has(overflowX) && el.scrollWidth > el.clientWidth;
-    if (scrollsY || scrollsX) targets.push(el);
-    el = el.parentElement;
-  }
-  const docEl = document.scrollingElement as HTMLElement | null;
-  if (docEl && !targets.includes(docEl)) targets.push(docEl);
-  return targets;
-}
-
-/**
- * Re-flow the editor's scroll containers after a paste so the browser refreshes
- * its click hit-testing geometry (#146). Pasting markdown/code inserts React
- * NodeViews that mount ASYNCHRONOUSLY; until the next reflow, ProseMirror's
- * posAtCoords/caretRangeFromPoint can map a click to a stale (offset) line —
- * which users observed clears itself on any scroll. We reproduce that scroll's
- * side effect with a ZERO-delta nudge (re-assign scrollTop/Left to their current
- * value), invalidating the hit-test layer WITHOUT moving the viewport. The
- * container lookup AND the nudge run across two animation frames so they happen
- * AFTER the pasted content + NodeViews commit (only then is the real scroll
- * container measurable).
- *
- * This is the SECOND of two #146 mitigations; the FIRST is the content-first DOM
- * order in the NodeViews (code-block-view.tsx, footnotes-list-view.tsx,
- * footnote-definition-view.tsx). Editing one, check the other.
- */
-export function reflowAfterPaste(editor: Editor) {
-  const dom = editor.view.dom as HTMLElement;
-  requestAnimationFrame(() => {
-    requestAnimationFrame(() => {
-      for (const el of collectScrollAncestors(dom)) {
-        // Zero-delta nudge: re-set the scroll position to its current value to
-        // invalidate the browser's hit-test layer WITHOUT moving the viewport.
-        // `scrollTo(x, y)` is the repo idiom and avoids a lint-flagged
-        // self-assignment.
-        el.scrollTo(el.scrollLeft, el.scrollTop);
-      }
-    });
-  });
-}
-
 export const handlePaste = (
  editor: Editor,
  event: ClipboardEvent,
  pageId: string,
  creatorId?: string,
 ) => {
-  // Schedule a post-paste reflow on EVERY paste path — intentionally. handlePaste
-  // returns BEFORE the markdown/code-insertion plugin runs, so it cannot know here
-  // whether async NodeViews will be inserted; the nudge is a cheap layout read on
-  // the next frames and a no-op for the viewport, so scheduling it unconditionally
-  // is simpler and harmless. Pairs with the content-first DOM order in the
-  // NodeViews — both address #146 from different angles.
-  reflowAfterPaste(editor);
-
  const clipboardData = event.clipboardData.getData("text/plain");

  if (INTERNAL_LINK_REGEX.test(clipboardData)) {
--- a/apps/client/src/features/editor/components/common/node-resize.module.css
+++ b/apps/client/src/features/editor/components/common/node-resize.module.css
@@ -73,18 +73,3 @@
    display: none !important;
  }
 }
-
-/* Float image (#145): on narrow screens a floated image would crowd the text to
-   an unreadable column, so collapse it to full width and drop the float.
-   `!important` is required because applyAlignment sets `float`/`padding` inline,
-   which a normal rule cannot override. Keys off the `data-image-align` attribute
-   the image node view mirrors onto its container. This module is the one actually
-   imported by the resize node views (node-resize-handles.ts), so the rule loads. */
-@media (max-width: 600px) {
-  .container:global([data-image-align="floatLeft"]),
-  .container:global([data-image-align="floatRight"]) {
-    float: none !important;
-    width: 100% !important;
-    padding: 0 !important;
-  }
-}
--- a/apps/client/src/features/editor/components/fixed-toolbar/fixed-toolbar.tsx
+++ b/apps/client/src/features/editor/components/fixed-toolbar/fixed-toolbar.tsx
@@ -13,6 +13,7 @@ import { QuickInsertsGroup } from "./groups/quick-inserts-group";
 import { MoreInsertsGroup } from "./groups/more-inserts-group";
 import { HistoryGroup } from "./groups/history-group";
 import { AskAiGroup } from "./groups/ask-ai-group";
+import { DictationGroup } from "./groups/dictation-group";
 import { workspaceAtom } from "@/features/user/atoms/current-user-atom";
 import classes from "./fixed-toolbar.module.css";

@@ -30,6 +31,7 @@ export const FixedToolbar: FC<FixedToolbarProps> = ({
  const state = useToolbarState(editor);
  const workspace = useAtomValue(workspaceAtom);
  const isGenerativeAiEnabled = workspace?.settings?.ai?.generative === true;
+  const isDictationEnabled = workspace?.settings?.ai?.dictation === true;

  if (!editor || !state) return null;

@@ -65,6 +67,12 @@ export const FixedToolbar: FC<FixedToolbarProps> = ({
          <MoreInsertsGroup editor={editor} templateMode={templateMode} />
          <div className={classes.divider} />
          <HistoryGroup editor={editor} state={state} />
+          {isDictationEnabled && (
+            <>
+              <div className={classes.divider} />
+              <DictationGroup editor={editor} />
+            </>
+          )}
        </div>
      </div>
      <div className={classes.spacer} aria-hidden />
--- a/Show More
+++ b/Show More