Found while live-testing the realtime dictation:
- 'already active' lockout (real bug): the per-user slot was tied to the
connected socket lifetime and a stale/racing socket could leave the counter
stuck, so a fresh mic start was rejected. Now per-user single-session is
enforced purely by LATEST-WINS EVICTION — a new connect disconnects the user's
prior socket and frees its slot synchronously — and the user counter no longer
participates in the cap decision (it could only cause false lockouts). Also
free the slot when a start fails to open. The per-workspace cap is unchanged.
- #737: drop the separate sttRealtimeModel / sttRealtimeBaseUrl settings — realtime
dictation now reuses the existing STT model + base URL (the realtime WS endpoint
is derived from it server-side). Removed the fields from the DTO, types, settings
service, repo allowlist, and the settings UI. The STT 'Test endpoint' button is
now a single context-aware button (probes the realtime WS endpoint when realtime
is on, the batch endpoint otherwise), and the 'Request format' selector is
disabled while realtime is on (realtime always uses the OpenAI Realtime protocol).
- no-silent-loss: parse the OpenAI
conversation.item.input_audio_transcription.failed event (e.g. insufficient_quota,
bad model) and surface its concrete reason to the client instead of dropping it
silently — previously a per-item transcription failure produced 'no words' with
no explanation.
Tests: realtime suites green (gateway latest-wins eviction, parser .failed surfacing,
ai-settings reuse-STT-model); server + client tsc clean; workspace vitest 37 pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>