feat(dictation): add realtime streaming STT (live dictation)
Layer an optional realtime speech-to-text path on top of the existing batch dictation, so transcribed text appears as the user speaks. Transport A2: browser <-> our server (Socket.IO `/ai-realtime`) <-> OpenAI Realtime (raw ws). The provider API key never leaves the server; the upstream URL is SSRF-checked before connecting; the gateway enforces the dictation+dictationRealtime gate, cookie-JWT auth and per-user/ per-workspace concurrency caps. Implemented against the GA (2026) OpenAI Realtime transcription contract (session.update / audio.input.format / server_vad), not the now-removed beta shape. Editor UI B2: interim text is shown as a meta-only ProseMirror ghost decoration (no Yjs/history noise); only completed segments are committed. Chat shows interim as a dimmed tail. The mic button switches realtime vs batch by the workspace flag; batch remains the default and fallback. Server: - AiRealtimeService (upstream ws proxy, normalized events, idle/max- duration timeouts, idempotent teardown) + parseUpstreamEvent unit tests - AiRealtimeGateway (Socket.IO `/ai-realtime`) wired into AiChatModule - admin-gated POST /ai-chat/realtime/test connectivity probe - config: settings.ai.dictationRealtime + provider sttRealtimeModel/ sttRealtimeBaseUrl (realtime key reuses sttApiKey; no new secret) Client: - pcm16 AudioWorklet (24kHz mono PCM16), RealtimeDictationClient, use-realtime-dictation hook (status/start/stop/cancel + onInterim/onFinal) - RealtimeMicButton + dictation-interim ProseMirror decoration - editor/chat integration + AI settings UI (toggle, model, test endpoint) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -123,6 +123,7 @@ import { countWords } from "alfaaz";
|
||||
import AutoJoiner from "@/features/editor/extensions/autojoiner.ts";
|
||||
import GlobalDragHandle from "@/features/editor/extensions/drag-handle.ts";
|
||||
import { CleanStyles } from "@/features/editor/extensions/clean-styles.ts";
|
||||
import { DictationInterim } from "@/features/editor/extensions/dictation-interim/dictation-interim.ts";
|
||||
|
||||
const lowlight = createLowlight(common);
|
||||
lowlight.register("mermaid", plaintext);
|
||||
@@ -343,6 +344,7 @@ export const mainExtensions = [
|
||||
},
|
||||
}),
|
||||
Selection,
|
||||
DictationInterim,
|
||||
Attachment.configure({
|
||||
view: AttachmentView,
|
||||
}),
|
||||
|
||||
Reference in New Issue
Block a user