refactor(ai): address PR #176 review — finite-timeout wording, env doc, tests, permanent provider-http module

- Wording: every comment now says the stream timeouts are RAISED to a generous-but-finite ~15-min silence timeout, not "disabled (0)" (the stale comments contradicted the code, which uses AI_STREAM_TIMEOUT_MS, default 900000ms). - Architecture (the load-bearing-temporary trap): the streaming fetch reached the chat provider only by riding the "temporary DIAGNOSTIC" telemetry, so deleting the telemetry by its own label would silently revert the timeout fix. Legitimize it: rename ai-http-diagnostics.ts -> ai-provider-http.ts, createDiagnosticFetch -> createInstrumentedFetch, field aiDiagnosticFetch -> aiProviderFetch, drop the "temporary" labels, and document the chat transport (streaming fetch + instrumentation) as one intentional construct. - Docs: AI_STREAM_TIMEOUT_MS added to .env.example next to AI_EMBEDDING_TIMEOUT_MS. - Tests: - ai-provider-http.spec: createInstrumentedFetch delegates to the injected baseFetch with the same input/init, returns the Response untouched, rethrows the error, and defaults to global fetch — covering the baseFetch seam. - ai-streaming-fetch.spec: the delayed-server test is now LOAD-BEARING — with AI_STREAM_TIMEOUT_MS set below the 1.5s server delay the call actually rejects (a lost dispatcher -> global 300s default would NOT), proving the configured dispatcher is wired; plus the default-timeout happy path. server tsc clean; ai-streaming-fetch / ai-provider-http / ai.service / mcp-servers / ai-error specs green (41). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 22:31:58 +03:00
parent a14560c7c9
commit da15b55786
5 changed files with 114 additions and 30 deletions
--- a/.env.example
+++ b/.env.example
@@ -136,6 +136,12 @@ MCP_DOCMOST_PASSWORD=
 # A slow/hung embeddings endpoint fails after this and the batch continues.
 # AI_EMBEDDING_TIMEOUT_MS=120000

+# Silence timeout (ms) for streaming chat/agent AI calls AND external-MCP traffic.
+# Bounds time-to-first-byte and the gap BETWEEN chunks (NOT the total turn length),
+# so an arbitrarily long turn that keeps streaming is never cut. Finite so a hung
+# provider is eventually broken instead of leaking forever. Default 900000 (15 min).
+# AI_STREAM_TIMEOUT_MS=900000
+
 # --- Anonymous public-share AI assistant ---
 # Opt-in per workspace (AI settings -> "public share assistant"; off by default).
 # When enabled, anonymous visitors of a published share can ask an AI about that