fix(ai-chat): don't sever long agent turns at undici's 300s stream timeout (#175) #176

Merged
vvzvlad merged 2 commits from fix/ai-stream-undici-timeout into develop 2026-06-24 22:34:19 +03:00

2 Commits

Author SHA1 Message Date
claude code agent 227
da15b55786 refactor(ai): address PR #176 review — finite-timeout wording, env doc, tests, permanent provider-http module
- Wording: every comment now says the stream timeouts are RAISED to a
  generous-but-finite ~15-min silence timeout, not "disabled (0)" (the stale
  comments contradicted the code, which uses AI_STREAM_TIMEOUT_MS, default
  900000ms).
- Architecture (the load-bearing-temporary trap): the streaming fetch reached
  the chat provider only by riding the "temporary DIAGNOSTIC" telemetry, so
  deleting the telemetry by its own label would silently revert the timeout fix.
  Legitimize it: rename ai-http-diagnostics.ts -> ai-provider-http.ts,
  createDiagnosticFetch -> createInstrumentedFetch, field aiDiagnosticFetch ->
  aiProviderFetch, drop the "temporary" labels, and document the chat transport
  (streaming fetch + instrumentation) as one intentional construct.
- Docs: AI_STREAM_TIMEOUT_MS added to .env.example next to AI_EMBEDDING_TIMEOUT_MS.
- Tests:
  - ai-provider-http.spec: createInstrumentedFetch delegates to the injected
    baseFetch with the same input/init, returns the Response untouched, rethrows
    the error, and defaults to global fetch — covering the baseFetch seam.
  - ai-streaming-fetch.spec: the delayed-server test is now LOAD-BEARING — with
    AI_STREAM_TIMEOUT_MS set below the 1.5s server delay the call actually rejects
    (a lost dispatcher -> global 300s default would NOT), proving the configured
    dispatcher is wired; plus the default-timeout happy path.

server tsc clean; ai-streaming-fetch / ai-provider-http / ai.service / mcp-servers
/ ai-error specs green (41).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 22:31:58 +03:00
claude code agent 227
a14560c7c9 fix(ai-chat): raise undici's 300s stream timeout for long agent turns (#175)
Long research turns failed mid-task with "Lost connection to the AI provider".
Node's global fetch (undici) defaults BOTH headersTimeout and bodyTimeout to
300_000ms, and the chat provider + the external-MCP dispatcher both ran on it
with no override, so:
  - the z.ai chat stream dropped when a late step's huge accumulated context
    pushed the model's time-to-first-token past 5 min (the model reasons
    server-side with NO streamed reasoning, so the connection is silent until the
    first answer token — reproduced: even a trivial glm-5.2 query has a ~4-8s
    first-chunk gap; a long run reaches 400k+-token steps), or a reasoning model
    paused >5 min between chunks (bodyTimeout);
  - the crawl4ai SSE transport, held open across the whole turn, dropped when it
    idled >5 min between tool calls.

Fix: a dedicated undici dispatcher whose stream timeouts are raised to a
generous-but-FINITE silence timeout (default 15 min, AI_STREAM_TIMEOUT_MS) on
each path. NOT disabled (0): that would let a genuinely hung provider — with the
client still connected — hang forever, since the turn's abortSignal only fires on
client disconnect. The timeout bounds SILENCE (time-to-first-byte and the gap
BETWEEN chunks), NOT total turn duration, so an arbitrarily long turn that keeps
streaming is never cut; only a stream quiet for >15 min is treated as a hang.
  - ai-streaming-fetch.ts: createStreamingFetch() + streamTimeoutMs() /
    streamingDispatcherOptions() (the shared, configurable timeout).
  - ai.service: the chat provider fetch is createStreamingFetch(), wrapped by the
    existing passive ECONNRESET telemetry (createDiagnosticFetch gained an
    optional baseFetch) so the telemetry observes the SAME transport.
  - mcp-clients: the SSRF-pinned Agent uses streamingDispatcherOptions().

Investigation: reproduced the transport mechanism against the real z.ai endpoint
(a 1ms headersTimeout throws UND_ERR_HEADERS_TIMEOUT — the exact drop) and ran
the actual research agent to a ~428k-token context. Verified the fixed path
streams cleanly live (glm-5.2 turns finish; telemetry confirms the streaming
fetch is in use).

Tests: ai-streaming-fetch.spec (default 15m + env override + invalid fallback +
both-timeouts + streams a delayed response); ai-http-diagnostics + ai/mcp specs
green. server tsc clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 22:09:10 +03:00