fix(ai-http): fail fast + retry on provider header stall (#140) #141
Reference in New Issue
Block a user
Delete Branch "fix/ai-stream-headers-timeout"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes #140 — the z.ai GLM coding endpoint intermittently accepts the chat request but never sends response headers; undici's default
headersTimeoutis 300s, so the user hung for five minutes before failing, andUND_ERR_HEADERS_TIMEOUTwas not in the RetryAgent's retried error set → no recovery.Root cause (confirmed on the stand)
headersTimeoutbounds time-to-FIRST-response-headers (before any body) — it is not the streaming budget; once headers arrive the SSE body streams freely, unaffected by it. The previousai-http.tscomment conflated it withbodyTimeoutand left it at undici's 300s default, so a header stall = a 5-minute hang with no retry. A tinyping(test endpoint) responded in <2s so it never tripped; the heavy streaming chat occasionally stalled.Reproduced live with the provided z.ai creds: normal streams return headers in ~1–9s; the stall is the rare path where headers never come.
Fix
headersTimeout(envAI_HTTP_HEADERS_TIMEOUT_MS, default 60s) so a header stall fails fast instead of hanging 300s. Safe for streaming — lowering it does not truncate live SSE.UND_ERR_HEADERS_TIMEOUTto the retried error codes so the stalled request is retried on a fresh connection (which usually responds in seconds). A header timeout fires before any body, so the retry is clean (no partial SSE / Range-resume problem).bodyTimeoutkept generous (envAI_HTTP_BODY_TIMEOUT_MS, default 300s) so slow/thinking models with sparse chunks survive.UND_ERR_BODY_TIMEOUTis deliberately not retried (mid-body, partial SSE already delivered).Verification
ai-http.spec.tsregression tests (loopback server, no external calls): a header stall is retried on a fresh connection and recovers; a healthy fast response passes through in one attempt. 5/5 spec tests pass, servertscclean.anthropic/claude-sonnet-4.6) also streams cleanly — consistent with the stall being specific to z.ai's coding endpoint.Note: this is the opposite direction from the previously-reverted experiment (which raised the timeout to 600s — that can't help a stuck request). Lowering + retrying is what converts a 5-minute hang into a fast retry that usually succeeds.
🤖 Generated with Claude Code
2026-06-23T01:23:10.413Z INF | pid=45 hostname=d39ceaa4ecbe context=DatabaseModule msg=Establishing database connection
2026-06-23T01:23:10.439Z INF | pid=45 hostname=d39ceaa4ecbe context=RedisModule msg=default: the connection was successfully established
2026-06-23T01:23:10.507Z INF | pid=45 hostname=d39ceaa4ecbe context=DatabaseModule msg=Database connection successful
2026-06-23T01:23:10.984Z INF | pid=45 hostname=d39ceaa4ecbe context=DatabaseMigrationService msg=No pending database migrations
2026-06-23T01:23:10.998Z INF | pid=45 hostname=d39ceaa4ecbe context=NestApplication msg=Nest application successfully started
2026-06-23T01:23:11.012Z INF | pid=45 hostname=d39ceaa4ecbe context=NestApplication msg=Listening on http://127.0.0.1:3000 / https://docs.vvzvlad.xyz
2026-06-23T01:23:22.009Z INF | pid=45 hostname=d39ceaa4ecbe req={"method":"POST","url":"/api/ai-chat/stream","ip":"172.18.0.15","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.6 Safari/605.1.15"} context=AiChatController msg=AI chat stream START chat=new ua="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.6 Safari/605.1.15"
2026-06-23T01:23:23.724Z INF | pid=45 hostname=d39ceaa4ecbe req={"method":"POST","url":"/api/ai-chat/stream","ip":"172.18.0.15","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.6 Safari/605.1.15"} context=AiHttp msg=provider request #3 -> POST api.z.ai/api/coding/paas/v4/chat/completions
2026-06-23T01:24:02.863Z INF | pid=45 hostname=d39ceaa4ecbe req={"method":"POST","url":"/api/ai-chat/stream","ip":"172.18.0.15","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/148.0.0.0 Safari/537.36"} context=AiChatController msg=AI chat stream START chat=new ua="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/148.0.0.0 Safari/537.36"
2026-06-23T01:24:03.078Z INF | pid=45 hostname=d39ceaa4ecbe req={"method":"POST","url":"/api/ai-chat/stream","ip":"172.18.0.15","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/148.0.0.0 Safari/537.36"} context=AiHttp msg=provider request #4 -> POST api.z.ai/api/coding/paas/v4/chat/completions
2026-06-23T01:24:25.348Z WRN | pid=45 hostname=d39ceaa4ecbe req={"method":"POST","url":"/api/ai-chat/stream","ip":"172.18.0.15","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.6 Safari/605.1.15"} context=AiHttp msg=provider request #3 x after 61624ms: fetch failed <- [UND_ERR_REQ_CONTENT_LENGTH_MISMATCH] Request body length does not match content-length header
2026-06-23T01:24:27.358Z INF | pid=45 hostname=d39ceaa4ecbe req={"method":"POST","url":"/api/ai-chat/stream","ip":"172.18.0.15","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.6 Safari/605.1.15"} context=AiHttp msg=provider request #5 -> POST api.z.ai/api/coding/paas/v4/chat/completions
2026-06-23T01:25:05.723Z WRN | pid=45 hostname=d39ceaa4ecbe req={"method":"POST","url":"/api/ai-chat/stream","ip":"172.18.0.15","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/148.0.0.0 Safari/537.36"} context=AiHttp msg=provider request #4 x after 62645ms: fetch failed <- [ECONNRESET] read ECONNRESET
2026-06-23T01:25:07.725Z INF | pid=45 hostname=d39ceaa4ecbe req={"method":"POST","url":"/api/ai-chat/stream","ip":"172.18.0.15","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/148.0.0.0 Safari/537.36"} context=AiHttp msg=provider request #6 -> POST api.z.ai/api/coding/paas/v4/chat/completions
2026-06-23T01:25:29.207Z WRN | pid=45 hostname=d39ceaa4ecbe req={"method":"POST","url":"/api/ai-chat/stream","ip":"172.18.0.15","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.6 Safari/605.1.15"} context=AiHttp msg=provider request #5 x after 61849ms: fetch failed <- [ECONNRESET] read ECONNRESET
2026-06-23T01:25:33.211Z INF | pid=45 hostname=d39ceaa4ecbe req={"method":"POST","url":"/api/ai-chat/stream","ip":"172.18.0.15","userAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.6 Safari/605.1.15"} context=AiHttp msg=provider request #7 -> POST api.z.ai/api/coding/paas/v4/chat/completions
нихуя не помогло же
Ghost referenced this pull request2026-06-23 05:26:41 +03:00
Ghost referenced this pull request2026-06-23 16:22:19 +03:00
Ghost referenced this pull request2026-06-24 05:28:16 +03:00
Ghost referenced this pull request2026-06-24 05:28:17 +03:00