In prod the AI provider resets the connection pre-response (ECONNRESET); the
#175 pre-response retry recovers it, but 2 of the 3 allowed attempts were burned
in a single turn — no headroom, and one more reset would surface an error to the
user. This is tuning for resilience (not a diagnosis of who resets):
- Retry budget 2 → 4 (total 5 attempts), env-configurable via
AI_STREAM_PRE_RESPONSE_RETRIES (0 = no retry; empty/invalid → default 4).
- Backoff: linear 150*(attempt+1) → capped exponential + full jitter
(preResponseBackoffMs, a pure injectable helper): base 150ms, ×2 per attempt,
capped 2000ms, delay = random in [0, capped]. Avoids a synchronized retry
storm and spreads reconnects across the reset window.
- Keep-alive default 10_000 → 4_000 ms so undici recycles idle sockets before a
~5s upstream/middlebox idle cutoff can poison them (a common pre-response
reset cause). Still env-overridable via AI_STREAM_KEEPALIVE_MS.
- .env.example documents both knobs.
Timeout (900s), RETRYABLE_CONNECT_CODES, and the instrumentation are unchanged.
refs #310
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>