Extend ai-http.spec with two loopback-server tests: a provider that stalls
without sending headers triggers the (lowered) headersTimeout and is retried on a
fresh connection, recovering; a healthy fast response passes through in one
attempt. No external network calls.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Outbound LLM calls used Node's default global undici agent (default
keep-alive pooling, no transport-level reconnect), so a TCP RST on a
reused/poisoned keep-alive socket surfaced as
"Cannot connect to API: read ECONNRESET" and failed the chat stream and
title generation after the AI SDK's own retries were exhausted.
Add a dedicated resilient outbound HTTP layer (ai-http.ts): a shared
undici RetryAgent over a tuned Agent, exposed as `aiFetch` and injected
into every AI provider factory (createOpenAI chat/embeddings/STT,
createGoogleGenerativeAI, createOllama) plus the raw JSON STT fetch. The
RetryAgent reconnects on connection-level errors (ECONNRESET, ...) on a
FRESH socket, opts POST into the retry methods (undici's default list
excludes POST), and leaves HTTP-status retries (429/5xx + Retry-After) to
the AI SDK to avoid double-retry.
- ai-http.ts: shared RetryAgent(Agent) + aiFetch (maxRetries 2,
conservative keep-alive, connect timeout, streaming-safe timeouts)
- ai.service.ts: inject fetch: aiFetch into every provider factory
- ai-http.spec.ts: regression test that aiFetch injects the RetryAgent
dispatcher into the underlying fetch
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>