fix(ai-chat): reconnect on provider ECONNRESET via a resilient fetch

Outbound LLM calls used Node's default global undici agent (default
keep-alive pooling, no transport-level reconnect), so a TCP RST on a
reused/poisoned keep-alive socket surfaced as
"Cannot connect to API: read ECONNRESET" and failed the chat stream and
title generation after the AI SDK's own retries were exhausted.

Add a dedicated resilient outbound HTTP layer (ai-http.ts): a shared
undici RetryAgent over a tuned Agent, exposed as `aiFetch` and injected
into every AI provider factory (createOpenAI chat/embeddings/STT,
createGoogleGenerativeAI, createOllama) plus the raw JSON STT fetch. The
RetryAgent reconnects on connection-level errors (ECONNRESET, ...) on a
FRESH socket, opts POST into the retry methods (undici's default list
excludes POST), and leaves HTTP-status retries (429/5xx + Retry-After) to
the AI SDK to avoid double-retry.

- ai-http.ts: shared RetryAgent(Agent) + aiFetch (maxRetries 2,
  conservative keep-alive, connect timeout, streaming-safe timeouts)
- ai.service.ts: inject fetch: aiFetch into every provider factory
- ai-http.spec.ts: regression test that aiFetch injects the RetryAgent
  dispatcher into the underlying fetch

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
claude_code
2026-06-22 20:23:35 +03:00
parent 7ddd0cba05
commit 1af5d34ae3
3 changed files with 154 additions and 7 deletions

View File

@@ -0,0 +1,47 @@
import { RetryAgent } from 'undici';
import { aiFetch } from './ai-http';
/**
* Light, dependency-free unit checks for the shared AI HTTP layer. The module
* constructs its undici dispatcher eagerly at import time, so importing it here
* already exercises that construction; we make NO real network calls.
*/
describe('ai-http', () => {
it('exports aiFetch as a function', () => {
expect(typeof aiFetch).toBe('function');
});
it('constructs the dispatcher eagerly without throwing at import time', () => {
// Reaching this assertion means the top-level Agent/RetryAgent construction
// in ai-http.ts did not throw when the module was imported above.
expect(aiFetch).toBeDefined();
});
it('forwards the resilient RetryAgent dispatcher into the underlying fetch', async () => {
// CRITICAL regression guard: aiFetch must inject the shared undici dispatcher
// into the real fetch call, otherwise AI traffic silently falls back to the
// default global agent and the ECONNRESET production bug returns. aiFetch
// resolves `fetch` at call time, so spying on globalThis.fetch intercepts it
// and prevents any real network call.
const spy = jest
.spyOn(globalThis, 'fetch')
.mockResolvedValue(new Response(null));
try {
await aiFetch('https://example.invalid/', { method: 'POST' });
expect(spy).toHaveBeenCalledTimes(1);
const init = spy.mock.calls[0][1] as {
dispatcher?: unknown;
method?: string;
};
// The dispatcher must be the resilient RetryAgent, not the default agent.
expect(init.dispatcher).toBeInstanceOf(RetryAgent);
// `{ ...init }` spreading must preserve the caller's original options.
expect(init.method).toBe('POST');
} finally {
// Never let the global fetch stub leak into other tests.
spy.mockRestore();
}
});
});