fix(ai-chat): reconnect on provider ECONNRESET via a resilient fetch
Outbound LLM calls used Node's default global undici agent (default keep-alive pooling, no transport-level reconnect), so a TCP RST on a reused/poisoned keep-alive socket surfaced as "Cannot connect to API: read ECONNRESET" and failed the chat stream and title generation after the AI SDK's own retries were exhausted. Add a dedicated resilient outbound HTTP layer (ai-http.ts): a shared undici RetryAgent over a tuned Agent, exposed as `aiFetch` and injected into every AI provider factory (createOpenAI chat/embeddings/STT, createGoogleGenerativeAI, createOllama) plus the raw JSON STT fetch. The RetryAgent reconnects on connection-level errors (ECONNRESET, ...) on a FRESH socket, opts POST into the retry methods (undici's default list excludes POST), and leaves HTTP-status retries (429/5xx + Retry-After) to the AI SDK to avoid double-retry. - ai-http.ts: shared RetryAgent(Agent) + aiFetch (maxRetries 2, conservative keep-alive, connect timeout, streaming-safe timeouts) - ai.service.ts: inject fetch: aiFetch into every provider factory - ai-http.spec.ts: regression test that aiFetch injects the RetryAgent dispatcher into the underlying fetch Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
47
apps/server/src/integrations/ai/ai-http.spec.ts
Normal file
47
apps/server/src/integrations/ai/ai-http.spec.ts
Normal file
@@ -0,0 +1,47 @@
|
||||
import { RetryAgent } from 'undici';
|
||||
|
||||
import { aiFetch } from './ai-http';
|
||||
|
||||
/**
|
||||
* Light, dependency-free unit checks for the shared AI HTTP layer. The module
|
||||
* constructs its undici dispatcher eagerly at import time, so importing it here
|
||||
* already exercises that construction; we make NO real network calls.
|
||||
*/
|
||||
describe('ai-http', () => {
|
||||
it('exports aiFetch as a function', () => {
|
||||
expect(typeof aiFetch).toBe('function');
|
||||
});
|
||||
|
||||
it('constructs the dispatcher eagerly without throwing at import time', () => {
|
||||
// Reaching this assertion means the top-level Agent/RetryAgent construction
|
||||
// in ai-http.ts did not throw when the module was imported above.
|
||||
expect(aiFetch).toBeDefined();
|
||||
});
|
||||
|
||||
it('forwards the resilient RetryAgent dispatcher into the underlying fetch', async () => {
|
||||
// CRITICAL regression guard: aiFetch must inject the shared undici dispatcher
|
||||
// into the real fetch call, otherwise AI traffic silently falls back to the
|
||||
// default global agent and the ECONNRESET production bug returns. aiFetch
|
||||
// resolves `fetch` at call time, so spying on globalThis.fetch intercepts it
|
||||
// and prevents any real network call.
|
||||
const spy = jest
|
||||
.spyOn(globalThis, 'fetch')
|
||||
.mockResolvedValue(new Response(null));
|
||||
try {
|
||||
await aiFetch('https://example.invalid/', { method: 'POST' });
|
||||
|
||||
expect(spy).toHaveBeenCalledTimes(1);
|
||||
const init = spy.mock.calls[0][1] as {
|
||||
dispatcher?: unknown;
|
||||
method?: string;
|
||||
};
|
||||
// The dispatcher must be the resilient RetryAgent, not the default agent.
|
||||
expect(init.dispatcher).toBeInstanceOf(RetryAgent);
|
||||
// `{ ...init }` spreading must preserve the caller's original options.
|
||||
expect(init.method).toBe('POST');
|
||||
} finally {
|
||||
// Never let the global fetch stub leak into other tests.
|
||||
spy.mockRestore();
|
||||
}
|
||||
});
|
||||
});
|
||||
Reference in New Issue
Block a user