refactor(ai): retry outside instrumentation + retry-exhaustion test (#179 review)
- Invert the transport layers so the pre-response retry is OUTERMOST and the
provider-HTTP instrumentation is INNER. Before, the retry lived inside
createStreamingFetch (under the instrumentation), so a reset the retry
recovered from logged only a clean "OK status=200" — the
"PRE-RESPONSE FAILED ... ECONNRESET ... idleSincePrevCall" signal went blind
exactly when the fix works, and AI_STREAM_KEEPALIVE_MS couldn't be tuned from
prod data. Now createStreamingFetch is the dispatcher-bound BASE (no retry) and
a new withPreResponseRetry() wraps it; ai.service composes
withPreResponseRetry(createInstrumentedFetch('AiService:provider-http',
createStreamingFetch())), so every attempt — including recovered resets — flows
through the instrumentation. (Also expresses the keepAlive-config vs retry-
behavior boundary structurally, per review #3.)
- Add the retry-exhaustion test: a server that resets EVERY connection, asserting
the call rejects with a retryable connection error AND exactly
PRE_RESPONSE_CONNECT_RETRIES + 1 (= 3) requests reached the server — pinning the
bound and that the final error propagates (guards an off-by-one / infinite loop
/ swallowed error). Existing happy-retry + abort tests moved onto
withPreResponseRetry.
Verified on the stand: a normal turn still streams (reasoning + finish) and the
provider-HTTP telemetry still logs. server tsc + ai/mcp specs green (30).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -16,7 +16,10 @@ import { AiEmbeddingNotConfiguredException } from './ai-embedding-not-configured
|
||||
import { AiSttNotConfiguredException } from './ai-stt-not-configured.exception';
|
||||
import { describeProviderError } from './ai-error.util';
|
||||
import { createInstrumentedFetch } from './ai-provider-http';
|
||||
import { createStreamingFetch } from './ai-streaming-fetch';
|
||||
import {
|
||||
createStreamingFetch,
|
||||
withPreResponseRetry,
|
||||
} from './ai-streaming-fetch';
|
||||
import { AiProviderCredentialsRepo } from '@docmost/db/repos/ai-chat/ai-provider-credentials.repo';
|
||||
import { SecretBoxService } from '../crypto/secret-box';
|
||||
import { AiDriver } from './ai.types';
|
||||
@@ -46,14 +49,15 @@ export interface ChatModelOverride {
|
||||
export class AiService {
|
||||
private readonly logger = new Logger(AiService.name);
|
||||
|
||||
// Provider HTTP fetch for the chat path: the streaming fetch — which RAISES
|
||||
// undici's 300s headers/body timeouts to a generous-but-finite silence timeout
|
||||
// so a long agent turn is not severed mid-stream (#175) — wrapped with the
|
||||
// provider-HTTP instrumentation so the logs observe that exact transport. Held
|
||||
// for the service lifetime to reuse the streaming dispatcher's connection pool.
|
||||
private readonly aiProviderFetch = createInstrumentedFetch(
|
||||
'AiService:provider-http',
|
||||
createStreamingFetch(),
|
||||
// Provider HTTP fetch for the chat path, layered so each transport concern is
|
||||
// observed (#175). Inside-out: the streaming fetch (finite silence timeouts +
|
||||
// keep-alive recycling) → provider-HTTP instrumentation (logs every attempt) →
|
||||
// pre-response connection-reset retry as the OUTERMOST layer. Retry-outer means
|
||||
// a reset the retry recovers from is still logged with its idle-gap, instead of
|
||||
// collapsing into a clean "OK". Held for the service lifetime to reuse the
|
||||
// streaming dispatcher's connection pool.
|
||||
private readonly aiProviderFetch = withPreResponseRetry(
|
||||
createInstrumentedFetch('AiService:provider-http', createStreamingFetch()),
|
||||
);
|
||||
|
||||
constructor(
|
||||
|
||||
Reference in New Issue
Block a user