refactor(ai): retry outside instrumentation + retry-exhaustion test (#179 review)

- Invert the transport layers so the pre-response retry is OUTERMOST and the
  provider-HTTP instrumentation is INNER. Before, the retry lived inside
  createStreamingFetch (under the instrumentation), so a reset the retry
  recovered from logged only a clean "OK status=200" — the
  "PRE-RESPONSE FAILED ... ECONNRESET ... idleSincePrevCall" signal went blind
  exactly when the fix works, and AI_STREAM_KEEPALIVE_MS couldn't be tuned from
  prod data. Now createStreamingFetch is the dispatcher-bound BASE (no retry) and
  a new withPreResponseRetry() wraps it; ai.service composes
  withPreResponseRetry(createInstrumentedFetch('AiService:provider-http',
  createStreamingFetch())), so every attempt — including recovered resets — flows
  through the instrumentation. (Also expresses the keepAlive-config vs retry-
  behavior boundary structurally, per review #3.)
- Add the retry-exhaustion test: a server that resets EVERY connection, asserting
  the call rejects with a retryable connection error AND exactly
  PRE_RESPONSE_CONNECT_RETRIES + 1 (= 3) requests reached the server — pinning the
  bound and that the final error propagates (guards an off-by-one / infinite loop
  / swallowed error). Existing happy-retry + abort tests moved onto
  withPreResponseRetry.

Verified on the stand: a normal turn still streams (reasoning + finish) and the
provider-HTTP telemetry still logs. server tsc + ai/mcp specs green (30).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
claude code agent 227
2026-06-25 00:10:40 +03:00
parent b0faa2fe32
commit c065e26d14
3 changed files with 80 additions and 29 deletions

View File

@@ -16,7 +16,10 @@ import { AiEmbeddingNotConfiguredException } from './ai-embedding-not-configured
import { AiSttNotConfiguredException } from './ai-stt-not-configured.exception';
import { describeProviderError } from './ai-error.util';
import { createInstrumentedFetch } from './ai-provider-http';
import { createStreamingFetch } from './ai-streaming-fetch';
import {
createStreamingFetch,
withPreResponseRetry,
} from './ai-streaming-fetch';
import { AiProviderCredentialsRepo } from '@docmost/db/repos/ai-chat/ai-provider-credentials.repo';
import { SecretBoxService } from '../crypto/secret-box';
import { AiDriver } from './ai.types';
@@ -46,14 +49,15 @@ export interface ChatModelOverride {
export class AiService {
private readonly logger = new Logger(AiService.name);
// Provider HTTP fetch for the chat path: the streaming fetch — which RAISES
// undici's 300s headers/body timeouts to a generous-but-finite silence timeout
// so a long agent turn is not severed mid-stream (#175) — wrapped with the
// provider-HTTP instrumentation so the logs observe that exact transport. Held
// for the service lifetime to reuse the streaming dispatcher's connection pool.
private readonly aiProviderFetch = createInstrumentedFetch(
'AiService:provider-http',
createStreamingFetch(),
// Provider HTTP fetch for the chat path, layered so each transport concern is
// observed (#175). Inside-out: the streaming fetch (finite silence timeouts +
// keep-alive recycling) → provider-HTTP instrumentation (logs every attempt) →
// pre-response connection-reset retry as the OUTERMOST layer. Retry-outer means
// a reset the retry recovers from is still logged with its idle-gap, instead of
// collapsing into a clean "OK". Held for the service lifetime to reuse the
// streaming dispatcher's connection pool.
private readonly aiProviderFetch = withPreResponseRetry(
createInstrumentedFetch('AiService:provider-http', createStreamingFetch()),
);
constructor(