Compare commits

..

5 Commits

Author SHA1 Message Date
claude code agent 227
a14560c7c9 fix(ai-chat): raise undici's 300s stream timeout for long agent turns (#175)
Long research turns failed mid-task with "Lost connection to the AI provider".
Node's global fetch (undici) defaults BOTH headersTimeout and bodyTimeout to
300_000ms, and the chat provider + the external-MCP dispatcher both ran on it
with no override, so:
  - the z.ai chat stream dropped when a late step's huge accumulated context
    pushed the model's time-to-first-token past 5 min (the model reasons
    server-side with NO streamed reasoning, so the connection is silent until the
    first answer token — reproduced: even a trivial glm-5.2 query has a ~4-8s
    first-chunk gap; a long run reaches 400k+-token steps), or a reasoning model
    paused >5 min between chunks (bodyTimeout);
  - the crawl4ai SSE transport, held open across the whole turn, dropped when it
    idled >5 min between tool calls.

Fix: a dedicated undici dispatcher whose stream timeouts are raised to a
generous-but-FINITE silence timeout (default 15 min, AI_STREAM_TIMEOUT_MS) on
each path. NOT disabled (0): that would let a genuinely hung provider — with the
client still connected — hang forever, since the turn's abortSignal only fires on
client disconnect. The timeout bounds SILENCE (time-to-first-byte and the gap
BETWEEN chunks), NOT total turn duration, so an arbitrarily long turn that keeps
streaming is never cut; only a stream quiet for >15 min is treated as a hang.
  - ai-streaming-fetch.ts: createStreamingFetch() + streamTimeoutMs() /
    streamingDispatcherOptions() (the shared, configurable timeout).
  - ai.service: the chat provider fetch is createStreamingFetch(), wrapped by the
    existing passive ECONNRESET telemetry (createDiagnosticFetch gained an
    optional baseFetch) so the telemetry observes the SAME transport.
  - mcp-clients: the SSRF-pinned Agent uses streamingDispatcherOptions().

Investigation: reproduced the transport mechanism against the real z.ai endpoint
(a 1ms headersTimeout throws UND_ERR_HEADERS_TIMEOUT — the exact drop) and ran
the actual research agent to a ~428k-token context. Verified the fixed path
streams cleanly live (glm-5.2 turns finish; telemetry confirms the streaming
fetch is in use).

Tests: ai-streaming-fetch.spec (default 15m + env override + invalid fallback +
both-timeouts + streams a delayed response); ai-http-diagnostics + ai/mcp specs
green. server tsc clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 22:09:10 +03:00
claude_code
4cc8df836f chore(ai): passive z.ai provider HTTP telemetry (#175)
Investigate the intermittent (~20-30%) long-turn failure
"Lost connection to the AI provider" = AI_RetryError / read ECONNRESET
on the gitmost->z.ai link (browser-agnostic, mid-turn). Pure
instrumentation, no behavior change:

- ai-http-diagnostics.ts: a passive fetch wrapper injected into the
  OpenAI-compatible (z.ai) client. Per provider HTTP call it logs
  time-to-headers/status on success, and on a pre-response rejection the
  latency, error code/cause, request-body size and idle-gap since the
  previous call. The Response is returned untouched (streaming intact),
  errors rethrown unchanged; no retry/timeout/dispatcher.
- ai.service.ts: wire the instrumented fetch into the openai case only.

Lets us classify the reset as connection-phase vs mid-stream before
choosing a fix, without repeating the reverted RetryAgent (#140).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 21:24:05 +03:00
claude_code
04a418e1a6 Merge pull request 'fix(mcp): tool allowlist stored/read as jsonb string, not array (edit-page crash + allowlist not enforced)' (#172) from fix/mcp-tool-allowlist-jsonb-shape into develop 2026-06-24 17:14:56 +03:00
claude code agent 227
255bc06883 fix(mcp): tool allowlist stored/read as jsonb string, not array
Opening the edit form for an MCP server that has a saved tool allowlist crashed
the whole settings page (`TypeError: Ke.map is not a function` in Mantine) — and,
worse, the allowlist was silently NOT enforced. Both stem from one root cause:
the `tool_allowlist` jsonb column round-trips as a JSON STRING, not an array.

Root cause: `jsonbArray` bound `JSON.stringify(value)` (already a JSON string)
straight to a `::jsonb` cast. node-postgres infers the param type as jsonb and
JSON-stringifies it a SECOND time, so the column stored a jsonb STRING SCALAR
(`"[\"a\"]"`, jsonb_typeof = string) instead of an array. On read the driver
hands back the JS string `'["a"]'`. Then:
  - the edit form's TagsInput called `.map` on a string -> page crash;
  - mcp-clients did `Array.isArray(allow)` -> false for a string -> fell through
    to "no restriction" and exposed ALL of the server's tools.

Fix (both verified on the stand):
- Write: `jsonbArray` casts `::text::jsonb` so the param is bound as text (sent
  verbatim) and parsed into a real jsonb array. New rows now store
  jsonb_typeof=array.
- Read: `normalizeRow` runs every fetched row through `parseToolAllowlist`, which
  returns `string[] | null` for both shapes (already-array passes through; a JSON
  string is parsed; null/invalid -> null). This REPAIRS existing double-encoded
  rows on read, so the UI and the allowlist enforcement work without a data
  migration. Applied in findById / listByWorkspace / listEnabled.
- Client: defensive `Array.isArray(...) ? ... : []` guard in the form so a bad
  shape can never take the settings page down again.

Tests: ai-mcp-server.repo.spec (8 cases for parseToolAllowlist — array, the
JSON-string read, null, empty, non-array json, unparseable, non-string elements,
non-string primitive). mcp-servers-to-view + mcp-namespacing still green.
Verified live: an old double-encoded row now reads as an array; a newly created
server stores jsonb_typeof=array.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 17:11:50 +03:00
8c06553b49 Merge pull request 'test(footnotes): cover footnoteWarnings import plumbing + doc fixes (#169 second review)' (#171) from fix/footnote-review-1227-followup into develop
Reviewed-on: #171
2026-06-24 16:46:23 +03:00
8 changed files with 350 additions and 6 deletions

View File

@@ -56,7 +56,13 @@ function buildInitialValues(server?: IAiMcpServer): FormValues {
transport: server?.transport ?? "http",
url: server?.url ?? "",
authHeader: "",
toolAllowlist: server?.toolAllowlist ?? [],
// Defensive: TagsInput calls `.map`, so a non-array here (e.g. an API that
// returns the jsonb column as a JSON string) would crash the whole page. The
// server normalizes this now, but guard anyway so a bad shape can never take
// the settings UI down.
toolAllowlist: Array.isArray(server?.toolAllowlist)
? server.toolAllowlist
: [],
enabled: server?.enabled ?? true,
};
}

View File

@@ -6,6 +6,7 @@ import { createMCPClient } from '@ai-sdk/mcp';
import { Agent, type Dispatcher } from 'undici';
import { AiMcpServerRepo } from '@docmost/db/repos/ai-chat/ai-mcp-server.repo';
import { AiMcpServer } from '@docmost/db/types/entity.types';
import { streamingDispatcherOptions } from '../../../integrations/ai/ai-streaming-fetch';
import { SecretBoxService } from '../../../integrations/crypto/secret-box';
import { isUrlAllowed, isIpAllowed } from './ssrf-guard';
@@ -400,6 +401,16 @@ export function validateResolvedAddresses(
*/
function buildPinnedDispatcher(): Agent {
return new Agent({
// Raise undici's default 300s headers/body timeouts on external MCP traffic
// to the same generous-but-finite silence timeout the chat fetch uses (#175).
// A long agent turn keeps an SSE transport (e.g. crawl4ai's /mcp/sse) open
// across the whole turn; that connection can idle BETWEEN tool calls longer
// than 5 min, and undici's bodyTimeout would otherwise sever it mid-task — a
// tool-call failure that aborts the streamed turn and shows the user "Lost
// connection to the AI provider". A slow single tool call (a crawl) can
// likewise exceed headersTimeout. The timeout stays FINITE so a genuinely
// hung server is still broken eventually.
...streamingDispatcherOptions(),
connect: {
lookup: (hostname, _options, callback) => {
// Always resolve ALL addresses ourselves; do not trust the caller's

View File

@@ -0,0 +1,48 @@
import { parseToolAllowlist } from './ai-mcp-server.repo';
/**
* The `tool_allowlist` jsonb column historically round-trips as a JSON STRING
* (rows written by the old double-encoding `jsonbArray`), so the driver hands
* back `'["a","b"]'` instead of an array. `parseToolAllowlist` normalizes both
* shapes to the `string[] | null` the entity type promises — fixing the settings
* UI crash (TagsInput `.map` on a string) and the tool-allowlist enforcement
* (which did `Array.isArray(allow)` and silently allowed ALL tools for a string).
*/
describe('parseToolAllowlist', () => {
it('passes a real string array through unchanged', () => {
expect(parseToolAllowlist(['search', 'crawl'])).toEqual(['search', 'crawl']);
});
it('parses a JSON-string array (the double-encoded read) into an array', () => {
// This is exactly what the DB returns for an old row: a jsonb string scalar.
expect(parseToolAllowlist('["alpha","beta"]')).toEqual(['alpha', 'beta']);
});
it('returns null for null / undefined (unrestricted)', () => {
expect(parseToolAllowlist(null)).toBeNull();
expect(parseToolAllowlist(undefined)).toBeNull();
});
it('returns [] for an empty array (no items, but a present allowlist)', () => {
expect(parseToolAllowlist([])).toEqual([]);
});
it('returns null for a JSON string that is not an array', () => {
expect(parseToolAllowlist('"justastring"')).toBeNull();
expect(parseToolAllowlist('{"a":1}')).toBeNull();
});
it('returns null for an unparseable string', () => {
expect(parseToolAllowlist('not json at all')).toBeNull();
});
it('returns null when elements are not all strings (defensive)', () => {
expect(parseToolAllowlist([1, 2, 3] as unknown)).toBeNull();
expect(parseToolAllowlist('[1,2,3]')).toBeNull();
});
it('returns null for a non-string, non-array primitive', () => {
expect(parseToolAllowlist(42 as unknown)).toBeNull();
expect(parseToolAllowlist(true as unknown)).toBeNull();
});
});

View File

@@ -21,32 +21,35 @@ export class AiMcpServerRepo {
id: string,
workspaceId: string,
): Promise<AiMcpServer | undefined> {
return this.db
const row = await this.db
.selectFrom('aiMcpServers')
.selectAll('aiMcpServers')
.where('id', '=', id)
.where('workspaceId', '=', workspaceId)
.executeTakeFirst();
return row ? normalizeRow(row) : row;
}
async listByWorkspace(workspaceId: string): Promise<AiMcpServer[]> {
return this.db
const rows = await this.db
.selectFrom('aiMcpServers')
.selectAll('aiMcpServers')
.where('workspaceId', '=', workspaceId)
.orderBy('createdAt', 'asc')
.execute();
return rows.map(normalizeRow);
}
/** Enabled servers only — used by the agent loop to build the toolset. */
async listEnabled(workspaceId: string): Promise<AiMcpServer[]> {
return this.db
const rows = await this.db
.selectFrom('aiMcpServers')
.selectAll('aiMcpServers')
.where('workspaceId', '=', workspaceId)
.where('enabled', '=', true)
.orderBy('createdAt', 'asc')
.execute();
return rows.map(normalizeRow);
}
async insert(
@@ -130,6 +133,14 @@ export class AiMcpServerRepo {
* Encode a string[] as a jsonb bind for the `tool_allowlist` column. Passing a
* plain JS array to the postgres driver would serialize it as a Postgres array
* literal (incompatible with jsonb), so we bind the JSON text and cast it.
*
* The cast is `::text::jsonb`, NOT `::jsonb`: if the parameter is bound straight
* to a jsonb cast, node-postgres infers its type as jsonb and JSON-stringifies
* the (already-JSON) string a SECOND time, so the column ends up holding a jsonb
* STRING SCALAR (`"[\"a\"]"`) instead of a jsonb ARRAY. Forcing the param through
* `::text` first binds it as text (sent verbatim), and `::jsonb` then parses it
* into a real array. (`normalizeRow` below repairs rows written the old way.)
*
* Returns null for null/empty arrays (an empty allowlist means "no restriction"
* is not intended — callers pass null to clear; an empty array is normalized to
* null here so it never round-trips as `[]`).
@@ -139,5 +150,37 @@ function jsonbArray(value: string[] | null | undefined) {
return null;
}
// Typed as string[] so it is assignable to the toolAllowlist column.
return sql<string[]>`${JSON.stringify(value)}::jsonb`;
return sql<string[]>`${JSON.stringify(value)}::text::jsonb`;
}
/**
* Parse the `toolAllowlist` value read from the DB into the `string[] | null`
* the entity type promises. The jsonb column historically round-trips as a JSON
* STRING (rows written by the old double-encoding `jsonbArray`, see above), so
* the driver hands back a string like `'["a","b"]'` rather than an array. Be
* tolerant: an already-parsed array passes through; a JSON string is parsed; null
* / a non-array / unparseable value becomes null (unrestricted).
*/
export function parseToolAllowlist(value: unknown): string[] | null {
if (value == null) return null;
if (Array.isArray(value)) {
return value.every((v) => typeof v === 'string') ? (value as string[]) : null;
}
if (typeof value === 'string') {
try {
const parsed = JSON.parse(value);
return Array.isArray(parsed) &&
parsed.every((v) => typeof v === 'string')
? (parsed as string[])
: null;
} catch {
return null;
}
}
return null;
}
/** Normalize a DB row so `toolAllowlist` is always `string[] | null`. */
function normalizeRow(row: AiMcpServer): AiMcpServer {
return { ...row, toolAllowlist: parseToolAllowlist(row.toolAllowlist) };
}

View File

@@ -0,0 +1,81 @@
import { Logger } from '@nestjs/common';
/**
* DIAGNOSTIC (provider ECONNRESET investigation) — temporary.
*
* A PASSIVE, behavior-neutral wrapper around the global `fetch`, injected into
* the OpenAI-compatible provider client (`createOpenAI({ fetch })`, the z.ai
* path). Per provider HTTP call it logs: time-to-response-headers + status +
* request-body size on success; and on a pre-response rejection the failure
* latency + error code/cause + request-body size + the idle gap since the
* previous provider call. It NEVER retries, times out, swaps the dispatcher, or
* reads/clones the response body — the Response is returned untouched (streaming
* unaffected) and any error is rethrown unchanged.
*
* How to read the result (a long agentic turn makes one provider call per step):
* - a failed turn whose last provider line is "PRE-RESPONSE FAILED ... ECONNRESET"
* => the reset is in the CONNECTION phase of a step's request (the provider
* never replied) — usually a poisoned keep-alive socket or the provider/middle
* box resetting that request (large body / idle gap are the suspects, hence
* reqBytes + idleSincePrevCall below).
* - the last line is "OK status=200" and the turn still errors with NO
* "PRE-RESPONSE FAILED" => the cut happened MID-STREAM (after headers), a
* different failure mode.
*
* The seq/last-call timestamps are module-level, so under concurrent turns the
* idle-gap figure is approximate (fine for single-user reproduction).
*/
export function createDiagnosticFetch(
context: string,
// The underlying fetch to instrument. Defaults to the global fetch; the chat
// provider passes a streaming fetch (disabled undici stream timeouts, #175) so
// the telemetry observes the SAME transport the long agent turn actually uses.
baseFetch: typeof fetch = fetch,
): typeof fetch {
const logger = new Logger(context);
let callSeq = 0;
let lastCallStartedAt: number | undefined;
return async (input: Parameters<typeof fetch>[0], init?: Parameters<typeof fetch>[1]): Promise<Response> => {
const callId = ++callSeq;
const startedAt = Date.now();
const idleSincePrev =
lastCallStartedAt === undefined ? undefined : startedAt - lastCallStartedAt;
lastCallStartedAt = startedAt;
// Request body size: the chat payload is a JSON string. Used to test whether
// failures correlate with the large accumulated context on later agent steps.
const body = init?.body as unknown;
const bodyBytes =
typeof body === 'string'
? body.length
: body instanceof Uint8Array
? body.byteLength
: undefined;
try {
// Delegate to the base fetch; return the Response UNTOUCHED (never read/
// clone the body) so the streamed SSE response is unaffected.
const res = await baseFetch(input, init);
logger.log(
`provider HTTP DIAGNOSTIC: call#${callId} OK ` +
`headersAfter=${Date.now() - startedAt}ms status=${res.status} ` +
`reqBytes=${bodyBytes ?? 'n/a'} idleSincePrevCall=${idleSincePrev ?? 'n/a'}ms`,
);
return res;
} catch (err) {
// fetch() rejected => PRE-RESPONSE failure (no headers/body received yet):
// the connection/request phase. Log it and rethrow the SAME error.
const e = err as {
name?: string;
message?: string;
cause?: { code?: string; message?: string };
};
logger.warn(
`provider HTTP DIAGNOSTIC: call#${callId} PRE-RESPONSE FAILED ` +
`after=${Date.now() - startedAt}ms code=${e?.cause?.code ?? 'none'} ` +
`name=${e?.name ?? 'Error'} cause=${e?.cause?.message ?? e?.message ?? 'unknown'} ` +
`reqBytes=${bodyBytes ?? 'n/a'} idleSincePrevCall=${idleSincePrev ?? 'n/a'}ms`,
);
throw err;
}
};
}

View File

@@ -0,0 +1,78 @@
import * as http from 'node:http';
import {
createStreamingFetch,
streamTimeoutMs,
streamingDispatcherOptions,
} from './ai-streaming-fetch';
/**
* #175: undici's default 300s headers/body timeouts severed long agent turns.
* The streaming fetch raises them to a generous-but-FINITE silence timeout (not
* 0 — a true hang must still break). We pin: the configured value + env override,
* that both dispatcher timeouts use it, and that a delayed response streams.
*/
describe('streamTimeoutMs', () => {
const ORIG = process.env.AI_STREAM_TIMEOUT_MS;
afterEach(() => {
if (ORIG === undefined) delete process.env.AI_STREAM_TIMEOUT_MS;
else process.env.AI_STREAM_TIMEOUT_MS = ORIG;
});
it('defaults to a generous-but-finite 15 minutes', () => {
delete process.env.AI_STREAM_TIMEOUT_MS;
expect(streamTimeoutMs()).toBe(900_000);
// Finite — NOT disabled (0 would let a hung provider leak forever).
expect(streamTimeoutMs()).toBeGreaterThan(0);
expect(Number.isFinite(streamTimeoutMs())).toBe(true);
});
it('honours a positive AI_STREAM_TIMEOUT_MS override', () => {
process.env.AI_STREAM_TIMEOUT_MS = '120000';
expect(streamTimeoutMs()).toBe(120000);
});
it('ignores an invalid / non-positive override (falls back to default)', () => {
for (const bad of ['0', '-5', 'abc', '']) {
process.env.AI_STREAM_TIMEOUT_MS = bad;
expect(streamTimeoutMs()).toBe(900_000);
}
});
it('applies the timeout to BOTH undici stream timeouts', () => {
delete process.env.AI_STREAM_TIMEOUT_MS;
expect(streamingDispatcherOptions()).toEqual({
headersTimeout: 900_000,
bodyTimeout: 900_000,
});
});
});
describe('createStreamingFetch — against a delayed server', () => {
let server: http.Server;
let url: string;
// The server waits before sending ANY byte (a long time-to-first-token).
const DELAY = 400;
beforeAll(async () => {
server = http.createServer((_req, res) => {
setTimeout(() => {
res.writeHead(200, { 'Content-Type': 'text/plain' });
res.end('ok');
}, DELAY);
});
await new Promise<void>((resolve) => server.listen(0, '127.0.0.1', resolve));
const addr = server.address() as import('node:net').AddressInfo;
url = `http://127.0.0.1:${addr.port}/`;
});
afterAll(async () => {
await new Promise<void>((resolve) => server.close(() => resolve()));
});
it('streams the delayed response instead of timing out', async () => {
const streamingFetch = createStreamingFetch();
const res = await streamingFetch(url);
expect(res.status).toBe(200);
expect(await res.text()).toBe('ok');
});
});

View File

@@ -0,0 +1,58 @@
import { Agent } from 'undici';
/**
* Default SILENCE timeout for streaming AI calls (15 min). Generous, but FINITE.
*
* Node's global fetch (undici) defaults headersTimeout and bodyTimeout to
* 300_000ms, which severed legitimate long agent turns mid-stream — surfacing as
* "Lost connection to the AI provider" (#175): a late step with a huge context
* pushes the model's time-to-first-token past 5 min, or a reasoning model pauses
* >5 min between chunks. We do NOT disable the timeout (0) — that would let a
* genuinely hung provider, with the client still connected, hang forever
* (abortSignal only fires on client disconnect). Instead we raise it well above
* any realistic gap while keeping it finite so a true hang is eventually broken.
*
* This bounds SILENCE (time-to-first-byte and the gap BETWEEN chunks), NOT total
* turn duration — so an arbitrarily long turn that keeps streaming bytes is never
* cut; only a stream that goes quiet for longer than this is treated as a hang.
*/
const DEFAULT_STREAM_TIMEOUT_MS = 900_000;
/**
* The configured silence timeout (ms). Override with `AI_STREAM_TIMEOUT_MS`; a
* missing/invalid/non-positive value falls back to {@link DEFAULT_STREAM_TIMEOUT_MS}.
*/
export function streamTimeoutMs(): number {
const raw = Number(process.env.AI_STREAM_TIMEOUT_MS);
return Number.isFinite(raw) && raw > 0 ? raw : DEFAULT_STREAM_TIMEOUT_MS;
}
/**
* undici `Agent` timeout options for streaming AI traffic — both stream timeouts
* set to the (generous, finite) silence timeout. Shared by the chat provider
* fetch and the external-MCP dispatcher so they behave identically (#175).
*/
export function streamingDispatcherOptions(): {
headersTimeout: number;
bodyTimeout: number;
} {
const t = streamTimeoutMs();
return { headersTimeout: t, bodyTimeout: t };
}
/**
* Build a `fetch` for long-lived streaming AI calls (the agent chat turn) backed
* by a dedicated undici dispatcher whose stream timeouts are the generous-but-
* finite silence timeout above (#175). A single shared dispatcher is returned
* (callers hold it for the service lifetime) so its connection pool is reused.
*/
export function createStreamingFetch(): typeof fetch {
const dispatcher = new Agent(streamingDispatcherOptions());
return ((input: Parameters<typeof fetch>[0], init?: RequestInit) =>
fetch(input, {
...(init ?? {}),
// `dispatcher` is an undici-specific init field (not in the DOM RequestInit
// type); Node's global fetch reads it. Cast to satisfy the type.
dispatcher,
} as RequestInit & { dispatcher: Agent })) as typeof fetch;
}

View File

@@ -14,6 +14,9 @@ import { AiNotConfiguredException } from './ai-not-configured.exception';
import { AiEmbeddingNotConfiguredException } from './ai-embedding-not-configured.exception';
import { AiSttNotConfiguredException } from './ai-stt-not-configured.exception';
import { describeProviderError } from './ai-error.util';
// DIAGNOSTIC (provider ECONNRESET investigation) — temporary.
import { createDiagnosticFetch } from './ai-http-diagnostics';
import { createStreamingFetch } from './ai-streaming-fetch';
import { AiProviderCredentialsRepo } from '@docmost/db/repos/ai-chat/ai-provider-credentials.repo';
import { SecretBoxService } from '../crypto/secret-box';
import { AiDriver } from './ai.types';
@@ -43,6 +46,16 @@ export interface ChatModelOverride {
export class AiService {
private readonly logger = new Logger(AiService.name);
// Provider HTTP fetch for the chat path: a streaming fetch that DISABLES
// undici's 300s headers/body timeouts (#175 — long agent turns were severed
// mid-stream), wrapped with passive ECONNRESET-investigation telemetry so the
// logs observe the exact transport the turn uses. Held for the service
// lifetime to reuse the streaming dispatcher's connection pool.
private readonly aiDiagnosticFetch = createDiagnosticFetch(
'AiService:provider-http',
createStreamingFetch(),
);
constructor(
private readonly aiSettings: AiSettingsService,
private readonly aiProviderCredentialsRepo: AiProviderCredentialsRepo,
@@ -140,7 +153,13 @@ export class AiService {
// Responses API (/responses), which OpenAI-compatible gateways
// (OpenRouter, etc.) reject on multi-turn requests (history with
// assistant messages) → 400.
return createOpenAI({ apiKey, baseURL: baseUrl }).chat(chatModel);
// DIAGNOSTIC (provider ECONNRESET investigation) — temporary: pass the
// passive instrumented fetch (logging only; no behavior change).
return createOpenAI({
apiKey,
baseURL: baseUrl,
fetch: this.aiDiagnosticFetch,
}).chat(chatModel);
case 'gemini':
return createGoogleGenerativeAI({ apiKey })(chatModel);
case 'ollama':