Files
gitmost/apps/server
claude code agent 227 b0faa2fe32 fix(ai-chat): recycle keep-alive sockets + retry pre-response resets (#175)
The real cause of the long-task "Lost connection to the AI provider" — the
earlier 300s-timeout fix (#176) was the wrong layer. The provider-HTTP telemetry
on the user's deploy shows the failures are PRE-RESPONSE `read ECONNRESET` ~500ms
in (not a 300s/15min timeout), correlated with idleSincePrevCall ~42s and large
bodies; and crucially a retry of the SAME request often succeeds. A direct probe
to the real z.ai endpoint does NOT reset (113KB bodies and a 45s-idle keep-alive
reuse both succeed), and another agent (opencode) runs fine from the same infra —
so the provider is healthy and the egress network is usable. The difference is
the transport: undici's keep-alive pool REUSES a socket that the deployment's
egress (NAT / firewall / conntrack) silently dropped during a long idle gap, so
the next request resets pre-response.

Fix (brings gitmost in line with clients that don't reuse stale sockets):
- Keep-alive recycling: the streaming dispatcher (chat fetch AND the external-MCP
  dispatcher, via the shared streamingDispatcherOptions) now sets
  keepAliveTimeout + keepAliveMaxTimeout to a 10s recycle window
  (AI_STREAM_KEEPALIVE_MS), so a connection idle longer than that is closed
  instead of reused — a long-gap step opens a fresh connection. keepAliveMaxTimeout
  also caps a server-advertised keep-alive so the provider can't widen the window.
- Pre-response connection retry: createStreamingFetch retries a connection-level
  reset (ECONNRESET / UND_ERR_SOCKET / ECONNREFUSED / EPIPE / *_TIMEOUT) on a
  fresh connection up to 2 times. This is SAFE because fetch() only rejects before
  the Response resolves — a started stream is never replayed; an abort (client
  disconnect) is never retried.

Tests: ai-streaming-fetch.spec — keep-alive options, streamKeepAliveMs env,
isRetryableConnectError, and a server that resets the first connection so the
retry must land on a fresh one (+ aborted requests are not retried). Verified on
the stand that a normal turn still streams (reasoning + text + finish) through the
new transport. server tsc + ai/mcp specs green.

Note: root cause is the deployment's egress dropping idle connections (Traefik is
inbound-only); this makes the app resilient to it. AI_STREAM_KEEPALIVE_MS can be
lowered if the egress drops faster than ~10s.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 23:51:17 +03:00
..
2024-06-07 17:29:34 +01:00
2024-06-07 17:29:34 +01:00
2024-01-09 18:58:26 +01:00
2024-12-09 14:51:31 +00:00
2024-01-09 18:58:26 +01:00
2024-01-09 18:58:26 +01:00
2024-01-09 18:58:26 +01:00
2025-03-06 13:38:37 +00:00

Nest Logo

A progressive Node.js framework for building efficient and scalable server-side applications.

NPM Version Package License NPM Downloads CircleCI Coverage Discord Backers on Open Collective Sponsors on Open Collective Support us

Description

Nest framework TypeScript starter repository.

Installation

$ npm install

Running the app

# development
$ npm run start

# watch mode
$ npm run start:dev

# production mode
$ npm run start:prod

Migrations

# This creates a new empty migration file named 'init'
$ npm run migration:create --name=init

# Generates 'init' migration file from existing entities to update the database schema
$ npm run migration:generate --name=init

# Runs all pending migrations to update the database schema
$ npm run migration:run

# Reverts the last executed migration
$ npm run migration:revert

# Reverts all migrations
$ npm run migration:revert

# Shows the list of executed and pending migrations
$ npm run migration:show



## Test

```bash
# unit tests
$ npm run test

# e2e tests
$ npm run test:e2e

# test coverage
$ npm run test:cov

Support

Nest is an MIT-licensed open source project. It can grow thanks to the sponsors and support by the amazing backers. If you'd like to join them, please read more here.

Stay in touch

License

Nest is MIT licensed.