The real cause of the long-task "Lost connection to the AI provider" — the earlier 300s-timeout fix (#176) was the wrong layer. The provider-HTTP telemetry on the user's deploy shows the failures are PRE-RESPONSE `read ECONNRESET` ~500ms in (not a 300s/15min timeout), correlated with idleSincePrevCall ~42s and large bodies; and crucially a retry of the SAME request often succeeds. A direct probe to the real z.ai endpoint does NOT reset (113KB bodies and a 45s-idle keep-alive reuse both succeed), and another agent (opencode) runs fine from the same infra — so the provider is healthy and the egress network is usable. The difference is the transport: undici's keep-alive pool REUSES a socket that the deployment's egress (NAT / firewall / conntrack) silently dropped during a long idle gap, so the next request resets pre-response. Fix (brings gitmost in line with clients that don't reuse stale sockets): - Keep-alive recycling: the streaming dispatcher (chat fetch AND the external-MCP dispatcher, via the shared streamingDispatcherOptions) now sets keepAliveTimeout + keepAliveMaxTimeout to a 10s recycle window (AI_STREAM_KEEPALIVE_MS), so a connection idle longer than that is closed instead of reused — a long-gap step opens a fresh connection. keepAliveMaxTimeout also caps a server-advertised keep-alive so the provider can't widen the window. - Pre-response connection retry: createStreamingFetch retries a connection-level reset (ECONNRESET / UND_ERR_SOCKET / ECONNREFUSED / EPIPE / *_TIMEOUT) on a fresh connection up to 2 times. This is SAFE because fetch() only rejects before the Response resolves — a started stream is never replayed; an abort (client disconnect) is never retried. Tests: ai-streaming-fetch.spec — keep-alive options, streamKeepAliveMs env, isRetryableConnectError, and a server that resets the first connection so the retry must land on a fresh one (+ aborted requests are not retried). Verified on the stand that a normal turn still streams (reasoning + text + finish) through the new transport. server tsc + ai/mcp specs green. Note: root cause is the deployment's egress dropping idle connections (Traefik is inbound-only); this makes the app resilient to it. AI_STREAM_KEEPALIVE_MS can be lowered if the egress drops faster than ~10s. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A progressive Node.js framework for building efficient and scalable server-side applications.
Description
Nest framework TypeScript starter repository.
Installation
$ npm install
Running the app
# development
$ npm run start
# watch mode
$ npm run start:dev
# production mode
$ npm run start:prod
Migrations
# This creates a new empty migration file named 'init'
$ npm run migration:create --name=init
# Generates 'init' migration file from existing entities to update the database schema
$ npm run migration:generate --name=init
# Runs all pending migrations to update the database schema
$ npm run migration:run
# Reverts the last executed migration
$ npm run migration:revert
# Reverts all migrations
$ npm run migration:revert
# Shows the list of executed and pending migrations
$ npm run migration:show
## Test
```bash
# unit tests
$ npm run test
# e2e tests
$ npm run test:e2e
# test coverage
$ npm run test:cov
Support
Nest is an MIT-licensed open source project. It can grow thanks to the sponsors and support by the amazing backers. If you'd like to join them, please read more here.
Stay in touch
- Author - Kamil Myśliwiec
- Website - https://nestjs.com
- Twitter - @nestframework
License
Nest is MIT licensed.