Compare commits

...

1 Commits

Author SHA1 Message Date
agent_vscode 9b4b38a611 fix(ai): patch ai@6.0.134 — drop O(n²) partialOutput accumulation causing heap OOM on long agent runs (#184)
Production OOM'd (JS heap 1.85 GB / 2 GB limit) during a ~20-step,
~28k-chunk autonomous agent turn. Heap snapshot analysis (memlab) showed a
single DefaultStreamTextResult retaining ~1.7 GB via the never-consumed
leftover tee() branch of its internal baseStream.

Root cause in ai@6.0.134: streamText substitutes the default text() output
strategy even when the caller passes NO `output` option. Its
createOutputTransformStream then accumulates the ENTIRE turn text and, on
EVERY text-delta, enqueues `{ part, partialOutput }` where partialOutput is
a flat snapshot of all text so far (JSON.stringify flattens the
cons-string) — O(n²) memory across the turn. Every consumer accessor tees
baseStream and keeps the second branch as the new baseStream; the final
leftover branch is never read, so its controller queue holds every chunk
(28,225 x ~164 KB in the OOM'd run) for the life of the turn.

Fix (pnpm patch on both dist/index.js and dist/index.mjs):
- pass the raw, possibly-undefined `output` option into
  createOutputTransformStream instead of defaulting to text()
- when output == null, publish each text-delta immediately without
  accumulating turn text or producing partialOutput snapshots; streaming
  granularity is unchanged, and callers that DO request an output strategy
  keep the original behavior

Our server never uses partialOutputStream / experimental_output / the
output option, so no behavior changes for us beyond memory.

Regression spec ai-sdk-partial-output.patch.spec.ts drives the real
patched SDK with MockLanguageModelV3: asserts per-delta textStream
granularity, an EMPTY experimental_partialOutputStream (tripwire — yields
one cumulative partial per delta when unpatched), and the PATCH(docmost
marker in both installed dist bundles. Also documents the patch in
AGENTS.md (must be re-created when bumping `ai`) and CHANGELOG.md.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-05 02:13:17 +03:00
6 changed files with 179 additions and 7 deletions
+1 -1
View File
@@ -294,7 +294,7 @@ Vite SPA. Code is organized by feature under `apps/client/src/features/*` (mirro
- **Errors must never be swallowed or shown as generic messages.** Every caught error MUST (1) be logged in full to the console/logger — error name, message, stack, `cause`, and (for HTTP/provider failures) the status code and response body — and (2) be surfaced to the user with a *specific, human-readable explanation of what actually went wrong*, never a bare generic string like "Something went wrong" / "Could not start recording" / "Transcription failed". Include the real reason (the underlying error/provider message) in the user-facing text. On the server, wrap third-party/provider failures with `describeProviderError` (or equivalent) and rethrow as a meaningful HTTP status + message — never let them collapse into an opaque 500. On the client, `console.error(<context>, err)` the raw error AND show the extracted reason (e.g. `err.response?.data?.message`, or the error `name: message`) in the notification.
- The version string shown in the UI comes from `APP_VERSION` (CI/Docker) or `git describe --tags --always` (local), resolved in `vite.config.ts` — not from `package.json`.
- Server TS config is permissive (`noImplicitAny: false`, `strictNullChecks: false`, `no-explicit-any` lint disabled). Follow the existing relaxed style rather than tightening types broadly.
- Dependency versions are heavily pinned via `pnpm.overrides` and `pnpm.patchedDependencies` (`scimmy`, `yjs`) in the root `package.json`. Don't bump pinned/patched deps casually; the patches and overrides exist for compatibility/security reasons.
- Dependency versions are heavily pinned via `pnpm.overrides` and `pnpm.patchedDependencies` (`scimmy`, `yjs`, `ai`) in the root `package.json`. Don't bump pinned/patched deps casually; the patches and overrides exist for compatibility/security reasons. The `ai@6.0.134` patch disables the SDK's O(n²) cumulative `partialOutput` accumulation when no output strategy is requested (server heap OOM on long agent runs, #184; tripwire test: `apps/server/src/integrations/ai/ai-sdk-partial-output.patch.spec.ts`) — it MUST be re-created via `pnpm patch` when bumping `ai`.
- **Adding/renaming/removing an MCP tool requires updating `SERVER_INSTRUCTIONS`** in `packages/mcp/src/index.ts` — the intent-routing guide MCP clients receive on initialize. This applies both to inline `server.registerTool(...)` calls in `index.ts` and to specs in `packages/mcp/src/tool-specs.ts`. Enforced by `packages/mcp/test/unit/server-instructions.test.mjs`, which fails when a registered tool is not mentioned in the guide (deliberate opt-outs go into its `EXCEPTIONS` list). `packages/mcp/build/` is gitignored and rebuilt in CI/Docker via `pnpm build` (same convention as `git-sync`/`prosemirror-markdown`) — never commit it; rebuild locally after editing to run the tests.
## CI / release
+8
View File
@@ -169,6 +169,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Fixed
- **The server no longer runs out of heap during long autonomous agent runs.** A
new pnpm patch on `ai@6.0.134` stops the SDK from building a cumulative
snapshot of the ENTIRE turn text on every streamed text-delta when no output
strategy was requested (our server never requests one). Unpatched, those
O(n²) `partialOutput` snapshots piled up in a never-consumed internal
`tee()` branch of the stream result — a ~20-step, ~28k-chunk agent run
retained ~1.7 GB and OOM'd the 2 GB JS heap. Streaming granularity is
unchanged; the patch must be re-created if `ai` is ever bumped. (#184)
- **Internal links in exported Markdown no longer lose their visible text.** A
link whose target page name had no file extension (e.g. a bare title) was
collapsed to empty text during export, producing an unclickable, label-less
@@ -0,0 +1,92 @@
import { readFileSync } from 'fs';
import { streamText } from 'ai';
import { MockLanguageModelV3, simulateReadableStream } from 'ai/test';
/**
* Regression tests for patches/ai@6.0.134.patch (server heap OOM on long
* autonomous agent runs, #184).
*
* Unpatched ai@6.0.134 substitutes the default text() output strategy even
* when the caller passes NO `output` option. Its createOutputTransformStream
* then accumulates the ENTIRE turn text and, on EVERY text-delta, enqueues a
* flat snapshot of all text so far as `partialOutput` (O(n^2) memory). Those
* snapshots pile up in the never-consumed leftover tee() branch of
* DefaultStreamTextResult.baseStream, which is what OOM'd production during a
* ~28k-chunk agent turn. The pnpm patch skips partialOutput production
* entirely when no output strategy was requested, while keeping per-delta
* streaming granularity.
*/
describe('ai@6.0.134 pnpm patch: no partialOutput accumulation without an output strategy', () => {
const makeModel = () =>
new MockLanguageModelV3({
doStream: async () => ({
stream: simulateReadableStream({
chunks: [
{ type: 'stream-start' as const, warnings: [] },
{ type: 'text-start' as const, id: '1' },
{ type: 'text-delta' as const, id: '1', delta: 'Hello' },
{ type: 'text-delta' as const, id: '1', delta: ', ' },
{ type: 'text-delta' as const, id: '1', delta: 'world!' },
{ type: 'text-end' as const, id: '1' },
{
type: 'finish' as const,
finishReason: { unified: 'stop' as const, raw: 'stop' },
usage: {
inputTokens: {
total: 1,
noCache: undefined,
cacheRead: undefined,
cacheWrite: undefined,
},
outputTokens: { total: 1, text: 1, reasoning: undefined },
},
},
],
}),
}),
});
it('preserves per-delta streaming granularity in textStream', async () => {
const result = streamText({ model: makeModel(), prompt: 'hi' });
const deltas: string[] = [];
for await (const delta of result.textStream) {
deltas.push(delta);
}
// The patch must NOT coalesce or drop deltas: three model deltas arrive
// as three separate textStream chunks.
expect(deltas).toEqual(['Hello', ', ', 'world!']);
});
it('emits NO partialOutput values when the caller did not request an output strategy', async () => {
const result = streamText({ model: makeModel(), prompt: 'hi' });
// Fully consume the primary stream first (mirrors production usage).
for await (const _ of result.textStream) {
// drain
}
const partials: unknown[] = [];
for await (const partial of result.experimental_partialOutputStream) {
partials.push(partial);
}
// TRIPWIRE: on unpatched ai@6.0.134 the default text() output strategy
// yields one cumulative partial per text-delta here (['Hello', 'Hello, ',
// 'Hello, world!']). An empty stream proves the patch is applied and no
// cumulative snapshots are being produced (and thus none can pile up in
// the leftover internal tee branch).
expect(partials).toEqual([]);
});
it('both installed dist builds (CJS and ESM) carry the patch marker', () => {
// Secondary guard: pins the patch to BOTH bundles the SDK ships, since
// the NestJS server consumes CJS while other tooling may load ESM.
const cjsPath = require.resolve('ai');
const mjsPath = cjsPath.replace(/index\.js$/, 'index.mjs');
expect(cjsPath).toMatch(/index\.js$/);
expect(readFileSync(cjsPath, 'utf8')).toContain('PATCH(docmost');
expect(readFileSync(mjsPath, 'utf8')).toContain('PATCH(docmost');
});
});
+2 -1
View File
@@ -96,7 +96,8 @@
"pnpm": {
"patchedDependencies": {
"scimmy@1.3.5": "patches/scimmy@1.3.5.patch",
"yjs@13.6.30": "patches/yjs@13.6.30.patch"
"yjs@13.6.30": "patches/yjs@13.6.30.patch",
"ai@6.0.134": "patches/ai@6.0.134.patch"
},
"overrides": {
"prosemirror-changeset": "2.4.0",
+68
View File
@@ -0,0 +1,68 @@
diff --git a/dist/index.js b/dist/index.js
index ae447a12f7823ec0a00837ee9f0eb809a610d5f8..a3402b2c2d021ef432cfa76e35d370073d525135 100644
--- a/dist/index.js
+++ b/dist/index.js
@@ -6578,9 +6578,19 @@ function createOutputTransformStream(output) {
controller.enqueue({ part: chunk, partialOutput: void 0 });
return;
}
- text2 += chunk.text;
textChunk += chunk.text;
textProviderMetadata = (_a21 = chunk.providerMetadata) != null ? _a21 : textProviderMetadata;
+ if (output == null) {
+ // PATCH(docmost #OOM): no output strategy requested -> publish each
+ // text-delta immediately and do NOT build cumulative partialOutput
+ // snapshots. Unpatched, the default text() output snapshots the ENTIRE
+ // accumulated turn text on every delta (O(n^2) memory) and those
+ // snapshots pile up in the never-consumed leftover tee branch of
+ // DefaultStreamTextResult.baseStream -> heap OOM on long agent turns.
+ publishTextChunk({ controller });
+ return;
+ }
+ text2 += chunk.text;
const result = await output.parsePartialOutput({ text: text2 });
if (result !== void 0) {
const currentJson = JSON.stringify(result.partial);
@@ -6959,7 +6969,7 @@ var DefaultStreamTextResult = class {
})
);
}
- this.baseStream = stream.pipeThrough(createOutputTransformStream(output != null ? output : text())).pipeThrough(eventProcessor);
+ this.baseStream = stream.pipeThrough(createOutputTransformStream(output)).pipeThrough(eventProcessor);
const { maxRetries, retry } = prepareRetries({
maxRetries: maxRetriesArg,
abortSignal
diff --git a/dist/index.mjs b/dist/index.mjs
index 663875332e3f9a9bd167c25583c515876f42951b..b840b0502c9894df983e0154805abb80e70e6331 100644
--- a/dist/index.mjs
+++ b/dist/index.mjs
@@ -6501,9 +6501,19 @@ function createOutputTransformStream(output) {
controller.enqueue({ part: chunk, partialOutput: void 0 });
return;
}
- text2 += chunk.text;
textChunk += chunk.text;
textProviderMetadata = (_a21 = chunk.providerMetadata) != null ? _a21 : textProviderMetadata;
+ if (output == null) {
+ // PATCH(docmost #OOM): no output strategy requested -> publish each
+ // text-delta immediately and do NOT build cumulative partialOutput
+ // snapshots. Unpatched, the default text() output snapshots the ENTIRE
+ // accumulated turn text on every delta (O(n^2) memory) and those
+ // snapshots pile up in the never-consumed leftover tee branch of
+ // DefaultStreamTextResult.baseStream -> heap OOM on long agent turns.
+ publishTextChunk({ controller });
+ return;
+ }
+ text2 += chunk.text;
const result = await output.parsePartialOutput({ text: text2 });
if (result !== void 0) {
const currentJson = JSON.stringify(result.partial);
@@ -6882,7 +6892,7 @@ var DefaultStreamTextResult = class {
})
);
}
- this.baseStream = stream.pipeThrough(createOutputTransformStream(output != null ? output : text())).pipeThrough(eventProcessor);
+ this.baseStream = stream.pipeThrough(createOutputTransformStream(output)).pipeThrough(eventProcessor);
const { maxRetries, retry } = prepareRetries({
maxRetries: maxRetriesArg,
abortSignal
+8 -5
View File
@@ -44,6 +44,9 @@ overrides:
ip-address: 10.1.1
patchedDependencies:
ai@6.0.134:
hash: f60bfc3357e01e1f3978c6c40fdd65aeb33fefaad7179cde8676465b6c5ff4d9
path: patches/ai@6.0.134.patch
scimmy@1.3.5:
hash: 775d80f86830b2c5dd1a250c9802c10f8fc3da3c7898373de5aa0c23993d1673
path: patches/scimmy@1.3.5.patch
@@ -623,10 +626,10 @@ importers:
version: 8.3.0(socket.io-adapter@2.5.4)
ai:
specifier: ^6.0.134
version: 6.0.134(zod@4.3.6)
version: 6.0.134(patch_hash=f60bfc3357e01e1f3978c6c40fdd65aeb33fefaad7179cde8676465b6c5ff4d9)(zod@4.3.6)
ai-sdk-ollama:
specifier: ^3.8.1
version: 3.8.1(ai@6.0.134(zod@4.3.6))(zod@4.3.6)
version: 3.8.1(ai@6.0.134(patch_hash=f60bfc3357e01e1f3978c6c40fdd65aeb33fefaad7179cde8676465b6c5ff4d9)(zod@4.3.6))(zod@4.3.6)
bcrypt:
specifier: ^6.0.0
version: 6.0.0
@@ -16355,17 +16358,17 @@ snapshots:
agent-base@7.1.4: {}
ai-sdk-ollama@3.8.1(ai@6.0.134(zod@4.3.6))(zod@4.3.6):
ai-sdk-ollama@3.8.1(ai@6.0.134(patch_hash=f60bfc3357e01e1f3978c6c40fdd65aeb33fefaad7179cde8676465b6c5ff4d9)(zod@4.3.6))(zod@4.3.6):
dependencies:
'@ai-sdk/provider': 3.0.8
'@ai-sdk/provider-utils': 4.0.21(zod@4.3.6)
ai: 6.0.134(zod@4.3.6)
ai: 6.0.134(patch_hash=f60bfc3357e01e1f3978c6c40fdd65aeb33fefaad7179cde8676465b6c5ff4d9)(zod@4.3.6)
jsonrepair: 3.13.3
ollama: 0.6.3
transitivePeerDependencies:
- zod
ai@6.0.134(zod@4.3.6):
ai@6.0.134(patch_hash=f60bfc3357e01e1f3978c6c40fdd65aeb33fefaad7179cde8676465b6c5ff4d9)(zod@4.3.6):
dependencies:
'@ai-sdk/gateway': 3.0.77(zod@4.3.6)
'@ai-sdk/provider': 3.0.8