Files
gitmost/patches/ai@6.0.134.patch
T
agent_vscode 9b4b38a611 fix(ai): patch ai@6.0.134 — drop O(n²) partialOutput accumulation causing heap OOM on long agent runs (#184)
Production OOM'd (JS heap 1.85 GB / 2 GB limit) during a ~20-step,
~28k-chunk autonomous agent turn. Heap snapshot analysis (memlab) showed a
single DefaultStreamTextResult retaining ~1.7 GB via the never-consumed
leftover tee() branch of its internal baseStream.

Root cause in ai@6.0.134: streamText substitutes the default text() output
strategy even when the caller passes NO `output` option. Its
createOutputTransformStream then accumulates the ENTIRE turn text and, on
EVERY text-delta, enqueues `{ part, partialOutput }` where partialOutput is
a flat snapshot of all text so far (JSON.stringify flattens the
cons-string) — O(n²) memory across the turn. Every consumer accessor tees
baseStream and keeps the second branch as the new baseStream; the final
leftover branch is never read, so its controller queue holds every chunk
(28,225 x ~164 KB in the OOM'd run) for the life of the turn.

Fix (pnpm patch on both dist/index.js and dist/index.mjs):
- pass the raw, possibly-undefined `output` option into
  createOutputTransformStream instead of defaulting to text()
- when output == null, publish each text-delta immediately without
  accumulating turn text or producing partialOutput snapshots; streaming
  granularity is unchanged, and callers that DO request an output strategy
  keep the original behavior

Our server never uses partialOutputStream / experimental_output / the
output option, so no behavior changes for us beyond memory.

Regression spec ai-sdk-partial-output.patch.spec.ts drives the real
patched SDK with MockLanguageModelV3: asserts per-delta textStream
granularity, an EMPTY experimental_partialOutputStream (tripwire — yields
one cumulative partial per delta when unpatched), and the PATCH(docmost
marker in both installed dist bundles. Also documents the patch in
AGENTS.md (must be re-created when bumping `ai`) and CHANGELOG.md.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-05 02:13:17 +03:00

69 lines
3.3 KiB
Diff

diff --git a/dist/index.js b/dist/index.js
index ae447a12f7823ec0a00837ee9f0eb809a610d5f8..a3402b2c2d021ef432cfa76e35d370073d525135 100644
--- a/dist/index.js
+++ b/dist/index.js
@@ -6578,9 +6578,19 @@ function createOutputTransformStream(output) {
controller.enqueue({ part: chunk, partialOutput: void 0 });
return;
}
- text2 += chunk.text;
textChunk += chunk.text;
textProviderMetadata = (_a21 = chunk.providerMetadata) != null ? _a21 : textProviderMetadata;
+ if (output == null) {
+ // PATCH(docmost #OOM): no output strategy requested -> publish each
+ // text-delta immediately and do NOT build cumulative partialOutput
+ // snapshots. Unpatched, the default text() output snapshots the ENTIRE
+ // accumulated turn text on every delta (O(n^2) memory) and those
+ // snapshots pile up in the never-consumed leftover tee branch of
+ // DefaultStreamTextResult.baseStream -> heap OOM on long agent turns.
+ publishTextChunk({ controller });
+ return;
+ }
+ text2 += chunk.text;
const result = await output.parsePartialOutput({ text: text2 });
if (result !== void 0) {
const currentJson = JSON.stringify(result.partial);
@@ -6959,7 +6969,7 @@ var DefaultStreamTextResult = class {
})
);
}
- this.baseStream = stream.pipeThrough(createOutputTransformStream(output != null ? output : text())).pipeThrough(eventProcessor);
+ this.baseStream = stream.pipeThrough(createOutputTransformStream(output)).pipeThrough(eventProcessor);
const { maxRetries, retry } = prepareRetries({
maxRetries: maxRetriesArg,
abortSignal
diff --git a/dist/index.mjs b/dist/index.mjs
index 663875332e3f9a9bd167c25583c515876f42951b..b840b0502c9894df983e0154805abb80e70e6331 100644
--- a/dist/index.mjs
+++ b/dist/index.mjs
@@ -6501,9 +6501,19 @@ function createOutputTransformStream(output) {
controller.enqueue({ part: chunk, partialOutput: void 0 });
return;
}
- text2 += chunk.text;
textChunk += chunk.text;
textProviderMetadata = (_a21 = chunk.providerMetadata) != null ? _a21 : textProviderMetadata;
+ if (output == null) {
+ // PATCH(docmost #OOM): no output strategy requested -> publish each
+ // text-delta immediately and do NOT build cumulative partialOutput
+ // snapshots. Unpatched, the default text() output snapshots the ENTIRE
+ // accumulated turn text on every delta (O(n^2) memory) and those
+ // snapshots pile up in the never-consumed leftover tee branch of
+ // DefaultStreamTextResult.baseStream -> heap OOM on long agent turns.
+ publishTextChunk({ controller });
+ return;
+ }
+ text2 += chunk.text;
const result = await output.parsePartialOutput({ text: text2 });
if (result !== void 0) {
const currentJson = JSON.stringify(result.partial);
@@ -6882,7 +6892,7 @@ var DefaultStreamTextResult = class {
})
);
}
- this.baseStream = stream.pipeThrough(createOutputTransformStream(output != null ? output : text())).pipeThrough(eventProcessor);
+ this.baseStream = stream.pipeThrough(createOutputTransformStream(output)).pipeThrough(eventProcessor);
const { maxRetries, retry } = prepareRetries({
maxRetries: maxRetriesArg,
abortSignal