Address the PR #202 review (approve-with-comments). The only actionable
non-blocking item was the test-coverage suggestion: the source switch in
AiChatService.handle from findRecent(chatId, ws, 50) to findAllByChat(chatId,
ws) was not pinned by a test. handle() is a streaming method the project marks
as not unit-testable, so cover the behavioral guarantee it now relies on at the
repo/integration level — seed a chat of 60 messages and assert the default
findAllByChat (exactly how handle calls it) returns the FULL transcript in
chronological order, including the first turn the old 50-window would have
dropped.
Also document the behavior change under CHANGELOG [Unreleased] -> Changed.
The two stability items (token-budget trim before streamText; O(N) history
rebuild per turn) are deferred: the reviewer flagged both as non-blocking
conscious trade-offs aligned with the PR's stated goal, and the trim is a
larger architecture change out of scope for this follow-up.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Approve-with-comments re-review; no blockers. All 7 actionable points (8 is a
forward-looking architecture note — recommendation A, keep as-is):
1. chat-markdown.util spec: restore parity coverage of the removed client spec —
tool error state (+ errorText), unknown-tool fallback (`Ran tool <name>` en /
`Выполнил инструмент <name>` ru), and the circular-output stringify catch.
2. findAllByChat row cap is now testable (injectable limit) + an int-spec proves
truncation on a modest volume.
3. Stability: the per-step durability updates are SERIALIZED via a promise chain
(stepUpdateChain) so they commit in step order — onlyIfStreaming already
closed the finalize race, this closes inter-step ordering.
4. findAllByChat keeps the NEWEST messages on truncation (order DESC + reverse,
like findRecent) and logs a warning with chatId, instead of silently dropping
the newest tail.
5. The LABELS parity comment already references the real path (tool-parts.tsx /
toolLabelKey) — confirmed accurate.
6. Removed the redundant 'off-by-one boundary' test (strict subset of the two
adjacent prepareAgentStep cases).
7. Extracted the terminal-finalize dispatch into a shared `applyFinalize`, used
by BOTH the service's finalizeAssistant and its test — the test now exercises
the real path, not a copy, so a production drift fails it.
Verified: server build + 325 ai-chat unit + 6 integration; prettier clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
15-point review of the persistent-history PR. Architecture decisions: crash
recovery = recency threshold; tool-label duplication = leave as-is.
Must-fix:
1. Boot-sweep bounded by recency. sweepStreaming now also requires
`updatedAt < now() - SWEEP_STREAMING_STALE_MS` (10 min), so a fresh replica's
startup sweep can't abort a turn another replica is actively streaming
(multi-instance deploy). Int-spec: a FRESH 'streaming' row is NOT swept, a
STALE one IS.
2. Restore export during the FIRST streaming turn of a new chat (#174). The
server chatId is now adopted EARLY (in-place, on the start-chunk metadata) via
a new `onServerChatId` callback wired through use-chat-session → chat-thread,
so `activeChatId` is set at turn start and the Copy button is live mid-first-
turn (canExport = !!activeChatId). Hook tests for early/in-place/no-op adopt.
3. Cover finalizeAssistant's fallback-insert branch: extracted pure
`planFinalizeAssistant(assistantId)` (update when id present, insert when the
upfront insert failed) + a dispatch harness test for both arms.
Tests: onModuleInit lifecycle spec (sweep called; throw → resolves + warns);
int-spec updatedAt assertion → toBeGreaterThan.
Cleanups: cap findAllByChat at 5000 rows; upfront-insert-failure log carries
chatId+workspaceId; removed the now-dead buildPartialAssistantRecord (only the
spec consumed it; shapes still pinned by the flushAssistant suite); controller
passes `lang: dto.lang` (normalizeLang handles undefined); dropped a no-op
`?? undefined` in errorOf; documented the content-column semantics change
(concatenated step text, UI renders from metadata.parts); CHANGELOG [Unreleased]
entry (#183, #174); reworded the stale LABELS parity comment.
Verified: server build + 323 ai-chat unit + 5 integration; client tsc + 160
ai-chat unit; prettier clean.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review caught a real race: onStepFinish fires `updateStreaming()` fire-and-
forget (not awaited), so the FINAL step's streaming UPDATE and the terminal
`finalizeAssistant` UPDATE run as two concurrent statements on different pool
connections — commit order is not guaranteed. If the late streaming update
lands AFTER finalize, the completed row is clobbered back to status='streaming'
with no usage/finishReason, and the next startup sweep then mis-marks the
finished turn 'aborted'. Green unit/integration tests don't reproduce a
cross-connection race.
Fix: scope the per-step update with `onlyIfStreaming` → SQL `WHERE
status='streaming'`. Once finalize has set a terminal status the late update
matches zero rows and no-ops, regardless of commit order; finalize runs
unguarded so it always wins. A cheap `if (finalized) return` short-circuit
avoids most wasted queries, but the SQL guard is the authoritative fix (the
flag can be set after a query is already in flight).
Integration test: finalize to 'completed', then a late onlyIfStreaming update
is a no-op — status/content/usage preserved.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The chat lived in inconsistent paradigms (in-memory stream + client export vs.
DB-as-context), which made export flaky and lost the assistant answer if the
process died mid-turn. Make the DB the single source of truth.
A. STEP-GRANULAR DURABILITY (server)
- ai_chat_messages gains a nullable `status` column (migration; NULL = legacy =
completed). The assistant row is now INSERTED UPFRONT as `status:'streaming'`
and UPDATEd on every onStepFinish with all finished steps (text + tool calls +
tool RESULTS), then finalized once to completed/error/aborted on the terminal
callback. So a process death mid-turn keeps every finished step; a startup
sweep (OnModuleInit → sweepStreaming) flips any dangling 'streaming' row to
'aborted'. The write path no longer depends on a live socket.
- Pure exported `flushAssistant(steps, inProgressText, status, extra?)` builds
the persist payload (metadata.parts byte-identical to the old builder), so a
future background worker can call the same path. AiChatMessageRepo gains
`update`, `sweepStreaming`, and `findAllByChat`.
- consumeStream drain, external-MCP client close-once, SSE heartbeat preserved.
B. SERVER-SIDE EXPORT
- New pure `chat-markdown.util.ts` renders Markdown from DB rows ONLY (server
port of the client builder). Because A persists the in-progress row, the
export now includes an interrupted turn up to its last finished step (flagged
"still generating"). `POST /ai-chat/export` (owner-gated via assertOwnedChat,
workspace-scoped) returns it; `lang` accepts a full client locale tag
('en-US'/'ru-RU') and is normalized server-side (normalizeLang) — a strict
@IsIn(['en','ru']) DTO rejected the real client's i18n.language with a 400,
caught in real-browser testing.
- Client: handleCopy calls the endpoint; `canExport = !!activeChatId`. The whole
liveThreadRef/liveStateRef/onLiveContentChange/hasLiveContent hybrid (and the
client chat-markdown util + test) is removed — the server is now authoritative.
Tests: flushAssistant unit (status shapes + parts parity), chat-markdown.util
unit (incl. legacy NULL-status + interrupted note + ru + normalizeLang locale
tags), controller export wiring + owner-gate, integration update/sweepStreaming.
Verified: server build + 318 ai-chat unit + 3 integration; client tsc + 157
ai-chat unit; and END-TO-END in a real browser — a chat turn persists mid-stream
and the Copy button exports the DB-sourced markdown (showing the in-progress
row), HTTP 200 after the locale fix.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>