feat(ai-chat): load full transcript for model history (drop 50-msg window)

The per-turn model conversation was rebuilt via findRecent(chatId, ws, 50), a sliding window that dropped the beginning of any chat longer than ~50 stored rows. Switch streamChat to the existing findAllByChat, which loads the full non-deleted transcript chronologically with a 5000-row memory-safety backstop (keeps the newest rows + logs a warning on overflow) — a safety net, not a conversational limit. Remove the now-unused findRecent method and update the comments/log text that referenced it (findAllByChat now feeds both the Markdown export and the model history). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-25 23:51:41 +03:00
parent fad1aa0501
commit 1f459d8d26
3 changed files with 23 additions and 42 deletions
--- a/apps/server/src/core/ai-chat/ai-chat.service.spec.ts
+++ b/apps/server/src/core/ai-chat/ai-chat.service.spec.ts
@@ -240,7 +240,7 @@ describe('prepareAgentStep', () => {
 * write path. It runs identically for the upfront insert (empty steps,
 * 'streaming'), every per-step update, and the terminal finalize — so a future
 * background worker can call the same function. These tests pin the four status
- * shapes and the `metadata.parts` shape that rowToUiMessage/findRecent depend on
+ * shapes and the `metadata.parts` shape that rowToUiMessage/findAllByChat depend on
 * (per-step text + tool parts via assistantParts, in-progress text appended).
 */
 describe('flushAssistant', () => {
--- a/apps/server/src/core/ai-chat/ai-chat.service.ts
+++ b/apps/server/src/core/ai-chat/ai-chat.service.ts
@@ -322,12 +322,14 @@ export class AiChatService implements OnModuleInit {
    // Rebuild the conversation from persisted history (not the client payload),
    // so the model always sees the authoritative server-side transcript. Load
-    // the most RECENT tail (oldest -> newest) so chats longer than one page do
+    // the FULL history in chronological order (oldest -> newest, incl. the user
-    // not drop recent turns (incl. the user message just inserted above).
+    // message just inserted above) so NO turns are dropped — there is no
-    const history = await this.aiChatMessageRepo.findRecent(
+    // recent-tail window anymore. `findAllByChat` keeps a 5000-row memory-safety
    // backstop (on overflow it keeps the NEWEST rows and logs a warning); that
    // is a safety net far above any realistic chat, not a conversational limit.
    const history = await this.aiChatMessageRepo.findAllByChat(
      chatId,
      workspace.id,
      50,
    );
    const uiMessages = history.map(rowToUiMessage);
    // convertToModelMessages is async in ai@6.0.134 (returns Promise<ModelMessage[]>).
@@ -1215,7 +1217,7 @@ export async function applyFinalize(
 *
 * `metadata.parts` is built by assistantParts over the finished steps, then the
 * in-progress text appended as a trailing text part, so rowToUiMessage /
- * findRecent keep replaying the turn unchanged. `metadata.finishReason`,
+ * findAllByChat keep replaying the turn unchanged. `metadata.finishReason`,
 * `metadata.error`, `metadata.usage`, `metadata.contextTokens` and
 * `metadata.maxContextTokens` are attached only when provided/relevant, matching
 * the pre-#183 onFinish/onError records.
--- a/apps/server/src/database/repos/ai-chat/ai-chat-message.repo.ts
+++ b/apps/server/src/database/repos/ai-chat/ai-chat-message.repo.ts
@@ -18,7 +18,8 @@ import { executeWithCursorPagination } from '@docmost/db/pagination/cursor-pagin
 // (multi-instance deploy).
 const SWEEP_STREAMING_STALE_MS = 10 * 60 * 1000; // 10 minutes
-// Hard upper bound on the rows materialized by `findAllByChat` (export path).
+// Hard upper bound on the rows materialized by `findAllByChat`, which now feeds
 // BOTH the Markdown export and the per-turn model history.
 // A generous cap so a pathologically huge chat cannot load an unbounded result
 // into memory; far above any realistic transcript length.
 const FIND_ALL_BY_CHAT_LIMIT = 5000;
@@ -78,14 +79,17 @@ export class AiChatMessageRepo {
  }
  // Load ALL (non-deleted) messages of a chat in ascending chronological order
-  // (oldest -> newest), unpaginated. Used by the server-side Markdown export
+  // (oldest -> newest), unpaginated. Two callers, both treating the DB as the
-  // (#183), where the DB is the single source of truth and the whole transcript
+  // single source of truth and needing the whole transcript in one pass
-  // must be rendered in one pass (findByChat is cursor-paginated and would only
+  // (findByChat is cursor-paginated and would only return the first page):
-  // return the first page).
+  //   - the server-side Markdown export (#183);
  //   - the per-turn model history, rebuilt fresh on every turn so the model
  //     sees the full authoritative transcript.
  //
  // Hard-capped at FIND_ALL_BY_CHAT_LIMIT rows (a generous bound, far above any
-  // realistic transcript) so exporting a pathologically huge chat cannot
+  // realistic transcript) — a shared memory-safety backstop for BOTH paths so a
-  // materialize an unbounded result set in memory.
+  // pathologically huge chat cannot materialize an unbounded result set in
  // memory. On overflow the NEWEST rows are kept and a warning is logged.
  async findAllByChat(
    chatId: string,
    workspaceId: string,
@@ -93,9 +97,9 @@ export class AiChatMessageRepo {
    limit: number = FIND_ALL_BY_CHAT_LIMIT,
  ): Promise<AiChatMessage[]> {
    // Fetch newest-first (+1 to DETECT truncation), so on overflow we keep the
-    // NEWEST `limit` messages — the recent conversation matters most for an
+    // NEWEST `limit` messages — the recent conversation matters most — rather
-    // export — rather than silently dropping the tail (#183 review). Reverse back
+    // than silently dropping the tail (#183 review). Then reverse back to
-    // to chronological for rendering, like findRecent.
+    // chronological order (oldest -> newest) for rendering / model replay.
    const rows = await this.db
      .selectFrom('aiChatMessages')
      .select(this.baseFields)
@@ -110,38 +114,13 @@ export class AiChatMessageRepo {
    if (rows.length > limit) {
      rows.length = limit; // keep the newest `limit` (rows are newest-first here)
      this.logger.warn(
-        `Chat ${chatId} export truncated to the newest ${limit} messages ` +
+        `Chat ${chatId} truncated to the newest ${limit} messages ` +
          `(older messages omitted).`,
      );
    }
    return rows.reverse();
  }
  // Load the most RECENT `limit` messages for a chat and return them in
  // ascending chronological order (oldest -> newest), as the model expects.
  // `findByChat` returns the FIRST page ASC (the OLDEST messages), which loses
  // recent turns once a chat grows beyond a page; this rebuilds the model
  // history from the tail instead. Plain query (no cursor pagination).
  async findRecent(
    chatId: string,
    workspaceId: string,
    limit: number,
  ): Promise<AiChatMessage[]> {
    const rows = await this.db
      .selectFrom('aiChatMessages')
      .select(this.baseFields)
      .where('chatId', '=', chatId)
      .where('workspaceId', '=', workspaceId)
      .where('deletedAt', 'is', null)
      .orderBy('createdAt', 'desc')
      .orderBy('id', 'desc')
      .limit(limit)
      .execute();
    // Selected newest-first for the limit; reverse to oldest-first for the model.
    return rows.reverse();
  }
  async insert(
    insertable: InsertableAiChatMessage,
    trx?: KyselyTransaction,