gitmost

Author	SHA1	Message	Date
claude code agent 227	364838d0b2	test(review): close the two test-coverage gaps from PR #185 auto-review Approve-with-comments auto-review (8 axes); no blockers. Closes the two flagged test gaps; the two forward-looking dedup suggestions (reconcileHasChildren helper; unifying reconcileChildren/mergeRootTrees) are non-blocking architecture notes and left for a follow-up (as with #186's forward-looking point). 1. Ambiguous-id refusal end-to-end (#159): the patch_node/delete_node guard `if (replaced/deleted !== 1) return null` was only covered in pieces — the replaceNodeById/deleteNodeById counts and assertUnambiguousMatch in isolation — so loosening the guard would not have failed a test. New mock test stands up a REAL Hocuspocus collab server seeded (via buildYDoc, same docmost extensions) with a two-blocks-one-id document and drives the real client methods: both must reject with /ambiguous/ AND never write to collab. Tracked via Hocuspocus onChange (fires synchronously per update, unlike the debounced onStoreDocument) so a clobbering write is actually observed — verified the test FAILS when the guard is loosened to `< 1`. 2. scrollToReference zero-match bail: the branch "non-empty id but querySelectorAll returns 0 -> matches[index] ?? matches[0] is undefined -> return false" (the real desync: definition present, inline ref removed from the DOM) was uncovered. Added a footnote.test.ts case: a definition for 'ghost' with no rendered ref -> false, no scroll. Verified: 313 mcp tests + 24 editor-ext footnote tests; prettier clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 12:39:18 +03:00
claude code agent 227	aa7a115f66	refactor(review): address PR #186 re-review (approve-with-comments) Approve-with-comments re-review; no blockers. All 7 actionable points (8 is a forward-looking architecture note — recommendation A, keep as-is): 1. chat-markdown.util spec: restore parity coverage of the removed client spec — tool error state (+ errorText), unknown-tool fallback (`Ran tool <name>` en / `Выполнил инструмент <name>` ru), and the circular-output stringify catch. 2. findAllByChat row cap is now testable (injectable limit) + an int-spec proves truncation on a modest volume. 3. Stability: the per-step durability updates are SERIALIZED via a promise chain (stepUpdateChain) so they commit in step order — onlyIfStreaming already closed the finalize race, this closes inter-step ordering. 4. findAllByChat keeps the NEWEST messages on truncation (order DESC + reverse, like findRecent) and logs a warning with chatId, instead of silently dropping the newest tail. 5. The LABELS parity comment already references the real path (tool-parts.tsx / toolLabelKey) — confirmed accurate. 6. Removed the redundant 'off-by-one boundary' test (strict subset of the two adjacent prepareAgentStep cases). 7. Extracted the terminal-finalize dispatch into a shared `applyFinalize`, used by BOTH the service's finalizeAssistant and its test — the test now exercises the real path, not a copy, so a production drift fails it. Verified: server build + 325 ai-chat unit + 6 integration; prettier clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 12:28:35 +03:00
claude code agent 227	30c358a2f8	test(review): add the 4 new test-coverage points from PR #185 re-review The re-review's blocking/structural points (lease leak, dup-id guard test, body-before-title test, CHANGELOG, pg18, shared jsonb decoder) were already addressed in commit 24264ef; this adds the 4 genuinely-new coverage requests: - pt 6: `scrollToReference(id, index?)` exercised against a live editor DOM — selects the index-th `sup[data-footnote-ref][data-id]` occurrence, falls back to the first for out-of-range, returns false for an empty id (scrollIntoView stubbed). (#168) - pt 7: export `backlinkLabel` and pin the base-26 carry boundary (25->z, 26->aa, 27->ab, 51->az, 52->ba). (#168) - pt 8: integration fail-open — a PRESENT-but-corrupt tool_allowlist (jsonb string scalar holding non-array JSON) reads back as null ("no restriction"), covering normalizeRow's degrade branch. (#159 #172/#173) - pt 9: getFootnoteRefCount cache invalidation — adding a `[^a]` reference bumps the cached count 2 -> 3. (#168) Verified: editor-ext footnote 23; client structure 7 + tsc; server int 8. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 12:08:21 +03:00
claude code agent 227	ea61c96a7c	refactor(review): address PR #186 review (#183 — recency sweep, #174 export, tests, cleanups) 15-point review of the persistent-history PR. Architecture decisions: crash recovery = recency threshold; tool-label duplication = leave as-is. Must-fix: 1. Boot-sweep bounded by recency. sweepStreaming now also requires `updatedAt < now() - SWEEP_STREAMING_STALE_MS` (10 min), so a fresh replica's startup sweep can't abort a turn another replica is actively streaming (multi-instance deploy). Int-spec: a FRESH 'streaming' row is NOT swept, a STALE one IS. 2. Restore export during the FIRST streaming turn of a new chat (#174). The server chatId is now adopted EARLY (in-place, on the start-chunk metadata) via a new `onServerChatId` callback wired through use-chat-session → chat-thread, so `activeChatId` is set at turn start and the Copy button is live mid-first- turn (canExport = !!activeChatId). Hook tests for early/in-place/no-op adopt. 3. Cover finalizeAssistant's fallback-insert branch: extracted pure `planFinalizeAssistant(assistantId)` (update when id present, insert when the upfront insert failed) + a dispatch harness test for both arms. Tests: onModuleInit lifecycle spec (sweep called; throw → resolves + warns); int-spec updatedAt assertion → toBeGreaterThan. Cleanups: cap findAllByChat at 5000 rows; upfront-insert-failure log carries chatId+workspaceId; removed the now-dead buildPartialAssistantRecord (only the spec consumed it; shapes still pinned by the flushAssistant suite); controller passes `lang: dto.lang` (normalizeLang handles undefined); dropped a no-op `?? undefined` in errorOf; documented the content-column semantics change (concatenated step text, UI renders from metadata.parts); CHANGELOG [Unreleased] entry (#183, #174); reworded the stale LABELS parity comment. Verified: server build + 323 ai-chat unit + 5 integration; client tsc + 160 ai-chat unit; prettier clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:53:25 +03:00
claude code agent 227	f80276d41a	refactor(review): address PR #185 review (lease leak, tests, changelog, jsonb seam) 8-point multi-aspect review of the batch PR; security/regressions were clean. 1. Lease leak: the #180 reorder moved `toolsFor` (which leases external MCP clients, refCount+1) ahead of buildSystemPrompt + forUser, but the only release (closeExternalClients) was bound to the streamText callbacks. A throw in between leaked the lease (refCount stuck, undici sockets held until restart). Define closeExternalClients right after the lease and wrap buildSystemPrompt+forUser in try/catch that closes-then-rethrows. 2. Cover the patch_node/delete_node dup-id refusal (#159 #6): extract the guard into a pure `assertUnambiguousMatch` (node-ops) and unit-test 0/1/>1. 3. Regress the body-before-title order (#159 #10): mock-HTTP test (collab fails fast against a server with no WS upgrade) asserts /pages/update (title) is NEVER posted when the body write fails — for updatePage AND updatePageJson. 4. CHANGELOG [Unreleased]: #180, #168 (Added); #163 (Fixed). 5. Add the missing en-US i18n keys (Back to references / {{label}}). 6. Drop the duplicate content/empty/blank cases in ai-chat.prompt.spec.ts (they repeat the buildMcpToolingBlock unit tests); keep only sandwich placement + both-safety-copies. 7. CI Postgres pg16 -> pg18 (match docker-compose). 8. jsonb decode seam: shared `parseJsonbValue(value, guard)` in database/utils.ts holds the legacy double-encoding self-heal in one place; parseToolAllowlist / parseModelConfig keep only a type-guard. Verified: server build + 124 unit + 15 integration; mcp 311; prettier clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	8218c1a8ef	fix(tree): refresh loaded branches on reconnect so they don't go stale (#159 ) Third tree-sync finding (#8). On a socket reconnect after a missed-events gap (laptop sleep / wifi blip), the resync only invalidated the ROOT sidebar query; a move/rename/delete that happened INSIDE an already-loaded, expanded branch was never reflected — the branch stayed stale until the user manually interacted. (The #2 fix reconciles the root level; this covers the deeper loaded branches.) - `treeModel.reconcileChildren(tree, parentId, fresh)`: replace a loaded branch's DIRECT children with the authoritative fresh set (drop removed, add new, reorder to server) while PRESERVING each surviving child's already-loaded grandchildren, so deeper expansion is not collapsed. An unloaded branch (children === undefined) is left untouched (lazy-load fetches it fresh). - `loadedOpenBranchIds(tree, openIds)`: the branches a reconnect should refresh (open AND loaded). `fetchAllAncestorChildren(..., { fresh: true })` bypasses the 30-min sidebar cache so the reconcile sees current data (handler-order independent). - space-tree: on socket `connect`, re-fetch + reconcile each open loaded branch of the active space (space-switch-guarded; an unloaded branch is skipped). Tests: reconcileChildren (drop/add/reorder + preserve grandchildren + unloaded no-op) and loadedOpenBranchIds (open+loaded only, skip unloaded, nested). The pure logic is unit-tested; the live socket-reconnect round-trip is not browser-automated (simulating a reconnect gap is impractical) — sidebar render + expand were smoke-tested with no regression. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	d7e7489654	fix(tree): stop silent page loss on move-to-unloaded-parent + reconnect ghost roots (#159 ) Two confirmed P1 data-loss findings in the sidebar tree sync. #1 — Move into an unloaded/collapsed parent silently dropped pages. When a moveTreeNode (or addTreeNode) broadcast targeted a parent whose children were NOT yet lazy-loaded, `insertByPosition` did `kids = parent.children ?? []` and inserted the moved node, MATERIALIZING a misleading partial child list (`[movedNode]`) out of an unloaded (`children === undefined`) parent. The lazy-load gate fetches only when children are absent/empty, so it then refused to fetch — leaving the parent showing ONLY the moved node and HIDING all its other real children (and, when the parent wasn't in the tree at all, the node was removed and never re-fetched). Fix: `insertByPosition` distinguishes `children === undefined` (not loaded) from `[]` (loaded-empty) and, for an unloaded parent, does NOT insert — it leaves children unloaded and just flags `hasChildren`, so expanding fetches the FULL set (including the moved/added node) via the existing lazy-load. #2 — After a socket reconnect, a deleted/moved-away root lingered as a 404 "ghost". `mergeRootTrees` was append-only: it kept every previously-loaded root and only added new ones, so a root removed during the missed-events gap was never dropped. It runs only once all root pages are fetched, so the incoming list is the authoritative complete root set — fix reconciles to it (drop roots absent from incoming) while PRESERVING each surviving root's lazy-loaded subtree and refreshing its own fields. Tests: insertByPosition unloaded-vs-loaded-empty parent; the move reducer keeps a collapsed destination lazy-loadable instead of partial; mergeRootTrees drops a ghost root, preserves a surviving subtree, adds new roots, refreshes fields. The existing "remove when parent not in tree" reducer test still holds. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	8f1af676ba	fix(mcp): write page body before title to avoid split-brain on failure (#159 ) updatePage (markdown) and updatePageJson wrote the title via REST FIRST, then the body via collab. If the body write failed (e.g. a collab persist timeout), the page was left with the NEW title over its OLD body — a split-brain the tool reported as an error but never repaired (red-team finding #10). Reorder both: write the body first, and only set the title after the body has persisted. Now a body-write failure leaves the title untouched (no split-brain). A title write failing after a successful body is rarer (REST is fast) and leaves correct content under a stale title — the strictly lesser inconsistency — which is the same trade-off the issue's "atomic, or roll back the title" intends, without the fragility of a rollback write that could itself fail. No unit test: both paths require a live collab provider and the suite has no provider mock; the change is a pure reordering. All 306 mcp tests still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	34c5b557ef	fix(share): SEO route must not leak a restricted page's title (#159 ) `ShareSeoController.getShare` resolved the inherited share with the RAW `getShareForPage`, which does NOT run the restricted-ancestor gate. So for a page shared with includeSubPages whose descendant is permission-restricted, the SEO route served that descendant's real title in <title>/og:title/twitter:title to anonymous visitors and crawlers — even though the content API returns 404 for it (red-team finding #3). Funnel the SEO path through the canonical `resolveReadableSharePage` boundary (the single place that checks `hasRestrictedAncestor`): a non-readable page now serves the plain SPA index with no meta. Also honour `isSharingAllowed` — a share whose workspace/space sharing toggle was flipped off after creation no longer leaks its title via SEO. Title comes from the server-resolved page; `buildShareMetaHtml` already emits robots=noindex when the share opted out of indexing. Tests (controller routing, fs spied at call time so bcrypt's native loader is untouched): non-readable page => plain index, no title; sharing-disabled => plain index; readable+indexing => title + og:title, no noindex; readable+no- indexing => noindex. Asserts getShareForPage is never called by the SEO path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	59f0c8b22d	fix(ai-chat): validate the open page server-side so the agent edits the right one (#159 ) The client sends the "current page" as { id, title } in the request body and the server echoed BOTH verbatim into the system prompt context and the getCurrentPage tool. id and title are independently attacker/desync-controllable (two tabs, stale navigation), so openPage.id could point at page B while openPage.title said "Page A" — the model then reported "updated Page A" while it actually edited page B (CASL still allowed it; the user has access). Red-team finding #4. Resolve the open page ONCE against the DB via a new `resolveOpenPageContext`: workspace-scoped lookup + access check, returning the AUTHORITATIVE { id, title } (title from the DB row, never the client) or null (fail-closed) for a missing / foreign / inaccessible page. That validated value now feeds the system prompt, the getCurrentPage tool, AND the new-chat history origin (which previously did this validation inline, for the id only — now shared, and the title is fixed too). Tests: resolveOpenPageContext covers no-id, not-found, foreign-workspace, Forbidden, non-Forbidden-fault (fail-closed), the DB-title-wins-over-client case, and null-title coercion. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	77ccc596ea	feat(ai-chat): per-MCP-server instructions in the agent system prompt (#180 ) Admins can now give each EXTERNAL MCP server a free-text instruction ("how/ when to use this server's tools") that the agent receives in its SYSTEM PROMPT next to the tool descriptions — porting the built-in SERVER_INSTRUCTIONS idea to admin-configured servers. Trusted, admin-authored text (like a system prompt); NON-secret, so unlike headersEnc it IS returned in views/forms. - Migration: nullable `instructions text` on ai_mcp_servers (old rows = null = no guidance). Table type + repo insert/update (blank/whitespace -> null via blankToNull). DTO `@MaxLength(4000)`. Service threads it through McpServerView/toView. - mcp-clients: `McpServerInstruction { serverName, toolPrefix, instructions }` threaded through the toolset/cache/lease. Guidance is built ONLY for a server that actually connected AND contributed >=1 callable tool (the allowlist may filter all of them out) AND has non-blank text — so a guide never appears for tools the agent cannot call. Cached with the toolset, so an edit is picked up next turn via the existing CRUD cache invalidation. - System prompt: `buildMcpToolingBlock` renders an <mcp_tooling> block INSIDE the safety sandwich (after context, before the trailing SAFETY_FRAMEWORK) so it informs tool choice but cannot override the rules; each section is headed by the server's `prefix_*` namespace. Empty/blank -> block omitted. The caller (ai-chat.service) now builds the external toolset BEFORE the prompt and passes external.instructions; client-handle lifecycle (close-once) unchanged. - Client: instructions field in types + a Textarea (autosize, maxLength 4000) in the MCP-server form with a namespace-prefix hint; i18n (en/ru). Tests across every layer (prompt block placement + both SAFETY copies; view blank->null; buildEntry includes guidance only for connected+>=1-tool+non-blank; DTO MaxLength; repo + integration round-trip; service wiring). Delegated impl reviewed (APPROVE); applied the import-type follow-up. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	e536c6f9a9	ci(test): run the server integration suite against real Postgres/Redis (#159 ) The only test command in CI was `pnpm -r test` (unit `.spec.ts` on mocks). `test:int` (`.int-spec.ts`, real Postgres/Redis) ran nowhere in CI — there were no DB `services:` — so the cost-cap, FK-cascade, jsonb round-trip and real AI-apply integration tests never gated a PR, and regressions in those high-severity paths stayed green (red-team finding #7). Add `services: postgres (pgvector) + redis` and a `pnpm --filter server test:int` step. The pgvector image is required because migrations create vector columns and global-setup runs `CREATE EXTENSION vector`. Service credentials/db match the defaults in apps/server/test/integration (docmost / docmost_dev_pw, maintenance db `docmost`, redis 6379), so no TEST_*_URL overrides are needed; global-setup drops/recreates the isolated docmost_test DB and migrates it. NOTE: the workflow change itself can only be validated by an actual CI run (YAML parses locally); the int-spec suite is verified passing locally on this branch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	fdaf20ca7b	fix(mcp): refuse ambiguous patch_node/delete_node on duplicated ids (#159 ) Docmost duplicates block ids on copy/paste, and copyPageContent writes the source document verbatim with the same ids. `patchNode`/`deleteNode` address a block by `attrs.id` via replaceNodeById/deleteNodeById, which act on EVERY node sharing the id — so a single patch_node/delete_node could silently replace/remove multiple unrelated blocks with no signal to the model (red-team finding #6). Guard both write paths: when more than one node matches the id, skip the write entirely (the transform returns null -> no mutation) and throw a clear "ambiguous id — N nodes share it" error so the model re-targets with a more specific anchor. Only an unambiguous single match is written; the 0-match and 1-match behavior is unchanged. The duplicate-count basis is covered by node-ops.test.mjs (replaceNodeById / deleteNodeById report count===2 for a 2-duplicate doc). The end-to-end guard is not unit-tested because patchNode/deleteNode require a live collab provider and the test suite has no provider mock. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	47a2ae420b	feat(footnotes): multi-backlinks — definition returns to ALL its references (#168 ) After #166 a repeated `[^a]` is one footnote (reuse): one number, one definition, N forward links. But the definition's ↩ only returned to the FIRST reference. Now a definition with N references shows ↩ a b c …, each backlink scrolling to its own occurrence (Pandoc/Wikipedia convention); a single-reference footnote keeps the plain ↩ unchanged. - editor-ext: `computeFootnoteRefCounts(doc)` (id -> occurrence count) cached alongside the number map in the numbering plugin state; `getFootnoteRefCount` getter (O(1), no per-render doc walk). `scrollToReference(id, index?)` picks the index-th `sup[data-footnote-ref][data-id]` occurrence (document order), falling back to the first. - client: FootnoteDefinitionView renders one lettered link (a, b, c, … aa …) per occurrence when refCount > 1; the chrome stays after the contentDOM so the #146 caret invariant holds. i18n keys (ru) added. Tests: computeFootnoteRefCounts + getFootnoteRefCount (reuse counts, unknown id => 0); structure test gains 3 cases (N lettered links render, click jumps to the n-th occorrence, single ref => one ↩). NOTE: the visual layout of the backlink row needs a real browser to verify (jsdom can't); the structural and behavioral contract is covered headless. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	1cfad1f6fb	fix(db): jsonb double-encoding follow-ups from PR #172 review (#173 ) PR #172 fixed the jsonb double-encoding for `tool_allowlist` but the same class of bug, and the same re-derived workaround, remained elsewhere. 1. model_config (agent roles): jsonbObject still used the buggy `::jsonb` bind, so `ai_agent_roles.model_config` round-tripped as a jsonb STRING SCALAR. The read-path `typeof === 'object'` check then failed and the model override was SILENTLY dropped (role fell back to the default model). Fixed to `::text::jsonb` and added `parseModelConfig` + `normalizeRow` so every read self-heals already-corrupted rows (no migration). 2. Centralized the write workaround as `jsonbBind()` in database/utils.ts — one implementation with one explanation of the quirk — replacing the per-repo `jsonbArray` (mcp) and `jsonbObject` (roles). 3. Integration coverage (the fix is a DB round-trip a unit test cannot see; the read-side parser MASKS a write regression): new ai-mcp-server-repo.int-spec asserts `jsonb_typeof(tool_allowlist)='array'` after insert + heals a seeded string-scalar row; ai-agent-roles-repo int-spec gains the same for `model_config` (`'object'` + heal). 4. Updated the stale `ai-mcp-servers.types.ts` comment (the driver returns a JSON string for legacy rows; the repo normalizes every read). 5. Fail-open logging: a corrupt tool_allowlist degrades to "no restriction" (agent gets ALL tools) — normalizeRow now warns (server id only, never contents) so the silent widening leaves a trace. 6. Simplified parseToolAllowlist (normalize the string once, then a single array-of-strings check) — identical behaviour, all 12 cases still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	a766672574	fix(mcp): replaceImage no longer yanks the cursor (#164 ) `mutateLiveContentUnlocked` — the write path used by `replaceImage` — still did the pre-#152 destructive write (delete the whole fragment + applyUpdate a fresh Y.Doc), discarding every Yjs node id. y-prosemirror anchors the editor selection to those ids, so an open editor's cursor snapped to the document end on every image swap, exactly the #152 jump that the main write path no longer causes. Switch it to the same `applyDocToFragment(ydoc, newDoc)` structural diff (updateYFragment) as the main path, so unchanged nodes keep their ids and the live cursor stays put. It runs its own atomic transact, so the old explicit transact/delete is gone; the now-unused docmostExtensions import is dropped. Regression tests (cursor-stability suite): a sibling paragraph's RelativePosition survives a top-level image src/attachmentId swap, and an image nested in a callout, matching the shapes replaceImage produces. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	5e8cb628f0	feat(ai-chat): compact reasoning rendering — collapse blank lines (#181 ) The "Thinking" (reasoning) block rendered with large vertical gaps: models emit reasoning with a blank line (\n\n) between every list item and paragraph, which `marked` turns into loose lists (each <li> wrapped in a <p>) and separate <p> paragraphs, each carrying a margin. - Add `collapseBlankLines(text)`: collapse 2+ newlines to one, EXCEPT inside fenced code blocks (``` / ~~~) where blank lines are significant. Applied in reasoning-block.tsx before renderChatMarkdown, so loose lists become tight (no <li><p>) and paragraphs join; `breaks: true` keeps single \n as <br>, preserving line breaks. Reasoning-only — the normal answer is untouched. - Drop `white-space: pre-wrap` from `.reasoningText`: on the rendered markdown <div> it turned the newlines between block tags into visible blank lines on top of the margins. The plain-text fallback <Text> that needs pre-wrap already sets it inline. Tests: collapseBlankLines unit (collapse, fence preservation incl. tilde and unclosed fences) + rendered-HTML assertions that a blank-line-separated list becomes a tight list and still parses as a list after a paragraph. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	8413185a1d	fix(ai-chat): tick the live token counter between agent steps (#163 ) The header token badge (and the "Thinking… · N tokens" line) froze between agent steps and jumped in chunks instead of ticking smoothly. liveTurnTokens returned the authoritative server `usage` VERBATIM as soon as it appeared, but the server only attaches usage at a step boundary and it is cumulative over COMPLETED steps — so during the next (in-flight) step the figure stayed frozen at the previous boundary and the running text estimate was ignored. Combine both sources per component via max: always compute the running estimate (chars/≈4 over the message's reasoning/text parts, which includes the in-flight step) and take max(authoritativeBase, estimate). Between boundaries the estimate ticks the number up; at a boundary the authoritative figure snaps it exact; and because the server usage is cumulative and we only ever take the max, the counter is monotonic (never drops). Reasoning/output stay split; the #151 reasoning-only authoritative count is preserved. Backward compatible: in every existing test the estimate is <= the authoritative figure, so max returns the same value. +4 tests for the in-flight-step-exceeds- base case (output + reasoning), the authoritative-wins case, and monotonicity. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude_code	8fee6a86c2	fix(ai-chat): style GFM tables in assistant chat markdown Assistant answers containing GFM tables rendered badly in the narrow AI side panel: `.markdown` only styled p/pre/code/ul/ol and had no table rules, so tables used the browser default `table-layout: auto`. Combined with the inherited `word-break: break-word`, columns collapsed to a single glyph and headers wrapped mid-word ("Секция" -> "Секци / я"). Add table styling scoped to `.markdown`, in line with the editor's table.css house style: - make the table a horizontally scrollable block (display:block + overflow-x:auto) so wide tables scroll instead of squishing; - give cells a 6em min-width and restore word-boundary wrapping (word-break:normal + overflow-wrap:break-word); - add 1px borders, padding and a th background (light-dark for dark mode); zero out the default <p> margin inside cells. CSS-only; no markdown-pipeline change (marked already emits GFM tables, DOMPurify already allows table tags). Applies to the public share too. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-25 10:53:58 +03:00
claude code agent 227	ae6faf3abc	fix(ai-chat): guard step-update vs finalize race with WHERE status='streaming' (#183 review) Review caught a real race: onStepFinish fires `updateStreaming()` fire-and- forget (not awaited), so the FINAL step's streaming UPDATE and the terminal `finalizeAssistant` UPDATE run as two concurrent statements on different pool connections — commit order is not guaranteed. If the late streaming update lands AFTER finalize, the completed row is clobbered back to status='streaming' with no usage/finishReason, and the next startup sweep then mis-marks the finished turn 'aborted'. Green unit/integration tests don't reproduce a cross-connection race. Fix: scope the per-step update with `onlyIfStreaming` → SQL `WHERE status='streaming'`. Once finalize has set a terminal status the late update matches zero rows and no-ops, regardless of commit order; finalize runs unguarded so it always wins. A cheap `if (finalized) return` short-circuit avoids most wasted queries, but the SQL guard is the authoritative fix (the flag can be set after a query is already in flight). Integration test: finalize to 'completed', then a late onlyIfStreaming update is a no-op — status/content/usage preserved. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 06:14:02 +03:00
claude code agent 227	e7b719bbb8	feat(ai-chat): persistent history as source of truth — step durability + server export (#183 ) The chat lived in inconsistent paradigms (in-memory stream + client export vs. DB-as-context), which made export flaky and lost the assistant answer if the process died mid-turn. Make the DB the single source of truth. A. STEP-GRANULAR DURABILITY (server) - ai_chat_messages gains a nullable `status` column (migration; NULL = legacy = completed). The assistant row is now INSERTED UPFRONT as `status:'streaming'` and UPDATEd on every onStepFinish with all finished steps (text + tool calls + tool RESULTS), then finalized once to completed/error/aborted on the terminal callback. So a process death mid-turn keeps every finished step; a startup sweep (OnModuleInit → sweepStreaming) flips any dangling 'streaming' row to 'aborted'. The write path no longer depends on a live socket. - Pure exported `flushAssistant(steps, inProgressText, status, extra?)` builds the persist payload (metadata.parts byte-identical to the old builder), so a future background worker can call the same path. AiChatMessageRepo gains `update`, `sweepStreaming`, and `findAllByChat`. - consumeStream drain, external-MCP client close-once, SSE heartbeat preserved. B. SERVER-SIDE EXPORT - New pure `chat-markdown.util.ts` renders Markdown from DB rows ONLY (server port of the client builder). Because A persists the in-progress row, the export now includes an interrupted turn up to its last finished step (flagged "still generating"). `POST /ai-chat/export` (owner-gated via assertOwnedChat, workspace-scoped) returns it; `lang` accepts a full client locale tag ('en-US'/'ru-RU') and is normalized server-side (normalizeLang) — a strict @IsIn(['en','ru']) DTO rejected the real client's i18n.language with a 400, caught in real-browser testing. - Client: handleCopy calls the endpoint; `canExport = !!activeChatId`. The whole liveThreadRef/liveStateRef/onLiveContentChange/hasLiveContent hybrid (and the client chat-markdown util + test) is removed — the server is now authoritative. Tests: flushAssistant unit (status shapes + parts parity), chat-markdown.util unit (incl. legacy NULL-status + interrupted note + ru + normalizeLang locale tags), controller export wiring + owner-gate, integration update/sweepStreaming. Verified: server build + 318 ai-chat unit + 3 integration; client tsc + 157 ai-chat unit; and END-TO-END in a real browser — a chat turn persists mid-stream and the Copy button exports the DB-sourced markdown (showing the in-progress row), HTTP 200 after the locale fix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 06:05:26 +03:00
claude_code	27c91e4a69	feat(ai-chat): bound external MCP tool calls with per-call timeouts External MCP tools (web search, crawl) had no per-call timeout: a hung tool call was only broken by the 15-min transport silence timeout shared with the chat provider, and a server that kept the socket warm but never returned could spin until the user cancelled. Add two independent, composing bounds for external MCP traffic (the chat provider path is unchanged): - Silence 5 min: buildPinnedDispatcher now overrides headersTimeout/ bodyTimeout with mcpStreamTimeoutMs() (AI_MCP_STREAM_TIMEOUT_MS, default 300000) on the external-MCP dispatcher only, so a byte-silent upstream is severed in ~5 min instead of 15. - Total per-call 15 min: wrapToolWithCallTimeout wraps each external tool's execute with a fresh AbortController + timer composed with the turn signal via AbortSignal.any (AI_MCP_CALL_TIMEOUT_MS, default 900000). It RACES the call against the abort signal because @ai-sdk/mcp does not settle its in-flight promise on abort, so a warm-but-stuck call would otherwise hang forever. On timeout the call surfaces as a tool-error and the agent loop recovers. Add tests (incl. a never-settling real-client-style stub) and document both env vars in .env.example.	2026-06-25 04:43:49 +03:00
claude_code	c3596dce68	Merge branch 'develop' of https://gitea.vvzvlad.xyz/vvzvlad/gitmost into develop	2026-06-25 03:59:41 +03:00
claude_code	b6787cc542	fix(ai-chat): drain stream on client disconnect to stop heap-OOM leak The /api/ai-chat/stream and public-share streaming paths piped streamText output to the client socket via pipeUIMessageStreamToResponse, whose only reader is that socket. On a client disconnect (pervasive Safari/proxy ECONNRESET), backpressure stalled the stream: the controller aborted the turn but nothing drained it, so streamText's onFinish/onError/onAbort never fired. Cleanup (close leased MCP clients, persist partial) never ran and the whole per-turn object graph (history, per-request toolset closures, captured steps, SDK buffers) stayed rooted — accumulating across turns until the default ~2GB heap saturated and the process crashed with "Ineffective mark-compacts near heap limit - JavaScript heap out of memory". Add the AI SDK v6 documented remedy: fire-and-forget `result.consumeStream({ onError })` right after streamText(), which removes backpressure and drains the stream independently of the client socket so the terminal callbacks always fire and the turn's memory is released even when the client has gone away. Applied to both the authenticated and public-share stream services. Also add `--heapsnapshot-near-heap-limit=2` to the prod start script so any residual leak dumps a heap snapshot near OOM for diagnosis (no effect on normal operation). Heap size stays ops-tunable via NODE_OPTIONS. - apps/server/src/core/ai-chat/ai-chat.service.ts - apps/server/src/core/ai-chat/public-share-chat.service.ts - apps/server/package.json Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-25 03:59:32 +03:00
vvzvlad	176b0f575f	Merge pull request 'fix(ai-chat): WYSIWYG Copy chat export + first-turn export (#160 , #174 )' (#165 ) from fix/ai-chat-copy-chat-wysiwyg into develop Reviewed-on: #165	2026-06-25 03:54:34 +03:00
claude code agent 227	df81851eb3	fix(ai-chat): export the first unsaved turn (#174 ) The "Copy chat" button was hidden during a brand-new chat's very first turn: both the `canExport` gate and the `handleCopy` early-return required an `activeChatId` AND persisted `messageRows`, neither of which exists yet while the first turn is streaming or after it was interrupted before any row was persisted. Decouple the export gate from persisted state. ChatThread now reports a reactive `onLiveContentChange(messages.length > 0)` signal (the live snapshot lives in a non-reactive ref, so a separate reactive flag is needed to re-render the button); the parent keeps it in `hasLiveContent` and exports whenever there is anything on screen OR persisted. `handleCopy` passes a `"unsaved"` placeholder chat id when none exists yet, and the live-first builder serializes the on-screen thread WYSIWYG. Builds on #160 (WYSIWYG export); covers the first-turn edge case that was explicitly out of scope there. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 03:52:03 +03:00
claude code agent 227	4597183a1e	fix(ai-chat): WYSIWYG Copy chat export keeps the on-screen partial reply (#160 ) "Copy chat" built the Markdown from persisted rows plus a live tail that was only included while isStreaming. When a turn was interrupted (dropped stream / "Lost connection" banner) isStreaming flipped false, the live tail was dropped, and the partial assistant reply visible on screen — whose row often never persisted — vanished from the export, leaving only the user messages. - buildChatMarkdown is now live-first: the on-screen `live` messages ARE the document. Each is matched to a persisted row by id to enrich it with token usage / error / timestamp; authoritative usage/error already on the live message win over the row. When `live` is empty it falls back to the persisted rows (old format preserved). Only the tail assistant is flagged "still generating", and only when it is genuinely the streaming tail — so the status==="submitted" window (tail is the user message) never mislabels the previous, completed answer. - The on-screen banner (classified error / dropped connection / manual stop) is flattened to a string in ChatThread, mirrored into liveStateRef alongside the messages/isStreaming snapshot, and appended at the end of the export. - handleCopy maps the live messages and passes live/rows/isStreaming/banner. Tests: chat-markdown rewritten for the live/enrichment/fallback/banner paths and the submitted-window regression (26); full ai-chat suite green (186). tsc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 03:42:43 +03:00
claude_code	99d0cb8773	perf(ai-chat): throttle stream + memoize markdown to stop CPU spikes on long runs On long agent runs (dozens of tool calls) the desktop app froze at 100% CPU with no user interaction: useChat updated state on every streamed token, and MessageItem/ReasoningBlock re-parsed the whole transcript's markdown (the marked pipeline + DOMPurify) on every delta. Per-turn work grew quadratically and saturated the main thread; the SSE stream drove it, so it hung "on its own". - chat-thread: pass experimental_throttle (50ms) to useChat so the streamed messages state re-renders at most ~20 Hz instead of once per token. - message-item: memoize MessageItem on a cheap per-message content signature (the streaming tail still re-renders as it grows; finalized rows are skipped), and render each text part via a memoized MarkdownPart so finalized parts are not re-parsed. The signature includes usage.reasoningTokens so the authoritative "Thinking - N tokens" count still snaps in at finish-step. - reasoning-block: memoize the markdown render (useMemo on the text) and wrap the component in React.memo. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-25 03:26:44 +03:00
claude_code	5aa199660d	fix(ai-chat): keep thinking dots visible between streamed steps showTypingIndicator hid the standalone thinking dots for any non-empty trailing text part, so during the pause after the model finished an intermediate narration and before its next step (e.g. a tool call) the UI looked frozen. Suppress the dots only while the text part is still streaming: a finalized ("done") trailing text part on an in-flight turn now shows the dots again, matching the function's documented intent. - message-list: guard the text branch with state !== "done" (AI SDK v6 TextUIPart.state); stateless parts keep their previous behavior - show-typing-indicator.test: add done -> shown and streaming -> hidden cases Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-25 00:34:22 +03:00
claude_code	bf2ebb9d47	fix(ai-chat): increase bottom margin for typing indicator name The name label was crowding the bouncing dots when displayed. Adding extra bottom margin (mb={8}) gives the dots room and improves readability. The change only applies when the name is shown.	2026-06-25 00:21:53 +03:00
claude_code	ad90e2290e	Merge branch 'develop' of https://gitea.vvzvlad.xyz/vvzvlad/gitmost into develop	2026-06-25 00:11:52 +03:00
vvzvlad	e262f1695c	Merge pull request 'fix(ai-chat): recycle keep-alive sockets + retry pre-response resets (#175 )' (#179 ) from fix/ai-stream-reset-resilience into develop Reviewed-on: #179	2026-06-25 00:11:50 +03:00
claude code agent 227	c065e26d14	refactor(ai): retry outside instrumentation + retry-exhaustion test (#179 review) - Invert the transport layers so the pre-response retry is OUTERMOST and the provider-HTTP instrumentation is INNER. Before, the retry lived inside createStreamingFetch (under the instrumentation), so a reset the retry recovered from logged only a clean "OK status=200" — the "PRE-RESPONSE FAILED ... ECONNRESET ... idleSincePrevCall" signal went blind exactly when the fix works, and AI_STREAM_KEEPALIVE_MS couldn't be tuned from prod data. Now createStreamingFetch is the dispatcher-bound BASE (no retry) and a new withPreResponseRetry() wraps it; ai.service composes withPreResponseRetry(createInstrumentedFetch('AiService:provider-http', createStreamingFetch())), so every attempt — including recovered resets — flows through the instrumentation. (Also expresses the keepAlive-config vs retry- behavior boundary structurally, per review #3.) - Add the retry-exhaustion test: a server that resets EVERY connection, asserting the call rejects with a retryable connection error AND exactly PRE_RESPONSE_CONNECT_RETRIES + 1 (= 3) requests reached the server — pinning the bound and that the final error propagates (guards an off-by-one / infinite loop / swallowed error). Existing happy-retry + abort tests moved onto withPreResponseRetry. Verified on the stand: a normal turn still streams (reasoning + finish) and the provider-HTTP telemetry still logs. server tsc + ai/mcp specs green (30). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 00:10:40 +03:00
claude_code	91e7335d54	refactor(ai-chat): drop thinking-token text from typing indicator The live typing placeholder now shows only the bouncing dots; the "Thinking… · N tokens" line is removed. Clean up the dead plumbing: - typing-indicator: remove thinkingTokens prop, thinkingLine and the <Text> line; keep the animated dots and the dimmed name label - message-list: remove tailThinkingTokens helper, the thinkingTokens prop pass-through, and the now-unused liveTurnTokens import - delete tail-thinking-tokens.test.ts (tested the removed helper) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-25 00:02:44 +03:00
claude code agent 227	b0faa2fe32	fix(ai-chat): recycle keep-alive sockets + retry pre-response resets (#175 ) The real cause of the long-task "Lost connection to the AI provider" — the earlier 300s-timeout fix (#176) was the wrong layer. The provider-HTTP telemetry on the user's deploy shows the failures are PRE-RESPONSE `read ECONNRESET` ~500ms in (not a 300s/15min timeout), correlated with idleSincePrevCall ~42s and large bodies; and crucially a retry of the SAME request often succeeds. A direct probe to the real z.ai endpoint does NOT reset (113KB bodies and a 45s-idle keep-alive reuse both succeed), and another agent (opencode) runs fine from the same infra — so the provider is healthy and the egress network is usable. The difference is the transport: undici's keep-alive pool REUSES a socket that the deployment's egress (NAT / firewall / conntrack) silently dropped during a long idle gap, so the next request resets pre-response. Fix (brings gitmost in line with clients that don't reuse stale sockets): - Keep-alive recycling: the streaming dispatcher (chat fetch AND the external-MCP dispatcher, via the shared streamingDispatcherOptions) now sets keepAliveTimeout + keepAliveMaxTimeout to a 10s recycle window (AI_STREAM_KEEPALIVE_MS), so a connection idle longer than that is closed instead of reused — a long-gap step opens a fresh connection. keepAliveMaxTimeout also caps a server-advertised keep-alive so the provider can't widen the window. - Pre-response connection retry: createStreamingFetch retries a connection-level reset (ECONNRESET / UND_ERR_SOCKET / ECONNREFUSED / EPIPE / *_TIMEOUT) on a fresh connection up to 2 times. This is SAFE because fetch() only rejects before the Response resolves — a started stream is never replayed; an abort (client disconnect) is never retried. Tests: ai-streaming-fetch.spec — keep-alive options, streamKeepAliveMs env, isRetryableConnectError, and a server that resets the first connection so the retry must land on a fresh one (+ aborted requests are not retried). Verified on the stand that a normal turn still streams (reasoning + text + finish) through the new transport. server tsc + ai/mcp specs green. Note: root cause is the deployment's egress dropping idle connections (Traefik is inbound-only); this makes the app resilient to it. AI_STREAM_KEEPALIVE_MS can be lowered if the egress drops faster than ~10s. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 23:51:17 +03:00
claude_code	d1fbcc1bfa	Merge pull request 'feat(ai-chat): surface reasoning from openai-compatible providers (z.ai/GLM) (#175 )' (#177 ) from feat/reasoning-openai-compatible into develop	2026-06-24 23:19:15 +03:00
claude code agent 227	6edbbab43b	refactor(ai): unify provider-settings allowlist + stronger chatApiStyle tests (#177 review) Addresses the second #177 review: - Architecture (the silent allowlist drift): the writable provider-setting keys were maintained by hand in two TS-uncheckable places — the key-loop in ai-settings.service and the SQL ALLOWED list in the generic workspace repo (a miss there silently dropped a field on persist, exactly what bit chatApiStyle). Introduce one typed source of truth PROVIDER_SETTINGS_KEYS in ai.types (`satisfies readonly (keyof AiProviderSettings)[]`), have the service consume it, and keep the repo's own copy (it can't import AI types) guarded by a parity test so any future drift fails in CI. - Tests: - ai.service.include-usage.spec: mocks @ai-sdk/openai-compatible and asserts the factory is called with { includeUsage: true, baseURL, apiKey, fetch, name } — `.provider` alone could not catch a dropped includeUsage (the token-usage zeroing regression); also asserts the 'openai' style does NOT use it. - ai-provider-settings-keys.spec: the allowlist parity check + DTO validation for chatApiStyle (@IsIn accepts both values, rejects garbage, optional). - CHANGELOG: [Unreleased] entries for the new "Protocol" / chatApiStyle setting and the default provider change (openai -> openai-compatible). (#175, #177) server + client tsc clean; 42 ai/settings specs green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 23:18:31 +03:00
claude code agent 227	59190148db	feat(ai-chat): explicit chatApiStyle selector to surface reasoning (#175 ) Rebuilt on develop (after #176) and reworked per review: instead of inferring the provider from baseUrl (`if (baseUrl)`), the admin picks the chat provider EXPLICITLY via a new `chatApiStyle` ('openai-compatible' \| 'openai'), mirroring the existing sttApiStyle. A custom baseURL can front real OpenAI too, so the heuristic was fragile. Why reasoning was missing: glm-5.2 (and DeepSeek etc.) stream their thinking as `reasoning_content`, but the official @ai-sdk/openai provider does not map that field. 'openai-compatible' uses @ai-sdk/openai-compatible, which does — so reasoning parts now stream (verified live: reasoning-start/delta/end appear, and disappear when set to 'openai'). - Default (unset) = 'openai-compatible', so existing openai+baseUrl workspaces surface reasoning with no admin action. No DB migration (field lives in the settings.ai.provider JSON blob). - includeUsage: true on the openai-compatible model — without it the provider omits streamed usage, zeroing the live token counter / reasoning-token metadata. The official provider always sent it; this keeps parity. (Confirmed live: usage.totalTokens present.) - openai-compatible has no default endpoint, so with no baseURL (real OpenAI, or a role's cross-driver override that cleared it) it falls back to the official provider. Plumbing: ai.types (ChatApiStyle / CHAT_API_STYLES + AiProviderSettings / MaskedAiSettings), update DTO (@IsIn), ai-settings.service (resolve / getMasked / update allowlist), workspace.repo updateAiProviderSettings ALLOWED (the second, SQL-level allowlist the review missed — without it the field never persisted), ai.service selector. Client: ai-settings-service types + a Protocol <Select> in the chat section + i18n (en/ru). Scope is chat-only (embeddings don't stream reasoning; STT already has sttApiStyle). Tests: ai.service.spec — 4 cases (openai-compatible+baseURL, openai+baseURL, default-unset, openai-compatible-without-baseURL fallback). Verified on the stand: default streams reasoning + usage; 'openai' drops reasoning; the setting round-trips. server + client tsc clean; 36 ai/settings specs green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 22:58:15 +03:00
vvzvlad	80a4b5a1b0	Merge pull request 'fix(ai-chat): don't sever long agent turns at undici's 300s stream timeout (#175 )' (#176 ) from fix/ai-stream-undici-timeout into develop Reviewed-on: #176	2026-06-24 22:34:18 +03:00
claude code agent 227	da15b55786	refactor(ai): address PR #176 review — finite-timeout wording, env doc, tests, permanent provider-http module - Wording: every comment now says the stream timeouts are RAISED to a generous-but-finite ~15-min silence timeout, not "disabled (0)" (the stale comments contradicted the code, which uses AI_STREAM_TIMEOUT_MS, default 900000ms). - Architecture (the load-bearing-temporary trap): the streaming fetch reached the chat provider only by riding the "temporary DIAGNOSTIC" telemetry, so deleting the telemetry by its own label would silently revert the timeout fix. Legitimize it: rename ai-http-diagnostics.ts -> ai-provider-http.ts, createDiagnosticFetch -> createInstrumentedFetch, field aiDiagnosticFetch -> aiProviderFetch, drop the "temporary" labels, and document the chat transport (streaming fetch + instrumentation) as one intentional construct. - Docs: AI_STREAM_TIMEOUT_MS added to .env.example next to AI_EMBEDDING_TIMEOUT_MS. - Tests: - ai-provider-http.spec: createInstrumentedFetch delegates to the injected baseFetch with the same input/init, returns the Response untouched, rethrows the error, and defaults to global fetch — covering the baseFetch seam. - ai-streaming-fetch.spec: the delayed-server test is now LOAD-BEARING — with AI_STREAM_TIMEOUT_MS set below the 1.5s server delay the call actually rejects (a lost dispatcher -> global 300s default would NOT), proving the configured dispatcher is wired; plus the default-timeout happy path. server tsc clean; ai-streaming-fetch / ai-provider-http / ai.service / mcp-servers / ai-error specs green (41). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 22:31:58 +03:00
claude code agent 227	a14560c7c9	fix(ai-chat): raise undici's 300s stream timeout for long agent turns (#175 ) Long research turns failed mid-task with "Lost connection to the AI provider". Node's global fetch (undici) defaults BOTH headersTimeout and bodyTimeout to 300_000ms, and the chat provider + the external-MCP dispatcher both ran on it with no override, so: - the z.ai chat stream dropped when a late step's huge accumulated context pushed the model's time-to-first-token past 5 min (the model reasons server-side with NO streamed reasoning, so the connection is silent until the first answer token — reproduced: even a trivial glm-5.2 query has a ~4-8s first-chunk gap; a long run reaches 400k+-token steps), or a reasoning model paused >5 min between chunks (bodyTimeout); - the crawl4ai SSE transport, held open across the whole turn, dropped when it idled >5 min between tool calls. Fix: a dedicated undici dispatcher whose stream timeouts are raised to a generous-but-FINITE silence timeout (default 15 min, AI_STREAM_TIMEOUT_MS) on each path. NOT disabled (0): that would let a genuinely hung provider — with the client still connected — hang forever, since the turn's abortSignal only fires on client disconnect. The timeout bounds SILENCE (time-to-first-byte and the gap BETWEEN chunks), NOT total turn duration, so an arbitrarily long turn that keeps streaming is never cut; only a stream quiet for >15 min is treated as a hang. - ai-streaming-fetch.ts: createStreamingFetch() + streamTimeoutMs() / streamingDispatcherOptions() (the shared, configurable timeout). - ai.service: the chat provider fetch is createStreamingFetch(), wrapped by the existing passive ECONNRESET telemetry (createDiagnosticFetch gained an optional baseFetch) so the telemetry observes the SAME transport. - mcp-clients: the SSRF-pinned Agent uses streamingDispatcherOptions(). Investigation: reproduced the transport mechanism against the real z.ai endpoint (a 1ms headersTimeout throws UND_ERR_HEADERS_TIMEOUT — the exact drop) and ran the actual research agent to a ~428k-token context. Verified the fixed path streams cleanly live (glm-5.2 turns finish; telemetry confirms the streaming fetch is in use). Tests: ai-streaming-fetch.spec (default 15m + env override + invalid fallback + both-timeouts + streams a delayed response); ai-http-diagnostics + ai/mcp specs green. server tsc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 22:09:10 +03:00
claude_code	4cc8df836f	chore(ai): passive z.ai provider HTTP telemetry (#175 ) Investigate the intermittent (~20-30%) long-turn failure "Lost connection to the AI provider" = AI_RetryError / read ECONNRESET on the gitmost->z.ai link (browser-agnostic, mid-turn). Pure instrumentation, no behavior change: - ai-http-diagnostics.ts: a passive fetch wrapper injected into the OpenAI-compatible (z.ai) client. Per provider HTTP call it logs time-to-headers/status on success, and on a pre-response rejection the latency, error code/cause, request-body size and idle-gap since the previous call. The Response is returned untouched (streaming intact), errors rethrown unchanged; no retry/timeout/dispatcher. - ai.service.ts: wire the instrumented fetch into the openai case only. Lets us classify the reset as connection-phase vs mid-stream before choosing a fix, without repeating the reverted RetryAgent (#140). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 21:24:05 +03:00
claude_code	04a418e1a6	Merge pull request 'fix(mcp): tool allowlist stored/read as jsonb string, not array (edit-page crash + allowlist not enforced)' (#172 ) from fix/mcp-tool-allowlist-jsonb-shape into develop	2026-06-24 17:14:56 +03:00
claude code agent 227	255bc06883	fix(mcp): tool allowlist stored/read as jsonb string, not array Opening the edit form for an MCP server that has a saved tool allowlist crashed the whole settings page (`TypeError: Ke.map is not a function` in Mantine) — and, worse, the allowlist was silently NOT enforced. Both stem from one root cause: the `tool_allowlist` jsonb column round-trips as a JSON STRING, not an array. Root cause: `jsonbArray` bound `JSON.stringify(value)` (already a JSON string) straight to a `::jsonb` cast. node-postgres infers the param type as jsonb and JSON-stringifies it a SECOND time, so the column stored a jsonb STRING SCALAR (`"[\"a\"]"`, jsonb_typeof = string) instead of an array. On read the driver hands back the JS string `'["a"]'`. Then: - the edit form's TagsInput called `.map` on a string -> page crash; - mcp-clients did `Array.isArray(allow)` -> false for a string -> fell through to "no restriction" and exposed ALL of the server's tools. Fix (both verified on the stand): - Write: `jsonbArray` casts `::text::jsonb` so the param is bound as text (sent verbatim) and parsed into a real jsonb array. New rows now store jsonb_typeof=array. - Read: `normalizeRow` runs every fetched row through `parseToolAllowlist`, which returns `string[] \| null` for both shapes (already-array passes through; a JSON string is parsed; null/invalid -> null). This REPAIRS existing double-encoded rows on read, so the UI and the allowlist enforcement work without a data migration. Applied in findById / listByWorkspace / listEnabled. - Client: defensive `Array.isArray(...) ? ... : []` guard in the form so a bad shape can never take the settings page down again. Tests: ai-mcp-server.repo.spec (8 cases for parseToolAllowlist — array, the JSON-string read, null, empty, non-array json, unparseable, non-string elements, non-string primitive). mcp-servers-to-view + mcp-namespacing still green. Verified live: an old double-encoded row now reads as an array; a newly created server stores jsonb_typeof=array. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 17:11:50 +03:00
vvzvlad	8c06553b49	Merge pull request 'test(footnotes): cover footnoteWarnings import plumbing + doc fixes (#169 second review)' (#171 ) from fix/footnote-review-1227-followup into develop Reviewed-on: #171	2026-06-24 16:46:23 +03:00
claude code agent 227	0e8af13122	test(footnotes): cover footnoteWarnings import plumbing + doc fixes (#169 second review) Follow-up to the merged #166/#169. Addresses the second review pass (comment 1227): - footnoteWarnings plumbing: extract a single `footnoteWarningsField(markdown)` helper (footnote-analyze) and use it at all three call sites (create_page, update_page, import_page_markdown) so the field is attached identically. - New unit test footnote-warnings-import.test.mjs pins the contract that was uncovered: the field is present on problems / omitted on clean input, and the IMPORT path analyzes the BODY after the docmost:meta / docmost:comments blocks (a footnote-like token inside those JSON blocks must NOT warn; a real body marker must). Tested via the same pure composition the importer uses (footnoteWarningsField(parseDocmostMarkdown(full).body)) — no collab socket needed; a regression that analyzed fullMarkdown or skipped the body split would now go red. - footnote.marked.ts: correct the stale module header — it claimed "only definitions that have a matching reference are emitted", which was never true (orphan defs are emitted; the editor sync plugin reconciles). Now describes first-wins + reuse + sync reconciliation. - derive-id golden test: rename the describe from "(cross-package drift guard)" to "(deterministic-scheme pin)" — there is no second package to drift against. editor-ext 129, MCP 304 (+3), client+server tsc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 16:44:53 +03:00
claude_code	b9056e2bee	Merge pull request 'feat(footnotes): reuse semantics + import diagnostics (#166 )' (#169 ) from feat/footnote-reuse-and-warnings into develop	2026-06-24 16:38:59 +03:00
claude code agent 227	a0cc625dfe	refactor(footnotes): address PR #169 review - footnote-sync: remove the now-dead `refReids` (CollisionPlan field, local, return, the 6a consumer loop) — references are never re-id'd under reuse, so it was dead structure on the hot reconciliation path. Rewrite the stale comments (plugin header, step 0, refOccurrences field) that still described the old "duplicates re-id'd so both survive" model to the reuse model. - Shared footnote lexer: new packages/mcp/src/lib/footnote-lex.ts (lexFootnoteLines + forEachFootnoteReference). extractFootnotes (collaboration) and analyzeFootnotes now consume the SAME fence-aware lexer, so "the analyzer sees exactly what the importer keeps/strips" is structural, not comment-kept. Removed the duplicated DEF_RE/fence machine from both consumers. - Tests: new mock test for the footnoteWarnings plumbing on createPage (problems -> field present; clean -> omitted); new paste-reuse case for TWO colliding pasted definitions (reservation -> distinct ids). Updated the derive-id golden test header (no MCP copy / parity test anymore). - CHANGELOG: [Unreleased] entries for footnote reuse (Changed, supersedes 0.93.0) and footnoteWarnings (Added). editor-ext 129, MCP 301, server roundtrip 2; client+server tsc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 16:16:30 +03:00
claude code agent 227	17e683a311	feat(footnotes): reuse semantics + import diagnostics (#166 ) Footnotes were strict 1:1: a repeated `[^a]` reference was treated as a collision and re-id'd to `a__2`, and a reference with no definition synthesized its own empty one — so an agent-authored article with reused labels produced dozens of empty `kowiki__N` footnotes. Move to Pandoc REUSE semantics and add non-fatal import diagnostics. Reuse (core): - resolveCollisions (footnote-sync): repeated references sharing an id are REUSE (recorded once in document order, never re-id'd) — one number, one shared definition. Only a duplicate DEFINITION is re-id'd deterministically and, with no matching reference, dropped by the existing orphan policy (first-wins). CollisionPlan.refReids is now always empty (harmless no-op downstream). - extractFootnoteDefinitions (marked) and extractFootnotes (MCP): duplicate definition ids are FIRST-WINS (keep first, drop rest); reference markers are never rewritten. Removed the marker-rewriting and the now-dead deriveFootnoteId mirror + helpers from the MCP path. Import diagnostics: - New analyzeFootnotes() (MCP): fence-aware pure scan reporting dangling references, empty/duplicate definitions and `[^id]` markers inside table rows. - createPage / updatePage / importPageMarkdown now attach `footnoteWarnings` (only when non-empty) so an agent can fix its markup; the page is still created. Paste-reuse: - footnotePastePlugin remaps only ids the pasted slice DEFINES (a colliding definition); a pasted lone reference to an existing id keeps it (reuse). Tests: reuse/first-wins rewrites of footnote.test, footnote-markdown.test, footnote.marked.orphan.test and the MCP footnotes.test; new footnote-paste.test (editor-ext) and footnote-analyze.test (MCP). Deleted derive-id-parity.test.mjs (the MCP no longer derives ids; editor-ext's deriveFootnoteId keeps its own golden test). editor-ext 128, MCP 299, server roundtrip 2, client views 3, client+server tsc clean. Two review suggestions applied: corrected a stale "duplicated in MCP" comment and the dangling-reference warning wording. Note: the multi-backlink editor UI (a reused definition linking back to each of its references) is deferred to a follow-up — this PR delivers the data-integrity core (reuse + warnings + paste-reuse). Forward links and numbering already reuse correctly; the backlink currently targets the first reference. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 15:34:41 +03:00
claude_code	13cac155c1	chore(ai-chat): add temporary Safari stream-drop diagnostics Investigate the Safari-only "Lost connection to the AI provider" mid-stream disconnect (Chrome unaffected). Pure instrumentation, no behavior change: the 15s heartbeat interval and all stream callbacks are unchanged. - sse-resilience.ts: startSseHeartbeat() gains an optional onBeat hook fired after each successfully written ping (beat counter). - ai-chat.service.ts: track stream start, first-chunk latency, model-silent gap and heartbeat count; log them on finish/error/abort to classify the drop (idle-gap vs hard wall-clock cap vs slow first chunk). - ai-chat.controller.ts: append elapsed-since-request to the disconnect warn. All blocks tagged "DIAGNOSTIC ... temporary" for easy removal once the Safari failure mode is identified. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-24 15:14:29 +03:00

1 2 3 4 5 ...

1585 Commits