gitmost

Author	SHA1	Message	Date
claude code agent 227	cd539558ed	feat(agent-tools): suggestedText on create_comment with strict anchor uniqueness (#315 phase 6) Agents can attach a suggested replacement when creating an inline comment, via both the MCP create_comment tool and the AI-chat createComment tool. Because applying a suggestion edits the EXACT anchored text, an ambiguous anchor would let Apply corrupt the wrong occurrence. So when suggestedText is set the selection must occur EXACTLY ONCE: - new countAnchorMatches(doc, selection) counts occurrences across all blocks (same normalization/traversal as canAnchorInDoc), counting occurrences (2 in one block => 2) — stricter than block-count, never under-counting distinct occurrences (false-unique is the dangerous direction). - client.createComment gains suggestedText: a pre-check (getPageJson + countAnchorMatches: 0 => not-found, >=2 => ambiguity error) before create, and an AUTHORITATIVE live check inside the anchoring mutation that recomputes on the live doc and, if != 1, aborts and rolls back the just-created comment (reusing the existing safeDeleteComment "anchor not found" path). Ordinary comments keep first-occurrence behavior unchanged. - suggestedText is rejected on a reply or without selection in all three layers (MCP handler, MCP client, AI-chat tool), mirroring the server DTO/service. - filterComment surfaces suggestedText/suggestionAppliedAt/suggestionAppliedById. - DocmostClientLike.createComment signature updated. MCP build/ rebuilt. Tests: countAnchorMatches (0/1/N, within/across/nested block, span nodes, quote normalization); createComment (ambiguous refused pre-create, reply and no-selection rejected, unique succeeds and forwards suggestedText, filterComment surfaces it); ai-chat schema accepts suggestedText. MCP 443 pass; ai-chat 601 pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-03 19:35:47 +03:00
claude code agent 227	ec542a924b	feat(comment): store suggestedText + POST /comments/apply-suggestion (#315 phase 4) Server side of agent comment suggestions. - CreateCommentDto gains optional suggestedText (<=2000). CommentService.create accepts it ONLY for a top-level inline comment with a non-empty selection, requires it be non-empty and differ from selection (else BadRequest), and stores it. - POST /comments/apply-suggestion (ApplySuggestionDto { commentId }): authorizes with validateCanEdit (applying edits page text) BEFORE any structural check or mutation, then CommentService.applySuggestion: - runs the phase-3 collab event applyCommentSuggestion on `page.<pageId>` to atomically check-and-replace the marked text, returning { applied, currentText }; - applied → stamp suggestion_applied_at/by, auto-resolve the thread, ws commentUpdated, audit COMMENT_SUGGESTION_APPLIED; - already-applied (DB) → idempotent success (no re-apply), self-healing the resolve if it was missed — satisfies the issue's double-click / two-user race requirement; - collab verdict applied:false && currentText===suggestedText → idempotent success (crash between doc mutation and DB write); - text changed → 409 ConflictException carrying currentText; - gateway undefined/throw → hard error, never a silent success. - audit-events: COMMENT_SUGGESTION_APPLIED. Tests: create validation (reply/no-selection/equal-to-selection rejected; valid stored) + applySuggestion verdict branches incl. both idempotent paths. jest src/core/comment: 33 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-03 19:09:23 +03:00
claude code agent 227	a9da8f7f15	feat(collab): applyCommentSuggestion event + no-Redis local fallback (#315 phase 3) New custom collab event applyCommentSuggestion runs replaceYjsMarkedText inside the document's Yjs transaction on the owning instance and returns the { applied, currentText } verdict to the API-server caller (cross-process via the Redis bridge, whose customEventComplete/replyId already carries handler return values). - withYdocConnection is now generic and returns the callback's result (captured in a closure, since hocuspocus connection.transact does not forward it). The callback is typed synchronous-only: transact runs fn synchronously without awaiting, so an async fn would mutate outside the transaction and lose atomicity. - collaboration.gateway.handleYjsEvent: when Redis is disabled (COLLAB_DISABLE_REDIS), dispatch the handler locally against the single hocuspocus instance and return its verdict instead of silently returning undefined (which would make apply a no-op). Also fixes the pre-existing silent no-op of setCommentMark/resolveCommentMark without Redis. Tests: handler spec (applied mutates doc + returns verdict; changed-text returns {applied:false} without mutating; args forwarded; withYdocConnection returns the value) and gateway spec (no-Redis path dispatches locally, returns the verdict, not undefined). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-03 18:52:44 +03:00
claude code agent 227	7c0664d2b3	feat(collab): replaceYjsMarkedText — atomic check-and-replace of comment-marked text (#315 phase 2) The primitive behind "Apply comment suggestion": walk the XmlFragment, collect the delta segments carrying the `comment` mark for a commentId, and replace them with new text ONLY if the run is intact (single Y.XmlText, contiguous, and the joined text still equals the expected anchor). Otherwise return a verdict { applied:false, currentText } — null when the anchor is gone, else the current text — so the caller can report "someone changed it". On apply it deletes the run and re-inserts the new text re-attaching the same comment mark (thread stays anchored). Mutates in place for the caller's connection.transact(); opens no transaction of its own. Non-string inserts (embeds) advance the offset by their 1-unit index length so a marked segment after an embed gets the right position and an embed inside a run is correctly rejected as a changed anchor. Tests (yjs.util.spec.ts): happy path (mark preserved, surrounding text and no mark-bleed), resolved-mark match, changed text, deleted anchor, paragraph split, interleaved unmarked text, and embed before/inside the run. 17 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-03 18:41:32 +03:00
claude code agent 227	a32fba63ec	feat(comment): db columns for comment suggestions (#315 phase 1) Add suggested_text / suggestion_applied_at / suggestion_applied_by_id to the comments table (migration) and mirror them in the hand-curated db.d.ts Comments interface. suggested_text holds a proposed replacement for the comment's anchored selection; the applied_* columns record who applied it and when. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-03 18:29:03 +03:00
vvzvlad	36b3539571	Merge pull request 'refactor(ai-chat): move patch_node/insert_node into the shared tool-spec registry (#294 )' (#305 ) from refactor/294-tool-spec-registry into develop Reviewed-on: #305	2026-07-03 18:02:40 +03:00
agent_coder	86c1307ed2	fix(#300 review): drop stray symlink, re-fetch enriched on comment update, cover history mapping (F1/F2/F3) F1: remove an accidentally-committed self-referential symlink packages/mcp/node_modules/node_modules -> an absolute build-machine path (leaked a dev home path, a pnpm artifact useless in the repo), and add a targeted ignore so it can't recommit. F2: the commentUpdated broadcast re-emitted the caller's pre-loaded comment mutated in place, so the {agent,launcher} stack survived only because the controller happened to load it with includeCreator:true — the fragile coupling that let the stack vanish on edit once already. update() now RE-FETCHES the enriched comment before broadcasting, symmetric with create()/resolveComment() (the row is already persisted), so all three broadcasts carry the stack regardless of any caller's pre-load. Adds a caller-contract test asserting all three broadcasts emit agent/launcher for an agent comment and neither for a non-agent one, spotlighting the update path (non-vacuous vs the old re-emit). F3: add a direct test of the page-history attachPageHistoryAgent mapping (its distinct lastUpdatedSource/lastUpdatedAiChatId/lastUpdatedBy column set): role / no-role / MCP / non-agent, and that the internal agentRole join column is stripped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-03 06:38:25 +03:00
agent_coder	f720151c63	refactor(ai-chat): move patch_node/insert_node metadata into the shared tool-spec registry (#294 ) The same tool metadata (zod schema + model-facing description) was hand-duplicated between the standalone MCP server and the in-app AI-chat agent, so every tweak had to land in two places and copies drifted (a materialized parity bug). The shared transport-agnostic registry (packages/mcp/src/tool-specs.ts) already de-duplicates 14 tools; this migrates two more genuinely-identical ones — patch_node/patchNode and insert_node/insertNode. The canonical description is a strict SUPERSET of both originals (keeps MCP's "without resending the whole document" + table-structure/anchor guidance AND the in-app "reversible via page history" / "exactly one of anchorNodeId or anchorText" framing — no model-facing guidance dropped); the schema is identical (the in-app side just gains MCP's .min(1) on ids, a safe tightening). Each transport keeps its own execute/auth wrapper, and the in-app parseNodeArg node-arg normalization is unchanged. The three table tools are intentionally NOT merged (a real param-name divergence: table vs tableRef) — documented on both sides. Other per-transport divergences (search/share/create_comment/transform/list_pages) are left separate with a short comment explaining why (the issue asked to flag these as intentional). DocmostClientLike stays a hand-mirror (the ESM/CJS boundary blocks a compile-time type import; a runtime drift-guard already pins it). Also fixes a latent contract-spec bug: derive `required` from `instanceof z.ZodOptional` (matches the emitted JSON schema) instead of `isOptional()`, which wrongly reported z.any() fields as optional. Partially addresses #294. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-03 05:55:11 +03:00
agent_coder	0968ea97d2	feat(ai-chat): agent avatar stack — agent in front, launcher behind (#300 ) For AI-agent-authored content (comments + page history), replace the text AI-AGENT badge with an avatar stack: the agent in front, the human who launched it smaller and behind. This fixes the inverted hierarchy (the action was the agent's; the human just launched it). closes #300. Backend: a single server-authoritative resolver resolveAgentProvenance normalizes to { agent, launcher } from server columns only (createdSource/lastUpdatedSource, aiChatId, creator, chat role) — nothing from request input, so agent identity can't be spoofed. Internal chat -> agent = chat role (name/emoji), launcher = human; external MCP (aiChatId null) -> agent = the agent account, launcher = null; non-agent -> neither. The role join (aiChatId -> ai_chats.role_id -> ai_agent_roles) deliberately does NOT filter enabled/deleted_at, so a later-disabled role still labels historical content (mirrors findById, not findLiveEnabled). Enrichment is applied on BOTH findPageComments (list) AND findById (the create/resolve/update broadcast path), so the stack shows on live comment events and doesn't vanish on resolve/edit. Frontend: new AgentAvatarStack + AgentGlyph (avatarUrl -> role emoji on violet -> IconSparkles on violet), integrated into comment-list-item and history-item where the badge was; the deep-link-to-chat click moved onto the stack. ai-agent-badge removed. Tests: AgentAvatarStack (role/no-role/MCP/click/non-clickable), the provenance resolver + recorder tests proving the role join never filters enabled/deleted, and findById enrichment (guards the live-broadcast regression). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-03 05:28:53 +03:00
agent_coder	3f7e1bdc7b	fix(export): stop comment.renderHTML returning a live jsdom node on the server (#298 ) Page/space export (Markdown & HTML, both via jsonToHtml -> generateHTML) crashed with "Export failed:undefined" on any page carrying a `comment` mark. Root cause: comment.renderHTML returned a LIVE DOM node (document.createElement + a click listener) whenever a global `document` existed — and the in-process MCP module injects a jsdom global.window+global.document into the Node server, defeating the old `typeof document === "undefined"` guard. The server export runs happy-dom's DOMSerializer, which crashes appending the foreign jsdom node (NodeUtility.isInclusiveAncestor -> "Cannot read properties of undefined (reading 'length')"). comment is the only extension returning a live node. Fix: widen the guard with an isNodeRuntime check (process.versions.node) so on any Node runtime renderHTML returns the plain, serializable spec array — even when MCP injected jsdom globals. The browser branch (createElement + click -> ACTIVE_COMMENT_EVENT) is untouched, so in-editor comment interactivity is preserved (Vite defines only process.env as a member-expression substitution, no `process` object in the browser bundle, so isNodeRuntime is false there). The mcp schema mirror already returns a spec array and is not on the export path (tiptapExtensions imports Comment from @docmost/editor-ext), so no mirror change is needed. Also: export-modal now reads the real error text from the response Blob (responseType:'blob' made err.response.data.message always undefined) so a failed export shows the server's message instead of "undefined". Adds a regression test that runs the real jsonToHtml on a comment-marked doc with jsdom globals injected (reproduces the crash on the unpatched code, passes after). closes #298 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-03 01:34:53 +03:00
agent_coder	438ef091f9	fix(#288 review): markdown-safe-escape the untrusted page title in chat export F1: pc.title (untrusted cross-user page title) was interpolated raw into the markdown export heading. Reusing escapeAttr alone (the prompt sink's XML-attribute sanitizer, strips < > ") is insufficient here because the sink is MARKDOWN: link /image syntax survives, so a title like ![x](http://evil) or [phish](http://evil) injects a remote image / clickable link into the downloaded .md disguised as a trusted system annotation. Add markdownHeadingSafe() = escapeAttr() + backslash- escape [ and ] (disables both [text](url) and ![text](url); a bare (url) is inert). F2: cover the title branch — a title that collapses to empty via escapeAttr falls to the bare heading (no ("")), and a link/image-injection title is neutralized (non-vacuous vs the escapeAttr-only version). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-02 15:46:44 +03:00
claude_code	c39fab70c1	feat(ai-chat): persist page-change diff to history and harden stale-page note The #274 page_changed marker lived only in the ephemeral system prompt, so the diff the agent saw was invisible in the chat export/history, and the note was too weak — the agent still overwrote the user's manual edits with a full-page replace. - Persist the diff the agent saw as metadata.pageChanged on the assistant row (flushAssistant), threaded into all five flush call sites in stream(). Model replay (rowToUiMessage/rowParts) reads only metadata.parts, so the sibling never re-injects the note into the model context on later turns. - Render the persisted diff as a labelled block (en/ru) before the message body in the server-side Markdown export (chat-markdown.util.ts). - Strengthen PAGE_CHANGED_NOTE: mandate a fresh getPage re-read and targeted edits (editPageText/patchNode/insertNode/deleteNode) instead of a whole-page replace, and never revert or overwrite the user's edits. Tests: prompt, export and service specs updated; 114 pass, tsc clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-07-02 14:31:41 +03:00
agent_coder	2f3d5d3783	docs: fix escapeAttr comment count (three, not four) (#274 review) The regex strips three attribute-breaking chars (" < >); the JSDoc said four. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-02 06:19:26 +03:00
agent_coder	6e681a9c66	fix(#274 ): escape page_changed injection surface, drop dead content_hash (review F1-F5) F1: escape the collaborative page title before interpolating into <page_changed page="..."> (and the pre-existing openedPage attr) — strip <>" and collapse whitespace, so a crafted title can't break out of the attribute into the system prompt (cross-user injection). F2: neutralize <page_changed>/</page_changed> occurrences inside the diff body so a crafted line can't close the block early. F3: remove the dead content_hash column (written every turn, never read) — migration, repo, service hashing + crypto import, db.d.ts, spec asserts. F4: test the best-effort catch branches (detectPageChange / snapshotOpenPage swallow errors and don't break the turn). F5: soften the overstated 'diff cannot smuggle instructions' comment to defense-in-depth framing referencing the F1/F2 mitigations + safety sandwich. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-02 05:43:46 +03:00
agent_coder	8c5b57ebfa	feat(ai-chat): notify the agent of user page edits between turns (closes #274 ) The agent rebuilds context from DB each turn and didn't know the user manually edited the open page since its last response, so it could overwrite those edits. Add a per-turn ephemeral <page_changed> note in the system prompt (twin of INTERRUPT_NOTE, self-clearing) carrying a unified Markdown diff of what changed since the END of the agent's previous turn. - New ai_chat_page_snapshots table (migration + hand-declared db.d.ts/entity types) storing the page Markdown per (chat,page) at each turn's end. - Pure computePageChange util (whitespace-normalized unified diff via the existing jsdiff dep, 6KB cap + getPage hint). - Turn start: if the open page's updatedAt moved past the snapshot, diff current vs snapshot; non-empty -> PAGE_CHANGED_NOTE in the safety sandwich. - Turn end: upsert the snapshot on EVERY terminal path (onFinish/onError/onAbort, once) so the agent's own edits are excluded by construction even on aborted turns. All best-effort (never breaks/latency-regresses a turn); fast path when updatedAt is unchanged. Server-only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-02 01:54:00 +03:00
claude code agent 227	3b80285d57	fix(#260 ): open MCP collab docs by canonical UUID (slugId doc-name split) Real root cause of the silent MCP edit loss: the web editor always opens the collaboration document by the page UUID (`page.${page.id}`), but the MCP opened it by the agent-supplied id — usually a slugId — so `page.${pageId}` became `page.<slugId>`. For one DB page that is TWO independent Yjs documents; both persist to the same `pages` row (findById/updatePage resolve id or slugId), so the human tab's debounced store overwrites the agent edit (last-store-wins) — gone after reload, never shown live. The slugId doc also made the server's transclusion sync + embedding reindex throw Postgres 22P02. Fix: - MCP (primary): resolvePageId(pageId) returns the canonical UUID — a UUID short-circuits with no network call, a slugId resolves once via getPageRaw and is cached both ways. Every collab-write path (mutatePageContent / updatePageContentRealtime / replacePageContent and the mutate/replace/ unlocked seams) now opens by the resolved UUID, so the MCP and the editor share ONE Yjs doc. replaceImage's whole-operation page lock also keys on the UUID so it serializes against the other (now-UUID-keyed) writes. - Server (defense + kills the 22P02 noise): onStoreDocument passes the resolved page.id — not the raw doc-name id — to syncTransclusion, the embedding queue, the mention-notification job, addContributors, and the in-tx history read. Content store and the empty-guard are untouched. Tests: a new MCP test stands up a real Hocuspocus server and asserts a slugId input opens `page.<uuid>` (never `page.<slugId>`), with UUID short-circuit and single-resolve caching; the server spec asserts the side-effects receive the UUID for a `page.<slugId>` doc. closes #260 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-30 10:04:49 +03:00
vvzvlad	22ea387495	Merge pull request 'feat(#246 ): inline spoiler mark (blur + click-reveal, lossless Markdown)' (#259 ) from feat/246-spoiler into develop Reviewed-on: #259	2026-06-30 01:47:46 +03:00
vvzvlad	7e6dd457a4	Merge pull request 'refactor(#193 ): tool-host drift-guard + staged plan (shared spec registry already merged)' (#249 ) from refactor/193-tool-spec-registry into develop Reviewed-on: #249	2026-06-30 01:47:13 +03:00
vvzvlad	42f3a328c2	Merge pull request 'feat(#251 ): intentional-clear signal editor→store (persist deliberate clear, keep #248 guard)' (#253 ) from feat/251-intentional-clear into develop Reviewed-on: #253	2026-06-30 01:36:46 +03:00
vvzvlad	a8a7fad850	Merge pull request 'test(#244 ): Part B backlog — editor-ext/mcp/client/server unit+contract tests + findBreadcrumbPath mutation fix' (#257 ) from test/244-part-b into develop Reviewed-on: #257	2026-06-30 01:36:00 +03:00
vvzvlad	d38a39e3e5	Merge pull request 'fix(ai): show live reindex progress so the embeddings counter resets to 0 and climbs' (#242 ) from fix/embeddings-reindex-progress into develop Reviewed-on: #242	2026-06-29 23:44:13 +03:00
vvzvlad	116a231691	Merge pull request 'fix(#255 ): disconnect socket.io redis-adapter pub/sub clients on shutdown' (#256 ) from fix/255-ws-redis-adapter-leak into develop Reviewed-on: #256	2026-06-29 23:23:47 +03:00
claude code agent 227	188c5f506c	feat(editor): inline spoiler mark (blur + click-reveal, lossless Markdown) (#246 ) Add an inline spoiler (Telegram/Discord-style hidden text): a TipTap mark `spoiler` rendered as <span data-spoiler="true" class="spoiler">, blurred via CSS and revealed on click (UI-only is-revealed class, never persisted). - packages/editor-ext: the Spoiler mark (inclusive:false, set/toggle/unset commands, \|\|text\|\| input rule), exported; a lossless turndown rule emitting raw inline HTML; round-trip test. - apps/client: SpoilerView mark-view (ReactMarkViewRenderer, Link pattern), registration in extensions, bubble-menu toggle button (editable only), CSS (blur + @media print reveal), en/ru i18n. - apps/server: register Spoiler in collaboration.util tiptapExtensions so the mark survives HTML<->JSON export/index/import/Yjs; a test proving the public share keeps the spoiler (it isn't stripped with comments). No keyboard shortcut: the proposed Mod-Shift-s collides with Strike (and Mod-Shift-h with Highlight); the \|\|text\|\| input rule + the bubble-menu button cover ergonomics. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 23:22:30 +03:00
claude code agent 227	aa14ad6698	docs(ai): quote the content predicate verbatim; drop twin tautological assert (F16,F17) F17: the header's content-clause literal omitted the [[:space:]]* tolerance; copy page.repo.ts's exact '"type"[[:space:]]:[[:space:]]"text"' (jsonb::text renders a space after the colon, which is why the tolerance exists). F16: remove expect(ttl).toBeGreaterThan(0) — the twin of the F15 removal; expect(ttl).toBe(120) strictly subsumes it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 17:30:52 +03:00
claude code agent 227	1e5994573f	docs(ai): list all three embeddable clauses in int-spec header; drop tautological assert (F14,F15) F14: the lockstep int-spec header still described the pre-F6 two-clause set with 'iff' — add the content-JSON text-node clause so it matches embeddablePredicate. F15: remove the redundant expect(ttl).toBeLessThanOrEqual(120) that followed expect(ttl).toBe(120). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 17:02:32 +03:00
claude code agent 227	d0eae69086	fix(ai): raise reindex pre-seed TTL to the client poll cap; cover predicate clause; align docs (F11-F13) F11: PRE_SEED_TTL_SECONDS 45->120 (= client REINDEX_POLL_CAP_MS). At concurrency 1 a queued reindex can wait past the old 45s; if the pre-seed expired while pending, getMasked fell back to the COUNT and reported done, so the client stopped polling and missed the climb. Tie the pre-seed TTL to the client cap. F12: extend the lockstep integration spec — insertPage takes content; a text_content=null + text-node-content page is IN and a math-only page is OUT, pinning the structural "type":"text" clause (and the jsonb space-after-colon). F13: list all three embeddable clauses in the reindex JSDoc/inline comments. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 16:12:36 +03:00
claude code agent 227	91f24fc062	fix(ai): include content-bearing pages in reindex coverage; correct progress race & hot path (F6-F10) F6: extend embeddablePredicate to pages with body content but null text_content, keyed on the text-node marker "type":"text" (not a bare "text": key, which also matched math nodes' attrs.text and would leave math-only pages stuck below 100%). Numerator and denominator share the predicate; tests assert the compiled WHERE is byte-identical and a math-only doc is excluded. F7: correct the start() JSDoc (both totals are the real page count). F8: nextReindexPollInterval reuses isReindexComplete. F9: getMasked reads progress first and skips the two COUNTs while a reindex is active. F10: pre-seed the progress entry with a short 45s TTL so a deduped enqueue's phantom "0 of N" expires quickly instead of sticking for the 1h TTL. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 14:37:26 +03:00
claude code agent 227	f9b58a0e3d	test(server): SSRF guardedFetch, decryptHeaders fail-open, yjs.util, tool-spec parity, storage delegation guardedFetch blocks loopback/private/link-local/metadata IPs and never calls fetch; decryptHeaders fails open (returns undefined, warns once, no blob leak). yjs.util setYjsMark/removeYjsMarkByAttribute/updateYjsMarkAttribute on real Y.Docs. SHARED_TOOL_SPECS<->in-app parity (name/desc/input-schema; a dropped or renamed wiring fails). Replace the tautological storage.service spec with driver-delegation checks across every public method. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 04:49:56 +03:00
claude code agent 227	82b042209e	fix(ws): make redis adapter error handlers actually log (were noop) The pub/sub error handlers were `(err) => () => {}` — a noop returning an inner arrow that never runs, so socket.io redis client errors were silently swallowed. Log them via Nest Logger. Adjacent pre-existing bug surfaced in review of #255. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 04:32:34 +03:00
claude code agent 227	a0f4c86a74	fix(ws): disconnect socket.io redis adapter pub/sub clients on shutdown The WsRedisIoAdapter creates two ioredis clients (pubClient/subClient) for @socket.io/redis-adapter but never closed them, leaking their TCP handles on application shutdown (#255). The redis-adapter does not own these clients' lifecycle, and the adapter is instantiated from main.ts (not a DI provider), so no Nest lifecycle hook applied to it. Keep references to both clients and override dispose(), which Nest's SocketModule.close() invokes exactly once during shutdown after all socket.io servers are closed. Use disconnect(false) to mirror the sibling pub/sub pair in collaboration/extensions/redis-sync (onDestroy): immediate close, no QUIT round-trip, no auto-reconnect. Refs are nulled to guard against double-close. Runtime behavior is unchanged; only the shutdown path is added. Verified with a script that boots connectToRedis() against a real Redis: 2 sockets to :6379 open after connect, 0 remain after dispose(). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 04:28:56 +03:00
claude code agent 227	cce539e8e2	fix(collab): hoist intentional-clear consume out of the store retry loop (#251 ) The store-side empty-guard consumed the per-document intentional-clear flag INSIDE the bounded retry loop. consumeIntentionalClear always deletes the in-memory Map entry, but a tx rollback cannot un-delete it: attempt 1 consumed the flag then updatePage threw a transient error and rolled back; attempt 2 re-read the page non-empty, saw the flag gone, and the empty-guard silently BLOCKED the write — dropping the user's deliberate clear and defeating the retry guarantee for clears. Hoist the decision out of the loop (like consumeContributors / consumeAgentTouched): consume once into `allowIntentionalClear` before the `for`, and only read that boolean on the empty-over-non-empty branch. The single hoisted consume still drops a pending flag for a non-empty store (the "cleared then retyped" case), since every store consumes regardless of incoming emptiness. Add a regression test: arm via the real onStateless transport, updatePage throws once then succeeds, assert it is called twice and the retry writes the empty doc (the clear survives). It fails on the old consume-in-loop ordering (updatePage called once) and passes after the hoist. Document the known fail-safe limitation near the TTL constant: if document ownership transfers / a node crashes between the stateless signal and the debounced store, the in-memory flag is lost and the clear is silently not applied (the doc reloads non-empty) — fail-safe, content is never destroyed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 04:17:41 +03:00
claude code agent 227	8274720281	fix(server): close leaked redis sockets so e2e jest exits (#252 ) The full-AppModule e2e (apps/server/test/app.e2e-spec.ts) passed but jest never exited, burning CI to its timeout. Diagnosis (process._getActiveHandles after app.close()) showed exactly two ioredis sockets to :6379 still open after shutdown; everything else (BullMQ queues/workers, @nestjs/schedule intervals, nestjs-ioredis, nestjs-kysely pg pool, @nestjs/cache-manager Keyv store, hocuspocus pub/sub) already closes on app.close(). The two leaks were owned-but-never-closed clients: 1. ThrottleModule passed a pre-built `new Redis(...)` instance to ThrottlerStorageRedisService. With an instance, the lib sets disconnectRequired=false, so its onModuleDestroy never disconnects. Pass ioredis options instead so the service owns + disconnects the client. 2. CollaborationGateway created a source `new RedisClient(...)` that RedisSyncExtension only duplicates into pub/sub; the extension's onDestroy disconnects those duplicates but not the source. Keep a reference and disconnect it after the hocuspocus onDestroy hook in destroy(). Both are real lifecycle fixes (production shutdown is now clean too), so no --forceExit is needed. Verified against real Postgres+Redis: - test:e2e (no forceExit, --runInBand) exits 0 in ~18s (was: hung forever) - --detectOpenHandles exits 0 with no open-handle report - active handles after app.close(): none CI timeout-minutes safety nets left untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 04:11:51 +03:00
claude code agent 227	3fdb1e05a4	feat(collab): persist a deliberate page clear via an intentional-clear signal (#251 ) The #248 store-side empty-guard (onStoreDocument) unconditionally refuses to overwrite non-empty persisted content with an empty document, because a momentarily-empty live Y.Doc is indistinguishable from a real clear at the store layer. That correctly blocks glitches/bad-merges, but also blocks a user who genuinely wants to empty a page. This re-introduces a WORKING, narrow, non-spoofable exception (the dead context.intentionalClear hatch #248 removed never had a real channel). Definition of an intentional clear (client, IntentionalClear editor extension): a LOCAL user transaction (docChanged, NOT a remote y-sync change — filtered via isChangeOrigin) that reduces a non-empty doc to the empty single-paragraph shape. This is exactly the select-all + Delete/Backspace keystroke path. Transport (option b — hocuspocus stateless message): on that transition the client sends a `{type:'intentional-clear'}` stateless message. The server (PersistenceExtension.onStateless) records a short-lived (TTL 60s > 45s maxDebounce), single-use "pending clear" flag keyed by the connection's document. The next debounced onStoreDocument consumes it on the empty-guard branch to let that one empty write through. Why this is the right channel and non-spoofable: - Yjs transaction origin/metadata does not survive to the server store; awareness is per-connection and racy. A stateless message ties the signal to a specific clear, survives the debounce, and rides the authenticated connection. - The document is taken from the connection, never the payload, so a client cannot target another page. - The flag is read ONLY on the empty-over-non-empty branch, so the worst a forged signal can do is clear a page the connection may already edit; it can never force or alter a non-empty write. Read-only connections cannot arm it. Every non-empty store drops a pending flag, so "cleared then retyped" leaves nothing usable; the flag is single-use and TTL-bounded. NOTE: #248 is not yet on develop, so the empty-guard block is included here as the foundation this exception extends. If #248 lands first this rebases cleanly (the guard logic is identical; the #251-unique additions are the exception, onStateless, the pending-flag state, and the client extension). Tests: - Server (real transport path, not a hand-poke): onStateless sets the flag with the exact client payload, then the debounced onStoreDocument persists the empty doc; plus single-use consumption, read-only rejection, non-empty-store drops the flag, and the unchanged #248 guard tests (empty-over-non-empty blocked, empty-over-empty allowed). - Client: a real Editor + the actual selectAll+deleteSelection command emits the signal; typing / non-emptying edits / already-empty docs do not. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 04:06:39 +03:00
vvzvlad	4a72ee1681	Merge pull request 'refactor(agent-roles-catalog): YAML catalog with block-scalar instructions (#229 )' (#231 ) from feat/229-catalog-yaml into develop Reviewed-on: #231	2026-06-29 01:20:40 +03:00
claude code agent 227	82af0c5291	test(catalog): tighten + isolate real shipped catalog-file checks Apply review suggestions to the real-files block in ai-agent-roles-catalog.provider.spec.ts (test-only): 1. Fix inaccurate comment: there are 5 content YAML files (index + four per-bundle/lang files), not 6. 2. Improve isolation: read/parse the real index lazily inside tests (via loadRealIndex) instead of in the describe body, so a broken real file fails only these catalog tests, not collection of the whole spec (incl. the unrelated mocked-remote provider tests). 3. Add the symmetric slug check: each language file's slug set must equal the declared slug set (no undeclared/extra roles), matching scripts/check.mjs's exact two-way correspondence. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-28 23:59:41 +03:00
claude_code	62eb7d082f	test(ai-chat): stub sandboxStore.asSink in AiChatToolsService spec The blob-sandbox feature (#243/#250) made AiChatToolsService.forUser() eagerly call this.sandboxStore.asSink() while wiring the stash tool, but the spec still passed an empty {} as the sandboxStore constructor arg. That object has no asSink method, so all 19 tests in the suite failed in CI with 'TypeError: this.sandboxStore.asSink is not a function'. Replace the stale {} mock at all 4 constructor sites with a no-op sink exposing asSink() -> { put, has, evict } (jest.fn()). These tests never execute the stash tool, so a no-op sink is sufficient for forUser() to wire successfully. Test-only change; production code is unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-28 23:45:06 +03:00
claude code agent 227	997e4395c6	test(agent-roles-catalog): pin the real shipped YAML files (#231 F1) Provider tests only exercised synthetic stringifyYaml fixtures, so a hand-conversion error in one of the 6 real catalog files (index.yaml, bundles/{editorial,research}/{en,ru}.yaml) — a stray quote/colon in a description, a broken emoji/arrow, a block-scalar indent slip that silently changes or drops instructions — was caught by no automated test. scripts/check.mjs is the only other guard and is wired into no CI/turbo/husky step. Add a real-files test block that reads each shipped file off disk, parses it with the SAME options the provider uses (strict: true, maxAliasCount: 100), and validates it through the provider's own exported type guards (isCatalogIndex / isCatalogBundleFile / isCatalogRole). It is driven from the real index so new bundles/langs are auto-covered, asserts the editorial bundle still ships fact-checker, and requires every declared role to be present with non-empty instructions/name in each language file. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-28 23:44:49 +03:00
claude code agent 227	85b38d6946	fix(ai): address reindex-progress review round 1 (PR #242 ) F1: clear the "Reindex now" spinner once the poll cap fires. Gate the reindexing part of the button's loading state on the active poll window (reindexDeadline !== null) so a run that outlives the 120s cap no longer leaves the button stuck-disabled with a stale `reindexing: true`; the admin can restart. F2: rewrite reindexWorkspace JSDoc to describe the EMBEDDABLE page set (text OR existing embeddings), matching getEmbeddablePageIds / countEmbeddablePages instead of the old "every non-deleted page". F3: extract the shared embeddable-content predicate into a private PageRepo.embeddablePredicate helper, called by both countEmbeddablePages and getEmbeddablePageIds, removing the verbatim duplication. Behavior is identical (lockstep int-spec stays green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-28 23:39:20 +03:00
claude_code	204cf9dfe7	test(sandbox): address PR #250 round-4 review — SSRF accept-path tests, MCP structuredContent (#243 ) Mandatory (test-coverage): - internal-file-urls.test: pin the SSRF/traversal ACCEPT path of resolveInternalFilePath (the sole guard for content-controlled `src`): an absolute/protocol-relative URL has its foreign host dropped and only an /api/files/ pathname survives (http://evil.com/api/files/x/y.png -> /files/x/y.png), while a host-dropped path that escapes /api/files/ (https://evil.com/api/auth/whoami) or a backslash-traversal (/api/files\..\auth\whoami) is rejected. Locks the behavior so a future prefix-only refactor cannot silently open a bypass. Suggestions: - index.ts: the stash_page MCP tool now returns structuredContent { uri, sha256, size, images } alongside the resource_link, so the MCP output matches the documented shape (clients get the blob's sha256/ETag and the mirror counts, not just the link). No outputSchema registered. Rebuilt build/. - new stash-page-mcp-result.test: server round-trip via InMemoryTransport asserts both the resource_link and the structuredContent mirror. - internal-file-urls.test: cover the new URL parse-failure catch branch (http://[ -> "Invalid internal file src"). - environment.service.spec: assert getPositiveIntEnv warns once per key and independently across keys (the invalidPositiveIntWarned dedup). Tests: packages/mcp 383 pass; apps/server sandbox/environment/mcp 235 pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-28 20:58:36 +03:00
claude_code	aff58646d1	refactor(sandbox): address PR #250 round-3 review — dead import, env validation, uuid validator, docs (#243 ) Must-fix: - mcp.module: drop the now-dead EnvironmentModule import (and its stale comment). McpService no longer injects EnvironmentService; EnvironmentModule is @Global and imported at the app root, so DI still resolves. Stability: - environment.service: route getSandboxTtlMs + the three SANDBOX_MAX__BYTES caps through a shared getPositiveIntEnv() helper that warns once per key and falls back to the default on a non-integer or <= 0 value (previously the byte caps did a bare parseInt, so SANDBOX_MAX_TOTAL_BYTES=0 made every stash_page fail against a 0-byte cap). TTL behavior is unchanged. Simplification: - sandbox.controller: replace the homemade UUID_RE with the project's shared `uuid` validator (import { validate as isValidUUID } from 'uuid'), matching the attachment routes; update the spec fixtures to valid v4 UUIDs. - mcp.service: inline the single-caller one-liner buildSandboxConfig() to this.sandboxStore.asSink() at the wiring site. Docs: - CHANGELOG: add an [Unreleased] > Added entry for #243 (stash_page tool, anonymous GET /api/sb/:id, five SANDBOX_ env vars). - AGENTS.md: note that GET /api/sb/:id is in the workspace-gate preHandler's excludedPaths and is fully tokenless, unlike /api/files/public/... which still resolves a workspace and needs an attachment JWT. Tests: cap-getter validation (0/-5/abc -> default, valid -> parsed), updated UUID fixtures. apps/server jest sandbox/environment/mcp: 233 pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-28 20:21:31 +03:00
claude_code	8842bc8bf3	fix(sandbox): address PR #250 follow-up review — XSS hardening, eviction reconcile, doc sync (#243 ) Security (must-fix): - sandbox.controller: the anonymous GET /api/sb/:id response now sets X-Content-Type-Options: nosniff, a restrictive CSP, and Content-Disposition= attachment for any mime outside a raster-image allowlist (png/jpeg/gif/webp/ avif). entry.mime is attacker-controlled, so an evil.svg/evil.html could otherwise execute script inline on the Docmost origin (stored XSS). Mirrors the public attachment route's hardening. Stability: - client.stashPage: reconcile mirrors AFTER the final document put, not only before it. The doc blob is the newest entry and FIFO eviction drops the oldest = this stash's own images, so the stored doc could reference an evicted blob (consumer 404) and over-report images.mirrored. A bounded loop now reverts doc-put-evicted mirrors, drops the stale doc blob, and re-puts until stable. Regenerated packages/mcp/build/. - sandbox.controller: emit Cache-Control on the 304 branch too (ttlSeconds is computed before the conditional check). Docs: - Bump the MCP tool count 39 -> 40 across all READMEs and AGENTS.md (the registry now exposes exactly 40 tools). Refactor: - SandboxStore.asSink() centralizes the {put,has,evict} sink + uri<->id mapping; the embedded-MCP and in-app agent-tools wiring sites share it. Tests: - security headers (inline vs attachment, nosniff, CSP), 304 Cache-Control, putAndLink URL form, has()/remove(), asSink() round-trip, getSandboxPublicUrl (trailing-slash trim + APP_URL fallback), and a stash test where the doc put itself evicts a mirrored image. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-28 19:08:06 +03:00
claude_code	6eb335d5e3	fix(sandbox): address PR #250 review — SSRF guard, eviction safety, cleanup (#243 ) Security: - stash_page: reject path-traversal / percent-encoded srcs before the authed loopback fetch (resolveInternalFilePath), closing an SSRF/exfiltration hole where a crafted node.attrs.src could read an arbitrary internal GET endpoint into the anonymous sandbox. Stability: - stash_page: revert + recount mirrors FIFO-evicted by a later put in the same stash (no dangling sandbox refs, honest images.mirrored/failed); free image blobs if the final document put throws. - Reject/clamp non-positive SANDBOX_TTL_MS to the 1h default (warn once). - Log mirror failures unconditionally (console.warn, no blob bodies). Cleanup / architecture: - Remove dead expiresAt from SandboxPutResult. - Centralize the /api/sb route in SANDBOX_ROUTE_SEGMENT/SANDBOX_API_PATH and move URL composition into SandboxStore.putAndLink; drop the duplicated sink closures and the now-unused EnvironmentService injection from McpService and AiChatToolsService. - Un-export isInternalFileUrl; document the process-local (instance-bound) sandbox limitation in the tool description and .env.example. Docs/tests: - README/README.ru: 38 -> 39 tools + stash_page entry. - Add traversal/normalize/recursion unit tests, stash self-eviction + doc-put-throw + empty/octet-stream mock tests, controller If-None-Match (wildcard/weak/list) + Cache-Control tests, and SANDBOX_TTL_MS validation tests. Regenerate packages/mcp/build. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-28 18:02:46 +03:00
claude code agent 227	2fe4ca8537	feat(sandbox): in-RAM blob sandbox for out-of-band page transfer (#243 ) Add an ephemeral, process-local blob store so the in-app agent (and the embedded MCP) can hand a large page document and its images to an external consumer WITHOUT routing the bytes through the model context or Docmost auth. - SandboxStore (@Injectable singleton): Map<uuid,{buf,mime,sha256,expiresAt}> in RAM only. put() picks a per-blob cap by mime (image vs doc), enforces a total-bytes RAM guard with oldest-first eviction, and stamps a TTL; get() lazily expires. sha256 computed at put() doubles as the strong ETag. An unref'd sweep interval clears expired entries and is cleared on destroy. - GET /api/sb/:uuid anonymous controller: serves raw bytes with Content-Type, Content-Length and ETag=sha256; 404 on missing/expired/non-UUID (anti- traversal), 304 on a matching If-None-Match. No tokens, no 401 — the capability is the unguessable UUID + short TTL + TLS. Auth-exempt the same way as /api/files/public (no JwtAuthGuard) plus an /api/sb entry in main.ts's workspace-resolution preHandler so a remote consumer with no workspace host is not rejected. - stash_page tool in both layers (MCP resource_link + in-app {uri,size,sha256, images}). client.stashPage serializes the get_page_json shape, mirrors every INTERNAL file/image src (type-agnostic, covers drawio/excalidraw/video/file) into the sandbox under Docmost auth and rewrites src to the sandbox URL; external http(s) srcs are left untouched; dedup by src; a failed image fetch is counted, never aborts the doc. - SANDBOX_PUBLIC_URL / SANDBOX_TTL_MS / SANDBOX_MAX_BYTES / SANDBOX_MAX_IMAGE_BYTES / SANDBOX_MAX_TOTAL_BYTES wired through the environment service + validation + .env.example. - SandboxModule (@Global) provides the shared store to the controller, McpService and AiChatToolsService (same instance for put and get). Tests: SandboxStore (round-trip, sha256, TTL lazy + sweep, caps, eviction), SandboxController (200+ETag+CT+CL, 404 missing/expired/non-UUID, 304), and a mock-HTTP stashPage test (mirror+rewrite internal, keep external, dedup, failed image counted, returns only a link). Interoperates with the vvzvlad/habr-mcp consumer's anonymous-GET + sha256-ETag + resource_link contract. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-28 15:13:11 +03:00
claude code agent 227	d0ca127d83	refactor(ai-chat): drift-guard the DocmostClientLike hand-mirror (#193 ) Issue #193's tool-half has two open items. The shared, zod-agnostic tool-spec registry (SHARED_TOOL_SPECS) for the identical tools is already merged (`f3fa15e7`) and consumed by both layers, so that subset is done. The remaining items are: (a) deriving the layer-3 hand-mirror `DocmostClientLike` from the real client type, and (b) folding more tools into the registry. Both were deferred as risky, and that deferral still holds (verified, see below) — so this change ships the safest concrete increment instead of forcing the risk. What this adds (behaviour-neutral, test-only + a doc comment): - packages/mcp/test/unit/client-host-contract.test.mjs: pins the layer-3 contract from the ESM side, where the real DocmostClient is importable. It asserts every method the in-app `DocmostClientLike` mirror declares exists as a function on a real DocmostClient instance (constructor is side-effect-free). A rename/removal in client.ts now fails this test instead of silently shipping a runtime "x is not a function" into an agent tool call. Negative-case verified (a bogus method name is detected). - docmost-client.loader.ts: replaces the vague mirror comment with a pointer to the guard test and a concrete, empirically-grounded staged plan for the full type-derivation. Verified blockers kept it deferred: @docmost/mcp emits no .d.ts (no `declaration`, no `types` export) and the server has no path mapping for it, so there is no type to import today; and the real methods' inferred CONCRETE return types conflict with the in-app adapter's loose Record<string,unknown> + `as`-cast result handling (deriving the exact type breaks the build / forces pervasive double-casts and full-surface test stubs). Out of scope (noted in the issue): the PM<->Markdown converter unification. Verified: server tsc clean; mcp tsc clean; mcp tests 369 pass (367 + 2 new); ai-chat tools specs 51 pass. No behaviour change; committed mcp build untouched (no mcp src changed). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-28 15:07:43 +03:00
claude_code	bf09eec4e1	fix(ai): address reindex-progress review (PR #242 ) - Delete the now-orphaned PageRepo.getIdsByWorkspace (its only caller, reindexWorkspace, switched to getEmbeddablePageIds). Its docstring still claimed "Used by the RAG bulk reindex"; re-grep confirmed zero callers. - ai-settings.service.reindex(): if aiQueue.add() throws (Redis hiccup/ shutdown) the worker never runs so its finally->clear() never fires, leaving the seeded progress record stuck for the full 1h TTL (button stuck "reindexing: 0 of N"). Roll back the seed THIS call wrote (seeded flag, only when get() was null) before re-throwing, so a concurrent active run's record is never wiped. Add tests for both the clear-on-throw and the don't-clear-a-concurrent-run paths. - Add an integration spec (real Postgres) proving getEmbeddablePageIds' WHERE stays in lockstep with countEmbeddablePages: seeds every boundary case and asserts the returned id set equals the count. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-28 04:39:18 +03:00
claude code agent 227	38a863e5f7	refactor(agent-roles-catalog): store catalog as YAML with block-scalar instructions (#229 ) The agent-roles catalog content files move from JSON to YAML so each role's long `instructions` system prompt is stored as a literal block scalar (`\|-`): editing one sentence now produces a line-by-line diff and the prompt is editable as plain multi-line text instead of a single escaped JSON string. Data: - `index.json` -> `index.yaml`, `bundles/<id>/<lang>.json` -> `<lang>.yaml` (old `.json` deleted). Converted programmatically via the `yaml` library with `lineWidth: 0`; round-trip verified deepEqual against the old JSON, so the resolved role content is byte-for-byte identical (the only `version` bump is fact-checker v2->3, carried over from develop during the rebase; see below). Server (`AiAgentRolesCatalogProvider`): - parse with `yaml`'s safe default (JSON-compatible) schema instead of `JSON.parse` — `strict: true` (rejects duplicate keys) and `maxAliasCount: 100` (billion-laughs guard); no custom `!!` tags / no code execution. Fetched paths become `index.yaml` / `<lang>.yaml`. The streaming 1 MB size cap, `redirect: 'error'`, 10s timeout and `^[a-z0-9-]+$` path-traversal/SSRF guard are unchanged; the hand-written type guards are untouched (`instructions` is still a string after parsing). - add `yaml` as a direct server dependency (already in the lockfile as a transitive dep). Catalog tooling: - `scripts/check.mjs` parses the catalog as YAML (lockfile stays JSON); pin `yaml` as a devDependency of the catalog package. Tests: - provider spec fixtures serialized with `yaml`; new tests for the block-scalar `instructions` round-trip (exact multi-line string), malformed YAML and strict duplicate-key rejection -> BadGateway; size-cap and path-traversal cases retargeted to the `.yaml` paths. Docs: README, `.env.example`, `catalog-types.ts` comments and CHANGELOG updated to the YAML layout. `AI_AGENT_ROLES_CATALOG_URL` base-URL contract unchanged. Rebase onto develop + review (PR #231, comment 2509): - semantic conflict: develop's `89edddc5` bumped fact-checker v2->3 (flags errors instead of confirming facts) in the now-deleted `.json`. Resolved the modify/delete by taking the deletion and porting develop's v3 `description` + `instructions` (en + ru) into the YAML and setting `version: 3` in index.yaml. Verified by `node scripts/check.mjs` going green against develop's unchanged content-hash lock (the ported YAML hashes byte-identically to the v3 JSON). - doc fix: ai-agent-roles.service.ts catalog comment "untrusted JSON" -> YAML. - doc fix: parseYaml docstring no longer claims `strict: true` rejects unknown custom tags (yaml@2.8.x warns + resolves to a plain scalar, then the type guard rejects it); the duplicate-key claim is kept. - doc: note in check.mjs that `yaml` resolves from the repo-ROOT node_modules (via shamefully-hoist), not the catalog package's own pinned devDependency. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-28 04:38:50 +03:00
a	95d07d8d6f	fix(ai): align reindex live denominator with the steady-state count Review fixes for the reindex-progress counter (#242): 1. Denominator jump (478 -> 500 -> 478): reindexWorkspace iterated getIdsByWorkspace() (ALL non-deleted pages) but the seed/status use countEmbeddablePages (text OR existing-embedding), so the live total exceeded the steady-state total whenever empty/text-less pages existed. Add PageRepo.getEmbeddablePageIds() that selects the IDs of the EXACT same set countEmbeddablePages counts (deletedAt IS NULL AND (text_content matches a non-whitespace char OR an EXISTS non-deleted pageEmbeddings row)), and have reindexWorkspace iterate THAT set with total = its length. Iteration set and count source change together, so done reaches exactly total == the steady-state denominator. Dropping text-less pages is correct (reindexPage no-ops on them; a page that lost its text but still has stale embeddings is in the set via the EXISTS clause and still gets its stale rows cleared). Removed the contradictory "worker overwrites with the real page count" / "denominator matches" comment. 2. Mid-run re-trigger reset: reindex() unconditionally re-seeded done=0 before an enqueue that de-dupes a running job, so a second click/admin/tab reset the visible counter while the worker kept incrementing. Now seed only when get(workspaceId) === null; the worker's own start() remains the single authoritative reset. 3. TTL: documented that it is intentionally tied to write progress (start/increment) and never refreshed on get(), so a dead worker's record can't be kept alive forever by client polling. Tests: new embedding-reindex-progress.service.spec.ts (fake ioredis: hash -> ReindexProgress, malformed/missing/non-numeric -> null, non-finite startedAt -> 0, hgetall throws -> null, start/increment issue hset/hincrby+expire and swallow Redis errors); reindex() seed order + no-reseed-when-active guard; getMasked live test now uses progress.total=500 vs DB 478 to pin the progress branch; indexer specs updated to mock getEmbeddablePageIds. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-28 04:32:36 +03:00
a	72bb03918d	fix(ai): show live reindex progress in semantic-search settings The "Indexed X of Y pages" counter stayed stuck at "478 of 478" during a manual "Reindex now" run instead of resetting to 0 and climbing. The status reports indexedPages = countIndexedPages (DISTINCT pages with >=1 embedding row), but reindex hard-replaces each page in its OWN small transaction, so nearly all pages always have rows -> the count never drops. Add a per-workspace live reindex-progress record in Redis (reusing the existing global ioredis client via RedisService, no new Redis config): - EmbeddingReindexProgressService: start/increment/clear/get over a Redis hash with a 1h TTL self-clean; all best-effort/cosmetic so a Redis failure degrades to the existing DB-count behavior. - AiSettingsService.reindex seeds {total, done:0, startedAt} at enqueue time so the very first poll already reports done=0. - EmbeddingIndexerService.reindexWorkspace overwrites total with the real page count at start, increments done per processed page (success or handled failure), and clears the record in a finally (covers success, fatal abort, and the unconfigured early-return) so a failed run never sticks. - AiSettingsService.getMasked returns the live run numbers when a progress record is active (plus an optional reindexing flag), else falls back to countIndexedPages/countEmbeddablePages. Per-page edits (reindexPage) never touch the workspace progress record, and no mass up-front delete is introduced (search availability preserved). Tests: indexer sets/increments/clears progress (incl. fatal abort and unconfigured early-return); status reports run progress when active and falls back when not. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-28 04:32:36 +03:00
vvzvlad	c5109aa2a3	Merge pull request 'feat(footnotes): author-inline footnotes + deterministic server canonicalization (#228 )' (#232 ) from feat/228-inline-footnotes into develop Reviewed-on: #232	2026-06-28 02:23:27 +03:00
a	c4ed4a4855	fix(footnotes): strip bare definitions on rebuild; MCP full-doc + zip-import canonicalize tests (#228 ) Review #6 (approve-with-comments) follow-ups: 1. canonicalize step 7 now strips bare footnoteDefinitions at ANY depth (stripFootnoteDefinitionsDeep), not just footnotesList, in BOTH copies. A definition hand-authored outside a list (e.g. nested in a callout via a raw-JSON write path) was left in place while a copy was also added to the rebuilt list -> duplicate, idempotent, self-perpetuating. Runs only in the rebuild path (after the lists are stripped); the fast-path / placement-keep branch is untouched. Added a shared-corpus case (bare def nested in a callout) to pin it in both mirrors. 2. markdown-clipboard: removed the dead top-level footnoteReference check in canonicalizePastedFootnotes (an inline atom is never a top-level slice child; only the descendants scan can find it). Test coverage: 4. New MCP binding tests (full-doc-write-canonicalize.test.mjs): update_page_json and copy_page_content canonicalize the persisted full doc, asserted via a new `replacePage` seam (symmetric to the existing `mutatePage` seam) so no live collab socket is needed. Routed both writers through the seam. 5. New server spec (file-import-task.service.footnote-canonicalize.spec.ts): the zip-import path (processGenericImport) canonicalizes footnotes — real markdown->HTML->JSON via a real ImportService over a temp-dir .md file, DB trx stubbed to capture the persisted page content. FileImportTaskService had no spec before. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-28 01:39:25 +03:00

1 2 3 4 5 ...

859 Commits