Compare commits

...

32 Commits

Author SHA1 Message Date
claude code agent 227 ba94def3c8 fix(temporary-notes): keep the banner usable on mobile (#321)
On narrow screens the temporary-note banner squeezed its text into a
one-word-per-line ladder and overflowing words slid under the subtle
"Move to trash" button. Two layout causes, both fixed here (layout-only; no
handler/logic/i18n changes):

- The text Group had `flex: 1` (= basis 0), so the outer `wrap="wrap"` never
  wrapped the buttons to a second row — it crushed the text instead. Give it a
  non-zero basis (`flex: 1 1 16rem`) so the wrap engages on narrow containers.
- Mirror DeletedPageBanner's adaptive actions: labeled Buttons visibleFrom="sm",
  icon-only ActionIcon + Tooltip + aria-label hiddenFrom="sm" (same handlers,
  loading flags, and t() keys). This also fixes the ru locale, whose long labels
  no longer render on mobile.

The sibling DeletedPageBanner already uses this pattern; adding the second button
in #273/#277 didn't carry the adaptive part over.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 23:17:22 +03:00
vvzvlad e1b8f81b15 Merge pull request 'feat(dictation): reason-модель — говорящий tooltip на серой иконке + общий резолвер ошибок (#309)' (#314) from feat/309-dictation-reasons into develop
Reviewed-on: #314
2026-07-03 23:14:57 +03:00
claude code agent 227 b7c16dc634 i18n(dictation): add "Dictation" aria-label key to en-US + ru-RU (#314 F6)
The F4 fix introduced t("Dictation") as the neutral aria-label for a disabled
mic with no reason (reachable via the AI chat mic while the assistant streams),
but the key wasn't in either locale — a ru-RU screen-reader user would hear the
English "Dictation". Add it to both locales.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 22:43:06 +03:00
vvzvlad da952ca536 Merge pull request 'fix(#300 ui): различимые per-agent цвета глифа + аватар пользователя на переднем плане' (#319) from feat/300-avatar-colors into develop
Reviewed-on: #319
2026-07-03 22:28:44 +03:00
claude code agent 227 1458e3e152 fix(dictation): cover read-only precedence, suppress misleading disabled tooltip, drop dead "busy" (#314 F3/F4/F5)
F3: add computeDictationAvailability assertions for the read-only ∩ pre-sync
intersection (editable:false, inEditMode:true, showStatic:true) → read-only for
both isDisconnected states, pinning that lack of edit permission takes
precedence over the pre-sync reason (kills a mutant dropping `editable &&`).

F4: switching native disabled → data-disabled made a disabled mic hoverable — good
for the byline mic (shows the reason), but a consumer passing bare `disabled`
without a reason (AI chat's isStreaming) got a misleading, actionable
"Start dictation" tooltip on a click-rejecting control. Now: disabled + no reason
→ render the icon with NO Tooltip and a neutral aria-label; disabled + reason →
reason tooltip; enabled → "Start dictation". Click guard/data-disabled preserved.

F5: remove the dead "busy" DictationUnavailableReason (never produced) — union
member, its resolver case (folded into default), and the vacuous test assert.

vitest (dictation + editor-sync + dictation-group): 41 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 22:18:16 +03:00
claude code agent 227 13a333632a test(#319): pin per-agent glyph color reaches the DOM + fix stale z-order docs (F1/F2)
F1: the bug was the color never reaching the DOM (Mantine Avatar's --avatar-bg
overrode it); the pure agentGlyphBackground always returned distinct colors, so
the existing unit tests would pass even against the broken Avatar. Add a
data-testid on the glyph Box and two render tests: one asserts the emoji glyph's
applied inline background equals agentGlyphBackground(name); one asserts two
palette-distinct agents reach the DOM as different backgrounds. React applies
styles via the CSSOM (hsl→rgb), so the assertion normalizes both sides through
the same path and compares against the real function output (no frozen literal).
Fails against the pre-fix Avatar (no inline background / no glyph testid).

F2: the top-level AgentAvatarStack JSDoc and two test titles still described the
old z-order (agent glyph in front, human behind); the PR flipped it (human
launcher badge in front, zIndex 2 > glyph 1). Updated the JSDoc + both titles to
match.

vitest: 10 passed (+2).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 22:15:42 +03:00
claude_code 344b9723b2 fix(#300 ui): distinct per-agent glyph colors + launcher on top
The per-agent glyph color never showed: the circle was a Mantine
`Avatar variant="filled"` whose background was overridden by Mantine's
`--avatar-bg`, so every agent fell back to the theme's violet. Also raw
`hue = hash % 360` put many names in the same "purple" arc.

- Render the emoji/sparkles circle as a plain Box with an explicit
  background — the color is now guaranteed.
- Pick the color from a curated palette of categorically-distinct dark
  hues (red/orange/green/teal/blue/violet/magenta/slate) by name hash, so
  different agents read as different colors, not shades of one violet.
- Bring the launcher (human) badge ABOVE the agent glyph (zIndex) so it is
  fully visible at the top-right instead of half-hidden behind the circle.

client tsc clean, tests pass (added a color-distinctness assertion).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-03 21:46:31 +03:00
claude code agent 227 d57392b5af fix(dictation): extract tested computeDictationAvailability + gate mic from the reactive atom (#314 F1/F2)
F1: the availability publish-effect duplicated the #218 editability gate
(editable && inEditMode && !showStatic) inline — a copy that could silently
diverge from the tested isBodyEditable — and the reason computation (the core of
#309) had no tests. Extract computeDictationAvailability into editor-sync-state.ts
REUSING isBodyEditable; the effect is now a one-line call. Unit tests cover the
branches (synced→null; pre-sync disconnected→offline / else connecting;
!editable/!edit→read-only).

F2: DictationGroup gated the mic on the non-reactive editor.isEditable while the
PR already publishes the reactive dictationAvailability.isEditable (same signals)
— so gate and reason came from different sources and the mic could stick. Gate on
dictationAvailability.isEditable: one reactive source of truth for both.

vitest (editor-sync-state + dictation): 37 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 21:32:06 +03:00
claude code agent 227 a86e5f409f feat(dictation): reason model — speaking tooltip on a disabled mic + shared error resolver (#309)
The dictation mic could be grey/disabled while silently showing "Start
dictation", and Mantine's native `disabled` set pointer-events:none so the
Tooltip never fired at all — the UI knew the cause but told the user nothing.
Runtime error strings were also duplicated verbatim across the two dictation
hooks.

- New dictation-status.ts: the single source of truth. A DictationUnavailableReason
  enum (connecting/offline/read-only/unsupported/busy) + a DictationErrorCode enum,
  pure classifiers (classifyGetUserMediaError / classifyTranscriptionError) and
  resolvers (resolveUnavailableLabel / dictationErrorMessage). All user-facing
  dictation strings are formed here; the verbatim server message still wins for
  transcription errors.
- page-editor publishes dictationAvailabilityAtom { isEditable, reason } computed
  at the source (editable/edit-mode/showStatic/collab status): connecting vs
  offline (stuck) vs read-only. DictationGroup forwards the reason to MicButton.
- MicButton is reason-aware: a disabled mic shows the cause-specific tooltip. The
  disabled-hover silence is fixed by marking disabled the Mantine way
  (data-disabled/aria-disabled + click guard) instead of the native attribute, so
  the Tooltip fires — applied to both the idle (reason) and error (errorMessage)
  states.
- Both hooks route every error through the shared resolver (deleting the
  duplicated transcriptionErrorMessage), and expose errorMessage for the tooltip.
  Wording is byte-identical to each hook's original (incl. the batch hook's
  DOMException name prefix and the verbatim server message).
- i18n: 3 new reason keys in en-US + ru-RU, and the previously-missing ru-RU
  dictation error translations.

Tests: dictation-status.test.ts (all classifier/resolver branches, incl. server
message passthrough) + mic-button.test.tsx (disabled mic shows the reason text,
uses data-disabled not native disabled — fails against the pre-fix code).
vitest: 5 files / 32 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 21:29:39 +03:00
vvzvlad 33d22ff164 Merge pull request 'feat(comment): предложения правок агента + кнопка «Применить» (server-side atomic apply, #315)' (#318) from feat/315-comment-suggestions into develop
Reviewed-on: #318
2026-07-03 21:29:28 +03:00
vvzvlad b861266ff8 Merge pull request 'fix(ai-chat): резолв slugId→uuid в bound-chat — 500 (22P02) на открытии страницы (#312)' (#313) from fix/312-bound-chat-slug into develop
Reviewed-on: #313
2026-07-03 21:27:14 +03:00
vvzvlad 8b99b70d73 Merge pull request 'feat(#300 ui): launcher в правый верхний угол + тёмный per-agent цвет глифа' (#307) from feat/300-avatar-polish into develop
Reviewed-on: #307
2026-07-03 21:26:28 +03:00
vvzvlad b3d4922efa Merge pull request 'fix(editor): резервировать высоту документа на свопе static→live — скролл переживает своп (корень дрыга, #266)' (#308) from fix/scroll-restore-swap-height into develop
Reviewed-on: #308
2026-07-03 21:26:04 +03:00
vvzvlad 49c7c4bb64 Merge pull request 'fix(editor): реактивное чтение editor.isEditable в DictationGroup — иконка диктовки больше не залипает серой (#311)' (#316) from fix/311-reactive-editable into develop
Reviewed-on: #316
2026-07-03 21:25:49 +03:00
vvzvlad d9517ff3f1 Merge pull request 'fix(ai): устойчивость pre-response ECONNRESET — бюджет ретраев, jittered backoff, keep-alive (#310)' (#317) from fix/310-econnreset-tuning into develop
Reviewed-on: #317
2026-07-03 21:25:33 +03:00
claude code agent 227 48c1ec46f7 fix(comment): store the real anchored substring as expectedText + pin authz (#318 F1/F2)
F1 [blocking]: a suggestion whose anchor matched via normalization could never
be applied (spurious 409). The comment mark lands on the doc's ACTUAL text
(Docmost auto-converts to typographic quotes/dashes/nbsp), but the stored
selection — used as expectedText at apply — was the raw ASCII agent input
(+substring(0,250)). So replaceYjsMarkedText's strict joined!==expectedText
always failed and threw "text changed" though nobody edited. Fix: new pure
getAnchoredText(doc, selection) reconstructs the exact raw doc substring the mark
covers (slicing identical to spliceCommentMark); on the suggestion path
client.createComment stores THAT as selection, so expectedText equals the marked
text and apply returns applied:true. Live anchoring still uses the raw agent
selection (normalization still finds the anchor). Truncation raised 250->2000
(+ DTO @MaxLength(2000)) so the anchored substring is never cut below the mark
span. Ordinary comments unchanged. AI-chat shares client.createComment, so
covered. Regression tests: getAnchoredText raw-vs-ASCII; create payload selection
is the typographic substring; apply with typographic expectedText -> applied.

F2 [blocking]: added comment.controller.spec.ts pinning that validateCanEdit runs
before applySuggestion (Forbidden -> applySuggestion never called; happy path ->
called; missing comment -> 404 without authorizing).

MCP 448 pass; server comment+yjs 54 pass. MCP build/ rebuilt.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 20:29:42 +03:00
claude code agent 227 cd539558ed feat(agent-tools): suggestedText on create_comment with strict anchor uniqueness (#315 phase 6)
Agents can attach a suggested replacement when creating an inline comment, via
both the MCP create_comment tool and the AI-chat createComment tool.

Because applying a suggestion edits the EXACT anchored text, an ambiguous anchor
would let Apply corrupt the wrong occurrence. So when suggestedText is set the
selection must occur EXACTLY ONCE:
- new countAnchorMatches(doc, selection) counts occurrences across all blocks
  (same normalization/traversal as canAnchorInDoc), counting occurrences (2 in
  one block => 2) — stricter than block-count, never under-counting distinct
  occurrences (false-unique is the dangerous direction).
- client.createComment gains suggestedText: a pre-check (getPageJson +
  countAnchorMatches: 0 => not-found, >=2 => ambiguity error) before create, and
  an AUTHORITATIVE live check inside the anchoring mutation that recomputes on the
  live doc and, if != 1, aborts and rolls back the just-created comment (reusing
  the existing safeDeleteComment "anchor not found" path). Ordinary comments keep
  first-occurrence behavior unchanged.
- suggestedText is rejected on a reply or without selection in all three layers
  (MCP handler, MCP client, AI-chat tool), mirroring the server DTO/service.
- filterComment surfaces suggestedText/suggestionAppliedAt/suggestionAppliedById.
- DocmostClientLike.createComment signature updated. MCP build/ rebuilt.

Tests: countAnchorMatches (0/1/N, within/across/nested block, span nodes,
quote normalization); createComment (ambiguous refused pre-create, reply and
no-selection rejected, unique succeeds and forwards suggestedText, filterComment
surfaces it); ai-chat schema accepts suggestedText. MCP 443 pass; ai-chat 601 pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 19:35:47 +03:00
claude code agent 227 b62db917de feat(comment): suggestion diff block + Apply button + mutation (#315 phase 5)
Client UI for agent comment suggestions.

- IComment gains suggestedText / suggestionAppliedAt / suggestionAppliedById.
- comment-list-item shows a "было → стало" block (old selection struck/red, new
  suggestedText green) for a top-level comment with a suggestion, plus an Apply
  button — gated by canShowApply(comment, canEdit): edit permission AND a
  suggestion AND not applied AND not resolved AND top-level. Once applied, an
  "Applied" badge replaces the button.
- canEdit comes from page.permissions.canEdit (real edit permission, NOT the
  looser canComment) and is threaded through CommentListItem and nested
  ChildComments; fail-closed when undefined.
- useApplySuggestionMutation posts to /comments/apply-suggestion; on success it
  writes the applied + server auto-resolve fields into the react-query cache
  (UI flips to Applied + resolved without a refetch); on 409 it shows a specific
  message with the server's currentText, else a generic error.
- i18n keys added in en-US + ru-RU.

Tests (comment-list-item.test.tsx + canShowApply unit suite): Apply visibility
across canEdit/applied/resolved/reply, click dispatches the mutation, diff
rendering. 34 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 19:19:36 +03:00
claude code agent 227 ec542a924b feat(comment): store suggestedText + POST /comments/apply-suggestion (#315 phase 4)
Server side of agent comment suggestions.

- CreateCommentDto gains optional suggestedText (<=2000). CommentService.create
  accepts it ONLY for a top-level inline comment with a non-empty selection,
  requires it be non-empty and differ from selection (else BadRequest), and
  stores it.
- POST /comments/apply-suggestion (ApplySuggestionDto { commentId }): authorizes
  with validateCanEdit (applying edits page text) BEFORE any structural check or
  mutation, then CommentService.applySuggestion:
  - runs the phase-3 collab event applyCommentSuggestion on `page.<pageId>` to
    atomically check-and-replace the marked text, returning { applied, currentText };
  - applied → stamp suggestion_applied_at/by, auto-resolve the thread, ws
    commentUpdated, audit COMMENT_SUGGESTION_APPLIED;
  - already-applied (DB) → idempotent success (no re-apply), self-healing the
    resolve if it was missed — satisfies the issue's double-click / two-user
    race requirement;
  - collab verdict applied:false && currentText===suggestedText → idempotent
    success (crash between doc mutation and DB write);
  - text changed → 409 ConflictException carrying currentText;
  - gateway undefined/throw → hard error, never a silent success.
- audit-events: COMMENT_SUGGESTION_APPLIED.

Tests: create validation (reply/no-selection/equal-to-selection rejected;
valid stored) + applySuggestion verdict branches incl. both idempotent paths.
jest src/core/comment: 33 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 19:09:23 +03:00
claude code agent 227 a9da8f7f15 feat(collab): applyCommentSuggestion event + no-Redis local fallback (#315 phase 3)
New custom collab event applyCommentSuggestion runs replaceYjsMarkedText inside
the document's Yjs transaction on the owning instance and returns the
{ applied, currentText } verdict to the API-server caller (cross-process via the
Redis bridge, whose customEventComplete/replyId already carries handler return
values).

- withYdocConnection is now generic and returns the callback's result (captured
  in a closure, since hocuspocus connection.transact does not forward it). The
  callback is typed synchronous-only: transact runs fn synchronously without
  awaiting, so an async fn would mutate outside the transaction and lose
  atomicity.
- collaboration.gateway.handleYjsEvent: when Redis is disabled
  (COLLAB_DISABLE_REDIS), dispatch the handler locally against the single
  hocuspocus instance and return its verdict instead of silently returning
  undefined (which would make apply a no-op). Also fixes the pre-existing silent
  no-op of setCommentMark/resolveCommentMark without Redis.

Tests: handler spec (applied mutates doc + returns verdict; changed-text returns
{applied:false} without mutating; args forwarded; withYdocConnection returns the
value) and gateway spec (no-Redis path dispatches locally, returns the verdict,
not undefined).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 18:52:44 +03:00
claude code agent 227 7c0664d2b3 feat(collab): replaceYjsMarkedText — atomic check-and-replace of comment-marked text (#315 phase 2)
The primitive behind "Apply comment suggestion": walk the XmlFragment, collect
the delta segments carrying the `comment` mark for a commentId, and replace them
with new text ONLY if the run is intact (single Y.XmlText, contiguous, and the
joined text still equals the expected anchor). Otherwise return a verdict
{ applied:false, currentText } — null when the anchor is gone, else the current
text — so the caller can report "someone changed it". On apply it deletes the
run and re-inserts the new text re-attaching the same comment mark (thread stays
anchored). Mutates in place for the caller's connection.transact(); opens no
transaction of its own.

Non-string inserts (embeds) advance the offset by their 1-unit index length so a
marked segment after an embed gets the right position and an embed inside a run
is correctly rejected as a changed anchor.

Tests (yjs.util.spec.ts): happy path (mark preserved, surrounding text and no
mark-bleed), resolved-mark match, changed text, deleted anchor, paragraph split,
interleaved unmarked text, and embed before/inside the run. 17 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 18:41:32 +03:00
claude code agent 227 a32fba63ec feat(comment): db columns for comment suggestions (#315 phase 1)
Add suggested_text / suggestion_applied_at / suggestion_applied_by_id to the
comments table (migration) and mirror them in the hand-curated db.d.ts Comments
interface. suggested_text holds a proposed replacement for the comment's
anchored selection; the applied_* columns record who applied it and when.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 18:29:03 +03:00
claude code agent 227 808a5c70df fix(ai): harden pre-response ECONNRESET retry — bigger budget, jittered backoff, shorter keep-alive (#310)
In prod the AI provider resets the connection pre-response (ECONNRESET); the
#175 pre-response retry recovers it, but 2 of the 3 allowed attempts were burned
in a single turn — no headroom, and one more reset would surface an error to the
user. This is tuning for resilience (not a diagnosis of who resets):

- Retry budget 2 → 4 (total 5 attempts), env-configurable via
  AI_STREAM_PRE_RESPONSE_RETRIES (0 = no retry; empty/invalid → default 4).
- Backoff: linear 150*(attempt+1) → capped exponential + full jitter
  (preResponseBackoffMs, a pure injectable helper): base 150ms, ×2 per attempt,
  capped 2000ms, delay = random in [0, capped]. Avoids a synchronized retry
  storm and spreads reconnects across the reset window.
- Keep-alive default 10_000 → 4_000 ms so undici recycles idle sockets before a
  ~5s upstream/middlebox idle cutoff can poison them (a common pre-response
  reset cause). Still env-overridable via AI_STREAM_KEEPALIVE_MS.
- .env.example documents both knobs.

Timeout (900s), RETRYABLE_CONNECT_CODES, and the instrumentation are unchanged.

refs #310

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 18:24:09 +03:00
claude code agent 227 0210faabea fix(editor): read editor.isEditable reactively in DictationGroup so the mic un-greys after sync (#311)
On an editable page the dictation mic in the byline stayed grey/disabled after
collab finished syncing, until an unrelated re-render (view↔edit toggle,
navigation) happened. DictationGroup read the NON-reactive `editor.isEditable`
field directly in render; `editor` comes from pageEditorAtom (a stable object
reference), so `editor.setEditable(true)` after sync (#218 gate) mutates TipTap
state without changing the atom reference — the byline never re-renders and
disabled=true sticks.

Read editable via `useEditorState` (the same reactive read the editor body
already uses), so the mic re-enables when the body flips editable and disables
again when it loses editable. The #218 pre-sync intent is preserved — just made
reactive. Test flips isEditable false→true and asserts the mic goes
disabled→enabled (fails against the pre-fix raw-field read).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 18:19:52 +03:00
claude code agent 227 17003fbbc1 test(editor): extract swap height-reservation into a tested hook (#308 F1)
The +42-line height-reservation logic lived inline in PageEditor on the risky
static→live swap path with zero tests (F1). Extract it into
useSwapHeightReservation(showStatic, menuContainerRef) → { reservedHeight,
captureReservation }, mirroring the sibling useScrollRestoreOnSwap extraction,
so its release guard and cap are directly unit-testable.

Pure extraction — behavior identical. The capture stays a synchronous callback
the editor invokes in the collab-sync effect (reading swapWrapperRef.offsetHeight
while the static content is still mounted, before setShowStatic(false)); a
post-transition effect inside the hook would read the collapsed live height and
be wrong. The rAF release loop (release at scrollHeight >= reserved, or the
RELEASE_CAP_MS=4000 cap) and cancelAnimationFrame cleanup moved verbatim.

Tests (use-swap-height-reservation.test.ts) cover 4 branches, mutation-verified:
(a) capture → reservedHeight; (b) release when live content reaches reserved;
(c) release at the cap when it never does; (d) non-swap → stays null.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 18:13:30 +03:00
claude code agent 227 0df6242128 fix(ai-chat): resolve page slugId to uuid in bound-chat, fixing 22P02 500 (#312)
POST /api/ai-chat/bound-chat 500'd with Postgres 22P02 because the client
sends a page slugId (10-char nanoid) in the request `pageId` field, which the
server passed straight into the UUID `page_id` column. The chat-to-document
binding silently broke (client fail-softs to a new chat) and every slug-URL
page open logged a 500.

Fix: resolve the incoming id to a real page UUID on the server. PageRepo.findById
already accepts both a uuid and a slugId (isValidUUID→slugId fallback), so
boundChat now resolves the page first, guards it against a foreign/unknown
workspace (returns {chatId:null} before any chat lookup — no cross-workspace
probe), and looks up the latest chat by the resolved page.id (real uuid).

Client: renamed the local pageId→slugId for clarity (the value is a slugId);
the wire body key stays `pageId` so the DTO is unchanged. DTO left @IsString()
(a @IsUUID() would only turn the 500 into a 400 and still break binding).

Test: bound-chat spec asserts a slugId resolves and findLatestByPage is called
with the real uuid; a foreign-workspace page → {chatId:null} without a chat
lookup (no leak); an unknown id → {chatId:null}, no throw.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 18:07:23 +03:00
vvzvlad 36b3539571 Merge pull request 'refactor(ai-chat): move patch_node/insert_node into the shared tool-spec registry (#294)' (#305) from refactor/294-tool-spec-registry into develop
Reviewed-on: #305
2026-07-03 18:02:40 +03:00
vvzvlad a63efa6920 Merge pull request 'fix(ai-chat): stop the reasoning-stream hang — parse markdown only when expanded (#302)' (#303) from fix/302-reasoning-parse-when-open into develop
Reviewed-on: #303
2026-07-03 18:02:16 +03:00
claude code agent 227 ccd38152ab docs(ai-chat): correct AgentGlyph docstring — per-agent dark circle, not violet (#307 F1)
The launcher-polish commit replaced the fixed violet AGENT_COLOR background
with a per-agent dark hashed circle (agentGlyphBackground). Points 2 and 3 of
the AgentGlyph image-source docstring still said 'violet circle' — update both
to 'per-agent dark circle' so the doc matches the code.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 18:01:18 +03:00
claude_code 8f95c5808e fix(editor): reserve document height across the static→live swap so scroll survives (#266)
Root cause (confirmed via Chrome DevTools on the live app, and the fix validated
there too): after the reading-position restore lands correctly, the static→live
editor swap momentarily SHRINKS the document (the live editor lays out its content
over a few frames — measured height 32005 → 22050), so the browser CLAMPS window
scroll to the top. That is what produced all of:
- "lands correct → jumps to top → back down" (restore#2 recovering from the clamp),
- the final position overshooting (~6000px) via scroll-anchoring during recovery,
- "scroll a little → jumps to 0" (the clamp catching the reader mid-scroll).

Fixing the restore logic was chasing symptoms. This reserves the pre-swap content
height (a min-height on a wrapper around the static/live editor) until the live
editor has laid out (or a short safety cap), so the document never collapses and
window scroll simply survives the swap. Validated live: with the height pinned the
restore fires ONCE and the position stays put (no reset, no jitter, no overshoot);
the existing post-swap re-assert becomes a silent no-op.

No change to the restore hook or its tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-03 17:36:10 +03:00
agent_coder f720151c63 refactor(ai-chat): move patch_node/insert_node metadata into the shared tool-spec registry (#294)
The same tool metadata (zod schema + model-facing description) was hand-duplicated
between the standalone MCP server and the in-app AI-chat agent, so every tweak had to
land in two places and copies drifted (a materialized parity bug). The shared
transport-agnostic registry (packages/mcp/src/tool-specs.ts) already de-duplicates 14
tools; this migrates two more genuinely-identical ones — patch_node/patchNode and
insert_node/insertNode. The canonical description is a strict SUPERSET of both originals
(keeps MCP's "without resending the whole document" + table-structure/anchor guidance
AND the in-app "reversible via page history" / "exactly one of anchorNodeId or
anchorText" framing — no model-facing guidance dropped); the schema is identical (the
in-app side just gains MCP's .min(1) on ids, a safe tightening). Each transport keeps its
own execute/auth wrapper, and the in-app parseNodeArg node-arg normalization is unchanged.

The three table tools are intentionally NOT merged (a real param-name divergence:
table vs tableRef) — documented on both sides. Other per-transport divergences
(search/share/create_comment/transform/list_pages) are left separate with a short comment
explaining why (the issue asked to flag these as intentional). DocmostClientLike stays a
hand-mirror (the ESM/CJS boundary blocks a compile-time type import; a runtime drift-guard
already pins it). Also fixes a latent contract-spec bug: derive `required` from
`instanceof z.ZodOptional` (matches the emitted JSON schema) instead of `isOptional()`,
which wrongly reported z.any() fields as optional.

Partially addresses #294.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 05:55:11 +03:00
agent_coder 2d30ad1fa2 fix(ai-chat): parse reasoning markdown only while expanded to stop the thinking-stream hang (#302)
The reasoning block memoized its markdown render on [trimmed] alone, so as the
reasoning text streamed in it re-parsed the whole, ever-growing text (marked +
DOMPurify) on every throttled delta (~20Hz) — an O(n^2) CPU storm that pinned the
main thread and froze the chat during a long "thinking" phase. Worse, the block is
collapsed by default, so all that parsing was for a hidden body the user never sees
(html is only shown inside <Collapse in={open}>).

Gate the parse on `open`: collapsed shows the cheap raw-text fallback and does no
markdown parsing; expanding parses the current text once (an instant user click), and
further streaming while open is the normal per-delta append render, like the answer.

Test: assert renderChatMarkdown is not called while collapsed and is called once on
expand.

closes #302

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-03 05:01:37 +03:00
73 changed files with 4587 additions and 468 deletions
+15 -3
View File
@@ -173,9 +173,21 @@ MCP_DOCMOST_PASSWORD=
# Keep-alive recycle window (ms) for streaming chat/agent AI + external-MCP calls.
# A pooled connection idle longer than this is closed instead of reused, so a
# NAT / egress firewall / reverse proxy that silently drops idle connections
# cannot poison a reused socket into a PRE-RESPONSE `read ECONNRESET`. Lower it if
# your egress drops idle connections faster than ~10s. Default 10000 (10 s).
# AI_STREAM_KEEPALIVE_MS=10000
# cannot poison a reused socket into a PRE-RESPONSE `read ECONNRESET`. Kept under
# common ~5s upstream/middlebox idle cutoffs so undici recycles the socket before
# the network kills it (fewer resets), while still reusing within a burst of
# back-to-back calls. Lower it further if your egress drops idle connections even
# faster. Default 4000 (4 s).
# AI_STREAM_KEEPALIVE_MS=4000
# Number of PRE-RESPONSE connection retries for streaming chat/agent AI calls: a
# reset/timeout BEFORE any response byte (e.g. `read ECONNRESET` on a stale pooled
# socket) is retried on a fresh connection with jittered exponential backoff.
# Total attempts = value + 1, so the default 4 gives 5 attempts — headroom to
# absorb a short BURST of upstream resets without exhausting the budget. Safe to
# retry: a started stream is never replayed, only a connect that never responded.
# 0 disables the retry. Default 4.
# AI_STREAM_PRE_RESPONSE_RETRIES=4
# Silence timeout (ms) for EXTERNAL-MCP transport ONLY (not the chat provider).
# Tighter than AI_STREAM_TIMEOUT_MS so a byte-silent/hung MCP server is broken in
@@ -1274,6 +1274,10 @@
"Voice dictation is not configured": "Voice dictation is not configured",
"Microphone is unavailable or already in use": "Microphone is unavailable or already in use",
"Audio recording is not available in this browser/context": "Audio recording is not available in this browser/context",
"Dictation": "Dictation",
"Dictation becomes available once the page finishes connecting": "Dictation becomes available once the page finishes connecting",
"No connection to the collaboration server — dictation unavailable": "No connection to the collaboration server — dictation unavailable",
"This page is read-only": "This page is read-only",
"Request format": "Request format",
"How transcription requests are sent to the endpoint": "How transcription requests are sent to the endpoint",
"OpenAI-compatible (multipart/form-data)": "OpenAI-compatible (multipart/form-data)",
@@ -1373,5 +1377,10 @@
"Updated to the latest version": "Updated to the latest version",
"This role is no longer in the catalog": "This role is no longer in the catalog",
"This language is no longer available in the catalog": "This language is no longer available in the catalog",
"Connecting… (read-only)": "Connecting… (read-only)"
"Connecting… (read-only)": "Connecting… (read-only)",
"Apply": "Apply",
"Applied": "Applied",
"Suggestion applied": "Suggestion applied",
"Failed to apply suggestion": "Failed to apply suggestion",
"The commented text changed since this suggestion was made; it was not applied.": "The commented text changed since this suggestion was made; it was not applied."
}
@@ -393,6 +393,17 @@
"No speech detected": "Речь не распознана",
"Transcription failed": "Не удалось распознать речь",
"Voice dictation is not configured": "Голосовой ввод не настроен",
"Start dictation": "Начать диктовку",
"Stop recording": "Остановить запись",
"Microphone access denied": "Доступ к микрофону запрещён",
"No microphone found": "Микрофон не найден",
"Microphone is unavailable or already in use": "Микрофон недоступен или уже используется",
"Could not start recording": "Не удалось начать запись",
"Audio recording is not available in this browser/context": "Запись аудио недоступна в этом браузере/контексте",
"Dictation": "Диктовка",
"Dictation becomes available once the page finishes connecting": "Диктовка станет доступна после подключения к документу",
"No connection to the collaboration server — dictation unavailable": "Нет связи с сервером совместного редактирования — диктовка недоступна",
"This page is read-only": "Страница открыта только для чтения",
"Embed PDF": "Встроить PDF",
"Upload and embed a PDF file.": "Загрузите и встроите PDF-файл.",
"Embed as PDF": "Встроить как PDF",
@@ -1229,5 +1240,10 @@
"Updated to the latest version": "Обновлено до последней версии",
"This role is no longer in the catalog": "Эта роль больше не представлена в каталоге",
"This language is no longer available in the catalog": "Этот язык больше не доступен в каталоге",
"Connecting… (read-only)": "Подключение… (только чтение)"
"Connecting… (read-only)": "Подключение… (только чтение)",
"Apply": "Применить",
"Applied": "Применено",
"Suggestion applied": "Предложение применено",
"Failed to apply suggestion": "Не удалось применить предложение",
"The commented text changed since this suggestion was made; it was not applied.": "Прокомментированный текст изменился после создания предложения; оно не было применено."
}
@@ -13,6 +13,16 @@ import {
type Props = React.ComponentProps<typeof AgentAvatarStack>;
// The DOM normalizes an inline `background: hsl(...)` to `rgb(...)`. Push the
// expected color through the same CSSOM path so the comparison stays exact and
// non-vacuous (an empty string — i.e. no inline background, as in the pre-fix
// Avatar approach — can never match a real color).
function normalizeColor(value: string): string {
const probe = document.createElement("div");
probe.style.background = value;
return probe.style.background;
}
function renderStack(props: Props) {
const store = createStore();
store.set(aiChatDraftAtom, "leftover draft from another chat");
@@ -33,18 +43,21 @@ describe("agentGlyphBackground", () => {
);
});
it("differs by name and stays a fixed dark shade (readable emoji)", () => {
it("gives categorically different colors to different agents", () => {
// The two agents that looked identically violet in the report must differ.
expect(agentGlyphBackground("Структурный редактор")).not.toBe(
agentGlyphBackground("Фактчекер"),
);
expect(agentGlyphBackground("Researcher")).not.toBe(
agentGlyphBackground("Нарратор"),
);
// Only the hue varies; saturation/lightness are pinned low so the glyph is
// always a dark circle.
expect(agentGlyphBackground("Нарратор")).toMatch(/^hsl\(\d+, 45%, 24%\)$/);
// Every color is a dark hsl circle drawn from the palette.
expect(agentGlyphBackground("Нарратор")).toMatch(/^hsl\(\d+, \d+%, \d+%\)$/);
});
});
describe("AgentAvatarStack", () => {
it("internal chat WITH role: emoji glyph in front + human launcher behind", () => {
it("internal chat WITH role: emoji glyph + human launcher badge in front", () => {
const { container } = renderStack({
agent: { name: "Researcher", emoji: "🔬", avatarUrl: null },
launcher: { name: "Alice", avatarUrl: null },
@@ -60,6 +73,63 @@ describe("AgentAvatarStack", () => {
expect(screen.getByText("Alice")).toBeDefined();
});
it("emoji glyph applies its per-agent color as an inline DOM background", () => {
// Pins the actual fix: the hashed color must reach the DOM as an inline
// `background` on the glyph Box. The pre-fix `Avatar variant="filled"` set no
// inline background (Mantine's --avatar-bg overrode it), so this fails there.
const agent = { name: "Researcher", emoji: "🔬", avatarUrl: null };
const { container } = renderStack({
agent,
launcher: { name: "Alice", avatarUrl: null },
aiChatId: "chat-1",
});
const glyph = container.querySelector<HTMLElement>(
'[data-testid="agent-glyph"]',
);
expect(glyph).not.toBeNull();
// Non-vacuous: compare against the function output (normalized the same way),
// not a frozen literal. Empty against the pre-fix Avatar (no inline bg).
expect(glyph!.style.background).not.toBe("");
expect(glyph!.style.background).toBe(
normalizeColor(agentGlyphBackground(agent.name)),
);
});
it("agents with distinct hashed colors reach the DOM as distinct backgrounds", () => {
// "Researcher" and "Нарратор" hash to different palette entries, so their
// applied DOM backgrounds must differ — pins "distinct colors reach the DOM".
expect(agentGlyphBackground("Researcher")).not.toBe(
agentGlyphBackground("Нарратор"),
);
const a = renderStack({
agent: { name: "Researcher", emoji: "🔬", avatarUrl: null },
launcher: null,
aiChatId: null,
});
const b = renderStack({
agent: { name: "Нарратор", emoji: "📖", avatarUrl: null },
launcher: null,
aiChatId: null,
});
const glyphA = a.container.querySelector<HTMLElement>(
'[data-testid="agent-glyph"]',
);
const glyphB = b.container.querySelector<HTMLElement>(
'[data-testid="agent-glyph"]',
);
expect(glyphA!.style.background).toBe(
normalizeColor(agentGlyphBackground("Researcher")),
);
expect(glyphB!.style.background).toBe(
normalizeColor(agentGlyphBackground("Нарратор")),
);
// Different colors reach the DOM (the normalized rgb values also differ).
expect(glyphA!.style.background).not.toBe(glyphB!.style.background);
});
it("showName=false: renders only the avatars, no inline name label", () => {
renderStack({
agent: { name: "Researcher", emoji: "🔬", avatarUrl: null },
@@ -91,7 +161,7 @@ describe("AgentAvatarStack", () => {
expect(screen.getByText("Bob")).toBeDefined();
});
it("external MCP: agent avatar in front, NO launcher behind", () => {
it("external MCP: agent avatar only, NO human launcher badge", () => {
const { container } = renderStack({
agent: { name: "MCP Bot", avatarUrl: "http://example.test/a.png" },
launcher: null,
@@ -1,4 +1,4 @@
import { Avatar, Box, Group, Text, Tooltip } from "@mantine/core";
import { Box, Group, Text, Tooltip } from "@mantine/core";
import { IconSparkles } from "@tabler/icons-react";
import { useCallback } from "react";
import { useTranslation } from "react-i18next";
@@ -25,8 +25,8 @@ export interface LauncherInfo {
const GLYPH_SIZE = 38;
const LAUNCHER_SIZE = 22;
// How far the launcher avatar sticks out past the agent's top-right corner, so
// the "human behind" reads as behind (lower z-index) yet stays clearly visible.
// How far the launcher avatar sticks out past the agent's top-right corner — it
// sits as a small badge over that corner (above the glyph) and stays fully visible.
const LAUNCHER_OVERHANG = 8;
// Small deterministic string hash (same algorithm as custom-avatar's initials
@@ -40,22 +40,43 @@ function hashName(input: string): number {
return Math.abs(hash);
}
// A palette of categorically-DISTINCT dark circle colors for emoji/sparkles agent
// glyphs. Every entry is intentionally dark (low lightness) so a bright emoji or
// the white sparkles icon stays readable on top; the hues are spread across the
// wheel (red → orange → amber → green → teal → cyan → blue → indigo → violet →
// magenta + a neutral slate) so two different agents read as DIFFERENT colors,
// not merely different shades of the same violet.
const GLYPH_COLORS = [
"hsl(355, 60%, 34%)", // red
"hsl(18, 62%, 32%)", // vermilion
"hsl(32, 60%, 30%)", // orange
"hsl(45, 55%, 28%)", // amber
"hsl(75, 45%, 26%)", // olive-green
"hsl(140, 48%, 26%)", // green
"hsl(165, 52%, 26%)", // teal
"hsl(188, 58%, 28%)", // cyan
"hsl(205, 58%, 32%)", // sky blue
"hsl(225, 52%, 36%)", // blue
"hsl(250, 48%, 38%)", // indigo
"hsl(280, 46%, 36%)", // violet
"hsl(312, 48%, 34%)", // magenta
"hsl(210, 12%, 36%)", // slate / neutral
];
/**
* Deterministic DARK background for an emoji/sparkles agent glyph. The hue is
* derived from the agent-name hash so distinct agents get distinct circles;
* saturation and lightness are pinned low ("shifted into darkness") so a bright
* emoji or the white sparkles icon stays legible on top (#300).
* Deterministic dark circle color for an emoji/sparkles agent glyph, picked from
* GLYPH_COLORS by a hash of the agent name so distinct agents get categorically
* distinct colors while every color stays dark enough to keep the glyph readable.
*/
export function agentGlyphBackground(name: string): string {
const hue = hashName(name) % 360;
return `hsl(${hue}, 45%, 24%)`;
return GLYPH_COLORS[hashName(name) % GLYPH_COLORS.length];
}
/**
* The front avatar. Image-source priority (#300):
* 1. agent.avatarUrl -> a real avatar image (external MCP agent account).
* 2. agent.emoji -> the role emoji on a violet circle.
* 3. otherwise -> the IconSparkles glyph on a violet circle (fallback).
* 2. agent.emoji -> the role emoji on a per-agent dark circle.
* 3. otherwise -> the IconSparkles glyph on a per-agent dark circle (fallback).
*/
function AgentGlyph({ agent }: { agent: AgentInfo }) {
if (agent.avatarUrl) {
@@ -68,29 +89,33 @@ function AgentGlyph({ agent }: { agent: AgentInfo }) {
);
}
// Emoji/sparkles glyphs sit on a per-agent dark circle (hashed from the agent
// name) so different agents are visually distinct, while the dark background
// keeps the emoji / white sparkles icon readable.
const bg = agentGlyphBackground(agent.name);
const glyphStyles = {
root: { background: bg },
placeholder: { background: bg, color: "var(--mantine-color-white)" },
};
if (agent.emoji) {
return (
<Avatar size={GLYPH_SIZE} radius="xl" variant="filled" styles={glyphStyles}>
// Emoji/sparkles glyph on a per-agent dark circle (color hashed from the agent
// name). Rendered as a plain Box, NOT a Mantine `Avatar variant="filled"`, so
// the background is guaranteed instead of being overridden by Mantine's
// `--avatar-bg` (which was falling back to the theme's violet for every agent).
return (
<Box
data-testid="agent-glyph"
style={{
width: GLYPH_SIZE,
height: GLYPH_SIZE,
borderRadius: "50%",
background: agentGlyphBackground(agent.name),
color: "var(--mantine-color-white)",
display: "flex",
alignItems: "center",
justifyContent: "center",
lineHeight: 1,
}}
>
{agent.emoji ? (
<span style={{ fontSize: Math.round(GLYPH_SIZE * 0.5) }} aria-hidden>
{agent.emoji}
</span>
</Avatar>
);
}
return (
<Avatar size={GLYPH_SIZE} radius="xl" variant="filled" styles={glyphStyles}>
<IconSparkles size={Math.round(GLYPH_SIZE * 0.55)} stroke={2} />
</Avatar>
) : (
<IconSparkles size={Math.round(GLYPH_SIZE * 0.55)} stroke={2} />
)}
</Box>
);
}
@@ -110,8 +135,10 @@ export interface AgentAvatarStackProps {
}
/**
* The "agent avatar stack" (#300): the AGENT glyph in front, and — for an
* internal AI chat — the HUMAN who launched it as a smaller avatar offset behind.
* The "agent avatar stack" (#300): the AGENT glyph, and — for an internal AI
* chat — the HUMAN who launched it as a smaller avatar badge on top, overhanging
* the glyph's top-right corner in FRONT (zIndex 2 > the glyph's zIndex 1) so the
* launcher stays fully visible rather than being half-hidden behind the glyph.
* Replaces the old text `AI-agent` badge. When the item carries an `aiChatId` the
* whole stack is a deep-link into that chat (the click the old badge owned moved
* here); the click is contained (stopPropagation) so it does not also trigger an
@@ -185,7 +212,9 @@ export function AgentAvatarStack({
: {})}
>
{launcher && (
<Box pos="absolute" top={0} right={0} style={{ zIndex: 0 }}>
// Launcher badge sits ABOVE the agent glyph (zIndex) at the top-right so
// it is fully visible, not half-hidden behind the agent circle.
<Box pos="absolute" top={0} right={0} style={{ zIndex: 2 }}>
<CustomAvatar
size={LAUNCHER_SIZE}
avatarUrl={launcher.avatarUrl}
@@ -1,7 +1,14 @@
import { describe, it, expect, vi } from "vitest";
import { render, screen } from "@testing-library/react";
import { render, screen, fireEvent } from "@testing-library/react";
import { MantineProvider } from "@mantine/core";
// Spy on the markdown renderer so we can assert it is NOT called while the block
// is collapsed (the #302 fix) and IS called once on expand. The count/fallback
// tests don't depend on real markdown, so a light stub is safe.
vi.mock("@/features/ai-chat/utils/markdown.ts", () => ({
renderChatMarkdown: vi.fn((md: string) => `<p>${md}</p>`),
}));
// Stub react-i18next so `t` returns the key with `{{count}}` interpolated. This
// keeps the assertions on the component's OWN count logic (authoritative vs
// estimate) rather than on translation, and mirrors the t-mock pattern used by
@@ -17,6 +24,7 @@ vi.mock("react-i18next", () => ({
import ReasoningBlock from "./reasoning-block";
import { estimateTokens } from "@/features/ai-chat/utils/count-stream-tokens.ts";
import { renderChatMarkdown } from "@/features/ai-chat/utils/markdown.ts";
// matchMedia (read by MantineProvider) is stubbed globally in vitest.setup.ts.
@@ -62,4 +70,18 @@ describe("ReasoningBlock", () => {
// either way the text is present in the document.
expect(screen.getByText(/reasoning/)).toBeDefined();
});
it("does not parse the reasoning markdown while collapsed; parses on expand (#302)", () => {
const renderSpy = vi.mocked(renderChatMarkdown);
renderSpy.mockClear();
renderBlock({ text: "**bold** reasoning", tokens: 5 });
// Collapsed is the default. The expensive markdown parse (marked + DOMPurify)
// must NOT run for the hidden body — that O(n^2) re-parse on every streamed
// delta is exactly what froze the chat (#302). The collapsed body shows the
// cheap raw-text fallback instead.
expect(renderSpy).not.toHaveBeenCalled();
// Expanding parses the current text exactly once (a user-initiated click).
fireEvent.click(screen.getByRole("button"));
expect(renderSpy).toHaveBeenCalledTimes(1);
});
});
@@ -34,15 +34,19 @@ function ReasoningBlock({ text, tokens }: ReasoningBlockProps) {
// Authoritative count wins; otherwise estimate live from the streamed text.
const count = tokens && tokens > 0 ? tokens : estimateTokens(text);
const trimmed = text.trim();
// Memoize the markdown render so toggling `open` (or a parent re-render caused
// by an unrelated streamed delta) does not re-parse the reasoning text; it
// recomputes only when the reasoning text itself changes (while it streams in).
// collapseBlankLines collapses the blank-line gaps the model emits between every
// list item / paragraph so the reasoning renders compactly (tight lists, joined
// paragraphs) — ONLY here, not in the normal answer.
// Parse the reasoning markdown ONLY while the block is expanded. Collapsed is the
// default and the common case during a long "thinking" stream: reasoning text
// streams in and grows with every throttled delta (~20Hz), so a `[trimmed]`-only
// memo re-parses the whole, ever-growing text (marked + DOMPurify) on every delta
// — an O(n²) storm that pins the main thread and freezes the chat, all for a block
// the user isn't even looking at (the html is only shown inside <Collapse in={open}>
// below). Gating on `open` skips that hidden parsing entirely; expanding parses the
// current text once (an instant, user-initiated click), and further streaming while
// open is the normal per-delta append render, like the answer.
const html = useMemo(
() => (trimmed ? renderChatMarkdown(collapseBlankLines(trimmed), {}) : ""),
[trimmed],
() =>
open && trimmed ? renderChatMarkdown(collapseBlankLines(trimmed), {}) : "",
[open, trimmed],
);
return (
@@ -27,7 +27,9 @@ export function useOpenAiChatForCurrentPage() {
// AiChatWindow lives in a pathless parent layout route, so useParams() can't
// see :pageSlug — match the full path against the authenticated page route.
const match = useMatch("/s/:spaceSlug/p/:pageSlug");
const pageId = extractPageSlugId(match?.params?.pageSlug);
// A page slugId (10-char nanoid), NOT a uuid; the server resolves it to the
// real page uuid (PageRepo.findById accepts slugId or uuid).
const slugId = extractPageSlugId(match?.params?.pageSlug);
return useCallback(async () => {
// Re-clicks while the window is already open (incl. minimized) must NOT
@@ -40,9 +42,9 @@ export function useOpenAiChatForCurrentPage() {
// connection the first click reads as a hung control until the POST returns.
setWindowOpen(true);
let resolved: string | null = activeChatId; // off-a-page: keep current
if (pageId) {
if (slugId) {
try {
resolved = await getBoundChat(pageId); // null => fresh chat
resolved = await getBoundChat(slugId); // null => fresh chat
} catch {
resolved = null; // fail-soft: a fresh chat is always a safe fallback
}
@@ -58,7 +60,7 @@ export function useOpenAiChatForCurrentPage() {
}, [
windowOpen,
activeChatId,
pageId,
slugId,
setWindowOpen,
setActiveChatId,
setDraft,
@@ -46,9 +46,11 @@ export async function getAiChatMessages(
* Resolve the chat bound to a document (the current user's most-recent chat
* created on that page), or null when there is none. Drives auto-open-on-page.
*/
export async function getBoundChat(pageId: string): Promise<string | null> {
export async function getBoundChat(slugId: string): Promise<string | null> {
// The `pageId` body field accepts a page slugId or a uuid; the server resolves
// it to the real page uuid (the wire key stays `pageId` for the DTO).
const req = await api.post<{ chatId: string | null }>("/ai-chat/bound-chat", {
pageId,
pageId: slugId,
});
return req.data.chatId;
}
@@ -1,5 +1,5 @@
import { describe, it, expect, vi } from "vitest";
import { render, screen } from "@testing-library/react";
import { render, screen, fireEvent } from "@testing-library/react";
import { MantineProvider } from "@mantine/core";
import { IComment } from "@/features/comment/types/comment.types";
@@ -7,10 +7,15 @@ import { IComment } from "@/features/comment/types/comment.types";
// The comment mutation hooks reach out to react-query/network — stub them so the
// component renders in isolation. We only assert the AI-badge rendering branch.
const applyMutateAsync = vi.fn();
vi.mock("@/features/comment/queries/comment-query", () => ({
useDeleteCommentMutation: () => ({ mutateAsync: vi.fn() }),
useResolveCommentMutation: () => ({ mutateAsync: vi.fn() }),
useUpdateCommentMutation: () => ({ mutateAsync: vi.fn() }),
useApplySuggestionMutation: () => ({
mutateAsync: applyMutateAsync,
isPending: false,
}),
}));
// CommentEditor pulls in the full TipTap editor stack; replace it with a stub.
@@ -19,6 +24,7 @@ vi.mock("@/features/comment/components/comment-editor", () => ({
}));
import CommentListItem from "./comment-list-item";
import { canShowApply } from "@/features/comment/utils/suggestion";
const baseComment = (over?: Partial<IComment>): IComment =>
({
@@ -32,10 +38,15 @@ const baseComment = (over?: Partial<IComment>): IComment =>
...over,
}) as IComment;
function renderItem(comment: IComment) {
function renderItem(comment: IComment, canEdit = true) {
return render(
<MantineProvider>
<CommentListItem comment={comment} pageId="page-1" canComment={true} />
<CommentListItem
comment={comment}
pageId="page-1"
canComment={true}
canEdit={canEdit}
/>
</MantineProvider>,
);
}
@@ -87,3 +98,87 @@ describe("CommentListItem — agent avatar stack", () => {
// are covered directly in agent-avatar-stack.test.tsx; this integration suite
// only guards the insertion gate (agent → stack, user → no stack).
});
describe("CommentListItem — suggested edit (#315)", () => {
const suggestion = (over?: Partial<IComment>): IComment =>
baseComment({
selection: "old wording here",
suggestedText: "new wording here",
...over,
});
it("renders the было→стало diff and an Apply button when canEdit and not applied/resolved", () => {
renderItem(suggestion(), true);
// Old text appears both as the selection quote and as the struck diff row.
expect(screen.getAllByText("old wording here").length).toBeGreaterThan(0);
expect(screen.getByText("new wording here")).toBeDefined();
// Apply button is present.
expect(screen.getByRole("button", { name: "Apply" })).toBeDefined();
// No Applied badge yet.
expect(screen.queryByText("Applied")).toBeNull();
});
it("hides the Apply button when canEdit is false", () => {
renderItem(suggestion(), false);
// Diff still renders...
expect(screen.getByText("new wording here")).toBeDefined();
// ...but no Apply button.
expect(screen.queryByRole("button", { name: "Apply" })).toBeNull();
});
it("shows an Applied badge (no Apply button) once suggestionAppliedAt is set", () => {
renderItem(suggestion({ suggestionAppliedAt: new Date() }), true);
expect(screen.getByText("Applied")).toBeDefined();
expect(screen.queryByRole("button", { name: "Apply" })).toBeNull();
});
it("hides the Apply button once the thread is resolved", () => {
renderItem(suggestion({ resolvedAt: new Date() }), true);
expect(screen.queryByRole("button", { name: "Apply" })).toBeNull();
});
it("calls the apply mutation when the Apply button is clicked", () => {
applyMutateAsync.mockClear();
renderItem(suggestion(), true);
fireEvent.click(screen.getByRole("button", { name: "Apply" }));
expect(applyMutateAsync).toHaveBeenCalledWith({
commentId: "c-1",
pageId: "page-1",
});
});
it("does not render the diff block for a reply (child) comment", () => {
renderItem(
suggestion({ parentCommentId: "c-0" }),
true,
);
expect(screen.queryByText("new wording here")).toBeNull();
expect(screen.queryByRole("button", { name: "Apply" })).toBeNull();
});
});
describe("canShowApply predicate", () => {
const c = (over?: Partial<IComment>): IComment =>
({ suggestedText: "x", ...over }) as IComment;
it("true when suggestion present, editable, not applied/resolved, top-level", () => {
expect(canShowApply(c(), true)).toBe(true);
});
it("false without edit permission", () => {
expect(canShowApply(c(), false)).toBe(false);
});
it("false when no suggestion", () => {
expect(canShowApply(c({ suggestedText: null }), true)).toBe(false);
});
it("false when already applied", () => {
expect(canShowApply(c({ suggestionAppliedAt: new Date() }), true)).toBe(
false,
);
});
it("false when resolved", () => {
expect(canShowApply(c({ resolvedAt: new Date() }), true)).toBe(false);
});
it("false for a reply comment", () => {
expect(canShowApply(c({ parentCommentId: "p" }), true)).toBe(false);
});
});
@@ -1,4 +1,4 @@
import { Group, Text, Box } from "@mantine/core";
import { Group, Text, Box, Badge, Button } from "@mantine/core";
import { AgentAvatarStack } from "@/components/ui/agent-avatar-stack.tsx";
import React, { useEffect, useRef, useState } from "react";
import classes from "./comment.module.css";
@@ -11,11 +11,13 @@ import CommentMenu from "@/features/comment/components/comment-menu";
import ResolveComment from "@/features/comment/components/resolve-comment";
import { useHover } from "@mantine/hooks";
import {
useApplySuggestionMutation,
useDeleteCommentMutation,
useResolveCommentMutation,
useUpdateCommentMutation,
} from "@/features/comment/queries/comment-query";
import { IComment } from "@/features/comment/types/comment.types";
import { canShowApply } from "@/features/comment/utils/suggestion";
import { CustomAvatar } from "@/components/ui/custom-avatar.tsx";
import { currentUserAtom } from "@/features/user/atoms/current-user-atom.ts";
import { useTranslation } from "react-i18next";
@@ -24,6 +26,10 @@ interface CommentListItemProps {
comment: IComment;
pageId: string;
canComment: boolean;
// Real page-edit permission (page.permissions.canEdit) — gates the suggestion
// "Apply" button. Distinct from `canComment`, which may be looser (viewers
// allowed to comment cannot apply edits).
canEdit?: boolean;
userSpaceRole?: string;
}
@@ -31,6 +37,7 @@ function CommentListItem({
comment,
pageId,
canComment,
canEdit,
userSpaceRole,
}: CommentListItemProps) {
const { t } = useTranslation();
@@ -43,6 +50,7 @@ function CommentListItem({
const updateCommentMutation = useUpdateCommentMutation();
const deleteCommentMutation = useDeleteCommentMutation(comment.pageId);
const resolveCommentMutation = useResolveCommentMutation();
const applySuggestionMutation = useApplySuggestionMutation();
const [currentUser] = useAtom(currentUserAtom);
const createdAtAgo = useTimeAgo(comment.createdAt);
@@ -95,6 +103,18 @@ function CommentListItem({
}
}
async function handleApplySuggestion() {
try {
await applySuggestionMutation.mutateAsync({
commentId: comment.id,
pageId: comment.pageId,
});
} catch (error) {
// Errors surface via the mutation's onError notification (incl. 409).
console.error("Failed to apply suggestion:", error);
}
}
function handleCommentClick(comment: IComment) {
const el = document.querySelector(
`.comment-mark[data-comment-id="${comment.id}"]`,
@@ -211,6 +231,47 @@ function CommentListItem({
</Box>
)}
{/* Suggested-edit (#315): "было → стало" diff for a top-level comment
carrying a suggestion. Old text struck-through/red, new text green. */}
{!comment.parentCommentId && comment.suggestedText && (
<Box className={classes.suggestionBlock}>
{comment.selection && (
<Text size="xs" className={classes.suggestionOld}>
{comment.selection}
</Text>
)}
<Text size="xs" className={classes.suggestionNew}>
{comment.suggestedText}
</Text>
{comment.suggestionAppliedAt ? (
<Badge
size="sm"
color="green"
variant="light"
mt={6}
aria-label={t("Applied")}
>
{t("Applied")}
</Badge>
) : (
canShowApply(comment, canEdit) && (
<Button
size="compact-xs"
variant="light"
color="green"
mt={6}
onClick={handleApplySuggestion}
loading={applySuggestionMutation.isPending}
disabled={applySuggestionMutation.isPending}
>
{t("Apply")}
</Button>
)
)}
</Box>
)}
{!isEditing ? (
<CommentEditor defaultContent={content} editable={false} />
) : (
@@ -49,8 +49,10 @@ function CommentListWithTabs({ onClose }: CommentListWithTabsProps) {
const [isLoading, setIsLoading] = useState(false);
const { data: space } = useGetSpaceBySlugQuery(page?.space?.slug);
const canEdit = page?.permissions?.canEdit ?? false;
const canComment =
(page?.permissions?.canEdit ?? false) ||
canEdit ||
(space?.settings?.comments?.allowViewerComments === true);
// Separate active and resolved comments
@@ -137,6 +139,7 @@ function CommentListWithTabs({ onClose }: CommentListWithTabsProps) {
comment={comment}
pageId={page?.id}
canComment={canComment}
canEdit={canEdit}
userSpaceRole={space?.membership?.role}
/>
<MemoizedChildComments
@@ -144,6 +147,7 @@ function CommentListWithTabs({ onClose }: CommentListWithTabsProps) {
parentId={comment.id}
pageId={page?.id}
canComment={canComment}
canEdit={canEdit}
userSpaceRole={space?.membership?.role}
/>
</div>
@@ -160,7 +164,14 @@ function CommentListWithTabs({ onClose }: CommentListWithTabsProps) {
)}
</Paper>
),
[comments, handleAddReply, isLoading, space?.membership?.role, canComment],
[
comments,
handleAddReply,
isLoading,
space?.membership?.role,
canComment,
canEdit,
],
);
if (isCommentsLoading) {
@@ -300,6 +311,7 @@ interface ChildCommentsProps {
parentId: string;
pageId: string;
canComment: boolean;
canEdit?: boolean;
userSpaceRole?: string;
}
const ChildComments = ({
@@ -307,6 +319,7 @@ const ChildComments = ({
parentId,
pageId,
canComment,
canEdit,
userSpaceRole,
}: ChildCommentsProps) => {
const getChildComments = useCallback(
@@ -325,6 +338,7 @@ const ChildComments = ({
comment={childComment}
pageId={pageId}
canComment={canComment}
canEdit={canEdit}
userSpaceRole={userSpaceRole}
/>
<MemoizedChildComments
@@ -332,6 +346,7 @@ const ChildComments = ({
parentId={childComment.id}
pageId={pageId}
canComment={canComment}
canEdit={canEdit}
userSpaceRole={userSpaceRole}
/>
</div>
@@ -21,6 +21,38 @@
box-sizing: border-box;
}
/* Suggested-edit (#315) "было → стало" diff block. */
.suggestionBlock {
margin-top: 8px;
margin-left: 6px;
padding: 6px;
border-radius: var(--mantine-radius-sm);
border: 1px solid var(--mantine-color-default-border);
overflow-wrap: break-word;
word-break: break-word;
max-width: 100%;
box-sizing: border-box;
display: flex;
flex-direction: column;
align-items: flex-start;
}
.suggestionOld {
text-decoration: line-through;
color: var(--mantine-color-red-7);
background: var(--mantine-color-red-light);
border-radius: 2px;
padding: 1px 3px;
}
.suggestionNew {
color: var(--mantine-color-green-9);
background: var(--mantine-color-green-light);
border-radius: 2px;
padding: 1px 3px;
margin-top: 4px;
}
.commentEditor {
&[data-editable][data-surface="muted"] .ProseMirror:not(.focused) {
@@ -5,6 +5,7 @@ import {
InfiniteData,
} from "@tanstack/react-query";
import {
applySuggestion,
createComment,
deleteComment,
getPageComments,
@@ -176,6 +177,63 @@ function updateCommentInCache(
};
}
export function useApplySuggestionMutation() {
const queryClient = useQueryClient();
const { t } = useTranslation();
return useMutation<IComment, any, { commentId: string; pageId: string }>({
// No optimistic update: apply can fail with 409 (the commented text drifted),
// so we only mutate the cache once the server confirms.
mutationFn: ({ commentId }) => applySuggestion(commentId),
onSuccess: (data, variables) => {
const cache = queryClient.getQueryData(
RQ_KEY(variables.pageId),
) as InfiniteData<IPagination<IComment>> | undefined;
if (cache) {
queryClient.setQueryData(
RQ_KEY(variables.pageId),
updateCommentInCache(cache, variables.commentId, (comment) => ({
...comment,
suggestionAppliedAt: data.suggestionAppliedAt,
suggestionAppliedById: data.suggestionAppliedById,
// The server auto-resolves the thread on apply — carry that through.
resolvedAt: data.resolvedAt,
resolvedById: data.resolvedById,
resolvedBy: data.resolvedBy,
})),
);
}
notifications.show({ message: t("Suggestion applied") });
},
onError: (err: any) => {
// 409 => the commented text changed since the suggestion was made. Surface
// a specific message (with the current text) rather than a generic error.
const status = err?.response?.status;
const currentText = err?.response?.data?.currentText;
if (status === 409 && typeof currentText === "string") {
const shortText =
currentText.length > 80
? `${currentText.slice(0, 80)}`
: currentText;
notifications.show({
title: t(
"The commented text changed since this suggestion was made; it was not applied.",
),
message: shortText,
color: "red",
});
return;
}
notifications.show({
message: t("Failed to apply suggestion"),
color: "red",
});
},
});
}
export function useResolveCommentMutation() {
const queryClient = useQueryClient();
const { t } = useTranslation();
@@ -18,6 +18,13 @@ export async function resolveComment(data: IResolveComment): Promise<IComment> {
return req.data;
}
export async function applySuggestion(commentId: string): Promise<IComment> {
// Mirrors resolveComment: let axios reject on non-2xx so the mutation can read
// the 409 body (`{ message, currentText }`) off err.response.data.
const req = await api.post("/comments/apply-suggestion", { commentId });
return req.data.data ?? req.data;
}
export async function updateComment(
data: Partial<IComment>,
): Promise<IComment> {
@@ -28,6 +28,13 @@ export interface IComment {
createdSource?: string;
aiChatId?: string | null;
resolvedSource?: string | null;
// Suggested-edit (#315): when an agent proposes a replacement for the
// commented `selection`, `suggestedText` holds the "стало" text. Once a user
// applies it server-side the backend stamps `suggestionAppliedAt` /
// `suggestionAppliedById` and auto-resolves the thread.
suggestedText?: string | null;
suggestionAppliedAt?: Date | string | null;
suggestionAppliedById?: string | null;
// Server-normalized "agent avatar stack" provenance (#300), present only when
// createdSource === "agent": `agent` is the front identity, `launcher` the
// human behind it (null for an external MCP agent).
@@ -0,0 +1,14 @@
import { IComment } from "@/features/comment/types/comment.types";
// Whether the suggested-edit (#315) "Apply" button should be shown for a
// comment: it must carry a suggestion, not already be applied or resolved, be a
// top-level comment, and the viewer must be able to edit the page.
export function canShowApply(comment: IComment, canEdit?: boolean): boolean {
return Boolean(
canEdit &&
comment.suggestedText &&
!comment.suggestionAppliedAt &&
!comment.resolvedAt &&
!comment.parentCommentId,
);
}
@@ -0,0 +1,92 @@
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen } from "@testing-library/react";
import { MantineProvider } from "@mantine/core";
// A disabled mic must explain WHY it is unavailable rather than silently saying
// "Start dictation". This renders MicButton in its idle+disabled state with a
// forwarded reason and asserts the accessible label resolves to that reason's
// text via the shared resolver (dictation-status.resolveUnavailableLabel).
// matchMedia (read by MantineProvider) is stubbed globally in vitest.setup.ts.
// Pass i18n keys through verbatim so we assert the exact resolved string.
vi.mock("react-i18next", () => ({
useTranslation: () => ({ t: (s: string) => s }),
}));
// Keep both controllers inert and idle so MicButton renders the idle branch.
const idleCtl = {
status: "idle" as const,
start: vi.fn(async () => {}),
stop: vi.fn(),
cancel: vi.fn(),
audioLevel: 0,
errorMessage: null,
};
vi.mock("@/features/dictation/hooks/use-dictation", () => ({
useDictation: () => idleCtl,
}));
vi.mock("@/features/dictation/hooks/use-streaming-dictation", () => ({
useStreamingDictation: () => idleCtl,
}));
import { MicButton } from "./mic-button";
function renderButton(props: React.ComponentProps<typeof MicButton>) {
render(
<MantineProvider>
<MicButton {...props} />
</MantineProvider>,
);
}
describe("MicButton — disabled reason label", () => {
// jsdom has no MediaRecorder / mediaDevices, so isDictationSupported() would
// report "unsupported" and mask the forwarded reason. Stub both so the button
// is considered supported and the availability reason is what surfaces.
beforeEach(() => {
(globalThis as unknown as { MediaRecorder: unknown }).MediaRecorder =
class {};
Object.defineProperty(navigator, "mediaDevices", {
configurable: true,
value: { getUserMedia: vi.fn() },
});
});
afterEach(() => {
delete (globalThis as unknown as { MediaRecorder?: unknown }).MediaRecorder;
});
it("shows the cause-specific reason instead of 'Start dictation' when disabled with a reason", () => {
renderButton({ onText: () => {}, disabled: true, unavailableReason: "offline" });
const expected =
"No connection to the collaboration server — dictation unavailable";
// The reason surfaces as the accessible label (and the tooltip text).
const button = screen.getByRole("button", { name: expected });
expect(button).toBeDefined();
// It is marked disabled the Mantine way (data-disabled), NOT the native
// `disabled` attribute — otherwise pointer-events:none would kill the tooltip.
expect(button.getAttribute("data-disabled")).toBe("true");
expect(button.hasAttribute("disabled")).toBe(false);
// And it no longer silently reads "Start dictation".
expect(screen.queryByRole("button", { name: "Start dictation" })).toBeNull();
});
it("reads 'Start dictation' when enabled with no reason", () => {
renderButton({ onText: () => {} });
expect(
screen.getByRole("button", { name: "Start dictation" }),
).toBeDefined();
});
it("does not advertise 'Start dictation' when disabled with no reason", () => {
// A consumer passing bare `disabled` (e.g. the AI chat's isStreaming) with no
// unavailableReason must not get a hoverable mic whose tooltip invites
// "Start dictation" on a click that is rejected.
renderButton({ onText: () => {}, disabled: true });
expect(
screen.queryByRole("button", { name: "Start dictation" }),
).toBeNull();
const button = screen.getByRole("button");
expect(button.getAttribute("data-disabled")).toBe("true");
});
});
@@ -4,6 +4,11 @@ import { IconMicrophone, IconPlayerStopFilled } from "@tabler/icons-react";
import { useTranslation } from "react-i18next";
import { useDictation } from "@/features/dictation/hooks/use-dictation";
import { useStreamingDictation } from "@/features/dictation/hooks/use-streaming-dictation";
import {
isDictationSupported,
resolveUnavailableLabel,
type DictationUnavailableReason,
} from "@/features/dictation/dictation-status";
import classes from "./mic-button.module.css";
interface MicButtonProps {
@@ -21,6 +26,9 @@ interface MicButtonProps {
// When true, use the streaming (Silero-VAD) dictation controller, which emits
// text progressively as the user pauses; otherwise use the batch controller.
streaming?: boolean;
// When the mic is disabled for an availability reason, this is the cause the
// idle tooltip explains (e.g. pre-sync "connecting", "offline", "read-only").
unavailableReason?: DictationUnavailableReason;
}
/**
@@ -37,6 +45,7 @@ export const MicButton: FC<MicButtonProps> = ({
color,
iconSize,
streaming = false,
unavailableReason,
}) => {
const { t } = useTranslation();
// Call BOTH hooks unconditionally to respect the rules of hooks: which one is
@@ -46,7 +55,7 @@ export const MicButton: FC<MicButtonProps> = ({
const batchCtl = useDictation({ onText, onStart });
const streamingCtl = useStreamingDictation({ onText, onStart });
const ctl = streaming ? streamingCtl : batchCtl;
const { status, start, stop, audioLevel } = ctl;
const { status, start, stop, audioLevel, errorMessage } = ctl;
const resolvedIconSize = iconSize ?? (size === "lg" ? 18 : 16);
if (status === "recording") {
@@ -82,15 +91,28 @@ export const MicButton: FC<MicButtonProps> = ({
) {
// "loading" (streaming hook fetching the VAD model on first use) shows the
// same spinner+disabled state so the first click is visibly acknowledged and
// a confusing second click can't fire while the model loads.
const label = status === "loading" ? t("Preparing…") : t("Transcribing…");
// a confusing second click can't fire while the model loads. The error case
// explains the failure via the hook's resolved errorMessage instead of the
// transient "Transcribing…" label.
const label =
status === "error"
? (errorMessage ?? t("Transcription failed"))
: status === "loading"
? t("Preparing…")
: t("Transcribing…");
return (
<Tooltip label={label} withArrow>
<ActionIcon
size={size}
variant="subtle"
color={color}
disabled
// Mark disabled the Mantine way (data-disabled/aria-disabled) rather
// than the native `disabled` attribute: native `disabled` sets
// `pointer-events:none`, which suppresses hover so the Tooltip never
// fires. This is a status display with no click action to guard, so
// keeping it hoverable simply lets the error reason be read on hover.
data-disabled
aria-disabled
aria-label={label}
>
<Loader size="xs" />
@@ -99,18 +121,56 @@ export const MicButton: FC<MicButtonProps> = ({
);
}
// Idle branch. A grey/disabled mic must explain WHY it can't record. An
// unsupported browser/context is detected here; otherwise the parent forwards
// a cause-specific reason. We must NOT pass the native `disabled` prop: Mantine
// renders `<button disabled>` with `pointer-events:none`, which suppresses
// hover so the Tooltip never fires. Instead mark it disabled the Mantine way
// (data-disabled/aria-disabled) — keeping it hoverable and in the a11y tree —
// and guard the click ourselves.
const unsupported = !isDictationSupported();
const isDisabled = disabled || unsupported;
const reason: DictationUnavailableReason | undefined = unsupported
? "unsupported"
: unavailableReason;
const reasonLabel = reason ? resolveUnavailableLabel(reason, t) : undefined;
// A disabled mic with a known reason surfaces it on hover; an enabled mic
// invites "Start dictation". But a mic disabled with NO reason (e.g. a
// consumer that passes bare `disabled` — the AI chat's isStreaming, with no
// unavailableReason) must NOT hover a misleading, actionable "Start dictation"
// tooltip on a control that rejects the click. In that case we render the icon
// without a Tooltip and give it a neutral accessible label instead.
const ariaLabel = reasonLabel ?? (isDisabled ? t("Dictation") : t("Start dictation"));
const icon = (
<ActionIcon
size={size}
variant="subtle"
color={color}
onClick={(e) => {
if (isDisabled) {
e.preventDefault();
return;
}
void start();
}}
data-disabled={isDisabled || undefined}
aria-disabled={isDisabled}
aria-label={ariaLabel}
>
<IconMicrophone size={resolvedIconSize} />
</ActionIcon>
);
// Suppress the tooltip on a disabled mic that has nothing to explain — hovering
// a grey, unclickable mic should not advertise "Start dictation".
if (isDisabled && !reasonLabel) {
return icon;
}
return (
<Tooltip label={t("Start dictation")} withArrow>
<ActionIcon
size={size}
variant="subtle"
color={color}
onClick={() => void start()}
disabled={disabled}
aria-label={t("Start dictation")}
>
<IconMicrophone size={resolvedIconSize} />
</ActionIcon>
<Tooltip
label={reasonLabel ?? t("Start dictation")}
withArrow
>
{icon}
</Tooltip>
);
};
@@ -0,0 +1,156 @@
import { describe, it, expect } from "vitest";
import {
classifyGetUserMediaError,
classifyTranscriptionError,
dictationErrorMessage,
resolveUnavailableLabel,
isDictationSupported,
} from "./dictation-status";
// Unit tests for the shared dictation-status resolvers (dictation-status.ts).
// Both dictation hooks and the mic button form their user-facing strings here,
// so a regression in the classification or message mapping would silently swap
// what a user reads when the mic is grey or a recording fails. A fake `t`
// returns its key verbatim so we assert the exact i18n key each branch selects.
const t = (k: string) => k;
describe("classifyGetUserMediaError", () => {
it("maps NotAllowedError / SecurityError to mic-denied", () => {
expect(classifyGetUserMediaError({ name: "NotAllowedError" })).toBe(
"mic-denied",
);
expect(classifyGetUserMediaError({ name: "SecurityError" })).toBe(
"mic-denied",
);
});
it("maps NotFoundError / OverconstrainedError to no-mic", () => {
expect(classifyGetUserMediaError({ name: "NotFoundError" })).toBe("no-mic");
expect(classifyGetUserMediaError({ name: "OverconstrainedError" })).toBe(
"no-mic",
);
});
it("maps NotReadableError / AbortError to mic-in-use", () => {
expect(classifyGetUserMediaError({ name: "NotReadableError" })).toBe(
"mic-in-use",
);
expect(classifyGetUserMediaError({ name: "AbortError" })).toBe(
"mic-in-use",
);
});
it("maps anything else / undefined to unknown", () => {
expect(classifyGetUserMediaError({ name: "WeirdError" })).toBe("unknown");
expect(classifyGetUserMediaError(undefined)).toBe("unknown");
expect(classifyGetUserMediaError({})).toBe("unknown");
});
});
describe("classifyTranscriptionError", () => {
it("returns the verbatim server message when present", () => {
const err = { response: { status: 500, data: { message: "provider 404" } } };
expect(classifyTranscriptionError(err)).toEqual({
code: "transcription-failed",
serverMessage: "provider 404",
});
});
it("maps 503 / 403 (no server message) to stt-not-configured", () => {
expect(classifyTranscriptionError({ response: { status: 503 } })).toEqual({
code: "stt-not-configured",
});
expect(classifyTranscriptionError({ response: { status: 403 } })).toEqual({
code: "stt-not-configured",
});
});
it("falls back to transcription-failed with no server message otherwise", () => {
expect(classifyTranscriptionError({ response: { status: 500 } })).toEqual({
code: "transcription-failed",
});
expect(classifyTranscriptionError(new Error("network"))).toEqual({
code: "transcription-failed",
});
// Blank server message is ignored (does not win as verbatim text).
expect(
classifyTranscriptionError({ response: { data: { message: " " } } }),
).toEqual({ code: "transcription-failed" });
});
});
describe("dictationErrorMessage", () => {
it("maps each code to the expected i18n key", () => {
expect(dictationErrorMessage("mic-denied", t)).toBe(
"Microphone access denied",
);
expect(dictationErrorMessage("no-mic", t)).toBe("No microphone found");
expect(dictationErrorMessage("mic-in-use", t)).toBe(
"Microphone is unavailable or already in use",
);
expect(dictationErrorMessage("no-media-devices", t)).toBe(
"Audio recording is not available in this browser/context",
);
expect(dictationErrorMessage("stt-not-configured", t)).toBe(
"Voice dictation is not configured",
);
expect(dictationErrorMessage("transcription-failed", t)).toBe(
"Transcription failed",
);
expect(dictationErrorMessage("recorder-failed", t)).toBe(
"Could not start recording",
);
expect(dictationErrorMessage("vad-init-failed", t)).toBe(
"Could not start recording",
);
expect(dictationErrorMessage("unknown", t)).toBe(
"Could not start recording",
);
});
it("returns the server message verbatim for transcription-failed (not the t key)", () => {
expect(
dictationErrorMessage("transcription-failed", t, {
serverMessage: "quota exceeded",
}),
).toBe("quota exceeded");
});
it("appends the detail to recorder-failed / unknown", () => {
expect(
dictationErrorMessage("recorder-failed", t, { detail: "boom" }),
).toBe("Could not start recording: boom");
expect(dictationErrorMessage("unknown", t, { detail: "nope" })).toBe(
"Could not start recording: nope",
);
});
it("appends the detail to transcription-failed when there is no server message", () => {
expect(
dictationErrorMessage("transcription-failed", t, { detail: "timeout" }),
).toBe("Transcription failed: timeout");
});
});
describe("resolveUnavailableLabel", () => {
it("maps each reason to its expected i18n key", () => {
expect(resolveUnavailableLabel("connecting", t)).toBe(
"Dictation becomes available once the page finishes connecting",
);
expect(resolveUnavailableLabel("offline", t)).toBe(
"No connection to the collaboration server — dictation unavailable",
);
expect(resolveUnavailableLabel("read-only", t)).toBe(
"This page is read-only",
);
expect(resolveUnavailableLabel("unsupported", t)).toBe(
"Audio recording is not available in this browser/context",
);
});
});
describe("isDictationSupported", () => {
it("returns a boolean", () => {
expect(typeof isDictationSupported()).toBe("boolean");
});
});
@@ -0,0 +1,110 @@
// Single source of truth for "why dictation is unavailable" and "why it failed".
// Both dictation hooks and the mic button pull their user-facing strings from
// the resolvers here so the wording lives in exactly one place.
export type DictationUnavailableReason =
| "connecting"
| "offline"
| "read-only"
| "unsupported";
export type DictationErrorCode =
| "no-media-devices"
| "mic-denied"
| "no-mic"
| "mic-in-use"
| "recorder-failed"
| "vad-init-failed"
| "stt-not-configured"
| "transcription-failed"
| "unknown";
// True if this browser/context can record audio.
export function isDictationSupported(): boolean {
return (
typeof MediaRecorder !== "undefined" &&
typeof navigator !== "undefined" &&
!!navigator.mediaDevices?.getUserMedia
);
}
// getUserMedia / VAD.start rejection -> code, by DOMException .name.
export function classifyGetUserMediaError(err: unknown): DictationErrorCode {
const name = (err as { name?: string })?.name;
if (name === "NotAllowedError" || name === "SecurityError")
return "mic-denied";
if (name === "NotFoundError" || name === "OverconstrainedError")
return "no-mic";
if (name === "NotReadableError" || name === "AbortError") return "mic-in-use";
return "unknown";
}
// Transcription HTTP failure -> code (+ verbatim server message when present).
export function classifyTranscriptionError(err: unknown): {
code: DictationErrorCode;
serverMessage?: string;
} {
const resp = (
err as { response?: { status?: number; data?: { message?: string } } }
)?.response;
const serverMessage = resp?.data?.message;
if (serverMessage && serverMessage.trim().length > 0)
return { code: "transcription-failed", serverMessage };
if (resp?.status === 503 || resp?.status === 403)
return { code: "stt-not-configured" };
return { code: "transcription-failed" };
}
type TFn = (key: string) => string;
// Code -> user text. The ONE place runtime error strings are formed.
// serverMessage (verbatim) wins for transcription-failed; detail is appended
// to the generic "could not start"/"transcription failed" strings.
export function dictationErrorMessage(
code: DictationErrorCode,
t: TFn,
extra?: { serverMessage?: string; detail?: string },
): string {
const detail = extra?.detail;
switch (code) {
case "mic-denied":
return t("Microphone access denied");
case "no-mic":
return t("No microphone found");
case "mic-in-use":
return t("Microphone is unavailable or already in use");
case "no-media-devices":
return t("Audio recording is not available in this browser/context");
case "stt-not-configured":
return t("Voice dictation is not configured");
case "transcription-failed":
if (extra?.serverMessage && extra.serverMessage.trim().length > 0)
return extra.serverMessage;
return `${t("Transcription failed")}${detail ? `: ${detail}` : ""}`;
case "recorder-failed":
case "vad-init-failed":
case "unknown":
default:
return `${t("Could not start recording")}${detail ? `: ${detail}` : ""}`;
}
}
// Unavailable reason -> tooltip text (the ONE place these strings are formed).
export function resolveUnavailableLabel(
r: DictationUnavailableReason,
t: TFn,
): string {
switch (r) {
case "connecting":
return t("Dictation becomes available once the page finishes connecting");
case "offline":
return t(
"No connection to the collaboration server — dictation unavailable",
);
case "read-only":
return t("This page is read-only");
case "unsupported":
default:
return t("Audio recording is not available in this browser/context");
}
}
@@ -2,6 +2,11 @@ import { useCallback, useEffect, useRef, useState } from "react";
import { notifications } from "@mantine/notifications";
import { useTranslation } from "react-i18next";
import { transcribeAudio } from "@/features/dictation/services/dictation-service";
import {
classifyGetUserMediaError,
classifyTranscriptionError,
dictationErrorMessage,
} from "@/features/dictation/dictation-status";
// "loading" is set only by the streaming hook while it lazily loads the VAD
// model on first use; the batch hook never sets it. It exists so the streaming
@@ -26,6 +31,8 @@ interface UseDictationResult {
cancel: () => void;
// Smoothed live microphone level in the 0..1 range while recording (0 when idle).
audioLevel: number;
// The last error shown to the user (null until one occurs / on a new start).
errorMessage: string | null;
}
// Candidate container/codec combinations in preference order. The first one the
@@ -67,6 +74,8 @@ export function useDictation(
const { t } = useTranslation();
const [status, setStatus] = useState<DictationStatus>("idle");
const [audioLevel, setAudioLevel] = useState(0);
// Last error message shown to the user; the mic button reads it for its tooltip.
const [errorMessage, setErrorMessage] = useState<string | null>(null);
// Keep the latest callbacks in a ref so the recorder's onstop closure always
// calls the current handlers without re-creating the recorder.
@@ -194,15 +203,16 @@ export function useDictation(
if (startingRef.current || recorderRef.current || streamRef.current) return;
if (status !== "idle") return;
startingRef.current = true;
// Clear any stale error from a previous attempt.
setErrorMessage(null);
if (!navigator.mediaDevices?.getUserMedia) {
const reason =
"navigator.mediaDevices.getUserMedia is unavailable in this context";
console.error("[dictation] " + reason);
notifications.show({
color: "red",
message: t("Audio recording is not available in this browser/context"),
});
const message = dictationErrorMessage("no-media-devices", t);
notifications.show({ color: "red", message });
setErrorMessage(message);
setStatus("idle");
startingRef.current = false;
return;
@@ -215,19 +225,16 @@ export function useDictation(
// Always log the full error for diagnosis (name, message, stack).
console.error("[dictation] getUserMedia failed", err);
const name = (err as { name?: string })?.name;
const detail = (err as { message?: string })?.message ?? String(err);
let message: string;
if (name === "NotAllowedError" || name === "SecurityError") {
message = t("Microphone access denied");
} else if (name === "NotFoundError" || name === "OverconstrainedError") {
message = t("No microphone found");
} else if (name === "NotReadableError" || name === "AbortError") {
message = t("Microphone is unavailable or already in use");
} else {
// Unknown failure: show the real reason instead of a generic string.
message = `${t("Could not start recording")}: ${name ? `${name}: ` : ""}${detail}`;
}
const rawDetail = (err as { message?: string })?.message ?? String(err);
// Prefix the DOMException name (e.g. "TypeError: …") so the generic
// resolver branch reproduces this hook's original "Could not start
// recording: <name>: <detail>" text. Each caller owns its own detail; the
// streaming hook intentionally does not add the name.
const detail = `${name ? `${name}: ` : ""}${rawDetail}`;
const code = classifyGetUserMediaError(err);
const message = dictationErrorMessage(code, t, { detail });
notifications.show({ color: "red", message });
setErrorMessage(message);
setStatus("idle");
startingRef.current = false;
return;
@@ -249,10 +256,10 @@ export function useDictation(
// The stream was acquired but the recorder failed to construct; stop the
// tracks so the MediaStream does not leak before bailing out.
stopTracks();
notifications.show({
color: "red",
message: `${t("Could not start recording")}: ${(err as { message?: string })?.message ?? String(err)}`,
});
const detail = (err as { message?: string })?.message ?? String(err);
const message = dictationErrorMessage("recorder-failed", t, { detail });
notifications.show({ color: "red", message });
setErrorMessage(message);
setStatus("idle");
startingRef.current = false;
return;
@@ -293,21 +300,14 @@ export function useDictation(
.catch((err: unknown) => {
// Log the full error for diagnosis (status + body + stack).
console.error("[dictation] transcription failed", err);
const resp = (
err as { response?: { status?: number; data?: { message?: string } } }
)?.response;
const serverMsg = resp?.data?.message;
let message: string;
if (serverMsg && serverMsg.trim().length > 0) {
// The server already explains the cause (e.g. provider 404, bad
// format, STT not configured) — show it verbatim.
message = serverMsg;
} else if (resp?.status === 503 || resp?.status === 403) {
message = t("Voice dictation is not configured");
} else {
message = `${t("Transcription failed")}: ${(err as { message?: string })?.message ?? String(err)}`;
}
const { code, serverMessage } = classifyTranscriptionError(err);
const detail = (err as { message?: string })?.message ?? String(err);
const message = dictationErrorMessage(code, t, {
serverMessage,
detail,
});
notifications.show({ color: "red", message });
setErrorMessage(message);
setStatus("error");
if (errorTimerRef.current !== null) {
clearTimeout(errorTimerRef.current);
@@ -332,10 +332,10 @@ export function useDictation(
stopTracks();
recorderRef.current = null;
startingRef.current = false;
notifications.show({
color: "red",
message: `${t("Could not start recording")}: ${(err as { message?: string })?.message ?? String(err)}`,
});
const detail = (err as { message?: string })?.message ?? String(err);
const message = dictationErrorMessage("recorder-failed", t, { detail });
notifications.show({ color: "red", message });
setErrorMessage(message);
setStatus("idle");
return;
}
@@ -405,5 +405,5 @@ export function useDictation(
};
}, [clearTimer, stopTracks, stopMeter]);
return { status, start, stop, cancel, audioLevel };
return { status, start, stop, cancel, audioLevel, errorMessage };
}
@@ -4,6 +4,11 @@ import { useTranslation } from "react-i18next";
import { transcribeAudio } from "@/features/dictation/services/dictation-service";
import { encodeWavPcm16 } from "@/features/dictation/utils/encode-wav";
import type { DictationStatus } from "@/features/dictation/hooks/use-dictation";
import {
classifyGetUserMediaError,
classifyTranscriptionError,
dictationErrorMessage,
} from "@/features/dictation/dictation-status";
// Lazily-imported MicVAD type. The runtime import happens inside start() so the
// heavy onnxruntime-web / Silero model is code-split out of the main bundle and
@@ -27,6 +32,8 @@ interface UseStreamingDictationResult {
cancel: () => void;
// Smoothed live speech level in the 0..1 range while recording (0 when idle).
audioLevel: number;
// The last error shown to the user (null until one occurs / on a new start).
errorMessage: string | null;
}
// Sample rate of the audio MicVAD hands to onSpeechEnd (Silero VAD runs at 16k).
@@ -60,6 +67,8 @@ export function useStreamingDictation(
const { t } = useTranslation();
const [status, setStatus] = useState<DictationStatus>("idle");
const [audioLevel, setAudioLevel] = useState(0);
// Last error message shown to the user; the mic button reads it for its tooltip.
const [errorMessage, setErrorMessage] = useState<string | null>(null);
// Keep the latest callbacks in a ref so async VAD/HTTP closures always call the
// current handlers without re-creating the VAD.
@@ -158,26 +167,6 @@ export function useStreamingDictation(
}
}, []);
// Map a transcription error to a user-facing message, mirroring the batch hook.
const transcriptionErrorMessage = useCallback(
(err: unknown): string => {
const resp = (
err as { response?: { status?: number; data?: { message?: string } } }
)?.response;
const serverMsg = resp?.data?.message;
if (serverMsg && serverMsg.trim().length > 0) {
// The server already explains the cause (e.g. provider 404, bad format,
// STT not configured) — show it verbatim.
return serverMsg;
}
if (resp?.status === 503 || resp?.status === 403) {
return t("Voice dictation is not configured");
}
return `${t("Transcription failed")}: ${(err as { message?: string })?.message ?? String(err)}`;
},
[t],
);
// Handle one ended speech segment: encode to WAV and transcribe. Results are
// buffered by seq and flushed in order. A single failed segment does NOT kill
// the session: log + one notification, then advance past that seq so later
@@ -204,10 +193,14 @@ export function useStreamingDictation(
if (epoch !== epochRef.current) return;
// Log the full error for diagnosis (status + body + stack).
console.error("[dictation] segment transcription failed", err);
notifications.show({
color: "red",
message: transcriptionErrorMessage(err),
const { code, serverMessage } = classifyTranscriptionError(err);
const detail = (err as { message?: string })?.message ?? String(err);
const message = dictationErrorMessage(code, t, {
serverMessage,
detail,
});
notifications.show({ color: "red", message });
setErrorMessage(message);
// Skip this seq so later segments can still flush in order.
if (nextEmitSeqRef.current === seq) {
nextEmitSeqRef.current += 1;
@@ -226,7 +219,7 @@ export function useStreamingDictation(
}
});
},
[drainResults, transcriptionErrorMessage],
[drainResults, t],
);
const start = useCallback(async (): Promise<void> => {
@@ -236,6 +229,8 @@ export function useStreamingDictation(
if (startingRef.current || vadRef.current || activeRef.current) return;
if (status !== "idle") return;
startingRef.current = true;
// Clear any stale error from a previous attempt.
setErrorMessage(null);
// Notify the caller right when dictation begins (before any async work) so the
// editor can snapshot the caret position.
@@ -354,10 +349,9 @@ export function useStreamingDictation(
// actually runs.)
console.error("[dictation] VAD init failed", err);
const detail = (err as { message?: string })?.message ?? String(err);
notifications.show({
color: "red",
message: `${t("Could not start recording")}: ${detail}`,
});
const message = dictationErrorMessage("vad-init-failed", t, { detail });
notifications.show({ color: "red", message });
setErrorMessage(message);
// Defensive: if MicVAD.new partially succeeded before throwing, make sure we
// don't leak it.
destroyVad();
@@ -379,19 +373,11 @@ export function useStreamingDictation(
} catch (err) {
// Always log the full error for diagnosis (name, message, stack).
console.error("[dictation] VAD.start failed", err);
const name = (err as { name?: string })?.name;
const detail = (err as { message?: string })?.message ?? String(err);
let message: string;
if (name === "NotAllowedError" || name === "SecurityError") {
message = t("Microphone access denied");
} else if (name === "NotFoundError" || name === "OverconstrainedError") {
message = t("No microphone found");
} else if (name === "NotReadableError" || name === "AbortError") {
message = t("Microphone is unavailable or already in use");
} else {
message = `${t("Could not start recording")}: ${detail}`;
}
const code = classifyGetUserMediaError(err);
const message = dictationErrorMessage(code, t, { detail });
notifications.show({ color: "red", message });
setErrorMessage(message);
activeRef.current = false;
destroyVad();
setStatus("idle");
@@ -470,5 +456,5 @@ export function useStreamingDictation(
};
}, [clearTimer, destroyVad]);
return { status, start, stop, cancel, audioLevel };
return { status, start, stop, cancel, audioLevel, errorMessage };
}
@@ -1,6 +1,7 @@
import { atom } from "jotai";
import { Editor } from "@tiptap/core";
import { PageEditMode } from "@/features/user/types/user.types.ts";
import type { DictationUnavailableReason } from "@/features/dictation/dictation-status";
export const pageEditorAtom = atom<Editor | null>(null);
@@ -15,3 +16,15 @@ export const showLinkMenuAtom = atom(false);
// Current page's edit mode — initialized from the user's saved preference on
// first load, can be toggled locally without persisting to the server.
export const currentPageEditModeAtom = atom<PageEditMode>(PageEditMode.Edit);
// Whether the dictation mic can start, and (when it can't) the cause-specific
// reason the mic button surfaces as a tooltip. Published by the page editor,
// consumed by DictationGroup -> MicButton.
export type DictationAvailability = {
isEditable: boolean;
reason: DictationUnavailableReason | null;
};
export const dictationAvailabilityAtom = atom<DictationAvailability>({
isEditable: false,
reason: null,
});
@@ -0,0 +1,60 @@
import { describe, it, expect, vi } from "vitest";
import { render, act } from "@testing-library/react";
import { Provider, createStore } from "jotai";
import { dictationAvailabilityAtom } from "@/features/editor/atoms/editor-atoms.ts";
// Regression test for the byline mic staying stuck disabled (#311 / #309): on a
// page the user can edit, the mic must un-grey once the body becomes editable.
// #311 first fixed this by reading `editor.isEditable` via `useEditorState`; #309
// superseded that with a reactive `dictationAvailabilityAtom` that page-editor
// publishes (carrying both the editable gate AND the unavailable reason). The mic
// now gates on `dictationAvailability.isEditable`, so a change to that atom must
// re-render the group and flip the disabled state (jotai drives the subscription).
// Detectable stand-in that surfaces the `disabled` prop the component computes.
vi.mock("@/features/dictation/components/mic-button", () => ({
MicButton: ({ disabled }: any) => (
<button data-testid="mic" disabled={disabled} />
),
}));
import { DictationGroup } from "./dictation-group";
// Minimal editor stand-in matching the surface DictationGroup uses (handleStart /
// handleText). The disabled gate no longer reads this — it reads the atom.
function makeFakeEditor() {
return {
isEditable: false,
isDestroyed: false,
state: { selection: { from: 0, to: 0 }, doc: { content: { size: 0 } } },
} as any;
}
describe("DictationGroup editable reactivity (#309 atom / #311)", () => {
it("re-enables the mic when dictationAvailability flips isEditable false -> true", () => {
const editor = makeFakeEditor();
const store = createStore();
// Pre-sync: page editor publishes not-editable (with a reason).
store.set(dictationAvailabilityAtom, {
isEditable: false,
reason: "connecting",
});
const { getByTestId } = render(
<Provider store={store}>
<DictationGroup editor={editor} />
</Provider>,
);
// Not editable yet -> disabled (preserves the #218 pre-sync intent).
expect(getByTestId("mic").hasAttribute("disabled")).toBe(true);
// Collab sync -> page editor republishes editable; the atom change must
// re-render the group and enable the mic.
act(() => {
store.set(dictationAvailabilityAtom, { isEditable: true, reason: null });
});
expect(getByTestId("mic").hasAttribute("disabled")).toBe(false);
});
});
@@ -1,7 +1,8 @@
import { FC, useRef } from "react";
import type { Editor } from "@tiptap/react";
import { Editor } from "@tiptap/react";
import { useAtomValue } from "jotai";
import { workspaceAtom } from "@/features/user/atoms/current-user-atom.ts";
import { dictationAvailabilityAtom } from "@/features/editor/atoms/editor-atoms.ts";
import { MicButton } from "@/features/dictation/components/mic-button";
interface Props {
@@ -16,6 +17,8 @@ export const DictationGroup: FC<Props> = ({ editor, color, iconSize }) => {
const workspace = useAtomValue(workspaceAtom);
const streamingDictation =
workspace?.settings?.ai?.dictationStreaming === true;
// Cause-specific reason the mic is unavailable (published by the page editor).
const dictationAvailability = useAtomValue(dictationAvailabilityAtom);
// Caret snapshot taken when dictation starts (where the first segment lands).
const rangeRef = useRef<{ from: number; to: number } | null>(null);
// Running insertion point: after each inserted segment we remember the caret
@@ -80,7 +83,8 @@ export const DictationGroup: FC<Props> = ({ editor, color, iconSize }) => {
streaming={streamingDictation}
onStart={handleStart}
onText={handleText}
disabled={!editor.isEditable}
disabled={!dictationAvailability.isEditable}
unavailableReason={dictationAvailability.reason ?? undefined}
color={color}
iconSize={iconSize}
/>
@@ -1,6 +1,10 @@
import { describe, it, expect } from "vitest";
import { WebSocketStatus } from "@hocuspocus/provider";
import { isCollabSynced, isBodyEditable } from "./editor-sync-state";
import {
isCollabSynced,
isBodyEditable,
computeDictationAvailability,
} from "./editor-sync-state";
describe("isCollabSynced", () => {
it("is true only when Connected and synced", () => {
@@ -30,3 +34,77 @@ describe("isBodyEditable (pre-sync data-loss gate, #218)", () => {
expect(isBodyEditable({ ...base, inEditMode: false })).toBe(false);
});
});
describe("computeDictationAvailability (mic reason precedence, #309)", () => {
const base = {
editable: true,
inEditMode: true,
showStatic: false,
isDisconnected: false,
};
it("is available with no reason once synced (showStatic false)", () => {
expect(computeDictationAvailability(base)).toEqual({
isEditable: true,
reason: null,
});
});
it("reports 'offline' during pre-sync while disconnected", () => {
expect(
computeDictationAvailability({
...base,
showStatic: true,
isDisconnected: true,
}),
).toEqual({ isEditable: false, reason: "offline" });
});
it("reports 'connecting' during pre-sync while still connecting", () => {
expect(
computeDictationAvailability({
...base,
showStatic: true,
isDisconnected: false,
}),
).toEqual({ isEditable: false, reason: "connecting" });
});
it("reports 'read-only' without edit permission", () => {
expect(
computeDictationAvailability({ ...base, editable: false }),
).toEqual({ isEditable: false, reason: "read-only" });
});
it("reports 'read-only' when not in edit mode", () => {
expect(
computeDictationAvailability({ ...base, inEditMode: false }),
).toEqual({ isEditable: false, reason: "read-only" });
});
// Lack of edit permission takes precedence over the pre-sync reason: a
// read-only viewer who is ALSO inside the pre-sync window (showStatic) must
// still read "read-only", never "offline"/"connecting". This pins the
// `opts.editable &&` guard on the pre-sync branch.
it("prefers 'read-only' over pre-sync when a read-only viewer is disconnected", () => {
expect(
computeDictationAvailability({
editable: false,
inEditMode: true,
showStatic: true,
isDisconnected: true,
}),
).toEqual({ isEditable: false, reason: "read-only" });
});
it("prefers 'read-only' over pre-sync when a read-only viewer is still connecting", () => {
expect(
computeDictationAvailability({
editable: false,
inEditMode: true,
showStatic: true,
isDisconnected: false,
}),
).toEqual({ isEditable: false, reason: "read-only" });
});
});
@@ -1,4 +1,5 @@
import { WebSocketStatus } from "@hocuspocus/provider";
import type { DictationUnavailableReason } from "@/features/dictation/dictation-status";
/**
* The collab document is usable only once the provider is Connected AND has
@@ -30,3 +31,32 @@ export function isBodyEditable(opts: {
}): boolean {
return opts.editable && opts.inEditMode && !opts.showStatic;
}
/**
* Whether dictation can start and, when it can't, the cause-specific reason the
* mic button surfaces. Derives editability from `isBodyEditable` (the single,
* tested gate) so the published `isEditable` can never diverge from the actual
* body-editable state and make the tooltip lie (#309).
*
* `isDisconnected` is the caller's own boolean (collab connection is in the
* Disconnected state), passed in so this module stays free of the collab enum.
*/
export function computeDictationAvailability(opts: {
editable: boolean;
inEditMode: boolean;
showStatic: boolean;
isDisconnected: boolean;
}): { isEditable: boolean; reason: DictationUnavailableReason | null } {
const isEditable = isBodyEditable({
editable: opts.editable,
inEditMode: opts.inEditMode,
showStatic: opts.showStatic,
});
if (isEditable) return { isEditable, reason: null };
// Permitted to edit and in edit mode but not yet synced (showStatic) → pre-sync.
if (opts.editable && opts.inEditMode && opts.showStatic) {
return { isEditable, reason: opts.isDisconnected ? "offline" : "connecting" };
}
// No edit permission or not in edit mode.
return { isEditable, reason: "read-only" };
}
@@ -0,0 +1,164 @@
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { renderHook, act } from "@testing-library/react";
import type { RefObject } from "react";
import { useSwapHeightReservation } from "./use-swap-height-reservation";
// Controllable fake requestAnimationFrame. jsdom's rAF is timer-driven and hard
// to step deterministically, so we install a manual queue: `tickRaf()` drains the
// callbacks scheduled so far (a callback that reschedules enqueues a new one for
// the NEXT tick), letting each test advance the release loop frame by frame.
let rafQueue: Array<{ id: number; cb: FrameRequestCallback }> = [];
let nextRafId = 1;
let realRaf: typeof globalThis.requestAnimationFrame;
let realCancel: typeof globalThis.cancelAnimationFrame;
function tickRaf(): void {
const current = rafQueue;
rafQueue = [];
for (const { cb } of current) cb(0);
}
// A mutable stand-in for the live-content container. The hook only reads
// `scrollHeight`, so tests drive the release condition by mutating this.
function makeMenuRef(): {
ref: RefObject<HTMLElement | null>;
setScrollHeight: (h: number) => void;
} {
const el = { scrollHeight: 0 };
return {
ref: { current: el } as unknown as RefObject<HTMLElement | null>,
setScrollHeight: (h: number) => {
el.scrollHeight = h;
},
};
}
const H = 1000;
describe("useSwapHeightReservation", () => {
beforeEach(() => {
rafQueue = [];
nextRafId = 1;
realRaf = globalThis.requestAnimationFrame;
realCancel = globalThis.cancelAnimationFrame;
globalThis.requestAnimationFrame = ((cb: FrameRequestCallback) => {
const id = nextRafId++;
rafQueue.push({ id, cb });
return id;
}) as typeof globalThis.requestAnimationFrame;
globalThis.cancelAnimationFrame = ((id: number) => {
rafQueue = rafQueue.filter((e) => e.id !== id);
}) as typeof globalThis.cancelAnimationFrame;
});
afterEach(() => {
globalThis.requestAnimationFrame = realRaf;
globalThis.cancelAnimationFrame = realCancel;
vi.useRealTimers();
vi.restoreAllMocks();
});
// (a) reserve-on-swap: the captured height becomes `reservedHeight`, the value
// that drives the swap wrapper's minHeight. Captured while static is still up,
// then the swap flips showStatic; before any release frame runs the reservation
// is held at exactly H.
it("(a) holds the captured height as reservedHeight after the swap (drives minHeight)", () => {
const { ref, setScrollHeight } = makeMenuRef();
setScrollHeight(0); // live content not laid out yet -> release cannot fire.
const { result, rerender } = renderHook(
({ showStatic }) => useSwapHeightReservation(showStatic, ref),
{ initialProps: { showStatic: true } },
);
// Capture happens synchronously at the swap point (static still shown).
act(() => {
result.current.captureReservation(H);
});
// The swap flips to the live branch.
rerender({ showStatic: false });
expect(result.current.reservedHeight).toBe(H);
});
// (b) release when the live content is tall enough. Guard is `>=`: with
// liveHeight === H the reservation releases. This FAILS if the guard direction
// were `<` (liveHeight === H is not `< H`, so it would never release).
it("(b) releases once live content reaches the reserved height", () => {
const { ref, setScrollHeight } = makeMenuRef();
setScrollHeight(0);
const { result, rerender } = renderHook(
({ showStatic }) => useSwapHeightReservation(showStatic, ref),
{ initialProps: { showStatic: true } },
);
act(() => {
result.current.captureReservation(H);
});
rerender({ showStatic: false });
expect(result.current.reservedHeight).toBe(H); // still reserved (short live doc)
// Live editor finishes laying out to the reserved height.
setScrollHeight(H);
act(() => {
tickRaf();
});
expect(result.current.reservedHeight).toBeNull();
});
// (c) cap escape: the live content never reaches the reserved height, so the
// height match never fires; the reservation must still release at the 4000ms
// cap (no stuck reservation / dead space). This FAILS if there were no cap: the
// loop would poll forever while scrollHeight stays below H.
it("(c) releases at the 4000ms cap when live content stays too short", () => {
// Only fake Date so `Date.now()` (the cap clock) is controllable; leave our
// manual rAF queue in place (default fake timers would replace it).
vi.useFakeTimers({ toFake: ["Date"] });
vi.setSystemTime(0);
const { ref, setScrollHeight } = makeMenuRef();
setScrollHeight(H - 100); // always shorter than reserved -> height match never fires.
const { result, rerender } = renderHook(
({ showStatic }) => useSwapHeightReservation(showStatic, ref),
{ initialProps: { showStatic: true } },
);
act(() => {
result.current.captureReservation(H);
});
rerender({ showStatic: false });
// A few frames pass but time has not reached the cap: still reserved.
act(() => {
tickRaf();
});
act(() => {
tickRaf();
});
expect(result.current.reservedHeight).toBe(H);
// Advance past the cap; the next frame releases even though the live content
// is still shorter than the reservation.
vi.setSystemTime(4001);
act(() => {
tickRaf();
});
expect(result.current.reservedHeight).toBeNull();
});
// (d) non-swap: without a capture (and while static is shown) there is no
// reservation and the release loop never arms, so no rAF is scheduled.
it("(d) reserves nothing and arms no loop when the swap never happens", () => {
const { ref } = makeMenuRef();
const { result } = renderHook(() =>
useSwapHeightReservation(true, ref),
);
expect(result.current.reservedHeight).toBeNull();
expect(rafQueue.length).toBe(0); // release loop never armed
act(() => {
tickRaf();
});
expect(result.current.reservedHeight).toBeNull();
});
});
@@ -0,0 +1,79 @@
import { RefObject, useCallback, useEffect, useState } from "react";
// Last-resort release deadline. The primary release is the live-content height
// match below; this cap only exists so a slow/short live doc can never pin the
// reservation forever. It is generous (well past when the live content normally
// reaches the reserved height — it renders the SAME content as the static copy)
// so a slow load doesn't release mid-render and reintroduce the collapse.
const RELEASE_CAP_MS = 4000;
/**
* Reserves the document height across the static -> live editor swap.
*
* The live editor lays out its content over a few frames, so replacing the
* (full-height) static copy with it momentarily shrinks the document; the
* browser then clamps window scroll to the top, which yanked the reader off
* their restored reading position (and threw their scroll to 0 if they were
* scrolling at that moment). Pinning a min-height on the swap wrapper keeps the
* document tall through the swap so the scroll position simply survives (#266).
* `reservedHeight === null` means no reservation is active.
*
* The capture is intentionally a CALLBACK the page editor invokes, NOT something
* this hook derives by watching `showStatic`. The height MUST be read
* synchronously while the static content is still mounted (full natural height),
* right before the flip to the live branch. By the time any post-transition
* effect here could run, `showStatic` is already false and the wrapper shows the
* live/collapsed content, so `offsetHeight` would be wrong. So page-editor calls
* `captureReservation(wrapper.offsetHeight)` inside its collab-sync effect,
* before `setShowStatic(false)`, preserving that exact timing.
*
* @param showStatic whether the static (cached) content is still shown.
* @param menuContainerRef the live-branch content container. It is a descendant
* of the swap wrapper inside the live branch, so its `scrollHeight` is the live
* content height (not inflated by the ancestor min-height reservation).
*/
export function useSwapHeightReservation(
showStatic: boolean,
menuContainerRef: RefObject<HTMLElement | null>,
): {
reservedHeight: number | null;
captureReservation: (height: number | null) => void;
} {
const [reservedHeight, setReservedHeight] = useState<number | null>(null);
// Capture the current (static, full-height) content height BEFORE the swap so
// the wrapper can reserve it while the live editor lays out — otherwise the
// transient shrink clamps window scroll to the top. The caller reads
// `offsetHeight` synchronously at the swap point and hands it here.
const captureReservation = useCallback(
(height: number | null) => setReservedHeight(height),
[],
);
// Release the reserved height once the live editor's content has laid out to
// at least the reserved height (so removing the reservation cannot collapse
// the document). The primary release is that height match; the cap is only a
// last-resort so we never pin forever. A shorter-than-reserved live doc (rare:
// stale/longer cache) releases at the cap, leaving only harmless bottom dead
// space until then.
useEffect(() => {
if (showStatic || reservedHeight == null) return;
let raf = 0;
const startedAt = Date.now();
const check = () => {
const liveHeight = menuContainerRef.current?.scrollHeight ?? 0;
if (
liveHeight >= reservedHeight ||
Date.now() - startedAt > RELEASE_CAP_MS
) {
setReservedHeight(null);
return;
}
raf = requestAnimationFrame(check);
};
raf = requestAnimationFrame(check);
return () => cancelAnimationFrame(raf);
}, [showStatic, reservedHeight, menuContainerRef]);
return { reservedHeight, captureReservation };
}
@@ -27,11 +27,12 @@ import {
collabExtensions,
mainExtensions,
} from "@/features/editor/extensions/extensions";
import { useAtom, useAtomValue } from "jotai";
import { useAtom, useAtomValue, useSetAtom } from "jotai";
import useCollaborationUrl from "@/features/editor/hooks/use-collaboration-url";
import { currentUserAtom } from "@/features/user/atoms/current-user-atom";
import {
currentPageEditModeAtom,
dictationAvailabilityAtom,
pageEditorAtom,
yjsConnectionStatusAtom,
} from "@/features/editor/atoms/editor-atoms";
@@ -79,6 +80,7 @@ import { jwtDecode } from "jwt-decode";
import { searchSpotlight } from "@/features/search/constants.ts";
import { useEditorScroll } from "./hooks/use-editor-scroll";
import { useScrollRestoreOnSwap } from "./hooks/use-scroll-position";
import { useSwapHeightReservation } from "./hooks/use-swap-height-reservation";
import { EditorLinkMenu } from "@/features/editor/components/link/link-menu";
import ColumnsMenu from "@/features/editor/components/columns/columns-menu.tsx";
import { TransclusionLookupProvider } from "@/features/editor/components/transclusion/transclusion-lookup-context";
@@ -87,6 +89,7 @@ import { PageEmbedAncestryProvider } from "@/features/editor/components/page-emb
import PageEmbedPicker from "@/features/editor/components/page-embed/page-embed-picker";
import { useTranslation } from "react-i18next";
import {
computeDictationAvailability,
isBodyEditable,
isCollabSynced,
} from "@/features/editor/editor-sync-state";
@@ -138,6 +141,7 @@ export default function PageEditor({
const { pageSlug } = useParams();
const slugId = extractPageSlugId(pageSlug);
const currentPageEditMode = useAtomValue(currentPageEditModeAtom);
const setDictationAvailability = useSetAtom(dictationAvailabilityAtom);
const canScroll = useCallback(
() => Boolean(isComponentMounted.current && editorRef.current),
[isComponentMounted],
@@ -449,6 +453,22 @@ export default function PageEditor({
const hasConnectedOnceRef = useRef(false);
const [showStatic, setShowStatic] = useState(true);
// Reserved height held across the static -> live editor swap. The live editor
// lays out its content over a few frames, so replacing the (full-height) static
// copy with it momentarily shrinks the document; the browser then clamps window
// scroll to the top, which yanked the reader off their restored reading position
// (and threw their scroll to 0 if they were scrolling at that moment). Pinning a
// min-height on the swap wrapper keeps the document tall through the swap so the
// scroll position simply survives. `null` = no reservation active.
const swapWrapperRef = useRef<HTMLDivElement | null>(null);
// Reserve/release wiring lives in the hook so its capture trigger and release
// guard/cap are directly unit-testable. Capture stays synchronous at the swap
// point (see the collab-sync effect below); the hook only owns the release.
const { reservedHeight, captureReservation } = useSwapHeightReservation(
showStatic,
menuContainerRef,
);
useEffect(() => {
const timeout = setTimeout(() => {
if (yjsConnectionStatus === WebSocketStatus.Connecting || !isSynced) {
@@ -471,12 +491,36 @@ export default function PageEditor({
);
}, [currentPageEditMode, editor, editable, showStatic]);
// Publish whether dictation can start and, if not, the cause-specific reason
// the mic button surfaces. Recomputed on the same signals that drive body
// editability so the tooltip never lies about the current state.
useEffect(() => {
setDictationAvailability(
computeDictationAvailability({
editable,
inEditMode: currentPageEditMode === PageEditMode.Edit,
showStatic,
isDisconnected: yjsConnectionStatus === WebSocketStatus.Disconnected,
}),
);
}, [
editable,
currentPageEditMode,
showStatic,
yjsConnectionStatus,
setDictationAvailability,
]);
useEffect(() => {
if (
!hasConnectedOnceRef.current &&
isCollabSynced(yjsConnectionStatus, isSynced)
) {
hasConnectedOnceRef.current = true;
// Capture the current (static, full-height) content height BEFORE the swap
// so the wrapper can reserve it while the live editor lays out — otherwise
// the transient shrink clamps window scroll to the top.
captureReservation(swapWrapperRef.current?.offsetHeight ?? null);
setShowStatic(false);
}
}, [yjsConnectionStatus, isSynced]);
@@ -490,6 +534,12 @@ export default function PageEditor({
<TransclusionLookupProvider>
<PageEmbedLookupProvider>
<PageEmbedAncestryProvider hostPageId={pageId}>
<div
ref={swapWrapperRef}
style={
reservedHeight != null ? { minHeight: reservedHeight } : undefined
}
>
{showStatic ? (
<div style={{ position: "relative" }}>
{/* Surface the pre-sync read-only window so edits typed before the
@@ -577,6 +627,7 @@ export default function PageEditor({
></div>
</div>
)}
</div>
</PageEmbedAncestryProvider>
</PageEmbedLookupProvider>
</TransclusionLookupProvider>
@@ -1,4 +1,11 @@
import { Button, Group, Paper, Text } from "@mantine/core";
import {
ActionIcon,
Button,
Group,
Paper,
Text,
Tooltip,
} from "@mantine/core";
import { IconClockHour4, IconTrash } from "@tabler/icons-react";
import { useState } from "react";
import { Trans, useTranslation } from "react-i18next";
@@ -70,7 +77,14 @@ export function TemporaryNoteBanner({ slugId }: TemporaryNoteBannerProps) {
return (
<Paper radius="sm" mb="md" px="md" py="xs" bg="orange.0">
<Group justify="space-between" wrap="wrap" gap="sm">
<Group gap="xs" wrap="nowrap" style={{ flex: 1, minWidth: 0 }}>
{/* A non-zero flex-basis lets the outer wrap="wrap" drop the buttons to
their own row on narrow screens; flex:1 (basis 0) never wraps and
instead crushes the text into a one-word-per-line ladder. */}
<Group
gap="xs"
wrap="nowrap"
style={{ flex: "1 1 16rem", minWidth: 0 }}
>
<IconClockHour4
size={18}
stroke={1.5}
@@ -87,28 +101,58 @@ export function TemporaryNoteBanner({ slugId }: TemporaryNoteBannerProps) {
</Text>
</Group>
{canEdit && (
<Group gap="xs" wrap="nowrap">
<Button
size="xs"
variant="subtle"
color="red"
leftSection={<IconTrash size={16} />}
onClick={handleTrashNow}
loading={isDeleting}
>
{t("Move to trash")}
</Button>
<Button
size="xs"
variant="light"
color="orange"
leftSection={<IconClockHour4 size={16} />}
onClick={handleMakePermanent}
loading={toggleTemporary.isPending}
>
{t("Make permanent")}
</Button>
</Group>
<>
{/* Desktop: full labeled buttons. */}
<Group gap="xs" wrap="nowrap" visibleFrom="sm">
<Button
size="xs"
variant="subtle"
color="red"
leftSection={<IconTrash size={16} />}
onClick={handleTrashNow}
loading={isDeleting}
>
{t("Move to trash")}
</Button>
<Button
size="xs"
variant="light"
color="orange"
leftSection={<IconClockHour4 size={16} />}
onClick={handleMakePermanent}
loading={toggleTemporary.isPending}
>
{t("Make permanent")}
</Button>
</Group>
{/* Mobile: icon-only actions so they never overflow the narrow row. */}
<Group gap="xs" wrap="nowrap" hiddenFrom="sm">
<Tooltip label={t("Move to trash")} withArrow>
<ActionIcon
size="lg"
variant="subtle"
color="red"
onClick={handleTrashNow}
loading={isDeleting}
aria-label={t("Move to trash")}
>
<IconTrash size={18} />
</ActionIcon>
</Tooltip>
<Tooltip label={t("Make permanent")} withArrow>
<ActionIcon
size="lg"
variant="light"
color="orange"
onClick={handleMakePermanent}
loading={toggleTemporary.isPending}
aria-label={t("Make permanent")}
>
<IconClockHour4 size={18} />
</ActionIcon>
</Tooltip>
</Group>
</>
)}
</Group>
</Paper>
@@ -0,0 +1,60 @@
import { CollaborationGateway } from './collaboration.gateway';
import { CollaborationHandler } from './collaboration.handler';
/**
* Focused test for the COLLAB_DISABLE_REDIS fallback in handleYjsEvent.
*
* With Redis disabled the gateway builds no RedisSyncExtension, so the old code
* (`return this.redisSync?.handleEvent(...)`) returned undefined and every
* doc-mutation event silently no-opped. The fallback must instead invoke the
* handler locally against the single hocuspocus instance and return its verdict.
*
* We construct the gateway with stub extensions and an EnvironmentService whose
* isCollabDisableRedis() returns true (redisSync stays null, real hocuspocus is
* still built), then spy getHandlers so no real direct connection is opened.
*/
const stubExtension = {} as any;
function makeEnv() {
return {
getRedisUrl: () => 'redis://localhost:6379',
isCollabDisableRedis: () => true,
} as any;
}
describe('CollaborationGateway.handleYjsEvent (no-Redis fallback)', () => {
it('invokes the handler locally and returns its verdict instead of undefined', async () => {
const collabHandler = new CollaborationHandler();
const verdict = { applied: true, currentText: 'new' };
const fakeHandler = jest.fn().mockResolvedValue(verdict);
// Bypass the real direct-connection code path — assert dispatch only.
jest
.spyOn(collabHandler, 'getHandlers')
.mockReturnValue({ applyCommentSuggestion: fakeHandler } as any);
const gateway = new CollaborationGateway(
stubExtension,
stubExtension,
stubExtension,
makeEnv(),
collabHandler,
);
const payload = {
commentId: 'c1',
expectedText: 'old',
newText: 'new',
user: { id: 'u1' } as any,
};
const result = await gateway.handleYjsEvent(
'applyCommentSuggestion' as any,
'doc-1',
payload as any,
);
expect(fakeHandler).toHaveBeenCalledWith('doc-1', payload);
expect(result).toEqual(verdict);
expect(result).not.toBeUndefined();
});
});
@@ -147,8 +147,41 @@ export class CollaborationGateway {
eventName: TName,
documentName: string,
payload: Parameters<CollabEventHandlers[TName]>[1],
) {
return this.redisSync?.handleEvent(eventName, documentName, payload);
): ReturnType<CollabEventHandlers[TName]> {
if (this.redisSync) {
// Normal path: the Redis bridge routes the event to the instance that owns
// the document (local or another worker) and carries the handler's return
// value back to us (customEventComplete + replyId).
return this.redisSync.handleEvent(
eventName,
documentName,
payload,
) as ReturnType<CollabEventHandlers[TName]>;
}
// COLLAB_DISABLE_REDIS: there is no cross-process bridge, so a single local
// hocuspocus instance owns every document. Invoke the handler directly
// against it instead of returning undefined — otherwise doc-mutation events
// (setCommentMark / resolveCommentMark / applyCommentSuggestion) would
// silently no-op and, for suggestions, the caller could never learn the
// verdict. openDirectConnection loads the doc via the persistence extension
// if it is not already in memory.
if (this.hocuspocus) {
const handlers = this.collabEventsService.getHandlers(this.hocuspocus);
const handler = handlers[eventName] as (
documentName: string,
payload: unknown,
) => ReturnType<CollabEventHandlers[TName]>;
return handler(documentName, payload);
}
// Collaboration was never initialized (no live instance). Fail loudly rather
// than silently dropping a mutation; phase 4's caller maps this to a 5xx.
throw new Error(
`Cannot handle collaboration event "${String(
eventName,
)}": requires a live collaboration instance`,
);
}
openDirectConnection(documentName: string, context?: any) {
@@ -0,0 +1,132 @@
import * as Y from 'yjs';
import { CollaborationHandler } from './collaboration.handler';
import * as yjsUtil from './yjs.util';
import { User } from '@docmost/db/types/entity.types';
/**
* Unit tests for the `applyCommentSuggestion` collab handler (phase 3 of #315).
*
* The handler runs `replaceYjsMarkedText` inside the owning instance's Y
* transaction and returns the verdict to the caller. We exercise it against a
* REAL in-memory Y.Doc carrying a marked comment run, driven through a FAKE
* hocuspocus whose openDirectConnection().transact(fn) simply runs fn(doc) —
* mirroring how the real hocuspocus DirectConnection invokes the callback with
* the shared document (it does not forward the callback's return value, which is
* exactly why withYdocConnection captures it via a closure).
*/
// Build a Y.Doc with a single paragraph whose text carries a `comment` mark for
// the given commentId — the shape `replaceYjsMarkedText` walks in production.
function buildDocWithComment(text: string, commentId: string) {
const doc = new Y.Doc();
const fragment = doc.getXmlFragment('default');
const paragraph = new Y.XmlElement('paragraph');
const xmlText = new Y.XmlText();
xmlText.insert(0, text);
xmlText.format(0, text.length, { comment: { commentId, resolved: false } });
paragraph.insert(0, [xmlText]);
fragment.insert(0, [paragraph]);
return doc;
}
// Fake hocuspocus exposing only what withYdocConnection needs: a direct
// connection whose transact() runs the callback against `doc`.
function fakeHocuspocus(doc: Y.Doc) {
const connection = {
transact: jest.fn(async (fn: (d: Y.Doc) => void) => {
fn(doc);
}),
disconnect: jest.fn(async () => {}),
};
const hocuspocus = {
openDirectConnection: jest.fn(async () => connection),
} as any;
return { hocuspocus, connection };
}
const user = { id: 'u1' } as unknown as User;
describe('CollaborationHandler.applyCommentSuggestion', () => {
it('applies the replacement and returns the verdict when the marked text matches', async () => {
const doc = buildDocWithComment('Hello world', 'c1');
const { hocuspocus, connection } = fakeHocuspocus(doc);
const handler = new CollaborationHandler();
const handlers = handler.getHandlers(hocuspocus);
const result = await handlers.applyCommentSuggestion('doc-1', {
commentId: 'c1',
expectedText: 'Hello world',
newText: 'Goodbye world',
user,
});
expect(result).toEqual({ applied: true, currentText: 'Goodbye world' });
// The mutation ran inside the transaction and hit the real doc.
expect(connection.transact).toHaveBeenCalledTimes(1);
expect(connection.disconnect).toHaveBeenCalledTimes(1);
expect(doc.getXmlFragment('default').toString()).toContain(
'Goodbye world',
);
});
it('rejects (applied=false) and returns the current text when it changed', async () => {
const doc = buildDocWithComment('Hello world', 'c1');
const { hocuspocus } = fakeHocuspocus(doc);
const handler = new CollaborationHandler();
const handlers = handler.getHandlers(hocuspocus);
const result = await handlers.applyCommentSuggestion('doc-1', {
commentId: 'c1',
expectedText: 'Stale expected text',
newText: 'Goodbye world',
user,
});
expect(result).toEqual({ applied: false, currentText: 'Hello world' });
// Nothing was replaced.
expect(doc.getXmlFragment('default').toString()).toContain(
'Hello world',
);
});
it('forwards the exact args to replaceYjsMarkedText and returns its result', async () => {
const doc = buildDocWithComment('abc', 'c9');
const { hocuspocus } = fakeHocuspocus(doc);
const spy = jest
.spyOn(yjsUtil, 'replaceYjsMarkedText')
.mockReturnValue({ applied: true, currentText: 'xyz' });
const handler = new CollaborationHandler();
const handlers = handler.getHandlers(hocuspocus);
const result = await handlers.applyCommentSuggestion('doc-1', {
commentId: 'c9',
expectedText: 'abc',
newText: 'xyz',
user,
});
expect(spy).toHaveBeenCalledWith(
doc.getXmlFragment('default'),
'c9',
'abc',
'xyz',
);
expect(result).toEqual({ applied: true, currentText: 'xyz' });
spy.mockRestore();
});
it('withYdocConnection returns the callback result (transact does not forward it)', async () => {
const doc = new Y.Doc();
const { hocuspocus } = fakeHocuspocus(doc);
const handler = new CollaborationHandler();
const value = await handler.withYdocConnection(
hocuspocus,
'doc-1',
{},
() => 42,
);
expect(value).toBe(42);
});
});
@@ -5,7 +5,12 @@ import {
prosemirrorNodeToYElement,
tiptapExtensions,
} from './collaboration.util';
import { setYjsMark, updateYjsMarkAttribute, YjsSelection } from './yjs.util';
import {
replaceYjsMarkedText,
setYjsMark,
updateYjsMarkAttribute,
YjsSelection,
} from './yjs.util';
import * as Y from 'yjs';
import { User } from '@docmost/db/types/entity.types';
@@ -73,6 +78,35 @@ export class CollaborationHandler {
},
);
},
applyCommentSuggestion: async (
documentName: string,
payload: {
commentId: string;
expectedText: string;
newText: string;
user: User;
},
): Promise<{ applied: boolean; currentText: string | null }> => {
const { commentId, expectedText, newText, user } = payload;
// Run the check-and-replace inside the owning instance's Y transaction so
// the delete+insert are atomic. The verdict from replaceYjsMarkedText is
// returned to the API-server caller (cross-process via the Redis bridge,
// or locally when Redis is disabled — see collaboration.gateway.ts).
return this.withYdocConnection(
hocuspocus,
documentName,
{ user },
(doc) => {
const fragment = doc.getXmlFragment('default');
return replaceYjsMarkedText(
fragment,
commentId,
expectedText,
newText,
);
},
);
},
updatePageContent: async (
documentName: string,
payload: {
@@ -115,18 +149,28 @@ export class CollaborationHandler {
};
}
async withYdocConnection(
async withYdocConnection<T>(
hocuspocus: Hocuspocus,
documentName: string,
context: any = {},
fn: (doc: Document) => void,
): Promise<void> {
// `fn` MUST be synchronous: hocuspocus `connection.transact(fn)` runs fn
// synchronously and does NOT await it, so any mutations after an `await`
// inside fn would execute OUTSIDE the Yjs transaction and lose atomicity.
fn: (doc: Document) => T,
): Promise<T> {
const connection = await hocuspocus.openDirectConnection(
documentName,
context,
);
try {
await connection.transact(fn);
// hocuspocus `connection.transact(fn)` invokes fn(document) but does NOT
// forward fn's return value, so we capture it in a closure and return it
// after the transaction (and its storeDocument hooks) resolve.
let result: T;
await connection.transact((doc) => {
result = fn(doc);
});
return result!;
} finally {
await connection.disconnect();
}
@@ -10,6 +10,7 @@ import {
setYjsMark,
removeYjsMarkByAttribute,
updateYjsMarkAttribute,
replaceYjsMarkedText,
type YjsSelection,
} from './yjs.util';
@@ -276,3 +277,256 @@ describe('updateYjsMarkAttribute', () => {
expect(text.toDelta()).toEqual(before);
});
});
describe('replaceYjsMarkedText', () => {
// Build a single-paragraph XmlText from runs. Insert the whole string as
// plain text FIRST, then format only the marked ranges — otherwise text
// inserted right after a marked run inherits its comment mark (Yjs carries
// formatting from the left insertion boundary).
function buildRuns(
runs: Array<{
text: string;
comment?: { commentId: string; resolved: boolean };
}>,
): { fragment: Y.XmlFragment; text: Y.XmlText } {
const ydoc = new Y.Doc();
const fragment = ydoc.getXmlFragment('default');
const para = new Y.XmlElement('paragraph');
fragment.insert(0, [para]);
const text = new Y.XmlText();
para.insert(0, [text]);
text.insert(0, runs.map((r) => r.text).join(''));
let offset = 0;
for (const run of runs) {
if (run.comment) {
text.format(offset, run.text.length, { comment: run.comment });
}
offset += run.text.length;
}
return { fragment, text };
}
// Two paragraphs, each with its own XmlText, both marked with the same
// commentId — mirrors a suggestion anchor that got split across blocks.
function buildTwoParagraphs(
a: { text: string; comment?: { commentId: string; resolved: boolean } },
b: { text: string; comment?: { commentId: string; resolved: boolean } },
): { fragment: Y.XmlFragment; textA: Y.XmlText; textB: Y.XmlText } {
const ydoc = new Y.Doc();
const fragment = ydoc.getXmlFragment('default');
const build = (seg: typeof a) => {
const para = new Y.XmlElement('paragraph');
const text = new Y.XmlText();
para.insert(0, [text]);
text.insert(0, seg.text);
if (seg.comment) {
text.format(0, seg.text.length, { comment: seg.comment });
}
return { para, text };
};
const pa = build(a);
const pb = build(b);
fragment.insert(0, [pa.para, pb.para]);
return { fragment, textA: pa.text, textB: pb.text };
}
it('happy path: replaces marked text with newText and keeps the comment mark', () => {
const { fragment, text } = buildRuns([
{ text: 'Hello ' },
{ text: 'world', comment: { commentId: 'c1', resolved: false } },
{ text: '!' },
]);
const result = replaceYjsMarkedText(fragment, 'c1', 'world', 'planet');
expect(result).toEqual({ applied: true, currentText: 'planet' });
// New text carries the SAME comment mark; surrounding text is untouched.
expect(text.toDelta()).toEqual([
{ insert: 'Hello ' },
{
insert: 'planet',
attributes: { comment: { commentId: 'c1', resolved: false } },
},
{ insert: '!' },
]);
});
it('matches by commentId even when the mark is resolved', () => {
const { fragment, text } = buildWithComments([
{ text: 'foo', comment: { commentId: 'c9', resolved: true } },
]);
const result = replaceYjsMarkedText(fragment, 'c9', 'foo', 'bar');
expect(result).toEqual({ applied: true, currentText: 'bar' });
expect(text.toDelta()).toEqual([
{
insert: 'bar',
attributes: { comment: { commentId: 'c9', resolved: true } },
},
]);
});
it('changed text: marked text differs from expected → no-op, doc unchanged', () => {
const { fragment, text } = buildWithComments([
{ text: 'abc', comment: { commentId: 'c1', resolved: false } },
]);
const before = text.toDelta();
const result = replaceYjsMarkedText(fragment, 'c1', 'expected', 'new');
expect(result).toEqual({ applied: false, currentText: 'abc' });
expect(text.toDelta()).toEqual(before);
});
// F1 regression: the marked doc text is TYPOGRAPHIC (smart quotes / em-dash)
// and expectedText equals that raw typographic text — as it now does, because
// the MCP client stores the RAW anchored substring (getAnchoredText) rather
// than the agent's ASCII input. The strict `joinedText !== expectedText`
// compare must therefore MATCH and the suggestion apply (not a spurious 409).
it('typographic marked text applies when expectedText is the raw typographic text', () => {
const marked = '“hello”—world';
const { fragment, text } = buildRuns([
{ text: 'say ' },
{ text: marked, comment: { commentId: 'c1', resolved: false } },
{ text: '!' },
]);
const result = replaceYjsMarkedText(fragment, 'c1', marked, 'bye');
expect(result).toEqual({ applied: true, currentText: 'bye' });
expect(text.toDelta()).toEqual([
{ insert: 'say ' },
{
insert: 'bye',
attributes: { comment: { commentId: 'c1', resolved: false } },
},
{ insert: '!' },
]);
});
it('anchor deleted: no mark with that commentId → { applied: false, currentText: null }', () => {
const { fragment, text } = buildWithComments([
{ text: 'abc', comment: { commentId: 'c1', resolved: false } },
]);
const before = text.toDelta();
const result = replaceYjsMarkedText(fragment, 'missing', 'abc', 'new');
expect(result).toEqual({ applied: false, currentText: null });
expect(text.toDelta()).toEqual(before);
});
it('paragraph split: same commentId in two XmlText nodes → no-op, doc unchanged', () => {
const { fragment, textA, textB } = buildTwoParagraphs(
{ text: 'Hello ', comment: { commentId: 'c1', resolved: false } },
{ text: 'world', comment: { commentId: 'c1', resolved: false } },
);
const beforeA = textA.toDelta();
const beforeB = textB.toDelta();
const result = replaceYjsMarkedText(fragment, 'c1', 'Hello world', 'new');
expect(result).toEqual({ applied: false, currentText: 'Hello world' });
expect(textA.toDelta()).toEqual(beforeA);
expect(textB.toDelta()).toEqual(beforeB);
});
it('interleaved unmarked text: marked run not contiguous → no-op, doc unchanged', () => {
const { fragment, text } = buildRuns([
{ text: 'abc', comment: { commentId: 'c1', resolved: false } },
{ text: 'X' },
{ text: 'def', comment: { commentId: 'c1', resolved: false } },
]);
const before = text.toDelta();
const result = replaceYjsMarkedText(fragment, 'c1', 'abcdef', 'new');
// Joined marked text ("abcdef") is returned, but the run is not contiguous.
expect(result).toEqual({ applied: false, currentText: 'abcdef' });
expect(text.toDelta()).toEqual(before);
});
it('preserves surrounding text and merges adjacent marked segments on apply', () => {
// The marked run itself is split into two adjacent delta segments; they must
// be treated as one contiguous run and replaced as a whole.
const { fragment, text } = buildRuns([
{ text: 'pre ' },
{ text: 'ab', comment: { commentId: 'c1', resolved: false } },
{ text: 'cd', comment: { commentId: 'c1', resolved: false } },
{ text: ' post' },
]);
const result = replaceYjsMarkedText(fragment, 'c1', 'abcd', 'Z');
expect(result).toEqual({ applied: true, currentText: 'Z' });
expect(text.toDelta()).toEqual([
{ insert: 'pre ' },
{
insert: 'Z',
attributes: { comment: { commentId: 'c1', resolved: false } },
},
{ insert: ' post' },
]);
});
it('embed before the marked run: offset accounts for the embed unit → replaces the right text, embed intact', () => {
// "AB", then a Yjs embed (1 index unit), then marked "world". Before the
// fix the embed was skipped WITHOUT advancing offset, so the computed start
// for "world" was too low by 1 → delete/insert would have hit the embed/text
// instead of "world", mangling the embed. With the fix offset is correct.
const ydoc = new Y.Doc();
const fragment = ydoc.getXmlFragment('default');
const para = new Y.XmlElement('paragraph');
fragment.insert(0, [para]);
const text = new Y.XmlText();
para.insert(0, [text]);
text.insert(0, 'AB');
text.insertEmbed(2, { image: { src: 'x' } });
text.insert(3, 'world');
text.format(3, 'world'.length, {
comment: { commentId: 'c1', resolved: false },
});
const result = replaceYjsMarkedText(fragment, 'c1', 'world', 'planet');
expect(result).toEqual({ applied: true, currentText: 'planet' });
// "AB" untouched, embed still present and intact, "world" → "planet"
// carrying the SAME comment mark.
expect(text.toDelta()).toEqual([
{ insert: 'AB' },
{ insert: { image: { src: 'x' } } },
{
insert: 'planet',
attributes: { comment: { commentId: 'c1', resolved: false } },
},
]);
});
it('embed inside the marked run: embed splits the run → non-contiguous → no-op, doc unchanged', () => {
// marked "abc", an embed, marked "def" — same commentId. The embed occupies
// one index unit between the two marked segments, so they are not contiguous
// → the guard rejects it and nothing is mutated (embed intact).
const ydoc = new Y.Doc();
const fragment = ydoc.getXmlFragment('default');
const para = new Y.XmlElement('paragraph');
fragment.insert(0, [para]);
const text = new Y.XmlText();
para.insert(0, [text]);
text.insert(0, 'abc');
text.insertEmbed(3, { image: { src: 'y' } });
text.insert(4, 'def');
text.format(0, 'abc'.length, {
comment: { commentId: 'c1', resolved: false },
});
text.format(4, 'def'.length, {
comment: { commentId: 'c1', resolved: false },
});
const before = text.toDelta();
const result = replaceYjsMarkedText(fragment, 'c1', 'abcdef', 'new');
expect(result).toEqual({ applied: false, currentText: 'abcdef' });
expect(text.toDelta()).toEqual(before);
});
});
+131
View File
@@ -133,6 +133,137 @@ export function removeYjsMarkByAttribute(
}
}
/**
* A single marked delta segment collected during the walk, together with the
* Y.XmlText node that owns it, the segment's start offset within that node,
* and the full `comment` mark attributes object (needed to re-attach the mark
* to the replacement text).
*/
type MarkedSegment = {
node: Y.XmlText;
offset: number;
length: number;
text: string;
markAttrs: Record<string, any>;
};
/**
* Atomically check-and-replace the text currently under a comment mark.
*
* Walks the fragment collecting every delta segment whose `comment` mark has the
* given commentId. The replacement is applied ONLY if the marked run is intact:
* it lives in a single Y.XmlText node, is contiguous (no unmarked text spliced
* into the middle), and its joined text still equals `expectedText`. On success
* the run is deleted and `newText` is inserted at the same offset carrying the
* SAME comment attributes, so the comment thread stays anchored to the new text.
*
* This mutates the passed fragment/text directly and does NOT open its own Y
* transaction — the caller is expected to wrap the call in connection.transact()
* so the delete+insert are atomic (mirrors updateYjsMarkAttribute's direct
* mutation style).
*
* @returns `{ applied: true, currentText: newText }` on replacement, otherwise
* `{ applied: false, currentText }` where currentText is the text currently
* under the mark (or null when the mark/anchor no longer exists).
*/
export function replaceYjsMarkedText(
fragment: Y.XmlFragment,
commentId: string,
expectedText: string,
newText: string,
): { applied: boolean; currentText: string | null } {
// 1. Collect every marked segment in document order.
const segments: MarkedSegment[] = [];
const processItem = (item: any) => {
if (item instanceof Y.XmlText) {
const deltas = item.toDelta();
let offset = 0;
for (const delta of deltas) {
const insert = delta.insert;
// Non-string inserts (embeds) carry no text length we can splice on.
if (typeof insert !== 'string') {
// A Yjs embed occupies one unit in the index space used by delete/
// insert/format — advance offset so a marked segment after an embed
// gets the right position (and an embed inside a marked run creates a
// gap → the contiguity guard rejects it as a changed anchor).
offset += 1;
continue;
}
const length = insert.length;
const attributes = delta.attributes ?? {};
const markAttr = attributes['comment'];
if (markAttr && markAttr.commentId === commentId) {
segments.push({
node: item,
offset,
length,
text: insert,
markAttrs: markAttr,
});
}
offset += length;
}
} else if (item instanceof Y.XmlElement) {
for (let i = 0; i < item.length; i++) {
processItem(item.get(i));
}
}
};
for (let i = 0; i < fragment.length; i++) {
processItem(fragment.get(i));
}
const joinedText = segments.map((s) => s.text).join('');
// 2a. No segments — the mark/anchor was deleted.
if (segments.length === 0) {
return { applied: false, currentText: null };
}
// 2b. Segments span more than one Y.XmlText node (paragraph split by Enter,
// or the mark bled across blocks) — treat as changed.
const node = segments[0].node;
const sameNode = segments.every((s) => s.node === node);
if (!sameNode) {
return { applied: false, currentText: joinedText };
}
// 2c. Non-contiguous within the single node: unmarked text is spliced between
// the first and last marked segment. Since collected segments are in document
// order, contiguity holds iff each segment starts where the previous ended.
let contiguous = true;
for (let i = 1; i < segments.length; i++) {
if (segments[i].offset !== segments[i - 1].offset + segments[i - 1].length) {
contiguous = false;
break;
}
}
if (!contiguous) {
return { applied: false, currentText: joinedText };
}
// 2d. The text under the mark changed.
if (joinedText !== expectedText) {
return { applied: false, currentText: joinedText };
}
// 3. All guards passed: delete the marked run and re-insert newText with the
// same comment attributes at the same offset. Atomic within the caller's
// transaction.
const start = segments[0].offset;
const len = segments.reduce((sum, s) => sum + s.length, 0);
const markAttrs = segments[0].markAttrs;
node.delete(start, len);
node.insert(start, newText, { comment: markAttrs });
return { applied: true, currentText: newText };
}
/**
* Updates a mark's attributes for all text that has the specified attribute value.
* Useful for resolving/unresolving comments by commentId.
@@ -51,6 +51,7 @@ export const AuditEvent = {
COMMENT_UPDATED: 'comment.updated',
COMMENT_RESOLVED: 'comment.resolved',
COMMENT_REOPENED: 'comment.reopened',
COMMENT_SUGGESTION_APPLIED: 'comment.suggestion_applied',
// Page
PAGE_CREATED: 'page.created',
@@ -2,43 +2,91 @@ import { AiChatController } from './ai-chat.controller';
import type { User, Workspace } from '@docmost/db/types/entity.types';
/**
* Wiring spec for the #191 `POST /ai-chat/bound-chat` endpoint. It must forward
* the requesting user + workspace + pageId to findLatestByPage and return the
* matched chat's id, or `{ chatId: null }` when there is none. The repo already
* scopes to the caller's OWN chats, so a foreign pageId simply yields no match
* (null) — no extra page-access check is needed. Exercised with hand-rolled
* mocks, no Nest graph and no DB.
* Wiring spec for the #191 `POST /ai-chat/bound-chat` endpoint, hardened for
* #312. `dto.pageId` carries either a page slugId (10-char nanoid, off a slug
* URL) or a page uuid, so the controller must FIRST resolve it to a real page
* uuid via PageRepo.findById (which accepts both) — passing the raw slugId into
* the uuid ai_chats.page_id column caused a Postgres 22P02 500. Only then is the
* caller's most-recent OWN chat for that page looked up (by the resolved uuid),
* and a page in a different workspace (or an unknown id) yields { chatId: null }
* without ever touching the chat lookup. Exercised with hand-rolled mocks, no
* Nest graph and no DB.
*/
describe('AiChatController.boundChat', () => {
const user = { id: 'u1' } as User;
const workspace = { id: 'ws1' } as Workspace;
function makeController(chat: unknown) {
function makeController(opts: { page: unknown; chat?: unknown }) {
const aiChatRepo = {
findLatestByPage: jest.fn().mockResolvedValue(chat),
findLatestByPage: jest.fn().mockResolvedValue(opts.chat),
};
const pageRepo = {
findById: jest.fn().mockResolvedValue(opts.page),
};
const controller = new AiChatController(
{} as never,
aiChatRepo as never,
{} as never,
{} as never,
pageRepo as never,
);
return { controller, aiChatRepo };
return { controller, aiChatRepo, pageRepo };
}
it('returns the owned chat id and scopes the lookup to user + workspace + page', async () => {
const { controller, aiChatRepo } = makeController({
id: 'c1',
creatorId: 'u1',
it('resolves a slugId to the page uuid and returns the owned chat id', async () => {
const { controller, aiChatRepo, pageRepo } = makeController({
// findById accepts a slugId and returns the page with its real uuid.
page: { id: 'page-uuid-1', workspaceId: 'ws1' },
chat: { id: 'c1', creatorId: 'u1' },
});
const res = await controller.boundChat({ pageId: 'p1' }, user, workspace);
expect(aiChatRepo.findLatestByPage).toHaveBeenCalledWith('u1', 'ws1', 'p1');
// The client sends a 10-char nanoid slugId, NOT a uuid.
const res = await controller.boundChat(
{ pageId: 'i82qXsivsx' },
user,
workspace,
);
expect(pageRepo.findById).toHaveBeenCalledWith('i82qXsivsx');
// findLatestByPage must receive the RESOLVED uuid, never the raw slugId.
expect(aiChatRepo.findLatestByPage).toHaveBeenCalledWith(
'u1',
'ws1',
'page-uuid-1',
);
expect(res).toEqual({ chatId: 'c1' });
});
it('returns { chatId: null } for a page with no owned chat (incl. foreign pageId)', async () => {
const { controller } = makeController(undefined);
const res = await controller.boundChat({ pageId: 'foreign' }, user, workspace);
it('returns { chatId: null } for a page in a DIFFERENT workspace without a chat lookup', async () => {
const { controller, aiChatRepo, pageRepo } = makeController({
page: { id: 'page-uuid-2', workspaceId: 'other-ws' },
});
const res = await controller.boundChat(
{ pageId: 'foreignSlug' },
user,
workspace,
);
expect(pageRepo.findById).toHaveBeenCalledWith('foreignSlug');
// No cross-workspace leak: the chat lookup must never run.
expect(aiChatRepo.findLatestByPage).not.toHaveBeenCalled();
expect(res).toEqual({ chatId: null });
});
it('returns { chatId: null } for an unknown id without throwing or looking up a chat', async () => {
const { controller, aiChatRepo } = makeController({ page: undefined });
const res = await controller.boundChat(
{ pageId: 'does-not-exist' },
user,
workspace,
);
expect(aiChatRepo.findLatestByPage).not.toHaveBeenCalled();
expect(res).toEqual({ chatId: null });
});
it('returns { chatId: null } when the resolved page has no owned chat', async () => {
const { controller } = makeController({
page: { id: 'page-uuid-3', workspaceId: 'ws1' },
chat: undefined,
});
const res = await controller.boundChat({ pageId: 'p3' }, user, workspace);
expect(res).toEqual({ chatId: null });
});
});
@@ -56,6 +56,7 @@ describe('AiChatController.export', () => {
aiChatRepo as never,
aiChatMessageRepo as never,
{} as never,
{} as never,
);
return { controller, aiChatRepo, aiChatMessageRepo };
}
@@ -24,6 +24,7 @@ import { AiChat, User, Workspace } from '@docmost/db/types/entity.types';
import { PaginationOptions } from '@docmost/db/pagination/pagination-options';
import { AiChatRepo } from '@docmost/db/repos/ai-chat/ai-chat.repo';
import { AiChatMessageRepo } from '@docmost/db/repos/ai-chat/ai-chat-message.repo';
import { PageRepo } from '@docmost/db/repos/page/page.repo';
import { UserThrottlerGuard } from '../../integrations/throttle/user-throttler.guard';
import { AI_CHAT_THROTTLER } from '../../integrations/throttle/throttler-names';
import { FileInterceptor } from '../../common/interceptors/file.interceptor';
@@ -55,6 +56,7 @@ export class AiChatController {
private readonly aiChatRepo: AiChatRepo,
private readonly aiChatMessageRepo: AiChatMessageRepo,
private readonly aiTranscription: AiTranscriptionService,
private readonly pageRepo: PageRepo,
) {}
/** List the requesting user's chats in this workspace (paginated). */
@@ -71,9 +73,15 @@ export class AiChatController {
/**
* Resolve the chat bound to a document for the requesting user: the most-recent
* non-deleted chat created on that page (ai_chats.page_id). Returns
* { chatId: null } when the page has no owned chat (-> a fresh chat). No page
* access check needed: only the caller's OWN chats are matched, so a foreign
* pageId reveals nothing.
* { chatId: null } when the page has no owned chat (-> a fresh chat).
*
* `dto.pageId` carries EITHER a page slugId (10-char nanoid, sent by the client
* off a slug URL) OR a page uuid, so it must be resolved to a real page uuid
* before it touches the uuid ai_chats.page_id column — passing a slugId straight
* through triggered a Postgres 22P02 "invalid input syntax for type uuid" 500
* (#312). PageRepo.findById accepts both forms. The workspace guard rejects an
* unknown or cross-workspace page (-> { chatId: null }) so a foreign id cannot
* probe another workspace's chats. Only the caller's OWN chats are then matched.
*/
@HttpCode(HttpStatus.OK)
@Post('bound-chat')
@@ -82,10 +90,14 @@ export class AiChatController {
@AuthUser() user: User,
@AuthWorkspace() workspace: Workspace,
): Promise<{ chatId: string | null }> {
const page = await this.pageRepo.findById(dto.pageId); // accepts slugId OR uuid
if (!page || page.workspaceId !== workspace.id) {
return { chatId: null }; // unknown or foreign-workspace page — no binding, no leak
}
const chat = await this.aiChatRepo.findLatestByPage(
user.id,
workspace.id,
dto.pageId,
page.id, // the real uuid, never the incoming slugId
);
return { chatId: chat?.id ?? null };
}
@@ -60,6 +60,7 @@ describe('AiChatController.generatePageTitle', () => {
{} as never,
{} as never,
{} as never,
{} as never,
);
return { controller, aiChatService };
}
@@ -518,6 +518,20 @@ describe('AiChatToolsService model-friendly input validation (#190)', () => {
});
});
it('createComment: accepts an optional suggestedText alongside a selection', async () => {
const tools = await buildTools();
const result = await inputSchemaOf(tools.createComment).validate({
pageId: '019efe44-0000-0000-0000-000000000000',
content: 'A remark',
selection: 'титановый проводник',
suggestedText: 'медный проводник',
});
expect(result.success).toBe(true);
expect(result.value).toMatchObject({
suggestedText: 'медный проводник',
});
});
it('sharedTool-built tools (getOutline) also get the friendly message on a dropped pageId', async () => {
const tools = await buildTools();
const result = await inputSchemaOf(tools.getOutline).validate({});
@@ -173,6 +173,11 @@ export class AiChatToolsService {
});
return {
// INTENTIONAL per-transport divergence (not in the shared registry): this
// in-app search runs a semantic + keyword hybrid (RRF) with in-process
// access control and a tuned schema (limit 1-20); the standalone MCP
// `search` is a plain REST full-text search (limit up to 100). Different
// behaviour AND schema, so kept per-layer.
searchPages: tool({
description:
'Search the wiki for pages relevant to a query. Combines exact ' +
@@ -432,6 +437,10 @@ export class AiChatToolsService {
},
}),
// INTENTIONAL per-transport divergence (not shared): the description is
// tuned for the in-app agent (e.g. "retry with a corrected EXACT selection"
// and "Reversible via the comment UI"); the standalone MCP `create_comment`
// keeps its own wording. Kept per-layer.
createComment: tool({
description:
'Add an INLINE comment to a page, or reply to an existing top-level ' +
@@ -441,8 +450,10 @@ export class AiChatToolsService {
"new top-level comment REQUIRES a `selection`. Replies inherit the " +
"parent's anchor and take no selection. If the call fails with a " +
'"selection not found" error, retry with a corrected EXACT selection ' +
'copied verbatim from a single paragraph/block. Reversible via the ' +
'comment UI.',
'copied verbatim from a single paragraph/block. You may also attach a ' +
'`suggestedText` proposing a replacement for the `selection` (a human ' +
'applies it from the UI); when set, the `selection` must occur exactly ' +
'once in the page. Reversible via the comment UI.',
inputSchema: modelFriendlyInput({
pageId: z.string().describe('The id of the page to comment on.'),
content: z.string().describe('The comment body as Markdown.'),
@@ -464,24 +475,57 @@ export class AiChatToolsService {
'Optional id of a TOP-LEVEL comment to reply to (one level ' +
'of replies only).',
),
suggestedText: z
.string()
.min(1)
.max(2000)
.optional()
.describe(
'Optional proposed replacement (PLAIN TEXT) for the `selection`, ' +
'applied by a human via the UI (never auto-applied). REQUIRES a ' +
'`selection`; NOT allowed on a reply. When set, the `selection` ' +
'must be UNIQUE in the page — expand it with surrounding context ' +
'(still <=250 chars) if it occurs more than once, or the call is ' +
'refused.',
),
}),
execute: async ({ pageId, content, selection, parentCommentId }) => {
// createComment(pageId, content, type, selection?, parentCommentId?).
// Top-level comments are inline and must carry a selection to anchor
// on; replies inherit the parent's anchor (no selection). Throwing
// here surfaces a tool error to the model (Vercel `ai` SDK) so the
// agent retries with a better selection — do not catch/suppress it.
execute: async ({
pageId,
content,
selection,
parentCommentId,
suggestedText,
}) => {
// createComment(pageId, content, type, selection?, parentCommentId?,
// suggestedText?). Top-level comments are inline and must carry a
// selection to anchor on; replies inherit the parent's anchor (no
// selection). Throwing here surfaces a tool error to the model (Vercel
// `ai` SDK) so the agent retries with a better selection — do not
// catch/suppress it.
if (!parentCommentId && (!selection || !selection.trim())) {
throw new Error(
"createComment requires a 'selection' (exact text to anchor on) for a new top-level comment.",
);
}
if (suggestedText !== undefined) {
if (parentCommentId) {
throw new Error(
"createComment: 'suggestedText' cannot be attached to a reply; it applies only to a top-level inline comment.",
);
}
if (!selection || !selection.trim()) {
throw new Error(
"createComment: 'suggestedText' requires a 'selection' to anchor and rewrite.",
);
}
}
const result = await client.createComment(
pageId,
content,
'inline',
selection,
parentCommentId,
suggestedText,
);
const data = (result?.data ?? {}) as { id?: string };
return { commentId: data.id, pageId };
@@ -519,6 +563,10 @@ export class AiChatToolsService {
async () => await client.getSpaces(),
),
// INTENTIONAL per-transport divergence (not shared): keeps the `tree:true`
// hierarchy mode but is worded for the in-app agent; the standalone MCP
// `list_pages` carries its own wording. Kept per-layer so each side tunes
// its own guidance.
listPages: tool({
description:
'List the most recent pages, optionally scoped to a single space. ' +
@@ -692,85 +740,25 @@ export class AiChatToolsService {
async ({ pageId }) => await client.stashPage(pageId),
),
patchNode: tool({
description:
'Replace a single content block (by id) with a new ProseMirror ' +
'node; the replacement keeps the same nodeId. Example node: a ' +
'paragraph {"type":"paragraph","content":[{"type":"text","text":"Hello"}]} ' +
'or a heading {"type":"heading","attrs":{"level":2},"content":' +
'[{"type":"text","text":"Title"}]}. Bold is a mark: ' +
'{"type":"text","text":"x","marks":[{"type":"bold"}]}. The node arg ' +
'may be a JSON object or a JSON string (both accepted). Reversible: ' +
'the previous version is kept in page history.',
inputSchema: modelFriendlyInput({
pageId: z.string().describe('The id of the page.'),
nodeId: z
.string()
.describe('The block id to replace (from getOutline/getPageJson).'),
node: z
.any()
.describe(
'The replacement ProseMirror node, e.g. ' +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]}. ' +
'JSON object or JSON string both accepted.',
),
}),
execute: async ({ pageId, nodeId, node }) => {
// Parity with the standalone MCP server (index.ts patch_node): the
// model sometimes serializes the node as a JSON string. Parse it
// before the client's typeof-object guard rejects it.
// Schema + description from the shared registry (identical across both
// transports). The execute body keeps its OWN parseNodeArg normalization:
// the model sometimes serializes the node as a JSON string, and we parse it
// before the client's typeof-object guard rejects it (parity with the
// standalone MCP server, index.ts patch_node).
patchNode: sharedTool(
sharedToolSpecs.patchNode,
async ({ pageId, nodeId, node }) => {
const parsedNode = parseNodeArg(node);
return await client.patchNode(pageId, nodeId, parsedNode);
},
}),
),
insertNode: tool({
description:
'Insert a ProseMirror node relative to an anchor, or append it at ' +
'the top level. For before/after you MUST provide EXACTLY ONE of ' +
'anchorNodeId or anchorText. Example node: a paragraph ' +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]} or a ' +
'heading {"type":"heading","attrs":{"level":2},"content":' +
'[{"type":"text","text":"Title"}]}. Bold is a mark: ' +
'{"type":"text","text":"x","marks":[{"type":"bold"}]}. The node arg ' +
'may be a JSON object or a JSON string (both accepted). Reversible ' +
'via page history.',
inputSchema: modelFriendlyInput({
pageId: z.string().describe('The id of the page.'),
node: z
.any()
.describe(
'The ProseMirror node to insert, e.g. ' +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]}. ' +
'JSON object or JSON string both accepted.',
),
position: z
.enum(['before', 'after', 'append'])
.describe('Where to insert relative to the anchor.'),
anchorNodeId: z
.string()
.optional()
.describe('Anchor block id (for before/after).'),
anchorText: z
.string()
.optional()
.describe(
'Anchor text fragment (for before/after), matched against the ' +
"block's literal rendered plain text (no markdown). " +
'Markdown/emoji are tolerated as a fallback; prefer plain text ' +
'or anchorNodeId.',
),
}),
execute: async ({
pageId,
node,
position,
anchorNodeId,
anchorText,
}) => {
// Parity with the standalone MCP server (index.ts insert_node): the
// model sometimes serializes the node as a JSON string. Parse it
// before the client's typeof-object guard rejects it.
// Shared registry schema + description; execute retains parseNodeArg on the
// incoming node (parity with the standalone MCP server, index.ts
// insert_node).
insertNode: sharedTool(
sharedToolSpecs.insertNode,
async ({ pageId, node, position, anchorNodeId, anchorText }) => {
const parsedNode = parseNodeArg(node);
return await client.insertNode(pageId, parsedNode, {
position,
@@ -778,7 +766,7 @@ export class AiChatToolsService {
anchorText,
});
},
}),
),
deleteNode: sharedTool(
sharedToolSpecs.deleteNode,
@@ -821,6 +809,10 @@ export class AiChatToolsService {
},
}),
// NOT in the shared registry: this layer names the table argument
// `tableRef`, while the standalone MCP tool names it `table` (index.ts).
// Sharing one buildShape would rename a model-facing parameter on one
// transport, so the table row/cell tools stay per-layer by design.
tableInsertRow: tool({
description:
'Insert a row of plain-text cells into a table. Reversible via ' +
@@ -841,6 +833,8 @@ export class AiChatToolsService {
await client.tableInsertRow(pageId, tableRef, cells, index),
}),
// NOT shared — same `tableRef` (here) vs `table` (MCP) parameter-name
// divergence as tableInsertRow.
tableDeleteRow: tool({
description:
'Delete a table row at a 0-based index. Reversible via page history.',
@@ -855,6 +849,8 @@ export class AiChatToolsService {
await client.tableDeleteRow(pageId, tableRef, index),
}),
// NOT shared — same `tableRef` (here) vs `table` (MCP) parameter-name
// divergence as tableInsertRow.
tableUpdateCell: tool({
description:
'Set the plain-text content of a table cell at [row, col] (0-based). ' +
@@ -884,6 +880,10 @@ export class AiChatToolsService {
await client.importPageMarkdown(pageId, markdown),
),
// INTENTIONAL per-transport divergence (not shared): adds a security
// confirmation framing ("Only share when the user explicitly asked, since
// this exposes the page to anyone with the link") for the in-app agent; the
// standalone MCP `share_page` keeps the plain public-URL wording.
sharePage: tool({
description:
'Make a page PUBLICLY accessible and return its public URL. ' +
@@ -910,6 +910,10 @@ export class AiChatToolsService {
async ({ historyId }) => await client.restorePageVersion(historyId),
),
// INTENTIONAL per-transport divergence (not shared): deliberately omits the
// `deleteComments` schema field (comment-deletion guardrail) and carries a
// much shorter description; the standalone MCP `docmost_transform` exposes
// the full helper catalogue. Different schema, so kept per-layer.
transformPage: tool({
description:
'Run a sandboxed JS transform of the form `(doc, ctx) => doc` over a ' +
@@ -177,6 +177,7 @@ export interface DocmostClientLike {
type?: 'page' | 'inline',
selection?: string,
parentCommentId?: string,
suggestedText?: string,
): Promise<{ data: Record<string, unknown>; success: boolean }>;
resolveComment(
commentId: string,
@@ -113,9 +113,15 @@ describe('SHARED_TOOL_SPECS contract parity', () => {
const expectedKeys = Object.keys(shape).sort();
expect(actualKeys).toEqual(expectedKeys);
// A non-.optional() field must surface as required in the advertised schema.
// A field that was NOT wrapped in `.optional()` must surface as required in
// the advertised schema. We test for the ZodOptional wrapper rather than
// `isOptional()`: `z.any()`/`z.unknown()` accept `undefined` and so report
// `isOptional() === true`, yet z.toJSONSchema still lists them under
// `required` (they carry no `.optional()`). Matching on the wrapper is what
// the emitted JSON schema actually does, so it stays correct for the
// registry's `node: z.any()` fields (patchNode/insertNode).
const expectedRequired = Object.entries(shape)
.filter(([, field]) => !(field as z.ZodTypeAny).isOptional?.())
.filter(([, field]) => !(field instanceof z.ZodOptional))
.map(([k]) => k)
.sort();
expect((json.required ?? []).slice().sort()).toEqual(expectedRequired);
@@ -0,0 +1,119 @@
import {
ForbiddenException,
NotFoundException,
} from '@nestjs/common';
import { CommentController } from './comment.controller';
/**
* Authz-gate tests for the apply-suggestion route. Applying a suggestion
* rewrites the page text, so the route MUST call
* pageAccessService.validateCanEdit BEFORE handing off to
* commentService.applySuggestion (which performs the document mutation + stamp).
* That ordering is a security boundary: an unauthorized user must never reach
* the mutation. These tests pin it against a fully mocked controller so any
* regression that drops the gate (or reorders it after the mutation) fails here.
*/
describe('CommentController apply-suggestion authz', () => {
function makeController() {
const commentService = {
applySuggestion: jest.fn(async () => ({ id: 'c-1', applied: true })),
};
const commentRepo = { findById: jest.fn() };
const pageRepo = { findById: jest.fn() };
const spaceAbility = {} as any;
const pageAccessService = {
validateCanEdit: jest.fn(async () => undefined),
};
const wsService = {} as any;
const auditService = { log: jest.fn() };
const controller = new CommentController(
commentService as any,
commentRepo as any,
pageRepo as any,
spaceAbility,
pageAccessService as any,
wsService,
auditService as any,
);
return {
controller,
commentService,
commentRepo,
pageRepo,
pageAccessService,
};
}
const user: any = { id: 'u-1' };
const workspace: any = { id: 'ws-1' };
const provenance: any = undefined;
const dto: any = { commentId: 'c-1' };
const comment = {
id: 'c-1',
pageId: 'p-1',
spaceId: 'sp-1',
suggestedText: 'new text',
selection: 'old text',
};
const page = { id: 'p-1', spaceId: 'sp-1', deletedAt: null };
it('validateCanEdit throwing Forbidden rejects AND applySuggestion is never called', async () => {
const { controller, commentRepo, pageRepo, pageAccessService, commentService } =
makeController();
commentRepo.findById.mockResolvedValue(comment);
pageRepo.findById.mockResolvedValue(page);
pageAccessService.validateCanEdit.mockRejectedValue(
new ForbiddenException('no edit access'),
);
await expect(
controller.applySuggestion(dto, user, workspace, provenance),
).rejects.toBeInstanceOf(ForbiddenException);
// The security boundary: the mutation/stamp must NOT run for an
// unauthorized user.
expect(pageAccessService.validateCanEdit).toHaveBeenCalledWith(page, user);
expect(commentService.applySuggestion).not.toHaveBeenCalled();
});
it('happy path: validateCanEdit resolves → applySuggestion is called and its result returned', async () => {
const { controller, commentRepo, pageRepo, pageAccessService, commentService } =
makeController();
commentRepo.findById.mockResolvedValue(comment);
pageRepo.findById.mockResolvedValue(page);
const applied = { id: 'c-1', applied: true };
commentService.applySuggestion.mockResolvedValue(applied);
const result = await controller.applySuggestion(
dto,
user,
workspace,
provenance,
);
// Authorization ran before the mutation, then the service was invoked.
expect(pageAccessService.validateCanEdit).toHaveBeenCalledWith(page, user);
expect(commentService.applySuggestion).toHaveBeenCalledWith(
comment,
user,
provenance,
);
expect(result).toBe(applied);
});
it('missing comment: NotFound is thrown without authorizing or applying', async () => {
const { controller, commentRepo, pageRepo, pageAccessService, commentService } =
makeController();
commentRepo.findById.mockResolvedValue(null);
await expect(
controller.applySuggestion(dto, user, workspace, provenance),
).rejects.toBeInstanceOf(NotFoundException);
expect(pageRepo.findById).not.toHaveBeenCalled();
expect(pageAccessService.validateCanEdit).not.toHaveBeenCalled();
expect(commentService.applySuggestion).not.toHaveBeenCalled();
});
});
@@ -14,6 +14,7 @@ import { CommentService } from './comment.service';
import { CreateCommentDto } from './dto/create-comment.dto';
import { UpdateCommentDto } from './dto/update-comment.dto';
import { ResolveCommentDto } from './dto/resolve-comment.dto';
import { ApplySuggestionDto } from './dto/apply-suggestion.dto';
import { PageIdDto, CommentIdDto } from './dto/comments.input';
import { AuthUser } from '../../common/decorators/auth-user.decorator';
import { AuthWorkspace } from '../../common/decorators/auth-workspace.decorator';
@@ -197,6 +198,42 @@ export class CommentController {
return updated;
}
@HttpCode(HttpStatus.OK)
@Post('apply-suggestion')
async applySuggestion(
@Body() dto: ApplySuggestionDto,
@AuthUser() user: User,
@AuthWorkspace() workspace: Workspace,
@AuthProvenance() provenance: AuthProvenanceData,
) {
const comment = await this.commentRepo.findById(dto.commentId, {
includeCreator: true,
includeResolvedBy: true,
});
if (!comment) {
throw new NotFoundException('Comment not found');
}
const page = await this.pageRepo.findById(comment.pageId);
if (!page || page.deletedAt) {
throw new NotFoundException('Page not found');
}
// Authorize BEFORE revealing any structural detail about the comment
// (metadata-disclosure hygiene). Applying a suggestion rewrites the page
// text, so require edit access (NOT just comment access). Running this
// first means a cross-workspace user with a guessed comment UUID gets a
// uniform 403 regardless of the comment's type or suggestion state — it can
// never distinguish those before the access check. The structural 400s
// (top-level / has-a-suggested-edit) are re-checked by the service below.
await this.pageAccessService.validateCanEdit(page, user);
// The service re-validates the comment's state, returns idempotent success
// for an already-applied suggestion, and lets ConflictException (409, with
// currentText in the payload) propagate untouched.
return this.commentService.applySuggestion(comment, user, provenance);
}
@HttpCode(HttpStatus.OK)
@Post('delete')
async delete(@Body() input: CommentIdDto, @AuthUser() user: User, @AuthWorkspace() workspace: Workspace) {
@@ -0,0 +1,245 @@
import {
BadRequestException,
ConflictException,
InternalServerErrorException,
} from '@nestjs/common';
import { CommentService } from './comment.service';
import { AuditEvent, AuditResource } from '../../common/events/audit-events';
/**
* Focused coverage for CommentService.applySuggestion (comment.service.ts).
* The service is constructed directly with jest-mocked deps (the @InjectQueue
* tokens can't be resolved by Test.createTestingModule — see the sibling specs).
*
* The collaboration gateway verdict is the pivot of the whole flow, so each test
* pins a specific { applied, currentText } and asserts the DB persistence,
* auto-resolve, audit, ws broadcast, and error mapping that follow from it.
*/
describe('CommentService — applySuggestion', () => {
const UPDATED = { id: 'c-1', __updated: true } as any;
function makeService(verdict: unknown) {
const commentRepo: any = {
// Both the applied-stamp re-read and resolveComment's re-read go through
// findById; return a recognizable enriched row.
findById: jest.fn(async () => UPDATED),
updateComment: jest.fn(async () => undefined),
};
const pageRepo: any = {};
const wsService: any = { emitCommentEvent: jest.fn() };
const collaborationGateway: any = {
handleYjsEvent: jest.fn(async () => verdict),
};
const generalQueue: any = { add: jest.fn(() => Promise.resolve()) };
const notificationQueue: any = { add: jest.fn(async () => undefined) };
const auditService: any = { log: jest.fn() };
const service = new CommentService(
commentRepo,
pageRepo,
wsService,
collaborationGateway,
generalQueue,
notificationQueue,
auditService,
);
return {
service,
commentRepo,
wsService,
collaborationGateway,
auditService,
};
}
const suggestionComment = (over?: Partial<any>): any => ({
id: 'c-1',
pageId: 'page-1',
spaceId: 'space-1',
workspaceId: 'ws-1',
creatorId: 'user-1',
parentCommentId: null,
selection: 'old text',
suggestedText: 'new text',
suggestionAppliedAt: null,
resolvedAt: null,
...over,
});
const user = (over?: Partial<any>): any => ({ id: 'user-1', ...over });
// Pull the updateComment patch that carries the applied stamps.
const appliedPatch = (commentRepo: any) =>
commentRepo.updateComment.mock.calls
.map((c: any[]) => c[0])
.find((patch: any) => 'suggestionAppliedAt' in patch);
it('applied=true → replaces text, persists applied stamps, auto-resolves, audits, returns updated', async () => {
const { service, commentRepo, wsService, collaborationGateway, auditService } =
makeService({ applied: true, currentText: 'new text' });
const result = await service.applySuggestion(suggestionComment(), user());
// The atomic replace was requested against the exact marked text.
expect(collaborationGateway.handleYjsEvent).toHaveBeenCalledWith(
'applyCommentSuggestion',
'page.page-1',
expect.objectContaining({
commentId: 'c-1',
expectedText: 'old text',
newText: 'new text',
user: expect.objectContaining({ id: 'user-1' }),
}),
);
// Applied stamps persisted.
const patch = appliedPatch(commentRepo);
expect(patch.suggestionAppliedAt).toBeInstanceOf(Date);
expect(patch.suggestionAppliedById).toBe('user-1');
// Auto-resolved: resolveComment writes a resolvedAt/resolvedById patch too.
const resolvePatch = commentRepo.updateComment.mock.calls
.map((c: any[]) => c[0])
.find((p: any) => 'resolvedAt' in p);
expect(resolvePatch.resolvedAt).toBeInstanceOf(Date);
expect(resolvePatch.resolvedById).toBe('user-1');
// Audit + broadcast + return.
expect(auditService.log).toHaveBeenCalledWith(
expect.objectContaining({
event: AuditEvent.COMMENT_SUGGESTION_APPLIED,
resourceType: AuditResource.COMMENT,
resourceId: 'c-1',
spaceId: 'space-1',
metadata: { pageId: 'page-1' },
}),
);
expect(wsService.emitCommentEvent).toHaveBeenCalledWith(
'space-1',
'page-1',
expect.objectContaining({ operation: 'commentUpdated', comment: UPDATED }),
);
expect(result).toBe(UPDATED);
});
it('applied=false but currentText === suggestedText → idempotent success (no 409)', async () => {
const { service, commentRepo, auditService } = makeService({
applied: false,
currentText: 'new text',
});
const result = await service.applySuggestion(suggestionComment(), user());
// The stamps are still persisted (reconciling a crash between the doc
// mutation and the DB write) and the call succeeds.
const patch = appliedPatch(commentRepo);
expect(patch.suggestionAppliedAt).toBeInstanceOf(Date);
expect(patch.suggestionAppliedById).toBe('user-1');
expect(auditService.log).toHaveBeenCalledTimes(1);
expect(result).toBe(UPDATED);
});
it('applied=false and currentText differs → ConflictException with currentText in payload', async () => {
const { service, commentRepo, auditService } = makeService({
applied: false,
currentText: 'someone else edited this',
});
const err = await service
.applySuggestion(suggestionComment(), user())
.catch((e) => e);
expect(err).toBeInstanceOf(ConflictException);
expect(err.getResponse()).toMatchObject({
currentText: 'someone else edited this',
});
// No persistence and no audit on a conflict.
expect(appliedPatch(commentRepo)).toBeUndefined();
expect(auditService.log).not.toHaveBeenCalled();
});
it('already-applied AND already-resolved → idempotent success, no collab call, no re-resolve (#315 double-click)', async () => {
const { service, collaborationGateway, commentRepo, auditService } =
makeService({ applied: true, currentText: 'new text' });
const result = await service.applySuggestion(
suggestionComment({
suggestionAppliedAt: new Date(),
resolvedAt: new Date(),
resolvedById: 'user-1',
}),
user(),
);
// Idempotent SUCCESS, not a 409. The suggestion is already applied, so the
// collaborative document is never touched again and nothing is re-stamped
// or re-resolved.
expect(result).toBe(UPDATED);
expect(collaborationGateway.handleYjsEvent).not.toHaveBeenCalled();
expect(commentRepo.updateComment).not.toHaveBeenCalled();
// Same success shape as the applied path (broadcast + audit).
expect(auditService.log).toHaveBeenCalledTimes(1);
});
it('already-applied but NOT resolved (crash window) → idempotent success, self-heals resolve, no re-apply', async () => {
const { service, collaborationGateway, commentRepo } = makeService({
applied: true,
currentText: 'new text',
});
const result = await service.applySuggestion(
suggestionComment({ suggestionAppliedAt: new Date(), resolvedAt: null }),
user(),
);
expect(result).toBe(UPDATED);
// The suggestion is NOT re-applied to the document…
expect(collaborationGateway.handleYjsEvent).not.toHaveBeenCalledWith(
'applyCommentSuggestion',
expect.anything(),
expect.anything(),
);
// …but the open thread is self-healed to resolved via resolveComment, which
// writes the resolve patch and updates the resolve mark.
const resolvePatch = commentRepo.updateComment.mock.calls
.map((c: any[]) => c[0])
.find((p: any) => 'resolvedAt' in p);
expect(resolvePatch.resolvedAt).toBeInstanceOf(Date);
expect(resolvePatch.resolvedById).toBe('user-1');
expect(collaborationGateway.handleYjsEvent).toHaveBeenCalledWith(
'resolveCommentMark',
'page.page-1',
expect.objectContaining({ commentId: 'c-1', resolved: true }),
);
// The applied stamps are NOT re-written (already stamped).
expect(appliedPatch(commentRepo)).toBeUndefined();
});
it('rejects a comment with no suggestedText', async () => {
const { service, collaborationGateway } = makeService({
applied: true,
currentText: 'x',
});
await expect(
service.applySuggestion(
suggestionComment({ suggestedText: null }),
user(),
),
).rejects.toThrow(BadRequestException);
expect(collaborationGateway.handleYjsEvent).not.toHaveBeenCalled();
});
it('gateway returning undefined → hard error, not a silent success', async () => {
const { service, commentRepo, auditService } = makeService(undefined);
await expect(
service.applySuggestion(suggestionComment(), user()),
).rejects.toThrow(InternalServerErrorException);
// Nothing persisted, nothing audited.
expect(appliedPatch(commentRepo)).toBeUndefined();
expect(auditService.log).not.toHaveBeenCalled();
});
});
@@ -60,6 +60,7 @@ describe('CommentService — behavior', () => {
};
const generalQueue: any = { add: jest.fn(() => Promise.resolve()) };
const notificationQueue: any = { add: jest.fn(async () => undefined) };
const auditService: any = { log: jest.fn() };
const service = new CommentService(
commentRepo,
@@ -68,14 +69,17 @@ describe('CommentService — behavior', () => {
collaborationGateway,
generalQueue,
notificationQueue,
auditService,
);
return {
service,
commentRepo,
wsService,
collaborationGateway,
generalQueue,
notificationQueue,
auditService,
};
}
@@ -181,6 +185,95 @@ describe('CommentService — behavior', () => {
});
});
describe('create — suggested edit validation & storage', () => {
it('rejects a suggestedText on a reply (not a top-level comment)', async () => {
const parentComment = {
id: 'parent-1',
pageId: 'page-1',
parentCommentId: null,
};
const { service, commentRepo } = makeService({ parentComment });
await expect(
service.create(
{ page: page(), workspaceId: 'ws-1', user: user() },
{
content: JSON.stringify(docMentioning()),
parentCommentId: 'parent-1',
selection: 'hello world',
suggestedText: 'goodbye world',
} as any,
),
).rejects.toThrow(BadRequestException);
expect(commentRepo.insertComment).not.toHaveBeenCalled();
});
it('rejects a suggestedText without a selection', async () => {
const { service, commentRepo } = makeService();
await expect(
service.create(
{ page: page(), workspaceId: 'ws-1', user: user() },
{
content: JSON.stringify(docMentioning()),
suggestedText: 'new text',
} as any,
),
).rejects.toThrow(BadRequestException);
expect(commentRepo.insertComment).not.toHaveBeenCalled();
});
it('rejects a suggestedText identical to the selection (no-op)', async () => {
const { service, commentRepo } = makeService();
await expect(
service.create(
{ page: page(), workspaceId: 'ws-1', user: user() },
{
content: JSON.stringify(docMentioning()),
selection: 'same text',
// Only differs by surrounding whitespace → still a no-op after trim.
suggestedText: ' same text ',
} as any,
),
).rejects.toThrow(BadRequestException);
expect(commentRepo.insertComment).not.toHaveBeenCalled();
});
it('stores a valid suggestedText (trimmed) on the inserted row', async () => {
const { service, commentRepo } = makeService();
await service.create(
{ page: page(), workspaceId: 'ws-1', user: user() },
{
content: JSON.stringify(docMentioning()),
selection: 'old text',
type: 'inline',
suggestedText: ' new text ',
} as any,
);
const insertArg = commentRepo.insertComment.mock.calls[0][0];
expect(insertArg.suggestedText).toBe('new text');
expect(insertArg.selection).toBe('old text');
});
it('leaves suggestedText null for an ordinary comment', async () => {
const { service, commentRepo } = makeService();
await service.create(
{ page: page(), workspaceId: 'ws-1', user: user() },
{ content: JSON.stringify(docMentioning()) } as any,
);
const insertArg = commentRepo.insertComment.mock.calls[0][0];
expect(insertArg.suggestedText).toBeNull();
});
});
describe('resolveComment — provenance & resolve notifications', () => {
it('stamps resolvedSource:"agent" when an agent resolves', async () => {
const { service, commentRepo } = makeService();
@@ -61,6 +61,8 @@ describe('CommentService — broadcast carries the agent avatar stack', () => {
const generalQueue: any = { add: jest.fn(() => Promise.resolve()) };
const notificationQueue: any = { add: jest.fn(async () => undefined) };
const auditService: any = { log: jest.fn() };
const service = new CommentService(
commentRepo,
pageRepo,
@@ -68,6 +70,7 @@ describe('CommentService — broadcast carries the agent avatar stack', () => {
collaborationGateway,
generalQueue,
notificationQueue,
auditService,
);
return { service, commentRepo, wsService };
@@ -14,6 +14,7 @@ describe('CommentService', () => {
{} as any, // collaborationGateway
{} as any, // generalQueue
{} as any, // notificationQueue
{} as any, // auditService
);
});
+199 -1
View File
@@ -1,7 +1,10 @@
import {
BadRequestException,
ConflictException,
ForbiddenException,
Inject,
Injectable,
InternalServerErrorException,
Logger,
NotFoundException,
} from '@nestjs/common';
@@ -26,6 +29,11 @@ import {
AuthProvenanceData,
agentSourceFields,
} from '../../common/decorators/auth-provenance.decorator';
import { AuditEvent, AuditResource } from '../../common/events/audit-events';
import {
AUDIT_SERVICE,
IAuditService,
} from '../../integrations/audit/audit.service';
@Injectable()
export class CommentService {
@@ -40,6 +48,7 @@ export class CommentService {
private generalQueue: Queue,
@InjectQueue(QueueName.NOTIFICATION_QUEUE)
private notificationQueue: Queue,
@Inject(AUDIT_SERVICE) private auditService: IAuditService,
) {}
async findById(commentId: string) {
@@ -78,15 +87,58 @@ export class CommentService {
}
}
// Do NOT lossily truncate at 250: for a suggestion the client sends the RAW
// anchored document substring (the exact text under the comment mark) as the
// selection, which can be LONGER than the agent's <=250-char typed input
// (normalization collapses whitespace/typographic runs, so the raw span can
// exceed the normalized selection). Truncating it shorter than the mark span
// would break the apply-time equality check and make the suggestion
// un-appliable. Keep a generous 2000-char safety bound (matching
// suggestedText) so a legitimate anchored substring is never cut.
const selection = createCommentDto?.selection?.substring(0, 2000) ?? null;
// A suggested edit rewrites the exact text under an inline comment mark, so
// it is only meaningful on a top-level inline comment that carries a
// selection, and only if the suggestion actually changes that text.
let suggestedText: string | null = null;
if (
createCommentDto.suggestedText !== undefined &&
createCommentDto.suggestedText !== null
) {
if (createCommentDto.parentCommentId) {
throw new BadRequestException(
'A suggested edit can only be attached to a top-level comment, not a reply',
);
}
if (!selection || selection.trim().length === 0) {
throw new BadRequestException(
'A suggested edit requires an inline comment with a non-empty text selection',
);
}
const trimmed = createCommentDto.suggestedText.trim();
if (trimmed.length === 0) {
throw new BadRequestException('A suggested edit cannot be empty');
}
// A no-op suggestion (identical to the selection) is meaningless and would
// make "apply" indistinguishable from "already applied".
if (trimmed === selection.trim()) {
throw new BadRequestException(
'A suggested edit must differ from the selected text',
);
}
suggestedText = trimmed;
}
const inserted = await this.commentRepo.insertComment({
pageId: page.id,
content: commentContent,
selection: createCommentDto?.selection?.substring(0, 250) ?? null,
selection,
type: createCommentDto.type ?? 'page',
parentCommentId: createCommentDto?.parentCommentId,
creatorId: user.id,
workspaceId: workspaceId,
spaceId: page.spaceId,
suggestedText,
// Agent-edit provenance: the user stays creatorId; this only annotates the
// source. Normal user requests leave the column default ('user').
...agentSourceFields(provenance, 'createdSource', 'aiChatId'),
@@ -299,6 +351,152 @@ export class CommentService {
return updatedComment;
}
/**
* Apply the suggested edit carried by a top-level inline comment: atomically
* replace the text under the comment mark in the collaborative document with
* the comment's suggestedText, then stamp the applied fields and auto-resolve
* the thread. The controller authorizes (validateCanEdit); this re-checks the
* comment's own state so the invariant holds regardless of caller.
*/
async applySuggestion(
comment: Comment,
user: User,
provenance?: AuthProvenanceData,
): Promise<Comment> {
// Structural guards.
if (comment.parentCommentId) {
throw new BadRequestException(
'Only a top-level comment can carry a suggested edit',
);
}
if (!comment.suggestedText) {
throw new BadRequestException('This comment has no suggested edit to apply');
}
// State guards. Order matters — the already-applied check precedes the
// resolved check because an applied comment is normally also resolved.
//
// Already applied → IDEMPOTENT SUCCESS (issue #315 DoD: double-click /
// two-user race → idempotent "already applied", NOT a 409). The suggestion
// is already in the document, so do NOT call the collab gateway again.
// finalizeAppliedSuggestion re-fetches/broadcasts the same success shape as
// the applied branch and, when the thread is still open (the rare "applied
// but not resolved" crash window), self-heals it via resolveComment.
if (comment.suggestionAppliedAt) {
return this.finalizeAppliedSuggestion(comment, user, provenance);
}
// Not-yet-applied on a resolved thread → reject. The client hides the apply
// button once a thread is resolved; this is the defensive server check.
if (comment.resolvedAt) {
throw new BadRequestException(
'Cannot apply a suggested edit on a resolved comment thread',
);
}
// Derive the document name the same way create()/resolveComment() do for
// the comment marks: `page.${pageId}`.
const documentName = `page.${comment.pageId}`;
let verdict: { applied: boolean; currentText: string | null } | undefined;
try {
verdict = await this.collaborationGateway.handleYjsEvent(
'applyCommentSuggestion',
documentName,
{
commentId: comment.id,
expectedText: comment.selection,
newText: comment.suggestedText,
user,
},
);
} catch (error) {
// A throwing gateway (or the phase-3 fallback failing) is a hard error —
// never silently succeed, the document may or may not have changed.
this.logger.error(
`Failed to apply suggested edit for comment ${comment.id}`,
error,
);
throw new InternalServerErrorException('Failed to apply the suggested edit');
}
if (!verdict) {
// Should not happen given the phase-3 fallback; treat as a hard error
// rather than assuming success.
throw new InternalServerErrorException('Failed to apply the suggested edit');
}
if (verdict.applied === true) {
return this.finalizeAppliedSuggestion(comment, user, provenance);
}
// Idempotent branch: the mutation didn't run now, but the text under the
// mark is ALREADY the suggested text (double-click, two-user race, or a
// crash between the doc mutation and the DB write). Reconcile the DB /
// resolved state and report success — do NOT 409.
if (
verdict.applied === false &&
verdict.currentText === comment.suggestedText
) {
return this.finalizeAppliedSuggestion(comment, user, provenance);
}
// The commented text changed since the suggestion was made. Surface the
// current text so the client can tell the user what it is now.
throw new ConflictException({
message:
'The commented text changed since this suggestion was made; it was not applied.',
currentText: verdict.currentText,
});
}
/**
* Persist the applied stamps (idempotently), auto-resolve the thread and
* broadcast + audit the applied suggestion. Shared by the applied and the
* idempotent "already-applied" branches of applySuggestion.
*/
private async finalizeAppliedSuggestion(
comment: Comment,
user: User,
provenance?: AuthProvenanceData,
): Promise<Comment> {
if (!comment.suggestionAppliedAt) {
await this.commentRepo.updateComment(
{
suggestionAppliedAt: new Date(),
suggestionAppliedById: user.id,
},
comment.id,
);
}
// Auto-resolve the thread. resolveComment handles the resolve mark, its ws
// broadcast and the resolve notification. The guard above guarantees the
// thread was open when we entered, but stay defensive on re-entry.
if (!comment.resolvedAt) {
await this.resolveComment(comment, true, user, provenance);
}
const updatedComment = await this.commentRepo.findById(comment.id, {
includeCreator: true,
includeResolvedBy: true,
});
this.wsService.emitCommentEvent(comment.spaceId, comment.pageId, {
operation: 'commentUpdated',
pageId: comment.pageId,
comment: updatedComment,
});
this.auditService.log({
event: AuditEvent.COMMENT_SUGGESTION_APPLIED,
resourceType: AuditResource.COMMENT,
resourceId: comment.id,
spaceId: comment.spaceId,
metadata: { pageId: comment.pageId },
});
return updatedComment;
}
private async queueCommentNotification(
content: any,
oldMentionIds: string[],
@@ -0,0 +1,6 @@
import { IsUUID } from 'class-validator';
export class ApplySuggestionDto {
@IsUUID()
commentId: string;
}
@@ -1,4 +1,12 @@
import { IsIn, IsJSON, IsObject, IsOptional, IsString, IsUUID } from 'class-validator';
import {
IsIn,
IsJSON,
IsObject,
IsOptional,
IsString,
IsUUID,
MaxLength,
} from 'class-validator';
import { z } from 'zod';
const yjsIdSchema = z.object({
@@ -25,8 +33,15 @@ export class CreateCommentDto {
@IsJSON()
content: any;
// The agent tool caps what it TYPES at 250 chars, but for a suggestion the
// client resolves and sends the RAW anchored document substring (the exact
// text under the mark), which can be longer once normalization is undone. Bound
// the stored value at 2000 (matching suggestedText) so a legitimate anchored
// substring is never rejected — the service used to lossily truncate at 250,
// which broke the apply-time equality check.
@IsOptional()
@IsString()
@MaxLength(2000)
selection: string;
@IsOptional()
@@ -43,4 +58,12 @@ export class CreateCommentDto {
anchor: any;
head: any;
};
// Optional suggested replacement for the selected text (a "suggested edit").
// Only valid on a top-level inline comment that carries a non-empty selection;
// enforced in CommentService.create.
@IsOptional()
@IsString()
@MaxLength(2000)
suggestedText?: string;
}
@@ -0,0 +1,26 @@
import { type Kysely } from 'kysely';
export async function up(db: Kysely<any>): Promise<void> {
// Agent comment suggestions (#315): a comment may carry a proposed replacement
// for its anchored `selection`, which a human applies via the comment UI.
await db.schema
.alterTable('comments')
// The proposed replacement text (plain text). NULL for ordinary comments.
.addColumn('suggested_text', 'text')
// When the suggestion was applied (NULL until applied).
.addColumn('suggestion_applied_at', 'timestamptz')
// Who applied it (NULL until applied).
.addColumn('suggestion_applied_by_id', 'uuid', (col) =>
col.references('users.id').onDelete('set null'),
)
.execute();
}
export async function down(db: Kysely<any>): Promise<void> {
await db.schema
.alterTable('comments')
.dropColumn('suggested_text')
.dropColumn('suggestion_applied_at')
.dropColumn('suggestion_applied_by_id')
.execute();
}
+3
View File
@@ -173,6 +173,9 @@ export interface Comments {
resolvedSource: string | null;
selection: string | null;
spaceId: string;
suggestedText: string | null;
suggestionAppliedAt: Timestamp | null;
suggestionAppliedById: string | null;
type: string | null;
updatedAt: Generated<Timestamp>;
workspaceId: string;
@@ -6,6 +6,8 @@ import {
streamKeepAliveMs,
streamingDispatcherOptions,
isRetryableConnectError,
preResponseConnectRetries,
preResponseBackoffMs,
} from './ai-streaming-fetch';
/**
@@ -47,8 +49,8 @@ describe('streamTimeoutMs', () => {
expect(streamingDispatcherOptions()).toEqual({
headersTimeout: 900_000,
bodyTimeout: 900_000,
keepAliveTimeout: 10_000,
keepAliveMaxTimeout: 10_000,
keepAliveTimeout: 4_000,
keepAliveMaxTimeout: 4_000,
});
});
});
@@ -60,21 +62,91 @@ describe('streamKeepAliveMs', () => {
else process.env.AI_STREAM_KEEPALIVE_MS = ORIG;
});
it('defaults to 10s (recycle idle sockets so a NAT/proxy drop cannot poison reuse)', () => {
it('defaults to 4s (recycle idle sockets under common ~5s upstream idle cutoffs)', () => {
delete process.env.AI_STREAM_KEEPALIVE_MS;
expect(streamKeepAliveMs()).toBe(10_000);
expect(streamKeepAliveMs()).toBe(4_000);
});
it('honours a positive override and ignores invalid/non-positive', () => {
process.env.AI_STREAM_KEEPALIVE_MS = '4000';
expect(streamKeepAliveMs()).toBe(4000);
process.env.AI_STREAM_KEEPALIVE_MS = '7000';
expect(streamKeepAliveMs()).toBe(7000);
for (const bad of ['0', '-1', 'x', '']) {
process.env.AI_STREAM_KEEPALIVE_MS = bad;
expect(streamKeepAliveMs()).toBe(10_000);
expect(streamKeepAliveMs()).toBe(4_000);
}
});
});
/**
* #310: the PRE-RESPONSE retry budget was raised 2 -> 4 (5 total attempts) and
* made env-configurable so a BURST of upstream resets doesn't exhaust it.
*/
describe('preResponseConnectRetries', () => {
const ORIG = process.env.AI_STREAM_PRE_RESPONSE_RETRIES;
afterEach(() => {
if (ORIG === undefined) delete process.env.AI_STREAM_PRE_RESPONSE_RETRIES;
else process.env.AI_STREAM_PRE_RESPONSE_RETRIES = ORIG;
});
it('defaults to 4 retries (5 total attempts)', () => {
delete process.env.AI_STREAM_PRE_RESPONSE_RETRIES;
expect(preResponseConnectRetries()).toBe(4);
});
it('honours a non-negative override (incl. 0 = single attempt)', () => {
process.env.AI_STREAM_PRE_RESPONSE_RETRIES = '6';
expect(preResponseConnectRetries()).toBe(6);
process.env.AI_STREAM_PRE_RESPONSE_RETRIES = '0';
expect(preResponseConnectRetries()).toBe(0);
});
it('ignores an invalid / negative override (falls back to default 4)', () => {
for (const bad of ['-1', 'abc', '']) {
process.env.AI_STREAM_PRE_RESPONSE_RETRIES = bad;
expect(preResponseConnectRetries()).toBe(4);
}
});
});
/**
* #310: linear `150 * (attempt + 1)` backoff replaced with capped exponential +
* FULL jitter to avoid a thundering herd of lock-step reconnects. Bound-check the
* jitter by pinning the randomness source to its extremes.
*/
describe('preResponseBackoffMs', () => {
it('with rand=0 waits 0 (bottom of the full-jitter window)', () => {
for (let attempt = 0; attempt < 6; attempt++) {
expect(preResponseBackoffMs(attempt, () => 0)).toBe(0);
}
});
it('with rand=1 returns the capped exponential top of the window', () => {
// base 150ms, exp = 150 * 2**attempt, capped at 2000ms.
expect(preResponseBackoffMs(0, () => 1)).toBe(150);
expect(preResponseBackoffMs(1, () => 1)).toBe(300);
expect(preResponseBackoffMs(2, () => 1)).toBe(600);
expect(preResponseBackoffMs(3, () => 1)).toBe(1200);
// 150 * 2**4 = 2400 -> capped to 2000.
expect(preResponseBackoffMs(4, () => 1)).toBe(2000);
expect(preResponseBackoffMs(10, () => 1)).toBe(2000);
});
it('stays within [0, cap] and is NOT the old fixed linear value', () => {
const cap = 2000;
for (let attempt = 0; attempt < 8; attempt++) {
for (const r of [0, 0.5, 0.999, 1]) {
const d = preResponseBackoffMs(attempt, () => r);
expect(d).toBeGreaterThanOrEqual(0);
expect(d).toBeLessThanOrEqual(cap);
}
}
// The old formula gave a fixed 150*(attempt+1); the jittered one with a
// mid-range rand does not reproduce it (e.g. attempt 0 -> 75, not 150).
expect(preResponseBackoffMs(0, () => 0.5)).toBe(75);
expect(preResponseBackoffMs(0, () => 0.5)).not.toBe(150);
});
});
describe('isRetryableConnectError', () => {
it('matches connection-level codes on the error or its cause', () => {
expect(isRetryableConnectError({ cause: { code: 'ECONNRESET' } })).toBe(true);
@@ -156,8 +228,12 @@ describe('createStreamingFetch — against a delayed server', () => {
describe('withPreResponseRetry', () => {
// The retry is the OUTERMOST layer (over the dispatcher-bound streaming fetch),
// matching ai.service's withPreResponseRetry(instrument(createStreamingFetch())).
// PRE_RESPONSE_CONNECT_RETRIES is 2 -> at most 3 total attempts.
const MAX_ATTEMPTS = 3;
// The budget is env-driven (AI_STREAM_PRE_RESPONSE_RETRIES, default 4 -> 5
// total attempts). We PIN it to 2 here so the exhaustion test is fast and
// deterministic regardless of the default; total attempts = retries + 1 = 3.
const RETRIES = 2;
const MAX_ATTEMPTS = RETRIES + 1;
const ORIG_RETRIES = process.env.AI_STREAM_PRE_RESPONSE_RETRIES;
let server: http.Server;
let url: string;
let requests = 0;
@@ -194,6 +270,13 @@ describe('withPreResponseRetry', () => {
beforeEach(() => {
requests = 0;
resetMode = 'first';
process.env.AI_STREAM_PRE_RESPONSE_RETRIES = String(RETRIES);
});
afterEach(() => {
if (ORIG_RETRIES === undefined)
delete process.env.AI_STREAM_PRE_RESPONSE_RETRIES;
else process.env.AI_STREAM_PRE_RESPONSE_RETRIES = ORIG_RETRIES;
});
it('retries a pre-response reset on a fresh connection and succeeds', async () => {
@@ -216,12 +299,28 @@ describe('withPreResponseRetry', () => {
expect(caught).toBeDefined();
// A retryable connection error reached the caller (not swallowed).
expect(isRetryableConnectError(caught)).toBe(true);
// Bounded: exactly PRE_RESPONSE_CONNECT_RETRIES + 1 attempts hit the server
// Bounded: exactly AI_STREAM_PRE_RESPONSE_RETRIES + 1 attempts hit the server
// (pins both the limit and that the final error propagates — guards an
// off-by-one or an infinite loop).
expect(requests).toBe(MAX_ATTEMPTS);
});
it('honours a raised AI_STREAM_PRE_RESPONSE_RETRIES (more attempts before giving up)', async () => {
// Env-driven budget: 4 retries -> 5 total attempts against a persistently
// resetting connect.
process.env.AI_STREAM_PRE_RESPONSE_RETRIES = '4';
resetMode = 'all';
let caught: unknown;
try {
await retryingFetch()(url);
} catch (e) {
caught = e;
}
expect(caught).toBeDefined();
expect(isRetryableConnectError(caught)).toBe(true);
expect(requests).toBe(5);
});
it('does NOT retry an aborted request (no retry storm)', async () => {
resetMode = 'all';
const ctrl = new AbortController();
@@ -19,7 +19,7 @@ import { Agent } from 'undici';
const DEFAULT_STREAM_TIMEOUT_MS = 900_000;
/**
* Default keep-alive recycle window (10s). A pooled connection idle longer than
* Default keep-alive recycle window (4s). A pooled connection idle longer than
* this is CLOSED rather than reused.
*
* Long agent turns leave gaps of tens of seconds between provider calls (one
@@ -30,17 +30,70 @@ const DEFAULT_STREAM_TIMEOUT_MS = 900_000;
* the resets correlate with idleSincePrevCall ~42s, while a direct path to the
* provider does NOT reset). Recycling idle sockets well below such a drop window
* means a long-gap call opens a fresh connection instead of reusing a stale one.
* Kept comfortably under common ~5s upstream/middlebox idle cutoffs so undici
* recycles the socket before the network kills it, while still long enough to
* reuse a connection within a single burst of back-to-back calls (#310).
* `keepAliveMaxTimeout` also caps a server-advertised keep-alive so the provider
* cannot push the reuse window back up.
*/
const DEFAULT_STREAM_KEEPALIVE_MS = 10_000;
const DEFAULT_STREAM_KEEPALIVE_MS = 4_000;
/**
* How many times to retry a PRE-RESPONSE connection failure (a reset/timeout
* before ANY response byte) on a fresh connection. Safe because `fetch()` only
* rejects before the Response resolves — a started stream is never replayed.
* Default number of times to retry a PRE-RESPONSE connection failure (a
* reset/timeout before ANY response byte) on a fresh connection. Safe because
* `fetch()` only rejects before the Response resolves — a started stream is
* never replayed.
*
* Raised from 2 to 4 (total 5 attempts) so a short BURST of upstream/middlebox
* resets is absorbed without exhausting the budget: prod saw 2 of 3 attempts
* burned on a single turn, leaving no headroom (#310). Override with
* `AI_STREAM_PRE_RESPONSE_RETRIES`.
*/
const PRE_RESPONSE_CONNECT_RETRIES = 2;
const DEFAULT_PRE_RESPONSE_CONNECT_RETRIES = 4;
/**
* Configured PRE-RESPONSE retry budget. Override with
* `AI_STREAM_PRE_RESPONSE_RETRIES`; a missing/invalid/negative value falls back
* to {@link DEFAULT_PRE_RESPONSE_CONNECT_RETRIES}. Total attempts = value + 1.
* 0 disables the retry (a single attempt).
*/
export function preResponseConnectRetries(): number {
// Read the raw string first: an empty/whitespace value coerces to 0 via
// Number(), which is a VALID setting here (0 = single attempt), so it must be
// treated as "unset" rather than "disable the retry".
const rawStr = process.env.AI_STREAM_PRE_RESPONSE_RETRIES;
if (rawStr === undefined || rawStr.trim() === '') {
return DEFAULT_PRE_RESPONSE_CONNECT_RETRIES;
}
const raw = Number(rawStr);
return Number.isFinite(raw) && raw >= 0
? Math.floor(raw)
: DEFAULT_PRE_RESPONSE_CONNECT_RETRIES;
}
/** Base backoff before the first PRE-RESPONSE retry (ms). */
const PRE_RESPONSE_BACKOFF_BASE_MS = 150;
/** Cap on the exponential backoff window before jitter (ms). */
const PRE_RESPONSE_BACKOFF_CAP_MS = 2_000;
/**
* Backoff (ms) to wait before PRE-RESPONSE retry number `attempt` (0-based).
*
* Capped exponential with FULL jitter: `delay = random in [0, min(base*2^attempt,
* cap)]`. Full jitter spreads concurrent retries across the whole window so a
* burst of turns that all reset at once do not reconnect in lock-step and
* hammer the upstream in a thundering herd (#310); the exponential growth backs
* off harder as resets persist, and the cap keeps the wait bounded.
*/
export function preResponseBackoffMs(
attempt: number,
rand: () => number = Math.random,
): number {
const exp = PRE_RESPONSE_BACKOFF_BASE_MS * 2 ** attempt;
const capped = Math.min(exp, PRE_RESPONSE_BACKOFF_CAP_MS);
return rand() * capped;
}
/** undici cause codes for a connection-level failure that occurred PRE-RESPONSE. */
const RETRYABLE_CONNECT_CODES = new Set([
@@ -177,20 +230,19 @@ export function createStreamingFetch(): typeof fetch {
*/
export function withPreResponseRetry(baseFetch: typeof fetch): typeof fetch {
return (async (input: Parameters<typeof fetch>[0], init?: RequestInit) => {
const maxRetries = preResponseConnectRetries();
for (let attempt = 0; ; attempt++) {
try {
return await baseFetch(input, init);
} catch (err) {
const aborted = init?.signal?.aborted === true;
if (
aborted ||
attempt >= PRE_RESPONSE_CONNECT_RETRIES ||
!isRetryableConnectError(err)
) {
if (aborted || attempt >= maxRetries || !isRetryableConnectError(err)) {
throw err;
}
// Brief backoff before the fresh-connection retry.
await new Promise((resolve) => setTimeout(resolve, 150 * (attempt + 1)));
// Jittered backoff before the fresh-connection retry (anti-thundering-herd).
await new Promise((resolve) =>
setTimeout(resolve, preResponseBackoffMs(attempt)),
);
}
}
}) as typeof fetch;
+86 -11
View File
@@ -17,7 +17,7 @@ import { withPageLock } from "./lib/page-lock.js";
import { applyTextEdits, } from "./lib/json-edit.js";
import { getCollabToken, performLogin } from "./lib/auth-utils.js";
import { diffDocs, summarizeChange } from "./lib/diff.js";
import { applyAnchorInDoc, canAnchorInDoc } from "./lib/comment-anchor.js";
import { applyAnchorInDoc, canAnchorInDoc, countAnchorMatches, getAnchoredText, } from "./lib/comment-anchor.js";
import { blockText, walk, getList, insertMarkerAfter, setCalloutRange, noteItem, mdToInlineNodes, commentsToFootnotes, canonicalizeFootnotes, insertInlineFootnote, } from "./lib/transforms.js";
import vm from "node:vm";
// Supported image types, kept as two lookup tables so both a local file
@@ -1912,9 +1912,21 @@ export class DocmostClient {
* an orphan, unanchored comment behind. Replies (parentCommentId set) inherit
* their parent's anchor: they take NO selection and are not anchored.
*/
async createComment(pageId, content, type = "page", selection, parentCommentId) {
async createComment(pageId, content, type = "page", selection, parentCommentId, suggestedText) {
await this.ensureAuthenticated();
const isReply = !!parentCommentId;
const hasSuggestion = suggestedText !== undefined && suggestedText !== null;
// Defense in depth mirroring the server DTO/service: a suggested edit rewrites
// the exact anchored text, so it is only meaningful on a top-level inline
// comment that carries a selection.
if (hasSuggestion) {
if (isReply) {
throw new Error("create_comment: a suggested edit (suggestedText) cannot be attached to a reply; it applies only to a top-level inline comment.");
}
if (!selection || !selection.trim()) {
throw new Error("create_comment: a suggested edit (suggestedText) requires a 'selection' to anchor and rewrite.");
}
}
// Only top-level comments are inline-anchored, so they are stored as
// "inline". Replies carry no inline selection, so they keep the historical
// general ("page") type — both backward-compatible and semantically correct.
@@ -1924,6 +1936,16 @@ export class DocmostClient {
if (!isReply && (!selection || !selection.trim())) {
throw new Error("create_comment: an inline 'selection' (exact text to anchor on) is required for a top-level comment");
}
// For a SUGGESTION, the value we store as the comment's `selection` must be
// the RAW document substring the mark lands on (typographic quotes/dashes,
// nbsp, collapsed whitespace), NOT the agent's ASCII input. The anchor is
// placed via normalization, so when the doc was auto-converted to
// typographic the raw substring differs from the agent input; apply-time
// compares the stored selection to the marked doc text STRICTLY, so storing
// the raw substring is what makes "Apply" succeed instead of a spurious 409.
// Captured in the pre-check below (which already reads the page) and used as
// payload.selection. Ordinary comments keep sending the raw agent selection.
let anchoredSelection = null;
// For a top-level comment, fail BEFORE creating anything when the selection
// is not present in the persisted document — this avoids leaving an orphan
// comment + notification behind. A read failure (network) is non-fatal: the
@@ -1931,16 +1953,38 @@ export class DocmostClient {
if (!isReply && selection) {
try {
const page = await this.getPageJson(pageId);
if (!canAnchorInDoc(page.content, selection)) {
if (hasSuggestion) {
// A suggestion's anchor MUST be unambiguous: applying it rewrites the
// exact anchored text, and ordinary anchoring silently takes the first
// occurrence, so 0 matches -> not found and >=2 -> ambiguous, both
// rejected BEFORE creating the comment.
const matches = countAnchorMatches(page.content, selection);
if (matches === 0) {
throw new Error("create_comment: could not find the selection text in the page to anchor the comment. " +
"Provide the EXACT contiguous text from a single paragraph/block (<=250 chars).");
}
if (matches >= 2) {
throw new Error(`create_comment: the suggestion's selection is ambiguous — it occurs ${matches} times in the page. ` +
"A suggested edit must anchor to a UNIQUE location; expand the selection with surrounding context " +
"(still <=250 chars) so it appears exactly once.");
}
// Exactly one match: capture the RAW anchored substring to store as the
// comment selection (so apply-time equality holds). If this returns
// null despite countAnchorMatches===1 (shouldn't happen), fall back to
// the raw agent selection below rather than crash.
anchoredSelection = getAnchoredText(page.content, selection);
}
else if (!canAnchorInDoc(page.content, selection)) {
throw new Error("create_comment: could not find the selection text in the page to anchor the comment. " +
"Provide the EXACT contiguous text from a single paragraph/block (<=250 chars).");
}
}
catch (e) {
// Rethrow our own "not found" error; swallow read/network errors so the
// live anchor step can still try (and enforce) the anchoring.
// Rethrow our own "not found"/"ambiguous" errors; swallow read/network
// errors so the live anchor step can still try (and enforce) anchoring.
if (e instanceof Error &&
e.message.startsWith("create_comment: could not find the selection")) {
(e.message.startsWith("create_comment: could not find the selection") ||
e.message.startsWith("create_comment: the suggestion's selection is ambiguous"))) {
throw e;
}
if (process.env.DEBUG) {
@@ -1958,10 +2002,19 @@ export class DocmostClient {
content: JSON.stringify(jsonContent),
type: effectiveType,
};
// For a suggestion, store the RAW anchored substring (anchoredSelection) so
// the stored selection === the text under the mark === apply-time
// expectedText. Ordinary comments (and the null fallback) keep the raw
// agent selection — their selection is only display/anchor and never used
// by apply, so their behavior is unchanged.
if (!isReply && selection)
payload.selection = selection;
payload.selection = anchoredSelection ?? selection;
if (parentCommentId)
payload.parentCommentId = parentCommentId;
// Only a top-level inline comment (with a selection) may carry a suggestion.
if (!isReply && selection && hasSuggestion) {
payload.suggestedText = suggestedText;
}
const response = await this.client.post("/comments/create", payload);
const comment = response.data.data || response.data;
const markdown = comment.content
@@ -1988,15 +2041,34 @@ export class DocmostClient {
throw new Error("create_comment: the server returned no comment id, so the comment could not be anchored");
}
let anchored = false;
// Set inside the transform when a suggestion's live anchor is ambiguous
// (>=2 occurrences), so the rollback path can surface the right error.
let ambiguousInLiveDoc = false;
try {
const collabToken = await this.getCollabTokenWithReauth();
// Open the collab doc by the canonical UUID, never the slugId (#260). The
// /comments/create REST call above keeps the agent-supplied id.
const pageUuid = await this.resolvePageId(pageId);
const mutation = await mutatePageContent(pageUuid, collabToken, this.apiUrl, (liveDoc) => {
// Route through the mutatePage seam (not the free function) so this
// wrapper's uniqueness gate + rollback can be unit-tested without a live
// Hocuspocus collab socket.
const mutation = await this.mutatePage(pageUuid, collabToken, this.apiUrl, (liveDoc) => {
const doc = liveDoc && liveDoc.type === "doc"
? liveDoc
: { type: "doc", content: [] };
if (hasSuggestion) {
// Authoritative uniqueness check against the LIVE document: a
// suggestion must anchor to EXACTLY ONE occurrence, otherwise
// "Apply" would rewrite the wrong/ambiguous text. If the live doc
// no longer has exactly one occurrence (it changed since the
// pre-check), abort so the just-created comment is rolled back
// rather than mis-anchored to the first occurrence.
const liveCount = countAnchorMatches(doc, selection);
if (liveCount !== 1) {
ambiguousInLiveDoc = liveCount >= 2;
return null;
}
}
if (applyAnchorInDoc(doc, selection, newCommentId)) {
anchored = true;
return doc;
@@ -2014,10 +2086,13 @@ export class DocmostClient {
throw e;
}
if (!anchored) {
// Mutation aborted because the selection was not found in the live
// document. Roll back the comment and surface a hard error.
// Mutation aborted because the selection was not found (or, for a
// suggestion, was ambiguous) in the live document. Roll back the comment
// and surface a hard error.
await this.safeDeleteComment(newCommentId);
throw new Error("create_comment: failed to anchor the comment (selection not found in the live document); the comment was rolled back");
throw new Error(ambiguousInLiveDoc
? "create_comment: the suggestion's selection is ambiguous in the live document (multiple occurrences); the comment was rolled back. Expand the selection with surrounding context so it is unique."
: "create_comment: failed to anchor the comment (selection not found in the live document); the comment was rolled back");
}
result.anchored = true;
return result;
+58 -55
View File
@@ -76,6 +76,10 @@ export function createDocmostMcpServer(config) {
return jsonContent(spaces);
});
// Tool: list_pages
// INTENTIONAL per-transport divergence (not in the shared registry): this
// transport exposes a `tree:true` mode that returns the full nested hierarchy;
// the in-app copy keeps the same tree option but is worded for the in-app agent.
// Kept per-layer so each side can tune its own guidance.
server.registerTool("list_pages", {
description: "List most recent pages in a space ordered by updatedAt (descending). " +
"Returns a bounded list (default 50, max 100) — use search for lookups " +
@@ -143,6 +147,10 @@ export function createDocmostMcpServer(config) {
return jsonContent(result);
});
// Tool: table_insert_row
// NOT in the shared registry: this transport names the table argument `table`,
// while the in-app tool names it `tableRef` (ai-chat-tools.service.ts). Sharing
// one buildShape would rename a public MCP parameter, so the table row/cell
// tools stay per-transport by design.
server.registerTool("table_insert_row", {
description: "Insert a row of plain-text cells into a table. `table` = `#<index>` or " +
"a block id inside it. `cells` = text per column (padded to the table's " +
@@ -159,6 +167,8 @@ export function createDocmostMcpServer(config) {
return jsonContent(result);
});
// Tool: table_delete_row
// NOT shared — same `table` (here) vs `tableRef` (in-app) parameter-name
// divergence as table_insert_row.
server.registerTool("table_delete_row", {
description: "Delete the row at 0-based `index` from a table (`table` = `#<index>` or " +
"a block id inside it). Refuses to delete the table's only row. An " +
@@ -174,6 +184,8 @@ export function createDocmostMcpServer(config) {
return jsonContent(result);
});
// Tool: table_update_cell
// NOT shared — same `table` (here) vs `tableRef` (in-app) parameter-name
// divergence as table_insert_row.
server.registerTool("table_update_cell", {
description: "Set the plain-text content of cell [row,col] (0-based) in a table " +
"(`table` = `#<index>` or a block id inside it). Replaces the cell's " +
@@ -317,62 +329,17 @@ export function createDocmostMcpServer(config) {
},
};
});
// Tool: patch_node
server.registerTool("patch_node", {
description: "Replaces a single block identified by its attrs.id WITHOUT resending the " +
"whole document. Get the block id from get_page_json, then pass a " +
"ProseMirror node to put in its place. Example node: a paragraph " +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]} or a ' +
'heading {"type":"heading","attrs":{"level":2},"content":' +
'[{"type":"text","text":"Title"}]}. Bold is a mark: ' +
'{"type":"text","text":"x","marks":[{"type":"bold"}]}. The node may be a ' +
"JSON object or a JSON string (both accepted). Cheaper and safer than " +
"update_page_json for one-block structural edits.",
inputSchema: {
pageId: z.string().min(1),
nodeId: z.string().min(1),
node: z
.any()
.describe("ProseMirror node to put in place of the node with this id, e.g. " +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]}. ' +
"JSON object or JSON string both accepted."),
},
}, async ({ pageId, nodeId, node }) => {
// Tool: patch_node — schema + description from the shared registry (identical
// across both transports). The execute body keeps its own parseNodeArg
// normalization (the model sometimes serializes `node` as a JSON string).
registerShared(SHARED_TOOL_SPECS.patchNode, async ({ pageId, nodeId, node }) => {
const parsedNode = parseNodeArg(node);
const result = await docmostClient.patchNode(pageId, nodeId, parsedNode);
return jsonContent(result);
});
// Tool: insert_node
server.registerTool("insert_node", {
description: "Insert a block before/after another block (by attrs.id or anchor text) " +
"or append at the end. Get anchor block ids from get_page_json. Avoids " +
"resending the whole document. Can also insert table structure: to add a " +
"tableRow, pass a tableRow node with position before/after and anchor " +
"INSIDE the target table — anchorNodeId of any block/cell in it, or " +
"anchorText matching the table; to add a tableCell/tableHeader, use " +
"anchorNodeId of a block inside the target row (anchorText only resolves " +
"top-level blocks, so it cannot target a row). `anchorText` is matched " +
"against the block's literal rendered plain text (no markdown); " +
"markdown/emoji are tolerated as a fallback; prefer plain text or " +
"anchorNodeId. Note: append is top-level " +
"only and rejects structural table nodes. Example node: a paragraph " +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]} or a ' +
'heading {"type":"heading","attrs":{"level":2},"content":' +
'[{"type":"text","text":"Title"}]}. Bold is a mark: ' +
'{"type":"text","text":"x","marks":[{"type":"bold"}]}. The node may be a ' +
"JSON object or a JSON string (both accepted).",
inputSchema: {
pageId: z.string().min(1),
node: z
.any()
.describe("ProseMirror node to insert, e.g. " +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]}. ' +
"JSON object or JSON string both accepted."),
position: z.enum(["before", "after", "append"]),
anchorNodeId: z.string().optional(),
anchorText: z.string().optional(),
},
}, async ({ pageId, node, position, anchorNodeId, anchorText }) => {
// Tool: insert_node — schema + description from the shared registry. As with
// patch_node, the execute body retains parseNodeArg on the incoming node.
registerShared(SHARED_TOOL_SPECS.insertNode, async ({ pageId, node, position, anchorNodeId, anchorText }) => {
const parsedNode = parseNodeArg(node);
const result = await docmostClient.insertNode(pageId, parsedNode, {
position,
@@ -453,6 +420,10 @@ export function createDocmostMcpServer(config) {
return jsonContent(result);
});
// Tool: share_page
// INTENTIONAL per-transport divergence (not shared): the in-app copy adds a
// security-confirmation framing ("only share when the user explicitly asked,
// since this exposes the page to anyone with the link") tuned for the in-app
// agent; this transport keeps the plain public-URL wording.
server.registerTool("share_page", {
description: "Make a page publicly accessible (idempotent) and return its public " +
"URL. The URL format is <app>/share/<key>/p/<slugId>.",
@@ -539,6 +510,9 @@ export function createDocmostMcpServer(config) {
return jsonContent(comments);
});
// Tool: create_comment
// INTENTIONAL per-transport divergence (not shared): the in-app copy tunes the
// guidance for the in-app agent (e.g. "retry with a corrected EXACT selection"
// and "Reversible via the comment UI"); this transport keeps its own wording.
server.registerTool("create_comment", {
description: "Create a new comment on a page. The comment is ALWAYS inline and is " +
"anchored to (highlights) its `selection` text — there are no page-level " +
@@ -546,7 +520,10 @@ export function createDocmostMcpServer(config) {
"A top-level comment REQUIRES an exact `selection`; if the selection " +
"cannot be found in the page the call fails (no orphan comment is left). " +
"Replies (parentCommentId set) inherit the parent's anchor and take no " +
"selection.",
"selection. You may also attach a `suggestedText` proposing a replacement " +
"for the `selection`; a human applies (or rejects) it from the UI. When " +
"`suggestedText` is set the `selection` MUST occur exactly once in the " +
"page — expand it with surrounding context if it is ambiguous.",
inputSchema: {
pageId: z.string().describe("ID of the page to comment on"),
content: z.string().min(1).describe("Comment content in Markdown format"),
@@ -563,12 +540,30 @@ export function createDocmostMcpServer(config) {
.string()
.optional()
.describe("Parent comment ID to create a reply (max 2 nesting levels)"),
suggestedText: z
.string()
.min(1)
.max(2000)
.optional()
.describe("Optional proposed replacement (PLAIN TEXT) for the `selection`, " +
"applied by a human via the UI (never auto-applied). REQUIRES a " +
"`selection`; NOT allowed on a reply. When set, the `selection` must " +
"be UNIQUE in the page — expand it with surrounding context (still " +
"<=250 chars) if it occurs more than once, or the call is refused."),
},
}, async ({ pageId, content, selection, parentCommentId }) => {
}, async ({ pageId, content, selection, parentCommentId, suggestedText }) => {
if (!parentCommentId && (!selection || !selection.trim())) {
throw new Error("create_comment: a 'selection' (exact text to anchor on) is required for a top-level comment; omit it only when replying via parentCommentId.");
}
const result = await docmostClient.createComment(pageId, content, "inline", selection, parentCommentId);
if (suggestedText !== undefined) {
if (parentCommentId) {
throw new Error("create_comment: 'suggestedText' cannot be attached to a reply; it applies only to a top-level inline comment.");
}
if (!selection || !selection.trim()) {
throw new Error("create_comment: 'suggestedText' requires a 'selection' to anchor and rewrite.");
}
}
const result = await docmostClient.createComment(pageId, content, "inline", selection, parentCommentId, suggestedText);
return jsonContent(result);
});
// Tool: update_comment
@@ -652,6 +647,10 @@ export function createDocmostMcpServer(config) {
return jsonContent(result);
});
// Tool: search
// INTENTIONAL per-transport divergence (not shared): the in-app `searchPages`
// runs a semantic + keyword hybrid (RRF) with in-process access control and a
// different schema (limit 1-20); this transport is a plain REST full-text search
// (limit up to 100). Different behaviour AND schema, so kept per-layer.
server.registerTool("search", {
description: "Search for pages and content. Results are bounded by `limit` " +
"(default applied by the client, max 100).",
@@ -672,6 +671,10 @@ export function createDocmostMcpServer(config) {
return jsonContent(result);
});
// Tool: docmost_transform
// INTENTIONAL per-transport divergence (not shared): the in-app `transformPage`
// deliberately omits the `deleteComments` schema field (comment-deletion
// guardrail) and carries a much shorter description; this transport exposes the
// full helper catalogue. Different schema, so kept per-layer.
server.registerTool("docmost_transform", {
description: "Edit a page by running an arbitrary JS transform `(doc, ctx) => doc` " +
"against its LIVE ProseMirror document, with a diff preview and page " +
+132
View File
@@ -148,6 +148,67 @@ export function findAnchorInBlock(blockContent, selection) {
}
return null;
}
/**
* Reconstruct the RAW text spanned by an AnchorMatch inside one block's
* `content` array. `startChild..endChild` are all text nodes (guaranteed by
* findAnchorInBlock, which only builds runs of `text` nodes), so concatenate
* each node's text slice: from `startOffset` on the first node, up to
* `endOffset` on the last, and the whole `.text` for any node fully inside the
* range. Mirrors spliceCommentMark's per-node slicing so the string returned
* here is EXACTLY the characters the comment mark will cover.
*/
function reconstructRawText(blockContent, match) {
const { startChild, startOffset, endChild, endOffset } = match;
let out = "";
for (let k = startChild; k <= endChild; k++) {
const n = blockContent[k];
const text = typeof n.text === "string" ? n.text : "";
const sliceStart = k === startChild ? startOffset : 0;
const sliceEnd = k === endChild ? endOffset : text.length;
out += text.slice(sliceStart, sliceEnd);
}
return out;
}
/**
* Return the RAW document substring that `selection` would anchor to the exact
* characters the comment mark will cover or `null` when the selection cannot
* be anchored anywhere in `doc`.
*
* This mirrors canAnchorInDoc / applyAnchorInDoc EXACTLY (same depth-first,
* document-order traversal and the same findAnchorInBlock match on the FIRST
* matching block), but instead of a boolean / an in-place mutation it
* reconstructs the raw text spanned by the matched range. Because
* findAnchorInBlock maps the normalized selection back to raw text-node
* positions, the returned string is the document's ORIGINAL characters (smart
* quotes, em-dashes, nbsp, collapsed whitespace) NOT the normalized ASCII
* agent input.
*
* Callers store THIS as the comment's `selection` so the stored value equals the
* text actually under the mark, which is what the apply-suggestion equality
* check (replaceYjsMarkedText's `joinedText !== expectedText`) compares against.
* Without it a suggestion whose anchor only matched via normalization would be
* un-appliable (spurious 409).
*/
export function getAnchoredText(doc, selection) {
const visit = (node, depth) => {
if (depth > MAX_DEPTH || !node || typeof node !== "object")
return null;
if (!Array.isArray(node.content))
return null;
const match = findAnchorInBlock(node.content, selection);
if (match)
return reconstructRawText(node.content, match);
for (const child of node.content) {
if (child && typeof child === "object" && Array.isArray(child.content)) {
const found = visit(child, depth + 1);
if (found !== null)
return found;
}
}
return null;
};
return visit(doc, 0);
}
/**
* Depth-first, document-order check for whether `selection` can be anchored
* anywhere in `doc`. At each node with an array `content`, first try to match
@@ -210,6 +271,77 @@ function spliceCommentMark(blockContent, match, commentId) {
}
blockContent.splice(startChild, endChild - startChild + 1, ...fragments);
}
/**
* Count how many times `selection` occurs across the whole document, using the
* same normalization and run-matching as findAnchorInBlock but WITHOUT stopping
* at the first hit: every non-overlapping occurrence within each block's text
* runs is counted and summed across all blocks (depth-first, the same traversal
* as canAnchorInDoc).
*
* This is the uniqueness gate for SUGGESTIONS: because applying a suggestion
* rewrites the exact anchored text, an ambiguous anchor (>1 occurrence) would
* silently edit the wrong place, so a suggestion is only allowed when this
* returns exactly 1. Ordinary comments keep first-occurrence anchoring and do
* not use this. (Note: counts OCCURRENCES, not just matching blocks, so two
* occurrences inside one block are correctly reported as 2.)
*/
export function countAnchorMatches(doc, selection) {
const normSel = normalizeForMatch(selection).norm.trim();
if (normSel.length === 0)
return 0;
// Count non-overlapping occurrences of the normalized selection within a
// single block's direct content, matching findAnchorInBlock's run building.
const countInBlock = (blockContent) => {
if (!Array.isArray(blockContent))
return 0;
let count = 0;
let i = 0;
while (i < blockContent.length) {
const node = blockContent[i];
if (!node || typeof node !== "object" || node.type !== "text") {
i++;
continue;
}
// Accumulate a maximal run of consecutive text nodes.
let rawRun = "";
let j = i;
while (j < blockContent.length) {
const n = blockContent[j];
if (!n || typeof n !== "object" || n.type !== "text")
break;
rawRun += typeof n.text === "string" ? n.text : "";
j++;
}
const norm = normalizeForMatch(rawRun).norm;
// Count every non-overlapping occurrence in this run.
let from = 0;
for (;;) {
const idx = norm.indexOf(normSel, from);
if (idx === -1)
break;
count++;
from = idx + normSel.length;
}
i = j > i ? j : i + 1;
}
return count;
};
let total = 0;
const visit = (node, depth) => {
if (depth > MAX_DEPTH || !node || typeof node !== "object")
return;
if (!Array.isArray(node.content))
return;
total += countInBlock(node.content);
for (const child of node.content) {
if (child && typeof child === "object" && Array.isArray(child.content)) {
visit(child, depth + 1);
}
}
};
visit(doc, 0);
return total;
}
/**
* Depth-first (same order as canAnchorInDoc) over `doc`; on the FIRST block
* whose content matches `selection`, splice the comment mark across the matched
+5
View File
@@ -70,6 +70,11 @@ export function filterComment(comment, markdownContent) {
editedAt: comment.editedAt || null,
resolvedAt: comment.resolvedAt || null,
resolvedById: comment.resolvedById || null,
// Suggestion state: the proposed replacement text (if any) and, once a human
// applies it via the UI, when and by whom.
suggestedText: comment.suggestedText || null,
suggestionAppliedAt: comment.suggestionAppliedAt || null,
suggestionAppliedById: comment.suggestionAppliedById || null,
};
}
export function filterSearchResult(result) {
+80
View File
@@ -80,6 +80,86 @@ export const SHARED_TOOL_SPECS = {
nodeId: z.string().min(1),
}),
},
// --- single-block structural write (patch / insert) ---
//
// CANONICAL description merges both layers: the MCP copy's "WITHOUT resending
// the whole document" + "cheaper/safer than a full-document replace" guidance
// AND the in-app copy's "keeps the same node id" + "Reversible via page
// history" framing — nothing either side conveyed is dropped. Sibling tools are
// named in transport-neutral prose ("the page-JSON view", "a full-document
// replace") to match the rest of the registry, since the two layers expose
// those siblings under different (snake_case vs camelCase) identifiers.
patchNode: {
mcpName: 'patch_node',
inAppKey: 'patchNode',
description: 'Replace a single content block identified by its attrs.id with a new ' +
'ProseMirror node, WITHOUT resending the whole document; the replacement ' +
'keeps the same node id. Get the block id from the page-JSON view, then ' +
'pass a ProseMirror node to put in its place. Example node: a paragraph ' +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]} or a ' +
'heading {"type":"heading","attrs":{"level":2},"content":' +
'[{"type":"text","text":"Title"}]}. Bold is a mark: ' +
'{"type":"text","text":"x","marks":[{"type":"bold"}]}. The node may be a ' +
'JSON object or a JSON string (both accepted). Cheaper and safer than ' +
'replacing the whole document for one-block structural edits. Reversible: ' +
'the previous version is kept in page history.',
buildShape: (z) => ({
pageId: z.string().min(1).describe('ID of the page containing the block'),
nodeId: z
.string()
.min(1)
.describe('attrs.id of the block to replace (from the page outline or ' +
'page-JSON view)'),
node: z
.any()
.describe('ProseMirror node to put in place of the node with this id, e.g. ' +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]}. ' +
'JSON object or JSON string both accepted.'),
}),
},
insertNode: {
mcpName: 'insert_node',
inAppKey: 'insertNode',
description: 'Insert a block before/after another block (by attrs.id or anchor text) ' +
'or append it at the end (top level). For before/after you MUST provide ' +
'EXACTLY ONE of anchorNodeId or anchorText. Get anchor block ids from the ' +
'page-JSON view. Avoids resending the whole document. Can also insert ' +
'table structure: to add a tableRow, pass a tableRow node with position ' +
'before/after and anchor INSIDE the target table — anchorNodeId of any ' +
'block/cell in it, or anchorText matching the table; to add a ' +
'tableCell/tableHeader, use anchorNodeId of a block inside the target row ' +
'(anchorText only resolves top-level blocks, so it cannot target a row). ' +
"`anchorText` is matched against the block's literal rendered plain text " +
'(no markdown); markdown/emoji are tolerated as a fallback; prefer plain ' +
'text or anchorNodeId. Note: append is top-level only and rejects ' +
'structural table nodes. Example node: a paragraph ' +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]} or a ' +
'heading {"type":"heading","attrs":{"level":2},"content":' +
'[{"type":"text","text":"Title"}]}. Bold is a mark: ' +
'{"type":"text","text":"x","marks":[{"type":"bold"}]}. The node may be a ' +
'JSON object or a JSON string (both accepted). Reversible via page history.',
buildShape: (z) => ({
pageId: z.string().min(1),
node: z
.any()
.describe('ProseMirror node to insert, e.g. ' +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]}. ' +
'JSON object or JSON string both accepted.'),
position: z
.enum(['before', 'after', 'append'])
.describe('Where to insert relative to the anchor.'),
anchorNodeId: z
.string()
.optional()
.describe('Anchor block id (for before/after).'),
anchorText: z
.string()
.optional()
.describe("Anchor text fragment (for before/after), matched against the " +
"block's literal rendered plain text (no markdown). Markdown/emoji " +
'are tolerated as a fallback; prefer plain text or anchorNodeId.'),
}),
},
// --- share management ---
unsharePage: {
mcpName: 'unshare_page',
+103 -10
View File
@@ -56,7 +56,12 @@ import {
} from "./lib/json-edit.js";
import { getCollabToken, performLogin } from "./lib/auth-utils.js";
import { diffDocs, summarizeChange } from "./lib/diff.js";
import { applyAnchorInDoc, canAnchorInDoc } from "./lib/comment-anchor.js";
import {
applyAnchorInDoc,
canAnchorInDoc,
countAnchorMatches,
getAnchoredText,
} from "./lib/comment-anchor.js";
import {
blockText,
walk,
@@ -2395,10 +2400,28 @@ export class DocmostClient {
type: "page" | "inline" = "page",
selection?: string,
parentCommentId?: string,
suggestedText?: string,
) {
await this.ensureAuthenticated();
const isReply = !!parentCommentId;
const hasSuggestion =
suggestedText !== undefined && suggestedText !== null;
// Defense in depth mirroring the server DTO/service: a suggested edit rewrites
// the exact anchored text, so it is only meaningful on a top-level inline
// comment that carries a selection.
if (hasSuggestion) {
if (isReply) {
throw new Error(
"create_comment: a suggested edit (suggestedText) cannot be attached to a reply; it applies only to a top-level inline comment.",
);
}
if (!selection || !selection.trim()) {
throw new Error(
"create_comment: a suggested edit (suggestedText) requires a 'selection' to anchor and rewrite.",
);
}
}
// Only top-level comments are inline-anchored, so they are stored as
// "inline". Replies carry no inline selection, so they keep the historical
// general ("page") type — both backward-compatible and semantically correct.
@@ -2411,6 +2434,17 @@ export class DocmostClient {
);
}
// For a SUGGESTION, the value we store as the comment's `selection` must be
// the RAW document substring the mark lands on (typographic quotes/dashes,
// nbsp, collapsed whitespace), NOT the agent's ASCII input. The anchor is
// placed via normalization, so when the doc was auto-converted to
// typographic the raw substring differs from the agent input; apply-time
// compares the stored selection to the marked doc text STRICTLY, so storing
// the raw substring is what makes "Apply" succeed instead of a spurious 409.
// Captured in the pre-check below (which already reads the page) and used as
// payload.selection. Ordinary comments keep sending the raw agent selection.
let anchoredSelection: string | null = null;
// For a top-level comment, fail BEFORE creating anything when the selection
// is not present in the persisted document — this avoids leaving an orphan
// comment + notification behind. A read failure (network) is non-fatal: the
@@ -2418,18 +2452,45 @@ export class DocmostClient {
if (!isReply && selection) {
try {
const page = await this.getPageJson(pageId);
if (!canAnchorInDoc(page.content, selection)) {
if (hasSuggestion) {
// A suggestion's anchor MUST be unambiguous: applying it rewrites the
// exact anchored text, and ordinary anchoring silently takes the first
// occurrence, so 0 matches -> not found and >=2 -> ambiguous, both
// rejected BEFORE creating the comment.
const matches = countAnchorMatches(page.content, selection);
if (matches === 0) {
throw new Error(
"create_comment: could not find the selection text in the page to anchor the comment. " +
"Provide the EXACT contiguous text from a single paragraph/block (<=250 chars).",
);
}
if (matches >= 2) {
throw new Error(
`create_comment: the suggestion's selection is ambiguous — it occurs ${matches} times in the page. ` +
"A suggested edit must anchor to a UNIQUE location; expand the selection with surrounding context " +
"(still <=250 chars) so it appears exactly once.",
);
}
// Exactly one match: capture the RAW anchored substring to store as the
// comment selection (so apply-time equality holds). If this returns
// null despite countAnchorMatches===1 (shouldn't happen), fall back to
// the raw agent selection below rather than crash.
anchoredSelection = getAnchoredText(page.content, selection);
} else if (!canAnchorInDoc(page.content, selection)) {
throw new Error(
"create_comment: could not find the selection text in the page to anchor the comment. " +
"Provide the EXACT contiguous text from a single paragraph/block (<=250 chars).",
);
}
} catch (e) {
// Rethrow our own "not found" error; swallow read/network errors so the
// live anchor step can still try (and enforce) the anchoring.
// Rethrow our own "not found"/"ambiguous" errors; swallow read/network
// errors so the live anchor step can still try (and enforce) anchoring.
if (
e instanceof Error &&
e.message.startsWith("create_comment: could not find the selection")
(e.message.startsWith("create_comment: could not find the selection") ||
e.message.startsWith(
"create_comment: the suggestion's selection is ambiguous",
))
) {
throw e;
}
@@ -2452,8 +2513,18 @@ export class DocmostClient {
content: JSON.stringify(jsonContent),
type: effectiveType,
};
if (!isReply && selection) payload.selection = selection;
// For a suggestion, store the RAW anchored substring (anchoredSelection) so
// the stored selection === the text under the mark === apply-time
// expectedText. Ordinary comments (and the null fallback) keep the raw
// agent selection — their selection is only display/anchor and never used
// by apply, so their behavior is unchanged.
if (!isReply && selection)
payload.selection = anchoredSelection ?? selection;
if (parentCommentId) payload.parentCommentId = parentCommentId;
// Only a top-level inline comment (with a selection) may carry a suggestion.
if (!isReply && selection && hasSuggestion) {
payload.suggestedText = suggestedText;
}
const response = await this.client.post("/comments/create", payload);
const comment = response.data.data || response.data;
@@ -2485,12 +2556,18 @@ export class DocmostClient {
);
}
let anchored = false;
// Set inside the transform when a suggestion's live anchor is ambiguous
// (>=2 occurrences), so the rollback path can surface the right error.
let ambiguousInLiveDoc = false;
try {
const collabToken = await this.getCollabTokenWithReauth();
// Open the collab doc by the canonical UUID, never the slugId (#260). The
// /comments/create REST call above keeps the agent-supplied id.
const pageUuid = await this.resolvePageId(pageId);
const mutation = await mutatePageContent(
// Route through the mutatePage seam (not the free function) so this
// wrapper's uniqueness gate + rollback can be unit-tested without a live
// Hocuspocus collab socket.
const mutation = await this.mutatePage(
pageUuid,
collabToken,
this.apiUrl,
@@ -2499,6 +2576,19 @@ export class DocmostClient {
liveDoc && liveDoc.type === "doc"
? liveDoc
: { type: "doc", content: [] };
if (hasSuggestion) {
// Authoritative uniqueness check against the LIVE document: a
// suggestion must anchor to EXACTLY ONE occurrence, otherwise
// "Apply" would rewrite the wrong/ambiguous text. If the live doc
// no longer has exactly one occurrence (it changed since the
// pre-check), abort so the just-created comment is rolled back
// rather than mis-anchored to the first occurrence.
const liveCount = countAnchorMatches(doc, selection as string);
if (liveCount !== 1) {
ambiguousInLiveDoc = liveCount >= 2;
return null;
}
}
if (applyAnchorInDoc(doc, selection as string, newCommentId)) {
anchored = true;
return doc;
@@ -2517,11 +2607,14 @@ export class DocmostClient {
}
if (!anchored) {
// Mutation aborted because the selection was not found in the live
// document. Roll back the comment and surface a hard error.
// Mutation aborted because the selection was not found (or, for a
// suggestion, was ambiguous) in the live document. Roll back the comment
// and surface a hard error.
await this.safeDeleteComment(newCommentId);
throw new Error(
"create_comment: failed to anchor the comment (selection not found in the live document); the comment was rolled back",
ambiguousInLiveDoc
? "create_comment: the suggestion's selection is ambiguous in the live document (multiple occurrences); the comment was rolled back. Expand the selection with surrounding context so it is unique."
: "create_comment: failed to anchor the comment (selection not found in the live document); the comment was rolled back",
);
}
+66 -64
View File
@@ -105,6 +105,10 @@ export function createDocmostMcpServer(config: DocmostMcpConfig): McpServer {
});
// Tool: list_pages
// INTENTIONAL per-transport divergence (not in the shared registry): this
// transport exposes a `tree:true` mode that returns the full nested hierarchy;
// the in-app copy keeps the same tree option but is worded for the in-app agent.
// Kept per-layer so each side can tune its own guidance.
server.registerTool(
"list_pages",
{
@@ -195,6 +199,10 @@ server.registerTool(
);
// Tool: table_insert_row
// NOT in the shared registry: this transport names the table argument `table`,
// while the in-app tool names it `tableRef` (ai-chat-tools.service.ts). Sharing
// one buildShape would rename a public MCP parameter, so the table row/cell
// tools stay per-transport by design.
server.registerTool(
"table_insert_row",
{
@@ -222,6 +230,8 @@ server.registerTool(
);
// Tool: table_delete_row
// NOT shared — same `table` (here) vs `tableRef` (in-app) parameter-name
// divergence as table_insert_row.
server.registerTool(
"table_delete_row",
{
@@ -243,6 +253,8 @@ server.registerTool(
);
// Tool: table_update_cell
// NOT shared — same `table` (here) vs `tableRef` (in-app) parameter-name
// divergence as table_insert_row.
server.registerTool(
"table_update_cell",
{
@@ -445,32 +457,11 @@ server.registerTool(
},
);
// Tool: patch_node
server.registerTool(
"patch_node",
{
description:
"Replaces a single block identified by its attrs.id WITHOUT resending the " +
"whole document. Get the block id from get_page_json, then pass a " +
"ProseMirror node to put in its place. Example node: a paragraph " +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]} or a ' +
'heading {"type":"heading","attrs":{"level":2},"content":' +
'[{"type":"text","text":"Title"}]}. Bold is a mark: ' +
'{"type":"text","text":"x","marks":[{"type":"bold"}]}. The node may be a ' +
"JSON object or a JSON string (both accepted). Cheaper and safer than " +
"update_page_json for one-block structural edits.",
inputSchema: {
pageId: z.string().min(1),
nodeId: z.string().min(1),
node: z
.any()
.describe(
"ProseMirror node to put in place of the node with this id, e.g. " +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]}. ' +
"JSON object or JSON string both accepted.",
),
},
},
// Tool: patch_node — schema + description from the shared registry (identical
// across both transports). The execute body keeps its own parseNodeArg
// normalization (the model sometimes serializes `node` as a JSON string).
registerShared(
SHARED_TOOL_SPECS.patchNode,
async ({ pageId, nodeId, node }) => {
const parsedNode = parseNodeArg(node);
const result = await docmostClient.patchNode(pageId, nodeId, parsedNode);
@@ -478,42 +469,10 @@ server.registerTool(
},
);
// Tool: insert_node
server.registerTool(
"insert_node",
{
description:
"Insert a block before/after another block (by attrs.id or anchor text) " +
"or append at the end. Get anchor block ids from get_page_json. Avoids " +
"resending the whole document. Can also insert table structure: to add a " +
"tableRow, pass a tableRow node with position before/after and anchor " +
"INSIDE the target table — anchorNodeId of any block/cell in it, or " +
"anchorText matching the table; to add a tableCell/tableHeader, use " +
"anchorNodeId of a block inside the target row (anchorText only resolves " +
"top-level blocks, so it cannot target a row). `anchorText` is matched " +
"against the block's literal rendered plain text (no markdown); " +
"markdown/emoji are tolerated as a fallback; prefer plain text or " +
"anchorNodeId. Note: append is top-level " +
"only and rejects structural table nodes. Example node: a paragraph " +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]} or a ' +
'heading {"type":"heading","attrs":{"level":2},"content":' +
'[{"type":"text","text":"Title"}]}. Bold is a mark: ' +
'{"type":"text","text":"x","marks":[{"type":"bold"}]}. The node may be a ' +
"JSON object or a JSON string (both accepted).",
inputSchema: {
pageId: z.string().min(1),
node: z
.any()
.describe(
"ProseMirror node to insert, e.g. " +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]}. ' +
"JSON object or JSON string both accepted.",
),
position: z.enum(["before", "after", "append"]),
anchorNodeId: z.string().optional(),
anchorText: z.string().optional(),
},
},
// Tool: insert_node — schema + description from the shared registry. As with
// patch_node, the execute body retains parseNodeArg on the incoming node.
registerShared(
SHARED_TOOL_SPECS.insertNode,
async ({ pageId, node, position, anchorNodeId, anchorText }) => {
const parsedNode = parseNodeArg(node);
const result = await docmostClient.insertNode(pageId, parsedNode, {
@@ -619,6 +578,10 @@ server.registerTool(
);
// Tool: share_page
// INTENTIONAL per-transport divergence (not shared): the in-app copy adds a
// security-confirmation framing ("only share when the user explicitly asked,
// since this exposes the page to anyone with the link") tuned for the in-app
// agent; this transport keeps the plain public-URL wording.
server.registerTool(
"share_page",
{
@@ -746,6 +709,9 @@ server.registerTool(
);
// Tool: create_comment
// INTENTIONAL per-transport divergence (not shared): the in-app copy tunes the
// guidance for the in-app agent (e.g. "retry with a corrected EXACT selection"
// and "Reversible via the comment UI"); this transport keeps its own wording.
server.registerTool(
"create_comment",
{
@@ -756,7 +722,10 @@ server.registerTool(
"A top-level comment REQUIRES an exact `selection`; if the selection " +
"cannot be found in the page the call fails (no orphan comment is left). " +
"Replies (parentCommentId set) inherit the parent's anchor and take no " +
"selection.",
"selection. You may also attach a `suggestedText` proposing a replacement " +
"for the `selection`; a human applies (or rejects) it from the UI. When " +
"`suggestedText` is set the `selection` MUST occur exactly once in the " +
"page — expand it with surrounding context if it is ambiguous.",
inputSchema: {
pageId: z.string().describe("ID of the page to comment on"),
content: z.string().min(1).describe("Comment content in Markdown format"),
@@ -775,20 +744,45 @@ server.registerTool(
.string()
.optional()
.describe("Parent comment ID to create a reply (max 2 nesting levels)"),
suggestedText: z
.string()
.min(1)
.max(2000)
.optional()
.describe(
"Optional proposed replacement (PLAIN TEXT) for the `selection`, " +
"applied by a human via the UI (never auto-applied). REQUIRES a " +
"`selection`; NOT allowed on a reply. When set, the `selection` must " +
"be UNIQUE in the page — expand it with surrounding context (still " +
"<=250 chars) if it occurs more than once, or the call is refused.",
),
},
},
async ({ pageId, content, selection, parentCommentId }) => {
async ({ pageId, content, selection, parentCommentId, suggestedText }) => {
if (!parentCommentId && (!selection || !selection.trim())) {
throw new Error(
"create_comment: a 'selection' (exact text to anchor on) is required for a top-level comment; omit it only when replying via parentCommentId.",
);
}
if (suggestedText !== undefined) {
if (parentCommentId) {
throw new Error(
"create_comment: 'suggestedText' cannot be attached to a reply; it applies only to a top-level inline comment.",
);
}
if (!selection || !selection.trim()) {
throw new Error(
"create_comment: 'suggestedText' requires a 'selection' to anchor and rewrite.",
);
}
}
const result = await docmostClient.createComment(
pageId,
content,
"inline",
selection,
parentCommentId,
suggestedText,
);
return jsonContent(result);
},
@@ -911,6 +905,10 @@ server.registerTool(
);
// Tool: search
// INTENTIONAL per-transport divergence (not shared): the in-app `searchPages`
// runs a semantic + keyword hybrid (RRF) with in-process access control and a
// different schema (limit 1-20); this transport is a plain REST full-text search
// (limit up to 100). Different behaviour AND schema, so kept per-layer.
server.registerTool(
"search",
{
@@ -937,6 +935,10 @@ server.registerTool(
);
// Tool: docmost_transform
// INTENTIONAL per-transport divergence (not shared): the in-app `transformPage`
// deliberately omits the `deleteComments` schema field (comment-deletion
// guardrail) and carries a much shorter description; this transport exposes the
// full helper catalogue. Different schema, so kept per-layer.
server.registerTool(
"docmost_transform",
{
+127
View File
@@ -171,6 +171,65 @@ export function findAnchorInBlock(
return null;
}
/**
* Reconstruct the RAW text spanned by an AnchorMatch inside one block's
* `content` array. `startChild..endChild` are all text nodes (guaranteed by
* findAnchorInBlock, which only builds runs of `text` nodes), so concatenate
* each node's text slice: from `startOffset` on the first node, up to
* `endOffset` on the last, and the whole `.text` for any node fully inside the
* range. Mirrors spliceCommentMark's per-node slicing so the string returned
* here is EXACTLY the characters the comment mark will cover.
*/
function reconstructRawText(blockContent: any[], match: AnchorMatch): string {
const { startChild, startOffset, endChild, endOffset } = match;
let out = "";
for (let k = startChild; k <= endChild; k++) {
const n = blockContent[k];
const text: string = typeof n.text === "string" ? n.text : "";
const sliceStart = k === startChild ? startOffset : 0;
const sliceEnd = k === endChild ? endOffset : text.length;
out += text.slice(sliceStart, sliceEnd);
}
return out;
}
/**
* Return the RAW document substring that `selection` would anchor to the exact
* characters the comment mark will cover or `null` when the selection cannot
* be anchored anywhere in `doc`.
*
* This mirrors canAnchorInDoc / applyAnchorInDoc EXACTLY (same depth-first,
* document-order traversal and the same findAnchorInBlock match on the FIRST
* matching block), but instead of a boolean / an in-place mutation it
* reconstructs the raw text spanned by the matched range. Because
* findAnchorInBlock maps the normalized selection back to raw text-node
* positions, the returned string is the document's ORIGINAL characters (smart
* quotes, em-dashes, nbsp, collapsed whitespace) NOT the normalized ASCII
* agent input.
*
* Callers store THIS as the comment's `selection` so the stored value equals the
* text actually under the mark, which is what the apply-suggestion equality
* check (replaceYjsMarkedText's `joinedText !== expectedText`) compares against.
* Without it a suggestion whose anchor only matched via normalization would be
* un-appliable (spurious 409).
*/
export function getAnchoredText(doc: any, selection: string): string | null {
const visit = (node: any, depth: number): string | null => {
if (depth > MAX_DEPTH || !node || typeof node !== "object") return null;
if (!Array.isArray(node.content)) return null;
const match = findAnchorInBlock(node.content, selection);
if (match) return reconstructRawText(node.content, match);
for (const child of node.content) {
if (child && typeof child === "object" && Array.isArray(child.content)) {
const found = visit(child, depth + 1);
if (found !== null) return found;
}
}
return null;
};
return visit(doc, 0);
}
/**
* Depth-first, document-order check for whether `selection` can be anchored
* anywhere in `doc`. At each node with an array `content`, first try to match
@@ -242,6 +301,74 @@ function spliceCommentMark(
blockContent.splice(startChild, endChild - startChild + 1, ...fragments);
}
/**
* Count how many times `selection` occurs across the whole document, using the
* same normalization and run-matching as findAnchorInBlock but WITHOUT stopping
* at the first hit: every non-overlapping occurrence within each block's text
* runs is counted and summed across all blocks (depth-first, the same traversal
* as canAnchorInDoc).
*
* This is the uniqueness gate for SUGGESTIONS: because applying a suggestion
* rewrites the exact anchored text, an ambiguous anchor (>1 occurrence) would
* silently edit the wrong place, so a suggestion is only allowed when this
* returns exactly 1. Ordinary comments keep first-occurrence anchoring and do
* not use this. (Note: counts OCCURRENCES, not just matching blocks, so two
* occurrences inside one block are correctly reported as 2.)
*/
export function countAnchorMatches(doc: any, selection: string): number {
const normSel = normalizeForMatch(selection).norm.trim();
if (normSel.length === 0) return 0;
// Count non-overlapping occurrences of the normalized selection within a
// single block's direct content, matching findAnchorInBlock's run building.
const countInBlock = (blockContent: any[]): number => {
if (!Array.isArray(blockContent)) return 0;
let count = 0;
let i = 0;
while (i < blockContent.length) {
const node = blockContent[i];
if (!node || typeof node !== "object" || node.type !== "text") {
i++;
continue;
}
// Accumulate a maximal run of consecutive text nodes.
let rawRun = "";
let j = i;
while (j < blockContent.length) {
const n = blockContent[j];
if (!n || typeof n !== "object" || n.type !== "text") break;
rawRun += typeof n.text === "string" ? n.text : "";
j++;
}
const norm = normalizeForMatch(rawRun).norm;
// Count every non-overlapping occurrence in this run.
let from = 0;
for (;;) {
const idx = norm.indexOf(normSel, from);
if (idx === -1) break;
count++;
from = idx + normSel.length;
}
i = j > i ? j : i + 1;
}
return count;
};
let total = 0;
const visit = (node: any, depth: number): void => {
if (depth > MAX_DEPTH || !node || typeof node !== "object") return;
if (!Array.isArray(node.content)) return;
total += countInBlock(node.content);
for (const child of node.content) {
if (child && typeof child === "object" && Array.isArray(child.content)) {
visit(child, depth + 1);
}
}
};
visit(doc, 0);
return total;
}
/**
* Depth-first (same order as canAnchorInDoc) over `doc`; on the FIRST block
* whose content matches `selection`, splice the comment mark across the matched
+5
View File
@@ -75,6 +75,11 @@ export function filterComment(comment: any, markdownContent?: string) {
editedAt: comment.editedAt || null,
resolvedAt: comment.resolvedAt || null,
resolvedById: comment.resolvedById || null,
// Suggestion state: the proposed replacement text (if any) and, once a human
// applies it via the UI, when and by whom.
suggestedText: comment.suggestedText || null,
suggestionAppliedAt: comment.suggestionAppliedAt || null,
suggestionAppliedById: comment.suggestionAppliedById || null,
};
}
+92
View File
@@ -119,6 +119,98 @@ export const SHARED_TOOL_SPECS = {
}),
},
// --- single-block structural write (patch / insert) ---
//
// CANONICAL description merges both layers: the MCP copy's "WITHOUT resending
// the whole document" + "cheaper/safer than a full-document replace" guidance
// AND the in-app copy's "keeps the same node id" + "Reversible via page
// history" framing — nothing either side conveyed is dropped. Sibling tools are
// named in transport-neutral prose ("the page-JSON view", "a full-document
// replace") to match the rest of the registry, since the two layers expose
// those siblings under different (snake_case vs camelCase) identifiers.
patchNode: {
mcpName: 'patch_node',
inAppKey: 'patchNode',
description:
'Replace a single content block identified by its attrs.id with a new ' +
'ProseMirror node, WITHOUT resending the whole document; the replacement ' +
'keeps the same node id. Get the block id from the page-JSON view, then ' +
'pass a ProseMirror node to put in its place. Example node: a paragraph ' +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]} or a ' +
'heading {"type":"heading","attrs":{"level":2},"content":' +
'[{"type":"text","text":"Title"}]}. Bold is a mark: ' +
'{"type":"text","text":"x","marks":[{"type":"bold"}]}. The node may be a ' +
'JSON object or a JSON string (both accepted). Cheaper and safer than ' +
'replacing the whole document for one-block structural edits. Reversible: ' +
'the previous version is kept in page history.',
buildShape: (z) => ({
pageId: z.string().min(1).describe('ID of the page containing the block'),
nodeId: z
.string()
.min(1)
.describe(
'attrs.id of the block to replace (from the page outline or ' +
'page-JSON view)',
),
node: z
.any()
.describe(
'ProseMirror node to put in place of the node with this id, e.g. ' +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]}. ' +
'JSON object or JSON string both accepted.',
),
}),
},
insertNode: {
mcpName: 'insert_node',
inAppKey: 'insertNode',
description:
'Insert a block before/after another block (by attrs.id or anchor text) ' +
'or append it at the end (top level). For before/after you MUST provide ' +
'EXACTLY ONE of anchorNodeId or anchorText. Get anchor block ids from the ' +
'page-JSON view. Avoids resending the whole document. Can also insert ' +
'table structure: to add a tableRow, pass a tableRow node with position ' +
'before/after and anchor INSIDE the target table — anchorNodeId of any ' +
'block/cell in it, or anchorText matching the table; to add a ' +
'tableCell/tableHeader, use anchorNodeId of a block inside the target row ' +
'(anchorText only resolves top-level blocks, so it cannot target a row). ' +
"`anchorText` is matched against the block's literal rendered plain text " +
'(no markdown); markdown/emoji are tolerated as a fallback; prefer plain ' +
'text or anchorNodeId. Note: append is top-level only and rejects ' +
'structural table nodes. Example node: a paragraph ' +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]} or a ' +
'heading {"type":"heading","attrs":{"level":2},"content":' +
'[{"type":"text","text":"Title"}]}. Bold is a mark: ' +
'{"type":"text","text":"x","marks":[{"type":"bold"}]}. The node may be a ' +
'JSON object or a JSON string (both accepted). Reversible via page history.',
buildShape: (z) => ({
pageId: z.string().min(1),
node: z
.any()
.describe(
'ProseMirror node to insert, e.g. ' +
'{"type":"paragraph","content":[{"type":"text","text":"Hello"}]}. ' +
'JSON object or JSON string both accepted.',
),
position: z
.enum(['before', 'after', 'append'])
.describe('Where to insert relative to the anchor.'),
anchorNodeId: z
.string()
.optional()
.describe('Anchor block id (for before/after).'),
anchorText: z
.string()
.optional()
.describe(
"Anchor text fragment (for before/after), matched against the " +
"block's literal rendered plain text (no markdown). Markdown/emoji " +
'are tolerated as a fallback; prefer plain text or anchorNodeId.',
),
}),
},
// --- share management ---
unsharePage: {
@@ -229,3 +229,322 @@ test("a reply creates without selection or anchoring and is stored as type 'page
"a reply must skip the pre-check/anchoring (no /pages/info read)",
);
});
// -----------------------------------------------------------------------------
// 4) suggestedText + a DUPLICATE selection is refused BEFORE creating anything:
// a suggestion must anchor to a unique location, so >=2 occurrences throws the
// ambiguity error (the /pages/info pre-check short-circuits before create).
// -----------------------------------------------------------------------------
test("suggestedText with an ambiguous selection is refused before creating", async () => {
let createCalls = 0;
let infoCalls = 0;
const { baseURL } = await spawn(async (req, res) => {
await readBody(req);
if (req.url === "/api/auth/login") {
sendJson(res, 200, { success: true }, {
"Set-Cookie": "authToken=t; Path=/; HttpOnly",
});
return;
}
if (req.url === "/api/pages/info") {
infoCalls++;
// "target" appears in two blocks -> ambiguous for a suggestion.
sendJson(res, 200, {
data: {
id: "page-1",
content: {
type: "doc",
content: [
{ type: "paragraph", content: [{ type: "text", text: "first target here" }] },
{ type: "paragraph", content: [{ type: "text", text: "second target here" }] },
],
},
},
});
return;
}
if (req.url === "/api/comments/create") {
createCalls++;
sendJson(res, 200, { data: { id: "should-not-happen" } });
return;
}
sendJson(res, 404, { message: "not found" });
});
const client = new DocmostClient(baseURL, "user@example.com", "pw");
await assert.rejects(
() =>
client.createComment(
"page-1",
"body",
"inline",
"target",
undefined,
"TARGET",
),
/ambiguous/i,
"an ambiguous suggestion selection must reject with the ambiguity error",
);
assert.ok(infoCalls >= 1, "the pre-check must read the page via /pages/info");
assert.equal(
createCalls,
0,
"/comments/create must NEVER be called for an ambiguous suggestion",
);
});
// -----------------------------------------------------------------------------
// 5) suggestedText on a reply is refused immediately (before any HTTP).
// -----------------------------------------------------------------------------
test("suggestedText on a reply is rejected", async () => {
let anyCall = 0;
const { baseURL } = await spawn(async (req, res) => {
await readBody(req);
if (req.url === "/api/auth/login") {
sendJson(res, 200, { success: true }, {
"Set-Cookie": "authToken=t; Path=/; HttpOnly",
});
return;
}
anyCall++;
sendJson(res, 200, { data: { id: "x" } });
});
const client = new DocmostClient(baseURL, "user@example.com", "pw");
await assert.rejects(
() =>
client.createComment(
"page-1",
"body",
"inline",
undefined,
"parent-1",
"replacement",
),
/reply/i,
"suggestedText on a reply must be rejected",
);
assert.equal(anyCall, 0, "no create/info call for a rejected reply suggestion");
});
// -----------------------------------------------------------------------------
// 6) suggestedText without a selection is refused immediately.
// -----------------------------------------------------------------------------
test("suggestedText without a selection is rejected", async () => {
const { baseURL } = await spawn(async (req, res) => {
await readBody(req);
if (req.url === "/api/auth/login") {
sendJson(res, 200, { success: true }, {
"Set-Cookie": "authToken=t; Path=/; HttpOnly",
});
return;
}
sendJson(res, 200, { data: { id: "x" } });
});
const client = new DocmostClient(baseURL, "user@example.com", "pw");
await assert.rejects(
() =>
client.createComment(
"page-1",
"body",
"inline",
undefined,
undefined,
"replacement",
),
/selection/i,
"suggestedText without a selection must be rejected",
);
});
// -----------------------------------------------------------------------------
// 7) suggestedText + a UNIQUE selection succeeds: the pre-check passes, the
// create payload carries suggestedText, and the live anchoring step (stubbed
// via the mutatePage seam) writes the comment mark exactly once.
// -----------------------------------------------------------------------------
test("suggestedText with a unique selection succeeds and forwards the payload", async () => {
let createPayload = null;
const { baseURL } = await spawn(async (req, res) => {
const raw = await readBody(req);
if (req.url === "/api/auth/login") {
sendJson(res, 200, { success: true }, {
"Set-Cookie": "authToken=t; Path=/; HttpOnly",
});
return;
}
if (req.url === "/api/pages/info") {
// "brave" is unique in the page.
sendJson(res, 200, {
data: {
id: "11111111-1111-1111-1111-111111111111",
content: {
type: "doc",
content: [
{ type: "paragraph", content: [{ type: "text", text: "Hello brave world" }] },
],
},
},
});
return;
}
if (req.url === "/api/comments/create") {
createPayload = JSON.parse(raw);
sendJson(res, 200, {
data: {
id: "cmt-ok-1",
content: createPayload.content,
selection: createPayload.selection,
suggestedText: createPayload.suggestedText,
type: createPayload.type,
},
});
return;
}
sendJson(res, 404, { message: "not found" });
});
// Subclass to stub the collab write seam: no live Hocuspocus socket, but the
// wrapper's uniqueness gate + applyAnchorInDoc still run against `doc`.
class TestClient extends DocmostClient {
async getCollabTokenWithReauth() {
return "collab-token";
}
async resolvePageId(pageId) {
return "11111111-1111-1111-1111-111111111111";
}
async mutatePage(pageId, collabToken, apiUrl, transform) {
const doc = {
type: "doc",
content: [
{ type: "paragraph", content: [{ type: "text", text: "Hello brave world" }] },
],
};
const out = transform(doc);
return { doc: out, verify: { ok: true } };
}
}
const client = new TestClient(baseURL, "user@example.com", "pw");
const result = await client.createComment(
"11111111-1111-1111-1111-111111111111",
"please rename",
"inline",
"brave",
undefined,
"bold",
);
assert.equal(result.success, true, "a unique suggestion must resolve");
assert.equal(result.anchored, true, "the comment must be anchored");
assert.ok(createPayload, "/comments/create must have been called");
assert.equal(
createPayload.suggestedText,
"bold",
"the create payload must carry suggestedText for a top-level inline comment",
);
assert.equal(createPayload.selection, "brave");
assert.equal(result.data.suggestedText, "bold", "filterComment surfaces suggestedText");
});
// -----------------------------------------------------------------------------
// 8) suggestedText where the DOC has TYPOGRAPHIC text and the agent selection is
// ASCII: the stored selection sent to /comments/create MUST be the doc's RAW
// typographic substring (what the mark covers), NOT the agent's ASCII input.
// This is the F1 contract that makes "Apply" succeed instead of a spurious
// 409 (apply compares the stored selection to the marked doc text strictly).
// -----------------------------------------------------------------------------
test("suggestedText: the stored selection is the doc's RAW typographic substring, not the ASCII input", async () => {
let createPayload = null;
const { baseURL } = await spawn(async (req, res) => {
const raw = await readBody(req);
if (req.url === "/api/auth/login") {
sendJson(res, 200, { success: true }, {
"Set-Cookie": "authToken=t; Path=/; HttpOnly",
});
return;
}
if (req.url === "/api/pages/info") {
// The doc holds SMART quotes; the agent will select the ASCII form.
sendJson(res, 200, {
data: {
id: "22222222-2222-2222-2222-222222222222",
content: {
type: "doc",
content: [
{
type: "paragraph",
content: [{ type: "text", text: "he said “hello” loudly" }],
},
],
},
},
});
return;
}
if (req.url === "/api/comments/create") {
createPayload = JSON.parse(raw);
sendJson(res, 200, {
data: {
id: "cmt-typo-1",
content: createPayload.content,
selection: createPayload.selection,
suggestedText: createPayload.suggestedText,
type: createPayload.type,
},
});
return;
}
sendJson(res, 404, { message: "not found" });
});
class TestClient extends DocmostClient {
async getCollabTokenWithReauth() {
return "collab-token";
}
async resolvePageId() {
return "22222222-2222-2222-2222-222222222222";
}
async mutatePage(pageId, collabToken, apiUrl, transform) {
const doc = {
type: "doc",
content: [
{
type: "paragraph",
content: [{ type: "text", text: "he said “hello” loudly" }],
},
],
};
const out = transform(doc);
return { doc: out, verify: { ok: true } };
}
}
const client = new TestClient(baseURL, "user@example.com", "pw");
const result = await client.createComment(
"22222222-2222-2222-2222-222222222222",
"please change",
"inline",
'"hello"', // ASCII quotes — the doc has smart quotes
undefined,
"goodbye",
);
assert.equal(result.success, true);
assert.equal(result.anchored, true);
assert.ok(createPayload, "/comments/create must have been called");
assert.equal(
createPayload.selection,
"“hello”",
"the stored selection must be the doc's RAW typographic substring, not the ASCII input",
);
assert.equal(createPayload.suggestedText, "goodbye");
});
@@ -6,6 +6,8 @@ import {
findAnchorInBlock,
canAnchorInDoc,
applyAnchorInDoc,
countAnchorMatches,
getAnchoredText,
} from "../../build/lib/comment-anchor.js";
const COMMENT_ID = "cmt-123";
@@ -208,3 +210,101 @@ test("anchoring works inside a nested block (e.g. list item) via DFS recursion",
assert.equal(marked.length, 1);
assert.equal(marked[0].text, "target");
});
// ---------------------------------------------------------------------------
// countAnchorMatches — the uniqueness gate for suggestions. Counts every
// non-overlapping occurrence across the whole document (0 / 1 / N).
// ---------------------------------------------------------------------------
test("countAnchorMatches returns 0 when the selection is absent", () => {
const doc = paragraphDoc([{ type: "text", text: "hello world" }]);
assert.equal(countAnchorMatches(doc, "missing"), 0);
});
test("countAnchorMatches returns 1 for a unique selection", () => {
const doc = paragraphDoc([{ type: "text", text: "Hello brave world" }]);
assert.equal(countAnchorMatches(doc, "brave"), 1);
});
test("countAnchorMatches counts multiple occurrences within one block", () => {
const doc = paragraphDoc([{ type: "text", text: "ab ab ab" }]);
assert.equal(countAnchorMatches(doc, "ab"), 3);
});
test("countAnchorMatches sums occurrences across separate blocks", () => {
const doc = {
type: "doc",
content: [
{ type: "paragraph", content: [{ type: "text", text: "first target here" }] },
{ type: "paragraph", content: [{ type: "text", text: "second target here" }] },
],
};
assert.equal(countAnchorMatches(doc, "target"), 2);
});
test("countAnchorMatches counts a match spanning adjacent text nodes as one", () => {
const doc = paragraphDoc([
{ type: "text", text: "запуска ", marks: [{ type: "italic" }] },
{ type: "text", text: "перед блоком", marks: [{ type: "italic" }] },
]);
assert.equal(countAnchorMatches(doc, "запуска перед"), 1);
});
test("countAnchorMatches counts matches inside nested (recursed) blocks", () => {
const doc = {
type: "doc",
content: [
{ type: "paragraph", content: [{ type: "text", text: "outer target" }] },
{
type: "bulletList",
content: [
{
type: "listItem",
content: [
{ type: "paragraph", content: [{ type: "text", text: "nested target" }] },
],
},
],
},
],
};
assert.equal(countAnchorMatches(doc, "target"), 2);
});
test("countAnchorMatches applies the same normalization as anchoring", () => {
// Smart quotes in the doc match ASCII quotes in the selection.
const doc = paragraphDoc([{ type: "text", text: "say “hi” now" }]);
assert.equal(countAnchorMatches(doc, '"hi"'), 1);
});
// -----------------------------------------------------------------------------
// getAnchoredText: returns the RAW document substring the mark would cover (the
// doc's original typographic characters), not the normalized ASCII selection.
// This is what makes a suggestion's stored selection equal the apply-time
// expectedText, so the strict equality in replaceYjsMarkedText holds.
// -----------------------------------------------------------------------------
test("getAnchoredText returns the RAW (typographic) doc substring for an ASCII selection", () => {
// Doc holds smart quotes; agent selection is the ASCII form.
const doc = paragraphDoc([{ type: "text", text: "he said “hello” loudly" }]);
assert.equal(getAnchoredText(doc, '"hello"'), "“hello”");
});
test("getAnchoredText undoes whitespace/dash normalization to the raw span", () => {
// Em-dash + nbsp in the doc; ASCII hyphen + single space in the selection.
const doc = paragraphDoc([{ type: "text", text: "a—b c" }]);
// selection "a-b c" (ascii dash) matches, raw substring keeps the em-dash+nbsp.
assert.equal(getAnchoredText(doc, "a-b c"), "a—b c");
});
test("getAnchoredText spans consecutive text nodes and returns their raw slices", () => {
const doc = paragraphDoc([
{ type: "text", text: "Hello " },
{ type: "text", text: "“brave”", marks: [{ type: "bold" }] },
{ type: "text", text: " world" },
]);
assert.equal(getAnchoredText(doc, '"brave" wor'), "“brave” wor");
});
test("getAnchoredText returns null when the selection does not anchor", () => {
const doc = paragraphDoc([{ type: "text", text: "hello world" }]);
assert.equal(getAnchoredText(doc, "not present"), null);
});
@@ -83,6 +83,63 @@ test("getNode builder produces exactly { pageId, nodeId }", () => {
assert.deepEqual(Object.keys(shape).sort(), ["nodeId", "pageId"]);
});
test("patchNode spec exists, merges BOTH descriptions, builds { pageId, nodeId, node }", () => {
const spec = SHARED_TOOL_SPECS.patchNode;
assert.ok(spec, "patchNode spec missing");
assert.equal(spec.mcpName, "patch_node");
assert.equal(spec.inAppKey, "patchNode");
// The canonical description must carry the key guidance from BOTH originals:
// - MCP-only: "WITHOUT resending the whole document" + the cheaper/safer note.
// - in-app-only: "keeps the same node id" + the "Reversible ... page history"
// framing the MCP copy lacked.
assert.match(spec.description, /WITHOUT resending the whole document/);
assert.match(spec.description, /Cheaper and safer/);
assert.match(spec.description, /keeps the same node id/i);
assert.match(spec.description, /Reversible/i);
assert.match(spec.description, /page history/i);
const shape = spec.buildShape(z);
assert.deepEqual(Object.keys(shape).sort(), ["node", "nodeId", "pageId"]);
// A minimal valid input parses (node accepts an arbitrary object via z.any()).
const parsed = z.object(shape).parse({
pageId: "p1",
nodeId: "n1",
node: { type: "paragraph" },
});
assert.equal(parsed.pageId, "p1");
assert.equal(parsed.nodeId, "n1");
});
test("insertNode spec exists, merges BOTH descriptions, builds the full anchor shape", () => {
const spec = SHARED_TOOL_SPECS.insertNode;
assert.ok(spec, "insertNode spec missing");
assert.equal(spec.mcpName, "insert_node");
assert.equal(spec.inAppKey, "insertNode");
// Canonical description must keep BOTH sides' nuance:
// - in-app-only: "EXACTLY ONE of anchorNodeId or anchorText" + "Reversible".
// - MCP-only: the table-structure (tableRow/tableCell) insertion guidance.
assert.match(spec.description, /EXACTLY ONE of anchorNodeId or anchorText/);
assert.match(spec.description, /tableRow/);
assert.match(spec.description, /append is top-level only/);
assert.match(spec.description, /Reversible via page history/);
const shape = spec.buildShape(z);
assert.deepEqual(
Object.keys(shape).sort(),
["anchorNodeId", "anchorText", "node", "pageId", "position"],
);
// before/after/append are the only accepted positions; anchors are optional.
const schema = z.object(shape);
assert.doesNotThrow(() =>
schema.parse({ pageId: "p1", node: { type: "paragraph" }, position: "append" }),
);
assert.throws(() =>
schema.parse({ pageId: "p1", node: {}, position: "sideways" }),
);
});
test("no-arg specs (getWorkspace/listSpaces/listShares) omit buildShape", () => {
for (const key of ["getWorkspace", "listSpaces", "listShares"]) {
assert.equal(SHARED_TOOL_SPECS[key].buildShape, undefined, `${key} should be no-arg`);