The custom undici RetryAgent + aiFetch transport added for issue #140
did not actually heal mid-stream provider drops: undici's retry path is
a Range-based download-resume that SSE/chat-completions endpoints cannot
satisfy, so a reset after the first byte only swapped ECONNRESET for a
"server does not support the range header" error. Its only real effect
was reconnecting a poisoned keep-alive socket before the first byte, and
PR #141 on top of it turned the 60s headers timeout into deterministic
~61s failures (plus CONTENT_LENGTH_MISMATCH from retrying a POST body
after a timeout abort). The root cause is the z.ai coding endpoint, not
our transport.
Remove the whole layer and return all AI provider calls to Node's
default global fetch.
- delete integrations/ai/ai-http.ts and its spec
- ai.service.ts: drop the aiFetch import, the AI_BYPASS_RESILIENT_FETCH
diagnostic toggle, and fetch:aiFetch from every chat/embedding/STT
factory; raw STT call back to global fetch
- ai-chat.controller.ts: drop the stream-timing START log + startedAt
- ai-chat.service.ts: drop the first-chunk/FINISHED/ERROR timing logs
- .env.example: drop AI_BYPASS_RESILIENT_FETCH
Reverts: 1af5d34a, 7c308728, b7abb7ea, 35fc58ea, d6cd2754, 6efb8656.
Preserved (not part of the rollback): client-disconnect abort, title
generation in onFinish, partial-answer persistence, Safari SSE heartbeat.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extend ai-http.spec with two loopback-server tests: a provider that stalls
without sending headers triggers the (lowered) headersTimeout and is retried on a
fresh connection, recovering; a healthy fast response passes through in one
attempt. No external network calls.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The z.ai GLM coding endpoint intermittently accepts the chat request but never
sends response headers; undici's default 300s headersTimeout then hung the user
for five minutes before failing, and UND_ERR_HEADERS_TIMEOUT was not in the
RetryAgent's retried error set, so there was no recovery.
headersTimeout only bounds time-to-FIRST-headers (before any body) — it is NOT
the streaming budget, so lowering it does not truncate live SSE streams. Cap it
(env AI_HTTP_HEADERS_TIMEOUT_MS, default 60s) so a header stall fails fast, and
add UND_ERR_HEADERS_TIMEOUT to the retried error codes so the stalled request is
retried on a fresh connection (which usually responds in seconds). bodyTimeout
kept generous (env AI_HTTP_BODY_TIMEOUT_MS, default 300s) so slow streams with
sparse chunks survive. UND_ERR_BODY_TIMEOUT is deliberately NOT retried (mid-body,
partial SSE already delivered).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Node's fetch returns a generic "fetch failed" error, hiding the actual
reason (e.g., ECONNRESET, timeout) in the error's cause chain. This
change extracts up to three levels of the cause, formats each with its
code and message, and includes the chain in the warning log, making
failures more actionable.
The streaming chat turn hangs in all browsers while the non-streaming test
endpoint works — both use the same model/transport (createOpenAI + aiFetch),
so the suspect is the streaming path / custom undici RetryAgent transport.
- ai-http.ts: wrap aiFetch with per-request timing logs (start, ms-to-headers
on success, elapsed ms + cause on failure). Chat at info, embeddings at
debug. Only host+path logged.
- ai-chat.controller.ts / ai-chat.service.ts: log turn START, first-chunk
latency, FINISHED duration, and elapsed ms on disconnect/error/abort.
- ai.service.ts: AI_BYPASS_RESILIENT_FETCH=true makes the CHAT model omit
fetch:aiFetch and use the default global fetch — isolates transport vs
request-shape. Chat-only; embeddings/STT untouched; reversible via env.
- .env.example: document the flag.
No timeout/retry change. tsc clean; ai-chat + ai suites pass (292).
Batch of fixes from the automated QA pass on develop. Each was reproduced and
then verified fixed live (browser/curl); logic-bearing fixes have unit tests.
Functional bugs:
- #122 collab-token was capped by the anonymous public-share-AI throttler (5/min);
skip all non-AUTH named throttlers on this auth-guarded, client-cached route.
- #123 editor onAuthenticationFailed threw `jwtDecode(undefined)` and never
reconnected; read the token via a ref, guard the decode (incl. missing exp),
and refetch+reconnect on any auth failure.
- #124 a slash command containing a space ("/Heading 1") inserted literal text;
enable allowSpaces and close the menu when the query matches no items.
- #125 space slug auto-gen produced uppercase initials for multi-word names;
computeSpaceSlug now yields a lowercase alphanumeric slug.
- #126 AI chat window position/size now persisted (atomWithStorage) across reload;
also fixes a latent ResizeObserver-attach bug on first open.
- #127 workspace name update accepted URLs; add @NoUrls (parity with setup).
- #132 icon-columns 4/5 passed calc() into SVG width/height attrs (console spam);
size via style. share-for-page query returns null instead of undefined.
- #134 "Reindex now" counter looked stuck: reindex runs async; the client now
polls coverage (bounded) so the counter climbs live; misleading server comment
reworded.
UX / consistency:
- #128 add success toasts to favorite/label/avatar/member-(de)activate.
- #129 "1 result found" pluralization; hide the single-option Type filter.
- #130 replace raw Zod strings with friendly messages (name/password/group).
- #131 unify "Untitled" casing in tree/breadcrumb/tab; stop force-uppercasing
space-name chips; fix confirm-dialog labels (Cancel / Remove), invite
placeholder typo, Export/Move-to-space labels.
- #133 disable profile Save when clean; toast on unsupported avatar image;
style the invalid-invitation page with a CTA; hide Share for read-only users;
align the dictation "not configured" message; "Go to login page" typo.
Tests: computeSpaceSlug, workspace-name NoUrls DTO, share-query null
normalization, slash getSuggestionItems empty-close.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Outbound LLM calls used Node's default global undici agent (default
keep-alive pooling, no transport-level reconnect), so a TCP RST on a
reused/poisoned keep-alive socket surfaced as
"Cannot connect to API: read ECONNRESET" and failed the chat stream and
title generation after the AI SDK's own retries were exhausted.
Add a dedicated resilient outbound HTTP layer (ai-http.ts): a shared
undici RetryAgent over a tuned Agent, exposed as `aiFetch` and injected
into every AI provider factory (createOpenAI chat/embeddings/STT,
createGoogleGenerativeAI, createOllama) plus the raw JSON STT fetch. The
RetryAgent reconnects on connection-level errors (ECONNRESET, ...) on a
FRESH socket, opts POST into the retry methods (undici's default list
excludes POST), and leaves HTTP-status retries (429/5xx + Retry-After) to
the AI SDK to avoid double-retry.
- ai-http.ts: shared RetryAgent(Agent) + aiFetch (maxRetries 2,
conservative keep-alive, connect timeout, streaming-safe timeouts)
- ai.service.ts: inject fetch: aiFetch into every provider factory
- ai-http.spec.ts: regression test that aiFetch injects the RetryAgent
dispatcher into the underlying fetch
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
reindexWorkspace isolated every per-page failure, so an invalid/missing
API key (401 "User not found") made all pages fail identically while the
batch kept issuing hundreds of doomed requests against the provider.
Add isFatalProviderError() (401/403 auth, 402 billing) and abort the
whole batch on such errors; 429 rate-limit and embedding timeouts stay
per-page isolated. Adds unit tests for the predicate and a regression
test for the abort/iterate control flow.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a per-workspace `sttLanguage` setting (ISO-639-1 hint; empty =
auto-detect) and a searchable language picker in the Voice / STT settings
card. The hint is forwarded to the transcription endpoint:
- multipart path via the AI SDK `providerOptions.openai.language`
- JSON (OpenRouter) path via a top-level `language` body field
only when non-empty, so auto-detect behaves exactly as before.
Threaded through the whole stack: ai.types, update DTO, AiSettingsService
(resolve/getMasked/update), the workspace.repo SQL allowlist, the client
ai-settings service types, and the provider-settings form. Adds en-US
source keys and ru-RU translations.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Provider auth failures were logged with the provider's opaque message only
(e.g. OpenRouter returns "401: User not found." for a bad/missing API key),
which reads like a missing wiki user rather than a credentials problem.
describeProviderError now prepends a clear, human-readable English label for
a small set of well-known HTTP statuses while keeping the original detail
(status + provider message + truncated response-body snippet):
- 401/403 -> authentication failed (invalid or missing API key)
- 402 -> insufficient credits or quota
- 429 -> rate limit exceeded
Other statuses and status-less errors are formatted exactly as before. The
label is a static string and never contains the API key. Benefits every
caller (embedding processor, indexer, AI "Test endpoint" UI) at once.
Tests: switch the plain status+message case to a non-classified status (500);
add 401/403/402/429 cases; keep 502/503 as regression guards for the
unchanged path.
Relocate resolveTrustProxy from main.ts (untestable — bootstraps on import) to
integrations/environment/trust-proxy.util.ts and import it back. Unit-test every
branch (empty/undefined -> safe loopback/private default; true/false; hop count;
trim; CIDR/negative passthrough) so a regression can't silently re-open the XFF
spoofing hole (#61).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Deduplicate the "save a workspace setting" plumbing shared by HtmlEmbedSettings
and TrackerSettings (workspace atom read, isLoading state, updateWorkspace + atom
merge forcing settings[key], success/error notifications) into a new
feature-scoped hook useWorkspaceSetting(key).
- Each component keeps its own interaction model: html-embed is an optimistic
toggle with revert-on-failure; tracker is edit-then-save on an explicit button.
- Unify error handling on the better pattern: surface err.response?.data?.message
and use console.error (html-embed previously used console.log + a generic message).
No user-facing behavior change; client typecheck clean.
Test-coverage follow-ups (untested trackerHead injection in ShareSeoController and
the no-op audit branch) tracked in #100.
isBlocked was checked synchronously but recordFailure ran only AFTER the bcrypt
awaits, so N concurrent /mcp Basic requests for one email all slipped past the
threshold. Add FailedLoginLimiter.tryReserve (atomic synchronous check+increment)
+ release (undo), and reserve all 3 keys BEFORE any await so the (threshold+1)-th
concurrent attempt is rejected before its bcrypt runs. The reservation IS the
failure record (post-await recordFailure removed -> counted exactly once). Non-
credential early throws (missing workspace, SSO/MFA gate) and business errors
release the reservation so they don't burn a victim's budget; success clears.
Tests prove login() runs exactly threshold times under concurrency and that
gate/config rejects don't consume budget.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
McpService.enforceBasicLoginGate re-implements AuthController.login's pre-token
SSO/MFA gate; silent drift would re-open the bypass. Add an AST contract test
(comments stripped) asserting BOTH method bodies contain validateSsoEnforcement,
the EE-MFA require, and checkMfaRequirements — so dropping the gate from either
side fails CI. Test-only (no core/auth refactor).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The 4-step html-embed gate (feature-enabled AND role-allowed -> stripHtmlEmbedNodes)
was replicated across call-sites, pinned only by brittle source-regex tests. Add
stripHtmlEmbedIfNotAllowed(json, {featureEnabled, role, onStrip}) and migrate the
5 plain strip-all sites (collab handler, page create+duplicate, both import paths,
transclusion) to it, each keeping its own feature/role resolve + log via onStrip.
Left the 2 sites with different semantics: persistence.extension (#29 preserve-
admin) and share.service (feature-only kill-switch, no role gate). Real unit tests
replace the regex pins; behavior identical.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
16 suites were disabled via testPathIgnorePatterns due to two root causes: lib0
ESM not transformed (the @hocuspocus/server -> lib0/decoding.js chain) and stock
'should be defined' specs built via Test.createTestingModule without providers.
Add lib0 to transformIgnorePatterns; convert the 14 DI placeholders to direct
new X(...) instantiation with stub deps (keeping a real construct smoke test);
re-enable the suites. Also updates the public-share limiter test to assert the
fail-closed behavior from #62 (surfaced now that the suite runs). Full server
suite: 67 passed, 689 tests.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The shared MCP_TOKEN guard moved from 'Authorization: Bearer <MCP_TOKEN>' to the
X-MCP-Token header (Authorization is now per-user Basic/Bearer), silently breaking
existing /mcp clients. Document it as a Breaking Change in CHANGELOG (reconfigure
to X-MCP-Token). Add a once-per-process migration warning: when MCP_TOKEN is set,
no x-mcp-token is present, and Authorization carries the old 'Bearer <MCP_TOKEN>',
log a hint to migrate — without changing the auth decision (still rejected) or
logging the token value.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Convert the htmlEmbed node from same-origin raw-HTML execution to a sandboxed
iframe (sandbox="allow-scripts allow-popups allow-forms", no allow-same-origin,
srcdoc) with postMessage auto-resize (validated by event.source) and an optional
manual height attr. The block now runs in an opaque origin and cannot reach the
viewer's cookies/session/API, so it is safe for any member.
Because the block is now harmless, remove the entire admin/role gating apparatus:
drop htmlEmbedAllowed/canAuthorHtmlEmbed/stripDisallowedHtmlEmbedNodes/
collectHtmlEmbedSources and every role-based strip on the write paths (collab
REST/MCP + socket, page create/duplicate, import x2, transclusion unsync), along
with the now-unused WorkspaceRepo/UserRepo injections and the PageService.create
callerRole param. Keep one strip: prepareContentForShare still removes htmlEmbed
on the anonymous public-share read path when the workspace master toggle is OFF.
The workspace settings.htmlEmbed toggle is now a plain feature switch (gates the
slash-menu and share rendering); when ON the block is available to all members.
Add settings.trackerHead: an admin-only raw HTML/JS analytics snippet injected
verbatim into the <head> of public share pages only (ShareSeoController), for
trackers that genuinely need same-origin. Admin-gated via the existing CASL
Manage/Settings ability; never injected into the authenticated app shell.
Closes security-review findings #1, #2, #4, #5, #10 (and #3 as a security issue).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Resolve the html-embed.spec.ts conflict as a union: both #46 and #49 (already in
develop) added different test cases to the same file. Keep all of them —
stripHtmlEmbedNodes gets #46's root-node case plus develop's deeply-nested,
non-object and empty-content cases; #46's collectHtmlEmbedSources and
stripDisallowedHtmlEmbedNodes suites and develop's hasHtmlEmbedNode suite all
kept; imports unioned. No production code conflicted.
Full suite green: server 651, client (16 files), editor-ext 56, mcp 247.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
After develop merged, mcp.service.ts calls decideBasicGate from mcp-auth.helpers.
The gate spec mocked the whole module returning only FailedLoginLimiter, so the
merged code crashed with 'decideBasicGate is not a function' (7/7 failing).
Spread jest.requireActual('./mcp-auth.helpers') so the real helpers are kept and
the gate exercises real logic; keep only FailedLoginLimiter stubbed so its
constructor runs without a real sweep timer.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add ~330 tests across server (Jest), client (Vitest), editor-ext (Vitest)
and packages/mcp (node:test) for the gitmost features added since
053a9c0d: AI chat, AI agent roles, public-share assistant, MCP per-user
auth, HTML embed, page templates/embed, realtime tree, tree
expand/collapse, and the AI-settings UI.
Test-tooling fixes (prerequisite, were silently hiding coverage):
- Repair 3 page-template specs broken by the 11-arg TransclusionService
constructor; they never compiled, so template access-control / content
-leak / unsync-strip coverage was fictitious.
- Build @docmost/editor-ext before server tests via a `pretest` hook;
the stale dist omitted the new HtmlEmbed/PageEmbed exports (TS2305).
- Let jest resolve the .tsx email templates: add `tsx` to
moduleFileExtensions and widen the ts-jest transform to (t|j)sx?.
Behaviour-preserving "extract pure core" refactors that the tests drive:
- server: resolveShareAssistantRequest + uiMessageTextLength
(public-share controller), decideBasicGate + mapAuthResultToResponse
(mcp), buildErrorAssistantRecord (ai-chat), jsonbObject export (roles).
- client: render-raw-html + shouldExecute/canEdit, decide-embed-state,
page-embed picker utils, tree-socket reducers, open/close branch maps,
isEndpointConfigured/resolveKeyField; buildTreeWithChildren now treats
a permission-trimmed orphan as a root instead of crashing.
Deferred (need a test DB or HTTP harness, documented in the specs):
repo-level Postgres integration tests and the public-share XFF E2E.
Pre-existing DI/lib0-ESM suite failures are untouched and out of scope.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Post-merge hardening from the #13 security review:
- isInitializeRequestBody now delegates to the SDK isInitializeRequest (same
predicate as packages/mcp/http.ts), so a bare {method:'initialize'} with no
id/params no longer triggers the side-effecting login() (audit-spam /
user_sessions growth) before http.ts 400s it.
- Bind the Bearer path to the instance workspace: verifyBearerAccess rejects a
token whose payload.workspaceId != the instance workspace (resolved via
workspaceRepo.findFirst, consistent with the Basic path); optional param so
it's a no-op when unset.
- Close the user-enumeration timing oracle in verifyUserCredentials: the
missing/disabled branch now runs a bcrypt compare against a module-level dummy
hash whose cost (12) matches production saltRounds, so both paths take one
equal-cost bcrypt compare; the exact CREDENTIALS_MISMATCH_MESSAGE is preserved.
- Document the trusted-proxy requirement for the spoofable per-IP brute-force
limiter in .env.example (trustProxy is on; deploy behind a trusted proxy).
- Add real-execution coverage for enforceBasicLoginGate (SSO enforced / EE-MFA
bundled vs not / user-MFA / workspace-enforced-MFA) instead of stubbing the gate.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The create/duplicate/import gate tests asserted gate presence via brittle
expect(SRC).toMatch(/regex/) over the source text plus a reimplemented
applyGate() stand-in, so a refactor could break the real gate while they still
passed. Rewrite both specs to execute the REAL methods (PageService.create /
duplicatePage; ImportService.importPage; FileImportTaskService.processGenericImport)
with each caller role and assert on the PERSISTED content via hasHtmlEmbedNode:
member -> stripped, admin/owner+toggle ON -> preserved, toggle OFF -> stripped
for everyone, unknown/missing role -> fail-closed. No source-regex assertions
remain.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Anonymous public-share AI assistant:
- Add a workspace setting `publicShareAssistantRoleId` so an admin can pick which
agent role (identity/persona) the anonymous assistant adopts. The role's
instructions REPLACE the built-in persona while the immutable safety framework
is still always appended; the role's optional model override takes precedence
over the cheap publicShareChatModel. Resolved server-authoritatively
(workspace-scoped, soft-delete aware; disabled/missing roles fall back to the
built-in persona, so the tool scope remains the real security boundary).
- Plumb the field through the update DTO, ai-settings service, the workspace.repo
ALLOWED whitelist, resolve()/getMasked(), stream-time role resolution and the
prompt/model, plus the settings UI: a new "Assistant identity" Select listing
enabled roles (and surfacing a saved-but-disabled role explicitly).
Public-share branding / floating icon:
- Fix the AI assistant FAB overlapping the "Powered by ..." button (both were
Affixed bottom-right): stack the FAB above the bottom-right branding.
- Rename "Powered by Docmost" -> "Powered by Gitmost" and point the link at the
gitmost repo.
Tests: extend public-share-chat.spec (role persona replacement still appends the
safety framework, resolveShareRole edge cases, model-override precedence).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- make addTreeNode receivers idempotent (invalidateOnCreatePage guard +
buildTree dedup) so the author's self-echo no longer duplicates the node
- broadcast realtime tree updates for bulk copy/duplicate and import via a
root refetch: PAGE_CREATED now carries spaceId and the WS listener falls
back to refetchRootTreeNodeEvent when no per-node snapshot is present
- remove the now-dead client-relay inbound path (isTreeEvent/handleTreeEvent)
that remained a stale-restriction-cache attack surface
- honest string|null cast for a root move's parent id
- add tests: buildTree dedup; onPageCreated per-node vs refetch branching
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The admin-only raw HTML/JS embed is a deliberate stored-XSS surface, so gate the
whole feature behind a workspace toggle that is OFF by default; it only works
when a workspace admin explicitly enables it.
- settings.htmlEmbed (boolean, default false) + workspace-update field htmlEmbed,
persisted via WorkspaceRepo.updateSetting with an audit diff. Flipping it is
admin-only (same Manage Settings CASL as other workspace toggles).
- New gate htmlEmbedAllowed(featureEnabled, role) = featureEnabled && admin/owner.
All 7 server write paths (create, duplicate, collab onStoreDocument, REST/MCP/AI
updatePageContent, single + zip import, transclusion unsync) now read the
workspace's settings.htmlEmbed and strip unless (toggle ON AND admin). OFF
(default, or a failed/empty workspace lookup) strips htmlEmbed for EVERYONE
including admins -> existing embeds are cleaned up on next save, none persist.
- Client (defense-in-depth): the /html slash item is hidden unless toggle ON +
admin; the NodeView executes nothing and shows a 'disabled in this workspace'
placeholder when OFF; an admin Switch in Workspace Settings -> General with a
description of the behavior.
- docs/html-embed-admin.md documents the toggle + admin-only + fail-closed
coedit (a non-admin save strips an admin's embed) + execution semantics.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Resolve conflicts with the independently-merged ai-agent-roles feature:
- ai-chat.module.ts: keep BOTH AiAgentRolesModule and the public-share
wiring (Share/Search modules, PublicShareChatController, services).
- ai.service.ts: take develop's getChatModel ChatModelOverride superset,
which already covers the public-share model-id-only override.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Follow-up fixes on the agent-roles feature:
- ai.service: a cross-driver override to the ollama driver (when the
workspace driver is not ollama) now fails with an explicit 503 instead
of silently reusing the workspace base URL, which belongs to a different
provider. Same-driver ollama and openai/gemini overrides are unchanged.
- migration: add a partial unique index on (workspace_id, name) WHERE
deleted_at IS NULL so role names are unique per workspace without
soft-deleted rows blocking re-creation; map Postgres 23505 to a 409
ConflictException on create/update.
- dto: validate the role id as @IsUUID instead of @IsString.
- roles list: do not expose instructions/modelConfig to non-admin members.
The list endpoint now returns a picker view (id/name/emoji/description/
enabled) to members and the full view only to admins (same gate as the
CRUD endpoints). Client IAiRole fields made optional accordingly.
Adds tests for the cross-driver-ollama throw, the 23505->409 mapping, and
the non-admin picker-view security invariant.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Integrate the already-merged step-limit work from develop. Only conflict was
ai-chat.service.spec.ts: both sides appended a describe block and edited the
import line. Resolved as a union — keep compactToolOutput + the assistantParts/
serializeSteps/rowToUiMessage suites (this branch) AND the prepareAgentStep
suite (develop), importing all symbols from ai-chat.service.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Behaviour change (split out of the test commit per review, and now covered).
Both the stream onError log line and the error text streamed to the client were
formatted by separate inline blocks that only emitted "<status>: <message>".
Route both through the shared describeProviderError() so formatting stays in one
place.
BEHAVIOUR CHANGE: describeProviderError additionally appends a single-line,
300-char-truncated snippet of the provider responseBody/text. So the log line
AND the user-facing stream error now include that snippet (e.g. the HTML error
page from a misconfigured endpoint), which previously neither did. This is
intentional — it makes a misconfigured external endpoint diagnosable — and is
safe: the API key travels in the Authorization header and is never echoed in
the response body (see the util's docstring). A `fallback` param is added so
each call site keeps its own default ('AI stream error' for the stream).
Adds ai-error.util.spec.ts covering the formatter, including the appended /
truncated body snippet, so this behaviour is no longer untested.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Unit tests for the safety-critical paths: crypto secret-box (round-trip,
tamper detection, wrong key), the SSRF guard (blocked ranges + DNS-rebinding),
the ai-chat tools service, the page-embedding repo, and the
assistant-parts/serialization helpers. Those server helpers (assistantParts,
rowToUiMessage, serializeSteps) are exported ONLY for the tests — no runtime
change.
Also: keyboard a11y on the chat history header and conversation rows
(role/tabIndex/Enter+Space), and DRY refactors that move shared logic into one
place (isToolPart -> tool-parts util; buildInitialValues in the MCP form).
The behaviour-changing edits that previously rode along in this commit are
split out into the following two commits, per review.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Make the denser page-tree layout opt-in instead of hardcoded, so row
density can be toggled per deployment via the COMPACT_PAGE_TREE runtime
config flag.
- doc-tree: extract ROW_HEIGHT_STANDARD (32) / ROW_HEIGHT_COMPACT (26);
default the virtualizer row stride to STANDARD density.
- client: isCompactPageTreeEnabled() in lib/config (reads
COMPACT_PAGE_TREE, default true); used by space-tree and shared-tree
to choose the row height.
- server: EnvironmentService.isCompactPageTreeEnabled() and expose
COMPACT_PAGE_TREE through the window runtime config (static.module).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Release-cycle review: POST /pages/template/lookup had only JwtAuthGuard and the
embed depth cap was client-only, so a scripted client could drive heavy
full-content fan-out (access control holds per-id, but a cost/DoS gap). And
page_template_references rows were written for any sourcePageId with no
workspace check at sync time (no leak today since lookup re-checks access, but
the graph could accumulate cross-space rows).
- Apply the standard per-user throttler (PAGE_TEMPLATE_THROTTLER, 30/min) to
/pages/template/lookup and /pages/toggle-template (mirrors ai-chat); auth +
the toggle's validateCanEdit CASL are unchanged.
- syncPageTemplateReferences / insertTemplateReferencesForPages now restrict
inserts to in-workspace source ids (filterInWorkspaceSourceIds, workspace +
not-deleted scoped, trx-aware) and still delete stale out-of-workspace rows
(self-heal). SECURITY comment: the ref table is NOT access-filtered; every
consumer must permission-filter at read time (as lookupTemplate does).
- Tests: lookup access exercises the REAL filterViewerAccessiblePageIds
(no_access / cross-workspace excluded / accessible+comment-stripped / <=50);
toggle controller CASL (cannot-edit -> Forbidden, flag not flipped); ref-sync
excludes cross-workspace and keeps in-workspace.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Release-cycle test audit: the strip boundary was tested only via a stand-in
helper re-implemented in the spec, so a deleted/misplaced guard kept CI green
(the missing create() guard was proof). Replace it with tests against real code:
- persistence.extension.onStoreDocument: real ydoc from a rich doc (columns/
table/mention/htmlEmbed) -> non-admin strip removes only htmlEmbed, every other
node preserved (data-loss guard); admin keeps; empty fragment no-throw.
- collaboration.handler.updatePageContent: real path, user?.role gate, decoded
ydoc embed-free for non-admin, kept for admin.
- transclusion unsync: member stripped, admin preserved.
- editor-ext gains a vitest setup (was zero tests) + a markdown round-trip:
the <!--html-embed:BASE64--> marker -> htmlEmbed node with decoded source, and
hasHtmlEmbedNode matches it — pinning the marked/turndown shape the import
strip relies on. tsconfig now excludes specs from the shipped dist.
- Fail-closed identity: source-pinned contracts that the gate keys on
fileTask.creatorId (zip) / request userId (single) / callerRole (create) /
authUser.role (duplicate), and missing-user -> strip (services can't load under
jest's ESM graph; helpers replay the exact predicate).
Adds the verified-safe ^src/ jest moduleNameMapper (identical fail set).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Release-cycle test audit: the /mcp auth's constant-time token guard, IP keying,
ACCESS-type pinning, and brute-force message coupling were untested. Extract
behavior-preserving pure helpers so they're testable and cover them:
- sharedTokenMatches: length-mismatch early-returns before timingSafeEqual
(which throws on unequal lengths); equal-length uses timingSafeEqual; array
header -> first element; non-string -> false.
- clientIp: req.ip > socket > first XFF hop > 'unknown' (limiter keying).
- bindAccessJwtVerifier: verifyJwt pinned to JwtType.ACCESS (rejects REFRESH).
- CREDENTIALS_MISMATCH_MESSAGE single source of truth shared by
verifyUserCredentials and isCredentialsFailure, so a reworded auth error can't
silently disable the /mcp brute-force counter.
- verifyUserCredentials no-side-effect contract asserted via a TS-AST spec
(AuthService can't load under jest): its body has no createSessionAndToken/
audit/updateLastLogin while login() has all three.
Extractions are behavior-preserving (reviewed); class delegates to the helpers,
dead code + unused imports removed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Release-cycle test audit found the role feature's security-critical paths
untested. Adds real unit tests (against the actual functions):
- resolveRoleForRequest invariants: role comes from chat.roleId not body.roleId
(no per-turn swap), lookup scoped to workspace.id, disabled/soft-deleted role
-> null, new-chat uses body.roleId, stale chatId falls back.
- CASL admin gate: non-admin create/update/delete -> Forbidden and service not
called; admin delegates with workspace.id; list() is member-reachable.
- roleModelOverride: unknown driver dropped (never reaches getChatModel's
throwing default), valid override passes through, blanks ignored.
- getChatModel override success path (cross-driver fetch + decrypt; chatModel-
only reuse), and service update/remove cross-workspace 'not found' guards +
modelConfig tri-state.
Tiny fix: findByCreator badge left-join now also requires enabled=true, so a
disabled role (downgraded to universal by resolveRoleForRequest) no longer shows
a misleading chat-list badge.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Release-cycle review found the /mcp Basic path skipped the controller's
pre-token gates and over-eagerly minted sessions:
- SSO/MFA bypass (blocker): the Basic path called AuthService.login/
verifyUserCredentials directly, but validateSsoEnforcement + the lazy EE MFA
gate live in AuthController.login. Now enforceBasicLoginGate runs in the Basic
branch BEFORE any token is minted: validateSsoEnforcement(workspace) (reject
on enforced SSO) and the same lazy-require MFA check the controller uses
(reject MFA users -> 'use a Bearer access token'). No EE module bundled (this
fork) -> no MFA gate, identical to the controller; a throw from the check
fails closed (no token). Bearer/service-account paths are not gated (those
JWTs are minted post-gate).
- Non-init session mint: isSessionInit is now (no mcp-session-id) AND the body
is a real JSON-RPC initialize (isInitializeRequestBody). A header-less
non-initialize request takes the side-effect-free verifyCredentials path -> no
user_sessions row, no USER_LOGIN audit, no lastLoginAt bump.
- FailedLoginLimiter.sweep() now runs on an unref'd 60s interval, cleared on
module destroy (was never scheduled -> unbounded Map growth under XFF rotation).
- Subsequent (non-init) valid login no longer resets the global per-email brute
bucket (only per-IP / per-IP+email); the email backstop is reset only on a
deliberate init login.
Note: in a hypothetical EE build, checkMfaRequirements is called with no
FastifyReply (we only read requirement flags); a res-dereferencing EE impl would
surface as a clean rejection (fail-closed), not a bypass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds an htmlEmbed block node that renders and executes raw HTML/CSS/JS in the
wiki origin (e.g. an analytics tracker) — the owner-chosen variant C. Because
this is stored-XSS by design, only workspace admins/owners may get such a node
persisted; everyone executes it when reading.
- Node (editor-ext): htmlEmbed atom/isolating block; source stored base64 in
data-source for lossless HTML<->JSON round-trip. renderHTML emits only the
encoded marker (never inlines raw markup), so generateHTML/export/search are
not themselves injection vectors. Registered in BOTH client extensions and
server tiptapExtensions. Markdown round-trip via an <!--html-embed:b64-->
comment (turndown) + a marked rule.
- Client NodeView: injects source and re-creates <script> elements so they
actually run; edit modal; renders in read-only/share too. Slash item is
admin-gated (adminOnly filtered by the user's workspace role).
- SERVER ENFORCEMENT (the real control — UI gating alone is insufficient):
stripHtmlEmbedNodes() removes htmlEmbed from any document persisted by a
non-admin, applied at every write path that introduces content from an
untrusted author: collab onStoreDocument, REST/MCP/AI updatePageContent,
single-file import, zip/multi-file import, page duplication, and transclusion
unsync. Page restore introduces no new content. Public share/readonly viewers
render fetched (already-stripped) content and do NOT open a collab socket, so
the only residual is a transient broadcast window to concurrent authenticated
editors (documented).
Implements docs/arbitrary-html-embed-plan.md (variant C).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Lets an unauthenticated viewer of a published share ask an AI scoped strictly
to that share's page tree. The authenticated agent is untouched; the security
boundary is the tool scope (no identity), and nothing is persisted.
Server:
- workspace toggle settings.ai.publicShareAssistant (default off) +
optional settings.ai.provider.publicShareChatModel (cheap model id; reuses
the chat driver/baseUrl/key). getChatModel(workspaceId, override) substitutes
only the model id, falling back to chatModel.
- POST /api/shares/ai/stream (@Public, SSE). Guardrail funnel, each failing
before streaming: toggle off -> 404; share missing/wrong-workspace/sharing
off -> 404; pageId not in share tree -> 404; provider unconfigured -> 503;
per-IP (5/min) and per-workspace (300/h, IP-independent) rate limits -> 429.
Uniform 404s never confirm a private page's existence.
- forShare read-only in-process toolset: searchSharePages (existing shareId
FTS branch, no spaceId/userId), getSharePage (getShareForPage gate +
share.id check, content via the public sanitizer), listSharePages. No write/
comment/history/cross-space/external-MCP tools.
- Locked share system prompt + immutable safety block; stepCountIs(5).
- /shares/page-info exposes an aiAssistant flag (gated behind isSharingAllowed).
Client: an ephemeral, text-only Ask-AI widget on the public shared page,
shown only when the flag is set; useChat -> /api/shares/ai/stream,
credentials omit. Admin toggle + model field in Settings -> AI.
Also adds a jest moduleNameMapper for src/-rooted imports (fixes pre-existing
unresolvable specs; additive).
Implements docs/public-share-assistant-plan.md.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The embedded MCP server acted as a single service account; now each /mcp
session authenticates as the current user, so tools run under that user's
CASL and edits attribute to them.
- HTTP Basic (chosen path): Authorization: Basic email:password, validated
server-side via AuthService; the session carries the issued user JWT (not
the raw password). Password may contain ':' (split on first only).
- Bearer fallback: Authorization: Bearer <access JWT>, verified as ACCESS and
additionally checked for an active session + non-disabled user (matching
JwtStrategy), so revoked/disabled users are rejected.
- Service account stays as an optional fallback (no creds + env configured).
- packages/mcp createMcpHttpHandler accepts a per-request config resolver
(back-compat: static config / stdio unchanged); identity is bound to the
mcp-session-id at init and re-validated from the caller's own credentials on
every request (anti session-fixation: a guessed session id can't be reused
without matching creds).
- A full login (session + audit) happens only once at session init; later
requests re-verify credentials via a new non-side-effecting
AuthService.verifyUserCredentials (no session/audit spam).
- Failed-login limiter (5/60s, keyed per-IP, per-IP+email, and per-email so IP
rotation can't brute one account) since direct login bypasses the controller
throttler. Only real credential failures count.
- MCP_TOKEN shared guard moved off Authorization to an X-MCP-Token header
(timing-safe compare); credsConfigured 503 gate replaced by a clear 401.
- No secrets logged; all auth resolved before res.hijack() so failures return
clean 401 JSON. .env.example marks the service account optional.
Implements docs/backlog/mcp-per-user-auth.md (variant L).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>