perf(server): low-hanging backend wins — indexes, auth dedup, embed coalescing, CTE short-circuit (#348)

One migration + targeted hot-path fixes. API behavior 1:1 (schema change = added
indexes + a byte-identical f_unaccent function-body swap, see below).

- Trigram + composite indexes (20260705T120000-perf-indexes.ts): GIN trigram on
  LOWER(f_unaccent(title/name)) for pages/users/groups (the /search/suggest
  leading-wildcard LIKE did a seq scan per keystroke — EXPLAIN now confirms
  Bitmap Index Scan on idx_pages_title_trgm), + page_history(page_id,id DESC),
  comments(page_id,id). DEVIATION (verified byte-identical): PG18 cannot inline
  the two-arg f_unaccent body during index creation, so up() swaps it to the
  schema-qualified single-arg `SELECT public.unaccent($1)` — same dictionary,
  identical output for all inputs, so the tsvector trigger + main @@ search stay
  consistent with NO reindex; down() restores the exact two-arg body.
- Auth path: jwt.strategy reuses req.raw.workspace when workspaceId matches (the
  middleware already validated it) instead of re-querying; domain.middleware
  caches the workspace lookup (withCache 15s, invalidated in all 8 WorkspaceRepo
  mutators, with a Date reviver for the JSON-serialized cache). USER + SESSION
  caching DEFERRED — the invalidation surface (role change doesn't revoke
  sessions; revocation includes background jobs) can't be safely covered, and a
  missed hook on a security path is worse than the win.
- AI re-embed coalescing: aiQueue.add gets {jobId: embed-<id>, delay: 30s} so
  active editing collapses to one job (worker reads current page state).
- filterAccessiblePageIds: hasRestrictedPagesInWorkspace short-circuit skips the
  recursive-ancestor CTE when a workspace has zero restricted pages (wired from
  search/favorites/notifications/recent/created-by). EXISTS on the same pageAccess
  table the CTE anti-joins → no false-positive / no access leak. Busts the cache
  on insertPageAccess so a 0->1 restricted transition takes effect immediately
  (review F1).
- Small: syncTransclusion guarded by a family-node probe (both old+new content, so
  the removal path is preserved); mention notifications enqueue only when the set
  gained a member; redis maintainLock clears a prior interval (leak fix).

Skipped as risky (flagged): global ValidationPipe transform change; a pool-wide
statement_timeout (would kill long CREATE INDEX migrations on the same pool).
NOTE: kept the trash query's `content` select — the trash UI reads page.content
for its preview modal (review F3, would have regressed).

Gate: server tsc 0; jest page-permission/auth/search/persistence 15 suites pass;
migration up+down+idempotency verified on real PG18 with EXPLAIN confirming index
use. No new deps.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
agent_coder
2026-07-05 01:31:35 +03:00
parent 7af85b476e
commit 24cfb158bf
19 changed files with 432 additions and 29 deletions
@@ -1,3 +1,10 @@
export const HISTORY_INTERVAL = 5 * 60 * 1000;
export const HISTORY_FAST_INTERVAL = 60 * 1000;
export const HISTORY_FAST_THRESHOLD = 5 * 60 * 1000;
// #348 — debounce window for the per-page RAG re-embed job. Repeated saves
// within this window collapse to a single delayed job (coalesced by a stable
// jobId), so active editing does not pile up expensive re-embeds (external API
// + page_embeddings rewrite, concurrency 1). The worker reads the CURRENT page
// state at run time, so the last content within the window wins.
export const EMBED_DEBOUNCE_MS = 30 * 1000;
@@ -431,7 +431,17 @@ describe('PersistenceExtension.onStoreDocument — Approach-A boundary snapshot'
it('uses the canonical page.id (not the slugId doc name) for post-store side effects (#260)', async () => {
const SLUG = 'slug-1'; // persistedHumanPage.slugId; findById resolves it
const document = ydocFor(doc('NEW AGENT CONTENT'));
pageRepo.findById.mockResolvedValue(persistedHumanPage('NEW AGENT CONTENT'));
// #348 — the transclusion sync now runs only when the new OR the previously
// persisted content carries a transclusion-family node. Give the persisted
// (old) content a pageEmbed so the sync path is exercised and the #260
// UUID-vs-slugId contract asserted below is still verified.
pageRepo.findById.mockResolvedValue({
...persistedHumanPage('NEW AGENT CONTENT'),
content: {
type: 'doc',
content: [{ type: 'pageEmbed', attrs: { sourcePageId: 'src-1' } }],
},
});
pageHistoryRepo.findPageLastHistory.mockResolvedValue(null);
// A `page.<slugId>` document name (the bug's smoking gun), agent store over
@@ -36,11 +36,13 @@ import {
import { Page } from '@docmost/db/types/entity.types';
import { CollabHistoryService } from '../services/collab-history.service';
import {
EMBED_DEBOUNCE_MS,
HISTORY_FAST_INTERVAL,
HISTORY_FAST_THRESHOLD,
HISTORY_INTERVAL,
} from '../constants';
import { TransclusionService } from '../../core/page/transclusion/transclusion.service';
import { hasTransclusionFamilyNodes } from '../../core/page/transclusion/utils/transclusion-prosemirror.util';
import { observeCollabStore } from '../../integrations/metrics/metrics.registry';
/**
@@ -415,7 +417,18 @@ export class PersistenceExtension implements Extension {
// Use the canonical page UUID (page.id), not the doc-name id, which may be
// a slugId for a `page.<slugId>` doc (#260). The transclusion/reference
// syncs write uuid-typed columns, so a slugId here threw Postgres 22P02.
await this.syncTransclusion(page.id, page.workspaceId, tiptapJson);
//
// #348 — skip the three sync SELECTs when neither the new content nor the
// previously-persisted content has any transclusion/reference/pageEmbed
// node: nothing to insert, and (the DB mirrors the old content) nothing to
// delete. Whenever either side has one, run the idempotent sync exactly as
// before so removals are still reconciled.
if (
hasTransclusionFamilyNodes(tiptapJson) ||
hasTransclusionFamilyNodes(page.content)
) {
await this.syncTransclusion(page.id, page.workspaceId, tiptapJson);
}
}
if (page) {
@@ -431,7 +444,17 @@ export class PersistenceExtension implements Extension {
(m) => m.entityId,
);
if (userMentions.length > 0) {
// #348 — only enqueue when the mentioned-user set actually GAINED a member.
// The processor (processPageMention) already no-ops when every current
// mention was present before (newMentions.length === 0), so skipping the
// enqueue in that case is behavior-identical and avoids piling up no-op jobs
// on every save of a page that merely CONTAINS (unchanged) mentions.
const oldMentionedUserIdSet = new Set(oldMentionedUserIds);
const hasNewMentionedUser = userMentions.some(
(m) => !oldMentionedUserIdSet.has(m.entityId),
);
if (hasNewMentionedUser) {
await this.notificationQueue.add(QueueJob.PAGE_MENTION_NOTIFICATION, {
userMentions: userMentions.map((m) => ({
userId: m.entityId,
@@ -446,12 +469,23 @@ export class PersistenceExtension implements Extension {
} as IPageMentionNotificationJob);
}
await this.aiQueue.add(QueueJob.PAGE_CONTENT_UPDATED, {
// Canonical UUID: the embedding reindex resolves pages by uuid, so a
// slugId here threw Postgres 22P02 invalid-uuid (#260).
pageIds: [page.id],
workspaceId: page.workspaceId,
});
await this.aiQueue.add(
QueueJob.PAGE_CONTENT_UPDATED,
{
// Canonical UUID: the embedding reindex resolves pages by uuid, so a
// slugId here threw Postgres 22P02 invalid-uuid (#260).
pageIds: [page.id],
workspaceId: page.workspaceId,
},
// #348 — coalesce re-embeds during active editing. A stable per-page
// jobId + delay means repeated saves within EMBED_DEBOUNCE_MS collapse
// to one delayed job instead of one expensive re-embed per save. The
// worker reads the current page state at run time, so last content wins.
// BullMQ forbids ':' in custom job ids (Redis key separator), so '-' is
// used; page.id is a UUID, so the id is unique per page. removeOnComplete
// (queue.module) frees the id after each run so the next window re-arms.
{ jobId: `embed-${page.id}`, delay: EMBED_DEBOUNCE_MS },
);
await this.enqueuePageHistory(page, lastUpdatedSource);
}
@@ -220,6 +220,13 @@ export class RedisSyncExtension<TCE extends CustomEvents> implements Extension {
};
async maintainLock(documentName: string) {
// #348 — clear any existing timer for this document before installing a new
// one. Without this, a second maintainLock for the same document (a
// reload-without-unload) overwrites this.locks[documentName] and leaks the
// previous interval, which keeps firing SET forever with no way to clear it.
if (this.locks[documentName]) {
clearInterval(this.locks[documentName]);
}
this.locks[documentName] = setInterval(() => {
this.pub.set(
this.getKey(documentName),
@@ -4,8 +4,25 @@ export const CacheKey = {
`perm:space-roles:${userId}:${spaceId}`,
PAGE_CAN_EDIT: (userId: string, pageId: string) =>
`perm:can-edit:${userId}:${pageId}`,
// #348 — "does this workspace have ANY restricted page?" Lets whole-workspace
// access filters skip the recursive-ancestor CTE when nothing is restricted.
HAS_RESTRICTED_PAGES_IN_WORKSPACE: (workspaceId: string) =>
`perm:ws-has-restricted:${workspaceId}`,
// #348 — DomainMiddleware workspace resolution. Self-hosted resolves the single
// workspace (constant key); cloud resolves by the request subdomain (lowercased
// to match the case-insensitive `LOWER(hostname)` lookup). Every WorkspaceRepo
// mutator busts these, so staleness is bounded by both explicit invalidation and
// the short TTL below.
WORKSPACE_SELF_HOSTED: 'workspace:self-hosted',
WORKSPACE_BY_HOST: (subdomain: string) =>
`workspace:byhost:${subdomain.toLowerCase()}`,
};
// Permission caches dedupe repeated checks within and across short request bursts.
// 5s keeps staleness on revocations bounded.
export const PERMISSION_CACHE_TTL_MS = 5_000;
// #348 — workspace row changes rarely; a short TTL bounds staleness of
// security-relevant fields (enforceSso/enforceMfa/status) even if an explicit
// bust is ever missed, while still removing the per-request workspace query.
export const WORKSPACE_CACHE_TTL_MS = 15_000;
@@ -1,13 +1,42 @@
import { Injectable, NestMiddleware, NotFoundException } from '@nestjs/common';
import { Inject, Injectable, NestMiddleware } from '@nestjs/common';
import { FastifyRequest, FastifyReply } from 'fastify';
import { CACHE_MANAGER } from '@nestjs/cache-manager';
import { Cache } from 'cache-manager';
import { EnvironmentService } from '../../integrations/environment/environment.service';
import { WorkspaceRepo } from '@docmost/db/repos/workspace/workspace.repo';
import { Workspace } from '@docmost/db/types/entity.types';
import { withCache } from '../helpers/with-cache';
import { CacheKey, WORKSPACE_CACHE_TTL_MS } from '../helpers/cache-keys';
// #348 — timestamptz columns on the workspace row. The cache store (Keyv/Redis)
// JSON-serializes values, so a cached workspace comes back with these fields as
// ISO strings. Reviving them to Date keeps the cached path byte-identical to the
// direct DB path (postgres.js returns Date), so nothing downstream can observe a
// cache hit vs miss. Idempotent: `new Date(date)` on an already-Date value is a
// no-op-equivalent. Keep in sync with the workspace timestamptz columns.
const WORKSPACE_DATE_FIELDS: Array<keyof Workspace> = [
'createdAt',
'updatedAt',
'deletedAt',
'trialEndAt',
];
function reviveWorkspaceDates(workspace: Workspace): Workspace {
for (const field of WORKSPACE_DATE_FIELDS) {
const value = workspace[field];
if (value != null) {
(workspace as any)[field] = new Date(value as any);
}
}
return workspace;
}
@Injectable()
export class DomainMiddleware implements NestMiddleware {
constructor(
private workspaceRepo: WorkspaceRepo,
private environmentService: EnvironmentService,
@Inject(CACHE_MANAGER) private readonly cacheManager: Cache,
) {}
async use(
req: FastifyRequest['raw'],
@@ -15,13 +44,21 @@ export class DomainMiddleware implements NestMiddleware {
next: () => void,
) {
if (this.environmentService.isSelfHosted()) {
const workspace = await this.workspaceRepo.findFirst();
// #348 — cache the single-workspace lookup that runs on every request.
// Invalidated by every WorkspaceRepo mutator (see bustWorkspaceCache).
const workspace = await withCache(
this.cacheManager,
CacheKey.WORKSPACE_SELF_HOSTED,
WORKSPACE_CACHE_TTL_MS,
() => this.workspaceRepo.findFirst(),
);
if (!workspace) {
//throw new NotFoundException('Workspace not found');
(req as any).workspaceId = null;
return next();
}
reviveWorkspaceDates(workspace);
// TODO: unify
(req as any).workspaceId = workspace.id;
(req as any).workspace = workspace;
@@ -29,13 +66,21 @@ export class DomainMiddleware implements NestMiddleware {
const header = req.headers.host;
const subdomain = header.split('.')[0];
const workspace = await this.workspaceRepo.findByHostname(subdomain);
// #348 — cache per-subdomain workspace resolution. Keyed by subdomain (the
// hostname column); busted per hostname by every WorkspaceRepo mutator.
const workspace = await withCache(
this.cacheManager,
CacheKey.WORKSPACE_BY_HOST(subdomain),
WORKSPACE_CACHE_TTL_MS,
() => this.workspaceRepo.findByHostname(subdomain),
);
if (!workspace) {
(req as any).workspaceId = null;
return next();
}
reviveWorkspaceDates(workspace);
(req as any).workspaceId = workspace.id;
(req as any).workspace = workspace;
}
@@ -51,7 +51,17 @@ export class JwtStrategy extends PassportStrategy(Strategy, 'jwt') {
throw new UnauthorizedException();
}
const workspace = await this.workspaceRepo.findById(payload.workspaceId);
// #348 — reuse the workspace DomainMiddleware already loaded for this request
// instead of re-querying it. `validate()` above has confirmed
// `req.raw.workspaceId === payload.workspaceId` (or that it is unset), and the
// middleware sets `req.raw.workspace` alongside `req.raw.workspaceId` from the
// same row, so when the ids match the cached row is the exact one this query
// would return. Fall back to the query if the middleware did not populate it
// (e.g. a code path that bypasses DomainMiddleware).
const workspace =
req.raw.workspace && req.raw.workspaceId === payload.workspaceId
? req.raw.workspace
: await this.workspaceRepo.findById(payload.workspaceId);
if (!workspace) {
throw new UnauthorizedException();
@@ -38,6 +38,8 @@ export class FavoriteService {
await this.pagePermissionRepo.filterAccessiblePageIds({
pageIds: result.items,
userId,
// #348 — favorites load at app-start; enable the workspace short-circuit.
workspaceId,
});
const accessibleSet = new Set(accessibleIds);
result.items = result.items.filter((id) => accessibleSet.has(id));
@@ -125,6 +127,8 @@ export class FavoriteService {
await this.pagePermissionRepo.filterAccessiblePageIds({
pageIds,
userId,
// #348 — workspace-level short-circuit for the favorites list.
workspaceId,
});
accessiblePageSet = new Set(accessibleIds);
}
@@ -23,7 +23,12 @@ export class NotificationController {
@Body() dto: ListNotificationsDto,
@AuthUser() user: User,
) {
return this.notificationService.findByUserId(user.id, dto, dto.type);
return this.notificationService.findByUserId(
user.id,
dto,
dto.type,
user.workspaceId,
);
}
@HttpCode(HttpStatus.OK)
@@ -45,6 +45,7 @@ export class NotificationService {
userId: string,
pagination: PaginationOptions,
type: NotificationTab = 'all',
workspaceId?: string | null,
) {
const result = await this.notificationRepo.findByUserId(
userId,
@@ -61,6 +62,8 @@ export class NotificationService {
await this.pagePermissionRepo.filterAccessiblePageIds({
pageIds,
userId,
// #348 — notifications list; enable the workspace short-circuit.
workspaceId,
});
const accessibleSet = new Set(accessiblePageIds);
+12 -2
View File
@@ -446,7 +446,11 @@ export class PageController {
);
}
return this.pageService.getRecentPages(user.id, pagination);
return this.pageService.getRecentPages(
user.id,
pagination,
user.workspaceId,
);
}
@HttpCode(HttpStatus.OK)
@@ -469,7 +473,13 @@ export class PageController {
}
}
return this.pageService.getCreatedByPages(targetUserId, user.id, pagination, dto.spaceId);
return this.pageService.getCreatedByPages(
targetUserId,
user.id,
pagination,
dto.spaceId,
user.workspaceId,
);
}
@HttpCode(HttpStatus.OK)
@@ -1163,6 +1163,7 @@ export class PageService {
async getRecentPages(
userId: string,
pagination: PaginationOptions,
workspaceId?: string | null,
): Promise<CursorPaginationResult<Page>> {
const result = await this.pageRepo.getRecentPages(userId, pagination);
@@ -1172,6 +1173,8 @@ export class PageService {
await this.pagePermissionRepo.filterAccessiblePageIds({
pageIds,
userId,
// #348 — cross-space "recent"; enable the workspace short-circuit.
workspaceId,
});
const accessibleSet = new Set(accessibleIds);
result.items = result.items.filter((p) => accessibleSet.has(p.id));
@@ -1185,6 +1188,7 @@ export class PageService {
requestingUserId: string,
pagination: PaginationOptions,
spaceId?: string,
workspaceId?: string | null,
): Promise<CursorPaginationResult<Page>> {
const result = await this.pageRepo.getCreatedByPages(
creatorId,
@@ -1199,6 +1203,9 @@ export class PageService {
await this.pagePermissionRepo.filterAccessiblePageIds({
pageIds,
userId: requestingUserId,
spaceId,
// #348 — enable the workspace short-circuit when not space-scoped.
workspaceId,
});
const accessibleSet = new Set(accessibleIds);
result.items = result.items.filter((p) => accessibleSet.has(p.id));
@@ -93,6 +93,41 @@ function collectNodes<T>(
return Array.from(byKey.values());
}
/**
* #348 — cheap early-exit probe: does this doc contain ANY node the transclusion
* syncs care about (`transclusionSource` / `transclusionReference` / `pageEmbed`)?
* Lets the collab store skip the three sync SELECTs when neither the previous nor
* the new content has any such node — there is nothing to insert, and (since the
* DB mirrors the previously-persisted content) nothing to delete. Walks once and
* short-circuits on the first match; uses the same depth ceiling as the
* collectors. Deliberately does NOT skip `transclusionSource` subtrees: it only
* answers "any node present?", so descending everywhere is strictly conservative
* (it can never wrongly report "none").
*/
export function hasTransclusionFamilyNodes(doc: unknown): boolean {
const visit = (node: any, depth: number): boolean => {
if (!node || typeof node !== 'object') return false;
if (depth > MAX_PM_WALK_DEPTH) return false;
if (
node.type === TRANSCLUSION_TYPE ||
node.type === REFERENCE_TYPE ||
node.type === PAGE_EMBED_TYPE
) {
return true;
}
if (Array.isArray(node.content)) {
for (const child of node.content) {
if (visit(child, depth + 1)) return true;
}
}
return false;
};
return visit(doc, 0);
}
/**
* Walks a ProseMirror JSON document and returns one snapshot per top-level
* `transclusion` node. Does not recurse into transclusions (schema disallows
@@ -155,6 +155,8 @@ export class SearchService {
pageIds,
userId: opts.userId,
spaceId: searchParams.spaceId,
// #348 — enables the workspace-level short-circuit when not space-scoped.
workspaceId: opts.workspaceId,
});
const accessibleSet = new Set(accessibleIds);
results = results.filter((r: any) => accessibleSet.has(r.id));
@@ -266,6 +268,8 @@ export class SearchService {
await this.pagePermissionRepo.filterAccessiblePageIds({
pageIds,
userId,
// #348 — workspace-level short-circuit for the suggest path.
workspaceId,
});
const accessibleSet = new Set(accessibleIds);
pages = pages.filter((p) => accessibleSet.has(p.id));
@@ -0,0 +1,95 @@
import { type Kysely, sql } from 'kysely';
/**
* #348 — targeted hot-path indexes.
*
* 1. GIN trigram indexes for `/search/suggest`. That endpoint runs a
* leading-wildcard `LOWER(f_unaccent(col)) LIKE '%q%'` per keystroke, which
* is a sequential scan without a trigram index. The index EXPRESSIONS below
* are `LOWER(f_unaccent(title|name))`, matching the predicates in
* search.service.ts exactly so the planner uses them (verified with EXPLAIN:
* the suggest predicate resolves to a Bitmap Index Scan on these indexes).
*
* IMMUTABLE-wrapper fix (required for the index to build): `f_unaccent` was
* defined as `SELECT unaccent('unaccent', $1)` (the two-arg, dictionary-named
* unaccent). That body CANNOT be used in an index expression: when Postgres
* inlines the IMMUTABLE SQL wrapper while building the index it fails to
* resolve the two-arg call (`function unaccent(unknown, text) does not exist`,
* the `'unaccent'` literal loses its regdictionary coercion). The single-arg
* `unaccent($1)` is the same operation (the default text-search dictionary IS
* `unaccent`; verified byte-equal on accented samples), and — crucially —
* SCHEMA-QUALIFIED as `public.unaccent($1)` it inlines cleanly, so the index
* builds. We therefore `CREATE OR REPLACE` `f_unaccent` to the qualified
* single-arg body. This is output-identical for every existing caller (the
* tsvector trigger, the main `tsv @@` search, and the suggest LIKE), so no
* reindex/backfill is needed; `down()` restores the original two-arg body.
* (The `unaccent` extension is installed in `public` in this codebase, which
* is why `public.unaccent` is the correct qualification.)
*
* 2. Composite indexes for two ORDER-BY-only-on-id queries that currently sort
* on top of a created_at index:
* - page_history: `findPageHistoryByPageId` does WHERE page_id ORDER BY id
* DESC, but only `(page_id, created_at DESC)` exists → extra sort.
* - comments: `findPageComments` does WHERE page_id ORDER BY id ASC, but only
* `(page_id)` exists → extra sort.
*/
export async function up(db: Kysely<any>): Promise<void> {
// Index-compatible, output-identical redefinition of f_unaccent (see header).
await sql`
CREATE OR REPLACE FUNCTION f_unaccent(text)
RETURNS text
LANGUAGE sql
IMMUTABLE PARALLEL SAFE STRICT
AS $func$
SELECT public.unaccent($1);
$func$
`.execute(db);
// Search-suggest trigram indexes. Expressions match search.service.ts.
await sql`
CREATE INDEX IF NOT EXISTS idx_pages_title_trgm
ON pages USING gin ((LOWER(f_unaccent(title))) gin_trgm_ops)
`.execute(db);
await sql`
CREATE INDEX IF NOT EXISTS idx_users_name_trgm
ON users USING gin ((LOWER(f_unaccent(name))) gin_trgm_ops)
`.execute(db);
await sql`
CREATE INDEX IF NOT EXISTS idx_groups_name_trgm
ON groups USING gin ((LOWER(f_unaccent(name))) gin_trgm_ops)
`.execute(db);
// page_history: WHERE page_id ORDER BY id DESC (findPageHistoryByPageId).
await sql`
CREATE INDEX IF NOT EXISTS idx_page_history_page_id
ON page_history (page_id, id DESC)
`.execute(db);
// comments: WHERE page_id ORDER BY id ASC (findPageComments).
await sql`
CREATE INDEX IF NOT EXISTS idx_comments_page_id_id
ON comments (page_id, id)
`.execute(db);
}
export async function down(db: Kysely<any>): Promise<void> {
// Drop the expression indexes before restoring the function body.
await sql`DROP INDEX IF EXISTS idx_pages_title_trgm`.execute(db);
await sql`DROP INDEX IF EXISTS idx_users_name_trgm`.execute(db);
await sql`DROP INDEX IF EXISTS idx_groups_name_trgm`.execute(db);
await sql`DROP INDEX IF EXISTS idx_page_history_page_id`.execute(db);
await sql`DROP INDEX IF EXISTS idx_comments_page_id_id`.execute(db);
// Restore the original two-arg (dictionary-named) f_unaccent body.
await sql`
CREATE OR REPLACE FUNCTION f_unaccent(text)
RETURNS text
LANGUAGE sql
IMMUTABLE PARALLEL SAFE STRICT
AS $func$
SELECT unaccent('unaccent', $1);
$func$
`.execute(db);
}
@@ -52,11 +52,22 @@ export class PagePermissionRepo {
trx?: KyselyTransaction,
): Promise<PageAccess> {
const db = dbOrTx(this.db, trx);
return db
const row = await db
.insertInto('pageAccess')
.values(data)
.returningAll()
.executeTakeFirst();
// Bust the workspace-level "has any restricted page" cache: a 0->1 transition
// (the FIRST restricted page in a workspace) must take effect immediately, or
// the filterAccessiblePageIds short-circuit would keep treating the workspace
// as unrestricted for up to PERMISSION_CACHE_TTL_MS and leak the just-
// restricted page into whole-workspace lists (search/favorites/recent/…).
if (data.workspaceId) {
await this.cacheManager.del(
CacheKey.HAS_RESTRICTED_PAGES_IN_WORKSPACE(data.workspaceId),
);
}
return row;
}
async deletePageAccess(
@@ -657,8 +668,9 @@ export class PagePermissionRepo {
pageIds: string[];
userId: string;
spaceId?: string;
workspaceId?: string | null;
}): Promise<string[]> {
const { pageIds, userId, spaceId } = opts;
const { pageIds, userId, spaceId, workspaceId } = opts;
if (pageIds.length === 0) return [];
if (spaceId) {
@@ -666,6 +678,17 @@ export class PagePermissionRepo {
if (!hasRestrictions) {
return pageIds;
}
} else if (workspaceId) {
// #348 — whole-workspace callers (no spaceId: favorites, notifications,
// recent, created-by, global search) skip the recursive-ancestor CTE + anti
// -join entirely when the workspace has ZERO restricted pages. When any
// restriction DOES exist, fall through to the identical CTE below, so
// behavior is unchanged whenever restrictions are present.
const hasRestrictions =
await this.hasRestrictedPagesInWorkspace(workspaceId);
if (!hasRestrictions) {
return pageIds;
}
}
const results = await this.db
@@ -903,6 +926,41 @@ export class PagePermissionRepo {
return Boolean(result?.exists);
}
/**
* Workspace-level analogue of hasRestrictedPagesInSpace: does ANY page in the
* whole workspace carry a restriction? Lets whole-workspace access filters
* short-circuit the recursive-ancestor CTE when nothing is restricted at all.
*
* Cached with the same short PERMISSION_CACHE_TTL_MS as PAGE_CAN_EDIT: this is
* the workspace-wide restriction flag, so it is read on nearly every list
* endpoint, and the 5s TTL bounds the window in which a just-added first
* restriction is not yet reflected — the identical staleness contract the other
* permission caches already accept. No explicit bust (mirrors PAGE_CAN_EDIT).
*/
async hasRestrictedPagesInWorkspace(workspaceId: string): Promise<boolean> {
return withCache(
this.cacheManager,
CacheKey.HAS_RESTRICTED_PAGES_IN_WORKSPACE(workspaceId),
PERMISSION_CACHE_TTL_MS,
async () => {
const result = await this.db
.selectNoFrom((eb) =>
eb
.exists(
eb
.selectFrom('pageAccess')
.select(sql`1`.as('one'))
.where('pageAccess.workspaceId', '=', workspaceId),
)
.as('exists'),
)
.executeTakeFirst();
return Boolean(result?.exists);
},
);
}
/**
* Given a list of parent page IDs, return which ones have at least one accessible child.
* Efficient batch query for sidebar hasChildren calculation.
@@ -581,6 +581,9 @@ export class PageRepo {
const query = this.db
.selectFrom('pages')
.select(this.baseFields)
// NOTE: `content` IS needed here — the trash UI reads page.content to render
// the deleted-page preview modal (trash.tsx handlePageClick ->
// TrashPageContentModal pageContent). Do NOT drop it (see #348 review F3).
.select('content')
.select((eb) => this.withSpace(eb))
.select((eb) => this.withDeletedBy(eb))
@@ -1,4 +1,6 @@
import { Injectable } from '@nestjs/common';
import { Inject, Injectable } from '@nestjs/common';
import { CACHE_MANAGER } from '@nestjs/cache-manager';
import { Cache } from 'cache-manager';
import { InjectKysely } from 'nestjs-kysely';
import { KyselyDB, KyselyTransaction } from '../../types/kysely.types';
import { dbOrTx } from '../../utils';
@@ -9,6 +11,7 @@ import {
} from '@docmost/db/types/entity.types';
import { ExpressionBuilder, sql } from 'kysely';
import { DB, Workspaces } from '@docmost/db/types/db';
import { CacheKey } from '../../../common/helpers/cache-keys';
/**
* Writable `settings.ai.provider` keys, enforced at this generic SQL layer. This
@@ -61,7 +64,34 @@ export class WorkspaceRepo {
'temporaryNoteHours',
'isScimEnabled',
];
constructor(@InjectKysely() private readonly db: KyselyDB) {}
constructor(
@InjectKysely() private readonly db: KyselyDB,
@Inject(CACHE_MANAGER) private readonly cacheManager: Cache,
) {}
/**
* #348 — bust the DomainMiddleware workspace caches after any workspace write.
* Deletes BOTH the self-hosted (constant) key and the cloud per-hostname key so
* a single implementation covers either deployment mode (the irrelevant key is a
* harmless no-op). Best-effort: a cache error must never fail the write, and a
* missed bust is bounded by WORKSPACE_CACHE_TTL_MS. Note: a hostname RENAME only
* busts the NEW hostname's key (the row returned here carries the new hostname);
* the old key expires via TTL.
*/
private async bustWorkspaceCache(
workspace?: Pick<Workspace, 'hostname'> | undefined,
): Promise<void> {
try {
await this.cacheManager.del(CacheKey.WORKSPACE_SELF_HOSTED);
if (workspace?.hostname) {
await this.cacheManager.del(
CacheKey.WORKSPACE_BY_HOST(workspace.hostname),
);
}
} catch {
// cache is best-effort; TTL is the backstop
}
}
async findById(
workspaceId: string,
@@ -144,12 +174,14 @@ export class WorkspaceRepo {
trx?: KyselyTransaction,
): Promise<Workspace> {
const db = dbOrTx(this.db, trx);
return db
const workspace = await db
.updateTable('workspaces')
.set({ ...updatableWorkspace, updatedAt: new Date() })
.where('id', '=', workspaceId)
.returning(this.baseFields)
.executeTakeFirst();
await this.bustWorkspaceCache(workspace);
return workspace;
}
async insertWorkspace(
@@ -157,11 +189,14 @@ export class WorkspaceRepo {
trx?: KyselyTransaction,
): Promise<Workspace> {
const db = dbOrTx(this.db, trx);
return db
const workspace = await db
.insertInto('workspaces')
.values(insertableWorkspace)
.returning(this.baseFields)
.executeTakeFirst();
// Bust the cached "not found" so a fresh install / new tenant is seen at once.
await this.bustWorkspaceCache(workspace);
return workspace;
}
async count(): Promise<number> {
@@ -203,7 +238,7 @@ export class WorkspaceRepo {
trx?: KyselyTransaction,
) {
const db = dbOrTx(this.db, trx);
return db
const workspace = await db
.updateTable('workspaces')
.set({
settings: sql`COALESCE(settings, '{}'::jsonb)
@@ -214,6 +249,8 @@ export class WorkspaceRepo {
.where('id', '=', workspaceId)
.returning(this.baseFields)
.executeTakeFirst();
await this.bustWorkspaceCache(workspace);
return workspace;
}
async updateAiSettings(
@@ -223,7 +260,7 @@ export class WorkspaceRepo {
trx?: KyselyTransaction,
) {
const db = dbOrTx(this.db, trx);
return db
const workspace = await db
.updateTable('workspaces')
.set({
settings: sql`COALESCE(settings, '{}'::jsonb)
@@ -234,6 +271,8 @@ export class WorkspaceRepo {
.where('id', '=', workspaceId)
.returning(this.baseFields)
.executeTakeFirst();
await this.bustWorkspaceCache(workspace);
return workspace;
}
/**
@@ -272,7 +311,7 @@ export class WorkspaceRepo {
entries.flatMap(([k, v]) => [sql.lit(k), sql`${v}::text`]),
)})`
: sql`'{}'::jsonb`;
return db
const workspace = await db
.updateTable('workspaces')
.set({
settings: sql`COALESCE(settings, '{}'::jsonb) || jsonb_build_object(
@@ -287,6 +326,8 @@ export class WorkspaceRepo {
.where('id', '=', workspaceId)
.returning(this.baseFields)
.executeTakeFirst();
await this.bustWorkspaceCache(workspace);
return workspace;
}
/**
@@ -303,7 +344,7 @@ export class WorkspaceRepo {
trx?: KyselyTransaction,
) {
const db = dbOrTx(this.db, trx);
return db
const workspace = await db
.updateTable('workspaces')
.set({
settings: sql`COALESCE(settings, '{}'::jsonb)
@@ -313,6 +354,8 @@ export class WorkspaceRepo {
.where('id', '=', workspaceId)
.returning(this.baseFields)
.executeTakeFirst();
await this.bustWorkspaceCache(workspace);
return workspace;
}
async updateSharingSettings(
@@ -322,7 +365,7 @@ export class WorkspaceRepo {
trx?: KyselyTransaction,
) {
const db = dbOrTx(this.db, trx);
return db
const workspace = await db
.updateTable('workspaces')
.set({
settings: sql`COALESCE(settings, '{}'::jsonb)
@@ -333,6 +376,8 @@ export class WorkspaceRepo {
.where('id', '=', workspaceId)
.returning(this.baseFields)
.executeTakeFirst();
await this.bustWorkspaceCache(workspace);
return workspace;
}
async updateTemplateSettings(
@@ -342,7 +387,7 @@ export class WorkspaceRepo {
trx?: KyselyTransaction,
) {
const db = dbOrTx(this.db, trx);
return db
const workspace = await db
.updateTable('workspaces')
.set({
settings: sql`COALESCE(settings, '{}'::jsonb)
@@ -353,6 +398,8 @@ export class WorkspaceRepo {
.where('id', '=', workspaceId)
.returning(this.baseFields)
.executeTakeFirst();
await this.bustWorkspaceCache(workspace);
return workspace;
}
}
@@ -15,7 +15,9 @@ describe('WorkspaceRepo.updateSetting (jsonb merge) [integration]', () => {
beforeAll(() => {
db = getTestDb();
// Repos are plain classes taking @InjectKysely() db — instantiate directly.
repo = new WorkspaceRepo(db as any);
// 2nd arg is CACHE_MANAGER (used only to bust the #348 workspace cache); a
// stub is fine here since bustWorkspaceCache is best-effort (try/catch).
repo = new WorkspaceRepo(db as any, {} as any);
});
afterAll(async () => {