fix(ai): address reindex-progress review (PR #242)

- Delete the now-orphaned PageRepo.getIdsByWorkspace (its only caller,
  reindexWorkspace, switched to getEmbeddablePageIds). Its docstring still
  claimed "Used by the RAG bulk reindex"; re-grep confirmed zero callers.
- ai-settings.service.reindex(): if aiQueue.add() throws (Redis hiccup/
  shutdown) the worker never runs so its finally->clear() never fires,
  leaving the seeded progress record stuck for the full 1h TTL (button
  stuck "reindexing: 0 of N"). Roll back the seed THIS call wrote
  (seeded flag, only when get() was null) before re-throwing, so a
  concurrent active run's record is never wiped. Add tests for both the
  clear-on-throw and the don't-clear-a-concurrent-run paths.
- Add an integration spec (real Postgres) proving getEmbeddablePageIds'
  WHERE stays in lockstep with countEmbeddablePages: seeds every boundary
  case and asserts the returned id set equals the count.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
claude_code
2026-06-28 04:39:18 +03:00
parent 95d07d8d6f
commit bf09eec4e1
4 changed files with 180 additions and 23 deletions
@@ -264,20 +264,6 @@ export class PageRepo {
return Number(row?.count ?? 0);
}
/**
* IDs of all non-deleted pages in a workspace. Used by the RAG bulk reindex to
* (re)build embeddings for every existing page.
*/
async getIdsByWorkspace(workspaceId: string): Promise<string[]> {
const rows = await this.db
.selectFrom('pages')
.select('id')
.where('workspaceId', '=', workspaceId)
.where('deletedAt', 'is', null)
.execute();
return rows.map((r) => r.id);
}
/**
* IDs of the EMBEDDABLE page set for a workspace — the exact same set that
* `countEmbeddablePages` counts (a page qualifies if it has non-empty