fix(ai-chat): live streaming, open-page context, any-dimension embeddings" -m "- streaming: give useChat a STABLE store id (chatId ?? per-mount generated)
so the v6 hook stops re-creating its store every render on a new chat
(which wiped the optimistic user message + streamed deltas, so nothing
showed until the turn finished). Also send X-Accel-Buffering:no + flushHeaders.
- context: client sends the currently-open page {id,title}; the system prompt
tells the agent which page 'this page' refers to (it reads it via its
CASL-scoped getPage tool; id is prompt-context only, no server-side fetch).
- embeddings: make page_embeddings.embedding dimension-agnostic (drop the
HNSW index + ALTER to vector), remove the hard 1536 guard, filter search by
model_dimensions — so 3072-dim (and any) models index instead of being
skipped. Seq-scan <=> search (wiki scale); existing pages reindex on next edit.
This commit is contained in:
@@ -0,0 +1,67 @@
|
||||
import { type Kysely, sql } from 'kysely';
|
||||
|
||||
/**
|
||||
* Make `page_embeddings.embedding` dimension-agnostic.
|
||||
*
|
||||
* The original column was `vector(1536)` — a FIXED dimension. On deployments
|
||||
* whose embedding model emits a different dimension (e.g. OpenAI
|
||||
* `text-embedding-3-large` = 3072, Gemini `text-embedding-004` = 768) every
|
||||
* vector failed the indexer's dimension guard and every page was SKIPPED, so
|
||||
* RAG / semanticSearch was never populated.
|
||||
*
|
||||
* pgvector's bare `vector` type (no `(N)`) accepts vectors of ANY dimension,
|
||||
* so this migration drops the fixed dimension. The dimension is still recorded
|
||||
* PER ROW in `model_dimensions`, and search filters on it so the `<=>` cosine
|
||||
* operator only ever compares same-dimension vectors (pgvector errors on a
|
||||
* dimension mismatch — possible when rows from a previous model linger).
|
||||
*
|
||||
* TRADE-OFF: an HNSW / ivfflat ANN index REQUIRES a fixed dimension, so a
|
||||
* dimension-agnostic column cannot carry one. We therefore DROP the HNSW index
|
||||
* and rely on a sequential scan with `<=>`. That is fine at wiki scale; if a
|
||||
* single embedding dimension is ever pinned per deployment, an HNSW index can
|
||||
* be re-added in a follow-up migration.
|
||||
*/
|
||||
export async function up(db: Kysely<any>): Promise<void> {
|
||||
// The HNSW ANN index requires a fixed dimension; drop it before relaxing the
|
||||
// column type. Index name mirrors 20260617T120000-page-embeddings.ts.
|
||||
await sql`DROP INDEX IF EXISTS idx_page_embeddings_embedding_hnsw`.execute(db);
|
||||
|
||||
// Drop the (1536) dimension constraint so the column accepts any dimension.
|
||||
// The identity cast `embedding::vector` is safe for existing 1536-dim rows;
|
||||
// on the affected live stand the table is empty (everything was skipped), so
|
||||
// there is no data risk.
|
||||
await sql`
|
||||
ALTER TABLE page_embeddings
|
||||
ALTER COLUMN embedding TYPE vector USING embedding::vector
|
||||
`.execute(db);
|
||||
|
||||
// Btree index supporting the scoped + dimension-filtered seq-scan search
|
||||
// (workspace_id + space_id IN (...) + model_dimensions = queryDim).
|
||||
await db.schema
|
||||
.createIndex('idx_page_embeddings_ws_space_dim')
|
||||
.ifNotExists()
|
||||
.on('page_embeddings')
|
||||
.columns(['workspace_id', 'space_id', 'model_dimensions'])
|
||||
.execute();
|
||||
}
|
||||
|
||||
export async function down(db: Kysely<any>): Promise<void> {
|
||||
// Best-effort rollback. The `::vector(1536)` cast only succeeds if EVERY row
|
||||
// is already 1536-dim — acceptable for a dev rollback (the up migration is
|
||||
// the intended steady state). On non-1536 data this will (correctly) error.
|
||||
await db.schema
|
||||
.dropIndex('idx_page_embeddings_ws_space_dim')
|
||||
.ifExists()
|
||||
.execute();
|
||||
|
||||
await sql`
|
||||
ALTER TABLE page_embeddings
|
||||
ALTER COLUMN embedding TYPE vector(1536) USING embedding::vector(1536)
|
||||
`.execute(db);
|
||||
|
||||
await sql`
|
||||
CREATE INDEX IF NOT EXISTS idx_page_embeddings_embedding_hnsw
|
||||
ON page_embeddings
|
||||
USING hnsw (embedding vector_cosine_ops)
|
||||
`.execute(db);
|
||||
}
|
||||
@@ -9,11 +9,17 @@ import { dbOrTx } from '../../utils';
|
||||
* Repository for `page_embeddings` — the pgvector store backing the AI agent's
|
||||
* semantic search (§5.5 / §6.7 stage D).
|
||||
*
|
||||
* The `embedding` column is `vector(1536)`, which is NOT a native Kysely column
|
||||
* The `embedding` column is a dimension-agnostic pgvector `vector` (no fixed
|
||||
* `(N)`, see migration 20260617T140000), which is NOT a native Kysely column
|
||||
* type, so every read/write of a vector is serialized with the `pgvector` npm
|
||||
* helper (`pgvector.toSql(number[])` → a `'[1,2,3]'` text literal) and cast back
|
||||
* to `vector` via a raw `::vector` SQL cast. Reindex is a HARD delete + insert
|
||||
* (see `deleteByPage`) so the HNSW ANN index never returns stale vectors.
|
||||
* (see `deleteByPage`) so search never returns stale vectors.
|
||||
*
|
||||
* TRADE-OFF: a dimension-agnostic column cannot carry an HNSW/ivfflat ANN index
|
||||
* (those require a fixed dimension), so `searchByEmbedding` is a sequential scan
|
||||
* with the `<=>` cosine operator. Fine at wiki scale; re-add an HNSW index if a
|
||||
* single embedding dimension is ever pinned per deployment.
|
||||
*/
|
||||
|
||||
/** A single chunk row to persist for a page (page-body embeddings). */
|
||||
@@ -66,8 +72,8 @@ export class PageEmbeddingRepo {
|
||||
|
||||
/**
|
||||
* Bulk-insert chunk rows for a page. The `embedding` value is serialized with
|
||||
* `pgvector.toSql` and cast to `vector` so Postgres stores it in the fixed
|
||||
* `vector(1536)` column. No-op on an empty array.
|
||||
* `pgvector.toSql` and cast to `vector` so Postgres stores it in the
|
||||
* dimension-agnostic `vector` column (any dimension). No-op on an empty array.
|
||||
*/
|
||||
async insertChunks(
|
||||
rows: PageEmbeddingChunkRow[],
|
||||
@@ -97,10 +103,17 @@ export class PageEmbeddingRepo {
|
||||
}
|
||||
|
||||
/**
|
||||
* Cosine ANN search over the embeddings, scoped to a workspace AND a set of
|
||||
* Cosine search over the embeddings, scoped to a workspace AND a set of
|
||||
* spaces the caller may read (see semanticSearch access-scoping). Orders by
|
||||
* `embedding <=> $query` (cosine distance) and joins the page title cheaply.
|
||||
* Returns [] when `spaceIds` is empty (no accessible spaces => no results).
|
||||
*
|
||||
* Because the column is dimension-agnostic (no ANN index), this is a seq scan
|
||||
* with `<=>`. The query MUST only be compared against same-dimension rows —
|
||||
* pgvector raises on a dimension mismatch, which can happen when rows from a
|
||||
* previously configured embedding model still linger. We therefore filter by
|
||||
* `model_dimensions = queryEmbedding.length` so the `<=>` operands always
|
||||
* agree on dimension.
|
||||
*/
|
||||
async searchByEmbedding(
|
||||
workspaceId: string,
|
||||
@@ -112,6 +125,8 @@ export class PageEmbeddingRepo {
|
||||
|
||||
// Serialized + cast query vector reused for the distance expression.
|
||||
const queryVector = sql`${pgvector.toSql(queryEmbedding)}::vector`;
|
||||
// Compare only against rows produced by a model of the SAME dimension.
|
||||
const queryDim = queryEmbedding.length;
|
||||
|
||||
const rows = await this.db
|
||||
.selectFrom('pageEmbeddings as pe')
|
||||
@@ -125,6 +140,9 @@ export class PageEmbeddingRepo {
|
||||
])
|
||||
.where('pe.workspaceId', '=', workspaceId)
|
||||
.where('pe.spaceId', 'in', spaceIds)
|
||||
// Same-dimension only: avoids a pgvector dimension-mismatch error against
|
||||
// rows from a previously configured embedding model.
|
||||
.where('pe.modelDimensions', '=', queryDim)
|
||||
// Exclude chunks whose page is in the trash (defence in depth).
|
||||
.where('p.deletedAt', 'is', null)
|
||||
.orderBy('distance', 'asc')
|
||||
|
||||
Reference in New Issue
Block a user