gitmost

Author	SHA1	Message	Date
claude code agent 227	5ca4cc4657	test(git-sync): add missing DTO/User imports for the rebased git-sync provenance spec block The rebase folded develop's agent-provenance PageService spec and the git-sync provenance spec into one file; the appended git-sync block needs CreatePageDto / UpdatePageDto / User imports that develop's spec (which used inline `as any`) did not have. Server tsc + the suite (158 tests, both provenance blocks) green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	3d9c508011	fix(git-sync): git-http stream error handlers + close test gaps (#119 review) Addresses the stability + test-coverage warnings from the #119 review: - git-http-backend.service.ts: add `'error'` handlers to child.stdout/stderr. An EventEmitter 'error' with no listener (e.g. EPIPE when the client aborts mid-response) is rethrown by Node as an uncaught exception and crashes the process; now swallowed + logged (never echoed to the client). - TEST INFRA: a jest setupFile shims `navigator`/`MessageChannel` for the `node` testEnvironment. react-dom@18 reads `navigator` at module-init (pulled in via @docmost/editor-ext -> @tiptap/react), so every spec transitively importing the conversion engine — including git-http.service.spec.ts — previously FAILED TO LOAD ("navigator is not defined") and ran ZERO tests. With the shim those specs now run (git-sync integration: 11 suites / 133 tests green). - git-http.service.spec.ts: cover the 503 lock-held push path — `ingestExternalPush` rejecting `GitSyncLockHeldError` -> 503 + Retry-After + "git-sync busy, retry", no double header write (+ the already-headers-sent no-rewrite path). - git-http-backend.service.spec.ts: unit-test run() — child 'error'/'close' before headers -> 500; normal CGI parse+stream; stdout/stderr 'error' (EPIPE) swallowed; synchronous spawn throw -> 500. - page-change.listener.ts: implement OnModuleDestroy to clearTimeout all pending debounce timers on shutdown (+ test). - .env.example: vaults are non-bare working repos, not "bare repos". (Docs deleted by the stray commit were restored in 9cdbce54.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	1ee18e3ed7	test(git-sync): e2e suites provision a throwaway space — never touch real data The shell e2e suites defaulted to the General space and created/edited pages there, polluting real content (and, when several enabled spaces raised poll contention, flaking on 503s). Now each suite creates its OWN throwaway, git-sync-enabled space at setup, runs everything against it, and deletes the space (+ its vault) on exit. Set SPACE_ID explicitly to opt into an existing space. Also gives the basic suite the 503-retry push helper the advanced one already had. Verified isolated: basic 12/12, advanced 23/23, no spaces/users/ pages left behind, the real space untouched. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	3e0b0aa7c0	fix(git-sync): never trash a page whose pageId still exists in the tree (cross-cycle move) + browser e2e Follow-up to `4376c5a6`, found by a real BROWSER e2e (the flow the in-diff fix missed). When the layout reshuffle's two halves land in SEPARATE sync cycles, the later cycle's diff has only the DELETE of the old path — the matching add was already pushed — so in-diff D+A coalescing can't see it, and the live page was still trashed. Robust fix on the identity invariant the reviewer (and the user) called out: a page EXISTS iff its pageId is in the vault, regardless of filename. runPush now collects the pageIds present at ANY path in the current `main` tree and passes them to computePushActions; a deleted file whose pageId is still tracked elsewhere is a MOVE, never a deletion. (Built only when the diff has deletes.) Adds apps/server/test/git-sync-browser-e2e.cjs — a Playwright test that drives the REAL Docmost web UI: log in, create several untitled pages, type a title, sync, assert NOTHING is trashed. Reproduced the data loss before this fix; 5/5 green and stable after. Engine suite 600 green (+2 computePushActions cases: pageId-still-present -> skip; pageId-gone -> real delete). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	5d0d5e7af4	test(git-sync): e2e guard for the untitled-page + retitle data-loss reshuffle Reproduces the browser bug at the API level: create several untitled pages (all collapse to the `_` fallback name), retitle one, sync — assert NO page is trashed and all survive. Caught the data-loss bug fixed in `4376c5a6`. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	57b9ced95f	test(git-sync): basic e2e operates on a dedicated page + cleans up (no real-page pollution) The push / 3-way-merge cases edited the FIRST real `.md` in the vault, leaving `E2E-PUSH-` / `E2E-MERGE-` marker headings accumulating in a real page, and the Docmost->git case left its created page in the Trash. Now the suite creates a dedicated `E2E-SyncTarget-` page and targets only that, and a teardown hard-deletes every `E2E-` fixture page and converges the vault on exit — so runs never mutate real content and leave the stand clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	a18302cdb4	test(git-sync): add advanced e2e suite — authz, protocol hardening, concurrency, data-loss guard Output of a generate→critique subagent pass on "what the feature's tests do NOT cover", implemented + verified against the live stand (20/20). Complements the basic two-way suite. Covers: - protocol shape: unknown service subpath -> 400; unknown content-type -> 415 (global allowlist); PUT/DELETE on pack endpoints -> 400; - path-traversal: `..%2f..`, `%2e%2e%2f`, bare `.git` space-id -> 400/404, no escape, never a file leak; - authz boundaries: a gitSync-DISABLED space -> 404 (existence hidden) and flips to 200 when enabled; a READER member can fetch (200) but is FORBIDDEN to push (403); a NON-member of an enabled space gets 403 (NOT 404 — the critic caught a wrong generator assumption here; pinned as a contract); - concurrency: a push while the per-space Redis lock is held -> 503 + Retry-After, and the receive-pack does NOT mutate the vault; - idempotency: repeated no-op cycles never churn `main` / `refs/docmost/last-pushed`; - data-loss guard (PR #119): deleting MORE than GIT_SYNC_MAX_DELETES_PER_CYCLE is HELD — none trashed AND last-pushed does not advance past the delete commit (retry-safe, not silently dropped). Auto-creates/tears down its fixtures (reader/non-member users, a 2nd space) and resets the vault cache on exit so re-runs and the basic suite stay green. Needs the vault dir + Redis container reachable (see header). A structural rename/move case was intentionally left to the engine unit suite (git rename-similarity on meta-only fixture pages is a fixture artifact, not a feature bug). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	61aad27fce	chore(git-sync): drop now-unused dirname import (PR #119 review) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	f24c8e20d5	test(git-sync): add a live two-way smart-HTTP e2e suite A runnable end-to-end suite that drives a LIVE git-sync stand over the real /git remote — the integration counterpart to the unit tests. 10 checks across the full feature: - the auth/authz gate: no creds -> 401, wrong password -> 401, unknown space -> 404 (existence never revealed), valid creds on a sync space -> 200; - fetch: git clone over HTTP returns the vault markdown; - push: a git-side edit propagates into the Docmost page; - Docmost -> git: a page created via the API materializes as a vault file; - delete: `git rm` + push soft-deletes the Docmost page (Trash); - 3-way merge: a new git edit is added without clobbering prior page content. Parameterized via env (SERVER/SPACE_ID/EMAIL/PASSWORD/DB_CONTAINER) and isolates its own test page. It boots nothing — see the header for the stand prerequisites (GIT_SYNC_ENABLED + a per-space gitSync flag + a service user). This is the suite that caught the smart-HTTP PATH_INFO 404 bug. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	b0fc49cf9d	refactor(git-sync): move the PULL->PUSH cycle into the engine as runCycle (PR #119 review, arch #1 ) The reconcile choreography (ensureRepo -> merge-check -> ensureBranch -> checkout('docmost') -> pull -> push) was hand-rolled in the app orchestrator's driveCycle, duplicating an order the vendored engine owns and could drift from on upgrade — the failure mode is data clobber. Lift it into @docmost/git-sync as a single entry point, `runCycle(deps)`. The orchestrator now calls runCycle and keeps only the lock (its caller) and the gitmost-specific delete-cap POLICY, injected as the `resolveApplyClient` hook (the engine does the dry-run, hands the hook the planned delete count — Infinity if planning failed — and uses whatever client it returns for the apply). driveCycle drops from ~150 lines to ~30. Tests: - engine test/cycle.test.ts: composition (merge-in-progress short-circuit; ensureRepo->ensureBranch->checkout staging order before the pull; the cap hook is consulted with the planned count; no dry-run when no hook). - engine test/cycle-roundtrip.test.ts: runCycle against a REAL VaultGit in a temp repo with a faked Docmost client — a git-originated CREATE flows pull->push and the assigned pageId is written back; an unresolved merge short-circuits before any client call. - orchestrator spec rewired to mock runCycle and assert the wiring + the resolveApplyClient cap policy (the engine-internal cycle-order/merge tests moved to the engine). Validated end to end on a live stand (real Postgres/Redis + server): a git clone -> edit -> push over the /git remote round-trips the change into the Docmost page through the refactored cycle. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	3b334d9624	fix(git-sync): drop the .git suffix from git http-backend PATH_INFO (smart-HTTP 404) The /git smart-HTTP host 404'd EVERY fetch and push: PATH_INFO was built as `/<spaceId>.git/<subpath>`, so `git http-backend` resolved the repo at `<GIT_PROJECT_ROOT>/<spaceId>.git` — which does not exist. The vault is a NON-bare working repo (the engine needs a working tree) at `<dataDir>/<spaceId>`, so the CGI repo path must be `<spaceId>` (git http-backend serves the `.git` inside). The URL's conventional `.git` suffix is already stripped to `spaceId` by parseGitPath; re-appending it for PATH_INFO was the bug. Found by standing up a full e2e stand (real Postgres/Redis + server + a real git clone/push over the /git remote): clone and push both 404'd until this fix, after which a clone → edit → push round-trips the change all the way into the Docmost page. Also extracts the CGI-env construction into a pure, exported `buildGitBackendCgiEnv` and adds unit tests (the env build was previously untested — the gap this bug hid in): a regression guard pinning PATH_INFO to `/<spaceId>/<subpath>` (no `.git`), plus method/query/content-type/remote-user forwarding and the conditional GIT_PROTOCOL. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	71a96581ca	test(git-sync): cover ingestExternalPush in the orchestrator spec (PR #119 review) Closes the test-coverage warning that the smart-HTTP push ingest path was unexercised. Adds 5 cases: receive-pack streams BEFORE the Docmost cycle; a held lock throws GitSyncLockHeldError and runs neither the receive-pack nor the cycle; a post-push cycle error is swallowed (the push is durable, poll retries) while the lock is still released; a missing service user runs the receive-pack but skips the immediate cycle; and a globally-disabled git-sync refuses without touching the lock. (The 503/Retry-After mapping in git-http.service is the sibling warning; its spec is in the repo's pre-existing set of jest suites that can't load locally via the react-dom/tiptap transform chain, so that case is left for CI.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	306d88c685	refactor(git-sync): extract SpaceLockService from the orchestrator (PR #119 review, arch #2 ) The per-space single-writer lock — Redis CAS leader lock (SET NX PX, DEL-CAS and PEXPIRE-CAS Lua), the in-process mutex, the per-process instanceId and the heartbeat — lived inline in GitSyncOrchestrator. Extract it into a dedicated @Injectable() SpaceLockService exposing one narrow surface, withSpaceLock(spaceId, fn), so the lock is the orchestrator's only Redis-lock touch-point and is testable in isolation. The orchestrator now injects SpaceLockService and both consumers (runOnce, ingestExternalPush) go through spaceLock.withSpaceLock — behavior unchanged (same sentinel returns, same 503-on-lock-held contract). Orchestrator drops 591→472 lines. Adds space-lock.service.spec.ts asserting the lock SEMANTICS against a fake Redis (the test-coverage warning from the review): the SET NX/PX args, the DEL-CAS and PEXPIRE-CAS Lua + ARGV[1]=instanceId, plus the lock-held / in-progress / throw- still-releases paths. The orchestrator spec is unchanged in count and stays green (it now builds the real SpaceLockService over its mock Redis). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	0318a148dc	docs(git-sync): remove dangling references to the deleted git-sync-plan doc (PR #119 review) The implementation spec docs/git-sync-plan.md was removed as completed, but ~44 code comments still cited it as "plan §N". Strip those citations (comments only), keeping each comment grammatical. The vendored engine's own "SPEC §N" references point at a different, still-present spec and are left untouched. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	f923accc3d	refactor(git-sync): drop dead DebounceEntry.workspaceId field (PR #119 review) The debounce map value carried `workspaceId`, but the scheduled cycle closes over the `workspaceId` argument directly — the field was written and never read. Replace the entry struct with `Map<string, NodeJS.Timeout>` (the timer handle is all the map tracks). No behavior change. (page-change.listener.spec is in the repo's pre-existing set of jest suites that can't load locally via the react-dom/tiptap transform chain — unaffected by this change; tsc clean.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	a0e1cde063	refactor(git-sync): extract shared buildLcsTable for the two block diffs (PR #119 review) The two-way block diff (yjs-body-merge.diffBlocks) and the three-way merge planner (three-way-merge.lcsPairs) built the identical backward-filled LCS DP table inline. Extract it to lcs.ts (buildLcsTable); each caller keeps its own traceback. No behavior change — merge specs unchanged and green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	259d4ca6fa	fix(git-sync): hold refs on suppressed deletes + stamp delete/restore provenance (PR #119 review) Two stability warnings from the #119 review: 1. delete-cap no longer drops deletions forever. When planned deletes exceed GIT_SYNC_MAX_DELETES_PER_CYCLE the apply client's deletePage now THROWS instead of resolving to a no-op. A throw is recorded by the engine as a per-page failure, so `refs/docmost/last-pushed` is NOT advanced past the commit that dropped the files — the next cycle re-diffs from the un-advanced ref and re-plans the same deletes (a transient over-cap is retried, not silently dropped and then recreated by the next pull). Previously a resolving no-op let the engine count `deleted++` with no failure, advance the ref, and never replay the deletions. 2. git-sync soft-delete and restore now stamp provenance. deletePage routes GIT_SYNC_PROVENANCE through pageService.removePage, and restorePage stamps lastUpdatedSource='git-sync' on the restore update — so the page-change listener's loop-guard (skip when lastUpdatedSource==='git-sync') recognizes both as its own writes instead of scheduling a wasted echo cycle. Done via a backward-compatible optional `lastUpdatedSource` param on pageRepo.removePage/restorePage (omitted for ordinary user deletes/restores). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	7ed33d8127	docs(git-sync): document GIT_SYNC_* env vars; fix stale/non-English comments (PR #119 review) Addresses the documentation/convention warnings from the #119 review: - .env.example: add the GIT-SYNC block (9 GIT_SYNC_* vars with defaults), noting GIT_SYNC_SERVICE_USER_ID is required when sync is enabled. - yjs-body-merge.ts: translate the Russian review note in the docstring to English (comments-only-in-English rule). - persistence.extension.ts: correct the stale "git-sync writes are full-body replaces" rationale — a git-sync write is now a block-level merge into the live doc, which is why it is debounced like a human edit rather than snapshotted. - history-item.tsx: the GitSyncBadge version is created on the PUSH path (writing the git body back into the doc), not by the pull — fix the comment. - edit-space-form.tsx: log the raw error in the git-sync toggle catch instead of swallowing it (AGENTS.md). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	c5b05aacaf	chore(mcp): stop committing build/ and node_modules; build in CI/Docker Same hygiene fix as git-sync (review #2), applied to packages/mcp which had the identical pre-existing problem: committed build/ (20 files) + node_modules (28, pnpm symlinks with a baked /home/claude store path). - git rm --cached packages/mcp/{build,node_modules}. - .gitignore: add packages/mcp/build/ (packages/*/node_modules/ already covers it). - Build where consumed: apps/server `pretest` and the CI Test workflow now build @docmost/mcp too. The Dockerfile builder already runs `pnpm build` (nx builds mcp) and already COPYs packages/mcp/build into the runtime image. Verified: wiped build/, rebuilt via `pnpm --filter @docmost/mcp build`; the mcp server suites (96 tests) pass against the freshly-built, non-committed output. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:20 +03:00
claude code agent 227	f90f3e272a	feat(git-sync): three-way body merge using the last-synced base (no edit loss) Upgrades the 2-way body merge to a real diff3 three-way merge (review #5), so a block ONLY the human changed is KEPT when git changed a DIFFERENT block — the 2-way merge would revert it to git's stale version. Engine: the push update loop reads the last-synced pre-image (`git.showFileAtRef(refs/docmost/last-pushed, path)`) and passes it as the optional `baseMarkdown` to `client.importPageMarkdown` (the common ancestor). Server: gitmost-datasource converts base+incoming, and writeBody runs a block- level diff3 (new three-way-merge.ts `diff3Plan`): live-only change -> keep live, git-only change -> take git, both-changed -> git wins (conflict policy), inserts/ deletes from either side preserved. Without a base (createPage) it falls back to the 2-way merge. Crash-safety unchanged (docs built before the connection opens). Tests: three-way-merge.spec.ts (14 — every diff3 case incl. the cross-block preservation and conflict policy), yjs-body-merge 3-way (real Y.Docs: human's block instance preserved while git's block is applied), plus an engine test that the base is forwarded from showFileAtRef. Existing push assertions updated for the new base arg. git-sync 589 pass; server merge/datasource/gate 62 pass; typecheck clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:04 +03:00
claude code agent 227	3bba9425f4	fix(git-sync): merge git body into the live doc block-by-block (no clobber) Supersedes the active-session "defer" guard with a real merge (review #5 — "запись делать через мерж", not skip-while-editing). writeBody no longer does delete-all + re-insert (which discarded a concurrent editor's in-flight changes on every sync). It now diffs the live body against the incoming git body at TOP-LEVEL BLOCK granularity (LCS over a canonical structural serialization) and applies only the minimal inserts/deletes: - a block a human is editing is left UNTOUCHED when git changed a DIFFERENT block; - an unchanged resync is a complete 0-op write; - Yjs CRDT-merges the minimal ops with concurrent edits. New yjs-body-merge.ts (mergeXmlFragments + cloneXmlNode + diffBlocks) is pure-Yjs and unit-tested with real Y.Docs (8 tests): identical->0 ops, edit-one-block keeps the other block instances, append/delete keep neighbours, marks survive the cross-doc clone. Crash-safety kept: the incoming doc is built before the connection opens, so a transform failure can't empty the body. Removed: the ActiveEditSessionError defer path and the now-unused CollaborationGateway.getActiveEditorCount. Honest limitation: this is a 2-way merge — for a block BOTH sides changed since the last sync, git wins (no common ancestor to decide). A full 3-way merge would need the last-synced base plumbed from the engine; the dominant cases (unchanged resync, edits to different blocks) are now lossless. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:03 +03:00
claude code agent 227	9c805e8069	chore(git-sync): stop committing build/ and node_modules; build in CI/Docker Review finding #2: packages/git-sync/build/ (the COMPILED engine) and the package's node_modules/ were committed. Prod executed the committed build/ while CI/tests ran src/ and never rebuilt it — so a fix in src/ could pass tests while stale compiled code shipped (a silent src/prod skew). The committed node_modules were pnpm symlinks with a baked machine-local store path (/home/claude/...), useless and misleading for everyone else. - git rm --cached packages/git-sync/{build,node_modules} (42 + 31 files). - .gitignore: ignore packages/*/node_modules/ and packages/git-sync/build/. - Build the package where it is actually consumed: apps/server `pretest` now builds @docmost/git-sync (its suite imports the built build/index.js), and the CI Test workflow gains an explicit "Build git-sync" step. The Dockerfile builder already runs `pnpm build` (nx builds the package) and now COPYs the fresh build/. Verified: wiped build/, rebuilt via `pnpm --filter @docmost/git-sync build`, then the server converter gate (26/26, imports the rebuilt package) and the git-sync suite (588 passed) both pass against the freshly-built, non-committed output. NOTE: packages/mcp/ has the same committed-build/node_modules pattern (pre-existing, out of this PR's scope) and should get the same treatment in a follow-up. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:18:03 +03:00
claude code agent 227	d716ca385a	fix(git-sync): don't clobber pages with a live editing session; crash-safe body write Review finding #5: the git -> page body write (writeBody) did a full-body replace (delete-all + re-insert) on the shared Yjs doc. Applied while a human is editing the page, it discarded their in-flight changes; and TiptapTransformer.toYdoc ran AFTER the fragment was cleared, so a conversion failure could leave the page with an empty body. Fixes: - Active-session guard: CollaborationGateway.getActiveEditorCount(documentName) reports live human (websocket) editor sessions for a doc, excluding server-side direct connections. writeBody now throws ActiveEditSessionError when an editor is connected. The engine's push loop already isolates each importPageMarkdown in try/catch and does not advance the loop-guard on failure, so the write is simply retried on the next poll once the editor disconnects — never a clobber. - Crash-safe conversion: build the replacement Yjs update BEFORE opening the connection / clearing the fragment, so a transform failure can never leave the body empty. Also updates the server-side converter gate spec to the corrected round-trip shape: the block-image hoist no longer leaves a leading empty paragraph (the git-sync converter fix in `7d39c16b`, now reaching the built package). A true merge of git content into a live Yjs session is out of scope (it needs a real 3-way text merge with no shared update lineage); deferring the write while a page is being edited is the safe, owner-approved minimum. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:17:24 +03:00
claude_code	66bd039f8f	feat(git-sync): serve spaces over smart-HTTP (gitmost as a two-way git host) Expose each git-sync-enabled space as a clonable/pushable git repo over HTTP, so `git clone https://<user>:<pass>@<host>/git/<spaceId>.git` works and external pushes flow back into Docmost pages — gitmost itself acts as the git host (no external GitHub/Gitea, no SSH). Transport: shell out to `git http-backend` (CGI; git is already in the runtime image) which implements the full smart-HTTP protocol (info/refs, upload-pack, receive-pack, protocol v2). A raw Fastify route `/git/` (mounted at the root, outside the `/api` prefix) bridges the request/response to the CGI; passthrough content-type parsers for the git media types stream the raw body to stdin. Reuse the existing engine: clients push the vault's `main` branch, whose commits beyond `refs/docmost/last-pushed` the engine already reconciles into Docmost. - http/git-http.service.ts — auth (HTTP Basic -> AuthService.verifyUserCredentials), self-resolved workspace (DomainMiddleware does not run for this raw route), per-space gating (global + per-space gitSync flags, 404 hides existence), CASL authz (Read=fetch, Manage=push), dispatch. - http/git-http-backend.service.ts — spawn `git http-backend`, binary-safe CGI response parsing (Status/headers/body), stream to the socket. - http/git-http.helpers.ts — pure path parse, service->kind mapping, gate decision (unit-tested); rejects literal and percent-encoded path traversal. - orchestrator: extract reusable withSpaceLock (CAS-guarded lock heartbeat so a long push cannot let the lock expire mid-cycle) and add ingestExternalPush (receive-pack + Docmost cycle under one lock; 503 on contention). - vault-registry: ensureServable() — ensureRepo + idempotent receive.denyCurrentBranch =updateInstead / denyNonFastForwards / http.receivepack / http.uploadpack. - env: GIT_SYNC_HTTP_ENABLED (defaults to GIT_SYNC_ENABLED) + validation. - main.ts: register the /git/ route and the git content-type parsers. Tests: pure helpers, CGI parsing, and the GitHttpService handler (auth/gate/authz + workspace resolution). Server tsc + git-sync/env suites green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:17:24 +03:00
claude_code	ba15fde809	test(git-sync): add reviewer-requested coverage across engine, server, client Implements the test cases called out in the PR #119 review threads (code-review, test-strategy report, red-team) — TESTS ONLY, no production code changes. packages/git-sync (vitest): - lib converter/markdown gaps: pageBreak data-loss (it.fails repro), subpages lossy round-trip, nested/fenced callouts, ol->taskList bridge, column.width number<->string drift, empty details. - engine units: parentFolderFile, planReconciliation swap/chained move, buildVaultLayout last-resort-by-id, firstDivergence, applyPushActions / applyPullActions failure isolation. - real temp-git integration: diffNameStatus -z rename+add/modify alignment, copy-line behavior, per-invocation committer identity (no leak into repo/global config). - ENFORCED type-level GitSyncClient contract via vitest typecheck over a *.test-d.ts file (tsconfig.vitest.json; build tsconfig untouched). apps/server (jest): - orchestrator: delete-cap neutralization + fail-safe, Redis lock / mutex skip ladder + release-on-throw, merge guard, pull/push order, remote template substitution, poll lifecycle. - page-change listener: loop-guard, debounce coalescing, id resolution, error swallowing. - vault registry, controller authz (trigger + status), env validation/getters, page.service git-sync provenance stamping, persistence precedence (agent > git-sync > user) + no boundary snapshot, space.service audit-delta, space.repo jsonb-merge, converter-gate corpus extension (mention/math/details/marks). apps/client (vitest + testing-library): - history-item git-sync badge: render gating + non-clickable. - edit-space-form toggle: initial state, optimistic payload, rollback on error, disabled states. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:17:24 +03:00
claude code agent 227	d1a8b48b96	fix(git-sync): address review — configurable poll, always-on loop-guard, cleanup Comprehensive-review follow-ups (APPROVE WITH SUGGESTIONS; no critical issues): - poll interval is now actually configurable: replaced the hardcoded @Interval('git-sync-poll', 15000) with a dynamic SchedulerRegistry interval registered in onModuleInit from getGitSyncPollIntervalMs() (cleared in onModuleDestroy); /status and the real cadence now share one config source. Boots logging 'poll interval registered (Nms)'. - loop-guard now ALWAYS applies: the lastUpdatedSource==='git-sync' skip was nested inside the !spaceId/!workspaceId branch, so structural self-writes (CREATE/MOVE/RESTORE/SOFT_DELETE, which carry spaceId+workspaceId) bypassed it and re-triggered cycles. Fetch the page row once, guard unconditionally, then resolve space/workspace. - remove the dead PAGE_CONTENT_UPDATED subscription (it's a BullMQ job, never an EventEmitter event; body edits arrive via PAGE_UPDATED). - fix the stale datasource comment (PageService DOES stamp 'git-sync' now). - env getters: parseInt radix 10 + NaN/<=0 fallback for poll/debounce (+ max deletes), with 6 new environment.service.spec tests. tsc clean; jest 723 pass; live cycle re-verified post-refactor (ran, push applied, unflagged 92-page space untouched). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:17:24 +03:00
claude code agent 227	55d610b7f8	feat(git-sync): per-space 'Enable Git sync' toggle (Phase C, §7.1) UI opt-in for git-sync, mirroring the existing sharing/comments settings pattern (no new endpoint, no new mechanism; orchestrator read query untouched): - UpdateSpaceDto.gitSyncEnabled?: boolean. - SpaceRepo.updateGitSyncSettings: jsonb-merge into settings.gitSync.<key> (COALESCE \|\| jsonb_build_object — never clobbers sibling sharing/comments); stored as a real jsonb boolean so the orchestrator's settings->'gitSync'->>'enabled' = 'true' matches. - SpaceService.updateSpace handles the flag (audit diff) via the existing CASL-guarded space update path (Manage/Settings). - client: Switch in edit-space-form (optimistic mutate + revert-on-error, readOnly-aware) + space types + 2 i18n keys. - space.service.spec extended (calls updateGitSyncSettings; no-op when undefined). tsc clean (server+client); jest src/core/space 4 pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:17:24 +03:00
claude code agent 227	8201e76c66	fix(git-sync): branch choreography + strict scoping + delete cap (Phase B hardening) Fixes found by the live pull/push e2e: - CRITICAL: driveCycle never checked out the 'docmost' branch before applyPullActions, so Docmost content was written straight onto 'main', clobbering local file edits before push could diff them. Now checkout 'docmost' before pull (applyPullActions commits there then checks out main + merges) — mirrors the engine's pull main(). Round-trip now works both ways. - add an unresolved-merge guard (SPEC §9): skip the cycle if the vault is mid-merge instead of failing on checkout. - SAFETY: enabledSpaces() is now STRICT opt-in — only spaces with settings.gitSync.enabled===true; removed the all-spaces fallback that synced every space (incl. a 92-page one) the moment GIT_SYNC_ENABLED flipped. - SAFETY: per-cycle delete cap (GIT_SYNC_MAX_DELETES_PER_CYCLE, default 5): dry-run the push, and if planned deletes exceed the cap, run the apply with deletePage neutralized — phantom absence-deletions from a non-convergent vault can't soft-delete real pages. Fails safe if the dry-run throws. - fix manual trigger: TriggerGitSyncDto.spaceId needs @IsUUID or the global whitelist ValidationPipe strips it (arrived undefined -> vault 'undefined'). Live-verified on an isolated flagged space: push (vault file edit -> Docmost content, stamped lastUpdatedSource='git-sync') and pull (Docmost rename -> vault file + meta) both work; an unrelated 92-page space stayed untouched throughout. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:17:24 +03:00
claude code agent 227	901147a224	feat(git-sync): GitSyncModule orchestrator + config + listener (Phase A.4b/B) Control plane wiring (plan §5-§11): - PageService create/update/movePage now honor provenance actor 'git-sync' (stamp lastUpdatedSource='git-sync'), closing the A.4a gap. - EnvironmentService: GIT_SYNC_ENABLED / DATA_DIR / REMOTE_TEMPLATE / POLL_INTERVAL_MS / DEBOUNCE_MS / SERVICE_USER_ID (required-if-enabled) / SSH_KEY_PATH + validation. - VaultRegistryService: per-space vault path + cached VaultGit. - GitSyncOrchestrator: per-space Redis leader-lock (SET NX PX + CAS-Lua release, randomUUID instanceId) + in-process mutex; runOnce drives the vendored engine PULL (readExisting->computePullActions->applyPullActions) then PUSH (runPush) with the bound native GitSyncClient + VaultGit; @Interval poll-safety gated on GIT_SYNC_ENABLED; imports plain ScheduleModule (TelemetryModule owns forRoot). - PageChangeListener: @OnEvent PAGE_* -> per-space debounce -> runOnce, with a best-effort lastUpdatedSource==='git-sync' loop-guard. - GitSyncController: admin POST /api/git-sync/trigger + GET /status (ops/e2e). - GitSyncModule registered in app.module. Enabled-space enumeration uses settings.gitSync.enabled, falling back to all live spaces until Phase C writes the flag (master gate = GIT_SYNC_ENABLED). tsc clean; 713 tests/71 suites pass; dev server hot-reloaded the module (route live, DI graph boots). Live pull/push round-trip verified next. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:17:24 +03:00
claude code agent 227	afe1ba8398	feat(git-sync): native GitmostDataSource + 'git-sync' provenance (Phase A.4a) Native data plane for git-sync (plan §3, §8.1): - provenance: widen actor to 'user'\|'agent'\|'git-sync' (jwt-payload, auth-provenance decorator); PersistenceExtension resolves lastUpdatedSource with precedence agent > git-sync > user, debounced history (like a human edit, not the agent's immediate snapshot). - GitmostDataSourceService implements @docmost/git-sync's GitSyncClient natively: reads via PageRepo/SpaceRepo (listSpaceTree complete:true, getPageJson), writes via PageService (create/removePage soft-delete/movePage with computed fractional position/update-rename/restore) + the writeBody linchpin through collab openDirectConnection('page.'+id, {actor:'git-sync'}) mirroring collaboration.handler withYdocConnection 'replace'. bind({workspaceId,userId}) returns the context-bound client for the orchestrator. - 10 unit/contract tests (mapping + soft-delete + move-position), tsc clean. Known gap (closed in A.4b): PageService.create/update/movePage only branch on actor==='agent'; git-sync provenance is already passed through so the row source marker propagates once PageService honors 'git-sync'. Module/orchestrator/config come next. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:17:24 +03:00
claude code agent 227	5aaeaaae3c	feat(git-sync): CommonJS build + §13.1 editor-ext idempotency gate (Phase A.2) Make @docmost/git-sync natively consumable by the CommonJS server (and jest): build to CommonJS (tsconfig module CommonJS, drop type:module, strip .js from relative imports), and lazy-load the only ESM-only dep (marked) via the dynamic Function('import()') trick (mirrors docmost-client.loader.ts) with a require() fallback so vitest's evaluator works too. git-sync tests stay green (314 pass, 3 expected fail). Add the §13.1 idempotency gate (apps/server .../git-sync-converter-gate.spec.ts): 13 editor-ext docs (paragraphs/headings, marks, links, bullet/ordered/task lists, blockquote, callouts, code block, hr, table, nested mix) round-trip content(editor-ext) -> convertProseMirrorToMarkdown -> markdownToProseMirror -> TiptapTransformer.toYdoc/fromYdoc(tiptapExtensions) -> canonicalize and assert docsCanonicallyEqual. All green => the vendored converter's docmost-schema is schema-compatible with editor-ext (no node/mark/attr loss), which the plan §13.1 requires before Phase B. The one intrinsic markdown-image lossiness (width/height /align can't ride plain ![](src)) is isolated in a KNOWN DIVERGENCE block, not hidden. Server tsc clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-26 00:17:24 +03:00
claude code agent 227	ed3b65c36b	Merge remote-tracking branch 'gitea/develop' into batch/issues-2026-06-25 # Conflicts: # apps/server/src/core/ai-chat/ai-chat.service.spec.ts # apps/server/src/core/ai-chat/ai-chat.service.ts	2026-06-25 12:48:47 +03:00
claude code agent 227	aa7a115f66	refactor(review): address PR #186 re-review (approve-with-comments) Approve-with-comments re-review; no blockers. All 7 actionable points (8 is a forward-looking architecture note — recommendation A, keep as-is): 1. chat-markdown.util spec: restore parity coverage of the removed client spec — tool error state (+ errorText), unknown-tool fallback (`Ran tool <name>` en / `Выполнил инструмент <name>` ru), and the circular-output stringify catch. 2. findAllByChat row cap is now testable (injectable limit) + an int-spec proves truncation on a modest volume. 3. Stability: the per-step durability updates are SERIALIZED via a promise chain (stepUpdateChain) so they commit in step order — onlyIfStreaming already closed the finalize race, this closes inter-step ordering. 4. findAllByChat keeps the NEWEST messages on truncation (order DESC + reverse, like findRecent) and logs a warning with chatId, instead of silently dropping the newest tail. 5. The LABELS parity comment already references the real path (tool-parts.tsx / toolLabelKey) — confirmed accurate. 6. Removed the redundant 'off-by-one boundary' test (strict subset of the two adjacent prepareAgentStep cases). 7. Extracted the terminal-finalize dispatch into a shared `applyFinalize`, used by BOTH the service's finalizeAssistant and its test — the test now exercises the real path, not a copy, so a production drift fails it. Verified: server build + 325 ai-chat unit + 6 integration; prettier clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 12:28:35 +03:00
claude code agent 227	30c358a2f8	test(review): add the 4 new test-coverage points from PR #185 re-review The re-review's blocking/structural points (lease leak, dup-id guard test, body-before-title test, CHANGELOG, pg18, shared jsonb decoder) were already addressed in commit 24264ef; this adds the 4 genuinely-new coverage requests: - pt 6: `scrollToReference(id, index?)` exercised against a live editor DOM — selects the index-th `sup[data-footnote-ref][data-id]` occurrence, falls back to the first for out-of-range, returns false for an empty id (scrollIntoView stubbed). (#168) - pt 7: export `backlinkLabel` and pin the base-26 carry boundary (25->z, 26->aa, 27->ab, 51->az, 52->ba). (#168) - pt 8: integration fail-open — a PRESENT-but-corrupt tool_allowlist (jsonb string scalar holding non-array JSON) reads back as null ("no restriction"), covering normalizeRow's degrade branch. (#159 #172/#173) - pt 9: getFootnoteRefCount cache invalidation — adding a `[^a]` reference bumps the cached count 2 -> 3. (#168) Verified: editor-ext footnote 23; client structure 7 + tsc; server int 8. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 12:08:21 +03:00
claude code agent 227	ea61c96a7c	refactor(review): address PR #186 review (#183 — recency sweep, #174 export, tests, cleanups) 15-point review of the persistent-history PR. Architecture decisions: crash recovery = recency threshold; tool-label duplication = leave as-is. Must-fix: 1. Boot-sweep bounded by recency. sweepStreaming now also requires `updatedAt < now() - SWEEP_STREAMING_STALE_MS` (10 min), so a fresh replica's startup sweep can't abort a turn another replica is actively streaming (multi-instance deploy). Int-spec: a FRESH 'streaming' row is NOT swept, a STALE one IS. 2. Restore export during the FIRST streaming turn of a new chat (#174). The server chatId is now adopted EARLY (in-place, on the start-chunk metadata) via a new `onServerChatId` callback wired through use-chat-session → chat-thread, so `activeChatId` is set at turn start and the Copy button is live mid-first- turn (canExport = !!activeChatId). Hook tests for early/in-place/no-op adopt. 3. Cover finalizeAssistant's fallback-insert branch: extracted pure `planFinalizeAssistant(assistantId)` (update when id present, insert when the upfront insert failed) + a dispatch harness test for both arms. Tests: onModuleInit lifecycle spec (sweep called; throw → resolves + warns); int-spec updatedAt assertion → toBeGreaterThan. Cleanups: cap findAllByChat at 5000 rows; upfront-insert-failure log carries chatId+workspaceId; removed the now-dead buildPartialAssistantRecord (only the spec consumed it; shapes still pinned by the flushAssistant suite); controller passes `lang: dto.lang` (normalizeLang handles undefined); dropped a no-op `?? undefined` in errorOf; documented the content-column semantics change (concatenated step text, UI renders from metadata.parts); CHANGELOG [Unreleased] entry (#183, #174); reworded the stale LABELS parity comment. Verified: server build + 323 ai-chat unit + 5 integration; client tsc + 160 ai-chat unit; prettier clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:53:25 +03:00
claude code agent 227	f80276d41a	refactor(review): address PR #185 review (lease leak, tests, changelog, jsonb seam) 8-point multi-aspect review of the batch PR; security/regressions were clean. 1. Lease leak: the #180 reorder moved `toolsFor` (which leases external MCP clients, refCount+1) ahead of buildSystemPrompt + forUser, but the only release (closeExternalClients) was bound to the streamText callbacks. A throw in between leaked the lease (refCount stuck, undici sockets held until restart). Define closeExternalClients right after the lease and wrap buildSystemPrompt+forUser in try/catch that closes-then-rethrows. 2. Cover the patch_node/delete_node dup-id refusal (#159 #6): extract the guard into a pure `assertUnambiguousMatch` (node-ops) and unit-test 0/1/>1. 3. Regress the body-before-title order (#159 #10): mock-HTTP test (collab fails fast against a server with no WS upgrade) asserts /pages/update (title) is NEVER posted when the body write fails — for updatePage AND updatePageJson. 4. CHANGELOG [Unreleased]: #180, #168 (Added); #163 (Fixed). 5. Add the missing en-US i18n keys (Back to references / {{label}}). 6. Drop the duplicate content/empty/blank cases in ai-chat.prompt.spec.ts (they repeat the buildMcpToolingBlock unit tests); keep only sandwich placement + both-safety-copies. 7. CI Postgres pg16 -> pg18 (match docker-compose). 8. jsonb decode seam: shared `parseJsonbValue(value, guard)` in database/utils.ts holds the legacy double-encoding self-heal in one place; parseToolAllowlist / parseModelConfig keep only a type-guard. Verified: server build + 124 unit + 15 integration; mcp 311; prettier clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	34c5b557ef	fix(share): SEO route must not leak a restricted page's title (#159 ) `ShareSeoController.getShare` resolved the inherited share with the RAW `getShareForPage`, which does NOT run the restricted-ancestor gate. So for a page shared with includeSubPages whose descendant is permission-restricted, the SEO route served that descendant's real title in <title>/og:title/twitter:title to anonymous visitors and crawlers — even though the content API returns 404 for it (red-team finding #3). Funnel the SEO path through the canonical `resolveReadableSharePage` boundary (the single place that checks `hasRestrictedAncestor`): a non-readable page now serves the plain SPA index with no meta. Also honour `isSharingAllowed` — a share whose workspace/space sharing toggle was flipped off after creation no longer leaks its title via SEO. Title comes from the server-resolved page; `buildShareMetaHtml` already emits robots=noindex when the share opted out of indexing. Tests (controller routing, fs spied at call time so bcrypt's native loader is untouched): non-readable page => plain index, no title; sharing-disabled => plain index; readable+indexing => title + og:title, no noindex; readable+no- indexing => noindex. Asserts getShareForPage is never called by the SEO path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	59f0c8b22d	fix(ai-chat): validate the open page server-side so the agent edits the right one (#159 ) The client sends the "current page" as { id, title } in the request body and the server echoed BOTH verbatim into the system prompt context and the getCurrentPage tool. id and title are independently attacker/desync-controllable (two tabs, stale navigation), so openPage.id could point at page B while openPage.title said "Page A" — the model then reported "updated Page A" while it actually edited page B (CASL still allowed it; the user has access). Red-team finding #4. Resolve the open page ONCE against the DB via a new `resolveOpenPageContext`: workspace-scoped lookup + access check, returning the AUTHORITATIVE { id, title } (title from the DB row, never the client) or null (fail-closed) for a missing / foreign / inaccessible page. That validated value now feeds the system prompt, the getCurrentPage tool, AND the new-chat history origin (which previously did this validation inline, for the id only — now shared, and the title is fixed too). Tests: resolveOpenPageContext covers no-id, not-found, foreign-workspace, Forbidden, non-Forbidden-fault (fail-closed), the DB-title-wins-over-client case, and null-title coercion. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	77ccc596ea	feat(ai-chat): per-MCP-server instructions in the agent system prompt (#180 ) Admins can now give each EXTERNAL MCP server a free-text instruction ("how/ when to use this server's tools") that the agent receives in its SYSTEM PROMPT next to the tool descriptions — porting the built-in SERVER_INSTRUCTIONS idea to admin-configured servers. Trusted, admin-authored text (like a system prompt); NON-secret, so unlike headersEnc it IS returned in views/forms. - Migration: nullable `instructions text` on ai_mcp_servers (old rows = null = no guidance). Table type + repo insert/update (blank/whitespace -> null via blankToNull). DTO `@MaxLength(4000)`. Service threads it through McpServerView/toView. - mcp-clients: `McpServerInstruction { serverName, toolPrefix, instructions }` threaded through the toolset/cache/lease. Guidance is built ONLY for a server that actually connected AND contributed >=1 callable tool (the allowlist may filter all of them out) AND has non-blank text — so a guide never appears for tools the agent cannot call. Cached with the toolset, so an edit is picked up next turn via the existing CRUD cache invalidation. - System prompt: `buildMcpToolingBlock` renders an <mcp_tooling> block INSIDE the safety sandwich (after context, before the trailing SAFETY_FRAMEWORK) so it informs tool choice but cannot override the rules; each section is headed by the server's `prefix_*` namespace. Empty/blank -> block omitted. The caller (ai-chat.service) now builds the external toolset BEFORE the prompt and passes external.instructions; client-handle lifecycle (close-once) unchanged. - Client: instructions field in types + a Textarea (autosize, maxLength 4000) in the MCP-server form with a namespace-prefix hint; i18n (en/ru). Tests across every layer (prompt block placement + both SAFETY copies; view blank->null; buildEntry includes guidance only for connected+>=1-tool+non-blank; DTO MaxLength; repo + integration round-trip; service wiring). Delegated impl reviewed (APPROVE); applied the import-type follow-up. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	1cfad1f6fb	fix(db): jsonb double-encoding follow-ups from PR #172 review (#173 ) PR #172 fixed the jsonb double-encoding for `tool_allowlist` but the same class of bug, and the same re-derived workaround, remained elsewhere. 1. model_config (agent roles): jsonbObject still used the buggy `::jsonb` bind, so `ai_agent_roles.model_config` round-tripped as a jsonb STRING SCALAR. The read-path `typeof === 'object'` check then failed and the model override was SILENTLY dropped (role fell back to the default model). Fixed to `::text::jsonb` and added `parseModelConfig` + `normalizeRow` so every read self-heals already-corrupted rows (no migration). 2. Centralized the write workaround as `jsonbBind()` in database/utils.ts — one implementation with one explanation of the quirk — replacing the per-repo `jsonbArray` (mcp) and `jsonbObject` (roles). 3. Integration coverage (the fix is a DB round-trip a unit test cannot see; the read-side parser MASKS a write regression): new ai-mcp-server-repo.int-spec asserts `jsonb_typeof(tool_allowlist)='array'` after insert + heals a seeded string-scalar row; ai-agent-roles-repo int-spec gains the same for `model_config` (`'object'` + heal). 4. Updated the stale `ai-mcp-servers.types.ts` comment (the driver returns a JSON string for legacy rows; the repo normalizes every read). 5. Fail-open logging: a corrupt tool_allowlist degrades to "no restriction" (agent gets ALL tools) — normalizeRow now warns (server id only, never contents) so the silent widening leaves a trace. 6. Simplified parseToolAllowlist (normalize the string once, then a single array-of-strings check) — identical behaviour, all 12 cases still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 11:36:01 +03:00
claude code agent 227	ae6faf3abc	fix(ai-chat): guard step-update vs finalize race with WHERE status='streaming' (#183 review) Review caught a real race: onStepFinish fires `updateStreaming()` fire-and- forget (not awaited), so the FINAL step's streaming UPDATE and the terminal `finalizeAssistant` UPDATE run as two concurrent statements on different pool connections — commit order is not guaranteed. If the late streaming update lands AFTER finalize, the completed row is clobbered back to status='streaming' with no usage/finishReason, and the next startup sweep then mis-marks the finished turn 'aborted'. Green unit/integration tests don't reproduce a cross-connection race. Fix: scope the per-step update with `onlyIfStreaming` → SQL `WHERE status='streaming'`. Once finalize has set a terminal status the late update matches zero rows and no-ops, regardless of commit order; finalize runs unguarded so it always wins. A cheap `if (finalized) return` short-circuit avoids most wasted queries, but the SQL guard is the authoritative fix (the flag can be set after a query is already in flight). Integration test: finalize to 'completed', then a late onlyIfStreaming update is a no-op — status/content/usage preserved. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 06:14:02 +03:00
claude code agent 227	e7b719bbb8	feat(ai-chat): persistent history as source of truth — step durability + server export (#183 ) The chat lived in inconsistent paradigms (in-memory stream + client export vs. DB-as-context), which made export flaky and lost the assistant answer if the process died mid-turn. Make the DB the single source of truth. A. STEP-GRANULAR DURABILITY (server) - ai_chat_messages gains a nullable `status` column (migration; NULL = legacy = completed). The assistant row is now INSERTED UPFRONT as `status:'streaming'` and UPDATEd on every onStepFinish with all finished steps (text + tool calls + tool RESULTS), then finalized once to completed/error/aborted on the terminal callback. So a process death mid-turn keeps every finished step; a startup sweep (OnModuleInit → sweepStreaming) flips any dangling 'streaming' row to 'aborted'. The write path no longer depends on a live socket. - Pure exported `flushAssistant(steps, inProgressText, status, extra?)` builds the persist payload (metadata.parts byte-identical to the old builder), so a future background worker can call the same path. AiChatMessageRepo gains `update`, `sweepStreaming`, and `findAllByChat`. - consumeStream drain, external-MCP client close-once, SSE heartbeat preserved. B. SERVER-SIDE EXPORT - New pure `chat-markdown.util.ts` renders Markdown from DB rows ONLY (server port of the client builder). Because A persists the in-progress row, the export now includes an interrupted turn up to its last finished step (flagged "still generating"). `POST /ai-chat/export` (owner-gated via assertOwnedChat, workspace-scoped) returns it; `lang` accepts a full client locale tag ('en-US'/'ru-RU') and is normalized server-side (normalizeLang) — a strict @IsIn(['en','ru']) DTO rejected the real client's i18n.language with a 400, caught in real-browser testing. - Client: handleCopy calls the endpoint; `canExport = !!activeChatId`. The whole liveThreadRef/liveStateRef/onLiveContentChange/hasLiveContent hybrid (and the client chat-markdown util + test) is removed — the server is now authoritative. Tests: flushAssistant unit (status shapes + parts parity), chat-markdown.util unit (incl. legacy NULL-status + interrupted note + ru + normalizeLang locale tags), controller export wiring + owner-gate, integration update/sweepStreaming. Verified: server build + 318 ai-chat unit + 3 integration; client tsc + 157 ai-chat unit; and END-TO-END in a real browser — a chat turn persists mid-stream and the Copy button exports the DB-sourced markdown (showing the in-progress row), HTTP 200 after the locale fix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 06:05:26 +03:00
claude_code	27c91e4a69	feat(ai-chat): bound external MCP tool calls with per-call timeouts External MCP tools (web search, crawl) had no per-call timeout: a hung tool call was only broken by the 15-min transport silence timeout shared with the chat provider, and a server that kept the socket warm but never returned could spin until the user cancelled. Add two independent, composing bounds for external MCP traffic (the chat provider path is unchanged): - Silence 5 min: buildPinnedDispatcher now overrides headersTimeout/ bodyTimeout with mcpStreamTimeoutMs() (AI_MCP_STREAM_TIMEOUT_MS, default 300000) on the external-MCP dispatcher only, so a byte-silent upstream is severed in ~5 min instead of 15. - Total per-call 15 min: wrapToolWithCallTimeout wraps each external tool's execute with a fresh AbortController + timer composed with the turn signal via AbortSignal.any (AI_MCP_CALL_TIMEOUT_MS, default 900000). It RACES the call against the abort signal because @ai-sdk/mcp does not settle its in-flight promise on abort, so a warm-but-stuck call would otherwise hang forever. On timeout the call surfaces as a tool-error and the agent loop recovers. Add tests (incl. a never-settling real-client-style stub) and document both env vars in .env.example.	2026-06-25 04:43:49 +03:00
claude_code	b6787cc542	fix(ai-chat): drain stream on client disconnect to stop heap-OOM leak The /api/ai-chat/stream and public-share streaming paths piped streamText output to the client socket via pipeUIMessageStreamToResponse, whose only reader is that socket. On a client disconnect (pervasive Safari/proxy ECONNRESET), backpressure stalled the stream: the controller aborted the turn but nothing drained it, so streamText's onFinish/onError/onAbort never fired. Cleanup (close leased MCP clients, persist partial) never ran and the whole per-turn object graph (history, per-request toolset closures, captured steps, SDK buffers) stayed rooted — accumulating across turns until the default ~2GB heap saturated and the process crashed with "Ineffective mark-compacts near heap limit - JavaScript heap out of memory". Add the AI SDK v6 documented remedy: fire-and-forget `result.consumeStream({ onError })` right after streamText(), which removes backpressure and drains the stream independently of the client socket so the terminal callbacks always fire and the turn's memory is released even when the client has gone away. Applied to both the authenticated and public-share stream services. Also add `--heapsnapshot-near-heap-limit=2` to the prod start script so any residual leak dumps a heap snapshot near OOM for diagnosis (no effect on normal operation). Heap size stays ops-tunable via NODE_OPTIONS. - apps/server/src/core/ai-chat/ai-chat.service.ts - apps/server/src/core/ai-chat/public-share-chat.service.ts - apps/server/package.json Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-25 03:59:32 +03:00
claude code agent 227	c065e26d14	refactor(ai): retry outside instrumentation + retry-exhaustion test (#179 review) - Invert the transport layers so the pre-response retry is OUTERMOST and the provider-HTTP instrumentation is INNER. Before, the retry lived inside createStreamingFetch (under the instrumentation), so a reset the retry recovered from logged only a clean "OK status=200" — the "PRE-RESPONSE FAILED ... ECONNRESET ... idleSincePrevCall" signal went blind exactly when the fix works, and AI_STREAM_KEEPALIVE_MS couldn't be tuned from prod data. Now createStreamingFetch is the dispatcher-bound BASE (no retry) and a new withPreResponseRetry() wraps it; ai.service composes withPreResponseRetry(createInstrumentedFetch('AiService:provider-http', createStreamingFetch())), so every attempt — including recovered resets — flows through the instrumentation. (Also expresses the keepAlive-config vs retry- behavior boundary structurally, per review #3.) - Add the retry-exhaustion test: a server that resets EVERY connection, asserting the call rejects with a retryable connection error AND exactly PRE_RESPONSE_CONNECT_RETRIES + 1 (= 3) requests reached the server — pinning the bound and that the final error propagates (guards an off-by-one / infinite loop / swallowed error). Existing happy-retry + abort tests moved onto withPreResponseRetry. Verified on the stand: a normal turn still streams (reasoning + finish) and the provider-HTTP telemetry still logs. server tsc + ai/mcp specs green (30). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 00:10:40 +03:00
claude code agent 227	b0faa2fe32	fix(ai-chat): recycle keep-alive sockets + retry pre-response resets (#175 ) The real cause of the long-task "Lost connection to the AI provider" — the earlier 300s-timeout fix (#176) was the wrong layer. The provider-HTTP telemetry on the user's deploy shows the failures are PRE-RESPONSE `read ECONNRESET` ~500ms in (not a 300s/15min timeout), correlated with idleSincePrevCall ~42s and large bodies; and crucially a retry of the SAME request often succeeds. A direct probe to the real z.ai endpoint does NOT reset (113KB bodies and a 45s-idle keep-alive reuse both succeed), and another agent (opencode) runs fine from the same infra — so the provider is healthy and the egress network is usable. The difference is the transport: undici's keep-alive pool REUSES a socket that the deployment's egress (NAT / firewall / conntrack) silently dropped during a long idle gap, so the next request resets pre-response. Fix (brings gitmost in line with clients that don't reuse stale sockets): - Keep-alive recycling: the streaming dispatcher (chat fetch AND the external-MCP dispatcher, via the shared streamingDispatcherOptions) now sets keepAliveTimeout + keepAliveMaxTimeout to a 10s recycle window (AI_STREAM_KEEPALIVE_MS), so a connection idle longer than that is closed instead of reused — a long-gap step opens a fresh connection. keepAliveMaxTimeout also caps a server-advertised keep-alive so the provider can't widen the window. - Pre-response connection retry: createStreamingFetch retries a connection-level reset (ECONNRESET / UND_ERR_SOCKET / ECONNREFUSED / EPIPE / *_TIMEOUT) on a fresh connection up to 2 times. This is SAFE because fetch() only rejects before the Response resolves — a started stream is never replayed; an abort (client disconnect) is never retried. Tests: ai-streaming-fetch.spec — keep-alive options, streamKeepAliveMs env, isRetryableConnectError, and a server that resets the first connection so the retry must land on a fresh one (+ aborted requests are not retried). Verified on the stand that a normal turn still streams (reasoning + text + finish) through the new transport. server tsc + ai/mcp specs green. Note: root cause is the deployment's egress dropping idle connections (Traefik is inbound-only); this makes the app resilient to it. AI_STREAM_KEEPALIVE_MS can be lowered if the egress drops faster than ~10s. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 23:51:17 +03:00
claude code agent 227	6edbbab43b	refactor(ai): unify provider-settings allowlist + stronger chatApiStyle tests (#177 review) Addresses the second #177 review: - Architecture (the silent allowlist drift): the writable provider-setting keys were maintained by hand in two TS-uncheckable places — the key-loop in ai-settings.service and the SQL ALLOWED list in the generic workspace repo (a miss there silently dropped a field on persist, exactly what bit chatApiStyle). Introduce one typed source of truth PROVIDER_SETTINGS_KEYS in ai.types (`satisfies readonly (keyof AiProviderSettings)[]`), have the service consume it, and keep the repo's own copy (it can't import AI types) guarded by a parity test so any future drift fails in CI. - Tests: - ai.service.include-usage.spec: mocks @ai-sdk/openai-compatible and asserts the factory is called with { includeUsage: true, baseURL, apiKey, fetch, name } — `.provider` alone could not catch a dropped includeUsage (the token-usage zeroing regression); also asserts the 'openai' style does NOT use it. - ai-provider-settings-keys.spec: the allowlist parity check + DTO validation for chatApiStyle (@IsIn accepts both values, rejects garbage, optional). - CHANGELOG: [Unreleased] entries for the new "Protocol" / chatApiStyle setting and the default provider change (openai -> openai-compatible). (#175, #177) server + client tsc clean; 42 ai/settings specs green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 23:18:31 +03:00
claude code agent 227	59190148db	feat(ai-chat): explicit chatApiStyle selector to surface reasoning (#175 ) Rebuilt on develop (after #176) and reworked per review: instead of inferring the provider from baseUrl (`if (baseUrl)`), the admin picks the chat provider EXPLICITLY via a new `chatApiStyle` ('openai-compatible' \| 'openai'), mirroring the existing sttApiStyle. A custom baseURL can front real OpenAI too, so the heuristic was fragile. Why reasoning was missing: glm-5.2 (and DeepSeek etc.) stream their thinking as `reasoning_content`, but the official @ai-sdk/openai provider does not map that field. 'openai-compatible' uses @ai-sdk/openai-compatible, which does — so reasoning parts now stream (verified live: reasoning-start/delta/end appear, and disappear when set to 'openai'). - Default (unset) = 'openai-compatible', so existing openai+baseUrl workspaces surface reasoning with no admin action. No DB migration (field lives in the settings.ai.provider JSON blob). - includeUsage: true on the openai-compatible model — without it the provider omits streamed usage, zeroing the live token counter / reasoning-token metadata. The official provider always sent it; this keeps parity. (Confirmed live: usage.totalTokens present.) - openai-compatible has no default endpoint, so with no baseURL (real OpenAI, or a role's cross-driver override that cleared it) it falls back to the official provider. Plumbing: ai.types (ChatApiStyle / CHAT_API_STYLES + AiProviderSettings / MaskedAiSettings), update DTO (@IsIn), ai-settings.service (resolve / getMasked / update allowlist), workspace.repo updateAiProviderSettings ALLOWED (the second, SQL-level allowlist the review missed — without it the field never persisted), ai.service selector. Client: ai-settings-service types + a Protocol <Select> in the chat section + i18n (en/ru). Scope is chat-only (embeddings don't stream reasoning; STT already has sttApiStyle). Tests: ai.service.spec — 4 cases (openai-compatible+baseURL, openai+baseURL, default-unset, openai-compatible-without-baseURL fallback). Verified on the stand: default streams reasoning + usage; 'openai' drops reasoning; the setting round-trips. server + client tsc clean; 36 ai/settings specs green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 22:58:15 +03:00
claude code agent 227	da15b55786	refactor(ai): address PR #176 review — finite-timeout wording, env doc, tests, permanent provider-http module - Wording: every comment now says the stream timeouts are RAISED to a generous-but-finite ~15-min silence timeout, not "disabled (0)" (the stale comments contradicted the code, which uses AI_STREAM_TIMEOUT_MS, default 900000ms). - Architecture (the load-bearing-temporary trap): the streaming fetch reached the chat provider only by riding the "temporary DIAGNOSTIC" telemetry, so deleting the telemetry by its own label would silently revert the timeout fix. Legitimize it: rename ai-http-diagnostics.ts -> ai-provider-http.ts, createDiagnosticFetch -> createInstrumentedFetch, field aiDiagnosticFetch -> aiProviderFetch, drop the "temporary" labels, and document the chat transport (streaming fetch + instrumentation) as one intentional construct. - Docs: AI_STREAM_TIMEOUT_MS added to .env.example next to AI_EMBEDDING_TIMEOUT_MS. - Tests: - ai-provider-http.spec: createInstrumentedFetch delegates to the injected baseFetch with the same input/init, returns the Response untouched, rethrows the error, and defaults to global fetch — covering the baseFetch seam. - ai-streaming-fetch.spec: the delayed-server test is now LOAD-BEARING — with AI_STREAM_TIMEOUT_MS set below the 1.5s server delay the call actually rejects (a lost dispatcher -> global 300s default would NOT), proving the configured dispatcher is wired; plus the default-timeout happy path. server tsc clean; ai-streaming-fetch / ai-provider-http / ai.service / mcp-servers / ai-error specs green (41). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 22:31:58 +03:00
claude code agent 227	a14560c7c9	fix(ai-chat): raise undici's 300s stream timeout for long agent turns (#175 ) Long research turns failed mid-task with "Lost connection to the AI provider". Node's global fetch (undici) defaults BOTH headersTimeout and bodyTimeout to 300_000ms, and the chat provider + the external-MCP dispatcher both ran on it with no override, so: - the z.ai chat stream dropped when a late step's huge accumulated context pushed the model's time-to-first-token past 5 min (the model reasons server-side with NO streamed reasoning, so the connection is silent until the first answer token — reproduced: even a trivial glm-5.2 query has a ~4-8s first-chunk gap; a long run reaches 400k+-token steps), or a reasoning model paused >5 min between chunks (bodyTimeout); - the crawl4ai SSE transport, held open across the whole turn, dropped when it idled >5 min between tool calls. Fix: a dedicated undici dispatcher whose stream timeouts are raised to a generous-but-FINITE silence timeout (default 15 min, AI_STREAM_TIMEOUT_MS) on each path. NOT disabled (0): that would let a genuinely hung provider — with the client still connected — hang forever, since the turn's abortSignal only fires on client disconnect. The timeout bounds SILENCE (time-to-first-byte and the gap BETWEEN chunks), NOT total turn duration, so an arbitrarily long turn that keeps streaming is never cut; only a stream quiet for >15 min is treated as a hang. - ai-streaming-fetch.ts: createStreamingFetch() + streamTimeoutMs() / streamingDispatcherOptions() (the shared, configurable timeout). - ai.service: the chat provider fetch is createStreamingFetch(), wrapped by the existing passive ECONNRESET telemetry (createDiagnosticFetch gained an optional baseFetch) so the telemetry observes the SAME transport. - mcp-clients: the SSRF-pinned Agent uses streamingDispatcherOptions(). Investigation: reproduced the transport mechanism against the real z.ai endpoint (a 1ms headersTimeout throws UND_ERR_HEADERS_TIMEOUT — the exact drop) and ran the actual research agent to a ~428k-token context. Verified the fixed path streams cleanly live (glm-5.2 turns finish; telemetry confirms the streaming fetch is in use). Tests: ai-streaming-fetch.spec (default 15m + env override + invalid fallback + both-timeouts + streams a delayed response); ai-http-diagnostics + ai/mcp specs green. server tsc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-24 22:09:10 +03:00

1 2 3 4 5 ...

781 Commits