feat(sync): scaffold monorepo, extract docmost-client, add Phase-0 harness + read-only pull

Lock the access-layer decision (REST only) and start implementation per SPEC.

- monorepo (npm workspaces): packages/docmost-client = DocmostClient + lib/*
  copied 1:1 from docmost-mcp/src (backport target), plus bannered sync methods
  (listTrash, restorePage, listAllSpacePages, exportPageBody, listRecentSince /
  collectRecentSince cursor scan)
- engine stays the root app per AGENTS.md (src/, test/, build/, data/, settings.ts);
  add roundtrip.ts (SPEC §11 idempotency harness), pull.ts (SPEC §6 read-only
  Docmost->FS mirror), sanitize.ts (SPEC §12 filenames, path-traversal-safe)
- Dockerfile builds the workspace lib before the app; vitest gates CI
- exportPageBody never touches /comments (SPEC §3); serializeDocmostMarkdownBody
  emits meta + body only
- SPEC: resolve access-layer (REST), reflect root-engine layout + REST pagination
- tests: sanitize (incl. dot-traversal), collectRecentSince (cutoff/dedup/cap),
  stripBlockIds, markdown round-trip byte-stability

Note: raw ProseMirror round-trip is byte-stable in Markdown but not yet attribute-
idempotent (SPEC §11 Задача №0, before Phase 2).
This commit is contained in:
vvzvlad
2026-06-16 20:20:20 +03:00
parent 2f92dc4c1f
commit 447d2508ae
33 changed files with 10502 additions and 174 deletions

110
README.md
View File

@@ -2,52 +2,92 @@
Bidirectional sync between Docmost articles and a local Markdown git vault — the
git repository is the state store. For the full design and the phased
implementation plan, see [`SPEC.md`](./SPEC.md).
implementation plan, see [`SPEC.md`](./SPEC.md) (the authoritative spec).
> **Status: scaffold only — the sync engine is not implemented yet.**
> `src/index.ts` validates configuration and exits. The engine described in
> `SPEC.md` is out of scope for this scaffold.
> **Status: Increment 1 — monorepo scaffold + read-only `pull` + Phase-0
> round-trip harness.** Continuous two-way sync is not implemented yet; see the
> phased plan in `SPEC.md`.
It reuses the sibling project **docmost-mcp** as a library (DocmostClient,
ProseMirror ↔ Markdown converter, collab-write).
It reuses the sibling project **docmost-mcp** as a library: the `DocmostClient`
REST client and the lossless ProseMirror ↔ Markdown converter are extracted into
this monorepo (so changes can be backported file-by-file).
## Layout
This is an npm-workspaces monorepo:
- **`packages/docmost-client`** (`docmost-client`) — the Docmost REST client and
its `lib/` (converter, markdown-document, collaboration, …). Its source layout
mirrors `docmost-mcp/src/` 1:1 so diffs can be backported by copying files.
Sync-specific REST methods are added under clearly marked `docmost-sync
additions` banners.
- **the repo ROOT** — the sync engine app (`src/`, `test/`, `build/`, `data/`).
It depends on `docmost-client` and holds the config (`src/settings.ts`),
filename sanitization (`src/sanitize.ts`), the Phase-0 round-trip idempotency
harness (`src/roundtrip.ts`), and the read-only `pull` (`src/pull.ts`).
## Install & build
Requires Node >= 20.
```sh
npm install # links the workspace packages
npm run build # builds docmost-client, then compiles the app into build/
```
`docmost-client` must build before the app (the app consumes its built output);
the root `build` script builds the lib first, then runs `tsc`.
## Configuration
All config comes from ENV / `.env` (see [`.env.example`](./.env.example)), read
through the single settings layer in `src/settings.ts`. A missing required
variable fails at startup with a clear message that names it.
Copy [`.env.example`](./.env.example) to `.env` and fill in real values. The
config is read through [`src/settings.ts`](./src/settings.ts).
| Variable | Required | Default | Meaning |
| ------------------ | :------: | ------------ | -------------------------------------------------------------- |
| `DOCMOST_API_URL` | yes | — | Base URL of our Docmost instance (used for `/auth/login`). |
| `DOCMOST_EMAIL` | yes | — | Docmost login email. |
| `DOCMOST_PASSWORD` | yes | — | Docmost login password. |
| `DOCMOST_SPACE_ID` | yes | — | The Docmost space to mirror. |
| `VAULT_PATH` | no | `data/vault` | Local git vault path (kept under `data/` for the volume). |
| `GIT_REMOTE` | no | _(unset)_ | Optional git remote the vault pushes to; empty = local-only. |
| `POLL_INTERVAL_MS` | no | `15000` | How often to poll Docmost for changes (ms). |
| `DEBOUNCE_MS` | no | `2000` | Debounce window for local file changes (ms). |
| `LOG_LEVEL` | no | `info` | One of `debug`, `info`, `warn`, `error`. |
| Variable | Required | Meaning |
| ------------------- | :------: | -------------------------------------------------------- |
| `DOCMOST_API_URL` | yes | Base URL of our Docmost instance. |
| `DOCMOST_EMAIL` | yes | Docmost service-user login email. |
| `DOCMOST_PASSWORD` | yes | Docmost service-user login password. |
| `DOCMOST_SPACE_ID` | yes | Which Docmost space to mirror. |
| `VAULT_PATH` | no | Local vault directory (default `data/vault`). |
| `GIT_REMOTE` | no | Optional git remote the vault pushes to. |
| `POLL_INTERVAL_MS` | no | Poll interval in ms (default `15000`). |
| `DEBOUNCE_MS` | no | Debounce window in ms (default `2000`). |
| `LOG_LEVEL` | no | `debug` \| `info` \| `warn` \| `error` (default `info`). |
Credentials and the address of our own Docmost instance have NO default — they
go ONLY into `.env`, never into code or inline command-line env vars.
**Real secrets go in `.env`, which is git-ignored — never commit them.** The
git remote grants access to the whole vault, so protect it no less than Docmost
itself (SPEC §12).
## Quick start
## Running
### Round-trip idempotency harness (Phase 0, SPEC §11)
Verifies that `export → import → export` is byte-stable. Runs offline against a
fixture (the default for CI) — **no Docmost credentials needed**:
```sh
make install # install dependencies (npm ci)
make env # create .env from .env.example, then fill it in
make test # run the test suite (vitest)
make run # build and run
make dev # run in watch mode (tsx)
npm run build
node build/roundtrip.js --fixture test/fixtures/sample-doc.json
```
`make` (or `make help`) lists all targets.
Or against a live page (needs `.env`):
## Deploy
```sh
node build/roundtrip.js --page <pageId>
```
Production runs a prebuilt image from `ghcr.io` (no build on prod):
`docker-compose.yml` pulls `ghcr.io/vvzvlad/docmost-sync:latest`, mounts a
volume at `/app/data`, and [watchtower](https://containrrr.dev/watchtower/)
auto-updates the container when a new image is published. CI (GitHub Actions)
builds and pushes the image; the `build` job runs only after `test` passes.
Exit code is 0 when the markdown is byte-stable, 1 on a markdown divergence
(CI-able). A document-level divergence after stripping block ids is a known
SPEC §11 finding and does not fail the run.
### Pull (Docmost → filesystem mirror, SPEC §6)
Read-only mirror: walks the configured space's page tree and writes one `.md`
per page under `<VAULT_PATH>/<…ancestors>/<Title>.md`. **Requires a `.env` with
real Docmost credentials** — it makes live REST calls and does not touch Docmost
state (read-only this increment):
```sh
npm run pull
```