Split the single container-automation webhook URL into two independently
optional URLs — UpdateWebhookURL (fired on update/rollback/update-failed) and
HealWebhookURL (fired on auto-heal restart). The notifier routes each event to
its mechanism's URL by kind; an empty URL silences only that mechanism, so a
user can enable notifications for updates without heal (or vice-versa).
Settings gain both fields (each validated http/https, {{message}} allowed), the
NotificationPanel exposes two labeled inputs, and the golden migration output is
updated. Delivery path (goroutine/recover/timeout, {{message}} GET vs POST,
per-container stack message format) is unchanged.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add an opt-in webhook notification for container-automation events (image
update, rollback, update-failed, auto-heal restart), plugging into the existing
Notifier seam in notify.go.
- Settings: new ContainerAutomation.Notification.WebhookURL (shared across
update + heal), persisted and validated in the settings update handler
(optional; http/https only; accepts the {{message}} placeholder).
- webhookNotifier reads the current URL from the datastore per event (UI changes
take effect without a restart). If the URL contains {{message}} it substitutes
the URL-encoded message and issues a GET; otherwise it POSTs the message as the
body. Delivery, the env/stack name lookups, and any panic run in a goroutine
under recover() with a 10s timeout — strictly best-effort, never blocks or
crashes the automation daemon. multiNotifier fans events to logNotifier +
webhook and isolates a panic in any one notifier.
- Message format (maintainer's spec):
Environment | <env>
Stack [<name>] (Container [<name>] for non-stack events)
Update [<name>]: <old> -> <new>
Auto-heal: 'Auto-heal: restarted unhealthy container'.
- New NotificationPanel in settings to configure the URL.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
F1: cap the image-status cache TTL at 5m (was 24h) — the cache is keyed by the
LOCAL imageID, which doesn't change when upstream pushes a new image under the
same tag, so the 24h TTL hid new images from both the badge and the auto-update
daemon; a short TTL re-resolves the remote digest within the poll window.
F2: document that the update->rollback guard map is in-memory (restart implication).
F3: skip auto-update for an unnamed container when rollback is on (the endpoint+name
keyed guard can't record it, so it would loop) — pure skipUnnamedForRollback + test.
F4: wrap the pre-update ContainerInspect in context.WithTimeout(endpointTimeout).
F5: document Reload() does not interrupt an in-flight tick.
F6: floor auto-heal CheckInterval at 1s (mirrors auto-update) + test.
F7: wontfix — migration is currently correct; namespace rework is out of scope.
F8: correct the misleading SSRF/AllowList comment (no filter is applied).
F9: front auto-heal interval floor + test; dedup STALE_TIME; fix invalidation comment.
Also refresh three stale '24h/long-lived cache' comments to match the 5m TTL.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
F1: record rolled-back targets per service (endpointID/containerName + remote
digest) and skip auto-update during a 24h cooldown unless the remote digest
changes — breaks the infinite update→rollback loop on a persistently
unhealthy image, without blocking a genuinely new image.
F2: unit-test applyContainerUpdate dispatch/payload mapping.
F3: settings_update.go comments mention auto-heal AND auto-update.
F4: drop stale '(future M4)' TS docs; primitives are frontend-only.
F5: replace the anonymous ContainerAutomation settings struct with named
types (identical JSON tags).
F6: drop parseEnable (duplicate of boolLabel).
F7: remove the unused gitService dependency.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
F1: tolerate up to 3 consecutive health-gate inspect failures (reset on
success) before declaring an update failed, so a transient Docker API blip no
longer triggers a false rollback.
F2: detect baseCtx cancellation during the gate and abort without rolling back
or emitting update-failed (debug log only), instead of a misleading
"rollback failed" event on every shutdown mid-gate.
F3: derive the gate deadline as start + max(RollbackTimeout, StartPeriod+buffer)
via effectiveRollbackDeadline, reading the container's healthcheck StartPeriod
so a legitimately slow-starting container is not rolled back while starting.
F4: only enable the gate when the original reference is a proper tag (new
isTagReference helper); skip with a log line for digest-pinned / bare-image-id
containers that cannot be re-tagged.
F5: document the sequential-tick delay limitation of the gate poll.
F6: emit EventUpdated only after the gate confirms healthy (or immediately when
no gate is active); the rollback path emits only EventRollback, so the event
sequence is truthful.
F7: floor RollbackTimeout at 10s in backend and frontend validation.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
P0 Health-gated rollback (standalone auto-update path): capture the previous
image id + reference + healthcheck before the recreate, then poll the new
container's health over a configurable window. On healthy proceed (and only
then clean up the old image); on unhealthy/exit/timeout re-tag the old image
back onto the original reference and Recreate (no pull) to restore it, reusing
Recreate's config preservation. The decision is a pure decideRollback() helper.
P1 Per-endpoint enable: ContainerAutomationDisabled flag on Endpoint (zero value
participates, no migration churn), checked by both daemons; settable via the
endpoint update API. UI control deferred (see report).
P2 Notifier seam: minimal Notifier interface + logNotifier, emitting structured
updated/rollback/update-failed/heal-restarted events from the daemon.
Settings: RollbackOnFailure + RollbackTimeout (default 120s) added to
ContainerAutomation.AutoUpdate, wired through defaults/migration/golden,
settings_update validation, the AutoUpdatePanel and the TS types.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
F1: Stop routing git-backed stacks through a per-tick RedeployWhenChanged for
image-only updates. The git redeploy path short-circuits when the commit is
unchanged (so an upstream-digest update never applies) yet still git-fetches
every tick. Git stacks are now detect-only in the auto-apply path; their image
update lands on the next git change or via manual "Update now". File (non-git)
stacks still force-pull-redeploy immediately. The AutoUpdatePanel text no longer
promises daemon auto-update for git/externally-managed containers.
F2: Resolve registries for the file-stack redeploy the same way the established
userless/system path (RedeployWhenChanged) does, via the new
deployments.ResolveStackRegistries: scope to the stack author's endpoint access
and RefreshAndPersistECRTokens, instead of hand-passing Registry().ReadAll().
ECR-backed stacks now auto-update with fresh tokens.
F3: Add a 1m floor for the auto-update poll interval, enforced in the settings
Validate and mirrored in the frontend validation.
F4: Thread the application shutdownCtx into NewService and use it as the base
for the heal/update job operation contexts, so shutdown cancels in-flight work.
F5: Correct the updateEndpoint comment about monitor-only badge-cache warming
(only in-scope monitor-only containers are status-checked).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add an optional periodic auto-update daemon that detects outdated container
images and applies updates, replacing the containrrr/watchtower sidecar. It
extends M1's containerautomation service/scheduler/labels infrastructure and
reuses the existing zlib image-detection engine, the standalone Recreate path
and the stack deployer.
Backend:
- api/containerautomation/autoupdate.go: scheduler job iterating Docker
(non-edge) endpoints -> in-scope running containers -> ContainerImageStatus;
for Outdated: standalone -> ContainerService.Recreate(pull); stack-managed ->
one stack redeploy-with-pull per stack per tick (git via RedeployWhenChanged,
file via the deployer directly); external compose -> detect only. Monitor-only
containers are status-checked (warms the badge cache) but never applied.
Overlap guard (atomic), pull/registry-auth failure -> leave running container
untouched, conservative cleanup of the dangling old image on the Cleanup flag
(non-forced ImageRemove only succeeds when truly unused).
- labels.go: update enable / monitor-only labels with watchtower aliases,
InUpdateScope, IsMonitorOnly, and pure resolveContainerUpdateRouting /
groupContainersForUpdate (Go analogue of M3's TS routing + grouping).
- service.go: run both jobs, Reload restarts/stops each per settings; NewService
also takes ContainerService, StackDeployer and GitService.
- Settings.ContainerAutomation.AutoUpdate {Enabled, PollInterval, Scope,
Cleanup} with fresh-install defaults and a 2.43.0 backfill (extends M1's
migration; golden test data updated). settings handler validates + reloads.
Frontend:
- Global AutoUpdatePanel in SettingsView (enable / poll interval / scope /
cleanup) via useUpdateSettingsMutation, plus settings TS types.
- Read-only per-container Auto-update row in the container details view
(Docker labels are immutable at runtime), surfacing monitor-only.
Tests: Go unit tests for the update label aliases, scope, monitor-only, the
routing decision and the one-redeploy-per-stack grouping; vitest for the panel
and the per-container row.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a native, CE-only auto-heal daemon that restarts Docker containers whose
healthcheck reports "unhealthy", replacing the willfarrell/autoheal sidecar.
Backend:
- New package api/containerautomation (service lifecycle + scheduler job,
per-endpoint heal pass, label/scope parsing, in-memory cooldown/retry state).
- Settings.ContainerAutomation.AutoHeal {Enabled, CheckInterval, Scope} with
fresh-install defaults and a 2.43.0 migration backfilling existing installs.
- Settings update handler reloads/stops the job via a small Reloader interface
(no import cycle); service bootstrapped from main.go after stack schedules.
Frontend:
- Global AutoHealPanel in SettingsView (enable / interval / scope) via
useUpdateSettingsMutation, plus settings TS types.
- Read-only per-container Auto-heal row in the container details view (Docker
labels are immutable at runtime; opt-in is set via Create/Edit form labels).
Tests: Go unit tests for label/scope resolution and the cooldown/retry decision;
vitest for the panel and the per-container row.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(internal-auth): ability to set minimum password length [EE-3175]
* pass props to react component
* fixes + WIP slider
* fix slider updating + add styles
* remove nested ternary
* fix slider updating + add remind me later button
* add length to settings + value & onchange method
* finish my account view
* fix slider updating
* slider styles
* update style
* move slider in
* update size of slider
* allow admin to browse to authentication view
* use feather icons instead of font awesome
* feat(settings): add colors to password rules
* clean up tooltip styles
* more style changes
* styles
* fixes + use requiredLength in password field for icon logic
* simplify logic
* simplify slider logic and remove debug code
* use required length for logic to display pwd length warning
* fix slider styles
* use requiredPasswordLength to determine if password is valid
* style tooltip based on theme
* reset skips when password is changed
* misc cleanup
* reset skips when required length is changed
* fix formatting
* fix issues
* implement some suggestions
* simplify logic
* update broken test
* pick min password length from DB
* fix suggestions
* set up min password length in the DB
* fix test after migration
* fix formatting issue
* fix bug with icon
* refactored migration
* fix typo
* fixes
* fix logic
* set skips per user
* reset skips for all users on length change
Co-authored-by: Chaim Lev-Ari <chiptus@gmail.com>
Co-authored-by: Dmitry Salakhov <to@dimasalakhov.com>