portainer

Author	SHA1	Message	Date
agent_coder	492d3d01b0	feat(#19 ): separate webhook per automation mechanism (update vs heal) Split the single container-automation webhook URL into two independently optional URLs — UpdateWebhookURL (fired on update/rollback/update-failed) and HealWebhookURL (fired on auto-heal restart). The notifier routes each event to its mechanism's URL by kind; an empty URL silences only that mechanism, so a user can enable notifications for updates without heal (or vice-versa). Settings gain both fields (each validated http/https, {{message}} allowed), the NotificationPanel exposes two labeled inputs, and the golden migration output is updated. Delivery path (goroutine/recover/timeout, {{message}} GET vs POST, per-container stack message format) is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-01 22:47:25 +03:00
agent_coder	eb35e9c47f	feat(automation): configurable webhook notifier for automation events Add an opt-in webhook notification for container-automation events (image update, rollback, update-failed, auto-heal restart), plugging into the existing Notifier seam in notify.go. - Settings: new ContainerAutomation.Notification.WebhookURL (shared across update + heal), persisted and validated in the settings update handler (optional; http/https only; accepts the {{message}} placeholder). - webhookNotifier reads the current URL from the datastore per event (UI changes take effect without a restart). If the URL contains {{message}} it substitutes the URL-encoded message and issues a GET; otherwise it POSTs the message as the body. Delivery, the env/stack name lookups, and any panic run in a goroutine under recover() with a 10s timeout — strictly best-effort, never blocks or crashes the automation daemon. multiNotifier fans events to logNotifier + webhook and isolates a panic in any one notifier. - Message format (maintainer's spec): Environment \| <env> Stack [<name>] (Container [<name>] for non-stack events) Update [<name>]: <old> -> <new> Auto-heal: 'Auto-heal: restarted unhealthy container'. - New NotificationPanel in settings to configure the URL. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-01 19:31:18 +03:00
claude code agent	be3bfd0513	fix(automation): maintainer pre-merge review — stale detection, daemon edge cases, parity (F1-F9) F1: cap the image-status cache TTL at 5m (was 24h) — the cache is keyed by the LOCAL imageID, which doesn't change when upstream pushes a new image under the same tag, so the 24h TTL hid new images from both the badge and the auto-update daemon; a short TTL re-resolves the remote digest within the poll window. F2: document that the update->rollback guard map is in-memory (restart implication). F3: skip auto-update for an unnamed container when rollback is on (the endpoint+name keyed guard can't record it, so it would loop) — pure skipUnnamedForRollback + test. F4: wrap the pre-update ContainerInspect in context.WithTimeout(endpointTimeout). F5: document Reload() does not interrupt an in-flight tick. F6: floor auto-heal CheckInterval at 1s (mirrors auto-update) + test. F7: wontfix — migration is currently correct; namespace rework is out of scope. F8: correct the misleading SSRF/AllowList comment (no filter is applied). F9: front auto-heal interval floor + test; dedup STALE_TIME; fix invalidation comment. Also refresh three stale '24h/long-lived cache' comments to match the 5m TTL. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 19:51:15 +03:00
claude code agent	cdf17d904d	fix(automation): rollback robustness — transient inspect, start_period, digest images, shutdown, event order (#12 review) F1: tolerate up to 3 consecutive health-gate inspect failures (reset on success) before declaring an update failed, so a transient Docker API blip no longer triggers a false rollback. F2: detect baseCtx cancellation during the gate and abort without rolling back or emitting update-failed (debug log only), instead of a misleading "rollback failed" event on every shutdown mid-gate. F3: derive the gate deadline as start + max(RollbackTimeout, StartPeriod+buffer) via effectiveRollbackDeadline, reading the container's healthcheck StartPeriod so a legitimately slow-starting container is not rolled back while starting. F4: only enable the gate when the original reference is a proper tag (new isTagReference helper); skip with a log line for digest-pinned / bare-image-id containers that cannot be re-tagged. F5: document the sequential-tick delay limitation of the gate poll. F6: emit EventUpdated only after the gate confirms healthy (or immediately when no gate is active); the rollback path emits only EventRollback, so the event sequence is truthful. F7: floor RollbackTimeout at 10s in backend and frontend validation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 10:57:54 +03:00
claude code agent	32a2b7a9ae	feat(automation): health-gated rollback + per-endpoint + notify hook (#12 , epic #3 M5) P0 Health-gated rollback (standalone auto-update path): capture the previous image id + reference + healthcheck before the recreate, then poll the new container's health over a configurable window. On healthy proceed (and only then clean up the old image); on unhealthy/exit/timeout re-tag the old image back onto the original reference and Recreate (no pull) to restore it, reusing Recreate's config preservation. The decision is a pure decideRollback() helper. P1 Per-endpoint enable: ContainerAutomationDisabled flag on Endpoint (zero value participates, no migration churn), checked by both daemons; settable via the endpoint update API. UI control deferred (see report). P2 Notifier seam: minimal Notifier interface + logNotifier, emitting structured updated/rollback/update-failed/heal-restarted events from the daemon. Settings: RollbackOnFailure + RollbackTimeout (default 120s) added to ContainerAutomation.AutoUpdate, wired through defaults/migration/golden, settings_update validation, the AutoUpdatePanel and the TS types. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 10:41:55 +03:00
claude code agent	21b5ec3e05	fix(automation): git-stack honesty + ECR registry refresh + interval floor (#11 review) F1: Stop routing git-backed stacks through a per-tick RedeployWhenChanged for image-only updates. The git redeploy path short-circuits when the commit is unchanged (so an upstream-digest update never applies) yet still git-fetches every tick. Git stacks are now detect-only in the auto-apply path; their image update lands on the next git change or via manual "Update now". File (non-git) stacks still force-pull-redeploy immediately. The AutoUpdatePanel text no longer promises daemon auto-update for git/externally-managed containers. F2: Resolve registries for the file-stack redeploy the same way the established userless/system path (RedeployWhenChanged) does, via the new deployments.ResolveStackRegistries: scope to the stack author's endpoint access and RefreshAndPersistECRTokens, instead of hand-passing Registry().ReadAll(). ECR-backed stacks now auto-update with fresh tokens. F3: Add a 1m floor for the auto-update poll interval, enforced in the settings Validate and mirrored in the frontend validation. F4: Thread the application shutdownCtx into NewService and use it as the base for the heal/update job operation contexts, so shutdown cancels in-flight work. F5: Correct the updateEndpoint comment about monitor-only badge-cache warming (only in-scope monitor-only containers are status-checked). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 10:24:58 +03:00

6 Commits