portainer

Author	SHA1	Message	Date
vvzvlad	594312a777	Merge pull request 'feat(automation): native container auto-update (Watchtower-style) + auto-heal (#3 )' (#19 ) from feat/3-auto-update into develop Reviewed-on: #19	2026-07-01 23:25:24 +03:00
agent_coder	492d3d01b0	feat(#19 ): separate webhook per automation mechanism (update vs heal) Split the single container-automation webhook URL into two independently optional URLs — UpdateWebhookURL (fired on update/rollback/update-failed) and HealWebhookURL (fired on auto-heal restart). The notifier routes each event to its mechanism's URL by kind; an empty URL silences only that mechanism, so a user can enable notifications for updates without heal (or vice-versa). Settings gain both fields (each validated http/https, {{message}} allowed), the NotificationPanel exposes two labeled inputs, and the golden migration output is updated. Delivery path (goroutine/recover/timeout, {{message}} GET vs POST, per-container stack message format) is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-01 22:47:25 +03:00
agent_coder	eb35e9c47f	feat(automation): configurable webhook notifier for automation events Add an opt-in webhook notification for container-automation events (image update, rollback, update-failed, auto-heal restart), plugging into the existing Notifier seam in notify.go. - Settings: new ContainerAutomation.Notification.WebhookURL (shared across update + heal), persisted and validated in the settings update handler (optional; http/https only; accepts the {{message}} placeholder). - webhookNotifier reads the current URL from the datastore per event (UI changes take effect without a restart). If the URL contains {{message}} it substitutes the URL-encoded message and issues a GET; otherwise it POSTs the message as the body. Delivery, the env/stack name lookups, and any panic run in a goroutine under recover() with a 10s timeout — strictly best-effort, never blocks or crashes the automation daemon. multiNotifier fans events to logNotifier + webhook and isolates a panic in any one notifier. - Message format (maintainer's spec): Environment \| <env> Stack [<name>] (Container [<name>] for non-stack events) Update [<name>]: <old> -> <new> Auto-heal: 'Auto-heal: restarted unhealthy container'. - New NotificationPanel in settings to configure the URL. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-01 19:31:18 +03:00
agent_coder	6aecdfbe46	feat(containers): interactive image-status badge (click to update / re-check) Make the container image-status badge actionable, matching native Portainer: - Clicking "Update available" opens the update confirm dialog and runs the existing update flow (standalone recreate-with-pull / stack redeploy), gated and disabled while in flight to avoid a double submit. The confirm+apply logic is extracted from UpdateNowButton into a shared useApplyContainerImageUpdate hook so the details button and the list badge share one implementation. - Clicking "Up to date" re-queries the registry. Because the server caches image status (statusCache 5m + remoteDigestCache 5s), a plain refetch was a no-op, so the endpoint gains an optional ?force=true that bypasses BOTH caches for a manual re-check while still repopulating them; the default (auto badges + the auto-update daemon) keeps using the caches unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-07-01 19:04:49 +03:00
claude code agent	7257ae52d8	test(logs): cover the docker proxy stream/flush loop (F1) Extract the manual stream-and-flush loop from dockerLocalProxy.ServeHTTP into a behaviour-preserving package-private streamResponse(w, body) helper, and add docker_test.go regression tests for the riskiest path (it runs on every Docker API response): - DeliversFullBodyAndFlushesPerChunk: a >32KB body delivered as several chunks (boundaries not aligned to the 32KB buffer), with the final Read returning (n>0, io.EOF) simultaneously, asserts the streamed body equals the input exactly (no loss/duplication) and that Flush ran more than once (the per-chunk flush is the whole point of the change). - StopsOnWriteErrorWithoutPanic: a writer that errors on first Write (and does not implement http.Flusher, exercising the nil-flusher fallback) breaks the loop after one write without panicking. No production behaviour change — the loop body is identical, only moved. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-30 02:58:48 +03:00
claude code agent	637e96f236	fix(logs): flush docker proxy stream per chunk; trim log-viewer settings UI Backend (the "logs arrive every ~5s / pipe clogged" bug): - dockerLocalProxy.ServeHTTP streamed the docker socket response via io.Copy, which buffers ~2KB into the ResponseWriter and only flushes when full or on handler return. Low-throughput streaming endpoints (container logs follow=1, events, stats, attach) therefore arrived in multi-second batches. Stream manually and Flush() after each chunk so they are delivered live. Behaviour is otherwise identical to io.Copy (full-write contract, EOF handling, Debug error logging); hijacked attach/exec go through a separate websocket handler, unaffected. - NewSingleHostReverseProxyWithHostHeader: set FlushInterval = -1 so the remote-endpoint path streams live too. Frontend (maintainer UI asks): - Remove the line-selection mechanic entirely (Copy-selected-lines and Unselect buttons, selectLine/copySelection/clearSelection, selectedLines state, line_selected highlight): selecting/copying is mouse-native. Copy (all visible) and Download stay. - Rename the unclear "Fetch" since-selector label to "Since". - Move the settings controls into the widget header (rd-widget-header default transclude slot) so they share one row with the "Log viewer settings" title, reclaiming vertical space for the log pane. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-30 02:05:02 +03:00
claude code agent	be3bfd0513	fix(automation): maintainer pre-merge review — stale detection, daemon edge cases, parity (F1-F9) F1: cap the image-status cache TTL at 5m (was 24h) — the cache is keyed by the LOCAL imageID, which doesn't change when upstream pushes a new image under the same tag, so the 24h TTL hid new images from both the badge and the auto-update daemon; a short TTL re-resolves the remote digest within the poll window. F2: document that the update->rollback guard map is in-memory (restart implication). F3: skip auto-update for an unnamed container when rollback is on (the endpoint+name keyed guard can't record it, so it would loop) — pure skipUnnamedForRollback + test. F4: wrap the pre-update ContainerInspect in context.WithTimeout(endpointTimeout). F5: document Reload() does not interrupt an in-flight tick. F6: floor auto-heal CheckInterval at 1s (mirrors auto-update) + test. F7: wontfix — migration is currently correct; namespace rework is out of scope. F8: correct the misleading SSRF/AllowList comment (no filter is applied). F9: front auto-heal interval floor + test; dedup STALE_TIME; fix invalidation comment. Also refresh three stale '24h/long-lived cache' comments to match the 5m TTL. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 19:51:15 +03:00
claude code agent	922f506fe5	feat(automation): guard update→rollback loop; name Settings types; tests & doc fixes (F1-F7) F1: record rolled-back targets per service (endpointID/containerName + remote digest) and skip auto-update during a 24h cooldown unless the remote digest changes — breaks the infinite update→rollback loop on a persistently unhealthy image, without blocking a genuinely new image. F2: unit-test applyContainerUpdate dispatch/payload mapping. F3: settings_update.go comments mention auto-heal AND auto-update. F4: drop stale '(future M4)' TS docs; primitives are frontend-only. F5: replace the anonymous ContainerAutomation settings struct with named types (identical JSON tags). F6: drop parseEnable (duplicate of boolLabel). F7: remove the unused gitService dependency. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 14:29:57 +03:00
claude code agent	70f7fe5e84	Merge remote-tracking branch 'origin/feat/10-update-now' into feat/3-auto-update	2026-06-29 12:48:24 +03:00
claude code agent	cdf17d904d	fix(automation): rollback robustness — transient inspect, start_period, digest images, shutdown, event order (#12 review) F1: tolerate up to 3 consecutive health-gate inspect failures (reset on success) before declaring an update failed, so a transient Docker API blip no longer triggers a false rollback. F2: detect baseCtx cancellation during the gate and abort without rolling back or emitting update-failed (debug log only), instead of a misleading "rollback failed" event on every shutdown mid-gate. F3: derive the gate deadline as start + max(RollbackTimeout, StartPeriod+buffer) via effectiveRollbackDeadline, reading the container's healthcheck StartPeriod so a legitimately slow-starting container is not rolled back while starting. F4: only enable the gate when the original reference is a proper tag (new isTagReference helper); skip with a log line for digest-pinned / bare-image-id containers that cannot be re-tagged. F5: document the sequential-tick delay limitation of the gate poll. F6: emit EventUpdated only after the gate confirms healthy (or immediately when no gate is active); the rollback path emits only EventRollback, so the event sequence is truthful. F7: floor RollbackTimeout at 10s in backend and frontend validation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 10:57:54 +03:00
claude code agent	32a2b7a9ae	feat(automation): health-gated rollback + per-endpoint + notify hook (#12 , epic #3 M5) P0 Health-gated rollback (standalone auto-update path): capture the previous image id + reference + healthcheck before the recreate, then poll the new container's health over a configurable window. On healthy proceed (and only then clean up the old image); on unhealthy/exit/timeout re-tag the old image back onto the original reference and Recreate (no pull) to restore it, reusing Recreate's config preservation. The decision is a pure decideRollback() helper. P1 Per-endpoint enable: ContainerAutomationDisabled flag on Endpoint (zero value participates, no migration churn), checked by both daemons; settable via the endpoint update API. UI control deferred (see report). P2 Notifier seam: minimal Notifier interface + logNotifier, emitting structured updated/rollback/update-failed/heal-restarted events from the daemon. Settings: RollbackOnFailure + RollbackTimeout (default 120s) added to ContainerAutomation.AutoUpdate, wired through defaults/migration/golden, settings_update validation, the AutoUpdatePanel and the TS types. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 10:41:55 +03:00
claude code agent	21b5ec3e05	fix(automation): git-stack honesty + ECR registry refresh + interval floor (#11 review) F1: Stop routing git-backed stacks through a per-tick RedeployWhenChanged for image-only updates. The git redeploy path short-circuits when the commit is unchanged (so an upstream-digest update never applies) yet still git-fetches every tick. Git stacks are now detect-only in the auto-apply path; their image update lands on the next git change or via manual "Update now". File (non-git) stacks still force-pull-redeploy immediately. The AutoUpdatePanel text no longer promises daemon auto-update for git/externally-managed containers. F2: Resolve registries for the file-stack redeploy the same way the established userless/system path (RedeployWhenChanged) does, via the new deployments.ResolveStackRegistries: scope to the stack author's endpoint access and RefreshAndPersistECRTokens, instead of hand-passing Registry().ReadAll(). ECR-backed stacks now auto-update with fresh tokens. F3: Add a 1m floor for the auto-update poll interval, enforced in the settings Validate and mirrored in the frontend validation. F4: Thread the application shutdownCtx into NewService and use it as the base for the heal/update job operation contexts, so shutdown cancels in-flight work. F5: Correct the updateEndpoint comment about monitor-only badge-cache warming (only in-scope monitor-only containers are status-checked). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 10:24:58 +03:00
claude code agent	b3ae5f3659	feat(automation): native auto-update daemon (#11 , epic #3 M4) Add an optional periodic auto-update daemon that detects outdated container images and applies updates, replacing the containrrr/watchtower sidecar. It extends M1's containerautomation service/scheduler/labels infrastructure and reuses the existing zlib image-detection engine, the standalone Recreate path and the stack deployer. Backend: - api/containerautomation/autoupdate.go: scheduler job iterating Docker (non-edge) endpoints -> in-scope running containers -> ContainerImageStatus; for Outdated: standalone -> ContainerService.Recreate(pull); stack-managed -> one stack redeploy-with-pull per stack per tick (git via RedeployWhenChanged, file via the deployer directly); external compose -> detect only. Monitor-only containers are status-checked (warms the badge cache) but never applied. Overlap guard (atomic), pull/registry-auth failure -> leave running container untouched, conservative cleanup of the dangling old image on the Cleanup flag (non-forced ImageRemove only succeeds when truly unused). - labels.go: update enable / monitor-only labels with watchtower aliases, InUpdateScope, IsMonitorOnly, and pure resolveContainerUpdateRouting / groupContainersForUpdate (Go analogue of M3's TS routing + grouping). - service.go: run both jobs, Reload restarts/stops each per settings; NewService also takes ContainerService, StackDeployer and GitService. - Settings.ContainerAutomation.AutoUpdate {Enabled, PollInterval, Scope, Cleanup} with fresh-install defaults and a 2.43.0 backfill (extends M1's migration; golden test data updated). settings handler validates + reloads. Frontend: - Global AutoUpdatePanel in SettingsView (enable / poll interval / scope / cleanup) via useUpdateSettingsMutation, plus settings TS types. - Read-only per-container Auto-update row in the container details view (Docker labels are immutable at runtime), surfacing monitor-only. Tests: Go unit tests for the update label aliases, scope, monitor-only, the routing decision and the one-redeploy-per-stack grouping; vitest for the panel and the per-container row. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 10:04:09 +03:00
claude code agent	7eaff4dab0	fix(automation): real status cache read + nodeName key + honest errors (#9 review) F1: ContainerImageStatus now reads the 24h statusCache (keyed by imageID) before the remote registry digest lookup, so the cache is effective on the input side for all callers instead of being write-only. This avoids the rate-limited registry HEAD on repeat loads. F2: add nodeName to the imageStatus query key so cached results cannot be reused across nodes. F3: correct the swagger annotations to reflect that engine-level issues degrade to a 200 skipped/error status rather than 400/404. F4: return a generic error message to the client instead of the raw registry/engine error; the raw error is still logged server-side. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 09:09:18 +03:00
claude code agent	f69eb3f9eb	feat(automation): CE container image update detection endpoint + badge (#9 , epic #3 M2) Add native CE detection of "a newer image is available" for running containers, surfaced as a read-only HTTP endpoint and a containers-list badge/column. No applying of updates (M3/M4), no auto-heal (M1). Backend: - New CE handler GET /docker/{id}/containers/{containerId}/image_status backed by the existing zlib/CE digest engine (images.NewClientWithRegistry + ContainerImageStatus). Honors nodeName, authz, and routes registry calls through the credential store / SSRF AllowList. Engine failures degrade to a 200 {Status:"error"} so the UI stays graceful. Response shape: {Status, Message?}. Frontend (CE-only, no isBE gating; the EE ImageStatus component is left untouched): - useContainerImageStatus TanStack Query hook (5min staleTime, no refetch-on-focus; backend caches 24h) calling the non-proxied endpoint. - UpdateStatusBadge component (own assets, neutral on skipped/error). - "Update available" column in the containers datatable; one cached, non-blocking query per visible row. Tests: Go response-shape unit test; vitest for the badge (all statuses) and the hook (url + nodeName query param via msw). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 08:59:54 +03:00
claude code agent	51957d2f98	feat(automation): native auto-heal daemon (#8 , epic #3 M1) Add a native, CE-only auto-heal daemon that restarts Docker containers whose healthcheck reports "unhealthy", replacing the willfarrell/autoheal sidecar. Backend: - New package api/containerautomation (service lifecycle + scheduler job, per-endpoint heal pass, label/scope parsing, in-memory cooldown/retry state). - Settings.ContainerAutomation.AutoHeal {Enabled, CheckInterval, Scope} with fresh-install defaults and a 2.43.0 migration backfilling existing installs. - Settings update handler reloads/stops the job via a small Reloader interface (no import cycle); service bootstrapped from main.go after stack schedules. Frontend: - Global AutoHealPanel in SettingsView (enable / interval / scope) via useUpdateSettingsMutation, plus settings TS types. - Read-only per-container Auto-heal row in the container details view (Docker labels are immutable at runtime; opt-in is set via Create/Edit form labels). Tests: Go unit tests for label/scope resolution and the cooldown/retry decision; vitest for the panel and the per-container row. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 08:22:46 +03:00
andres-portainer	e664bf0e19	fix(helm): add missing SSRF protections BE-13136 (#3001 )	2026-06-22 20:25:10 -03:00
andres-portainer	a6370808ae	fix(ssrf): disable HTTP/2 for some specific cases BE-13121 (#2996 )	2026-06-22 16:13:43 -03:00
Phil Calder	f596c862b3	fix(websocket): enforce environment authorization on kubernetes-shell [BE-13027] (#2774 ) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: oscarzhou <oscar.zhou@portainer.io>	2026-06-22 15:09:41 +12:00
bernard-portainer	5395dee4c6	feat(gpu-stats): add gpu stats to environments [C9S-200] (#2735 )	2026-06-22 09:21:43 +12:00
andres-portainer	26334e9088	feat(ssrf): add missing transport wrappings and more checks BE-13021 (#2968 )	2026-06-19 20:26:03 -03:00
RHCowan	37bd8c06b5	fix(security): gate docker dashboard and edge async command routes [R8S-1057] (#2953 )	2026-06-19 11:08:01 +12:00
Chaim Lev-Ari	4d539a691d	feat(custom-templates): reuse existing git sources in create/update [BE-13053] (#2925 ) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-16 21:45:35 +03:00
Chaim Lev-Ari	ee8e73d7f9	feat(edge/stacks): use source ID for edge stack git creation [BE-13044] (#2926 ) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-16 17:33:19 +03:00
Chaim Lev-Ari	d9673e33ec	feat(helm): reuse existing git sources in Kubernetes Helm-from-git install [BE-13046] (#2900 ) Co-authored-by: Claude <noreply@anthropic.com>	2026-06-15 22:01:31 +03:00
andres-portainer	16b5554f66	fix(customtemplates): add resource controls BE-13019 (#2897 )	2026-06-15 14:59:07 -03:00
Chaim Lev-Ari	fcdd6b4510	feat(stacks): use source id to create git stacks [BE-13043] (#2870 ) Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-15 18:49:26 +03:00
Devon Steenberg	8b21dfc318	feat(ssrf): add ssrf allow list to settings [BE-13021] (#2858 )	2026-06-12 15:16:06 +12:00
andres-portainer	0da42c01b6	feat(gitcredential): remove GitCredential BE-12919 (#2838 )	2026-06-11 18:53:24 -03:00
Steven Kang	1cd6017df6	fix(api): add endpoint authorization check to /api/kubernetes/{id}/* route - develop [R8S-1056] (#2829 )	2026-06-11 09:49:50 +12:00
andres-portainer	babb4ffb37	fix(nolint): remove unnecessary nolint directives BE-13074 (#2852 )	2026-06-10 15:35:08 -03:00
LP B	0c2f07988a	feat(app/sources): source create view (#2680 ) Co-authored-by: Chaim Lev-Ari <chaim.lev-ari@portainer.io> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-10 21:34:46 +03:00
andres-portainer	1765e41fd4	feat(ssrf): implement an SSRF protection mechanism BE-13021 (#2818 )	2026-06-09 00:41:42 -03:00
andres-portainer	df7a4b5d6f	feat(gitops): improve the data model BE-12919 (#2819 )	2026-06-08 15:01:55 -03:00
Josiah Clumont	e3e2a3b782	fix(environments): Environment Groups detail view environment breakdown regression [BE-13051] (#2828 )	2026-06-08 16:03:32 +12:00
andres-portainer	8daf0bb2a9	feat(customtemplates): use Sources for CustomTemplates BE-12919 (#2759 )	2026-06-05 01:51:18 -03:00
Chaim Lev-Ari	d2b56efcb4	feat(security): require setup token for admin init and restore [BE-13029] (#2770 )	2026-06-04 09:15:23 +03:00
Hannah Cooper	916367dccb	fix(api-docs): time.Duration bounds fix + linting fixes [C9S-223] (#2762 )	2026-06-04 15:14:07 +12:00
Chaim Lev-Ari	2ba8b582e2	feat(api): use generated api client [BE-12901] (#2727 )	2026-06-03 14:37:39 +03:00
Chaim Lev-Ari	bc81eb7a22	feat(sources): allow user to edit source [BE-12956] (#2748 )	2026-06-03 12:52:41 +03:00
Steven Kang	b233453cf7	feat(kubernetes): display cached images per node [R8S-898] (#2068 )	2026-06-03 10:40:14 +12:00
Steven Kang	eb5ee3bfdb	fix(kubernetes): improve PVC deletion UX based on workload usage [R8S-1046] (#2766 )	2026-06-03 09:43:07 +12:00
Steven Kang	86a84c3c6a	fix(kubernetes): updated wrong tooltip for container restart feature-gate [R8S-1037] (#2721 )	2026-06-03 09:26:04 +12:00
andres-portainer	1fa756372e	feat(gitops): general improvements BE-12919 (#2780 )	2026-06-02 09:44:57 -03:00
Josiah Clumont	484af3c2c8	feat(environment group) detail view update v1 [c9s-206] (#2722 ) Last system-test failure is also on dev	2026-06-02 16:59:18 +12:00
Devon Steenberg	742551e592	fix(registries): make gitlab proxy endpoint admin only [BE-13018] (#2764 )	2026-06-02 15:45:57 +12:00
Chaim Lev-Ari	67590aa27d	feat(api): auto generate typescript definition from api docs [BE-9222] (#2468 )	2026-05-31 14:51:52 +03:00
Ali	6c059c41f9	chore: bump version to 2.43.0 (#2760 )	2026-05-30 16:56:17 +12:00
andres-portainer	f1db82934d	fix(security): fix a short-circuit condition that can lead to improper access control BE-13020 (#2756 )	2026-05-29 20:47:59 -03:00
Hannah Cooper	28dd6b767f	fix(api-docs): API docs fixes / improvements [C9S-208] (#2717 )	2026-05-29 11:33:06 +12:00

1 2 3 4 5 ...

1181 Commits