2 Commits

Author SHA1 Message Date
claude code agent b233f75ab7 fix(automation): retry-state retention + running-only heal + per-restart ctx (#8 F1-F6)
F1: prune retry-state by elapsed window since lastRestart instead of "not
seen this tick", so a container flapping through "starting" keeps its
cooldown/max-retries accounting (storm guard no longer defeated). Recovered
containers quiet for > window are still cleaned up.
F2: list running containers only (All:false) so stopped-unhealthy containers
are never revived.
F3: each ContainerRestart gets its own context (stop-timeout + buffer),
separate from the per-endpoint list context, so a slow/hung restart cannot
starve the others or exhaust a single shared deadline.
F4: start() is idempotent (no-op when a job is already scheduled); Reload
still stops first so it always reschedules.
F5: frontend parseBool mirrors Go strconv.ParseBool (case-insensitive
1/t/true; present-but-invalid counts as present & false).
F6: tests TestPruneRetries and TestRetryStateSurvivesStartingTick lock in
the F1 behavior; added AutoHealRow parse cases.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 08:39:17 +03:00
claude code agent 51957d2f98 feat(automation): native auto-heal daemon (#8, epic #3 M1)
Add a native, CE-only auto-heal daemon that restarts Docker containers whose
healthcheck reports "unhealthy", replacing the willfarrell/autoheal sidecar.

Backend:
- New package api/containerautomation (service lifecycle + scheduler job,
  per-endpoint heal pass, label/scope parsing, in-memory cooldown/retry state).
- Settings.ContainerAutomation.AutoHeal {Enabled, CheckInterval, Scope} with
  fresh-install defaults and a 2.43.0 migration backfilling existing installs.
- Settings update handler reloads/stops the job via a small Reloader interface
  (no import cycle); service bootstrapped from main.go after stack schedules.

Frontend:
- Global AutoHealPanel in SettingsView (enable / interval / scope) via
  useUpdateSettingsMutation, plus settings TS types.
- Read-only per-container Auto-heal row in the container details view (Docker
  labels are immutable at runtime; opt-in is set via Create/Edit form labels).

Tests: Go unit tests for label/scope resolution and the cooldown/retry decision;
vitest for the panel and the per-container row.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 08:22:46 +03:00