feat: graceful SIGTERM/SIGINT shutdown for watch mode and one-shot CLI (#794)

* feat: graceful SIGTERM/SIGINT shutdown for watch mode and one-shot CLI

Previously SIGTERM (systemctl stop, docker stop, Kubernetes pod termination)
killed parsedmarc mid-batch, tearing output writes and silently dropping
buffered Kafka records. Shutdown is now cooperative:

- SIGTERM/SIGINT set a flag that is polled at safe boundaries. The one-shot
  CLI checks it between batches; watch mode passes it as `config_reloading` so
  the mailbox backend -- including the IMAP IDLE loop -- returns once the
  current batch is fully processed. Either way the in-flight batch and its
  output writes finish before the process exits 0.
- Ctrl-C is a double-tap: the first press is graceful, the second
  short-circuits to os._exit(130).
- Output clients are now closed on every exit path (atexit plus a trailing
  close in _main), fixing a long-standing leak where one-shot runs and
  graceful shutdowns never flushed Kafka / closed Elasticsearch / S3 / etc.

Docs: the example systemd unit gains KillSignal=SIGTERM and TimeoutStopSec=60
(keep it above mailbox_check_timeout). Tests cover watch shutdown, the one-shot
between-batch stop, the SIGINT double-tap, and the output-client-close leak.

* test: cover the one-shot mbox-loop shutdown break

Extend the one-shot SIGTERM test to also pass an .mbox path so a single
run exercises both shutdown checkpoints: the file-batch loop break and the
subsequent mbox loop break (which Codecov flagged as the only uncovered
lines on PR #794). is_mbox is keyed by suffix and get_dmarc_reports_from_mbox
is asserted not called, since the mbox loop breaks before reaching it.

* test: narrow signal.getsignal() return before invoking in SIGINT test

signal.getsignal() is typed Callable | int | Handlers | None; calling it
directly fails pyright's callable check. Assert callable() first.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Sean Whalen <44679+seanthegeek@users.noreply.github.com>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
Casper Biering
2026-06-13 02:00:32 +02:00
committed by GitHub
parent b869235224
commit d3510da3a6
5 changed files with 350 additions and 17 deletions
+13
View File
@@ -818,6 +818,8 @@ After=network.target network-online.target elasticsearch.service
[Service]
ExecStart=/opt/parsedmarc/venv/bin/parsedmarc -c /etc/parsedmarc.ini
ExecReload=/bin/kill -HUP $MAINPID
KillSignal=SIGTERM
TimeoutStopSec=60
User=parsedmarc
Group=parsedmarc
Restart=always
@@ -850,6 +852,17 @@ sudo service parsedmarc restart
:::
:::{note}
On `systemctl stop`/`restart` (or Ctrl-C) `parsedmarc` finishes the
current batch, flushes its outputs, and exits cleanly. Shutdown is
observed at batch boundaries, so the worst-case delay is roughly
`mailbox_check_timeout` (default 30s) plus the batch's processing and
flush time. Keep `TimeoutStopSec` comfortably above
`mailbox_check_timeout` (≈2×, and raise both together) or systemd will
`SIGKILL` mid-batch. In the foreground, a second Ctrl-C force-quits
immediately, skipping the output flush.
:::
### Reloading configuration without restarting
When running in watch mode, `parsedmarc` supports reloading its