Bump version to 9.5.4, enhance Maildir folder handling, and add config key aliases for environment variable compatibility

9.5.3
- Fixed `FileNotFoundError` when using Maildir with Docker volume mounts. Python's `mailbox.Maildir(create=True)` only creates `cur/new/tmp` subdirectories when the top-level directory doesn't exist; Docker volume mounts pre-create the directory as empty, skipping subdirectory creation. parsedmarc now explicitly creates the subdirectories when `maildir_create` is enabled. - Maildir UID mismatch no longer crashes the process. In Docker containers where volume ownership differs from the container UID, parsedmarc now logs a warning instead of raising an exception. Also handles `os.setuid` failures gracefully in containers without `CAP_SETUID`. - Token file writes (MS Graph and Gmail) now create parent directories automatically, preventing `FileNotFoundError` when the token path points to a directory that doesn't yet exist. - File paths from config (`token_file`, `credentials_file`, `cert_path`, `log_file`, `output`, `ip_db_path`, `maildir_path`, syslog cert paths, etc.) now expand `~` and `$VAR` references via `os.path.expanduser`/`os.path
2026-03-26 08:22:45 +00:00 · 2026-03-25 23:22:46 -04:00 · 2026-03-25 21:29:08 -04:00 · 2026-03-25 21:13:34 -04:00 · 2026-03-25 20:21:08 -04:00 · 2026-03-25 19:43:24 -04:00
37 changed files with 5300 additions and 1028 deletions
--- a/.claude/settings.json
+++ b/.claude/settings.json
@@ -0,0 +1,17 @@
 {
  "permissions": {
    "allow": [
      "Bash(python -c \"import py_compile; py_compile.compile\\(''parsedmarc/cli.py'', doraise=True\\)\")",
      "Bash(ruff check:*)",
      "Bash(ruff format:*)",
      "Bash(GITHUB_ACTIONS=true pytest --cov tests.py)",
      "Bash(ls tests*)",
      "Bash(GITHUB_ACTIONS=true python -m pytest --cov tests.py -x)",
      "Bash(GITHUB_ACTIONS=true python -m pytest tests.py -x -v)",
      "Bash(python -m pytest tests.py --no-header -q)"
    ],
    "additionalDirectories": [
      "/tmp"
    ]
  }
 }
--- a/.github/ISSUE_TEMPLATE/bug_report.yml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yml
@@ -0,0 +1,72 @@
 name: Bug report
 description: Report a reproducible parsedmarc bug
 title: "[Bug]: "
 labels:
  - bug
 body:
  - type: input
    id: version
    attributes:
      label: parsedmarc version
      description: Include the parsedmarc version or commit if known.
      placeholder: 9.x.x
    validations:
      required: true
  - type: dropdown
    id: input_backend
    attributes:
      label: Input backend
      description: Which input path or mailbox backend is involved?
      options:
        - IMAP
        - MS Graph
        - Gmail API
        - Maildir
        - mbox
        - Local file / direct parse
        - Other
    validations:
      required: true
  - type: textarea
    id: environment
    attributes:
      label: Environment
      description: Runtime, container image, OS, Python version, or deployment details.
      placeholder: Docker on Debian, Python 3.12, parsedmarc installed from PyPI
    validations:
      required: true
  - type: textarea
    id: config
    attributes:
      label: Sanitized config
      description: Include the relevant config fragment with secrets removed.
      render: ini
  - type: textarea
    id: steps
    attributes:
      label: Steps to reproduce
      description: Describe the smallest reproducible sequence you can.
      placeholder: |
        1. Configure parsedmarc with ...
        2. Run ...
        3. Observe ...
    validations:
      required: true
  - type: textarea
    id: expected_actual
    attributes:
      label: Expected vs actual behavior
      description: What did you expect, and what happened instead?
    validations:
      required: true
  - type: textarea
    id: logs
    attributes:
      label: Logs or traceback
      description: Paste sanitized logs or a traceback if available.
      render: text
  - type: textarea
    id: samples
    attributes:
      label: Sample report availability
      description: If you can share a sanitized sample report or message, note that here.
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@@ -0,0 +1,5 @@
 blank_issues_enabled: true
 contact_links:
  - name: Security issue
    url: https://github.com/domainaware/parsedmarc/security/policy
    about: Please use the security policy and avoid filing public issues for undisclosed vulnerabilities.
--- a/.github/ISSUE_TEMPLATE/feature_request.yml
+++ b/.github/ISSUE_TEMPLATE/feature_request.yml
@@ -0,0 +1,30 @@
 name: Feature request
 description: Suggest a new feature or behavior change
 title: "[Feature]: "
 labels:
  - enhancement
 body:
  - type: textarea
    id: problem
    attributes:
      label: Problem statement
      description: What workflow or limitation are you trying to solve?
    validations:
      required: true
  - type: textarea
    id: proposal
    attributes:
      label: Proposed behavior
      description: Describe the feature or behavior you want.
    validations:
      required: true
  - type: textarea
    id: alternatives
    attributes:
      label: Alternatives considered
      description: Describe workarounds or alternative approaches you considered.
  - type: textarea
    id: impact
    attributes:
      label: Compatibility or operational impact
      description: Note config, output, performance, or deployment implications if relevant.
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@@ -0,0 +1,24 @@
 ## Summary
 - 
 ## Why
 - 
 ## Testing
 - 
 ## Backward Compatibility / Risk
 - 
 ## Related Issue
 - Closes #
 ## Checklist
 - [ ] Tests added or updated if behavior changed
 - [ ] Docs updated if config or user-facing behavior changed
--- a/.github/workflows/python-tests.yml
+++ b/.github/workflows/python-tests.yml
@@ -10,7 +10,32 @@ on:
    branches: [ master ]
 jobs:
-  build:
+  lint-docs-build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v5
    - name: Set up Python
      uses: actions/setup-python@v6
      with:
        python-version: "3.13"
    - name: Install Python dependencies
      run: |
        python -m pip install --upgrade pip
        pip install .[build]
    - name: Check code style
      run: |
        ruff check .
    - name: Test building documentation
      run: |
        cd docs
        make html
    - name: Test building packages
      run: |
        hatch build
  test:
    needs: lint-docs-build
    runs-on: ubuntu-latest
    services:
@@ -46,13 +71,6 @@ jobs:
      run: |
        python -m pip install --upgrade pip
        pip install .[build]
    - name: Test building documentation
      run: |
        cd docs
        make html
    - name: Check code style
      run: |
        ruff check .
    - name: Run unit tests
      run: |
        pytest --cov --cov-report=xml tests.py
@@ -61,9 +79,6 @@ jobs:
        pip install -e .
        parsedmarc --debug -c ci.ini samples/aggregate/*
        parsedmarc --debug -c ci.ini samples/forensic/*
    - name: Test building packages
      run: |
        hatch build
    - name: Upload coverage to Codecov
      uses: codecov/codecov-action@v5
      with:
--- a/.gitignore
+++ b/.gitignore
@@ -137,7 +137,7 @@ samples/private
 *.html
 *.sqlite-journal
-parsedmarc.ini
+parsedmarc*.ini
 scratch.py
 parsedmarc/resources/maps/base_reverse_dns.csv
--- a/.vscode/settings.json
+++ b/.vscode/settings.json
@@ -52,6 +52,7 @@
      "geoipupdate",
      "Geolite",
      "geolocation",
      "getuid",
      "githubpages",
      "Grafana",
      "hostnames",
@@ -75,6 +76,7 @@
      "LISTSERV",
      "loganalytics",
      "lxml",
      "Maildir",
      "mailparser",
      "mailrelay",
      "mailsuite",
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -42,7 +42,7 @@ To skip DNS lookups during testing, set `GITHUB_ACTIONS=true`.
 ### Key modules
 - `parsedmarc/__init__.py` — Core parsing logic. Main functions: `parse_report_file()`, `parse_report_email()`, `parse_aggregate_report_xml()`, `parse_forensic_report()`, `parse_smtp_tls_report_json()`, `get_dmarc_reports_from_mailbox()`, `watch_inbox()`
- `parsedmarc/cli.py` — CLI entry point (`_main`), config file parsing, output orchestration
+- `parsedmarc/cli.py` — CLI entry point (`_main`), config file parsing (`_load_config` + `_parse_config`), output orchestration. Supports configuration via INI files, `PARSEDMARC_{SECTION}_{KEY}` environment variables, or both (env vars override file values).
 - `parsedmarc/types.py` — TypedDict definitions for all report types (`AggregateReport`, `ForensicReport`, `SMTPTLSReport`, `ParsingResults`)
 - `parsedmarc/utils.py` — IP/DNS/GeoIP enrichment, base64 decoding, compression handling
 - `parsedmarc/mail/` — Polymorphic mail connections: `IMAPConnection`, `GmailConnection`, `MSGraphConnection`, `MaildirConnection`
@@ -52,6 +52,10 @@ To skip DNS lookups during testing, set `GITHUB_ACTIONS=true`.
 `ReportType = Literal["aggregate", "forensic", "smtp_tls"]`. Exception hierarchy: `ParserError` → `InvalidDMARCReport` → `InvalidAggregateReport`/`InvalidForensicReport`, and `InvalidSMTPTLSReport`.
 ### Configuration
 Config priority: CLI args > env vars > config file > defaults. Env var naming: `PARSEDMARC_{SECTION}_{KEY}` (e.g. `PARSEDMARC_IMAP_PASSWORD`). Section names with underscores use longest-prefix matching (`PARSEDMARC_SPLUNK_HEC_TOKEN` → `[splunk_hec] token`). Some INI keys have short aliases for env var friendliness (e.g. `[maildir] create` for `maildir_create`). File path values are expanded via `os.path.expanduser`/`os.path.expandvars`. Config can be loaded purely from env vars with no file (`PARSEDMARC_CONFIG_FILE` sets the file path).
 ### Caching
 IP address info cached for 4 hours, seen aggregate report IDs cached for 1 hour (via `ExpiringDict`).
@@ -62,3 +66,6 @@ IP address info cached for 4 hours, seen aggregate report IDs cached for 1 hour
 - TypedDict for structured data, type hints throughout
 - Python ≥3.10 required
 - Tests are in a single `tests.py` file using unittest; sample reports live in `samples/`
 - File path config values must be wrapped with `_expand_path()` in `cli.py`
 - Maildir UID checks are intentionally relaxed (warn, don't crash) for Docker compatibility
 - Token file writes must create parent directories before opening for write
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,125 @@
 # Changelog
 ## 9.5.4
 ### Fixed
 - Maildir `fetch_messages` now respects the `reports_folder` argument. Previously it always read from the top-level Maildir, ignoring the configured reports folder. `fetch_message`, `delete_message`, and `move_message` now also operate on the correct active folder.
 - Config key aliases for env var compatibility: `[maildir] create` and `path` are now accepted as aliases for `maildir_create` and `maildir_path`, and `[msgraph] url` for `graph_url`. This allows natural env var names like `PARSEDMARC_MAILDIR_CREATE` to work without the redundant `PARSEDMARC_MAILDIR_MAILDIR_CREATE`.
 ## 9.5.3
 ### Fixed
 - Fixed `FileNotFoundError` when using Maildir with Docker volume mounts. Python's `mailbox.Maildir(create=True)` only creates `cur/new/tmp` subdirectories when the top-level directory doesn't exist; Docker volume mounts pre-create the directory as empty, skipping subdirectory creation. parsedmarc now explicitly creates the subdirectories when `maildir_create` is enabled.
 - Maildir UID mismatch no longer crashes the process. In Docker containers where volume ownership differs from the container UID, parsedmarc now logs a warning instead of raising an exception. Also handles `os.setuid` failures gracefully in containers without `CAP_SETUID`.
 - Token file writes (MS Graph and Gmail) now create parent directories automatically, preventing `FileNotFoundError` when the token path points to a directory that doesn't yet exist.
 - File paths from config (`token_file`, `credentials_file`, `cert_path`, `log_file`, `output`, `ip_db_path`, `maildir_path`, syslog cert paths, etc.) now expand `~` and `$VAR` references via `os.path.expanduser`/`os.path.expandvars`.
 ## 9.5.2
 ### Fixed
 - Fixed `ValueError: invalid interpolation syntax` when config values (from env vars or INI files) contain `%` characters, such as in passwords. Disabled ConfigParser's `%`-based string interpolation.
 ## 9.5.1
 ### Changes
 - Correct ISO format for MSGraphConnection timestamps (PR #706)
 ## 9.5.0
 ### Added
 - Environment variable configuration support: any config option can now be set via `PARSEDMARC_{SECTION}_{KEY}` environment variables (e.g. `PARSEDMARC_IMAP_PASSWORD`, `PARSEDMARC_SPLUNK_HEC_TOKEN`). Environment variables override config file values but are overridden by CLI arguments.
 - `PARSEDMARC_CONFIG_FILE` environment variable to specify the config file path without the `-c` flag.
 - Env-only mode: parsedmarc can now run without a config file when `PARSEDMARC_*` environment variables are set, enabling fully file-less Docker deployments.
 - Explicit read permission check on config file, giving a clear error message when the container UID cannot read the file (e.g. `chmod 600` with a UID mismatch).
 ## 9.4.0
 ### Added
 - Extracted `load_reverse_dns_map()` utility function in `utils.py` for loading the reverse DNS map independently of individual IP lookups.
 - SIGHUP reload now re-downloads/reloads the reverse DNS map, so changes take effect without restarting.
 - Add premade OpenSearch index patterns, visualizations, and dashboards
 ### Changed
 - When `index_prefix_domain_map` is configured, SMTP TLS reports for domains not in the map are now silently dropped instead of being output. Unlike DMARC, TLS-RPT has no DNS authorization records, so this filtering prevents processing reports for unrelated domains.
 - Bump OpenSearch support to `< 4`
 ### Fixed
 - Fixed `get_index_prefix` using wrong key (`domain` instead of `policy_domain`) for SMTP TLS reports, which prevented domain map matching from working for TLS reports.
 - Domain matching in `get_index_prefix` now lowercases the domain for case-insensitive comparison.
 ## 9.3.1
 ### Breaking changes
 - Elasticsearch and OpenSearch now verify SSL certificates by default when `ssl = True`, even without a `cert_path`
 - Added `skip_certificate_verification` option to the `elasticsearch` and `opensearch` configuration sections for consistency with `splunk_hec`
 ### Fixed
 - Splunk HEC `skip_certificate_verification` now works correctly
 - SMTP TLS reports no longer fail when saving to multiple output targets (e.g. Elasticsearch and OpenSearch) due to in-place mutation of the report dict
 - Output client initialization errors now identify which module failed (e.g. "OpenSearch: ConnectionError..." instead of generic "Output client error")
 ## 9.3.0
 ### Added
 - SIGHUP-based configuration reload for watch mode — update output destinations, DNS/GeoIP settings, processing flags, and log level without restarting the service or interrupting in-progress report processing.
  - Use `systemctl reload parsedmarc` when running under `systemd`.
  - On a successful reload, old output clients are closed and recreated.
  - On a failed reload, the previous configuration remains fully active.
 - `close()` methods on `GelfClient`, `KafkaClient`, `SyslogClient`, `WebhookClient`, HECClient, and `S3Client` for clean resource teardown on reload.
 - `config_reloading` parameter on all `MailboxConnection.watch()` implementations and `watch_inbox()` to ensure SIGHUP never triggers a new email batch mid-reload.
 - Elasticsearch and OpenSearch connections are now tracked and cleaned up on reload via `_close_output_clients()`.
 - Extracted `_parse_config_file()` and `_init_output_clients()` from `_main()` in `cli.py` to support config reload and reduce code duplication.
 ### Fixed
 - `get_index_prefix()` crashed on forensic reports with `TypeError` due to `report()` instead of `report[]` dict access.
 - Missing `exit(1)` after IMAP user/password validation failure allowed execution to continue with `None` credentials.
 ## 9.2.1
 ### Added
 - Better checking of `msgraph` configuration (PR #695)
 ### Changed
 - Updated `dbip-country-lite` database to version `2026-03`
 - DNS query error logging level from `warning` to `debug`
 ## 9.2.0
 ### Added
 - OpenSearch AWS SigV4 authentication support (PR #673)
 - IMAP move/delete compatibility fallbacks (PR #671)
 - `fail_on_output_error` CLI option for sink failures (PR #672)
 - Gmail service account auth mode for non-interactive runs (PR #676)
 - Microsoft Graph certificate authentication support (PRs #692 and #693)
 - Microsoft Graph well-known folder fallback for root listing failures (PR #618 and #684 close #609)
 ### Fixed
 - Pass mailbox since filter through `watch_inbox` callback (PR #670 closes issue #581)
 - `parsedmarc.mail.gmail.GmailConnection.delete_message` now properly calls the Gmail API (PR #668)
 - Avoid extra mailbox fetch in batch and test mode (PR #691 closes #533)
 ## 9.1.2
 ### Fixes
 - Fix duplicate detection for normalized aggregate reports in Elasticsearch/OpenSearch (PR #666 fixes issue #665)
 ## 9.1.1
 ### Fixes
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -0,0 +1,78 @@
 # Contributing
 Thanks for contributing to parsedmarc.
 ## Local setup
 Use a virtual environment for local development.
 ```bash
 python3 -m venv .venv
 . .venv/bin/activate
 python -m pip install --upgrade pip
 pip install .[build]
 ```
 ## Before opening a pull request
 Run the checks that match your change:
 ```bash
 ruff check .
 pytest --cov --cov-report=xml tests.py
 ```
 If you changed documentation:
 ```bash
 cd docs
 make html
 ```
 If you changed CLI behavior or parsing logic, it is also useful to exercise the
 sample reports:
 ```bash
 parsedmarc --debug -c ci.ini samples/aggregate/*
 parsedmarc --debug -c ci.ini samples/forensic/*
 ```
 To skip DNS lookups during tests, set:
 ```bash
 GITHUB_ACTIONS=true
 ```
 ## Pull request guidelines
 - Keep pull requests small and focused. Separate bug fixes, docs updates, and
  repo-maintenance changes where practical.
 - Add or update tests when behavior changes.
 - Update docs when configuration or user-facing behavior changes.
 - Include a short summary, the reason for the change, and the testing you ran.
 - Link the related issue when there is one.
 ## Branch maintenance
 Upstream `master` may move quickly. Before asking for review or after another PR
 lands, rebase your branch onto the current upstream branch and force-push with
 lease if needed:
 ```bash
 git fetch upstream
 git rebase upstream/master
 git push --force-with-lease
 ```
 ## CI and coverage
 GitHub Actions is the source of truth for linting, docs, and test status.
 Codecov patch coverage is usually the most relevant signal for small PRs. Project
 coverage can be noisier when the base comparison is stale, so interpret it in
 the context of the actual diff.
 ## Questions
 Use GitHub issues for bugs and feature requests. If you are not sure whether a
 change is wanted, opening an issue first is usually the safest path.
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -0,0 +1,29 @@
 # Security Policy
 ## Reporting a vulnerability
 Please do not open a public GitHub issue for an undisclosed security
 vulnerability. Use GitHub private vulnerability reporting in the Security tab of this project instead.
 When reporting a vulnerability, include:
 - the affected parsedmarc version or commit
 - the component or integration involved
 - clear reproduction details if available
 - potential impact
 - any suggested mitigation or workaround
 ## Supported versions
 Security fixes will be applied to the latest released version and
 the current `master` branch.
 Older versions will not receive backported fixes.
 ## Disclosure process
 After a report is received, maintainers can validate the issue, assess impact,
 and coordinate a fix before public disclosure.
 Please avoid publishing proof-of-concept details until maintainers have had a
 reasonable opportunity to investigate and release a fix or mitigation.
--- a/codecov.yml
+++ b/codecov.yml
@@ -0,0 +1,11 @@
 codecov:
  require_ci_to_pass: true
 coverage:
  status:
    project:
      default:
        informational: true
    patch:
      default:
        informational: false
--- a/docker-compose.dashboard-dev.yml
+++ b/docker-compose.dashboard-dev.yml
@@ -0,0 +1,47 @@
 name: parsedmarc-dashboards
 include:
  - docker-compose.yml
 services:
  kibana:
    image: docker.elastic.co/kibana/kibana:8.19.7
    environment:
      - ELASTICSEARCH_HOSTS=http://elasticsearch:9200
    ports:
      - "127.0.0.1:5601:5601"
    depends_on:
      elasticsearch:
        condition: service_healthy
  opensearch-dashboards:
    image: opensearchproject/opensearch-dashboards:3
    environment:
      - OPENSEARCH_HOSTS=["https://opensearch:9200"]
    ports:
      - "127.0.0.1:5602:5601"
    depends_on:
      opensearch:
        condition: service_healthy
  grafana:
    image: grafana/grafana:latest
    environment:
      - GRAFANA_PASSWORD=${GRAFANA_PASSWORD}
      - GF_INSTALL_PLUGINS=grafana-piechart-panel,grafana-worldmap-panel
    ports:
      - "127.0.0.1:3000:3000"
    depends_on:
      elasticsearch:
        condition: service_healthy
  splunk:
    image: splunk/splunk:latest
    environment:
      - SPLUNK_START_ARGS=--accept-license
      - "SPLUNK_GENERAL_TERMS=--accept-sgt-current-at-splunk-com"
      - SPLUNK_PASSWORD=${SPLUNK_PASSWORD}
      - SPLUNK_HEC_TOKEN=${SPLUNK_HEC_TOKEN}
    ports:
      - "127.0.0.1:8000:8000"
      - "127.0.0.1:8088:8088"
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -48,7 +48,7 @@ services:
      test:
        [
          "CMD-SHELL",
-          "curl -s -XGET http://localhost:9201/_cluster/health?pretty | grep status | grep -q '\\(green\\|yellow\\)'"
+          "curl -sk -u admin:${OPENSEARCH_INITIAL_ADMIN_PASSWORD} -XGET https://localhost:9200/_cluster/health?pretty | grep status | grep -q '\\(green\\|yellow\\)'"
        ]
      interval: 10s
      timeout: 10s
--- a/docs/source/usage.md
+++ b/docs/source/usage.md
@@ -146,6 +146,9 @@ The full set of configuration options are:
  - `dns_timeout` - float: DNS timeout period
  - `debug` - bool: Print debugging messages
  - `silent` - bool: Only print errors (Default: `True`)
  - `fail_on_output_error` - bool: Exit with a non-zero status code if
      any configured output destination fails while saving/publishing
      reports (Default: `False`)
  - `log_file` - str: Write log messages to a file at this path
  - `n_procs` - int: Number of process to run in parallel when
      parsing in CLI mode (Default: `1`)
@@ -200,7 +203,7 @@ The full set of configuration options are:
  - `password` - str: The IMAP password
 - `msgraph`
  - `auth_method` - str: Authentication method, valid types are
-      `UsernamePassword`, `DeviceCode`, or `ClientSecret`
+      `UsernamePassword`, `DeviceCode`, `ClientSecret`, or `Certificate`
      (Default: `UsernamePassword`).
  - `user` - str: The M365 user, required when the auth method is
      UsernamePassword
@@ -208,6 +211,11 @@ The full set of configuration options are:
      method is UsernamePassword
  - `client_id` - str: The app registration's client ID
  - `client_secret` - str: The app registration's secret
  - `certificate_path` - str: Path to a PEM or PKCS12 certificate
      including the private key. Required when the auth method is
      `Certificate`
  - `certificate_password` - str: Optional password for the
      certificate file when using `Certificate` auth
  - `tenant_id` - str: The Azure AD tenant ID. This is required
      for all auth methods except UsernamePassword.
  - `mailbox` - str: The mailbox name. This defaults to the
@@ -245,6 +253,9 @@ The full set of configuration options are:
    -Description "Restrict access to dmarc reports mailbox."
    ```
    The same application permission and mailbox scoping guidance
    applies to the `Certificate` auth method.
    :::
 - `elasticsearch`
  - `hosts` - str: A comma separated list of hostnames and ports
@@ -262,6 +273,8 @@ The full set of configuration options are:
    (Default: `True`)
  - `timeout` - float: Timeout in seconds (Default: 60)
  - `cert_path` - str: Path to a trusted certificates
  - `skip_certificate_verification` - bool: Skip certificate
    verification (not recommended)
  - `index_suffix` - str: A suffix to apply to the index names
  - `index_prefix` - str: A prefix to apply to the index names
  - `monthly_indexes` - bool: Use monthly indexes instead of daily indexes
@@ -281,10 +294,16 @@ The full set of configuration options are:
  - `user` - str: Basic auth username
  - `password` - str: Basic auth password
  - `api_key` - str: API key
  - `auth_type` - str: Authentication type: `basic` (default) or `awssigv4` (the key `authentication_type` is accepted as an alias for this option)
  - `aws_region` - str: AWS region for SigV4 authentication
    (required when `auth_type = awssigv4`)
  - `aws_service` - str: AWS service for SigV4 signing (Default: `es`)
  - `ssl` - bool: Use an encrypted SSL/TLS connection
    (Default: `True`)
  - `timeout` - float: Timeout in seconds (Default: 60)
  - `cert_path` - str: Path to a trusted certificates
  - `skip_certificate_verification` - bool: Skip certificate
    verification (not recommended)
  - `index_suffix` - str: A suffix to apply to the index names
  - `index_prefix` - str: A prefix to apply to the index names
  - `monthly_indexes` - bool: Use monthly indexes instead of daily indexes
@@ -389,15 +408,25 @@ The full set of configuration options are:
  retry_attempts = 3
  retry_delay = 5
  ```
 - `gmail_api`
  - `credentials_file` - str: Path to file containing the
      credentials, None to disable (Default: `None`)
  - `token_file` - str: Path to save the token file
      (Default: `.token`)
  - `auth_mode` - str: Authentication mode, `installed_app` (default)
      or `service_account`
  - `service_account_user` - str: Delegated mailbox user for Gmail
      service account auth (required for domain-wide delegation). Also
      accepted as `delegated_user` for backward compatibility.
    :::{note}
    credentials_file and token_file can be got with [quickstart](https://developers.google.com/gmail/api/quickstart/python).Please change the scope to `https://www.googleapis.com/auth/gmail.modify`.
    :::
    :::{note}
    When `auth_mode = service_account`, `credentials_file` must point to a
    Google service account key JSON file, and `token_file` is not used.
    :::
  - `include_spam_trash` - bool: Include messages in Spam and
      Trash when searching reports (Default: `False`)
  - `scopes` - str: Comma separated list of scopes to use when
@@ -418,7 +447,7 @@ The full set of configuration options are:
  - `dcr_smtp_tls_stream` - str: The stream name for the SMTP TLS reports in the DCR
  :::{note}
-    Information regarding the setup of the Data Collection Rule can be found [here](https://learn.microsoft.com/en-us/azure/azure-monitor/logs/tutorial-logs-ingestion-portal).
+    Information regarding the setup of the Data Collection Rule can be found [in the Azure documentation](https://learn.microsoft.com/en-us/azure/azure-monitor/logs/tutorial-logs-ingestion-portal).
    :::
 - `gelf`
  - `host` - str: The GELF server name or IP address
@@ -502,6 +531,123 @@ PUT _cluster/settings
 Increasing this value increases resource usage.
 :::
 ## Environment variable configuration
 Any configuration option can be set via environment variables using the
 naming convention `PARSEDMARC_{SECTION}_{KEY}` (uppercase). This is
 especially useful for Docker deployments where file permissions make it
 difficult to use config files for secrets.
 **Priority order:** CLI arguments > environment variables > config file > defaults
 ### Examples
 ```bash
 # Set IMAP credentials via env vars
 export PARSEDMARC_IMAP_HOST=imap.example.com
 export PARSEDMARC_IMAP_USER=dmarc@example.com
 export PARSEDMARC_IMAP_PASSWORD=secret
 # Elasticsearch
 export PARSEDMARC_ELASTICSEARCH_HOSTS=http://localhost:9200
 export PARSEDMARC_ELASTICSEARCH_SSL=false
 # Splunk HEC (note: section name splunk_hec becomes SPLUNK_HEC)
 export PARSEDMARC_SPLUNK_HEC_URL=https://splunk.example.com
 export PARSEDMARC_SPLUNK_HEC_TOKEN=my-hec-token
 export PARSEDMARC_SPLUNK_HEC_INDEX=email
 # General settings
 export PARSEDMARC_GENERAL_SAVE_AGGREGATE=true
 export PARSEDMARC_GENERAL_DEBUG=true
 ```
 ### Specifying the config file via environment variable
 ```bash
 export PARSEDMARC_CONFIG_FILE=/etc/parsedmarc.ini
 parsedmarc
 ```
 ### Running without a config file (env-only mode)
 When no config file is given (neither `-c` flag nor `PARSEDMARC_CONFIG_FILE`),
 parsedmarc will still pick up any `PARSEDMARC_*` environment variables. This
 enables fully file-less deployments:
 ```bash
 export PARSEDMARC_GENERAL_SAVE_AGGREGATE=true
 export PARSEDMARC_GENERAL_OFFLINE=true
 export PARSEDMARC_ELASTICSEARCH_HOSTS=http://elasticsearch:9200
 parsedmarc /path/to/reports/*
 ```
 ### Docker Compose example
 ```yaml
 services:
  parsedmarc:
    image: parsedmarc:latest
    environment:
      PARSEDMARC_IMAP_HOST: imap.example.com
      PARSEDMARC_IMAP_USER: dmarc@example.com
      PARSEDMARC_IMAP_PASSWORD: ${IMAP_PASSWORD}
      PARSEDMARC_MAILBOX_WATCH: "true"
      PARSEDMARC_ELASTICSEARCH_HOSTS: http://elasticsearch:9200
      PARSEDMARC_GENERAL_SAVE_AGGREGATE: "true"
      PARSEDMARC_GENERAL_SAVE_FORENSIC: "true"
 ```
 ### Section name mapping
 For sections with underscores in the name, the full section name is used:
 | Section          | Env var prefix                |
 |------------------|-------------------------------|
 | `general`        | `PARSEDMARC_GENERAL_`         |
 | `mailbox`        | `PARSEDMARC_MAILBOX_`         |
 | `imap`           | `PARSEDMARC_IMAP_`            |
 | `msgraph`        | `PARSEDMARC_MSGRAPH_`         |
 | `elasticsearch`  | `PARSEDMARC_ELASTICSEARCH_`   |
 | `opensearch`     | `PARSEDMARC_OPENSEARCH_`      |
 | `splunk_hec`     | `PARSEDMARC_SPLUNK_HEC_`      |
 | `kafka`          | `PARSEDMARC_KAFKA_`           |
 | `smtp`           | `PARSEDMARC_SMTP_`            |
 | `s3`             | `PARSEDMARC_S3_`              |
 | `syslog`         | `PARSEDMARC_SYSLOG_`          |
 | `gmail_api`      | `PARSEDMARC_GMAIL_API_`       |
 | `maildir`        | `PARSEDMARC_MAILDIR_`         |
 | `log_analytics`  | `PARSEDMARC_LOG_ANALYTICS_`   |
 | `gelf`           | `PARSEDMARC_GELF_`            |
 | `webhook`        | `PARSEDMARC_WEBHOOK_`         |
 ## Performance tuning
 For large mailbox imports or backfills, parsedmarc can consume a noticeable amount
 of memory, especially when it runs on the same host as Elasticsearch or
 OpenSearch. The following settings can reduce peak memory usage and make long
 imports more predictable:
 - Reduce `mailbox.batch_size` to smaller values such as `100-500` instead of
  processing a very large message set at once. Smaller batches trade throughput
  for lower peak memory use and less sink pressure.
 - Keep `n_procs` low for mailbox-heavy runs. In practice, `1-2` workers is often
  a safer starting point for large backfills than aggressive parallelism.
 - Use `mailbox.since` to process reports in smaller time windows such as `1d`,
  `7d`, or another interval that fits the backlog. This makes it easier to catch
  up incrementally instead of loading an entire mailbox history in one run.
 - Set `strip_attachment_payloads = True` when forensic reports contain large
  attachments and you do not need to retain the raw payloads in the parsed
  output.
 - Prefer running parsedmarc separately from Elasticsearch or OpenSearch, or
  reserve enough RAM for both services if they must share a host.
 - For very large imports, prefer incremental supervised runs, such as a
  scheduler or systemd service, over infrequent massive backfills.
 These are operational tuning recommendations rather than hard requirements, but
 they are often enough to avoid memory pressure and reduce failures during
 high-volume mailbox processing.
 ## Multi-tenant support
 Starting in `8.19.0`, ParseDMARC provides multi-tenant support by placing data into separate OpenSearch or Elasticsearch index prefixes. To set this up, create a YAML file that is formatted where each key is a tenant name, and the value is a list of domains related to that tenant, not including subdomains, like this:
@@ -551,6 +697,7 @@ After=network.target network-online.target elasticsearch.service
 [Service]
 ExecStart=/opt/parsedmarc/venv/bin/parsedmarc -c /etc/parsedmarc.ini
 ExecReload=/bin/kill -HUP $MAINPID
 User=parsedmarc
 Group=parsedmarc
 Restart=always
@@ -583,6 +730,51 @@ sudo service parsedmarc restart
 :::
 ### Reloading configuration without restarting
 When running in watch mode, `parsedmarc` supports reloading its
 configuration file without restarting the service or interrupting
 report processing that is already in progress. Send a `SIGHUP` signal
 to the process, or use `systemctl reload` if the unit file includes
 the `ExecReload` line shown above:
 ```bash
 sudo systemctl reload parsedmarc
 ```
 The reload takes effect after the current batch of reports finishes
 processing and all output operations (Elasticsearch, Kafka, S3, etc.)
 for that batch have completed. The following settings are reloaded:
 - All output destinations (Elasticsearch, OpenSearch, Kafka, S3,
  Splunk, syslog, GELF, webhooks, Log Analytics)
 - Multi-tenant index prefix domain map (`index_prefix_domain_map` —
  the referenced YAML file is re-read on reload)
 - DNS and GeoIP settings (`nameservers`, `dns_timeout`, `ip_db_path`,
  `offline`, etc.)
 - Processing flags (`strip_attachment_payloads`, `batch_size`,
  `check_timeout`, etc.)
 - Log level (`debug`, `verbose`, `warnings`, `silent`)
 Mailbox connection settings (IMAP host/credentials, Microsoft Graph,
 Gmail API, Maildir path) are **not** reloaded — changing those still
 requires a full restart.
 On a **successful** reload, existing output client connections are
 closed and new ones are created from the updated configuration. The
 service then resumes watching with the new settings.
 If the new configuration file contains errors (missing required
 settings, unreachable output destinations, etc.), the **entire reload
 is aborted** — no output clients are replaced and the previous
 configuration remains fully active. This means a typo in one section
 will not take down an otherwise working setup. Check the logs for
 details:
 ```bash
 journalctl -u parsedmarc.service -r
 ```
 To check the status of the service, run:
 ```bash
--- a/opensearch/opensearch_dashboards.ndjson
+++ b/opensearch/opensearch_dashboards.ndjson
--- a/parsedmarc/init.py
+++ b/parsedmarc/init.py
@@ -962,10 +962,12 @@ def extract_report(content: Union[bytes, str, BinaryIO]) -> str:
    return report
-def extract_report_from_file_path(file_path: str):
+def extract_report_from_file_path(
    file_path: Union[str, bytes, os.PathLike[str], os.PathLike[bytes]],
 ) -> str:
    """Extracts report from a file at the given file_path"""
    try:
-        with open(file_path, "rb") as report_file:
+        with open(os.fspath(file_path), "rb") as report_file:
            return extract_report(report_file.read())
    except FileNotFoundError:
        raise ParserError("File was not found")
@@ -1660,7 +1662,7 @@ def parse_report_email(
 def parse_report_file(
-    input_: Union[bytes, str, BinaryIO],
+    input_: Union[bytes, str, os.PathLike[str], os.PathLike[bytes], BinaryIO],
    *,
    nameservers: Optional[list[str]] = None,
    dns_timeout: float = 2.0,
@@ -1677,7 +1679,8 @@ def parse_report_file(
    file-like object. or bytes
    Args:
-        input_ (str | bytes | BinaryIO): A path to a file, a file like object, or bytes
+        input_ (str | os.PathLike | bytes | BinaryIO): A path to a file,
            a file-like object, or bytes
        nameservers (list): A list of one or more nameservers to use
            (Cloudflare's public DNS resolvers by default)
        dns_timeout (float): Sets the DNS timeout in seconds
@@ -1694,9 +1697,10 @@ def parse_report_file(
        dict: The parsed DMARC report
    """
    file_object: BinaryIO
-    if isinstance(input_, str):
+    if isinstance(input_, (str, os.PathLike)):
-        logger.debug("Parsing {0}".format(input_))
+        file_path = os.fspath(input_)
-        file_object = open(input_, "rb")
+        logger.debug("Parsing {0}".format(file_path))
        file_object = open(file_path, "rb")
    elif isinstance(input_, (bytes, bytearray, memoryview)):
        file_object = BytesIO(bytes(input_))
    else:
@@ -1951,9 +1955,7 @@ def get_dmarc_reports_from_mailbox(
            )
            current_time = datetime.now(timezone.utc).strftime("%d-%b-%Y")
        elif isinstance(connection, MSGraphConnection):
-            since = (
+            since = (datetime.now(timezone.utc) - timedelta(minutes=_since)).isoformat()
                datetime.now(timezone.utc) - timedelta(minutes=_since)
            ).isoformat() + "Z"
            current_time = datetime.now(timezone.utc).isoformat() + "Z"
        elif isinstance(connection, GmailConnection):
            since = (datetime.now(timezone.utc) - timedelta(minutes=_since)).strftime(
@@ -2137,14 +2139,17 @@ def get_dmarc_reports_from_mailbox(
        "smtp_tls_reports": smtp_tls_reports,
    }
-    if current_time:
+    if not test and not batch_size:
-        total_messages = len(
+        if current_time:
-            connection.fetch_messages(reports_folder, since=current_time)
+            total_messages = len(
-        )
+                connection.fetch_messages(reports_folder, since=current_time)
            )
        else:
            total_messages = len(connection.fetch_messages(reports_folder))
    else:
-        total_messages = len(connection.fetch_messages(reports_folder))
+        total_messages = 0
-    if not test and not batch_size and total_messages > 0:
+    if total_messages > 0:
        # Process emails that came in during the last run
        results = get_dmarc_reports_from_mailbox(
            connection=connection,
@@ -2186,7 +2191,9 @@ def watch_inbox(
    dns_timeout: float = 6.0,
    strip_attachment_payloads: bool = False,
    batch_size: int = 10,
    since: Optional[Union[datetime, date, str]] = None,
    normalize_timespan_threshold_hours: float = 24,
    config_reloading: Optional[Callable] = None,
 ):
    """
    Watches the mailbox for new messages and
@@ -2212,7 +2219,10 @@ def watch_inbox(
        strip_attachment_payloads (bool): Replace attachment payloads in
            forensic report samples with None
        batch_size (int): Number of messages to read and process before saving
        since: Search for messages since certain time
        normalize_timespan_threshold_hours (float): Normalize timespans beyond this
        config_reloading: Optional callable that returns True when a config
            reload has been requested (e.g. via SIGHUP)
    """
    def check_callback(connection):
@@ -2231,12 +2241,20 @@ def watch_inbox(
            dns_timeout=dns_timeout,
            strip_attachment_payloads=strip_attachment_payloads,
            batch_size=batch_size,
            since=since,
            create_folders=False,
            normalize_timespan_threshold_hours=normalize_timespan_threshold_hours,
        )
        callback(res)
-    mailbox_connection.watch(check_callback=check_callback, check_timeout=check_timeout)
+    watch_kwargs: dict = {
        "check_callback": check_callback,
        "check_timeout": check_timeout,
    }
    if config_reloading is not None:
        watch_kwargs["config_reloading"] = config_reloading
    mailbox_connection.watch(**watch_kwargs)
 def append_json(
--- a/parsedmarc/cli.py
+++ b/parsedmarc/cli.py
--- a/parsedmarc/constants.py
+++ b/parsedmarc/constants.py
@@ -1,3 +1,3 @@
-__version__ = "9.1.1"
+__version__ = "9.5.4"
 USER_AGENT = f"parsedmarc/{__version__}"
--- a/parsedmarc/elastic.py
+++ b/parsedmarc/elastic.py
@@ -268,6 +268,7 @@ def set_hosts(
    *,
    use_ssl: bool = False,
    ssl_cert_path: Optional[str] = None,
    skip_certificate_verification: bool = False,
    username: Optional[str] = None,
    password: Optional[str] = None,
    api_key: Optional[str] = None,
@@ -280,6 +281,7 @@ def set_hosts(
        hosts (str | list[str]): A single hostname or URL, or list of hostnames or URLs
        use_ssl (bool): Use an HTTPS connection to the server
        ssl_cert_path (str): Path to the certificate chain
        skip_certificate_verification (bool): Skip certificate verification
        username (str): The username to use for authentication
        password (str): The password to use for authentication
        api_key (str): The Base64 encoded API key to use for authentication
@@ -291,10 +293,11 @@ def set_hosts(
    if use_ssl:
        conn_params["use_ssl"] = True
        if ssl_cert_path:
            conn_params["verify_certs"] = True
            conn_params["ca_certs"] = ssl_cert_path
-        else:
+        if skip_certificate_verification:
            conn_params["verify_certs"] = False
        else:
            conn_params["verify_certs"] = True
    if username and password:
        conn_params["http_auth"] = username + ":" + password
    if api_key:
@@ -735,6 +738,7 @@ def save_smtp_tls_report_to_elasticsearch(
        index_date = begin_date.strftime("%Y-%m")
    else:
        index_date = begin_date.strftime("%Y-%m-%d")
    report = report.copy()
    report["begin_date"] = begin_date
    report["end_date"] = end_date
--- a/parsedmarc/gelf.py
+++ b/parsedmarc/gelf.py
@@ -3,9 +3,7 @@
 from __future__ import annotations
 import logging
 import logging.handlers
 import threading
 from typing import Any
 from pygelf import GelfTcpHandler, GelfTlsHandler, GelfUdpHandler
@@ -14,6 +12,7 @@ from parsedmarc import (
    parsed_forensic_reports_to_csv_rows,
    parsed_smtp_tls_reports_to_csv_rows,
 )
 from parsedmarc.types import AggregateReport, ForensicReport, SMTPTLSReport
 log_context_data = threading.local()
@@ -37,7 +36,7 @@ class GelfClient(object):
        """
        self.host = host
        self.port = port
-        self.logger = logging.getLogger("parsedmarc_syslog")
+        self.logger = logging.getLogger("parsedmarc_gelf")
        self.logger.setLevel(logging.INFO)
        self.logger.addFilter(ContextFilter())
        self.gelf_mode = {
@@ -50,7 +49,7 @@ class GelfClient(object):
        )
        self.logger.addHandler(self.handler)
-    def save_aggregate_report_to_gelf(self, aggregate_reports: list[dict[str, Any]]):
+    def save_aggregate_report_to_gelf(self, aggregate_reports: list[AggregateReport]):
        rows = parsed_aggregate_reports_to_csv_rows(aggregate_reports)
        for row in rows:
            log_context_data.parsedmarc = row
@@ -58,14 +57,19 @@ class GelfClient(object):
        log_context_data.parsedmarc = None
-    def save_forensic_report_to_gelf(self, forensic_reports: list[dict[str, Any]]):
+    def save_forensic_report_to_gelf(self, forensic_reports: list[ForensicReport]):
        rows = parsed_forensic_reports_to_csv_rows(forensic_reports)
        for row in rows:
            log_context_data.parsedmarc = row
            self.logger.info("parsedmarc forensic report")
-    def save_smtp_tls_report_to_gelf(self, smtp_tls_reports: dict[str, Any]):
+    def save_smtp_tls_report_to_gelf(self, smtp_tls_reports: SMTPTLSReport):
        rows = parsed_smtp_tls_reports_to_csv_rows(smtp_tls_reports)
        for row in rows:
            log_context_data.parsedmarc = row
            self.logger.info("parsedmarc smtptls report")
    def close(self):
        """Remove and close the GELF handler, releasing its connection."""
        self.logger.removeHandler(self.handler)
        self.handler.close()
--- a/parsedmarc/kafkaclient.py
+++ b/parsedmarc/kafkaclient.py
@@ -62,6 +62,10 @@ class KafkaClient(object):
        except NoBrokersAvailable:
            raise KafkaError("No Kafka brokers available")
    def close(self):
        """Close the Kafka producer, releasing background threads and sockets."""
        self.producer.close()
    @staticmethod
    def strip_metadata(report: dict[str, Any]):
        """
--- a/parsedmarc/mail/gmail.py
+++ b/parsedmarc/mail/gmail.py
@@ -10,6 +10,7 @@ from typing import List
 from google.auth.transport.requests import Request
 from google.oauth2.credentials import Credentials
 from google.oauth2 import service_account
 from google_auth_oauthlib.flow import InstalledAppFlow
 from googleapiclient.discovery import build
 from googleapiclient.errors import HttpError
@@ -18,7 +19,29 @@ from parsedmarc.log import logger
 from parsedmarc.mail.mailbox_connection import MailboxConnection
-def _get_creds(token_file, credentials_file, scopes, oauth2_port):
+def _get_creds(
    token_file,
    credentials_file,
    scopes,
    oauth2_port,
    auth_mode="installed_app",
    service_account_user=None,
 ):
    normalized_auth_mode = (auth_mode or "installed_app").strip().lower()
    if normalized_auth_mode == "service_account":
        creds = service_account.Credentials.from_service_account_file(
            credentials_file,
            scopes=scopes,
        )
        if service_account_user:
            creds = creds.with_subject(service_account_user)
        return creds
    if normalized_auth_mode != "installed_app":
        raise ValueError(
            f"Unsupported Gmail auth_mode '{auth_mode}'. "
            "Expected 'installed_app' or 'service_account'."
        )
    creds = None
    if Path(token_file).exists():
@@ -32,6 +55,7 @@ def _get_creds(token_file, credentials_file, scopes, oauth2_port):
            flow = InstalledAppFlow.from_client_secrets_file(credentials_file, scopes)
            creds = flow.run_local_server(open_browser=False, oauth2_port=oauth2_port)
        # Save the credentials for the next run
        Path(token_file).parent.mkdir(parents=True, exist_ok=True)
        with Path(token_file).open("w") as token:
            token.write(creds.to_json())
    return creds
@@ -47,8 +71,17 @@ class GmailConnection(MailboxConnection):
        reports_folder: str,
        oauth2_port: int,
        paginate_messages: bool,
        auth_mode: str = "installed_app",
        service_account_user: str | None = None,
    ):
-        creds = _get_creds(token_file, credentials_file, scopes, oauth2_port)
+        creds = _get_creds(
            token_file,
            credentials_file,
            scopes,
            oauth2_port,
            auth_mode=auth_mode,
            service_account_user=service_account_user,
        )
        self.service = build("gmail", "v1", credentials=creds)
        self.include_spam_trash = include_spam_trash
        self.reports_label_id = self._find_label_id_for_label(reports_folder)
@@ -126,7 +159,7 @@ class GmailConnection(MailboxConnection):
        return urlsafe_b64decode(msg["raw"]).decode(errors="replace")
    def delete_message(self, message_id: str):
-        self.service.users().messages().delete(userId="me", id=message_id)
+        self.service.users().messages().delete(userId="me", id=message_id).execute()
    def move_message(self, message_id: str, folder_name: str):
        label_id = self._find_label_id_for_label(folder_name)
@@ -143,10 +176,14 @@ class GmailConnection(MailboxConnection):
        # Not needed
        pass
-    def watch(self, check_callback, check_timeout):
+    def watch(self, check_callback, check_timeout, config_reloading=None):
        """Checks the mailbox for new messages every n seconds"""
        while True:
            if config_reloading and config_reloading():
                return
            sleep(check_timeout)
            if config_reloading and config_reloading():
                return
            check_callback(self)
    @lru_cache(maxsize=10)
--- a/parsedmarc/mail/graph.py
+++ b/parsedmarc/mail/graph.py
@@ -12,19 +12,25 @@ from azure.identity import (
    UsernamePasswordCredential,
    DeviceCodeCredential,
    ClientSecretCredential,
    CertificateCredential,
    TokenCachePersistenceOptions,
    AuthenticationRecord,
 )
 from msgraph.core import GraphClient
 from requests.exceptions import RequestException
 from parsedmarc.log import logger
 from parsedmarc.mail.mailbox_connection import MailboxConnection
 GRAPH_REQUEST_RETRY_ATTEMPTS = 3
 GRAPH_REQUEST_RETRY_DELAY_SECONDS = 5
 class AuthMethod(Enum):
    DeviceCode = 1
    UsernamePassword = 2
    ClientSecret = 3
    Certificate = 4
 def _get_cache_args(token_path: Path, allow_unencrypted_storage):
@@ -50,6 +56,7 @@ def _load_token(token_path: Path) -> Optional[str]:
 def _cache_auth_record(record: AuthenticationRecord, token_path: Path):
    token = record.serialize()
    token_path.parent.mkdir(parents=True, exist_ok=True)
    with token_path.open("w") as token_file:
        token_file.write(token)
@@ -83,30 +90,55 @@ def _generate_credential(auth_method: str, token_path: Path, **kwargs):
            tenant_id=kwargs["tenant_id"],
            client_secret=kwargs["client_secret"],
        )
    elif auth_method == AuthMethod.Certificate.name:
        cert_path = kwargs.get("certificate_path")
        if not cert_path:
            raise ValueError(
                "certificate_path is required when auth_method is 'Certificate'"
            )
        credential = CertificateCredential(
            client_id=kwargs["client_id"],
            tenant_id=kwargs["tenant_id"],
            certificate_path=cert_path,
            password=kwargs.get("certificate_password"),
        )
    else:
        raise RuntimeError(f"Auth method {auth_method} not found")
    return credential
 class MSGraphConnection(MailboxConnection):
    _WELL_KNOWN_FOLDERS = {
        "inbox": "inbox",
        "archive": "archive",
        "drafts": "drafts",
        "sentitems": "sentitems",
        "deleteditems": "deleteditems",
        "junkemail": "junkemail",
    }
    def __init__(
        self,
        auth_method: str,
        mailbox: str,
        graph_url: str,
        client_id: str,
-        client_secret: str,
+        client_secret: Optional[str],
-        username: str,
+        username: Optional[str],
-        password: str,
+        password: Optional[str],
        tenant_id: str,
        token_file: str,
        allow_unencrypted_storage: bool,
        certificate_path: Optional[str] = None,
        certificate_password: Optional[Union[str, bytes]] = None,
    ):
        token_path = Path(token_file)
        credential = _generate_credential(
            auth_method,
            client_id=client_id,
            client_secret=client_secret,
            certificate_path=certificate_path,
            certificate_password=certificate_password,
            username=username,
            password=password,
            tenant_id=tenant_id,
@@ -117,10 +149,10 @@ class MSGraphConnection(MailboxConnection):
            "credential": credential,
            "cloud": graph_url,
        }
-        if not isinstance(credential, ClientSecretCredential):
+        if not isinstance(credential, (ClientSecretCredential, CertificateCredential)):
            scopes = ["Mail.ReadWrite"]
            # Detect if mailbox is shared
-            if mailbox and username != mailbox:
+            if mailbox and username and username != mailbox:
                scopes = ["Mail.ReadWrite.Shared"]
            auth_record = credential.authenticate(scopes=scopes)
            _cache_auth_record(auth_record, token_path)
@@ -129,6 +161,23 @@ class MSGraphConnection(MailboxConnection):
        self._client = GraphClient(**client_params)
        self.mailbox_name = mailbox
    def _request_with_retries(self, method_name: str, *args, **kwargs):
        for attempt in range(1, GRAPH_REQUEST_RETRY_ATTEMPTS + 1):
            try:
                return getattr(self._client, method_name)(*args, **kwargs)
            except RequestException as error:
                if attempt == GRAPH_REQUEST_RETRY_ATTEMPTS:
                    raise
                logger.warning(
                    "Transient MS Graph %s error on attempt %s/%s: %s",
                    method_name.upper(),
                    attempt,
                    GRAPH_REQUEST_RETRY_ATTEMPTS,
                    error,
                )
                sleep(GRAPH_REQUEST_RETRY_DELAY_SECONDS)
        raise RuntimeError("no retry attempts configured")
    def create_folder(self, folder_name: str):
        sub_url = ""
        path_parts = folder_name.split("/")
@@ -143,7 +192,7 @@ class MSGraphConnection(MailboxConnection):
        request_body = {"displayName": folder_name}
        request_url = f"/users/{self.mailbox_name}/mailFolders{sub_url}"
-        resp = self._client.post(request_url, json=request_body)
+        resp = self._request_with_retries("post", request_url, json=request_body)
        if resp.status_code == 409:
            logger.debug(f"Folder {folder_name} already exists, skipping creation")
        elif resp.status_code == 201:
@@ -173,7 +222,7 @@ class MSGraphConnection(MailboxConnection):
            params["$top"] = batch_size
        else:
            params["$top"] = 100
-        result = self._client.get(url, params=params)
+        result = self._request_with_retries("get", url, params=params)
        if result.status_code != 200:
            raise RuntimeError(f"Failed to fetch messages {result.text}")
        messages = result.json()["value"]
@@ -181,7 +230,7 @@ class MSGraphConnection(MailboxConnection):
        while "@odata.nextLink" in result.json() and (
            since is not None or (batch_size == 0 or batch_size - len(messages) > 0)
        ):
-            result = self._client.get(result.json()["@odata.nextLink"])
+            result = self._request_with_retries("get", result.json()["@odata.nextLink"])
            if result.status_code != 200:
                raise RuntimeError(f"Failed to fetch messages {result.text}")
            messages.extend(result.json()["value"])
@@ -190,7 +239,7 @@ class MSGraphConnection(MailboxConnection):
    def mark_message_read(self, message_id: str):
        """Marks a message as read"""
        url = f"/users/{self.mailbox_name}/messages/{message_id}"
-        resp = self._client.patch(url, json={"isRead": "true"})
+        resp = self._request_with_retries("patch", url, json={"isRead": "true"})
        if resp.status_code != 200:
            raise RuntimeWarning(
                f"Failed to mark message read{resp.status_code}: {resp.json()}"
@@ -198,7 +247,7 @@ class MSGraphConnection(MailboxConnection):
    def fetch_message(self, message_id: str, **kwargs):
        url = f"/users/{self.mailbox_name}/messages/{message_id}/$value"
-        result = self._client.get(url)
+        result = self._request_with_retries("get", url)
        if result.status_code != 200:
            raise RuntimeWarning(
                f"Failed to fetch message{result.status_code}: {result.json()}"
@@ -210,7 +259,7 @@ class MSGraphConnection(MailboxConnection):
    def delete_message(self, message_id: str):
        url = f"/users/{self.mailbox_name}/messages/{message_id}"
-        resp = self._client.delete(url)
+        resp = self._request_with_retries("delete", url)
        if resp.status_code != 204:
            raise RuntimeWarning(
                f"Failed to delete message {resp.status_code}: {resp.json()}"
@@ -220,7 +269,7 @@ class MSGraphConnection(MailboxConnection):
        folder_id = self._find_folder_id_from_folder_path(folder_name)
        request_body = {"destinationId": folder_id}
        url = f"/users/{self.mailbox_name}/messages/{message_id}/move"
-        resp = self._client.post(url, json=request_body)
+        resp = self._request_with_retries("post", url, json=request_body)
        if resp.status_code != 201:
            raise RuntimeWarning(
                f"Failed to move message {resp.status_code}: {resp.json()}"
@@ -230,10 +279,14 @@ class MSGraphConnection(MailboxConnection):
        # Not needed
        pass
-    def watch(self, check_callback, check_timeout):
+    def watch(self, check_callback, check_timeout, config_reloading=None):
        """Checks the mailbox for new messages every n seconds"""
        while True:
            if config_reloading and config_reloading():
                return
            sleep(check_timeout)
            if config_reloading and config_reloading():
                return
            check_callback(self)
    @lru_cache(maxsize=10)
@@ -248,6 +301,19 @@ class MSGraphConnection(MailboxConnection):
        else:
            return self._find_folder_id_with_parent(folder_name, None)
    def _get_well_known_folder_id(self, folder_name: str) -> Optional[str]:
        folder_key = folder_name.lower().replace(" ", "").replace("-", "")
        alias = self._WELL_KNOWN_FOLDERS.get(folder_key)
        if alias is None:
            return None
        url = f"/users/{self.mailbox_name}/mailFolders/{alias}?$select=id,displayName"
        folder_resp = self._request_with_retries("get", url)
        if folder_resp.status_code != 200:
            return None
        payload = folder_resp.json()
        return payload.get("id")
    def _find_folder_id_with_parent(
        self, folder_name: str, parent_folder_id: Optional[str]
    ):
@@ -256,8 +322,12 @@ class MSGraphConnection(MailboxConnection):
            sub_url = f"/{parent_folder_id}/childFolders"
        url = f"/users/{self.mailbox_name}/mailFolders{sub_url}"
        filter = f"?$filter=displayName eq '{folder_name}'"
-        folders_resp = self._client.get(url + filter)
+        folders_resp = self._request_with_retries("get", url + filter)
        if folders_resp.status_code != 200:
            if parent_folder_id is None:
                well_known_folder_id = self._get_well_known_folder_id(folder_name)
                if well_known_folder_id:
                    return well_known_folder_id
            raise RuntimeWarning(f"Failed to list folders.{folders_resp.json()}")
        folders: list = folders_resp.json()["value"]
        matched_folders = [
--- a/parsedmarc/mail/imap.py
+++ b/parsedmarc/mail/imap.py
@@ -55,15 +55,33 @@ class IMAPConnection(MailboxConnection):
        return cast(str, self._client.fetch_message(message_id, parse=False))
    def delete_message(self, message_id: int):
-        self._client.delete_messages([message_id])
+        try:
            self._client.delete_messages([message_id])
        except IMAPClientError as error:
            logger.warning(
                "IMAP delete fallback for message %s due to server error: %s",
                message_id,
                error,
            )
            self._client.add_flags([message_id], [r"\Deleted"], silent=True)
            self._client.expunge()
    def move_message(self, message_id: int, folder_name: str):
-        self._client.move_messages([message_id], folder_name)
+        try:
            self._client.move_messages([message_id], folder_name)
        except IMAPClientError as error:
            logger.warning(
                "IMAP move fallback for message %s due to server error: %s",
                message_id,
                error,
            )
            self._client.copy([message_id], folder_name)
            self.delete_message(message_id)
    def keepalive(self):
        self._client.noop()
-    def watch(self, check_callback, check_timeout):
+    def watch(self, check_callback, check_timeout, config_reloading=None):
        """
        Use an IDLE IMAP connection to parse incoming emails,
        and pass the results to a callback function
@@ -76,6 +94,8 @@ class IMAPConnection(MailboxConnection):
            check_callback(self)
        while True:
            if config_reloading and config_reloading():
                return
            try:
                IMAPClient(
                    host=self._client.host,
@@ -93,3 +113,5 @@ class IMAPConnection(MailboxConnection):
            except Exception as e:
                logger.warning("IMAP connection error. {0}. Reconnecting...".format(e))
                sleep(check_timeout)
            if config_reloading and config_reloading():
                return
--- a/parsedmarc/mail/mailbox_connection.py
+++ b/parsedmarc/mail/mailbox_connection.py
@@ -28,5 +28,5 @@ class MailboxConnection(ABC):
    def keepalive(self):
        raise NotImplementedError
-    def watch(self, check_callback, check_timeout):
+    def watch(self, check_callback, check_timeout, config_reloading=None):
        raise NotImplementedError
--- a/parsedmarc/mail/maildir.py
+++ b/parsedmarc/mail/maildir.py
@@ -19,29 +19,54 @@ class MaildirConnection(MailboxConnection):
    ):
        self._maildir_path = maildir_path
        self._maildir_create = maildir_create
-        maildir_owner = os.stat(maildir_path).st_uid
+        try:
-        if os.getuid() != maildir_owner:
+            maildir_owner = os.stat(maildir_path).st_uid
-            if os.getuid() == 0:
+        except OSError:
-                logger.warning(
+            maildir_owner = None
-                    "Switching uid to {} to access Maildir".format(maildir_owner)
+        current_uid = os.getuid()
-                )
+        if maildir_owner is not None and current_uid != maildir_owner:
-                os.setuid(maildir_owner)
+            if current_uid == 0:
                try:
                    logger.warning(
                        "Switching uid to {} to access Maildir".format(maildir_owner)
                    )
                    os.setuid(maildir_owner)
                except OSError as e:
                    logger.warning(
                        "Failed to switch uid to {}: {}".format(maildir_owner, e)
                    )
            else:
-                ex = "runtime uid {} differ from maildir {} owner {}".format(
+                logger.warning(
-                    os.getuid(), maildir_path, maildir_owner
+                    "Runtime uid {} differs from maildir {} owner {}. "
                    "Access may fail if permissions are insufficient.".format(
                        current_uid, maildir_path, maildir_owner
                    )
                )
-                raise Exception(ex)
+        if maildir_create:
            for subdir in ("cur", "new", "tmp"):
                os.makedirs(os.path.join(maildir_path, subdir), exist_ok=True)
        self._client = mailbox.Maildir(maildir_path, create=maildir_create)
        self._active_folder: mailbox.Maildir = self._client
        self._subfolder_client: Dict[str, mailbox.Maildir] = {}
    def _get_folder(self, folder_name: str) -> mailbox.Maildir:
        """Return a cached subfolder handle, creating it if needed."""
        if folder_name not in self._subfolder_client:
            self._subfolder_client[folder_name] = self._client.add_folder(folder_name)
        return self._subfolder_client[folder_name]
    def create_folder(self, folder_name: str):
-        self._subfolder_client[folder_name] = self._client.add_folder(folder_name)
+        self._get_folder(folder_name)
    def fetch_messages(self, reports_folder: str, **kwargs):
-        return self._client.keys()
+        if reports_folder and reports_folder != "INBOX":
            self._active_folder = self._get_folder(reports_folder)
        else:
            self._active_folder = self._client
        return self._active_folder.keys()
    def fetch_message(self, message_id: str) -> str:
-        msg = self._client.get(message_id)
+        msg = self._active_folder.get(message_id)
        if msg is not None:
            msg = msg.as_string()
            if msg is not None:
@@ -49,24 +74,27 @@ class MaildirConnection(MailboxConnection):
        return ""
    def delete_message(self, message_id: str):
-        self._client.remove(message_id)
+        self._active_folder.remove(message_id)
    def move_message(self, message_id: str, folder_name: str):
-        message_data = self._client.get(message_id)
+        message_data = self._active_folder.get(message_id)
        if message_data is None:
            return
-        if folder_name not in self._subfolder_client:
+        dest = self._get_folder(folder_name)
-            self._subfolder_client[folder_name] = self._client.add_folder(folder_name)
+        dest.add(message_data)
-        self._subfolder_client[folder_name].add(message_data)
+        self._active_folder.remove(message_id)
        self._client.remove(message_id)
    def keepalive(self):
        return
-    def watch(self, check_callback, check_timeout):
+    def watch(self, check_callback, check_timeout, config_reloading=None):
        while True:
            if config_reloading and config_reloading():
                return
            try:
                check_callback(self)
            except Exception as e:
                logger.warning("Maildir init error. {0}".format(e))
            if config_reloading and config_reloading():
                return
            sleep(check_timeout)
--- a/parsedmarc/opensearch.py
+++ b/parsedmarc/opensearch.py
@@ -4,7 +4,9 @@ from __future__ import annotations
 from typing import Any, Optional, Union
 import boto3
 from opensearchpy import (
    AWSV4SignerAuth,
    Boolean,
    Date,
    Document,
@@ -15,6 +17,7 @@ from opensearchpy import (
    Nested,
    Object,
    Q,
    RequestsHttpConnection,
    Search,
    Text,
    connections,
@@ -268,10 +271,14 @@ def set_hosts(
    *,
    use_ssl: Optional[bool] = False,
    ssl_cert_path: Optional[str] = None,
    skip_certificate_verification: bool = False,
    username: Optional[str] = None,
    password: Optional[str] = None,
    api_key: Optional[str] = None,
    timeout: Optional[float] = 60.0,
    auth_type: str = "basic",
    aws_region: Optional[str] = None,
    aws_service: str = "es",
 ):
    """
    Sets the OpenSearch hosts to use
@@ -280,10 +287,14 @@ def set_hosts(
        hosts (str|list[str]): A single hostname or URL, or list of hostnames or URLs
        use_ssl (bool): Use an HTTPS connection to the server
        ssl_cert_path (str): Path to the certificate chain
        skip_certificate_verification (bool): Skip certificate verification
        username (str): The username to use for authentication
        password (str): The password to use for authentication
        api_key (str): The Base64 encoded API key to use for authentication
        timeout (float): Timeout in seconds
        auth_type (str): OpenSearch auth mode: basic (default) or awssigv4
        aws_region (str): AWS region for SigV4 auth (required for awssigv4)
        aws_service (str): AWS service for SigV4 signing (default: es)
    """
    if not isinstance(hosts, list):
        hosts = [hosts]
@@ -291,14 +302,35 @@ def set_hosts(
    if use_ssl:
        conn_params["use_ssl"] = True
        if ssl_cert_path:
            conn_params["verify_certs"] = True
            conn_params["ca_certs"] = ssl_cert_path
-        else:
+        if skip_certificate_verification:
            conn_params["verify_certs"] = False
-    if username and password:
+        else:
-        conn_params["http_auth"] = username + ":" + password
+            conn_params["verify_certs"] = True
-    if api_key:
+    normalized_auth_type = (auth_type or "basic").strip().lower()
-        conn_params["api_key"] = api_key
+    if normalized_auth_type == "awssigv4":
        if not aws_region:
            raise OpenSearchError(
                "OpenSearch AWS SigV4 auth requires 'aws_region' to be set"
            )
        session = boto3.Session()
        credentials = session.get_credentials()
        if credentials is None:
            raise OpenSearchError(
                "Unable to load AWS credentials for OpenSearch SigV4 authentication"
            )
        conn_params["http_auth"] = AWSV4SignerAuth(credentials, aws_region, aws_service)
        conn_params["connection_class"] = RequestsHttpConnection
    elif normalized_auth_type == "basic":
        if username and password:
            conn_params["http_auth"] = username + ":" + password
        if api_key:
            conn_params["api_key"] = api_key
    else:
        raise OpenSearchError(
            f"Unsupported OpenSearch auth_type '{auth_type}'. "
            "Expected 'basic' or 'awssigv4'."
        )
    connections.create_connection(**conn_params)
@@ -735,6 +767,7 @@ def save_smtp_tls_report_to_opensearch(
        index_date = begin_date.strftime("%Y-%m")
    else:
        index_date = begin_date.strftime("%Y-%m-%d")
    report = report.copy()
    report["begin_date"] = begin_date
    report["end_date"] = end_date
--- a/parsedmarc/resources/dbip/dbip-country-lite.mmdb
+++ b/parsedmarc/resources/dbip/dbip-country-lite.mmdb
--- a/parsedmarc/s3.py
+++ b/parsedmarc/s3.py
@@ -93,3 +93,11 @@ class S3Client(object):
        self.bucket.put_object(
            Body=json.dumps(report), Key=object_path, Metadata=object_metadata
        )
    def close(self):
        """Clean up the boto3 resource."""
        try:
            if self.s3.meta is not None:
                self.s3.meta.client.close()
        except Exception:
            pass
--- a/parsedmarc/splunk.py
+++ b/parsedmarc/splunk.py
@@ -58,7 +58,7 @@ class HECClient(object):
        self.source = source
        self.session = requests.Session()
        self.timeout = timeout
-        self.session.verify = verify
+        self.verify = verify
        self._common_data: dict[str, Union[str, int, float, dict]] = dict(
            host=self.host, source=self.source, index=self.index
        )
@@ -124,10 +124,12 @@ class HECClient(object):
                data["event"] = new_report.copy()
                json_str += "{0}\n".format(json.dumps(data))
-        if not self.session.verify:
+        if not self.verify:
            logger.debug("Skipping certificate verification for Splunk HEC")
        try:
-            response = self.session.post(self.url, data=json_str, timeout=self.timeout)
+            response = self.session.post(
                self.url, data=json_str, verify=self.verify, timeout=self.timeout
            )
            response = response.json()
        except Exception as e:
            raise SplunkError(e.__str__())
@@ -161,10 +163,12 @@ class HECClient(object):
            data["event"] = report.copy()
            json_str += "{0}\n".format(json.dumps(data))
-        if not self.session.verify:
+        if not self.verify:
            logger.debug("Skipping certificate verification for Splunk HEC")
        try:
-            response = self.session.post(self.url, data=json_str, timeout=self.timeout)
+            response = self.session.post(
                self.url, data=json_str, verify=self.verify, timeout=self.timeout
            )
            response = response.json()
        except Exception as e:
            raise SplunkError(e.__str__())
@@ -198,12 +202,18 @@ class HECClient(object):
            data["event"] = report.copy()
            json_str += "{0}\n".format(json.dumps(data))
-        if not self.session.verify:
+        if not self.verify:
            logger.debug("Skipping certificate verification for Splunk HEC")
        try:
-            response = self.session.post(self.url, data=json_str, timeout=self.timeout)
+            response = self.session.post(
                self.url, data=json_str, verify=self.verify, timeout=self.timeout
            )
            response = response.json()
        except Exception as e:
            raise SplunkError(e.__str__())
        if response["code"] != 0:
            raise SplunkError(response["text"])
    def close(self):
        """Close the underlying HTTP session."""
        self.session.close()
--- a/parsedmarc/syslog.py
+++ b/parsedmarc/syslog.py
@@ -57,7 +57,7 @@ class SyslogClient(object):
        self.logger.setLevel(logging.INFO)
        # Create the appropriate syslog handler based on protocol
-        log_handler = self._create_syslog_handler(
+        self.log_handler = self._create_syslog_handler(
            server_name,
            server_port,
            self.protocol,
@@ -69,7 +69,7 @@ class SyslogClient(object):
            retry_delay,
        )
-        self.logger.addHandler(log_handler)
+        self.logger.addHandler(self.log_handler)
    def _create_syslog_handler(
        self,
@@ -179,3 +179,8 @@ class SyslogClient(object):
        rows = parsed_smtp_tls_reports_to_csv_rows(smtp_tls_reports)
        for row in rows:
            self.logger.info(json.dumps(row))
    def close(self):
        """Remove and close the syslog handler, releasing its socket."""
        self.logger.removeHandler(self.log_handler)
        self.log_handler.close()
--- a/parsedmarc/utils.py
+++ b/parsedmarc/utils.py
@@ -205,8 +205,7 @@ def get_reverse_dns(
        )[0]
    except dns.exception.DNSException as e:
-        logger.warning(f"get_reverse_dns({ip_address}) exception: {e}")
+        logger.debug(f"get_reverse_dns({ip_address}) exception: {e}")
        pass
    return hostname
@@ -336,6 +335,76 @@ def get_ip_address_country(
    return country
 def load_reverse_dns_map(
    reverse_dns_map: ReverseDNSMap,
    *,
    always_use_local_file: bool = False,
    local_file_path: Optional[str] = None,
    url: Optional[str] = None,
    offline: bool = False,
 ) -> None:
    """
    Loads the reverse DNS map from a URL or local file.
    Clears and repopulates the given map dict in place. If the map is
    fetched from a URL, that is tried first; on failure (or if offline/local
    mode is selected) the bundled CSV is used as a fallback.
    Args:
        reverse_dns_map (dict): The map dict to populate (modified in place)
        always_use_local_file (bool): Always use a local map file
        local_file_path (str): Path to a local map file
        url (str): URL to a reverse DNS map
        offline (bool): Use the built-in copy of the reverse DNS map
    """
    if url is None:
        url = (
            "https://raw.githubusercontent.com/domainaware"
            "/parsedmarc/master/parsedmarc/"
            "resources/maps/base_reverse_dns_map.csv"
        )
    reverse_dns_map.clear()
    def load_csv(_csv_file):
        reader = csv.DictReader(_csv_file)
        for row in reader:
            key = row["base_reverse_dns"].lower().strip()
            reverse_dns_map[key] = {
                "name": row["name"].strip(),
                "type": row["type"].strip(),
            }
    csv_file = io.StringIO()
    if not (offline or always_use_local_file):
        try:
            logger.debug(f"Trying to fetch reverse DNS map from {url}...")
            headers = {"User-Agent": USER_AGENT}
            response = requests.get(url, headers=headers)
            response.raise_for_status()
            csv_file.write(response.text)
            csv_file.seek(0)
            load_csv(csv_file)
        except requests.exceptions.RequestException as e:
            logger.warning(f"Failed to fetch reverse DNS map: {e}")
        except Exception:
            logger.warning("Not a valid CSV file")
            csv_file.seek(0)
            logging.debug("Response body:")
            logger.debug(csv_file.read())
    if len(reverse_dns_map) == 0:
        logger.info("Loading included reverse DNS map...")
        path = str(
            files(parsedmarc.resources.maps).joinpath("base_reverse_dns_map.csv")
        )
        if local_file_path is not None:
            path = local_file_path
        with open(path) as csv_file:
            load_csv(csv_file)
 def get_service_from_reverse_dns_base_domain(
    base_domain,
    *,
@@ -362,55 +431,21 @@ def get_service_from_reverse_dns_base_domain(
    """
    base_domain = base_domain.lower().strip()
    if url is None:
        url = (
            "https://raw.githubusercontent.com/domainaware"
            "/parsedmarc/master/parsedmarc/"
            "resources/maps/base_reverse_dns_map.csv"
        )
    reverse_dns_map_value: ReverseDNSMap
    if reverse_dns_map is None:
        reverse_dns_map_value = {}
    else:
        reverse_dns_map_value = reverse_dns_map
    def load_csv(_csv_file):
        reader = csv.DictReader(_csv_file)
        for row in reader:
            key = row["base_reverse_dns"].lower().strip()
            reverse_dns_map_value[key] = {
                "name": row["name"],
                "type": row["type"],
            }
    csv_file = io.StringIO()
    if not (offline or always_use_local_file) and len(reverse_dns_map_value) == 0:
        try:
            logger.debug(f"Trying to fetch reverse DNS map from {url}...")
            headers = {"User-Agent": USER_AGENT}
            response = requests.get(url, headers=headers)
            response.raise_for_status()
            csv_file.write(response.text)
            csv_file.seek(0)
            load_csv(csv_file)
        except requests.exceptions.RequestException as e:
            logger.warning(f"Failed to fetch reverse DNS map: {e}")
        except Exception:
            logger.warning("Not a valid CSV file")
            csv_file.seek(0)
            logging.debug("Response body:")
            logger.debug(csv_file.read())
    if len(reverse_dns_map_value) == 0:
-        logger.info("Loading included reverse DNS map...")
+        load_reverse_dns_map(
-        path = str(
+            reverse_dns_map_value,
-            files(parsedmarc.resources.maps).joinpath("base_reverse_dns_map.csv")
+            always_use_local_file=always_use_local_file,
            local_file_path=local_file_path,
            url=url,
            offline=offline,
        )
-        if local_file_path is not None:
+
            path = local_file_path
        with open(path) as csv_file:
            load_csv(csv_file)
    service: ReverseDNSService
    try:
        service = reverse_dns_map_value[base_domain]
--- a/parsedmarc/webhook.py
+++ b/parsedmarc/webhook.py
@@ -63,3 +63,7 @@ class WebhookClient(object):
            self.session.post(webhook_url, data=payload, timeout=self.timeout)
        except Exception as error_:
            logger.error("Webhook Error: {0}".format(error_.__str__()))
    def close(self):
        """Close the underlying HTTP session."""
        self.session.close()
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -50,7 +50,7 @@ dependencies = [
    "lxml>=4.4.0",
    "mailsuite>=1.11.2",
    "msgraph-core==0.2.2",
-    "opensearch-py>=2.4.2,<=3.0.0",
+    "opensearch-py>=2.4.2,<=4.0.0",
    "publicsuffixlist>=0.10.0",
    "pygelf>=0.4.2",
    "requests>=2.22.0",
--- a/tests.py
+++ b/tests.py
Author	SHA1	Message	Date
Sean Whalen	1542936468	Bump version to 9.5.4, enhance Maildir folder handling, and add config key aliases for environment variable compatibility	2026-03-25 23:22:46 -04:00
Sean Whalen	fb3c38a8b8	9.5.3 - Fixed `FileNotFoundError` when using Maildir with Docker volume mounts. Python's `mailbox.Maildir(create=True)` only creates `cur/new/tmp` subdirectories when the top-level directory doesn't exist; Docker volume mounts pre-create the directory as empty, skipping subdirectory creation. parsedmarc now explicitly creates the subdirectories when `maildir_create` is enabled. - Maildir UID mismatch no longer crashes the process. In Docker containers where volume ownership differs from the container UID, parsedmarc now logs a warning instead of raising an exception. Also handles `os.setuid` failures gracefully in containers without `CAP_SETUID`. - Token file writes (MS Graph and Gmail) now create parent directories automatically, preventing `FileNotFoundError` when the token path points to a directory that doesn't yet exist. - File paths from config (`token_file`, `credentials_file`, `cert_path`, `log_file`, `output`, `ip_db_path`, `maildir_path`, syslog cert paths, etc.) now expand `~` and `$VAR` references via `os.path.expanduser`/`os.path	2026-03-25 21:29:08 -04:00
Sean Whalen	c9a6145505	9.5.3 - Fixed `FileNotFoundError` when using Maildir with Docker volume mounts. Python's `mailbox.Maildir(create=True)` only creates `cur/new/tmp` subdirectories when the top-level directory doesn't exist; Docker volume mounts pre-create the directory as empty, skipping subdirectory creation. parsedmarc now explicitly creates the subdirectories when `maildir_create` is enabled.	2026-03-25 21:13:34 -04:00
Sean Whalen	e1bdbeb257	Bump version to 9.5.2 and fix interpolation issues in config parser	2026-03-25 20:21:08 -04:00
Sean Whalen	12c4676b79	9.5.1 - Correct ISO format for MSGraphConnection timestamps (PR #706)	2026-03-25 19:43:24 -04:00
mihugo	cda039ee27	Correct ISO format for MSGraphConnection timestamps (#706 ) Fix formatting of ISO 8601 date strings for MSGraphConnection. format yyyy-dd-mmThh:MM:SS.zzzzzz+00:00 already has a timezone indicated. The extra Z is invalid in this format. specifying a "since" in config file causes msgraph to error due to invalid time stamp.	2026-03-25 19:38:23 -04:00
Sean Whalen	ff0ca6538c	9.5.0 Add environment variable configuration support and update documentation - Introduced support for configuration via environment variables using the `PARSEDMARC_{SECTION}_{KEY}` format. - Added `PARSEDMARC_CONFIG_FILE` variable to specify the config file path. - Enabled env-only mode for file-less Docker deployments. - Implemented explicit read permission checks on config files. - Updated changelog and usage documentation to reflect these changes.	2026-03-25 19:25:21 -04:00
Sean Whalen	2032438d3b	9.4.0 ### Added - Extracted `load_reverse_dns_map()` utility function in `utils.py` for loading the reverse DNS map independently of individual IP lookups. - SIGHUP reload now re-downloads/reloads the reverse DNS map, so changes take effect without restarting. - Add premade OpenSearch index patterns, visualizations, and dashboards ### Changed - When `index_prefix_domain_map` is configured, SMTP TLS reports for domains not in the map are now silently dropped instead of being output. Unlike DMARC, TLS-RPT has no DNS authorization records, so this filtering prevents processing reports for unrelated domains. - Bump OpenSearch support to `< 4` ### Fixed - Fixed `get_index_prefix` using wrong key (`domain` instead of `policy_domain`) for SMTP TLS reports, which prevented domain map matching from working for TLS reports. - Domain matching in `get_index_prefix` now lowercases the domain for case-insensitive comparison.	2026-03-23 17:08:26 -04:00
Sean Whalen	1e95c5d30b	9.3.1 Elasticsearch and OpenSearch now verify SSL certificates by default when `ssl = True`, even without a `cert_path` - Added `skip_certificate_verification` option to the `elasticsearch` and `opensearch` configuration sections for consistency with `splunk_hec` - Splunk HEC `skip_certificate_verification` now works correctly with self-signed certificates - SMTP TLS reports no longer fail when saving to multiple output targets (e.g. Elasticsearch and OpenSearch) due to in-place mutation of the report dict - Output client initialization errors now identify which module failed (e.g. "OpenSearch: ConnectionError..." instead of generic "Output client error") - Enhanced error handling for output client initialization	2026-03-22 14:38:32 -04:00
Sean Whalen	cb2384be83	Copy report before modifying begin_date and end_date in save_smtp_tls_report functions	2026-03-22 13:13:21 -04:00
Sean Whalen	9a5b5310fa	Update Grafana and Splunk environment variables in docker-compose for consistency	2026-03-22 12:40:42 -04:00
Sean Whalen	9849598100	Formatting	2026-03-21 16:17:35 -04:00
Sean Whalen	e82f3e58a1	SIGHUP-based configuration reload for watch mode (#697 ) * Enhance mailbox connection watch method to support reload functionality - Updated the `watch` method in `GmailConnection`, `MSGraphConnection`, `IMAPConnection`, `MaildirConnection`, and the abstract `MailboxConnection` class to accept an optional `should_reload` parameter. This allows the method to check if a reload is necessary and exit the loop if so. - Modified related tests to accommodate the new method signature. - Changed logger calls from `critical` to `error` for consistency in logging severity. - Added a new settings file for Claude with specific permissions for testing and code checks. * Update parsedmarc/cli.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update parsedmarc/cli.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * [WIP] SIGHUP-based configuration reload for watch mode (#698) * Initial plan * Fix reload state consistency, resource leaks, stale opts; add tests Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> Agent-Logs-Url: https://github.com/domainaware/parsedmarc/sessions/3c2e0bb9-7e2d-4efa-aef6-d2b98478b921 --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> * [WIP] SIGHUP-based configuration reload for watch mode (#699) * Initial plan * Fix review comments: ConfigurationError wrapping, duplicate parse args, bool parsing, Kafka required topics, should_reload kwarg, SIGHUP test skips Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> Agent-Logs-Url: https://github.com/domainaware/parsedmarc/sessions/0779003c-ccbe-4d76-9748-801dbc238b96 --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> * SIGHUP-based configuration reload: address review feedback (#700) * Initial plan * Address review feedback: kafka_ssl, duplicate silent, exception chain, log file reload, should_reload timing Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> Agent-Logs-Url: https://github.com/domainaware/parsedmarc/sessions/a8a43c55-23fa-4471-abe6-7ac966f381f9 --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> * Update parsedmarc/cli.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Best-effort initialization for optional output clients in watch mode (#701) * Initial plan * Wrap optional output client init in try/except for best-effort initialization Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> Agent-Logs-Url: https://github.com/domainaware/parsedmarc/sessions/59241d4e-1b05-4a92-b2d2-e6d13d10a4fd --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> * Fix SIGHUP reload tight-loop in watch mode (#702) * Initial plan * Fix _reload_requested tight-loop: reset flag before reload to capture concurrent SIGHUPs Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> Agent-Logs-Url: https://github.com/domainaware/parsedmarc/sessions/879d0bb1-9037-41f7-bc89-f59611956d2e --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> * Update parsedmarc/cli.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix resource leak when HEC config is invalid in `_init_output_clients()` (#703) * Initial plan * Fix resource leak: validate HEC settings before creating any output clients Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> Agent-Logs-Url: https://github.com/domainaware/parsedmarc/sessions/38c73e09-789d-4d41-b75e-bbc61418859d --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> * Ensure SIGHUP never triggers a new email batch across all watch() implementations (#704) * Initial plan * Ensure SIGHUP never starts a new email batch in any watch() implementation Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> Agent-Logs-Url: https://github.com/domainaware/parsedmarc/sessions/45d5be30-8f6b-4200-9bdd-15c655033f17 --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> * SIGHUP-based config reload for watch mode: address review feedback (#705) * Initial plan * Address review feedback: Kafka SSL context, SIGHUP handler safety, test formatting Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> Agent-Logs-Url: https://github.com/domainaware/parsedmarc/sessions/8f2fd48f-32a4-4258-9a89-06f7c7ac29bf --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> * Reverted changes by copilot that turned errors into warnings * Enhance usage documentation for config reload: clarify behavior on successful reload and error handling * Update CHANGELOG.md to reflect config reload enhancements * Add pytest command to settings for silent output during testing * Enhance resource management: add close methods for S3Client and HECClient, and improve IMAP connection handling during IDLE. Update CHANGELOG.md for config reload improvements and bug fixes. * Update changelog to not include fixes within the same unreleased version * Refactor changelog entries for clarity and consistency in configuration reload section * Fix changelog entry for msgraph configuration check * Update CHANGELOG..md * make single list items on one line in the changelog instead of doing hard wraps * Remove incorrect IMAP changes * Rename 'should_reload' parameter to 'config_reloading' in mailbox connection methods for clarity * Restore startup configuration checks * Improve error logging for Elasticsearch and OpenSearch exceptions * Bump version to 9.3.0 in constants.py * Refactor GelfClient methods to use specific report types instead of generic dicts * Refactor tests to use assertions consistently and improve type hints --------- Co-authored-by: Sean Whalen <seanthegeek@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>	2026-03-21 16:14:48 -04:00
Sean Whalen	dd1a8fd461	Create docker compose file for dashboard development	2026-03-20 14:12:26 -04:00
Sean Whalen	81656c75e9	Update OpenSearch healthcheck to use HTTPS and include authentication	2026-03-16 17:53:37 -04:00
Sean Whalen	691b0fcd41	Fix changelog headings	2026-03-10 20:34:13 -04:00
Sean Whalen	b9343a295f	9.2.1 - Better checking of `msconfig` configuration (PR #695) - Updated `dbip-country-lite` database to version `2026-03` - Changed - DNS query error logging level from `warning` to `debug`	2026-03-10 20:32:33 -04:00
Kili	b51a62463f	Fail fast on invalid MS Graph username/password config (#695 )	2026-03-10 19:34:16 -04:00
Kili	66ba5b0e5e	Add MS Graph auth matrix regression tests (#696 ) * Rebase MS Graph auth matrix tests onto current master * Expand ClientSecret auth matrix coverage	2026-03-10 19:33:37 -04:00
Sean Whalen	7929919223	9.2.0 ### Added - OpenSearch AWS SigV4 authentication support (PR #673) - IMAP move/delete compatibility fallbacks (PR #671) - `fail_on_output_error` CLI option for sink failures (PR #672) - Gmail service account auth mode for non-interactive runs (PR #676) - Microsoft Graph certificate authentication support (PRs #692 and #693) - Microsoft Graph well-known folder fallback for root listing failures (PR #618 and #684 close #609) ### Fixed - Pass mailbox since filter through `watch_inbox` callback (PR #670 closes issue #581) - `parsedmarc.mail.gmail.GmailConnection.delete_message` now properly calls the Gmail API (PR #668) - Avoid extra mailbox fetch in batch and test mode (PR #691 closes #533)	2026-03-10 11:41:37 -04:00
Kili	faa68333a9	Avoid extra mailbox fetch in batch/test mode and add regression test (#691 ) Co-authored-by: Sean Whalen <44679+seanthegeek@users.noreply.github.com>	2026-03-10 11:22:39 -04:00
Kili	d34a33e980	Validate MS Graph certificate auth inputs (#693 ) * Validate MS Graph certificate auth inputs * Fix MS Graph shared scope detection without username	2026-03-10 11:22:09 -04:00
Kili	9040a38842	Refine MS Graph well-known folder fallback (#694 ) * Refine MS Graph well-known folder fallback * Make MS Graph retry test doubles method-aware	2026-03-10 11:20:43 -04:00
Kili	ea0e3b11c1	Add MS Graph certificate authentication support (#692 ) * Add MS Graph certificate authentication support * Preserve MS Graph constructor compatibility --------- Co-authored-by: Sean Whalen <44679+seanthegeek@users.noreply.github.com>	2026-03-10 09:30:39 -04:00
Kili	199b782191	Add MS Graph well-known folder fallback for root listing failures (#689 ) * Add MS Graph well-known folder fallback for root listing failures * Resolve test merge cleanup for MS Graph folder fallback	2026-03-10 09:25:37 -04:00
Kili	25f3c3e1d0	Add security policy (#688 ) * Add security policy * Update SECURITY.md for vulnerability reporting clarity Clarified instructions for reporting vulnerabilities and updated language regarding security fixes. --------- Co-authored-by: Sean Whalen <44679+seanthegeek@users.noreply.github.com>	2026-03-09 18:24:16 -04:00
Kili	a14ff66f5a	Add GitHub issue templates (#686 )	2026-03-09 18:17:06 -04:00
Kili	fb738bf9c4	Add contributing guide (#685 )	2026-03-09 18:16:47 -04:00
Kili	0e811fe0ff	Add pull request template (#687 )	2026-03-09 18:15:40 -04:00
Kili	56eb565ad2	Accept pathlib.Path in report parsing APIs (#680 ) * Accept pathlib.Path in report parsing APIs * Polish PathLike typing and test names	2026-03-09 18:08:57 -04:00
Kili	2c3abb3e8c	Retry transient MS Graph request errors (#679 ) * Retry transient MS Graph request errors * Handle zero MS Graph retry attempts explicitly	2026-03-09 17:56:22 -04:00
Kili	326e630f50	Add performance tuning guidance for large mailbox runs (#677 )	2026-03-09 17:44:42 -04:00
Kili	cdc30e6780	Tune Codecov statuses for small PRs (#678 )	2026-03-09 17:43:34 -04:00
Kili	f2febf21d3	Add fail_on_output_error CLI option for sink failures (#672 ) * Add fail-on-output-error option and CLI regression test * Broaden fail_on_output_error coverage for disabled and multi-sink paths	2026-03-09 17:35:38 -04:00
Kili	79f47121a4	Pass mailbox since filter through watch_inbox callback (#670 ) * Pass mailbox since through watch loop and add regression test * Add CLI regression test for mailbox since in watch mode	2026-03-09 17:33:42 -04:00
Kili	6e6c90e19b	Add IMAP move/delete compatibility fallbacks (#671 ) * Add IMAP move/delete compatibility fallbacks with tests * Expand IMAP fallback tests for success and error paths	2026-03-09 17:29:01 -04:00
Kili	c4d7455839	Add OpenSearch AWS SigV4 authentication support (#673 ) * Add OpenSearch AWS SigV4 authentication support * Increase SigV4 coverage for auth validation and CLI config wiring * Update parsedmarc/opensearch.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update docs/source/usage.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Sean Whalen <44679+seanthegeek@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-03-09 17:21:22 -04:00
Kili	95e6fb85a1	Fix Gmail delete_message to execute API request (#668 ) * Fix Gmail delete to execute request and add regression test * Fix duplicate GmailConnection import in tests	2026-03-09 17:11:35 -04:00
Kili	298d5b6e6e	CI: split lint/docs/build from integration tests matrix (#669 ) * Optimize CI: split lint/docs/build from integration tests * Trim unnecessary package install from lint job	2026-03-09 17:09:02 -04:00
Kili	a3c5bb906b	Add Gmail service account auth mode with delegated user support (#676 )	2026-03-09 17:04:30 -04:00
Kili	d49ce6a13f	Increase unit test coverage for Gmail/Graph/IMAP connectors (#664 ) * Increase coverage for Gmail, Graph, and IMAP mail connectors * Make testLoadTokenMissing use guaranteed-missing temp path * Expand coverage for Gmail token refresh and Graph pagination error paths	2026-03-09 11:54:43 -04:00
Sean Whalen	adb0d31382	9.1.2 - Fix duplicate detection for normalized aggregate reports in Elasticsearch/OpenSearch (PR #666 fixes issue #665)	2026-03-06 13:41:33 -05:00
Copilot	ae5d20ecf5	Fix duplicate detection for normalized aggregate reports in Elasticsearch/OpenSearch (#666 ) Change date_begin/date_end queries from exact match to range queries (gte/lte) so that previously saved normalized time buckets are correctly detected as duplicates within the original report's date range. Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>	2026-03-06 13:21:54 -05:00
`@@ -1,3 +1,3 @@`
	`__version__ = "9.1.1"`	`__version__ = "9.5.4"`

	`USER_AGENT = f"parsedmarc/{__version__}"`	`USER_AGENT = f"parsedmarc/{__version__}"`