Sean Whalen 5bb6570f4e collect_domain_info.py: replace curl fallback with pure-requests path (#731)
* collect_domain_info.py: replace curl shell-out with requests-based fallback

The previous fallback for cert-error / UA-blocked sites was a curl
subprocess. This was correct but added an external runtime dependency
(curl is usually present but not on minimal containers) and a fork +
tempfile + parse round-trip per fallback call. Replaced with a pure
requests-based path that uses a custom HTTPAdapter to relax the SSL
context to the same effective configuration:

  ssl.CERT_NONE                 (verify=False, equivalent to curl -k)
  set_ciphers("DEFAULT@SECLEVEL=0")  (allows weak DH/RSA, recovers
                                       DH_KEY_TOO_SMALL hosts that
                                       even curl's default config
                                       rejects)
  options |= 0x4 (OP_LEGACY_SERVER_CONNECT, allows unsafe legacy
                  TLS renegotiation for older server stacks)

Plus a real-browser User-Agent (same Chrome/124 string as before),
verify=False, allow_redirects=True, and Session.max_redirects=5.
InsecureRequestWarning is suppressed at module level since the
verify-disabled path is intentional.

Smoke-tested against the same eight cert-error domains as the original
curl fallback. Same recovery rate on all eight (six recover with full
title+description, two -- twmbroadband.com and ltt.ly -- remain
genuinely unreachable with both implementations). One additional win:
vnpt.com.vn (DH_KEY_TOO_SMALL) now recovers under the SECLEVEL=0
cipher list, which curl with default options did not. Happy-path
domains (google.com) still take the primary path and produce
identical output.

Side effects:
- removes the curl runtime dependency from collect_domain_info.py
- removes ~10ms of fork-and-parse overhead per fallback call
- removes the tempfile-on-disk round-trip; body is captured in-memory
- error suffix in the TSV's error column changes from "| curl: ..." to
  "| fallback: ..."

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Use getattr(ssl, "OP_LEGACY_SERVER_CONNECT", 0x4) instead of raw 0x4

Per PR review: prefer the constant where the interpreter exposes it
(Python 3.12+) and fall back to the raw value (0x4) only on older
interpreters that the project still supports. Self-documenting and
future-proof against any unlikely stdlib value reshuffle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Sean Whalen <seanthegeek@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 16:34:57 -04:00
2026-04-19 21:20:41 -04:00
2024-12-25 16:09:43 -05:00
2025-06-10 19:05:06 -04:00
2026-04-19 21:20:41 -04:00
2025-12-12 15:56:52 -05:00
2026-03-09 18:16:47 -04:00
2026-03-23 17:08:26 -04:00
2018-02-05 20:23:07 -05:00
2022-10-04 18:45:57 -04:00
2026-03-09 18:24:16 -04:00

parsedmarc

Build
Status Code
Coverage PyPI
Package PyPI - Downloads

A screenshot of DMARC summary charts in Kibana

parsedmarc is a Python module and CLI utility for parsing DMARC reports. When used with Elasticsearch and Kibana (or Splunk), it works as a self-hosted open-source alternative to commercial DMARC report processing services such as Agari Brand Protection, Dmarcian, OnDMARC, ProofPoint Email Fraud Defense, and Valimail.

Note

Domain-based Message Authentication, Reporting, and Conformance (DMARC) is an email authentication protocol.

Sponsors

This is a project is maintained by one developer. Please consider sponsoring my work if you or your organization benefit from it.

Features

  • Parses draft and 1.0 standard aggregate/rua DMARC reports
  • Parses forensic/failure/ruf DMARC reports
  • Parses reports from SMTP TLS Reporting
  • Can parse reports from an inbox over IMAP, Microsoft Graph, or Gmail API
  • Transparently handles gzip or zip compressed reports
  • Consistent data structures
  • Simple JSON and/or CSV output
  • Optionally email the results
  • Optionally send the results to Elasticsearch, Opensearch, and/or Splunk, for use with premade dashboards
  • Optionally send reports to Apache Kafka

Python Compatibility

This project supports the following Python versions, which are either actively maintained or are the default versions for RHEL or Debian.

Version Supported Reason
< 3.6 End of Life (EOL)
3.6 Used in RHEL 8, but not supported by project dependencies
3.7 End of Life (EOL)
3.8 End of Life (EOL)
3.9 Used in Debian 11 and RHEL 9, but not supported by project dependencies
3.10 Actively maintained
3.11 Actively maintained; supported until June 2028 (Debian 12)
3.12 Actively maintained; supported until May 2035 (RHEL 10)
3.13 Actively maintained; supported until June 2030 (Debian 13)
3.14 Supported (requires imapclient>=3.1.0)
S
Description
No description provided
Readme Apache-2.0 119 MiB
Languages
Python 96.7%
Shell 3.2%
Dockerfile 0.1%