Sean Whalen 2978436d89 Expand reverse-DNS map and PSL overrides from the live PSL (#716)
* Expand reverse-DNS map and PSL overrides from the live PSL

Parses the private-domains section of the live Public Suffix List and
adds 269 brand-owned suffixes as PSL overrides paired with map
entries, so customer subdomains on shared hosting / SaaS / PaaS
platforms fold to the operator's brand. Adds 33 ASN-domain entries
for the subset of these brands whose IP space is registered under a
different corporate domain in the MMDB, so both the PTR-derived
lookup and the ASN-fallback lookup hit the same (name, type). Also
normalizes ``a2hosting.com`` from ``A2Hosting`` to ``A2 Hosting``
for spelling consistency.

PTR-path wins (overrides + map entries)
- Web hosts: A2 Hosting, alwaysdata, Antagonist, Beget, bplaced,
  Bytemark, Combell, cyber_Folks, cyon, DreamHost, EasyWP, Gehirn,
  HelioHost, home.pl, HostyHosting, Hypernode, IONOS (6 suffixes),
  Jotelulu, JouwWeb, KaasHosting, Keyweb, LCube, LiquidNet, McHost,
  Memset, Mittwald, Mythic Beasts, NearlyFreeSpeech, Nimbus Hosting,
  One.com (20 ccTLD variants), OwnProvider, Pantheon, Planet-Work,
  prgmr, Rackmaze, Rad Web Hosting, Raidboxes, Servebolt,
  SpeedPartner, Uberspace, Whatbox, WP Engine, ZAP-Hosting, Zitcom.
- Dynamic DNS: DuckDNS, DynDNS (24), No-IP (22), Now-DNS, dynv6,
  freemyip, nsupdate.info, ddnss.de, GoIP, DrayTek.
- PaaS/SaaS/IaaS: Netlify, Vercel (6), Heroku, fly.io, Render,
  Firebase/GCP (4), Azure (5), AWS (4), DigitalOcean (2), Red Hat
  OpenShift, Hasura, Supabase, Snowflake/Streamlit, Read the Docs,
  PythonAnywhere, GitHub, GitLab, Adobe Magento.
- Hosted sites/stores: Hatena (6), Notion, Figma, Webflow, Wix (4),
  Shopify, Shopware, Sellfy, Spreadshop (19 ccTLDs), Datto.
- Email/Marketing: Fastmail, ActiveTrail, Leadpages, Heyflow, Carrd,
  Typeform.
- CDN/Technology: Akamai (7), Fastly (3), Yandex Cloud.

ASN-path wins (MMDB coverage now attributes 1,184,256 more IPv4
addresses to a named brand, 85.04% -> 85.08%): yandex.com, ya.ru,
hosting.com (A2 Hosting), beget.com, cyberfolks.pl, fly.io,
bytemark.co.uk, cyberfolks.ro, keyweb.de, mittwald.de, memset.com,
zap-hosting.com, datto.com, jotelulu.com, yandex.cloud, github.com,
asavie.com (Akamai), and 16 others.

Entries are curated from the live PSL rather than any bundled copy;
brand / as_name attribution was verified against the CLAUDE.md rule
that the IP-WHOIS signal is only trusted when the domain name itself
matches the host's name (name-collisions in MMDB were skipped —
Hypernode AU, goipgroup.com, liquidnet.com, One.com substring noise,
nimbusitsolutions.com, etc.). Types follow
``base_reverse_dns_types.txt``; ``sortlists.py`` re-sorts + dedupes +
validates after the batch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Document PSL-derived override workflow and load_psl_overrides gotcha

Adds three pieces of map-maintenance context learned while building
this PR:

- New subsection "Discovering overrides from the live PSL
  private-domains section" — distinct source from live DMARC data
  (unknown_base_reverse_dns.csv) and MMDB coverage-gap analysis. The
  private section is itself a list of brand-owned suffixes; each is a
  candidate (psl_override + map entry) pair. Emphasizes ruthless
  selectivity — most of the 600+ private-section orgs are dev
  sandboxes or hobby zones that will never appear in DMARC reports.

- Two-path coverage as a single linked step, not two round-trips:
  when adding a PSL override for a hosted-content suffix
  (netlify.app), also add a map row for the brand's corporate
  as_domain (netlify.com) in the same pass. The override fixes the
  PTR path; the ASN-domain alias fixes the ASN-fallback path.

- The load_psl_overrides() fetch-first gotcha. The no-arg form pulls
  the file from master on GitHub, so end-to-end testing of local
  overrides silently uses the old remote version. offline=True is
  required to test local changes against get_base_domain().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Sean Whalen <seanthegeek@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 09:12:32 -04:00
2026-04-19 21:20:41 -04:00
2024-12-25 16:09:43 -05:00
2025-06-10 19:05:06 -04:00
2026-04-19 21:20:41 -04:00
2025-12-12 15:56:52 -05:00
2026-03-09 18:16:47 -04:00
2026-03-23 17:08:26 -04:00
2018-02-05 20:23:07 -05:00
2022-10-04 18:45:57 -04:00
2026-03-09 18:24:16 -04:00

parsedmarc

Build
Status Code
Coverage PyPI
Package PyPI - Downloads

A screenshot of DMARC summary charts in Kibana

parsedmarc is a Python module and CLI utility for parsing DMARC reports. When used with Elasticsearch and Kibana (or Splunk), it works as a self-hosted open-source alternative to commercial DMARC report processing services such as Agari Brand Protection, Dmarcian, OnDMARC, ProofPoint Email Fraud Defense, and Valimail.

Note

Domain-based Message Authentication, Reporting, and Conformance (DMARC) is an email authentication protocol.

Sponsors

This is a project is maintained by one developer. Please consider sponsoring my work if you or your organization benefit from it.

Features

  • Parses draft and 1.0 standard aggregate/rua DMARC reports
  • Parses forensic/failure/ruf DMARC reports
  • Parses reports from SMTP TLS Reporting
  • Can parse reports from an inbox over IMAP, Microsoft Graph, or Gmail API
  • Transparently handles gzip or zip compressed reports
  • Consistent data structures
  • Simple JSON and/or CSV output
  • Optionally email the results
  • Optionally send the results to Elasticsearch, Opensearch, and/or Splunk, for use with premade dashboards
  • Optionally send reports to Apache Kafka

Python Compatibility

This project supports the following Python versions, which are either actively maintained or are the default versions for RHEL or Debian.

Version Supported Reason
< 3.6 End of Life (EOL)
3.6 Used in RHEL 8, but not supported by project dependencies
3.7 End of Life (EOL)
3.8 End of Life (EOL)
3.9 Used in Debian 11 and RHEL 9, but not supported by project dependencies
3.10 Actively maintained
3.11 Actively maintained; supported until June 2028 (Debian 12)
3.12 Actively maintained; supported until May 2035 (RHEL 10)
3.13 Actively maintained; supported until June 2030 (Debian 13)
3.14 Supported (requires imapclient>=3.1.0)
Description
No description provided
Readme Apache-2.0 109 MiB
Languages
Python 99.8%
Shell 0.1%
Dockerfile 0.1%