mirror of
https://github.com/domainaware/parsedmarc.git
synced 2026-05-20 10:55:24 +00:00
9aa930f7cc91edbce1c99376bf9179971bd00989
Re-ran the expanded-multilingual classifier (PR #762's classifier with broader language coverage on Healthcare, Travel, Government, Retail, Finance, ISP, Manufacturing, Logistics, Real Estate, Automotive, Legal, Agriculture, plus Finance-via-body-text catching insurance/investment/ asset-management) against every cached TSV from prior batches (b6–b13). 3,171 domains that previously couldn't be auto-classified (and were therefore added to known_unknown_base_reverse_dns.txt) now match the new detectors. These domains are promoted out of KU and into the map under their newly classified `(name, type)` pairs. Type distribution of promotions: Finance 736 Logistics 179 Real Estate 105 Healthcare 68 ISP 323 Retail 159 Education 110 Marketing 66 Manufacturing 207 Technology 142 Consulting 99 Nonprofit 64 Government 136 Travel 123 Utilities 71 Legal 53 + smaller volumes across ~25 other industry types ASN-domain coverage of the bundled IPinfo Lite MMDB after these promotions: - by domain count: 32,254 / 63,993 (50.40%, up from 45.45%) - by IPv4 weight: 98.45% Honest scope note: the multilingual classifier achieves "concept parity" for the top-5 high-volume detectors (Healthcare, Travel, Government, Retail, Finance) across ~30 languages. Smaller detectors (Photography, Conglomerate, Sports, Defense, MSSP, IaaS/PaaS/SaaS, etc.) still have ~10-15 languages with 1-3 keywords each. Further per-detector multilingual parity is a follow-up effort; each subsequent expansion recovers fewer domains as the long tail of language-specific phrasings shrinks. Co-authored-by: Sean Whalen <seanthegeek@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
parsedmarc
parsedmarc is a Python module and CLI utility for parsing DMARC
reports. When used with Elasticsearch and Kibana (or Splunk), it works
as a self-hosted open-source alternative to commercial DMARC report
processing services such as Agari Brand Protection, Dmarcian, OnDMARC,
ProofPoint Email Fraud Defense, and Valimail.
Note
Domain-based Message Authentication, Reporting, and Conformance (DMARC) is an email authentication protocol.
Sponsors
This is a project is maintained by one developer. Please consider sponsoring my work if you or your organization benefit from it.
Features
- Parses draft and 1.0 standard aggregate/rua DMARC reports
- Parses forensic/failure/ruf DMARC reports
- Parses reports from SMTP TLS Reporting
- Can parse reports from an inbox over IMAP, Microsoft Graph, or Gmail API
- Transparently handles gzip or zip compressed reports
- Consistent data structures
- Simple JSON and/or CSV output
- Optionally email the results
- Optionally send the results to Elasticsearch, Opensearch, and/or Splunk, for use with premade dashboards
- Optionally send reports to Apache Kafka
Python Compatibility
This project supports the following Python versions, which are either actively maintained or are the default versions for RHEL or Debian.
| Version | Supported | Reason |
|---|---|---|
| < 3.6 | ❌ | End of Life (EOL) |
| 3.6 | ❌ | Used in RHEL 8, but not supported by project dependencies |
| 3.7 | ❌ | End of Life (EOL) |
| 3.8 | ❌ | End of Life (EOL) |
| 3.9 | ❌ | Used in Debian 11 and RHEL 9, but not supported by project dependencies |
| 3.10 | ✅ | Actively maintained |
| 3.11 | ✅ | Actively maintained; supported until June 2028 (Debian 12) |
| 3.12 | ✅ | Actively maintained; supported until May 2035 (RHEL 10) |
| 3.13 | ✅ | Actively maintained; supported until June 2030 (Debian 13) |
| 3.14 | ✅ | Supported (requires imapclient>=3.1.0) |
Description
Languages
Python
98.2%
Shell
1.7%
