Sean Whalen 9aa930f7cc Retroactive promotions: 3,171 KU rows reclassified by expanded multilingual classifier (#763)
Re-ran the expanded-multilingual classifier (PR #762's classifier with
broader language coverage on Healthcare, Travel, Government, Retail,
Finance, ISP, Manufacturing, Logistics, Real Estate, Automotive, Legal,
Agriculture, plus Finance-via-body-text catching insurance/investment/
asset-management) against every cached TSV from prior batches (b6–b13).
3,171 domains that previously couldn't be auto-classified (and were
therefore added to known_unknown_base_reverse_dns.txt) now match the new
detectors.

These domains are promoted out of KU and into the map under their newly
classified `(name, type)` pairs.

Type distribution of promotions:
  Finance         736   Logistics        179   Real Estate     105   Healthcare    68
  ISP             323   Retail           159   Education       110   Marketing     66
  Manufacturing   207   Technology       142   Consulting       99   Nonprofit     64
  Government      136   Travel           123   Utilities        71   Legal         53
  + smaller volumes across ~25 other industry types

ASN-domain coverage of the bundled IPinfo Lite MMDB after these promotions:
  - by domain count:  32,254 / 63,993  (50.40%, up from 45.45%)
  - by IPv4 weight:   98.45%

Honest scope note: the multilingual classifier achieves "concept parity"
for the top-5 high-volume detectors (Healthcare, Travel, Government,
Retail, Finance) across ~30 languages. Smaller detectors (Photography,
Conglomerate, Sports, Defense, MSSP, IaaS/PaaS/SaaS, etc.) still have
~10-15 languages with 1-3 keywords each. Further per-detector
multilingual parity is a follow-up effort; each subsequent expansion
recovers fewer domains as the long tail of language-specific phrasings
shrinks.

Co-authored-by: Sean Whalen <seanthegeek@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 17:00:25 -04:00
2026-05-03 12:36:06 -04:00
2026-04-19 21:20:41 -04:00
2025-12-12 15:56:52 -05:00
2026-03-09 18:16:47 -04:00
2026-03-23 17:08:26 -04:00
2018-02-05 20:23:07 -05:00
2022-10-04 18:45:57 -04:00
2026-03-09 18:24:16 -04:00

parsedmarc

Build
Status Code
Coverage PyPI
Package PyPI - Downloads

A screenshot of DMARC summary charts in Kibana

parsedmarc is a Python module and CLI utility for parsing DMARC reports. When used with Elasticsearch and Kibana (or Splunk), it works as a self-hosted open-source alternative to commercial DMARC report processing services such as Agari Brand Protection, Dmarcian, OnDMARC, ProofPoint Email Fraud Defense, and Valimail.

Note

Domain-based Message Authentication, Reporting, and Conformance (DMARC) is an email authentication protocol.

Sponsors

This is a project is maintained by one developer. Please consider sponsoring my work if you or your organization benefit from it.

Features

  • Parses draft and 1.0 standard aggregate/rua DMARC reports
  • Parses forensic/failure/ruf DMARC reports
  • Parses reports from SMTP TLS Reporting
  • Can parse reports from an inbox over IMAP, Microsoft Graph, or Gmail API
  • Transparently handles gzip or zip compressed reports
  • Consistent data structures
  • Simple JSON and/or CSV output
  • Optionally email the results
  • Optionally send the results to Elasticsearch, Opensearch, and/or Splunk, for use with premade dashboards
  • Optionally send reports to Apache Kafka

Python Compatibility

This project supports the following Python versions, which are either actively maintained or are the default versions for RHEL or Debian.

Version Supported Reason
< 3.6 End of Life (EOL)
3.6 Used in RHEL 8, but not supported by project dependencies
3.7 End of Life (EOL)
3.8 End of Life (EOL)
3.9 Used in Debian 11 and RHEL 9, but not supported by project dependencies
3.10 Actively maintained
3.11 Actively maintained; supported until June 2028 (Debian 12)
3.12 Actively maintained; supported until May 2035 (RHEL 10)
3.13 Actively maintained; supported until June 2030 (Debian 13)
3.14 Supported (requires imapclient>=3.1.0)
S
Description
No description provided
Readme Apache-2.0 160 MiB
Languages
Python 98.2%
Shell 1.7%