Sean Whalen 8cc017fe84 ASN-domain coverage sweep #3: 516 new map entries (#735)
* Add Tier 0 to the verification triage: globally-known brand at primary domain

In the previous ASN-domain coverage sweep, the agent ran web searches
for entries like `bestbuy.com → Best Buy`, `ups.com → United Parcel
Service`, `usps.gov → US Postal Service`, `marriott.com → Marriott`,
`henkel.cn → Henkel`, `experian.com → Experian`, `jd.com → JD.com`,
`ing.com → ING`, `verisign.com → Verisign`. For each of these the
domain ↔ brand pairing is encyclopedic — same outcome a few seconds
slower.

The two-corroborating-sources rule (rule 8) was being applied
mechanically: "MMDB as_name alone is one source, must fetch a second."
But for globally-known brands at their primary domain, the brand
identity itself is the second source. Searching for confirmation that
Best Buy owns bestbuy.com is the kind of busywork the tier system
exists to avoid.

Adds Tier 0 with explicit guardrails — must be globally known
(multinational or top-tier-national, decades-old, single canonical
entity), must be the entity's primary marketing/corporate domain
(not a tracking subdomain or regional ccTLD where ownership is
non-obvious), and no recent acquisition/rebrand status in question.
Cross-references the existing parent-too-generic sub-rule and
warns against stretching to mid-size brands the agent happens to
recognize. When in doubt: drop to Tier 3 and search.

Also generalizes the section's lead from "redirect-target candidates"
to cover MMDB coverage-gap and PSL private-domain candidates — the
tier logic transfers cleanly across all three workflows. Updates the
Tier 1 description with an explicit MMDB-coverage-gap analog.

Refreshes the held-back-review split stat to 0 / 109 / 2 / 34 / 35
(Tier 0 didn't apply to that batch because every candidate was a
redirect target that needed to inherit the *source row's* existing
canonical name, not its own brand identity).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ASN-domain coverage sweep #3: 516 new map entries

Third pass against the IPinfo Lite MMDB coverage gap, processing the
top ~500 unmapped as_domain entries by IPv4 weight after the prior two
sweeps. Verifies each entry against AGENTS.md's tiered triage:

- **Tier 0** (globally-known brand at primary domain, no search
  needed): Barclays, Liberty Mutual, Zurich Insurance, ABN AMRO,
  Swedbank, CIBC, Allstate, Julius Baer, MUFG, Travelers, USPS-Bank,
  ING, Florida Blue, AgriBank, Energy Transfer, FirstEnergy, Scania,
  Evonik, Merck KGaA, Agfa, Bosch, Iveco, Applied Materials, Micron,
  Andritz, Whirlpool, Leonardo, QinetiQ, Atlas Elektronik, Draper,
  Airbus, Jacobs Engineering, Teledyne, Dropbox, Autodesk, Wind River,
  Stratus, Unisys, ByteDance, Chevron, BBC, CDC, NEC, HPE,
  Kimberly-Clark, U.S. Bank, NATO, EUROCONTROL, Federal Reserve, NIST,
  NSF, DARPA, Library of Congress, IMF, FAO, IAEA, ITU, several US
  state/county/city governments, Australian state/federal departments,
  European national agencies, United Airlines, Alaska Airlines,
  Rakuten Mobile, Coles, Woolworths.

- **Tier 1** (MMDB as_name lexically matches candidate domain, no
  search needed): ~150+ ISPs / hosters / cable TV operators where
  the as_name itself is the second corroborating source — major
  national/regional telcos (BTC Botswana, Uganda Telecom, ONE Albania,
  Tanzania Telecommunications, Kyrgyztelecom, Uzbektelekom, Telecom
  Algeria, MTN Rwanda, Vodacom Tanzania, Celcom Axiata, Triple T
  Broadcasting/Jasmine Thailand, MyRepublic Indonesia, Northwestel
  Canada, JT Jersey, Liberty Networks Colombia, ARLINK Argentina,
  Cable & Wireless Dominica, SETAR Aruba, AR Telecom Portugal),
  regional fiber providers (Trooli, Allied Telecom, OEC Fiber,
  Conexon Connect, Ben Lomand, Great Plains, BrightNet Oklahoma,
  All West, SDN, Tularosa, Blackfoot, Greeneville Energy, Avanti
  Broadband, Net at Once, Avanti, Aura Fiber, Stichting Breedband
  Delft), regional cable TV operators across Japan/Korea/Taiwan
  (Miyazaki Cable, Toyohashi Cable, Nagasaki Cable, Cable TV Toyama,
  Kurashiki Cable, Himeji Cable, Keumgang Cable Network), data center
  operators (eStruxture, PureVoltage, Hyonix, NovoServe, Voxility,
  Webzilla, Worldstream, Atman Poland, EO Data Center).

- **Education** (TLD-restricted .edu / .ac.* / .edu.* — restriction is
  itself a corroborating source): 200+ universities and research
  institutions across US, Canada, Europe, Asia, and Australia,
  including Notre Dame, Washington State, U Texas Rio Grande Valley /
  Arlington / El Paso / San Antonio / Medical Branch, McMaster, U
  Ottawa, U Calgary, U Waterloo, Memorial U Newfoundland, U Auckland,
  U Otago, TU Munich, U Cologne, Goethe Frankfurt, Ruhr-Bochum, U
  Warwick, Chalmers, Lund, Gothenburg, Luleå, Osaka, Yonsei, Kasetsart,
  Pusan, Kuwait U, Aristotle Thessaloniki, Ł Tech U, Vienna U Economics,
  several Cancer Research Centers (MSKCC, Fred Hutchinson, MD Anderson,
  Cold Spring Harbor), national research institutes (KEK, IAEA, ITRI
  Taiwan, ETRI, IPM Iran, Smithsonian, ucar, Jefferson Lab,
  CSHL, mbari, Lam Research, Andritz Hydropower, sri.com, GSI Germany,
  Max Delbrück, jhuapl).

- **Government** (.gov / .gov.* TLD-restricted, or as_name unambiguously
  names a government entity): NIST, NSF, NATO, DARPA, ITU, FAO, IAEA,
  IMF, US Centers for Disease Control, Federal Reserve, Library of
  Congress, Idaho/Chicago/King County/Pierce County/State of New York,
  Indianapolis, Tacoma, Fairfax County, Sweden's Vägverket and
  Forsakringskassan, Hessen GWDG, ANSTO Australia, South Florida
  Water Management District, Communications Research Centre Canada,
  Dataport Germany, Cenitex Victoria, EUROCONTROL.

Skipped: Cox Enterprises (multi-product parent, no clean type fit),
Tucows already added, sknt.ru already added, etc. Full triage shows
1 duplicate-skip from the apply pass.

Sortlists.py runs cleanly. All 516 type values validate against
base_reverse_dns_types.txt. No collisions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Sean Whalen <seanthegeek@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 21:01:47 -04:00
2026-04-19 21:20:41 -04:00
2024-12-25 16:09:43 -05:00
2025-06-10 19:05:06 -04:00
2026-04-19 21:20:41 -04:00
2025-12-12 15:56:52 -05:00
2026-03-09 18:16:47 -04:00
2026-03-23 17:08:26 -04:00
2018-02-05 20:23:07 -05:00
2022-10-04 18:45:57 -04:00
2026-03-09 18:24:16 -04:00

parsedmarc

Build
Status Code
Coverage PyPI
Package PyPI - Downloads

A screenshot of DMARC summary charts in Kibana

parsedmarc is a Python module and CLI utility for parsing DMARC reports. When used with Elasticsearch and Kibana (or Splunk), it works as a self-hosted open-source alternative to commercial DMARC report processing services such as Agari Brand Protection, Dmarcian, OnDMARC, ProofPoint Email Fraud Defense, and Valimail.

Note

Domain-based Message Authentication, Reporting, and Conformance (DMARC) is an email authentication protocol.

Sponsors

This is a project is maintained by one developer. Please consider sponsoring my work if you or your organization benefit from it.

Features

  • Parses draft and 1.0 standard aggregate/rua DMARC reports
  • Parses forensic/failure/ruf DMARC reports
  • Parses reports from SMTP TLS Reporting
  • Can parse reports from an inbox over IMAP, Microsoft Graph, or Gmail API
  • Transparently handles gzip or zip compressed reports
  • Consistent data structures
  • Simple JSON and/or CSV output
  • Optionally email the results
  • Optionally send the results to Elasticsearch, Opensearch, and/or Splunk, for use with premade dashboards
  • Optionally send reports to Apache Kafka

Python Compatibility

This project supports the following Python versions, which are either actively maintained or are the default versions for RHEL or Debian.

Version Supported Reason
< 3.6 End of Life (EOL)
3.6 Used in RHEL 8, but not supported by project dependencies
3.7 End of Life (EOL)
3.8 End of Life (EOL)
3.9 Used in Debian 11 and RHEL 9, but not supported by project dependencies
3.10 Actively maintained
3.11 Actively maintained; supported until June 2028 (Debian 12)
3.12 Actively maintained; supported until May 2035 (RHEL 10)
3.13 Actively maintained; supported until June 2030 (Debian 13)
3.14 Supported (requires imapclient>=3.1.0)
S
Description
No description provided
Readme Apache-2.0 160 MiB
Languages
Python 98.2%
Shell 1.7%