mirror of
https://github.com/domainaware/parsedmarc.git
synced 2026-05-20 19:05:24 +00:00
* Add Tier 0 to the verification triage: globally-known brand at primary domain In the previous ASN-domain coverage sweep, the agent ran web searches for entries like `bestbuy.com → Best Buy`, `ups.com → United Parcel Service`, `usps.gov → US Postal Service`, `marriott.com → Marriott`, `henkel.cn → Henkel`, `experian.com → Experian`, `jd.com → JD.com`, `ing.com → ING`, `verisign.com → Verisign`. For each of these the domain ↔ brand pairing is encyclopedic — same outcome a few seconds slower. The two-corroborating-sources rule (rule 8) was being applied mechanically: "MMDB as_name alone is one source, must fetch a second." But for globally-known brands at their primary domain, the brand identity itself is the second source. Searching for confirmation that Best Buy owns bestbuy.com is the kind of busywork the tier system exists to avoid. Adds Tier 0 with explicit guardrails — must be globally known (multinational or top-tier-national, decades-old, single canonical entity), must be the entity's primary marketing/corporate domain (not a tracking subdomain or regional ccTLD where ownership is non-obvious), and no recent acquisition/rebrand status in question. Cross-references the existing parent-too-generic sub-rule and warns against stretching to mid-size brands the agent happens to recognize. When in doubt: drop to Tier 3 and search. Also generalizes the section's lead from "redirect-target candidates" to cover MMDB coverage-gap and PSL private-domain candidates — the tier logic transfers cleanly across all three workflows. Updates the Tier 1 description with an explicit MMDB-coverage-gap analog. Refreshes the held-back-review split stat to 0 / 109 / 2 / 34 / 35 (Tier 0 didn't apply to that batch because every candidate was a redirect target that needed to inherit the *source row's* existing canonical name, not its own brand identity). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ASN-domain coverage sweep #3: 516 new map entries Third pass against the IPinfo Lite MMDB coverage gap, processing the top ~500 unmapped as_domain entries by IPv4 weight after the prior two sweeps. Verifies each entry against AGENTS.md's tiered triage: - **Tier 0** (globally-known brand at primary domain, no search needed): Barclays, Liberty Mutual, Zurich Insurance, ABN AMRO, Swedbank, CIBC, Allstate, Julius Baer, MUFG, Travelers, USPS-Bank, ING, Florida Blue, AgriBank, Energy Transfer, FirstEnergy, Scania, Evonik, Merck KGaA, Agfa, Bosch, Iveco, Applied Materials, Micron, Andritz, Whirlpool, Leonardo, QinetiQ, Atlas Elektronik, Draper, Airbus, Jacobs Engineering, Teledyne, Dropbox, Autodesk, Wind River, Stratus, Unisys, ByteDance, Chevron, BBC, CDC, NEC, HPE, Kimberly-Clark, U.S. Bank, NATO, EUROCONTROL, Federal Reserve, NIST, NSF, DARPA, Library of Congress, IMF, FAO, IAEA, ITU, several US state/county/city governments, Australian state/federal departments, European national agencies, United Airlines, Alaska Airlines, Rakuten Mobile, Coles, Woolworths. - **Tier 1** (MMDB as_name lexically matches candidate domain, no search needed): ~150+ ISPs / hosters / cable TV operators where the as_name itself is the second corroborating source — major national/regional telcos (BTC Botswana, Uganda Telecom, ONE Albania, Tanzania Telecommunications, Kyrgyztelecom, Uzbektelekom, Telecom Algeria, MTN Rwanda, Vodacom Tanzania, Celcom Axiata, Triple T Broadcasting/Jasmine Thailand, MyRepublic Indonesia, Northwestel Canada, JT Jersey, Liberty Networks Colombia, ARLINK Argentina, Cable & Wireless Dominica, SETAR Aruba, AR Telecom Portugal), regional fiber providers (Trooli, Allied Telecom, OEC Fiber, Conexon Connect, Ben Lomand, Great Plains, BrightNet Oklahoma, All West, SDN, Tularosa, Blackfoot, Greeneville Energy, Avanti Broadband, Net at Once, Avanti, Aura Fiber, Stichting Breedband Delft), regional cable TV operators across Japan/Korea/Taiwan (Miyazaki Cable, Toyohashi Cable, Nagasaki Cable, Cable TV Toyama, Kurashiki Cable, Himeji Cable, Keumgang Cable Network), data center operators (eStruxture, PureVoltage, Hyonix, NovoServe, Voxility, Webzilla, Worldstream, Atman Poland, EO Data Center). - **Education** (TLD-restricted .edu / .ac.* / .edu.* — restriction is itself a corroborating source): 200+ universities and research institutions across US, Canada, Europe, Asia, and Australia, including Notre Dame, Washington State, U Texas Rio Grande Valley / Arlington / El Paso / San Antonio / Medical Branch, McMaster, U Ottawa, U Calgary, U Waterloo, Memorial U Newfoundland, U Auckland, U Otago, TU Munich, U Cologne, Goethe Frankfurt, Ruhr-Bochum, U Warwick, Chalmers, Lund, Gothenburg, Luleå, Osaka, Yonsei, Kasetsart, Pusan, Kuwait U, Aristotle Thessaloniki, Ł Tech U, Vienna U Economics, several Cancer Research Centers (MSKCC, Fred Hutchinson, MD Anderson, Cold Spring Harbor), national research institutes (KEK, IAEA, ITRI Taiwan, ETRI, IPM Iran, Smithsonian, ucar, Jefferson Lab, CSHL, mbari, Lam Research, Andritz Hydropower, sri.com, GSI Germany, Max Delbrück, jhuapl). - **Government** (.gov / .gov.* TLD-restricted, or as_name unambiguously names a government entity): NIST, NSF, NATO, DARPA, ITU, FAO, IAEA, IMF, US Centers for Disease Control, Federal Reserve, Library of Congress, Idaho/Chicago/King County/Pierce County/State of New York, Indianapolis, Tacoma, Fairfax County, Sweden's Vägverket and Forsakringskassan, Hessen GWDG, ANSTO Australia, South Florida Water Management District, Communications Research Centre Canada, Dataport Germany, Cenitex Victoria, EUROCONTROL. Skipped: Cox Enterprises (multi-product parent, no clean type fit), Tucows already added, sknt.ru already added, etc. Full triage shows 1 duplicate-skip from the apply pass. Sortlists.py runs cleanly. All 516 type values validate against base_reverse_dns_types.txt. No collisions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Sean Whalen <seanthegeek@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in: