Classify reverse DNS map: next 5000 unmapped MMDB ASN domains (#757)

Continuing the long-tail sweep. Auto-classification rate dropped to 36.5%
this round (1826/5000) from ~43% in prior batches — the further into the
tail we go, the more candidates have parked / Cloudflare-challenged /
sparse homepages where the regex can't match.

- 1,826 added to base_reverse_dns_map.csv (ISP 1,187, Web Host 267,
  Education 112, MSP 80, Finance 65, Government 62, Healthcare 53).
- 3,174 added to known_unknown_base_reverse_dns.txt.

ASN-domain coverage of the bundled IPinfo Lite MMDB after this batch:
  - by domain count:  16,652 / 63,993  (26.02%, up from 23.17%)
  - by IPv4 weight:   98.09%           (up from 98.01%)

Same classifier as batches 5-7 (no new code).

Co-authored-by: Sean Whalen <seanthegeek@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Sean Whalen
2026-05-07 13:21:52 -04:00
committed by GitHub
parent 4446702b84
commit c523d0da9c
2 changed files with 5000 additions and 0 deletions
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff