Files
parsedmarc/samples/failure/dmarc_ruf_report_linkedin.eml
Copilot ae1e5adb66 Add RFC 9989/9990/9991 (final DMARC) report support; rename forensic→failure project-wide (#659)
* Add DMARCbis report support; rename forensic→failure project-wide

Rebased on top of master @ 2cda5bf (9.9.0), which added the ASN
source attribution work (#712, #713, #714, #715). Individual Copilot
iteration commits squashed into this single commit — the per-commit
history on the feature branch was iterative (add tests, fix lint,
move field, revert, etc.) and not worth preserving; GitHub squash-
merges PRs anyway.

New fields from the DMARCbis XSD, plumbed through types, parsing, CSV
output, and the Elasticsearch / OpenSearch mappings:

- ``np`` — non-existent subdomain policy (``none`` / ``quarantine`` /
  ``reject``)
- ``testing`` — testing mode flag (``n`` / ``y``), replaces RFC 7489
  ``pct``
- ``discovery_method`` — policy discovery method (``psl`` /
  ``treewalk``)
- ``generator`` — report generator software identifier (metadata)
- ``human_result`` — optional descriptive text on DKIM / SPF results

RFC 7489 reports parse with ``None`` for DMARCbis-only fields.

Forensic reports have been renamed to failure reports throughout the
project to reflect the proper naming since RFC 7489.

- Core: ``types.py``, ``__init__.py`` — ``ForensicReport`` →
  ``FailureReport``, ``parse_forensic_report`` →
  ``parse_failure_report``, report type ``"failure"``.
- Output modules: ``elastic.py``, ``opensearch.py``, ``splunk.py``,
  ``kafkaclient.py``, ``syslog.py``, ``gelf.py``, ``webhook.py``,
  ``loganalytics.py``, ``s3.py``.
- CLI: ``cli.py`` — args, config keys, index names
  (``dmarc_failure``).
- Docs + dashboards: all markdown, Grafana JSON, Kibana NDJSON,
  Splunk XML.

Backward compatibility preserved: old function / type names remain as
aliases (``parse_forensic_report = parse_failure_report``,
``ForensicReport = FailureReport``, etc.), CLI accepts both the old
(``save_forensic``, ``forensic_topic``) and new (``save_failure``,
``failure_topic``) config keys, and updated dashboards query both
old and new index / sourcetype names so data from before and after
the rename appears together.

Merge conflicts resolved in ``parsedmarc/constants.py`` (took bis's
10.0.0 bump), ``parsedmarc/__init__.py`` (combined bis's "failure"
wording with master's IPinfo MMDB mention), ``parsedmarc/elastic.py``
and ``parsedmarc/opensearch.py`` (kept master's ``source_asn`` /
``source_asn_name`` / ``source_asn_domain`` on the failure doc path
while renaming ``forensic_report`` → ``failure_report``), and
``CHANGELOG.md`` (10.0.0 entry now sits above the 9.9.0 entry).

All 324 tests pass; ``ruff check`` / ``ruff format --check`` clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Apply post-RFC review fixes: RFC 9990 detection, langAttrString, CFWS-aware RUF parsing

Aligns the implementation with the final RFCs (9989/9990/9991) instead of
inferring DMARCbis support from the version element or the namespace alone.

Aggregate parsing (RFC 9990):
- _text() helper unwraps langAttrString values (extra_contact_info, error,
  comment, human_result, generator) — when reporters include the lang
  attribute, xmltodict yields {"#text": ..., "@lang": ...} dicts instead
  of strings; the parser now stores the text payload in both shapes.
- New xml_namespace field on AggregateReport records the declared XML
  namespace (urn:ietf:params:xml:ns:dmarc-2.0 for RFC 9990 reports).
- RFC 9990 detection accepts namespaceless reports that follow the
  RFC 9990 shape (presence of np / testing / discovery_method / generator),
  so reporters that don't declare the namespace still receive RFC 9990-
  aware validation.
- Warnings: missing DKIM <selector> (REQUIRED in RFC 9990); legacy
  forwarded / sampled_out policy-override types (removed by RFC 9990);
  unknown policy-override types per the RFC 9990 enumeration.
- xml_namespace added to Elasticsearch and OpenSearch document mappings.

Failure parsing (RFC 9991):
- Identity-Alignment and Auth-Failure are split on commas with CFWS
  whitespace stripped per the RFC 9991 ABNF; previously "dkim, spf"
  yielded ["dkim", " spf"] with a leading space on the second token.
- Warnings logged when either REQUIRED field is missing.

Terminology: every reference to "DMARCbis" in code, tests, sample
filenames, AGENTS.md, and CHANGELOG.md is replaced with the appropriate
RFC number (9989 for the policy spec, 9990 for aggregate reports, 9991
for failure reports). Sample contents are unchanged.

Docs: corrects the prior claim that fo was dropped from RFC 9990 (only
pct was), reframes testing as a new field (not a pct replacement, since
RFC 9989 Appendix A.6 removed pct with no per-message substitute), and
documents the policy_override_reason enum changes (added policy_test_mode;
removed forwarded / sampled_out).

Tests: 8 new tests covering xml_namespace capture, RFC 9990 detection
from field shape, missing-DKIM-selector warning, legacy-override-type
warning, langAttrString unwrapping across all four affected elements,
and CFWS-aware Identity-Alignment / Auth-Failure parsing plus their
missing-field warnings. 276 tests total, all passing; ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Sean Whalen <44679+seanthegeek@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 18:51:08 -04:00

109 lines
5.0 KiB
Plaintext

From dmarc-noreply@linkedin.com Tue Apr 30 02:09:16 2019
Received: from mailf-cd.linkedin.com ([108.174.6.228])
by example.uriports.com with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
(MTA 2.20)
(envelope-from <dmarc-noreply@linkedin.com>)
id 1hLICq-00001z-JU
for dmarc@example.uriports.com; Tue, 30 Apr 2019 02:09:16 +0000
Received: from [127.0.0.1] ([local])
by mail516.prod.linkedin.com (envelope-from <dmarc-noreply@linkedin.com>)
(ecelerity 3.6.21.53563 r(Core:3.6.21.0)) with UNKNOWN
id AA/01-16018-D1AA1CC5; Tue, 30 Apr 2019 02:09:00 +0000
Date: Tue, 30 Apr 2019 02:09:00 +0000
Message-ID: <BB.91.26019.C3EA7CC5@mail516.prod.linkedin.com>
X-LinkedIn-Class: EMAIL_REPORTS
Subject: DMARC Failure report for example.com Mail-From: IP:10.10.10.10
To: dmarc-noreply@linkedin.com
From: dmarc-noreply@linkedin.com
Content-Type: multipart/report; report-type=feedback-report;
boundary="_----abcdefghijklmnopqrstuv===_AA/01-16018-D1AA1CC5"
Received-SPF: pass client-ip=108.174.6.228; envelope-from=dmarc-noreply@linkedin.com; helo=mailf-cd.linkedin.com
--_----abcdefghijklmnopqrstuv===_AA/01-16018-D1AA1CC5
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
This is an email abuse report for an email message received from IP 10.10.10.10 on Tue, 30 Apr 2019 02:09:00 +0000.
The message below did not meet the sending domain's dmarc policy.
The message below could have been accepted or rejected depending on policy.
For more information about this format please see http://tools.ietf.org/html/rfc6591 .
--_----abcdefghijklmnopqrstuv===_AA/01-16018-D1AA1CC5
Content-Type: message/feedback-report
Feedback-Type: auth-failure
User-Agent: Lua/1.0
Version: 1.0
Original-Mail-From:
Original-Rcpt-To: recipient@linkedin.com
Arrival-Date: Tue, 30 Apr 2019 02:09:00 +0000
Message-ID: <01010101010101010101010101010101@ABAB01MS0016.someserver.loc>
Authentication-Results: dmarc=fail (p=none; dis=none) header.from=example.com
Source-IP: 10.10.10.10
Delivery-Result: delivered
Auth-Failure: dmarc
Reported-Domain: example.com
--_----abcdefghijklmnopqrstuv===_AA/01-16018-D1AA1CC5
Content-Type: message/rfc822
Content-Disposition: inline
Return-Path: <>
Authentication-Results: mail516.prod.linkedin.com; iprev=pass policy.iprev="10.10.10.10"; spf=neutral smtp.mailfrom="" smtp.helo="mail02.someserver.com"; dkim=none (message not signed) header.d=none; tls=pass (verified) key.ciphersuite="TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384" key.length="256" tls.v="tlsv1.2" cert.client="OU=Domain Control Validated,CN=*.someserver.com" cert.clientissuer="C=GB,ST=Greater Manchester,L=Salford,O=COMODO CA Limited,CN=COMODO RSA Domain Validation Secure Server CA"; dmarc=fail (p=none; dis=none) header.from=example.com
X-OnPremExternalIP: 10.10.10.10
Received: from [10.10.10.10] ([10.10.10.10:4227] helo=mail02.someserver.com)
by mail516.prod.linkedin.com (envelope-from <>)
(ecelerity 3.6.21.53563 r(Core:3.6.21.0)) with ESMTPS (cipher=ECDHE-RSA-AES256-GCM-SHA384
subject="/OU=Domain Control Validated/CN=*.someserver.com")
id CA/91-26019-ABCDECC5; Tue, 30 Apr 2019 02:09:00 +0000
Received: from DENU02MS0016.someserver.loc (10.156.68.14) by
DENU02MS0017.someserver.loc (10.10.10.9) with Microsoft SMTP Server (TLS) id
15.0.1367.3; Tue, 30 Apr 2019 04:09:09 +0200
Received: from DENU02MS0016.someserver.loc ([127.0.0.1]) by
DENU02MS0016.someserver.loc ([10.10.10.8]) with Microsoft SMTP Server id
15.00.1367.000; Tue, 30 Apr 2019 04:09:09 +0200
From: Sender <sender@example.com>
To: LinkedIn <recipient@linkedin.com>
Subject: Subject line, could be UTF8 encoded
Thread-Topic: Thread Topic line, could be UTF8 encoded
Thread-Index: AQHU/abcdW8+abcdLkClF52hP4alIaZT9XGh
Date: Tue, 30 Apr 2019 02:09:09 +0000
Message-ID: <01010101010101010101010101010101@ABAB01MS0016.someserver.loc>
References: <1111111111.1111111.1111111111111.JavaMail.app@lor1-app3586.prod.linkedin.com>
In-Reply-To: <1111111111.1111111.1111111111111.JavaMail.app@lor1-app3586.prod.linkedin.com>
X-MS-Has-Attach:
X-Auto-Response-Suppress: All
X-MS-Exchange-Inbox-Rules-Loop: sender@example.com
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-ms-exchange-parent-message-id: <1111111111.1111111.1111111111111.JavaMail.app@lor1-app3586.prod.linkedin.com>
auto-submitted: auto-generated
x-ms-exchange-generated-message-source: Mailbox Rules Agent
x-exclaimer-md-config: 11111111-1111-1111-1111-111111111111
Content-Type: multipart/alternative;
boundary="_000_0d00000000000000000d000000000000f00000s00000someserverloc_"
MIME-Version: 1.0
X-Linkedin-fe: false
--_000_0d00000000000000000d000000000000f00000s00000someserverloc_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Alternative
Text
--_000_0d00000000000000000d000000000000f00000s00000someserverloc_
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<html>
<head>
</head>
<body>
HTML Text
</body>
</html>
--_000_0d00000000000000000d000000000000f00000s00000someserverloc_--
--_----abcdefghijklmnopqrstuv===_AA/01-16018-D1AA1CC5--