* Split tests.py into per-module tests/test_<module>.py
The 5174-line tests.py monolith is split into per-module files under
tests/, mirroring the checkdmarc layout:
tests/test_init.py parsedmarc/__init__.py parsing surface
tests/test_cli.py parsedmarc/cli.py + config / env-vars / SIGHUP
tests/test_utils.py parsedmarc/utils.py (DNS, IP info, PSL, etc.)
tests/test_webhook.py parsedmarc/webhook.py
tests/test_kafkaclient.py parsedmarc/kafkaclient.py
tests/test_splunk.py parsedmarc/splunk.py
tests/test_syslog.py parsedmarc/syslog.py
tests/test_loganalytics.py parsedmarc/loganalytics.py
tests/test_gelf.py parsedmarc/gelf.py
tests/test_s3.py parsedmarc/s3.py
tests/test_maps.py parsedmarc/resources/maps/ maintainer scripts
The split is purely a redistribution — no test bodies changed, no tests
added or removed. All 276 existing tests pass under the new layout.
The current tests.py contains two kitchen-sink classes (`Test` at line 54
and `TestEnvVarConfig` at line 2360) holding tests that span many
modules. Their methods are routed to the correct per-module file by name
prefix; the wholly-thematic classes (TestExtractReport, TestUtilsXxx,
TestSighupReload, etc.) move whole. Each target file gets its own
`class Test(unittest.TestCase)` for the redistributed kitchen-sink
methods, plus the thematic classes verbatim.
Wiring updates:
- `.github/workflows/python-tests.yml`: `pytest ... tests.py` →
`python -m pytest ... tests/` (also switches to `python -m pytest` per
the checkdmarc convention so cwd lands on the project root).
- `pyproject.toml`: adds `[tool.pytest.ini_options] testpaths = ["tests"]`
and `[tool.coverage.run] source = ["parsedmarc"]` with an `omit` for
`parsedmarc/resources/maps/*.py`. The maps scripts are maintainer-only
batch tooling that ships out of the wheel; excluding them from
coverage makes the headline number reflect only installed library
code. Runtime coverage on the new layout is 59% (was 45% with maps
counted), and PR-B will push it to 90%+.
- `AGENTS.md`: documents the new layout and how to run individual files
/ tests; tells future contributors not to reintroduce a monolithic
tests.py.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Restore 66.9% coverage baseline (count tests/ + parsedmarc)
Master's headline 66.9% number on Codecov includes the tests.py file
itself (99.35% covered) being measured alongside parsedmarc/*. The
original tests.py had no `[tool.coverage.run]` block, so coverage's
default — "measure every file imported during the run" — counted the
test code as if it were product code.
The split commit added `source = ["parsedmarc"]` which suppressed
measurement of the test files (correct in principle, since test files
aren't shipped code), and that alone made the headline number drop by
~8 percentage points without any actual loss of testing. This commit
swaps `source` for an explicit `include = ["parsedmarc/*", "tests/*"]`
so both halves are measured the way they were on master. Verified:
276 tests, 66.96% line coverage (effectively unchanged from master's
66.90%).
If you want the shipped-code-only number (was the headline that this
commit overrides), run `pytest --cov=parsedmarc tests/`. That number
is currently 59% and is the focus of the upcoming coverage-expansion PR.
Also adds junit.xml to .gitignore so the CI artefact doesn't get
accidentally committed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Restrict coverage to shipped code (`source = ["parsedmarc"]`)
Reverts the prior commit's `include = ["tests/*"]`. Counting the test
files toward coverage was wrong — it conflates "shipped code exercised
by tests" with "test code that pytest auto-runs", inflates the headline
number, and rewards writing more tests rather than tests that verify
more code. Master's apparent 66.9% was an artefact of the old
monolithic tests.py having no [tool.coverage.run] block at all; coverage's
default behaviour measured every imported file, including the test file
itself at ~99% "covered", which added ~8 percentage points to the
displayed number without any real testing signal.
Restricting to `source = ["parsedmarc"]` plus the existing maps omit
gives a meaningful baseline: 59% of shipped code is exercised by the
test suite today. That's the number the next PR is targeting to lift
to 90%+ before the 10.0.0 release; the Codecov "drop" here is a
measurement correction, not a regression.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Add DMARCbis report support; rename forensic→failure project-wide
Rebased on top of master @ 2cda5bf (9.9.0), which added the ASN
source attribution work (#712, #713, #714, #715). Individual Copilot
iteration commits squashed into this single commit — the per-commit
history on the feature branch was iterative (add tests, fix lint,
move field, revert, etc.) and not worth preserving; GitHub squash-
merges PRs anyway.
New fields from the DMARCbis XSD, plumbed through types, parsing, CSV
output, and the Elasticsearch / OpenSearch mappings:
- ``np`` — non-existent subdomain policy (``none`` / ``quarantine`` /
``reject``)
- ``testing`` — testing mode flag (``n`` / ``y``), replaces RFC 7489
``pct``
- ``discovery_method`` — policy discovery method (``psl`` /
``treewalk``)
- ``generator`` — report generator software identifier (metadata)
- ``human_result`` — optional descriptive text on DKIM / SPF results
RFC 7489 reports parse with ``None`` for DMARCbis-only fields.
Forensic reports have been renamed to failure reports throughout the
project to reflect the proper naming since RFC 7489.
- Core: ``types.py``, ``__init__.py`` — ``ForensicReport`` →
``FailureReport``, ``parse_forensic_report`` →
``parse_failure_report``, report type ``"failure"``.
- Output modules: ``elastic.py``, ``opensearch.py``, ``splunk.py``,
``kafkaclient.py``, ``syslog.py``, ``gelf.py``, ``webhook.py``,
``loganalytics.py``, ``s3.py``.
- CLI: ``cli.py`` — args, config keys, index names
(``dmarc_failure``).
- Docs + dashboards: all markdown, Grafana JSON, Kibana NDJSON,
Splunk XML.
Backward compatibility preserved: old function / type names remain as
aliases (``parse_forensic_report = parse_failure_report``,
``ForensicReport = FailureReport``, etc.), CLI accepts both the old
(``save_forensic``, ``forensic_topic``) and new (``save_failure``,
``failure_topic``) config keys, and updated dashboards query both
old and new index / sourcetype names so data from before and after
the rename appears together.
Merge conflicts resolved in ``parsedmarc/constants.py`` (took bis's
10.0.0 bump), ``parsedmarc/__init__.py`` (combined bis's "failure"
wording with master's IPinfo MMDB mention), ``parsedmarc/elastic.py``
and ``parsedmarc/opensearch.py`` (kept master's ``source_asn`` /
``source_asn_name`` / ``source_asn_domain`` on the failure doc path
while renaming ``forensic_report`` → ``failure_report``), and
``CHANGELOG.md`` (10.0.0 entry now sits above the 9.9.0 entry).
All 324 tests pass; ``ruff check`` / ``ruff format --check`` clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Apply post-RFC review fixes: RFC 9990 detection, langAttrString, CFWS-aware RUF parsing
Aligns the implementation with the final RFCs (9989/9990/9991) instead of
inferring DMARCbis support from the version element or the namespace alone.
Aggregate parsing (RFC 9990):
- _text() helper unwraps langAttrString values (extra_contact_info, error,
comment, human_result, generator) — when reporters include the lang
attribute, xmltodict yields {"#text": ..., "@lang": ...} dicts instead
of strings; the parser now stores the text payload in both shapes.
- New xml_namespace field on AggregateReport records the declared XML
namespace (urn:ietf:params:xml:ns:dmarc-2.0 for RFC 9990 reports).
- RFC 9990 detection accepts namespaceless reports that follow the
RFC 9990 shape (presence of np / testing / discovery_method / generator),
so reporters that don't declare the namespace still receive RFC 9990-
aware validation.
- Warnings: missing DKIM <selector> (REQUIRED in RFC 9990); legacy
forwarded / sampled_out policy-override types (removed by RFC 9990);
unknown policy-override types per the RFC 9990 enumeration.
- xml_namespace added to Elasticsearch and OpenSearch document mappings.
Failure parsing (RFC 9991):
- Identity-Alignment and Auth-Failure are split on commas with CFWS
whitespace stripped per the RFC 9991 ABNF; previously "dkim, spf"
yielded ["dkim", " spf"] with a leading space on the second token.
- Warnings logged when either REQUIRED field is missing.
Terminology: every reference to "DMARCbis" in code, tests, sample
filenames, AGENTS.md, and CHANGELOG.md is replaced with the appropriate
RFC number (9989 for the policy spec, 9990 for aggregate reports, 9991
for failure reports). Sample contents are unchanged.
Docs: corrects the prior claim that fo was dropped from RFC 9990 (only
pct was), reframes testing as a new field (not a pct replacement, since
RFC 9989 Appendix A.6 removed pct with no per-message substitute), and
documents the policy_override_reason enum changes (added policy_test_mode;
removed forwarded / sampled_out).
Tests: 8 new tests covering xml_namespace capture, RFC 9990 detection
from field shape, missing-DKIM-selector warning, legacy-override-type
warning, langAttrString unwrapping across all four affected elements,
and CFWS-aware Identity-Alignment / Auth-Failure parsing plus their
missing-field warnings. 276 tests total, all passing; ruff clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Sean Whalen <44679+seanthegeek@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fixing ES/OS forensic report lookup and storage, extracting ES to separate CI service
* bumping CI ES version to current latest
* reshuffling CI job attributes
* removing EOL Python 3.8 from the CI pipeline