Compare commits

...

1343 Commits

Author SHA1 Message Date
copilot-swe-agent[bot]
2174f23eb5 Add comprehensive TypedDicts to minimize Any usage in public APIs
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 22:28:19 +00:00
copilot-swe-agent[bot]
febbb107c4 Fix Python 3.9 compatibility: replace pipe union syntax with Union/Optional
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 22:18:57 +00:00
copilot-swe-agent[bot]
9a64b494e7 Fix code review issues: incomplete isinstance and variable name mismatch
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 21:45:21 +00:00
copilot-swe-agent[bot]
e93209c766 Fix function signatures and improve type annotations
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 21:42:25 +00:00
copilot-swe-agent[bot]
d1c22466be Replace OrderedDict with dict and add TypedDict definitions
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 21:36:57 +00:00
copilot-swe-agent[bot]
3d1b2522d3 Initial plan 2025-12-17 21:19:30 +00:00
Sean Whalen
af9ad568ec Specify Python version requirements in pyproject.toml 2025-12-17 16:18:24 -05:00
Sean Whalen
748164d177 Fix #638 2025-12-17 16:09:26 -05:00
Sean Whalen
487e5e1149 Format on build 2025-12-12 15:56:52 -05:00
Sean Whalen
73010cf964 Use ruff for code formatting 2025-12-12 15:44:46 -05:00
Sean Whalen
a4a5475aa8 Fix another typo before releasing 9.0.5 2025-12-08 15:29:48 -05:00
Sean Whalen
dab78880df Actual 9.0.5 release
Fix typo
2025-12-08 15:26:58 -05:00
Sean Whalen
fb54e3b742 9.0.5
- Fix report type detection bug introduced in `9.0.4` (yanked).
2025-12-08 15:22:02 -05:00
Sean Whalen
6799f10364 9.0.4
Fixes

- Fix saving reports to OpenSearch ([#637](https://github.com/domainaware/parsedmarc/issues/637))
- Fix parsing certain DMARC failure/forensic reports
- Some fixes to type hints (incomplete, but published as-is due to the above bugs)
2025-12-08 13:26:59 -05:00
Sean Whalen
445c9565a4 Update bug link in docs 2025-12-06 15:05:19 -05:00
Sean Whalen
4b786846ae Remove Python 3.14 from testing
Until cpython bug https://github.com/python/cpython/issues/142307 is fixed
2025-12-05 11:05:29 -05:00
Sean Whalen
23ae563cd8 Update Python version support details in documentation 2025-12-05 10:48:04 -05:00
Sean Whalen
cdd000e675 9.0.3
- Set `requires-python` to `>=3.9, <3.14` to avoid [this bug](https://github.com/python/cpython/issues/142307)
2025-12-05 10:43:28 -05:00
Sean Whalen
7d58abc67b Add shebang and encoding declaration to tests.py 2025-12-04 10:21:53 -05:00
Sean Whalen
a18ae439de Fix typo in RHEL version support description in documentation 2025-12-04 10:18:15 -05:00
Sean Whalen
d7061330a8 Use None for blank fields in the Top 1000 Message Sources by Name DMARC Summary dashboard widget 2025-12-03 09:22:33 -05:00
Sean Whalen
9d5654b8ec Fix bugs with the Top 1000 Message Sources by Name DMARC Summary dashboard widget 2025-12-03 09:14:52 -05:00
Sean Whalen
a0e0070dd0 Bump version to 9.0.2 2025-12-02 20:12:58 -05:00
Sean Whalen
cf3b7f2c29 ## 9.0.2
## Improvements

- Type hinting is now used properly across the entire library. (#445)

## Fixes

- Decompress report files as needed when passed via the CLI.
- Fixed incomplete removal of the ability for `parsedmarc.utils.extract_report` to accept a file path directly in `8.15.0`.

## Breaking changes

This version of the library requires consumers to pass certain arguments as keyword-only. Internally, the API uses a bare `*` in the function signature. This is standard per [PEP 3102](https://peps.python.org/pep-3102/)  and as documented in the Python Language Reference.
.
2025-12-02 19:41:14 -05:00
Sean Whalen
d312522ab7 Enhance type hints and argument formatting in multiple files for improved clarity and consistency 2025-12-02 17:06:57 -05:00
Sean Whalen
888d717476 Enhance type hints and argument formatting in utils.py for improved clarity and consistency 2025-12-02 16:21:30 -05:00
Sean Whalen
1127f65fbb Enhance type hints and argument formatting in webhook.py for improved clarity and consistency 2025-12-02 15:52:31 -05:00
Sean Whalen
d017dfcddf Enhance type hints and argument formatting across multiple files for improved clarity and consistency 2025-12-02 15:17:37 -05:00
Sean Whalen
5fae99aacc Enhance type hints for improved clarity and consistency in __init__.py, elastic.py, and opensearch.py 2025-12-02 14:14:06 -05:00
Sean Whalen
ba57368ac3 Refactor argument formatting and type hints in elastic.py for consistency 2025-12-02 13:13:25 -05:00
Sean Whalen
dc6ee5de98 Add type hints to methods in opensearch.py for improved clarity and type checking 2025-12-02 13:11:59 -05:00
Sean Whalen
158d63d205 Complete annotations on elastic.py 2025-12-02 12:59:03 -05:00
Oscar Mattsson
f1933b906c Fix 404 link to maxmind docs (#635) 2025-12-02 09:26:01 -05:00
Anael Mobilia
4b98d795ff Define minimal Python version on pyproject (#634) 2025-12-01 20:22:49 -05:00
Sean Whalen
b1356f7dfc 9.0.1
- Allow multiple `records` for the same aggregate DMARC report in Elasticsearch and Opensearch (fixes issue in 9.0.0)
- Fix typos
2025-12-01 18:57:23 -05:00
Sean Whalen
1969196e1a Switch CHANGELOG headers 2025-12-01 18:01:54 -05:00
Sean Whalen
553f15f6a9 Code formatting 2025-12-01 17:24:10 -05:00
Sean Whalen
1fc9f638e2 9.0.0 (#629)
* Normalize report volumes when a report timespan exceed 24 hours
2025-12-01 17:06:58 -05:00
Sean Whalen
48bff504b4 Fix build script to properly publish docs 2025-12-01 11:08:21 -05:00
Sean Whalen
681b7cbf85 Formatting 2025-12-01 10:56:08 -05:00
Sean Whalen
0922d6e83a Add supported Python versions to the documentation index 2025-12-01 10:24:19 -05:00
Sean Whalen
baf3f95fb1 Update README with clarification on Python 3.6 support 2025-12-01 10:20:56 -05:00
Anael Mobilia
a51f945305 Clearly define supported Python versions policy (#633)
* Clearly define supported Python versions.

Support policy based on author's comment on https://github.com/domainaware/parsedmarc/pull/458#issuecomment-2002516299 #458

* Compile Python 3.6 as Ubuntu latest run against Ubuntu 24.04 which haven't Python3.6 + 20.04 is no longer available
https://raw.githubusercontent.com/actions/python-versions/main/versions-manifest.json

* Use latest versions of GH Actions

* Silent some technicals GH Actions steps

* Elasticsearch / opensearch: use supported versions + align used versions

* Delete .github/workflows/python-tests-3.6.yml

Drop Python 3.6 test

* Update Python 3.6 support status in README

---------

Co-authored-by: Sean Whalen <44679+seanthegeek@users.noreply.github.com>
2025-12-01 10:02:47 -05:00
Sean Whalen
55dbf8e3db Add sources my name table to the Kibana DMARC Summary dashboard
This matches the table in the Splunk DMARC  Aggregate reports dashboard
2025-11-30 19:44:14 -05:00
Anael Mobilia
00267c9847 Codestyle cleanup (#631)
* Fix typos

* Copyright - Update date

* Codestyle xxx is False -> not xxx

* Ensure "_find_label_id_for_label" always return str

* PEP-8 : apiKey -> api_key + backward compatibility for config files

* Duplicate variable initialization

* Fix format
2025-11-30 19:13:57 -05:00
Anael Mobilia
51356175e1 Get option on the type described on documentation (#632) 2025-11-30 19:00:04 -05:00
Anael Mobilia
3be10d30dd Fix warnings in docker-compose.yml (#630)
* Fix level=warning msg="...\parsedmarc\docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion"

* Fix "Unquoted port mapping not recommended"
2025-11-30 18:59:01 -05:00
Sean Whalen
98342ecac6 8.19.1 (#627)
- Ignore HTML content type in report email parsing (#626)
2025-11-29 11:37:31 -05:00
Sean Whalen
38a3d4eaae Code formatting 2025-11-28 12:48:55 -05:00
Sean Whalen
a05c230152 8.19.0 (#622)
8.19.0

- Add multi-tenant support via an index-prefix domain mapping file
- PSL overrides so that services like AWS are correctly identified
- Additional improvements to report type detection
- Fix webhook timeout parsing (PR #623)
- Output to STDOUT when the new general config boolean `silent` is set to `False` (Close #614)
- Additional services added to `base_reverse_dns_map.csv`

---------

Co-authored-by: Sean Whalen <seanthegeek@users.noreply.github.com>
Co-authored-by: Félix <felix.debloisbeaucage@gmail.com>
2025-11-28 12:47:00 -05:00
Sean Whalen
17bdc3a134 More tests cleanup 2025-11-21 09:10:59 -05:00
Sean Whalen
858be00f22 Fix badge links and update image source branch 2025-11-21 09:03:04 -05:00
Sean Whalen
597ca64f9f Clean up tests 2025-11-21 00:09:28 -05:00
Sean Whalen
c5dbe2c4dc 8.10.9
- Complete fix for #687 and more robust report type detection
2025-11-20 23:50:42 -05:00
Sean Whalen
082b3d355f 8.18.8
- Fix parsing emails with an uncompressed aggregate report attachment (Closes #607)
- Add `--no-prettify-json` CLI option (PR #617)
2025-11-20 20:47:57 -05:00
Sean Whalen
2a7ce47bb1 Update code coverage badge link to main branch 2025-11-20 20:28:10 -05:00
daminoux
9882405d96 Update README.md fix url screenshot (#620)
the url of screenshot is broken
2025-11-20 20:27:15 -05:00
Andrew
fce84763b9 add --no-prettify-json CLI option (#617)
* updates process_reports to respect newly added prettify_json option

* removes duplicate definition

* removes redundant option

* fixes typo
2025-11-02 15:54:59 -05:00
Rowan
8a299b8600 Updated default python docker base image to 3.13-slim (#618)
* Updated default python docker base image to 3.13-slim

* Added python 3.13 to tests
2025-10-29 22:34:06 -04:00
jandr
b4c2b21547 Sorted usage of TLS on SMTP (#613)
Added a line for the `email_results` function to take into account the smtp_ssl setting.
2025-08-25 13:51:10 -04:00
Sean Whalen
865c249437 Update features list 2025-08-24 13:39:50 -04:00
Sean Whalen
013859f10e Fix find_unknown_base_reverse_dns.py 2025-08-19 21:18:14 -04:00
Sean Whalen
6d4a31a120 Fix find_unknown_base_reverse_dns.py and sortlist.py 2025-08-19 20:59:42 -04:00
Sean Whalen
45d3dc3b2e Fiz sortlists.py 2025-08-19 20:23:55 -04:00
Sean Whalen
4bbd97dbaa Improve list verification 2025-08-19 20:02:55 -04:00
Sean Whalen
5df152d469 Refactor find_unknown_base_reverse_dns.py 2025-08-18 12:59:54 -04:00
Sean Whalen
d990bef342 Use \n here too 2025-08-17 21:08:28 -04:00
Sean Whalen
caf77ca6d4 Use \n when writing CSVs 2025-08-17 21:01:07 -04:00
Sean Whalen
4b3d32c5a6 Actual, actual Actual 6.18.7 release
Revert back to using python csv instead of pandas to avoid conflicts with numpy in elasticsearch
2025-08-17 20:36:15 -04:00
Sean Whalen
5df5c10f80 Pin pandas an numpy versions 2025-08-17 19:59:53 -04:00
Sean Whalen
308d4657ab Make sort_csv function more flexible 2025-08-17 19:43:19 -04:00
Sean Whalen
0f74e33094 Fix typo 2025-08-17 19:35:16 -04:00
Sean Whalen
9f339e11f5 Actual 6.18.7 release 2025-08-17 19:34:14 -04:00
Sean Whalen
391e84b717 Fix map sorting 2025-08-17 18:15:20 -04:00
Sean Whalen
8bf06ce5af 8.18.7
Removed improper spaces from  `base_reverse_dns_map.csv` (Closes #612)
2025-08-17 18:13:49 -04:00
Sean Whalen
2b7ae50a27 Better wording 2025-08-17 17:01:22 -04:00
Sean Whalen
3feb478793 8.18.6
- Fix since option to correctly work with weeks (PR #604)
- Add 183 entries to `base_reverse_dns_map.csv`
- Add 57 entries to `known_unknown_base_reverse_dns.txt`
- Check for invalid UTF-8 bytes in `base_reverse_dns_map.csv` at build
- Remove unneeded items from the `parsedmarc.resources` module at build
2025-08-17 17:00:11 -04:00
Sean Whalen
01630bb61c Update code formatting 2025-08-17 16:01:45 -04:00
Sean Whalen
39347cb244 Sdd find_bad_utf8.py 2025-08-17 15:55:47 -04:00
Sean Whalen
ed25526d59 Update maps 2025-08-17 15:17:24 -04:00
alagendijk-minddistrict
880d7110fe Fix since option to correctly work with weeks (#604) 2025-08-14 18:39:04 -04:00
Martin Kjær Jørgensen
d62001f5a4 fix wrong configuration option for maildir (#606)
Signed-off-by: Martin Kjær Jørgensen <me@lagy.org>
2025-08-14 18:36:58 -04:00
Sean Whalen
0720bffcb6 Remove extra spaces 2025-06-10 19:05:06 -04:00
Sean Whalen
fecd55a97d Add SMTP TLS Reporting dashboard for Splunk
Closes #600
2025-06-10 18:54:43 -04:00
Sean Whalen
a121306eed Fix typo in the map 2025-06-10 10:53:55 -04:00
Sean Whalen
980c9c7904 Add Hostinger to the map 2025-06-10 10:50:06 -04:00
Sean Whalen
963f5d796f Fix build script 2025-06-10 09:51:12 -04:00
Sean Whalen
6532f3571b Update lists 2025-06-09 20:05:56 -04:00
Sean Whalen
ea878443a8 Update lists 2025-06-09 17:04:16 -04:00
Sean Whalen
9f6de41958 Update lists 2025-06-09 13:41:49 -04:00
Sean Whalen
119192701c Update lists 2025-06-09 12:02:50 -04:00
Sean Whalen
1d650be48a Fix typo 2025-06-08 21:41:07 -04:00
Sean Whalen
a85553fb18 Update lists 2025-06-08 21:40:10 -04:00
Sean Whalen
5975d8eb21 Fix sorting 2025-06-08 20:17:21 -04:00
Sean Whalen
87ae6175f2 Update lists 2025-06-08 19:51:13 -04:00
Sean Whalen
68b93ed580 Update map 2025-06-03 14:54:58 -04:00
Sean Whalen
55508b513b Remove debugging code 2025-06-03 14:38:15 -04:00
Sean Whalen
71511c0cfc 8.18.5
- Fix CSV download
2025-06-03 11:44:42 -04:00
Sean Whalen
7c45812284 8.18.4
- Fix webhooks
2025-06-02 16:52:48 -04:00
Sean Whalen
607a091a5f 8.18.3
- Move `__version__` to `parsedmarc.constants`
- Create a constant `USER_AGENT`
- Use the HTTP `User-Agent` header value `parsedmarc/version` for all HTTP requests
2025-06-02 16:43:26 -04:00
Sean Whalen
c308bf938c Update the README 2025-06-02 15:43:51 -04:00
Sean Whalen
918501ccb5 Better formatting 2025-06-02 15:20:40 -04:00
Sean Whalen
036c372ea3 8.18.2
- Merged PR #603
  - Fixes issue #595 - CI test fails for Elasticsearch
    - Moved Elasticsearch to a separate Docker service container for CI testing
    - Dropped Python 3.8 from CI testing
  - Fixes lookup and saving of DMARC forensic reports in Elasticsearch and OpenSearch
- Updated fallback `base_reverse_dns_map.csv`, which now includes over 1,400 lines
- Updated included `dbip-country-lite.mmdb` to the June 2025 release
- Automatically fall back to the internal `base_reverse_dns_map.csv` if the received file is not valid (Fixes #602)
  - Print the received data to the debug log
2025-06-02 15:19:19 -04:00
Sean Whalen
a969d83137 Update included IP database 2025-06-02 11:30:26 -04:00
Szasza Palmer
e299f7d161 fixing ES/OS forensic report lookup and storage, extracting ES to separate CI service (#603)
* fixing ES/OS forensic report lookup and storage, extracting ES to separate CI service

* bumping CI ES version to current latest

* reshuffling CI job attributes

* removing EOL Python 3.8 from the CI pipeline
2025-06-02 11:10:10 -04:00
Sean Whalen
4c04418dae Fix domain lists check 2025-04-24 16:03:18 -04:00
Sean Whalen
2ca9373ed0 Match dashboard fields 2025-04-24 15:44:22 -04:00
Sean Whalen
961ef6d804 Revert adding the BOM
It broke reading the file with python
2025-04-24 14:04:36 -04:00
Sean Whalen
573ba1e3e9 Add UTF-8 BOM to the CSV so Excel will open the file as UTF-8 2025-04-24 13:58:06 -04:00
Sean Whalen
1d8af3ccff Add find_unknown_base_reverse_dns.py 2025-04-24 13:48:51 -04:00
Sean Whalen
8426daa26b Remove duplicate domains 2025-04-24 13:47:07 -04:00
Sean Whalen
d1531b86f2 Sort known_unknown_base_reverse_dns.txt 2025-04-24 11:42:09 -04:00
Sean Whalen
8bb046798c Simplify sender types 2025-04-23 16:20:05 -04:00
Sean Whalen
d64e12548a Fix errors in the reverse DNS map 2025-04-23 15:57:31 -04:00
Sean Whalen
380479cbf1 Update reverse DNS map 2025-04-23 15:43:38 -04:00
Sean Whalen
ace21c8084 Update base_reverse_dns map .csv.and add known_unknown_base_reverse_dns.txt 2025-04-23 15:36:14 -04:00
Sean Whalen
1a1aef21ad Replace deprecated path call with file call 2025-04-23 15:33:27 -04:00
Sean Whalen
532dbbdb7e Fix file formatting 2025-04-23 15:32:04 -04:00
miles
45738ae688 Fix SyntaxError in elastic forensic report (#598) 2025-04-23 14:40:03 -04:00
Sean Whalen
9d77bd64bc Fix some CSV entries 2025-04-01 09:23:44 -04:00
Sean Whalen
140290221d Update elastic.py 2025-03-22 15:09:44 -04:00
Sean Whalen
187d61b770 Update elastic.py 2025-03-22 15:03:42 -04:00
Sean Whalen
0443b7365e Update elastic.py 2025-03-22 14:47:50 -04:00
Sean Whalen
d7b887a835 Debug elasticsearch 2025-03-22 14:42:45 -04:00
Tom Henderson
a805733221 Raise for failed status (#594) 2025-03-22 11:22:49 -04:00
Sean Whalen
9552c3ac92 Update README.md 2025-03-21 09:41:14 -04:00
Sean Whalen
5273948be0 Make build.sh usable without the gh-pages branch 2025-02-18 09:17:12 -05:00
Sean Whalen
b51756b8bd 8.18.1
- Add missing `https://` to the default Microsoft Graph URL
2025-02-17 12:41:57 -05:00
Sean Whalen
7fa7c24cb8 Merge branch 'master' of https://github.com/domainaware/parsedmarc 2025-02-17 12:31:47 -05:00
Sean Whalen
972237ae7e Fix default Microsoft Graph URL 2025-02-17 12:31:39 -05:00
Sean Whalen
6e5333a342 Style fixes 2025-02-03 16:11:21 -05:00
Sean Whalen
47b074c80b Merge branch 'master' of https://github.com/domainaware/parsedmarc 2025-02-03 16:11:01 -05:00
Sean Whalen
a1cfeb3081 8.18.0
- Add support for Microsoft national clouds via Graph API base URL (PR #590)
- Avoid stopping processing when an invalid DMARC report is encountered (PR #587)
- Increase `http.client._MAXHEADERS` from `100` to `200` to avoid errors connecting to Elasticsearch/OpenSearch (PR #589)
2025-02-03 16:10:51 -05:00
Paul Hecker
c7c451b1b1 Set http.client._MAXHEADERS to 200 (#589) 2025-02-03 15:26:15 -05:00
Kevin Goad
669deb9755 Add support for Microsoft national clouds via Graph API base URL (#590)
* adding support for Microsoft National Clouds

* Update usage.md
2025-02-03 15:25:15 -05:00
bendem
446c018920 do not stop processing when we encounter an invalid dmarc report (#587) 2025-02-03 15:20:52 -05:00
Sean Whalen
38c6f86973 Update CHANGELOG.md 2025-01-10 09:09:24 -05:00
Sean Whalen
62ccc11925 Update changelog 2025-01-09 22:25:43 -05:00
Sean Whalen
c32ca3cae3 Fix sortmaps.py 2025-01-09 22:24:03 -05:00
Sean Whalen
010f1f84a7 8.17.0
- Ignore duplicate aggregate DMARC reports with the same `org_name` and `report_id` seen within the same hour ([#539](https://github.com/domainaware/parsedmarc/issues/539))
- Fix saving SMTP TLS reports to OpenSearch (PR #585 closed issue #576)
- Add 303 entries to `base_reverse_dns_map.csv`
2025-01-09 22:22:55 -05:00
Anael Mobilia
7da57c6382 Fix colors on export.ndjson (#586)
Old elements are put on compatibility color palette => update to status color palette
2025-01-09 22:09:44 -05:00
Sean Whalen
d08e29a306 Move sortmaps.py 2025-01-09 22:08:42 -05:00
Sean Whalen
e1e53ad4cb Use Python instead of Excel for sorting map CSVs 2025-01-09 22:03:49 -05:00
Sean Whalen
4670e9687d Update base_reverse_dns_map.csv 2025-01-09 21:18:00 -05:00
Sean Whalen
7f8a2c08cd Use a smaller key value 2025-01-09 19:34:56 -05:00
Sean Whalen
e9c05dd0bf Update base_reverse_dns_map.csv 2025-01-08 20:51:44 -05:00
Sean Whalen
9348a474dd Actually fix the CLI 2025-01-08 20:49:39 -05:00
Sean Whalen
e0decaba8c Fix CLI 2025-01-07 14:33:35 -05:00
Sean Whalen
26a651cded Use a combination of report org and report ID when checking for duplicate aggregate reports 2025-01-07 14:25:57 -05:00
Sean Whalen
bcfcd93fc6 More duplicate aggregate report checks
#535
2025-01-07 13:56:26 -05:00
Sean Whalen
54d5ed3543 Remove unused import 2025-01-07 12:57:41 -05:00
Sean Whalen
1efbc87e0e Consolidate SEEN_AGGREGATE_REPORT_IDS 2025-01-07 12:56:30 -05:00
Sean Whalen
e78e7f64af Add parsedmarc.ini to .gitignore 2025-01-07 11:59:03 -05:00
Szasza Palmer
ad9de65b99 fixing SMTP TLS report saving to OpenSearch (#585) 2025-01-07 11:57:04 -05:00
Sean Whalen
b9df12700b Check for duplicate aggregate report IDs when processing a mailbox
Fix #535
2025-01-07 11:56:51 -05:00
Sean Whalen
20843b920f Sort reverse DNS map 2025-01-06 21:26:48 -05:00
Sean Whalen
e5ae89fedf Merge branch 'master' of https://github.com/domainaware/parsedmarc 2025-01-06 21:21:57 -05:00
Sean Whalen
f148cff11c Update reverse DNS map 2025-01-06 21:19:06 -05:00
Sean Whalen
4583769e04 Update reverse DNS map 2025-01-03 09:23:06 -05:00
Sean Whalen
0ecb80b27c Update reverse DNS map 2024-12-30 11:40:29 -05:00
Sean Whalen
b8e62e6d3b Remove duplicate entry 2024-12-28 14:14:00 -05:00
Sean Whalen
c67953a2c5 Update reverse DNS map 2024-12-28 14:10:39 -05:00
Sean Whalen
27dff4298c Update reverse DNS mapping 2024-12-28 11:53:50 -05:00
Sean Whalen
f2133aacd4 Fix build dependencies 2024-12-25 18:52:42 -05:00
Sean Whalen
31917e58a9 Update build backend 2024-12-25 18:28:30 -05:00
Sean Whalen
bffb98d217 Get report ID correctly 2024-12-25 16:37:40 -05:00
Sean Whalen
1f93b3a7ea Set max_len to a value 2024-12-25 16:26:38 -05:00
Sean Whalen
88debb9729 Fix SEEN_AGGREGATE_REPORT_IDS 2024-12-25 16:21:07 -05:00
Sean Whalen
a8a5564780 Merge branch 'master' of https://github.com/domainaware/parsedmarc 2024-12-25 16:14:40 -05:00
Sean Whalen
1e26f95b7b 8.16.1
- Ignore aggregate DMARC reports seen within a period of one hour (#535)
2024-12-25 16:14:33 -05:00
ericericsw
82b48e4d01 Add files via upload (#578)
update new version dashbroad

panel model change list:
grafana-piechart-panel -> pie chart
Graph(old) -> time series
worldmap panel -> geomap

some table panel has change , be like overview add ARC Column

The problem cannot be solved at the moment: Multiple DKIM information will cause table display errors
2024-12-25 16:09:43 -05:00
Sean Whalen
617b7c5b4a Merge PR #527 2024-11-09 18:18:31 -05:00
Sean Whalen
989bfd8f07 Code cleanup 2024-11-02 11:40:37 -04:00
Sean Whalen
908cc2918c Merge branch 'ramspoluri-master' 2024-11-02 11:39:34 -04:00
Sean Whalen
bd5774d71d Merge branch 'master' of https://github.com/ramspoluri/parsedmarc into ramspoluri-master 2024-11-02 11:38:41 -04:00
Sean Whalen
8e9112bad3 Merge branch 'master' of https://github.com/ramspoluri/parsedmarc 2024-11-02 10:48:15 -04:00
Sean Whalen
40e041a8af Merge branch 'master' of https://github.com/ramspoluri/parsedmarc 2024-11-02 10:48:10 -04:00
Sean Whalen
7ba433cddb Fix code style 2024-11-02 10:39:05 -04:00
Sean Whalen
6d467c93f9 Update __init__.py
Add reference to https://www.rfc-editor.org/rfc/rfc3501#page-52
2024-11-02 10:35:22 -04:00
Sean Whalen
be38e83761 Code cleanup 2024-11-02 10:28:11 -04:00
Sean Whalen
ef4e1ac8dc Code cleanup 2024-11-02 10:26:30 -04:00
Sean Whalen
39e4c22ecc Fix syntax 2024-11-02 10:23:23 -04:00
Sean Whalen
88ff3a2c23 Update syntax to support Python < 3.10 2024-11-02 10:04:01 -04:00
Sean Whalen
d8aee569f7 Update __init__.py 2024-11-02 09:50:55 -04:00
Sean Whalen
debc28cc6e 8.15.4
- Fix crash if aggregate report timespan is > 24 hours
2024-10-24 19:53:44 -04:00
Sean Whalen
52ccf0536c 8.15.3
- Ignore aggregate reports with a timespan of > 24 hours (Fixes #282)
2024-10-24 19:43:28 -04:00
Sean Whalen
976a3274e6 8.15.2 2024-10-24 18:04:19 -04:00
Sean Whalen
bb722e651a Fix parsing when auth record is missing 2024-10-24 17:14:02 -04:00
PhiBo
ab280d7a34 Update Dockerfile (#571)
* Use multi-stage build to reduce image size
* Add ARGS to be more flexible during image builds
* Create user and use it instead of root
* Don't update pip in container. The Python image should have a recent
  version
2024-10-24 14:18:29 -04:00
Alexej Sidorenko
92b12eaacf issue #565 - Logfile is overwritten when parsedmarc (re)starts (#569)
Do not re-write the log file if already exists. Add a log handler in "append" mode (that should be an implicit value but it's defined explicitly for the visibility).
2024-10-10 15:27:15 -04:00
Jed Laundry
8444053476 Create optional dependency group for build, fix codecov (#567)
* Create optional dependency groups for build and cli

* revert cli optional-dependencies group
2024-10-07 13:47:35 -04:00
Sean Whalen
1ef3057110 8.15.1
- Proper IMAP namespace fix (Closes issue #557 and issue #563)
  - Require `mailsuite>=1.9.17`
  - Revert PR #552
- Add pre-flight check for nameservers (PR #562 closes issue #543)
- Reformat code with `ruff`
2024-10-02 21:19:57 -04:00
Paolo Schiro
fdb4e4cb36 Added nameservers pre-flight check, issue #543 (#562) 2024-10-02 20:48:02 -04:00
N4v41
d80ce744da add some identified missing organizations (#566) 2024-09-26 12:37:00 -04:00
Sean Whalen
f12828485b Parse aggregate reports with multiple policy_published records 2024-09-12 18:23:11 -04:00
centja1
0a6cfb602c Added output to webhook as an option (#558)
* Added output to webhook as an option

* added documentation for new webhook configuration

---------

Co-authored-by: Sean Whalen <44679+seanthegeek@users.noreply.github.com>
2024-09-12 15:47:59 -04:00
Sean Whalen
cf46558fa3 Update reverse DNS map 2024-09-12 15:01:40 -04:00
Sean Whalen
f1a526c247 Update reverse DNS map 2024-09-12 14:55:01 -04:00
Sean Whalen
7344ea9dda Update reverse DNS map 2024-09-12 14:53:59 -04:00
Sean Whalen
7633a30066 maildir fixes 2024-09-12 14:48:26 -04:00
Sean Whalen
266d57eb8c Fix maildir connection 2024-09-06 16:00:23 -04:00
Paolo Schiro
7b7d20b1a4 Added input report maildir connector, issue #82 (#555)
Co-authored-by: Paolo Schiro <paolo.schiro@staff.aruba.it>
2024-09-06 15:44:57 -04:00
Sean Whalen
b530d624e6 Add example google SMTP-TLS report email 2024-09-04 20:03:51 -04:00
Sean Whalen
5973ca8205 Code style fixes 2024-09-04 16:48:07 -04:00
Sean Whalen
e967778f25 8.15.0
- Fix processing of SMTP-TLS reports ([#549](https://github.com/domainaware/parsedmarc/issues/549)), which broke in commit [410663d ](410663dbca)(PR [#530](https://github.com/domainaware/parsedmarc/pull/530))
  - This PR enforced a stricter check for base64-encoded strings, which SMTP TLS reports from Google did not pass
  - Removing the check introduced its own issue, because some file paths were treated as base64-encoded strings
- Create a separate `extract_report_from_file_path()` function for processioning reports based on a file path
- Remove report extraction based on a file path from `extract_report()`
2024-09-04 16:31:41 -04:00
Sean Whalen
630863df5c 8.14.2
- Update `base_reverse_dns_map.csv` to fix over-replacement on [`f3a5f10`](f3a5f10d67) (PR #553)
2024-09-03 13:58:07 -04:00
Anael Mobilia
3a3b02687e Update base_reverse_dns_map.csv (#553)
Fix over-replacement on f3a5f10
2024-09-03 13:51:30 -04:00
Sean Whalen
79cd7d3c3d Actual 8.14.1 release
- Fix unit tests by enforcing base64 validation
2024-09-03 13:22:27 -04:00
Sean Whalen
3ca3d64775 8.14.1 2024-09-03 11:24:52 -04:00
Sean Whalen
21180f4bb8 8.14.0 release
- Fix processing of SMTP-TLS reports (#549)
- Skip invalid aggregate report rows without calling the whole report invalid
  - Some providers such as GoDaddy will send reports with some rows missing a source IP address, while other rows are fine
- Fix Dovecot support by using the seperator provided by the IPMAP namespace when possible (PR #552 closes #551)
- Only download `base_reverse_dns_map.csv` once (fixes #542)
- Update included `base_reverse_dns_map.csv`
  - Replace University category with Education to be more inclusive
- Update included `dbip-country-lite.mmdb`
2024-09-03 08:37:19 -04:00
Sean Whalen
fc36a78a4d 8.14.0
- Skip invalid aggregate report rows without calling the whole report invalid
  - Some providers such as GoDaddy will send reports with some rows missing a source IP address, while other rows are fine
- Fix Dovecot support by using the seperator provided by the IPMAP namespace when possible (PR #552 closes #551)
- Only download `base_reverse_dns_map.csv` once (fixes #542)
- Update included `base_reverse_dns_map.csv`
  - Replace University category with Education to be more inclusive
- Update included `dbip-country-lite.mmdb`
2024-09-02 22:42:37 -04:00
Sean Whalen
f3a5f10d67 Replace University category with Education to be more inclusive 2024-09-02 21:08:17 -04:00
Frederico Freire Boaventura
4787da9ea1 Fix the tilde expansion for user home folder (#550)
This will enable the tilde expansion to map the output dir to the home
user using `~/path`.
2024-09-02 17:28:48 -04:00
Gaige B Paulsen
06d8578c47 fix: use namespace for separator (#552) 2024-09-02 17:27:28 -04:00
Sean Whalen
ef621d68c7 Cache the reverse DNS map 2024-09-02 17:25:55 -04:00
Sean Whalen
1c3bcc05c7 Skip invalid aggregate report rows without calling the whole report invalid 2024-09-02 17:08:51 -04:00
Sean Whalen
554888b3dd Update IPDP database 2024-09-02 17:07:32 -04:00
N4v41
17d9599c54 Add various mail providers to the reverse DNS map (#548)
* Update base_reverse_dns_map.csv

Add Cloudflare E-mail routing service

* Add various missing mail providers
2024-08-31 12:08:15 -04:00
N4v41
fb0adf0627 Update base_reverse_dns_map.csv (#547)
Add Cloudflare E-mail routing service
2024-08-29 15:01:45 -04:00
N4v41
ef820c5d68 Update base_reverse_dns_map.csv (#546)
add Qualtrics to base_reverse_dns_map
2024-08-29 15:01:19 -04:00
Sean Whalen
cc1b6ae389 8.13.0
- Add Elastic/OpenSearch index prefix option (PR #531 closes #159)
- Add GELF output support (PR #532)
2024-08-24 21:29:32 -04:00
Sean Whalen
0fa6bebf5a Fix error in docs 2024-08-24 20:10:57 -04:00
Sean Whalen
06ce080171 Update cli.py
Remove trailing spaces
2024-08-24 11:34:48 -04:00
Jason Lingohr
11e0461b9d Add GELF support (#532)
* Implement the ability to log to a GELF server/input, via the use of pygelf.

* Fix flake8 style checks.
2024-08-24 11:28:55 -04:00
Andreas Brett
efe4893a7f add TLSRPT dashbboard (#529) 2024-08-24 11:27:42 -04:00
Emmanuel Ormancey
9b32d9459f added ES/OS prefix (#531) 2024-08-24 11:26:13 -04:00
Félix
aa357dc50f add new sources to base_reverse_dns_map.csv (#537)
* add new sources to base_reverse_dns_map.csv

* remove extra comma in base_reverse_dns_map.csv
2024-07-30 15:53:24 -04:00
Sean Whalen
e9f0cdef1f Update tests.py
Fix code style
2024-07-16 15:14:40 -04:00
pphillips99
410663dbca fixup and tests to extract_report() to handle documented inputs (#530) 2024-07-16 14:53:27 -04:00
Sean Whalen
b110d06adb Merge branch 'master' of https://github.com/domainaware/parsedmarc 2024-06-11 14:40:13 -04:00
Sean Whalen
61402d6284 Fix PowerShell typo in docs 2024-06-11 14:38:29 -04:00
yuji suzuki
e6e282a10c Fix ci and gha update (#526)
* add gmail_api example to README.md

* fix ci

fix

fix

m

fix

* codecov update
2024-05-28 16:54:17 -04:00
ramspoluri
f618f69c6c Added 'since' option to search for messages since a certain time
- Added `since` option under `mailbox` section to search for messages since a certain time instead of going through the complete mailbox during testing scenarios. Acceptable values -`5m|3h|2d|1w`, units - {"m":"minutes", "h":"hours", "d":"days", "w":"weeks"}). Defaults to `1d` if an incorrect value is provided.
    - Not to mark messages as read if test option is selected (works only for MSGraphConnection)
2024-05-24 20:43:36 +05:30
Sean Whalen
13ddc26d70 8.12.0
- Fix for deadlock with large report (#508)
- Build: move to kafka-python-ng (#510)
- Fix new config variables previously not propagated in the code (#524)
- Fixes for kafka integration (#522)
- Fix if base_domain is None before get_service_from_reverse_dns_base_domain (#514)
- Update base_reverse_dns_map.csv
2024-05-22 09:35:18 -04:00
Sean Whalen
899dc060b2 pep8 fixes 2024-05-22 09:06:32 -04:00
Anael Mobilia
1145b0f63b Update base_reverse_dns_map.csv (#516)
* Update base_reverse_dns_map.csv

Add some domains + fix mailchimp

* Update base_reverse_dns_map.csv

While Mailgun can be used for marketing, its main use case is transactional email.

https://www.mailgun.com/solutions/use-cases/transactional-email/

---------

Co-authored-by: Sean Whalen <44679+seanthegeek@users.noreply.github.com>
2024-05-22 08:44:14 -04:00
Lennart Friberg
91191e30f3 Fix if base_domain is None before get_service_from_reverse_dns_base_d… (#514)
* Fix if base_domain is None before get_service_from_reverse_dns_base_domain call

- Added if statement for base_domain before getting get_service_from_reverse_dns_base_domain(). If base_domain is None, get_service_from_reverse_dns_base_domain() will fail the report.
- Added .xml test file

* Create protection.outlook.com!example.com!1711756800!1711843200.xml

added .xml test sample report where base_domain return none
2024-05-22 08:36:13 -04:00
ChibangLW
42be53349f Fixes for kafka integration (#522)
* fix(kafka): set correct variables

* fix(kafka): made username and password optional.

The documentation states these as optional but in code these were mandatory.
2024-05-22 08:35:41 -04:00
yuji suzuki
20f451192f add gmail_api example to README.md (#518) 2024-05-22 08:35:16 -04:00
Sean Whalen
6e96b88a27 Always use local files when testing 2024-05-22 08:30:12 -04:00
yuji suzuki
201280d700 Fixed ci for Test sample DMARC reports (#520)
* Fixed ci for Test sample DMARC reports

* pip install local local folder

* pip install path fix
2024-05-22 08:14:04 -04:00
ryuksan
f8ee9bd166 Fix new config variables previously not propagated in the code (#524) 2024-05-22 08:13:45 -04:00
Andrew Baumann
ed92e0f7eb Remove unused parsedmarc.utils.tempdir (#517)
Parsedmarc.utils was creating a temp directory on every import, but this directory is no longer used anywhere.
2024-05-22 08:13:11 -04:00
yuji suzuki
c150c7671f github actions run in all python versions and ci version up (#519) 2024-05-22 08:12:55 -04:00
Christian Clauss
6bd9aab925 README.md: Expand the acronym to help readers understand (#511) 2024-05-22 08:11:58 -04:00
Patrick Linnane
f98dc6d452 build: move to kafka-python-ng (#510)
Signed-off-by: Patrick Linnane <patrick@linnane.io>
2024-05-22 08:11:29 -04:00
Russel Hunter Yukawa
7aa2e14cbb Fix for deadlock with large report (#508)
* add large xml sample

* Avoid deadlock

* Remove extra  whitespaces
2024-05-22 08:10:59 -04:00
Vermium Sifell
0b46c1807c Update base_reverse_dns_map.csv (#515) 2024-05-22 08:08:41 -04:00
Sean Whalen
11a9e959a0 Update base_reverse_dns_map.csv 2024-04-16 13:35:33 -04:00
Lennart Friberg
b6a8739b4f Update base_reverse_dns_map.csv (#513)
Added Email Security provider
2024-04-15 10:06:45 -04:00
Sean Whalen
a53d35a90c Revert "Update usage.md" PR #497 makes moot
This reverts commit 662b7b694b.
2024-04-08 10:41:16 -04:00
Sean Whalen
662b7b694b Update usage.md 2024-04-08 09:55:07 -04:00
Sean Whalen
6a06d01b44 8.11.0
- Actually save `source_type` and `source_name` to Elasticsearch and OpenSearch
- Reverse-lookup cache improvements (PR #501 closes issue #498)
- Update the included `dbip-country-lite.mmdb` to the 2024-03 version
- Update `base_reverse_dns_map.csv`
- Add new general config options (closes issue #500)
  - `always_use_local_files` - Disables the download of the reverse DNS map
  - `local_reverse_dns_map_path` - Overrides the default local file path to use for the reverse DNS map
  - `reverse_dns_map_url` - Overrides the default download URL for the reverse DNS map
2024-04-02 12:34:32 -04:00
Sean Whalen
422f7a074a Update base_reverse_dns_map.csv 2024-04-02 12:13:19 -04:00
Sean Whalen
d6128eae9f Actually save source_tye and source_name to Elasticsearch and Opensearch 2024-04-01 09:14:45 -04:00
Sean Whalen
36eedcb446 Update dbip-country-lite.mmdb 2024-03-31 20:36:22 -04:00
Sean Whalen
acef7bdd6e Code cleanup 2024-03-31 20:25:10 -04:00
Rod Payne
8936193280 Reverse-lookup cache improvements (#501) 2024-03-31 20:01:40 -04:00
Sean Whalen
fd5b792c4a Close issue #500
Add the following general configuration options:

- `always_use_local_files` - Disables the download of the reverse DNS map
- `local_reverse_dns_map_path` - Overrides the default local file path to use for the reverse DNS map
2024-03-31 20:00:47 -04:00
Sean Whalen
041296b1f1 8.10.3
- Fix flaws in `base_reverse_dns_map.csv`
2024-03-29 14:17:40 -04:00
Sean Whalen
2e49db3c13 8.10.2
- Fix flaws in `base_reverse_dns_map.csv`
2024-03-29 14:02:30 -04:00
Sean Whalen
871d678d16 Update dmarc_aggregate_dashboard.xml 2024-03-28 19:38:31 -04:00
Sean Whalen
7a8781ef5c Update dmarc_aggregate_dashboard.xml 2024-03-28 16:07:13 -04:00
Sean Whalen
9084d32040 8.10.1
- Fix flaws in `base_reverse_dns_map.csv`
2024-03-27 18:10:39 -04:00
Sean Whalen
29fe768772 Update dmarc_aggregate_dashboard.xml 2024-03-27 18:00:35 -04:00
Sean Whalen
0cad27c686 Update base_reverse_dns_map.csv 2024-03-27 18:00:27 -04:00
Sean Whalen
1f9a5ffe58 Splunk dashboard bug fixes 2024-03-27 17:19:58 -04:00
Sean Whalen
9ffc63f895 Update base_reverse_dns_map.csv 2024-03-27 17:01:25 -04:00
Sean Whalen
26f62082c3 Update Splunk dashboards 2024-03-27 15:40:19 -04:00
Sean Whalen
d7dbf68e7d Actually bump the version number to 8.10.0 2024-03-26 12:24:01 -04:00
Sean Whalen
465829c18b 8.10.0
- Fix MSGraph UsernamePassword Authentication (PR #497)
- Attempt to download an updated `base_reverse_dns_map.csv` at runtime
- Update included `base_reverse_dns_map.csv`
2024-03-26 12:22:49 -04:00
Sean Whalen
09238d5ca8 Update base_reverse_dns_map.csv 2024-03-26 12:18:00 -04:00
Tim Nowaczyk
d907dd8cff Fix MSGraph UsernamePassword Authentication (#497)
parsedmarc/pull/471 broke UsernamePassword authentication for me. This change fixes it.

Co-authored-by: Tim Nowaczyk <tnowaczyk@eqx-kibana-1.allpointsbroadband.net>
2024-03-26 12:07:47 -04:00
Sean Whalen
18f7508a1f Download reverse DNS map from GitHub 2024-03-26 12:07:10 -04:00
Sean Whalen
ed593a0b49 Update base_reverse_dns_map.csv 2024-03-26 10:32:34 -04:00
Sean Whalen
d27e195645 Update base_reverse_dns_map.csv 2024-03-26 08:16:41 -04:00
Sean Whalen
ede8c0b6a2 8.9.4
- Update `base_reverse_dns_map.csv`
2024-03-25 11:35:25 -04:00
Sean Whalen
91daa31172 8.9.3
- Revert change in 8.9.2
2024-03-25 07:04:15 -04:00
Sean Whalen
cdde9734ec 8.9.2 2024-03-25 06:26:45 -04:00
Sean Whalen
62ad46b8ba 8.9.1 release
- Fix broken CLI by removing opbsolete paramater from `cli_parse` call (PR #496 closes issue #495)
2024-03-25 05:45:08 -04:00
Szasza Palmer
fc5e0fe4d5 removing obsolete parameter from cli_parse call (#496) 2024-03-25 00:03:51 -04:00
Sean Whalen
633435547a Actual 8.9.0 release
- Fix broken cache (PR #494)
- Add source name and type information based on static mapping of the reverse DNS base domain
  - See [this documentation](https://github.com/domainaware/parsedmarc/tree/master/parsedmarc/resources/maps) for more information, and to learn how to help!
- Replace `multiprocessing.Pool` with `Pipe` + `Process` (PR #491 closes issue #489)
- Remove unused parallel arguments (PR #492 closes issue #490)
2024-03-24 23:48:08 -04:00
Sean Whalen
fd0572cdd0 8.9.0
- Add source name and type information based on static mapping of the reverse DNS base domain
  - See [this documentation](https://github.com/domainaware/parsedmarc/tree/master/parsedmarc/resources/maps) for more information, and to learn how to help!
- Replace `multiprocessing.Pool` with `Pipe` + `Process` (PR #491 closes issue #489)
- Remove unused parallel arguments (PR #492 closes issue #490)
2024-03-24 23:30:40 -04:00
Sean Whalen
e550f42a22 Add support for source name and type 2024-03-24 21:31:39 -04:00
Szasza Palmer
2cde116a93 removing unused parallel arguments (#492) 2024-03-24 20:43:43 -04:00
Szasza Palmer
a915385246 replacing multiprocessing pool with pipe (#491)
* replacing multiprocessing pool with pipe

* code styling fix

* dropping obsolete chunk_size config parameter
2024-03-23 21:41:46 -04:00
Rod Payne
1e565d9eb2 Use cache in get_ip_address_info. (#494) 2024-03-23 21:40:36 -04:00
Anael Mobilia
3a1360a47a Always use systemctl command (#482)
In order to align with the rest of the documentation
2024-03-17 11:44:55 -04:00
Sean Whalen
5f3977d686 8.8.0 release
- Add support for OpenSearch (PR #481 closes #480)
- Fix SMTP TLS reporting to Elasticsearch (PR #470)
2024-03-04 10:49:44 -05:00
Sean Whalen
65d04bcb78 Code style cleanup 2024-03-04 10:37:32 -05:00
cgoIT
f3206dcdab [SMTP TLS] some minor bug fixes (#477)
* fix minor bugs during smtp-tls parsing, add docker-compose for local elasticsearch, add smtp-tls tests

* fix wrong log message parameter

* fix wrong log message

* add contact-info to smtp tls report, fix wrong fieldnames

* fix wrong fieldnames

* fix wrong index name for search

* at least for some reporting organizations the field sending-mta-ip is optional...

* add missing fields to elasticsearch for smtp tls

* failure_details is a list, add more test cases

* fix wrong name in ci.ini
2024-03-04 10:06:47 -05:00
Szasza Palmer
995bdbcd97 adding OpenSearch support, fixing minor typos, and code styling (#481)
* adding OpenSearch support, fixing minor typos and code styling

* documentation update
2024-03-04 10:06:26 -05:00
Sean Whalen
77132b3fc5 8.7.0
- Add support for SMTP TLS reports (PR #453 closes issue #71)
- Do not replace content in forensic samples (fix #403)
- Pin `msgraph-core` dependency at version `0.2.2` until Microsoft provides better documentation (PR #466 Close [#464](https://github.com/domainaware/parsedmarc/issues/464))
- Properly handle base64-encoded email attachments (PR #453)
- Do not crash when attempting to parse invalid email content (PR #453)
- Ignore errors when parsing text-based forensic reports (PR #460)
- Add email date to email processing debug logs (PR #462)
- Set default batch size to 10 to match the documentation (PR #465)
- Properly handle none values (PR #468)
- Add Gmail pagination (PR #469)
- Use the correct `msgraph` scope (PR #471)
2024-02-19 19:21:38 -05:00
Sean Whalen
a1f141d84c Code cleanup 2024-02-19 19:05:12 -05:00
mkupferman
efe74091f3 Gmail API pagination (#469)
* Use pagination to fetch more than 100 Gmail messages at once

* Provide `paginate_messages` option to allow use of previous behavior
2024-02-19 18:54:41 -05:00
Brassn
d2145b71ba msgraph DeviceFlow selecting wrong scope (#471)
* DeviceCode Flow ignores user and selects wrong scope

* only require client secret on ClientSecret flow
2024-02-19 18:54:16 -05:00
Yuuki Takahashi
d512b9f60e fix: handle none value (#468) 2024-02-19 18:53:52 -05:00
Anael Mobilia
93278bc023 Add email date on debug of mail processing (#462) 2024-02-19 18:53:02 -05:00
Anael Mobilia
9e9065837e Define default value for batch_size to 10 as according to the documentation (#465) 2024-02-19 18:52:30 -05:00
Anael Mobilia
1c1ce7fea3 Doc - Update elasticsearch parameters (#467) 2024-02-19 18:52:01 -05:00
bendem
fc49f7f56c Ignore errors when parsing text-based forensic reports (#460)
Starting 8.2.0, parsedmarc crashes instead of ignoring some invalid reports.

The original change was introduced in abf9695228.
2024-02-19 18:51:28 -05:00
Sean Whalen
b8088505b1 Add support for SMTP TLS reports (#453) 2024-02-19 18:45:38 -05:00
Jason Lingohr
7d2b431e5f Fix tiny formatting issue (#451) 2024-01-02 16:38:50 -05:00
Nicholas Hairs
cb3f82e847 Add additional samples and ensure git does not touch CRLF (#456)
mimecast sample taken from:
https://github.com/domainaware/parsedmarc/issues/429#issuecomment-1873387507
2024-01-02 16:29:06 -05:00
UKnowTEd
100f12ed35 Update elastic.py (#449)
Insert new config parameter "apiKey" to authenticate via API to Elasticsearch.
2023-12-16 11:59:50 -05:00
UKnowTEd
37a6155262 Update cli.py (#450)
Insert new config parameter "apiKey" to authenticate via API to Elasticsearch.
2023-12-16 10:10:35 -05:00
Andras
25086763a9 small grammatical error in README.md (#446) 2023-12-16 10:09:42 -05:00
Anael Mobilia
b89c38c22a Region Map -> Maps (#445)
Since Kibana 8, Region Maps are no longer rendered.
See https://github.com/elastic/kibana/issues/81704
2023-12-16 10:09:11 -05:00
Jonathan Martens
c71bc19cea docs: formatting fix on delete keyword in mail section (#444) 2023-12-16 10:08:41 -05:00
Anael Mobilia
3bf0bea710 Fix typo on documentation (#443)
* Doc cleanup/improvement

* Fix typos

---------

Co-authored-by: Anael Mobilia <anael.mobilia@mydsomanager.com>
2023-12-16 10:08:06 -05:00
Sean Whalen
95954c5d87 Correct changelog 2023-10-13 10:35:58 -04:00
Sean Whalen
6120b8683d Update changelog 2023-10-13 10:29:42 -04:00
Sean Whalen
21d6f92fd4 Add PyPI download stats badge 2023-10-13 10:01:48 -04:00
jlownie
a164fb8e04 Update usage.md (#436) 2023-10-13 09:53:28 -04:00
Sean Whalen
762d92f6d2 8.6.4
- Ignore SPF results in aggregate report records if the domain is not provided
2023-10-13 09:51:59 -04:00
Sean Whalen
1655b84cc2 8.6.3
- Add an error message instead of raising an exception when an aggregate report timespan is greater than 24 hours
2023-10-11 19:41:30 -04:00
Sean Whalen
0eaba079b9 8.6.2
- Use `zlib` instead of `Gzip` to decompress more `.gz` files, including the ones supplied by Mimecast (Based on #430 closes #429)
2023-10-11 17:49:46 -04:00
Sean Whalen
1de4a94c37 Merge branch 'master' of https://github.com/domainaware/parsedmarc 2023-10-11 17:36:39 -04:00
Sean Whalen
722ff79e23 Move senders out of the package
until the database is more complete
2023-10-11 17:36:13 -04:00
Anael Mobilia
68145b8b06 setup.py don't exist anymore (#434)
setup.py file was deleted on de3002db8b
2023-10-11 17:25:25 -04:00
Anael Mobilia
732547e622 Doc cleanup/improvement (#432)
Co-authored-by: Anael Mobilia <anael.mobilia@mydsomanager.com>
2023-10-11 17:24:50 -04:00
Anael Mobilia
aaf269b11b Install latest elasticsearch for tests (#433) 2023-10-11 17:23:19 -04:00
Sean Whalen
2bee4fc591 Fix README typos 2023-09-05 21:17:04 -04:00
Sean Whalen
4d0974948d Update senders.sqlite 2023-09-05 21:10:12 -04:00
Sean Whalen
8b5834b00d Fix flake8 E721 2023-09-05 18:12:50 -04:00
Sean Whalen
31db7d2301 Add senders.sqlite 2023-09-05 15:15:30 -04:00
Sean Whalen
26027ef6b3 Add senders.sqlite 2023-09-05 15:14:11 -04:00
Sean Whalen
4ff44dcb0f Update dbip-country-lite.mmdb 2023-09-05 06:43:31 -04:00
Sean Whalen
557e2e0554 Code style fixes 2023-08-01 16:23:31 -04:00
Sean Whalen
6c84cfb7c4 Fix output in tests.py 2023-06-27 15:56:33 -04:00
Sean Whalen
a4b0aabdfb Update CHANGELOG.md 2023-06-27 15:54:21 -04:00
andersnauman
51760181b0 Fix: Less error-prone code to minimize the risk of program crash (#417)
- Double check if org_name exist. Empty name will crash Elastissearch's 'exist' search
- Move try-statement to include open() to catch if files do not exist
- Enclose Elasticsearch's execute in a try-statement to catch any invalid searches when variables are empty/missing
2023-06-27 15:45:40 -04:00
Sean Whalen
89872d78ac Fix testing 2023-06-27 15:43:09 -04:00
Sean Whalen
477a45d19b Acutally bump the version to 8.6.1 2023-05-14 20:43:42 -04:00
Sean Whalen
e5e7a6fe75 8.6.1
- Fix handling of non-domain organization names (PR #411 fixes issue #410)
- Ignore rua reports with a date range that is too long to be valid (PR #408 fixes issue #282)
2023-05-14 20:41:44 -04:00
Gaige B Paulsen
5a659ea578 fix: handling of text org_name without space #410 (#411) 2023-05-14 20:29:12 -04:00
Anael Mobilia
d2d62488f7 Add contributors on copyright (#414)
Co-authored-by: Anael Mobilia <anael.mobilia@mydsomanager.com>
2023-05-14 18:07:35 -04:00
Anael Mobilia
bf69ea8ccc Fix typos (#413)
Co-authored-by: Anael Mobilia <anael.mobilia@mydsomanager.com>
2023-05-14 18:07:07 -04:00
Michael Kliewe
af1e299dd4 Fix issue #282: Detect large date ranges in aggregate reports and skip processing (#408) 2023-05-14 18:06:07 -04:00
Sean Whalen
d426098b7e 8.6.0
- Replace publicsuffix2 with publicsuffixlist
2023-05-09 09:11:35 -04:00
Jed Laundry
a06fdc586f Change publicsuffix2 to publicsuffixlist (#406)
* change to publicsuffixlist

* update publicsuffixlist (now auto-updating)

* Fix unused imports
2023-05-09 08:49:41 -04:00
Anael Mobilia
062d6ea821 Use a local network connection to elasticsearch (#407) 2023-05-09 08:48:17 -04:00
Sean Whalen
b15e8d0aad Fix documentation headings 2023-05-03 18:25:27 -04:00
Sean Whalen
bf102f78aa 8.5.0
- Add support for Azure Log Analytics (PR #394)
- Fix a bug in the Microsoft Graph integration that caused a crash when an inbox has 10+ folders (PR #398)
- Documentation fixes
2023-05-03 17:10:49 -04:00
Sean Whalen
f6c0a4ecbc Update docs 2023-05-03 16:57:22 -04:00
Sean Whalen
044038a381 Fix mailing list diocumentation 2023-05-03 16:54:25 -04:00
Sean Whalen
cd475255c5 Documentation cleanup 2023-05-03 16:44:15 -04:00
Ben Companjen
2b35b785c6 Split and Organise documentation files (#404)
* Set global TOC collapse to false

* Split documentation

I tried to split the index.md file into logical parts, not changing the contents.
I did add a space and change one HTTP URL to HTTPS.

---------

Co-authored-by: Sean Whalen <44679+seanthegeek@users.noreply.github.com>
2023-05-03 16:11:58 -04:00
Sean Whalen
634631e326 Code style fixes 2023-05-03 16:07:54 -04:00
rubeste
a7280988eb Implemented Azure Log Analytics ingestion via Data Collection Rules (#394)
* Implemented Azure Log Analytics ingestion via Data Collection Rules

* Update loganalytics.py

* Update cli.py

* Update pyproject.toml

* Fixed config bug

Fixed a bug that causes the program to fail if you do not configure a Data stream.

* Fixed code format
2023-05-03 15:54:25 -04:00
Anael Mobilia
02e856a9bf From Elasticsearch 8.7, xpack security isn't on by default but is required (#395)
```
org.elasticsearch.ElasticsearchSecurityException: invalid configuration for xpack.security.transport.ssl - [xpack.security.transport.ssl.enabled] is not set, but the following settings have been configured in elasticsearch.yml : [xpack.security.transport.ssl.keystore.secure_password,xpack.security.transport.ssl.truststore.secure_password]
```
2023-05-03 15:39:46 -04:00
rubeste
70a9d11adb Fixed Bug in graph.py (#398)
* Fixed Bug in graph.py

Fixed Bug regarding the finding of a specific folder.
This Bug caused parsedmarc to crash if it could not find the folder in one Ms Graph request. This is only an issue if your MailBox contains 10+ folders.
It was solved by adding the `$filter=displayName eq '{folder_name}'` param so it would immediatly find the folder.

* Fixed MS Graph Search bug

Fixed bug that causes only 10 messages to be read from MS Graph if batch size is defined as 0.
Fixed formatting

* prevented $top from being 0
2023-05-03 15:36:23 -04:00
Ben Companjen
9c86e2df49 Fix docstring indentations and grammar (#405)
In several docstrings incorrect indentation caused errors in the
rendered documentation.
I also changed a few words to correct the grammar.
2023-05-03 15:30:05 -04:00
Sean Whalen
b75259c58c Merge branch 'master' of github.com:domainaware/parsedmarc 2023-01-21 10:09:39 -05:00
Sean Whalen
16318b9152 8.4.2
PR #386 closes issues #289 and #380

- Only initialize the syslog, S3 and Kafka clients once
2023-01-21 10:08:47 -05:00
kcyd
22a6d21362 initialize syslog, s3 and kafka clients only once (#386) 2023-01-21 10:02:05 -05:00
Sean Whalen
b10cc06441 8.4.1
- Fix bug introduced in 8.3.1 that caused `No such file or directory` errors if output files didn't exist (PR #385 closes issues #358 and #382)
- Make the `--silent` CLI option only print errors.
  Add the `--warnings` options to also print warnings (PR#383)
2023-01-16 15:06:31 -05:00
Aleksandar Ristic
a54cc3e6dd Add --warnings cli option; --silent now prints only errors (#383)
* Adding warnings argument; silent logs only errors.

* Fixing flake warning for trailing whitespace

* Change help for --warnings to match existing style
2023-01-16 14:45:27 -05:00
Anael Mobilia
8b8c8c15fe Fix markdown (#384) 2023-01-16 14:43:36 -05:00
Clayton Dukes
bb838bb11a fixes https://github.com/domainaware/parsedmarc/issues/382 (#385) 2023-01-16 14:43:06 -05:00
Sean Whalen
375aaf025d Better build script 2022-12-23 20:20:53 -05:00
Sean Whalen
f82445fa06 8.4.0 2022-12-23 20:08:00 -05:00
Anael Mobilia
70ff13bfae In case of timeout or exception, wait "check_timeout" before to try a new connexion (documentation : "or the number of seconds until the next mail check"). (#377)
Else in case of a mail server issue, we try to connect again each 5 seconds.
2022-12-23 18:23:01 -05:00
aroldxd
fcc64ed85a add option to allow unencrypted fallback for token cache (#375) 2022-12-23 18:21:22 -05:00
Bjoern
ea777d9d18 Fix Bug with wrong ip_db_path (#338 #287) (#369)
Fixes the problem when only a file path was specified instead of a file
Reportet with Issues:
Not a valid aggregate or forensic report after fresh Install with 8.3.x #338
Not a valid aggregate or forensic report after Update to 7.1.1 #287
2022-12-23 18:19:21 -05:00
Anael Mobilia
4217a076de Doc - Add info on how to update max shards (#368)
Add information on how to fix "Elasticsearch error: RequestError(400, 'validation_exception', 'Validation Failed: 1: this action would add [1] shards, but this cluster currently has [1000]/[1000] maximum normal shards open;"
2022-12-23 18:15:11 -05:00
nmourad
0a0e4beb27 Update documentation default value for ES replica setting (#376)
Change made in 7.1.0 "Set Elasticsearch shard replication to 0 (PR #274)"
Documentation was not updated

Co-authored-by: n.mourad <n.mourad@criteo.com>
2022-12-23 18:14:41 -05:00
ykcab
304ebaaa42 Updated Grafana query template (#366)
Co-authored-by: Alain Mbuku <git@almb.me>
2022-12-23 18:13:56 -05:00
Anael Mobilia
bcf242b0ab Fix typo (#364) 2022-12-23 18:13:10 -05:00
Anael Mobilia
1380eed2b8 Doc - Update install documentation to Elasticsearch/Kibana 8 (#363)
* Update elasticsearch/kibana instructions

[From elastisearch notes](https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#heap-size-settings) : 
```
By default, Elasticsearch automatically sets the JVM heap size based on a node’s roles and total memory. We recommend the default sizing for most production environments.
```

* Update nginx conf to TLSv1.3 and IPv6

* Replace nginx proxy by native https server

Kibana now provide https web server, remove the nginx proxy part and directly use kibana

* Fix typo

* Add infos how to login to kibana

* Add interface details
2022-12-23 18:12:39 -05:00
Anael Mobilia
69c2c6bdb6 Add details on virtualenv / package installation (#361) 2022-12-23 18:10:35 -05:00
Anael Mobilia
7c349fe97e Add contrib component requirement on Debian (#360) 2022-12-23 18:09:52 -05:00
Sean Whalen
49f9d75413 Add publish-docs.sh 2022-10-04 18:45:57 -04:00
Sean Whalen
b86365225e Merge remote-tracking branch 'refs/remotes/origin/master' 2022-10-04 18:43:17 -04:00
Sean Whalen
dde79c9e26 8.3.2
- Improvements to the Microsoft Graph integration (PR #352)
2022-10-04 18:42:32 -04:00
Nathan Thorpe
79d99d0b2e Fix issue with MS Graph batch size, scopes, and exception logging improvements (#352)
* log the stack trace when handling an exception

* fix issue when batch size is not supplied, raise error on fetch_messages

* fix unused var

* initialize graph client with scopes,

* add check for status code
2022-10-04 18:11:01 -04:00
Pierce
126bab1c3b Fix screenshot in README.md (#353) 2022-10-04 18:09:42 -04:00
Sean Whalen
4a607420a7 Fix list formatting in docs 2022-09-10 15:16:02 -04:00
Sean Whalen
be4c236d69 Use hatch when testing building packages 2022-09-10 14:43:15 -04:00
Sean Whalen
4376b12c93 Remove sstcheck from tests 2022-09-10 14:39:08 -04:00
Sean Whalen
12e591255c Fix tests 2022-09-10 14:32:43 -04:00
Sean Whalen
6ccc827e4c Fix Python tests in GitHub 2022-09-10 14:16:17 -04:00
Sean Whalen
132bcde655 Update vscode settings 2022-09-10 14:03:41 -04:00
Sean Whalen
6540577ad5 Convert docs to markdown 2022-09-10 12:53:47 -04:00
Sean Whalen
26f43b3315 Fix build 2022-09-09 17:50:51 -04:00
Sean Whalen
1e0fa9472c Fix build 2022-09-09 16:46:57 -04:00
Sean Whalen
475b89adb0 Fix build 2022-09-09 16:44:21 -04:00
Sean Whalen
de3002db8b Fix Docker build 2022-09-09 16:36:03 -04:00
Sean Whalen
d2da6f30af 8.3.1
- Handle unexpected xml parsing errors more gracefully
2022-09-09 16:22:43 -04:00
Sean Whalen
10e15d963b 8.3.1
- Handle unexpected xml parsing errors more gracefully
2022-09-09 16:22:28 -04:00
Dave Rawks
84a7386726 Handle unexpected xml parsing errors more gracefully (#349)
* updates `parse_aggregate_report_xml` to not raise an unhandled
  exception on parsing errors
* adds an empty xml file to the aggregate test samples
* adds test for coverage
* Resolves #348
2022-09-08 18:23:34 -04:00
Sean Whalen
9d739ccd65 PEP 8 code style fixes 2022-09-03 10:28:45 -04:00
Sean Whalen
4f53894ce1 PEP 8 code style fixes 2022-08-19 14:12:26 -04:00
Dominik Bermühler
1d1f9e84b0 Logging silenced if parsedmarc is used as Library (#344)
By default, libraries should not configure their logger to give the developer, who is using the library, the freedom to decide how log messages are logged. For this reason no handler other than the NullHandler or log level should be set by the library.

For more information about this topic see here: https://realpython.com/python-logging-source-code/#library-vs-application-logging-what-is-nullhandler
2022-08-19 13:54:52 -04:00
Justus Piater
11c151e818 replace dateparser with dateutil (#331)
fixes #298
2022-08-19 13:53:20 -04:00
Cody Cutrer
d1da40bab7 add config option for setting check_timeout (#343) 2022-08-19 13:51:08 -04:00
Cody Cutrer
046a7885ea support custom port for gmail oauth2 local server (#341) 2022-08-19 13:50:18 -04:00
vk2muf
bddc2adb9c Update Grafana-DMARC_Reports.json (#335) 2022-08-19 13:49:40 -04:00
Cody Cutrer
4e8c74599b append to output files if they already exist (#342)
* append to output files if they already exist

instead of overwriting them. fixes #226

* save output with each successive watch result

possible now that it appends to the output
2022-08-19 13:49:16 -04:00
Nathan Thorpe
b15425f50e Add Graph credentials cache and fix issue with batch size (#334)
* implement token cache for Graph API

* fix case when username isn't defined (device code)

* add batch size when listing messages from Graph API

* fix string concat
2022-07-21 13:43:21 -04:00
Sean Whalen
315b99f95a Resolve Sphinx warnings 2022-06-20 10:46:54 -04:00
Sean Whalen
f070ee95c3 Update build.sh to work with newer versions off rstcheck 2022-06-20 10:40:20 -04:00
Sean Whalen
eda5726652 8.3.0 release
- Support MFA for Microsoft Graph (PR #320 closes issue #319)
- Add more options for S3 export (PR #328)
- Provide a helpful error message when the log file cannot be created (closes issue #317)
2022-06-20 10:32:50 -04:00
Sean Whalen
5ab649cf8c Code style fix 2022-06-20 10:22:25 -04:00
Sean Whalen
ae8c587aed Provide a helpful error message when the log file cannot be created
Close issue #317
2022-06-20 10:15:35 -04:00
Sean Whalen
3850c7bdc4 Code style fix 2022-06-20 10:00:53 -04:00
William Desportes
9381381f40 CI improvements (#327)
* Add permissions on GitHub workflows tokens to avoid misuse and leak

* Bump docker actions

* Bump elasticsearch version
2022-06-20 09:49:43 -04:00
William Desportes
a496b4e42a Fix documentation typos (#326)
Fixes: #325
2022-06-20 09:48:14 -04:00
William Desportes
9671a49166 Implement new settings for s3 (#328)
* Fix s3 path documented as an int

* Implement new settings for s3 storage
2022-06-20 09:47:29 -04:00
Sean Whalen
6354c9bce7 Close file object before raising exception
Hopefully resolves #321
2022-05-16 22:00:25 -04:00
Nathan Thorpe
9d05fe776a Add message about scoping shared mailboxes when using MSGraph ClientSecret auth (#322)
* Add message about scoping to shared mailboxes

* disable resolve entities in Xml Parser
2022-05-14 14:14:42 -04:00
Sean Whalen
0e6e6c31c0 Merge branch 'master' of https://github.com/domainaware/parsedmarc 2022-05-14 13:16:03 -04:00
Sean Whalen
31821bcfd9 Remove pypy guide until #321 is fixed 2022-05-14 13:16:00 -04:00
Nathan Thorpe
9bf4e75e0e Implement Device Code and Client Secret auth flow for MS Graph (#320)
* implement DeviceCode and ClientSecret auth flow for MS Graph

* update readme for MS Graph auth method additions

* add warning to msgraph config setup
2022-05-14 12:58:22 -04:00
Sean Whalen
ed39c7d89e Update documentation 2022-05-13 17:11:05 -04:00
Sean Whalen
1b443b8843 Revert "Set default batch size at the config, not the function"
This reverts commit c5a624274c.
2022-05-13 10:40:32 -04:00
Sean Whalen
c5a624274c Set default batch size at the config, not the function 2022-05-13 09:50:14 -04:00
Sean Whalen
4018e828e9 8.2.0 release
- Support non-standard, text-based forensic reports sent by some mail hosts
- Set forensic report version to `None` (`null` in JSON) if the report was in a non-standard format and/or is missing a version number
- The default value of the `mailbox` `batch_size` is now `10` (use `0` for no limit)
2022-05-10 21:01:41 -04:00
Sean Whalen
2d85e095fe PEP8 code style fix 2022-05-10 20:01:32 -04:00
Sean Whalen
abf9695228 8.2.0 2022-05-10 19:55:27 -04:00
Sean Whalen
665720a017 8.1.1
- Fix marking messages as read via Microsoft Graph
2022-05-09 19:14:27 -04:00
Sean Whalen
6b2131f0e8 Update documentation 2022-05-09 15:41:35 -04:00
Sean Whalen
6702181400 Move __version__ back
Fixes building from git/docker
2022-05-09 15:34:01 -04:00
Sean Whalen
b9c3404989 Mark a message as read when fetching a message from Microsoft Graph 2022-05-09 15:19:15 -04:00
Sean Whalen
b6054aafce Prep for 8.1.0
- Restore compatability with <8.0.0 configuration files (with deprecation warnings)
- Move `parsedmarc.__version__` to `parsedmarc.meta.__version__`
- Set default `reports_folder` to `Inbox` (rather than `INBOX`) when `msgraph` is configured
2022-05-09 10:01:04 -04:00
Nathan Thorpe
abf07125c4 Add Dockerfile & build/push task (#316)
* add dockerfile and actions task to build image

* test on branch

* change to push only on release, update readme

* remove pip install requirements

* change to on release github action
2022-05-05 21:06:38 -04:00
Sean Whalen
bfd1aa8172 8.0.3 release
- Fix IMAP callback for `IDLE` connections (PR #313 closes issue #311)
- Add warnings in documentation and log output for IMAP configuration changes introduced in 8.0.0 (Closes issue #309)
- Actually pin the `elasticsearch` Python library version at `<7.14.0` (Closes issue #315)
- Separate version numbers in `__init__.py` and `setup.py` to allow `pip` to install directly from `git`
- Update `dateparser` to 1.1.1 (closes issue #273)
2022-05-02 16:33:44 -04:00
Sean Whalen
a1c912fc7d Use dateparser 1.1.1 to address #273 2022-05-02 16:24:44 -04:00
Sean Whalen
6d0717d6c0 Six setup.py requirements list 2022-05-02 15:56:48 -04:00
Sean Whalen
8943430ff3 Fix elasticsearch-dsl requirement 2022-05-02 15:39:40 -04:00
Sean Whalen
03811988d3 Separate version number in setup.py to allow install from git 2022-05-02 14:42:25 -04:00
Sean Whalen
996deb042c Move version to its own package 2022-05-02 14:38:17 -04:00
Sean Whalen
fe5559e44e Move version above imports 2022-05-02 14:28:44 -04:00
Sean Whalen
af2afddf96 Possible 8.0.3 release
- Fix IMAP callback for `IDLE` connections (PR #313 closes issue #311)
- Add warnings in documentation and log output for IMAP configuration changes introduced in 8.0.0 (Closes issue #309)
- Actually pin the `elasticsearch` Python library version at `<7.14.0` (Closes issue #315)
2022-05-02 12:50:37 -04:00
Sean Whalen
1b0f95a0ff Add notes and warnings to address #309 and #314 # 2022-04-29 09:10:26 -04:00
Sean Whalen
167de27d34 Merge pull request #313 from nathanthorpe/imap_watch_fix
fix callback on IMAPConnection
2022-04-29 08:18:46 -04:00
Nathan Thorpe
e46a768b82 fix callback object on IMAPConnection 2022-04-28 19:46:24 -07:00
Sean Whalen
75da9f6a30 8.0.2
Strip leading and trailing whitespaces from Gmail scopes (Closes #310)
2022-04-26 12:17:25 -04:00
Sean Whalen
64f8eef27b 8.0.1
Fix `ModuleNotFoundError` by adding `parsedmarc.mail` to the list of packages in `setup.py` (PR #308)
2022-04-24 20:12:01 -04:00
Sean Whalen
adee1288c7 Merge pull request #308 from nathanthorpe/fix_package
Fix ModuleNotFoundError: add parsedmarc.mail to setup packages
2022-04-24 12:44:26 -04:00
Nathan Thorpe
514ca35117 add parsedmarc.mail to setup packages 2022-04-24 08:18:37 -07:00
Sean Whalen
23df163759 Update version number to 8.8.8 2022-04-22 17:08:15 -04:00
Sean Whalen
763476cdd3 Update CHANGELOG.md 2022-04-22 17:06:35 -04:00
Sean Whalen
64b88991d1 8.0.0
- Update included copy of `dbip-country-lite.mmdb` to the 2022-04 release
- Add support for Microsoft/Office 365 via Microsoft Graph API (PR #301 closes issue #111)
- Pin `elasticsearch-dsl` version at `>=2.2.0<7.14.0` (PR #297  closes issue #296)
- Properly initialize `ip_dp_path` (PR #294 closes issue #286)
- Remove usage of `lgging.basicConfig` (PR #285)
- Add support for the Gmail API (PR #284 and PR #307 close issue #96)

Special thanks to @nathanthorpe  and @roeften for their substantial contributions.
2022-04-22 16:59:47 -04:00
Sean Whalen
d2cc93f23f Merge pull request #307 from nathanthorpe/gmail_mailbox_integration
Refactor Gmail integration with MailboxConnection interface
2022-04-22 14:19:51 -04:00
Nathan Thorpe
e2ac067bf3 Gmail: remove deprecated call to run_console, create folder and move msg fixes 2022-04-21 17:42:45 -07:00
Nathan Thorpe
d03d2b5f44 pep8 and tests fix 2022-04-21 17:03:54 -07:00
Nathan Thorpe
cf682337e9 Merge branch 'master' into gmail_mailbox_integration
# Conflicts:
#	parsedmarc/cli.py
2022-04-21 16:48:14 -07:00
Nathan Thorpe
6a1a88cfdf gmail pep8 fixes 2022-04-21 16:46:01 -07:00
Nathan Thorpe
5ad7e821b6 pep8 fixes on graph/imap, readme changes 2022-04-21 16:45:28 -07:00
Nathan Thorpe
657f34dc2a initial pass at integrating gmail with MailboxConnection interface 2022-04-21 16:14:34 -07:00
Sean Whalen
b0a6a5bbff PEP8 fixes 2022-04-21 19:10:54 -04:00
Sean Whalen
03ff412c70 Merge pull request #306 from nathanthorpe/python_3_6_fix
Python 3.6 fix: add maxsize parameter to find_folder_id cache
2022-04-21 18:49:05 -04:00
Nathan Thorpe
bd35d31573 add maxsize parameter to find_folder_id cache 2022-04-21 14:58:30 -07:00
Sean Whalen
90a53f2217 PEP8 fixes 2022-04-21 17:46:27 -04:00
Sean Whalen
4faf8ad651 Update README.rst 2022-04-21 17:19:42 -04:00
Sean Whalen
8da627aa61 Merge pull request #301 from nathanthorpe/graph_addition
Add support for Microsoft Graph API (Microsoft 365 mailboxes)
2022-04-21 17:11:35 -04:00
Sean Whalen
f5dcde183a Merge branch 'master' into graph_addition 2022-04-21 17:11:22 -04:00
Sean Whalen
bc5cca69ba Merge pull request #297 from robertomoutinho/ISSUE-296
ISSUE-296 - Elastic DSL lib pinned to under 7.14
2022-04-21 16:54:04 -04:00
Sean Whalen
a37f7aef5e Merge pull request #294 from taladar/ip_db_path
add ip_db_path initialization
2022-04-21 16:53:22 -04:00
Sean Whalen
3bce7cf300 Merge pull request #285 from dbermuehler/master
Removed usage of logging.basicConfig
2022-04-21 16:53:02 -04:00
Sean Whalen
c777d6759d Merge branch 'master' into master 2022-04-21 16:52:47 -04:00
Sean Whalen
d76b0adc67 Merge pull request #284 from roeften/gmail
Add GMail API support
2022-04-21 16:47:13 -04:00
Nathan Thorpe
754e1d6bc5 remove batch_size from fetch_messages method 2022-04-06 11:46:37 -07:00
Nathan Thorpe
445e3cdf9d add try except block to get_dmarc_reports call 2022-04-03 21:49:27 -07:00
Nathan Thorpe
db1d3443fd update setup.py 2022-04-03 17:56:39 -07:00
Nathan Thorpe
59a39b1509 update readme 2022-04-03 17:54:40 -07:00
Nathan Thorpe
1f865ae566 implement mailbox watch 2022-04-03 17:42:11 -07:00
Nathan Thorpe
88c8af8334 Implement getting messages from an Office 365 mailbox using MS Graph 2022-04-03 16:57:31 -07:00
robertomoutinho
30539dc111 ISSUE-296 - Elastic DSL lib pinned to under 7.14 2022-03-15 20:52:50 -03:00
Matthias Hörmann
e79dbd702e add ip_db_path initialization 2022-03-02 11:47:39 +01:00
Panos Gkikakis
5be36e431c Added dependencies for GMail api 2022-01-24 23:45:59 +02:00
Dominik Bermühler
8441f8badd Removed usage of logging.basicConfig
logging.basicConfig will change the configuration of the root logger and not the configuration of your own library logger. Since parsedmarc is a library, it should keep its logging configuration to its own logger, such that the logging configuration of applications using this library are not affected.
2022-01-18 18:23:23 +01:00
Panos Gkikakis
e721f5701e Add GMail API support 2022-01-17 23:43:33 +02:00
Sean Whalen
315d400677 Update index.rst 2022-01-09 14:42:55 -05:00
Sean Whalen
5e0ac908c6 Better install instructions
Avoid polluting the Python system installation
2022-01-09 14:33:47 -05:00
Sean Whalen
12e1382f13 Update README.rst 2022-01-09 13:57:33 -05:00
Sean Whalen
672a0f255d Update version number to 7.1.1 2022-01-08 16:10:16 -05:00
Sean Whalen
21c7e9d4af 7.1.1
- Actually include `dbip-country-lite.mmdb` file in the `parsedmarc.resources` package (PR #281)
- Update `dbip-country-lite.mmdb` to the 2022-01 release
2022-01-08 16:07:23 -05:00
Sean Whalen
7a44164ce5 Merge pull request #281 from mcastellini/fix_package
Actually add DBIP database static copy to packaged files
2022-01-08 15:19:41 -05:00
Matteo Castellini
68bc430476 Actually add DBIP database static copy to packaged files 2022-01-07 22:58:41 +01:00
Sean Whalen
2b55c9ec13 Update python-tests.yml 2022-01-05 12:11:35 -05:00
Sean Whalen
c5248a9375 Update requirements.txt 2022-01-05 11:55:12 -05:00
Sean Whalen
3fb3d6f920 Update requirements.txt 2022-01-05 11:54:26 -05:00
Sean Whalen
632b501f76 Update requirements.txt 2022-01-05 11:47:54 -05:00
Sean Whalen
dcdc210ab6 PEP8 style fixes 2022-01-05 10:59:00 -05:00
Sean Whalen
8259f6866f Update python-tests.yml 2022-01-05 10:46:31 -05:00
Sean Whalen
4f22ab4879 Update python-tests.yml 2022-01-05 10:36:29 -05:00
Sean Whalen
ce2943e0cc Update python-tests.yml 2022-01-05 10:29:58 -05:00
Sean Whalen
c0f82fa363 Update Github Workflows 2022-01-05 10:29:19 -05:00
Sean Whalen
293dfc46b7 PEP8 code style fixes 2022-01-05 10:21:25 -05:00
Sean Whalen
fcf5a3be31 Update python-tests.yml 2022-01-05 09:54:55 -05:00
Sean Whalen
c0e975b1e0 Update python-tests.yml 2022-01-05 09:45:58 -05:00
Sean Whalen
d50259cdc3 Update python-tests.yml 2022-01-05 09:44:14 -05:00
Sean Whalen
8a5242be5a Update python-tests.yml 2022-01-05 09:43:22 -05:00
Sean Whalen
ddb02cd031 Update python-tests.yml 2022-01-05 09:31:26 -05:00
Sean Whalen
273e9906a3 Update python-tests.yml 2022-01-05 09:30:12 -05:00
Sean Whalen
a87b11f726 Fix x-axis dates in Kibana - Closes #246 2022-01-05 08:56:36 -05:00
Sean Whalen
de8f18dcd3 Fix CLI that was broken by the new ip_db_path option 2022-01-03 21:23:35 -05:00
Sean Whalen
061c5a084e Merge pull request #278 from lasat/master
Resolve issue #261 - add documentation for imap password escape
2022-01-03 21:02:36 -05:00
lasat
c921814e18 Update index.rst
add note to imap password to escape '%'
2021-12-23 07:45:39 +01:00
Sean Whalen
ca2c18b566 Update python-tests.yml 2021-12-07 12:19:53 -05:00
Sean Whalen
25b63e484c Update python-tests.yml 2021-12-07 12:14:33 -05:00
Sean Whalen
89f43bbe5d Update requirements.txt 2021-12-07 12:12:34 -05:00
Sean Whalen
b54bcc9138 Update setup.py 2021-12-07 12:05:53 -05:00
Sean Whalen
de3ac64583 Update python-tests.yml 2021-12-07 12:02:16 -05:00
Sean Whalen
1b69aa1fef Update setup.py 2021-12-07 11:52:26 -05:00
Sean Whalen
499516585b Migrate from Travis-CI to GitHub Actions 2021-12-07 11:04:27 -05:00
Sean Whalen
76a7a47c53 Update python-tests.yml 2021-12-07 10:55:05 -05:00
Sean Whalen
2748022824 Create python-tests.yml 2021-12-07 10:51:36 -05:00
Sean Whalen
51eea6c08d 7.1.0
- A static copy of the DBIP database is now included for use when a copy of the MaxMind GeoLite2 Country database is not installed (Closes #275)
- Add `ip_db_path` to as a parameter and `general` setting for a custom IP geolocation database location (Closes #184)
- Search default Homebrew path when searching for a copy of the MaxMind GeoLite2 Country database (Closes #272)
- Fix log messages written to root logger (PR #276)
- Fix `--offline` option in CLI not being passed as a boolean (PR #265)
- Set Elasticsearch shard replication to `0` (PR #274)
- Add support for syslog output (PR #263 closes #227)
- Do not print TQDDM progress bar when running in a no-interactive TTY (PR #264)
2021-12-07 10:19:41 -05:00
Sean Whalen
16fb7bca2c Merge pull request #264 from mwander/isatty
Print tqdm progress bar only when running from interactive TTY
2021-12-04 11:30:51 -05:00
Sean Whalen
2bcd601d33 Merge pull request #263 from chris-y/syslog
Add Syslog output support
2021-12-04 11:29:47 -05:00
Sean Whalen
405c842737 Merge pull request #274 from White-Knight-IT/master
Set replication to 0 so that single node setups have "healthy" green …
2021-12-04 11:28:55 -05:00
Sean Whalen
404ed401f9 Merge pull request #265 from drxmknight/master
Fix parse offline in cli.py
2021-12-04 11:28:01 -05:00
Sean Whalen
b42a850749 Merge pull request #276 from dbermuehler/master
Fixed log messages written to root logger
2021-12-04 11:19:10 -05:00
Dominik Bermühler
25b96aa4c0 Log-messages were logged with the root logger instead of the parsedmarc logger. 2021-11-26 10:29:12 +01:00
knightian
d2378d4690 and the default setting I forgot that too 2021-11-07 11:17:02 +00:00
knightian
bdf61f437d I forgot to set ALL of the index replication values 2021-11-07 09:48:36 +00:00
knightian
ca18a89718 Set replication to 0 so that single node setups have "healthy" green index health status 2021-11-07 09:38:21 +00:00
Eduardo Hales
36a1695281 Fix parse offline in cli.py
Fix in cli.py where "offline" option is not being parsed as boolean
2021-08-30 19:05:02 -04:00
Matthäus Wander
d335d594f5 print tqdm progress bar only in interactive tty (as opposed to cronjob) 2021-08-22 10:31:09 +02:00
Matthäus Wander
2ea89c60b9 Merge branch 'domainaware:master' into master 2021-08-22 10:27:07 +02:00
Chris Young
69e29b2dfd Add Syslog output support
This outputs a JSON string containing one record.
Output can be configured in the INI file.
Fixes #227
2021-08-18 15:25:34 +01:00
Sean Whalen
6d689ca8f5 7.0.1 2021-06-23 15:03:12 -04:00
Sean Whalen
377df6a719 Merge pull request #254 from casperdcl/patch-1
fix startup bug
2021-06-23 14:59:59 -04:00
Casper da Costa-Luis
a10e6592fe fix startup bug
Pretty silly typo means `parsedmarc` completely fails unless `parsedmarc.ini` has `general.aggregate_json_filename` explicitly set
2021-06-23 12:06:29 +01:00
Sean Whalen
4c5a266f19 PEP 8 fixes 2021-06-20 19:07:18 -04:00
Sean Whalen
6d5f8a9ec3 Fix documentation formatting 2021-06-20 15:45:15 -04:00
Sean Whalen
e841a49ca7 Fix documentation formatting 2021-06-20 14:24:49 -04:00
Sean Whalen
3d0f7c8c83 7.0.0
Closes issues #221 #219 #155 #103
2021-06-20 13:10:12 -04:00
Sean Whalen
b8a148f7d4 Merge pull request #252 from White-Knight-IT/master
fix what was broken in merge train
2021-06-19 16:04:19 -04:00
Ubuntu
89816bbc6e fix what was broken in merge train 2021-06-20 03:58:46 +10:00
Sean Whalen
4dfde372c1 Merge pull request #222 from tom-henderson/imap_batch_size
Add option to process messages in batches
2021-06-19 11:41:26 -04:00
Sean Whalen
242522f7ee Merge branch 'master' into imap_batch_size 2021-06-19 11:40:36 -04:00
Sean Whalen
dc7533793a Merge pull request #217 from Olen/issue-211
Handling other IMAP connection errors.
2021-06-19 11:33:01 -04:00
Sean Whalen
d722bbf8f4 Merge pull request #225 from m0rcq/feature/custom/output/files
Feature/custom/output/files
2021-06-19 11:31:45 -04:00
Sean Whalen
0a1e57fd1b Merge pull request #224 from supaeasy/patch-1
Update README.rst
2021-06-19 11:30:04 -04:00
Sean Whalen
d3f1d761f1 Merge pull request #220 from Olen/modify-loglevels
Modify loglevels
2021-06-19 11:28:53 -04:00
Sean Whalen
d2d2000040 Merge pull request #216 from maurofaccenda/master
Pass offline parameter to wait_inbox()
2021-06-19 11:27:42 -04:00
Sean Whalen
0758bc179c Merge pull request #223 from tom-henderson/s3
Allow saving reports to S3
2021-06-19 11:27:15 -04:00
Sean Whalen
f694b6c489 Merge pull request #241 from slv008/master
Fix imap broken connection on large emails
2021-06-19 11:24:44 -04:00
Sean Whalen
8337c19399 Merge pull request #245 from mwander/utctimestamp
Use UTC datetimes for Elastic output
2021-06-19 11:23:21 -04:00
Sean Whalen
5e82b29afd Merge pull request #247 from mwander/splunkdoc
Extend Splunk Setup Guide
2021-06-19 11:22:42 -04:00
Sean Whalen
72e1448f32 Merge pull request #248 from mwander/issue221
Fix: don't crash when handling invalid reports without root node
2021-06-19 11:22:24 -04:00
Matthäus Wander
ca36db5f24 Minor formatting 2021-06-09 14:33:52 +02:00
Matthäus Wander
837ba7ef4d Added splunk installation guide 2021-06-09 14:33:52 +02:00
Matthäus Wander
00cb5bc4e8 Merge branch 'issue221' 2021-06-09 14:31:34 +02:00
Matthäus Wander
ca15ff51bd handle invalid reports gracefully 2021-06-09 14:29:04 +02:00
mwander
d330b25205 Merge remote-tracking branch 'origin/utctimestamp'
fixes deduplication for Elastic
2021-06-07 00:05:32 +02:00
Matthäus Wander
4bc7b0b62c deduplicate over date_begin and date_end instead of date_range 2021-06-07 00:03:39 +02:00
mwander
7e9bbfc805 Revert "convert unixtime into UTC datetime"
This reverts commit 9dab931f44.

9dab93 fixes Elastic output, but breaks Splunk output.
See 4ad693 for a better fix.
2021-06-06 19:04:45 +02:00
wander
4ad693301e Merge remote-tracking branch 'origin/utctimestamp' 2021-06-06 19:01:57 +02:00
Matthäus Wander
0aa7d84d0d Use UTC datetimes for Elastic. Elastic by default expects UTC. 2021-06-06 18:41:23 +02:00
Matthäus Wander
595ff0abb7 Merge branch 'master' of https://github.com/mwander/parsedmarc 2021-06-06 16:58:21 +02:00
Matthäus Wander
9dab931f44 convert unixtime into UTC datetime 2021-06-06 16:57:40 +02:00
Matthäus Wander
bbc379aaca Minor formatting 2021-06-06 16:44:40 +02:00
Matthäus Wander
bd8f13796e Added splunk installation guide 2021-06-06 16:35:07 +02:00
Silvian I
df0e5467ab Merge remote-tracking branch 'origin/master'
# Conflicts:
#	parsedmarc/__init__.py
2021-05-31 20:02:51 +02:00
Silvian I
3615ad3799 Fix server connection timeout while processiong large dmarc files 2021-05-31 20:00:53 +02:00
Silvian I
775a6f2181 Fix server connection timeout while processiong large dmarc files 2021-05-31 15:40:57 +02:00
Sean Whalen
0d680edd31 Merge pull request #236 from drawks/issue_235
Resolves Issue #235 - Apply index suffix to pre-insert search
2021-05-12 21:35:34 -04:00
Sean Whalen
50b7b5f28e Merge pull request #237 from drawks/iss_233
Resolve issue #233 - don't create imap folders when in test mode
2021-05-12 21:33:09 -04:00
Sean Whalen
190a6a004b Merge pull request #230 from snaptowen/master
Typo correction: allignment -> alignment
2021-05-12 21:30:38 -04:00
Sean Whalen
184425f567 Merge pull request #232 from supaeasy/patch-3
Update Grafana-DMARC_Reports.json
2021-05-12 21:30:20 -04:00
Dave Rawks
1b61156d50 Resolves Issue #235 - Apply index suffix to pre-insert search
* updates `save_forensic_report_to_elasticsearch` and
  `save_aggregate_report_to_elasticsearch` to apply suffix, if
  configured, to pre-insert search
2021-04-20 09:15:54 -07:00
Dave Rawks
55196cb389 Resolve issue #233 - don't create imap folders when in test mode 2021-04-19 10:33:27 -07:00
supaeasy
77331b55c5 Update Grafana-DMARC_Reports.json
- Update to current version of ES, Grafana and Grafana Plugins.
- fix TimeDate Handling for Forensic Reports, was not displayed correctly
- alter handling of stacking in one place (it just looked wrong)
2021-03-21 20:41:14 +01:00
Owen Valentine
609fbdce6f Typo correction: allignment -> alignment 2021-03-03 12:13:11 +02:00
atanas argirov
e51f2b0127 * general cleanup to meet linter rules 2021-02-12 10:50:25 +00:00
atanas argirov
36c592cc5a * added defaults for arg parser 2021-02-11 18:22:29 +00:00
supaeasy
394dddd2df Update README.rst
I struggled too long with this to not let others know.
2021-02-05 15:16:51 +01:00
Tom Henderson
85e7fd4ce6 Fix flake8 errors 2021-02-05 15:58:57 +13:00
Tom Henderson
de05be90df Fix flake8 error 2021-02-05 14:53:43 +13:00
Tom Henderson
9522c9b6e4 Ensure message_limit is not greater than total_messages 2021-02-05 14:51:32 +13:00
Tom Henderson
eba722cddc Fix path example 2021-02-05 14:38:52 +13:00
Tom Henderson
5f6b945839 Save reports to s3 2021-02-05 14:30:54 +13:00
Tom Henderson
a4acd5f232 Add S3Client 2021-02-05 14:30:02 +13:00
Tom Henderson
291d389f69 Add boto3 2021-02-05 14:29:27 +13:00
Tom Henderson
755ee3ded7 Add new settings for s3 2021-02-05 14:28:46 +13:00
Tom Henderson
bafa4861b1 Update docs 2021-02-05 14:27:22 +13:00
Tom Henderson
bc684c8913 Add option to process messages in batches 2021-02-05 13:37:09 +13:00
Ola Thoresen
c853c47087 Ensuring mail from is set 2021-01-22 15:06:35 +01:00
Ola Thoresen
a00cee8ba4 Adding a log line to see the sender of a report when it is parsed 2021-01-22 10:38:04 +01:00
Ola Thoresen
76614bdc94 Fixing flake-error 2021-01-21 08:34:56 +01:00
Ola Thoresen
0e2636225e Modifying some log-levels to INFO 2021-01-21 08:24:44 +01:00
Ola Thoresen
be8395dbe3 Detecting other IMAP-errors. Adding short sleep to avoid hammering the IMAP-server on error 2021-01-20 19:56:15 +01:00
Mauro Faccenda
478452de20 pass offline parameter to wait_inbox() 2021-01-20 15:53:19 +01:00
Sean Whalen
b43a622f9e Merge pull request #214 from maurofaccenda/patch-1
Fixes issue #213
2021-01-20 08:54:32 -05:00
Sean Whalen
8feffcb1ac Merge pull request #212 from Olen/log-level
Add verbose logging level
2021-01-20 08:51:49 -05:00
Sean Whalen
0f8d5477a6 Merge pull request #210 from AnaelMobilia/patch-1
Update categorie name in Kibana
2021-01-20 08:50:48 -05:00
Sean Whalen
7d7a197ff3 Merge pull request #207 from MarcelWaldvogel/fix-map-legend
Fix map thresholds and colors
2021-01-20 08:49:45 -05:00
Mauro Faccenda
4d40f59491 Fixes issue #213 2021-01-20 12:31:09 +01:00
Ola Thoresen
72b0a1b053 Add verbose logging level 2021-01-20 07:52:28 +01:00
Anael Mobilia
08457b66fd Update categorie name in Kibana 2021-01-18 16:51:12 +01:00
atanas argirov
83e229aeb1 * added output_{json,csv}_{aggregate,forensic}_file command line args
* refactored save_output() to support output_*
2020-12-28 15:57:32 +00:00
Marcel Waldvogel
49d09a51ba Fix map thresholds and colors
Inverted ranges are not supported; fixes #206
2020-12-28 16:37:38 +01:00
Sean Whalen
3f1e25e315 Drop Python 3.5 support 2020-11-25 09:36:39 -05:00
Sean Whalen
ddb007af13 Update index.rst 2020-11-24 22:03:51 -05:00
Sean Whalen
529fe27a97 PEP 8 fixes 2020-11-24 22:02:55 -05:00
Sean Whalen
e5720422f6 Update .travis.yml 2020-11-24 21:54:42 -05:00
Sean Whalen
4c3fb65af1 6.12.0 2020-11-24 21:42:45 -05:00
Sean Whalen
dbfed2e309 Update elastic.py 2020-11-24 21:15:31 -05:00
Sean Whalen
f0612203f0 Merge pull request #191 from White-Knight-IT/master
Allow Basic Auth with Elasticsearch
2020-11-20 11:54:45 -05:00
Sean Whalen
226afee12d Merge pull request #190 from Sikorsky78/patch-1
Update utils.py
2020-11-20 11:46:47 -05:00
Sean Whalen
809d533ce0 Merge pull request #199 from arne1101/master
Limit filename length to 100 characters
2020-11-20 11:45:52 -05:00
Arne Allisat
87092cf4ca Reduce number of characters in filename from 255 to 100
Reduce number of characters in filename from 255 to 100
2020-11-06 14:16:09 +01:00
Arne Allisat
06e42791c4 Limit filename length to 255 characters
https://github.com/domainaware/parsedmarc/issues/197

Since there is a limit for filename length in some OS, filename length should be limited. I propose limit it to 255 characters.
2020-11-06 09:21:17 +01:00
Ubuntu
f18322c16c I can spell.... sometimes 2020-10-20 01:31:30 +11:00
Ubuntu
07f8a30f08 tabs and spaces :( 2020-10-20 01:30:08 +11:00
Ubuntu
de02edc0a9 take 3 2020-10-20 01:27:45 +11:00
Ubuntu
a36dc21c07 Try 2 2020-10-20 01:23:18 +11:00
Ubuntu
3312387852 Take 1 trying to add elastic basic auth 2020-10-20 01:14:43 +11:00
Ubuntu
bd7819881d delete test file 2020-10-20 00:52:01 +11:00
Ubuntu
dedbd518e8 test 2020-10-20 00:50:06 +11:00
Sikorsky78
c1681dc48b Update utils.py
Fixed broken system_paths for Windows
2020-10-12 12:47:15 +02:00
Sean Whalen
82b1a7e292 6.11.0 2020-08-31 16:31:53 -04:00
Sean Whalen
93189945b3 6.11.0 2020-08-31 16:22:35 -04:00
Sean Whalen
1f557f9e41 Merge pull request #182 from bhozar/master
Stop double count of email messages displaying. Include images of dashboard.
2020-08-31 16:03:39 -04:00
Sean Whalen
9d920e0bd8 Merge pull request #183 from ericwbentley/master
Add begin and end date fields
2020-08-31 16:02:41 -04:00
Sean Whalen
e1059b234e Merge pull request #186 from cvandeplas/fix/imaptimeout
fix: [imap] fixes #163 crash on imap timeout
2020-08-31 16:02:27 -04:00
Christophe Vandeplas
73b8866b29 fix: [imap] fixes #163 crash on imap timeout 2020-08-26 11:15:57 +02:00
ericwbentley
20f9890008 Update elastic.py 2020-08-06 18:43:49 -07:00
Bhozar
d4905968f6 Add files via upload
Included demo images. Fix for issue https://github.com/domainaware/parsedmarc/issues/162
2020-08-04 18:17:14 +01:00
Sean Whalen
382e33f687 Use elasticsearch version provided by elasticsearch-dsl 2020-06-10 18:05:23 -04:00
Sean Whalen
fd0896ac40 Merge pull request #165 from cvandeplas/chg-dashboard-aggr
chg: [splunk] adding source_base_domain as filter
2020-06-10 17:47:44 -04:00
Sean Whalen
bd82966d1c Merge pull request #170 from danhstevens/master
Fixing forensic parse failure on valid forensic report
2020-06-10 17:47:06 -04:00
Dan Stevens
c9355d7c94 Fixing forensic parse failure on valid forensic report 2020-06-10 16:10:25 -05:00
Christophe Vandeplas
e583728d4b chg: [splunk] adding source_base_domain as filter
this is quite useful when pivoting on data
2020-05-11 16:53:54 +02:00
Sean Whalen
4fca674064 6.10.0
- Ignore unknown forensic report fields when generating CSVs (Closes issue #148)
- Fix crash on IMAP timeout (PR #164 - closes issue #163)
- Use SMTP port from the config file when sending emails (PR #151)
- Add support for Elasticsearch 7.0 (PR #161 - closes issue #149)
- Remove temporary workaround for DMARC aggregate report records missing a SPF domain fields
2020-05-10 17:44:14 -04:00
Sean Whalen
8431207920 Merge pull request #164 from cvandeplas/fix-imap
fix: [imap] fixes #163 crash on imap timeout
2020-05-10 16:27:55 -04:00
Sean Whalen
8bfaa3951b Merge pull request #161 from Kuzuto/master
Support for Elasticsearch 7.x
2020-05-10 16:26:24 -04:00
Christophe Vandeplas
585f87e46e fix: [imap] fixes #163 crash on imap timeout 2020-05-08 12:57:54 +02:00
Lennart Friberg
a89d41acd0 Update setup.py
Changed Elasticsearch py to support Elastic > 7.0
For Elasticsearch 7.0 and later, use the major version 7 (7.x.y) of the library.
2020-05-06 21:54:08 +02:00
Lennart Friberg
f0169a451a Update requirements.txt
Changed Elasticsearch Python to support Elasticsearch >= 7
For Elasticsearch 7.0 and later, use the major version 7 (7.x.y) of the library.
2020-05-06 21:52:39 +02:00
Sean Whalen
3a6a84dbec Merge pull request #160 from cvandeplas/fix-splunkxml
fix: [splunk] correct default index for splunk dashboard
2020-05-06 10:37:54 -04:00
Christophe Vandeplas
b01b8d9374 fix: [splunk] correct default index for splunk dashboard 2020-05-06 08:15:15 +02:00
Sean Whalen
e940268e16 Merge pull request #157 from sander85/master
Fix for copy-paste error
2020-04-08 10:09:54 -04:00
Sander Lepik
ebda496178 Fix for copy-paste error
Signed-off-by: Sander Lepik <sander.lepik@cooppank.ee>
2020-04-07 09:47:29 +03:00
Sean Whalen
74de4fecf9 Kibana dashboard fixes 2020-03-21 21:06:16 -04:00
Sean Whalen
3a11ed3ac3 Merge pull request #151 from Yabk/master
Use SMTP port from config when sending the report
2020-03-11 12:01:47 -04:00
Yabk
6a9e3f0f5d Use SMTP port from config when sending the report 2020-03-11 13:10:35 +01:00
Sean Whalen
d0bb858e74 6.9.0 2020-02-17 16:24:11 -05:00
Sean Whalen
0ae15ed90c Merge pull request #145 from bhozar/master
Updated visuals. Corrected some of the guide.
2020-02-12 13:47:10 -05:00
Bhozar
7cfa8c20bb Updated visuals. Corrected some of the guide. 2020-02-11 23:28:12 +00:00
Sean Whalen
97db183031 6.8.2 2020-01-24 12:21:34 -05:00
Sean Whalen
95477bb818 Update UIs 2020-01-23 13:58:25 -05:00
Sean Whalen
c50bdf8d7e Actually fix the build 2020-01-22 12:47:13 -05:00
Sean Whalen
4edd86ce73 Fix CI build 2020-01-22 11:59:35 -05:00
Sean Whalen
ac25262385 6.8.1 2020-01-22 11:50:49 -05:00
Sean Whalen
52eaa32c3b Fix more documentation typos 2020-01-14 17:39:31 -05:00
Sean Whalen
c1a78264d2 Fix documentation typo 2020-01-14 17:33:30 -05:00
Sean Whalen
335ee39d6b Update geoipupdate documentation 2020-01-14 17:29:58 -05:00
Sean Whalen
88304bbf67 Better geoipupdate documentation 2020-01-14 17:19:49 -05:00
Sean Whalen
e30ccf2e44 Update index.rst 2020-01-14 14:04:01 -05:00
Sean Whalen
abcb739e67 Fix documentation typos 2020-01-14 13:58:07 -05:00
Sean Whalen
3eccfb1bc1 Update .travis.yml 2020-01-14 13:39:39 -05:00
Sean Whalen
a997496e75 Actual 6.8.0 release 2020-01-14 13:10:57 -05:00
Sean Whalen
8ca62a9860 6.8.0 release 2020-01-14 12:42:45 -05:00
Sean Whalen
1271b26fd5 6.8.0
closes issues #137 and #139
2020-01-14 11:32:40 -05:00
Sean Whalen
de465aa84d Update geoipupdate documentation 2020-01-14 10:51:44 -05:00
Sean Whalen
20ac81343b 6.7.4 - Update dependencies
Fixes issue #134
2019-12-23 15:30:54 -05:00
Sean Whalen
c67c991ee2 PEP 8 fixes 2019-12-17 07:01:23 -05:00
Sean Whalen
d82f211946 6.7.3
Make `dkim_aligned` and `spf_aligned` case insensitive (PR #132)
2019-12-17 06:53:52 -05:00
Sean Whalen
097a847f49 Merge pull request #132 from aharpour/master
Making dkim_aligned and spf_aligned case insensitive.
2019-12-17 06:28:34 -05:00
Ebrahim Aharpour
4c57181e21 Making dkim_aligned and spf_aligned case insensitive. 2019-12-16 21:32:00 +01:00
Sean Whalen
4673ebb1c4 Update link for exported kibana objects 2019-12-05 19:50:44 -05:00
Sean Whalen
793cf3588d Merge branch 'master' of https://github.com/domainaware/parsedmarc 2019-11-25 11:07:38 -05:00
Sean Whalen
d3f25c9447 Fix missing data in CSV output
Closes issue #128
2019-11-25 11:07:31 -05:00
Sean Whalen
54cdd2cf51 Update Grafana-DMARC_Reports.json 2019-11-19 10:12:43 -05:00
Sean Whalen
747d22358f 6.7.1
- Parse forensic email samples with non-standard date headers
- Graceful handling of a failure to download the GeoIP database (issue #123)
2019-11-12 11:12:50 -05:00
Sean Whalen
180d18ada3 Update README.rst 2019-11-06 13:33:57 -05:00
Sean Whalen
b81aba4a58 Fix conversion of reports to CSV 2019-11-06 12:55:29 -05:00
Sean Whalen
3721b25a04 Remove python 3.4 support 2019-11-06 12:47:13 -05:00
Sean Whalen
757a28b56d Keep error message when recovering from bad XML 2019-11-06 12:45:42 -05:00
Sean Whalen
9e50a1db57 Fix CI build 2019-11-06 12:41:57 -05:00
Sean Whalen
193e0fd98c Merge pull request #122 from michaeldavie/handle_bad_data
Handle invalid aggregate reports
2019-11-06 12:40:05 -05:00
Sean Whalen
eefc74c576 Fix PEP 8 line length 2019-11-06 12:24:09 -05:00
Sean Whalen
a95cfa4efd CSV output fixes
Make CSV output match JSON output. Fixes issue #115
2019-11-06 12:12:06 -05:00
Sean Whalen
e054bc7cbe Update export.ndjson
Fix typos
2019-11-06 11:26:41 -05:00
Sean Whalen
d12c1baa75 Merge pull request #119 from ruffy91/patch-1
Fix typos
2019-11-06 11:15:35 -05:00
michaeldavie
d4ec6dee65 Handle invalid XML 2019-11-05 22:13:06 -05:00
michaeldavie
13a2624850 Handle invalid bytes 2019-11-05 21:15:06 -05:00
Sean Whalen
9bc23a20fa Merge pull request #120 from chinskiy/python38
add python 3.8 to CI
2019-11-01 09:10:29 -04:00
chinskiy
2b789c869a add python 3.8 to CI 2019-10-31 13:08:25 +02:00
Fabian
28608e1573 Update elastic.py
fix typo
2019-10-27 16:04:54 +01:00
Fabian
e9a507bf99 Update kibana_saved_objects.json
Fixed typo
2019-10-27 15:59:10 +01:00
Sean Whalen
4685d0a750 Fix CentOS documentation as described in issue #114 2019-10-25 09:28:02 -04:00
Sean Whalen
6fd80ebdee Update dmarc_forensic_dashboard.xml
Closes issue #117
2019-10-23 10:06:04 -04:00
Sean Whalen
7d7a3e0919 6.6.1 - Close files after reading them 2019-09-23 00:53:36 -04:00
Sean Whalen
ff5bb1e03e Remove unused import 2019-09-23 00:27:43 -04:00
Sean Whalen
8a45628f25 Update setup.py 2019-09-23 00:24:04 -04:00
Sean Whalen
189acd8779 Update index.rst 2019-09-23 00:19:06 -04:00
Sean Whalen
c991feb860 Auto detect mbox files and add IMAP timeouts 2019-09-23 00:12:51 -04:00
Sean Whalen
3fef3b58a8 Create export.ndjson 2019-09-22 15:04:20 -04:00
Sean Whalen
bf6bea1456 Add support for detecting mbox files 2019-09-22 12:41:07 -04:00
Sean Whalen
1f196486d9 Update CHANGELOG.md 2019-09-22 00:22:06 -04:00
Sean Whalen
a1303a2168 Start work on 6.6.0 2019-09-22 00:20:21 -04:00
Sean Whalen
8a41505c4e Merge pull request #106 from ardovm/mbox
Allow parsing reports archived in mbox files
2019-09-21 14:28:48 -04:00
Sean Whalen
61e282d6ea 6.5.5 2019-09-13 08:24:48 -04:00
Sean Whalen
6ce06f7f10 Update requirements.txt 2019-09-12 20:19:43 -04:00
Sean Whalen
6efec4e633 6.5.4
Bump required `mailsuite` version to `1.2.1`
2019-08-12 15:07:06 -04:00
Arrigo Marchiori
537651836b obey style guide and make build.sh more readable
The script build.sh should be more readable now, and a little less
error-prone as it creates the parsedmarc-docs directory if it is missing
2019-08-03 11:15:21 +02:00
Arrigo Marchiori
623eeddc8e allow indicating mailboxes with the 'mbox:' prefix 2019-08-02 18:01:19 +02:00
Sean Whalen
cb63c55b94 6.5.3 2019-07-31 11:08:01 -04:00
Sean Whalen
7f99759d30 6.5.2 2019-07-30 11:41:42 -04:00
Sean Whalen
af9da9bc6b Merge pull request #100 from michaeldavie/csv-bug-fix
Correct bug in 6.5.1
2019-07-25 07:15:13 -04:00
michaeldavie
ef0323ceb4 Copy report_dict in aggregate report CSV row parsing 2019-07-24 21:44:59 -04:00
Sean Whalen
c98416950b Update CHANGELOG.md 2019-07-18 23:03:58 -04:00
Sean Whalen
8471d20d4b Fix the channgelog 2019-07-18 13:18:59 -04:00
Sean Whalen
e340232bd4 Update CHANGELOG.md 2019-07-18 13:17:40 -04:00
Sean Whalen
a58975fc1c Bump version to 6.5.1 2019-07-18 13:16:39 -04:00
Sean Whalen
2d5f613870 Merge pull request #98 from michaeldavie/list-of-dicts
Expose CSV rows as a list of dicts
2019-07-17 22:08:51 -04:00
michaeldavie
09b4607ba5 Expose CSV rows as a list of dicts 2019-07-17 21:31:07 -04:00
Sean Whalen
ff6b8b2daf Update CHANGELOG.md 2019-07-17 11:39:20 -04:00
Sean Whalen
4b56c516b1 Update CHANGELOG.md 2019-07-17 11:34:44 -04:00
Sean Whalen
2a25ce5b62 6.5 release 2019-07-17 11:00:33 -04:00
Sean Whalen
5a6b51c59f Merge pull request #97 from domainaware/6.5
6.5
2019-07-17 10:40:28 -04:00
Sean Whalen
766786344d 6.5 release 2019-07-17 10:39:39 -04:00
Sean Whalen
8b3fc00f13 Document offline option 2019-07-17 09:03:49 -04:00
Sean Whalen
0036bbf14e Add support for offline report processing
Issue #90
2019-07-17 07:53:16 -04:00
Sean Whalen
ca0e992c90 Make argsmatch 2019-07-17 06:46:23 -04:00
Sean Whalen
615c10c0c6 Start work on offline mode 2019-07-16 23:47:41 -04:00
Sean Whalen
73675b17b9 More message parsing fixes 2019-07-16 22:36:28 -04:00
Sean Whalen
43eba0738b Fix email parsing 2019-07-16 22:26:12 -04:00
Sean Whalen
364f292a15 Prep for release 2019-07-16 21:59:44 -04:00
Sean Whalen
38c8d14c3e Fix mail moving logic 2019-07-15 19:50:02 -04:00
Sean Whalen
f5b64151eb Use UDP for DNS 2019-07-15 19:31:31 -04:00
Sean Whalen
90e92809e3 Bump version 2019-07-15 19:15:31 -04:00
Sean Whalen
c455143c21 Start refactoring with mailsuite 2019-07-11 22:16:35 -04:00
Sean Whalen
5294d7140c Update build.sh 2019-07-09 13:33:43 -04:00
Sean Whalen
2351590c4d 6.4.2
Closes issue #94
2019-07-02 10:41:40 -04:00
Sean Whalen
11cf6f8ba6 Revert "Add IMAP message download timeout"
This reverts commit 6a27f41de1.
2019-06-30 19:56:11 -04:00
Sean Whalen
6a27f41de1 Add IMAP message download timeout 2019-06-30 19:39:57 -04:00
Sean Whalen
15b444141f 6.4.1
Raise utils.DownloadError exception when a GeoIP database or Public Suffix List (PSL) download fails (closes issue #73)
2019-05-19 13:21:06 -04:00
Sean Whalen
ffdeb8cfd3 Update CHANGELOG.md 2019-05-08 15:00:48 -04:00
Sean Whalen
1be7e3ff4c Update index.rst 2019-05-08 14:51:48 -04:00
Sean Whalen
1c9a6c4e85 6.4.0
Add ``number_of_shards`` and ``number_of_replicas`` to as possible options
in the ``elasticsearch`` configuration file section (see issue #78)
2019-05-08 14:46:24 -04:00
Sean Whalen
32cfede9ac 6.3.7
Work around some unexpected IMAP responses reported in issue #75
2019-05-02 22:08:16 -04:00
Sean Whalen
4722aadfba Update requirements.txt 2019-05-02 09:36:14 -04:00
Sean Whalen
34e428f1cf Workaround unexpected IMAP response reported in issue #75 2019-05-01 08:46:25 -04:00
Sean Whalen
20ff722f30 6.3.6 2019-04-30 10:09:53 -04:00
Sean Whalen
0d609c4ff2 Fix debug logging 2019-04-30 10:04:10 -04:00
Sean Whalen
1ad994c717 IMAP fixes 2019-04-30 09:43:30 -04:00
Sean Whalen
ecc9fd434c Update documentation 2019-04-29 18:01:11 -04:00
Sean Whalen
714697720b Remove duplicated line 2019-04-29 17:07:08 -04:00
Sean Whalen
cf62534c5b 6.3.5 - Normalize Delivery-Result value in forensic/failure reports (issue #76) 2019-04-29 17:02:05 -04:00
Sean Whalen
3b366a24e4 Use dark theme in Splunk dashboards 2019-04-24 15:41:40 -04:00
Sean Whalen
0638650550 6.3.4 - Fix Elasticsearch index creation (closes issue #74) 2019-04-23 12:51:35 -04:00
Sean Whalen
7f85b1b346 Fix index creation 2019-04-23 12:38:25 -04:00
Sean Whalen
a2998f3968 Fix elasticsearch index creation 2019-04-23 12:24:56 -04:00
Sean Whalen
6b3a51a3f0 Update .travis.yml 2019-04-22 20:36:31 -04:00
Sean Whalen
76749f0b5f Update .travis.yml 2019-04-22 20:34:19 -04:00
Sean Whalen
af3eb1bd40 Update .travis.yml 2019-04-22 20:31:27 -04:00
Sean Whalen
5f49998e05 Update ci.ini 2019-04-22 20:26:35 -04:00
Sean Whalen
5312f4082a Update .travis.yml 2019-04-22 20:19:30 -04:00
Sean Whalen
cced69e31d Update .travis.yml 2019-04-22 20:12:08 -04:00
Sean Whalen
873985251c Update .travis.yml 2019-04-22 20:09:20 -04:00
Sean Whalen
ea6ed8f19c Update .travis.yml 2019-04-22 20:03:22 -04:00
Sean Whalen
0feaec93dd 6.3.3 - Fix dependency conflict 2019-04-22 20:01:19 -04:00
Sean Whalen
c9bb7a7af0 Update .travis.yml 2019-04-22 19:58:04 -04:00
Sean Whalen
8612a5d1b3 6.3.3 - Fix dependency order 2019-04-22 19:54:46 -04:00
Sean Whalen
775f80c02c 6.3.3
Set `number_of_shards` and `number_of_replicas` to `1` when creating indexes
2019-04-22 19:48:53 -04:00
Sean Whalen
536b94ff90 Update .travis.yml 2019-04-11 16:33:43 -04:00
Sean Whalen
0ee60f46ac Update .travis.yml 2019-04-11 16:29:00 -04:00
Sean Whalen
2696162a49 Fix deps 2019-04-11 16:22:44 -04:00
Sean Whalen
9d680a20d6 Update .travis.yml 2019-04-11 16:07:52 -04:00
Sean Whalen
ce7655ec8f Fix deps 2019-04-11 16:01:44 -04:00
Sean Whalen
60cefa8066 Fix CI build 2019-04-11 15:54:00 -04:00
Sean Whalen
04d1f5e7c9 Ese Elasticsearch 7.0 2019-04-11 15:50:38 -04:00
Sean Whalen
d4e8974853 6.3.2 - Fix the monthly_indexes option in the elastocsearch configuration section 2019-04-11 15:46:33 -04:00
Sean Whalen
06371dfe9b Expand test coverage 2019-04-04 20:15:55 -04:00
Sean Whalen
af36df2f48 Fix codecov badge 2019-04-04 11:17:01 -04:00
Sean Whalen
4fce44bfa4 Update .travis.yml 2019-04-04 11:04:59 -04:00
Sean Whalen
18e714aedf Update .travis.yml 2019-04-04 11:01:20 -04:00
Sean Whalen
38b7299db7 Fix CI 2019-04-04 10:55:41 -04:00
Sean Whalen
1389e4df8d Add codecov to CI 2019-04-04 10:49:50 -04:00
Sean Whalen
55a7e9c69b 6.3.1 2019-03-29 17:13:22 -04:00
Sean Whalen
065aba7f6f 3.6.1 - Fix strip_attachment_payloads option 2019-03-29 17:10:09 -04:00
Sean Whalen
d1c483d337 Update ci.ini 2019-03-29 17:01:10 -04:00
Sean Whalen
2119382054 6.3.0 release - Fix issue #69 2019-03-29 16:37:37 -04:00
Sean Whalen
84dbf4d475 Update CHANGELOG.md 2019-03-29 12:53:28 -04:00
Sean Whalen
086a9b1fbf 6.3.0 2019-03-29 12:46:24 -04:00
Sean Whalen
a7814d1bf7 6.3.0 2019-03-29 12:44:35 -04:00
Sean Whalen
2f44d8fe3d Attempt to fix issue #67 2019-03-28 10:12:14 -04:00
Sean Whalen
e3bbb4e008 6.2.2
Fix crash when trying to save forensic reports with missing fields to Elasticsearch
2019-03-19 11:36:06 -04:00
Sean Whalen
bd2b08027f 6.2.1 - Add missing tqdm dependency to setup.py 2019-02-25 11:00:48 -05:00
Sean Whalen
5ad5107aec Update README.rst 2019-02-25 10:45:14 -05:00
Sean Whalen
76a8f61d40 Update mailing list documentation 2019-02-25 08:15:45 -05:00
Sean Whalen
63c5656354 Update index.rst 2019-02-24 23:04:38 -05:00
Sean Whalen
be5d01ff57 Update index.rst 2019-02-24 23:03:21 -05:00
Sean Whalen
11c76a42d6 Update index.rst 2019-02-24 23:02:57 -05:00
Sean Whalen
86527a5555 Update index.rst 2019-02-24 23:01:26 -05:00
Sean Whalen
a732f7123a Update index.rst 2019-02-24 23:00:12 -05:00
Sean Whalen
0dfe978b3d Update index.rst 2019-02-24 22:58:44 -05:00
Sean Whalen
29d3714721 Update index.rst 2019-02-24 22:55:55 -05:00
Sean Whalen
042e2cfafb Update index.rst 2019-02-24 22:53:27 -05:00
Sean Whalen
342acd94b2 Update index.rst 2019-02-24 22:51:18 -05:00
Sean Whalen
70a6cdc581 Update README.rst 2019-02-24 22:46:07 -05:00
Sean Whalen
47f32fb189 6.2.0 2019-02-24 22:42:43 -05:00
Sean Whalen
cc6f4bb680 Update Mailman 3 guide 2019-02-24 22:35:08 -05:00
Sean Whalen
4813f7bc87 6.2.0 2019-02-24 14:32:41 -05:00
Sean Whalen
e3f4291ff1 Add note about URL Encoding 2019-02-24 11:52:46 -05:00
Sean Whalen
d8bf5f950a Update index.rst 2019-02-23 16:52:22 -05:00
Sean Whalen
445435cf99 Update mailing list documentation 2019-02-21 17:33:41 -05:00
Sean Whalen
2b01e97c8e Merge pull request #62 from zscholl/master
Add parallelized processing option for parsing reports with the CLI
2019-02-20 18:25:50 -05:00
zscholl
182cc251fc fix another pep8. fix default param issue 2019-02-20 15:14:50 -07:00
zscholl
8fc856d0e3 change n_cpus to n_procs. fix PEP8 issues. remove debugging statements. 2019-02-20 11:25:46 -07:00
Sean Whalen
75dadb31bf Merge pull request #63 from syska/patch-1
Fixed command name for apt-get
2019-02-20 05:26:09 -05:00
Mikael Syska
b99187b423 Fixed command name for apt-get 2019-02-20 09:38:45 +01:00
zscholl
ad6860817f Merge branch 'master' of https://github.com/domainaware/parsedmarc 2019-02-19 15:15:25 -07:00
Sean Whalen
dc5dd1dc54 Logging improvements 2019-02-16 22:04:05 -05:00
Sean Whalen
47598d9de9 6.1.8 - Actually fix GeoIP lookups 2019-02-16 15:37:14 -05:00
Sean Whalen
7681f30295 6.1.7 - Fix GeoIP lookups 2019-02-16 15:31:27 -05:00
Sean Whalen
d29ae43dd7 Fix CI builds 2019-02-16 14:08:58 -05:00
Sean Whalen
de8b4f936c 6.1.6 - Better GeoIP error handling 2019-02-16 13:50:39 -05:00
Sean Whalen
39d71968f1 Update index.rst 2019-02-16 13:36:56 -05:00
Sean Whalen
97b581f404 Update index.rst 2019-02-16 13:35:11 -05:00
Sean Whalen
89ce95e2cd Update index.rst 2019-02-16 13:33:33 -05:00
Sean Whalen
a16b5c5627 6.1.5 2019-02-16 13:29:07 -05:00
Sean Whalen
0e5247d79f 6.1.4 - Actually package requirements 2019-02-16 11:12:17 -05:00
Sean Whalen
ecdff4d339 6.1.3 Fix package requirements 2019-02-16 11:06:16 -05:00
Sean Whalen
559b5dff07 Update pypy3 download 2019-02-16 10:51:40 -05:00
Sean Whalen
6ac5305db5 6.1.2 Release 2019-02-15 20:10:52 -05:00
Sean Whalen
264ed68b14 Release 6.1.2 2019-02-15 20:09:50 -05:00
Sean Whalen
3a5d97e8f7 Fix argument name for send_email() (closes issue #60) 2019-02-15 19:35:52 -05:00
zscholl
304074ade5 fix requirements.txt 2019-02-15 17:34:26 -07:00
zscholl
6f7a333623 fix typo 2019-02-15 17:20:51 -07:00
zscholl
884b3759e7 updated requirements 2019-02-15 17:19:31 -07:00
Sean Whalen
81ad0b85c2 Merge pull request #61 from arnydo/docs/davmailservice
Docs/davmailservice
2019-02-15 19:18:30 -05:00
zscholl
41823cbb00 Merge branch 'master' of https://github.com/domainaware/parsedmarc 2019-02-15 17:17:58 -07:00
zscholl
baa544217f updated README 2019-02-15 17:17:20 -07:00
Sean Whalen
2e2e47b202 6.1.2 - Use local Public Suffix List file instead of downloading it 2019-02-15 19:14:08 -05:00
zscholl
8f702b9bc2 added parallelization option to cli parsing 2019-02-15 17:05:15 -07:00
arnydo
c2b5ec9fbd Update index.rst 2019-02-15 16:20:04 -05:00
arnydo
b3d2efe0b0 Add doc on running davmail as systemd service 2019-02-15 16:19:12 -05:00
arnydo
a27b5a4291 Merge pull request #1 from domainaware/master
pull in latest changes
2019-02-15 15:18:10 -05:00
arnydo
9bed81ad07 Correct typo in docs 2019-02-15 15:10:45 -05:00
Sean Whalen
2ae500ba9c 6.1.1 release 2019-02-15 14:19:52 -05:00
Sean Whalen
f722907a9a PEP 8 fix 2019-02-15 09:29:51 -05:00
Sean Whalen
2eceac3100 6.1.1 2019-02-15 08:57:51 -05:00
Sean Whalen
fdf8ea292f Fix link 2019-02-13 19:50:42 -05:00
Sean Whalen
22230af4d2 Add a link to the Davmail FAQ
Includes Modern Auth/multi-factor authentication instructions
2019-02-13 19:44:08 -05:00
Sean Whalen
71362b8d69 Update example davmail config
Make settings clearer

See also: Issue #57
2019-02-13 19:30:42 -05:00
Sean Whalen
53510c1f78 Update example davmail config
See also: Issue #57
2019-02-13 19:15:52 -05:00
Sean Whalen
d68294c58a Fix docs typo 2019-02-13 15:04:47 -05:00
Sean Whalen
93ecf4a262 Fix docs typo 2019-02-13 14:23:16 -05:00
Sean Whalen
e12c5637e1 Update index.rst 2019-02-13 13:53:32 -05:00
Sean Whalen
f5d645cebd Fix table formatting 2019-02-13 13:50:52 -05:00
Sean Whalen
b70d47e1b2 Update mailing list documentation 2019-02-13 13:42:32 -05:00
Sean Whalen
603dd65da5 Update CHANGELOG.md 2019-02-13 10:34:55 -05:00
Sean Whalen
e588845f23 6.1.0 - Fix aggregate report email parsing regression
Fix aggregate report email parsing regression in 6.0.3 (closes issues #57 and #58)
2019-02-13 10:30:25 -05:00
Sean Whalen
6d047befcb fix bad debog output #57 2019-02-12 15:29:46 -05:00
Sean Whalen
a06db17a52 Fix Davmail support 2019-02-12 15:10:07 -05:00
Sean Whalen
007225302e Add debugging info for #57 2019-02-12 14:54:46 -05:00
Sean Whalen
34594ca514 6.0.3 release 2019-02-12 11:59:37 -05:00
Sean Whalen
b56c958146 Update CHANGELOG.md 2019-02-12 11:53:55 -05:00
Sean Whalen
444deeab7c 6.0.3 - Fix email parser (closes issue #53)
Don't assume the report is the last part of the email message
2019-02-12 11:51:32 -05:00
Sean Whalen
bca34f3891 Remove outdated documentation 2019-02-10 13:45:16 -05:00
Sean Whalen
b61717d184 Fix docs 2019-02-10 13:12:34 -05:00
Sean Whalen
7b7ac245b0 Release 6.0.2 and update documentation 2019-02-10 12:57:50 -05:00
Sean Whalen
8ed6c7840d Use temp directory for temp files (fixes issue #54) 2019-02-10 07:29:52 -05:00
Sean Whalen
21b6ccb427 Attempt to fix issue #53 without breaking anything else 2019-02-08 09:03:16 -05:00
Sean Whalen
9329feec1f Add index patters that were accidently removed from kibana_saved_objects.json
Issue #52
2019-02-07 10:59:33 -05:00
Sean Whalen
2aa4b6aac4 6.0.1 - Fix Elasticsearch output
PR #50 - andrewmcgilvray
2019-02-06 06:43:55 -05:00
Sean Whalen
59acc5238d Merge pull request #50 from andrewmcgilvray/andrewmcgilvray-patch-1
fixes elasticsearch output
2019-02-06 06:37:01 -05:00
Andrew McGilvray
9b5e3db91f fixes elasticsearch output
There were many problems created by what looks like a bad search and replace.
2019-02-06 14:56:17 +10:00
Sean Whalen
af79a1f286 Update index.rst 2019-02-05 12:27:21 -05:00
Sean Whalen
74dacf48fb Add instructions for securing the config file 2019-02-05 11:51:58 -05:00
Sean Whalen
bd76344baa Small doc fixes 2019-02-05 11:31:49 -05:00
Sean Whalen
84522a2fa0 Update documentation for 6.0.0 release 2019-02-05 11:25:45 -05:00
Sean Whalen
9a3971ca50 Update README.rst 2019-02-05 00:31:53 -05:00
Sean Whalen
43791dd64a Remove outdated docs 2019-02-05 00:28:15 -05:00
Sean Whalen
d0d386e7ad Update documentation 2019-02-04 23:59:06 -05:00
Sean Whalen
975365413d Start to update docs 2019-02-04 23:03:13 -05:00
Sean Whalen
c796b0be6c Actually fix CI 2019-02-04 17:32:07 -05:00
Sean Whalen
6d399ef931 Fix CI 2019-02-04 17:20:07 -05:00
Sean Whalen
528cfb2822 6.0.0
Move CLI options to a config file
2019-02-04 17:03:33 -05:00
Sean Whalen
5c0de87d4e Update dmarc-summary-charts.png 2019-02-01 11:50:57 -05:00
Sean Whalen
61f33f0017 Update dmarc-summary-charts.png 2019-02-01 11:26:59 -05:00
Sean Whalen
eacd30688e Update dmarc-summary-charts.png 2019-02-01 11:15:35 -05:00
Sean Whalen
3f71518498 PEP8 fixes 2019-02-01 11:02:52 -05:00
Sean Whalen
80ee60f69e Merge branch 'master' of https://github.com/domainaware/parsedmarc 2019-02-01 10:42:52 -05:00
Sean Whalen
e8feca3117 Visualization improvements (Closes #49) 2019-02-01 10:42:43 -05:00
Sean Whalen
1ba1bc9543 5.3.0 - Closses issue #45 and issue #57 2019-01-28 18:19:07 -05:00
Sean Whalen
cc315d00e1 Rase proper exception on invalid forensic report emails
Issue #47
2019-01-28 10:18:19 -05:00
Sean Whalen
e85e99a416 Use the same connection for moving messages
Office 365 did not like multiple conections at all
2019-01-24 14:02:08 -05:00
Sean Whalen
f883996126 Fix move test 2019-01-24 12:04:01 -05:00
Sean Whalen
e929476f32 Remove premature logout 2019-01-24 11:30:13 -05:00
Sean Whalen
6c49d76688 PEP8 fix 2019-01-24 11:05:05 -05:00
Sean Whalen
59168ca8f7 PEP8 fix 2019-01-24 10:56:43 -05:00
Sean Whalen
05f225721d Move IMAP messages over a seperate connection 2019-01-24 10:49:28 -05:00
Sean Whalen
0e39830783 Better ddebuggong output 2019-01-17 16:19:14 -05:00
Sean Whalen
241ea4d1bd Debug output workaround 2019-01-17 15:48:19 -05:00
Sean Whalen
bb1b62b1fe More detailed debugging output 2019-01-17 15:29:54 -05:00
Sean Whalen
a3a6e97876 Bump version number to 5.2.1 2019-01-13 15:06:39 -05:00
Sean Whalen
2a507a764e 5.2.1 - Remove unnecessary debugging code 2019-01-13 15:04:48 -05:00
Sean Whalen
c3652e06e0 Update CHANGELOG.md 2019-01-13 12:35:41 -05:00
Sean Whalen
7619bde93f Better error recovery 2019-01-12 23:42:54 -05:00
Sean Whalen
92a6341e5d Debugging 2019-01-12 15:47:43 -05:00
Sean Whalen
01ee77eb6b Print tracebacks 2019-01-11 12:27:50 -05:00
Sean Whalen
9af06f5de3 Improve log formatting 2019-01-10 19:22:27 -05:00
Sean Whalen
105a956f8c Update CHANGELOG.md 2019-01-10 11:55:36 -05:00
Sean Whalen
df1511d8db Update CLI documentation 2019-01-10 11:53:22 -05:00
Sean Whalen
51a0e891e7 PEP8 fix 2019-01-10 11:48:10 -05:00
Sean Whalen
d224964449 Merge pull request #43 from mlodic/master
added option to redirect log into a specified file
2019-01-10 11:42:49 -05:00
Sean Whalen
fcf39d4810 More workarounds for IMAP errors 2019-01-10 11:25:59 -05:00
Matteo Lodi
475c5f5b3c added option to redirect log into a specified file 2019-01-10 11:29:32 +01:00
Sean Whalen
c5f2e463c1 5.2.0 2019-01-09 16:26:08 -05:00
Sean Whalen
1792e868e2 Update documentation 2019-01-09 16:13:56 -05:00
Sean Whalen
419e8a68b2 Add options for monthly Elasticsearch indexes (issue #41) 2019-01-09 14:35:12 -05:00
Sean Whalen
b64fa96d88 PEP8 fixes 2019-01-09 14:14:39 -05:00
Sean Whalen
6a6370dbda Merge pull request #42 from mlodic/master
added support for HTTPS connections to elasticsearch server
2019-01-09 14:04:55 -05:00
Matteo Lodi
2ca7bb200a added support for HTTPS connections to elasticsearch server 2019-01-09 18:16:56 +01:00
Sean Whalen
048fa28160 Fix typo 2019-01-06 23:17:45 -05:00
Sean Whalen
a38c66d0b5 5.1.3 2019-01-06 23:13:29 -05:00
Sean Whalen
884786116e Update index.rst 2018-12-31 17:30:46 -05:00
Sean Whalen
c8e1424c3f Fix docs typo 2018-12-31 12:19:50 -05:00
Sean Whalen
ead7f9ad09 5.1.2 2018-12-31 12:11:50 -05:00
Sean Whalen
3c0e550a3a Fix documentation typos 2018-12-31 11:41:04 -05:00
Sean Whalen
2345908bff Update CHANGELOG.md 2018-12-20 11:50:50 -05:00
Sean Whalen
4d94f7bba0 Add note about RAM to docs (closes issue #40) 2018-12-20 11:22:21 -05:00
Sean Whalen
fe73f21df4 Increase default Splunk HEC response timeout to 60 seconds 2018-12-19 17:15:38 -05:00
Sean Whalen
cbe554ae5f Update CHANGELOG.md 2018-12-19 16:26:22 -05:00
Sean Whalen
08df8d3344 Move import back where it was 2018-12-19 12:08:08 -05:00
Sean Whalen
a9b92e31e4 Fix crash when parsing invalid forensic report (closes #38) 2018-12-19 12:03:09 -05:00
Sean Whalen
0d103a3d54 Remove completed TODO comments 2018-11-28 23:27:49 -05:00
Sean Whalen
f0928b1063 Update CLI documentation 2018-11-28 21:49:15 -05:00
Sean Whalen
028be52653 Fix typo 2018-11-28 21:41:08 -05:00
Sean Whalen
76f3007740 Make CLI help uniform 2018-11-28 21:39:53 -05:00
Sean Whalen
f706041701 Update cli.py 2018-11-28 21:37:34 -05:00
Sean Whalen
b196b5fca0 Code formatting improvements 2018-11-28 21:35:09 -05:00
Sean Whalen
8548a3749e Make CLI help uniform 2018-11-28 21:32:07 -05:00
Sean Whalen
46f5967212 Make CLI help uniform 2018-11-28 21:30:51 -05:00
Sean Whalen
7d9cf723c1 Make CLI help uniform 2018-11-28 21:29:13 -05:00
Sean Whalen
f59af2334b 5.1.0 2018-11-28 21:23:43 -05:00
Sean Whalen
bdf9f62377 Revert "Fix Elasticsearch/Python mapping mismatch"
This reverts commit e7e1f238ab.
2018-11-28 11:08:47 -05:00
Sean Whalen
e7e1f238ab Fix Elasticsearch/Python mapping mismatch 2018-11-28 11:02:57 -05:00
Sean Whalen
78de18eb64 PEP8 fix 2018-11-27 12:39:52 -05:00
Sean Whalen
2b44bd5111 Cleanup CLI help 2018-11-27 12:19:41 -05:00
Sean Whalen
eaaebb54be Clean up CLI healp 2018-11-27 12:16:48 -05:00
Sean Whalen
1343b25963 Use custom Kafka client ID 2018-11-27 12:11:30 -05:00
Sean Whalen
a227b73cfb 5.1.0 - Add support for TLS/SSL and uerename/password auth for Kafka 2018-11-27 12:02:30 -05:00
Sean Whalen
f19ea5b950 5.0.2 - Revert to using publicsuffix instead of publicsuffix2 2018-11-26 15:01:22 -05:00
Sean Whalen
3d59be3ec3 Fix docs syntax 2018-11-26 13:35:03 -05:00
Sean Whalen
b25cf80a75 Add CentOS instructions to docs 2018-11-26 13:30:44 -05:00
Sean Whalen
e264c5744e Set more static versions in requirements 2018-11-26 11:53:27 -05:00
Sean Whalen
7634b9c9d1 Update venv location in build script 2018-11-26 11:46:33 -05:00
Sean Whalen
5934b0abae 5.0.1 - Closes issue #35 2018-11-26 11:43:16 -05:00
Sean Whalen
f0ef25bcd7 Update .travis.yml 2018-11-26 11:10:32 -05:00
Sean Whalen
9945c3f384 Update .travis.yml 2018-11-26 09:22:19 -05:00
Sean Whalen
6367c069b1 Update .travis.yml 2018-11-26 09:16:32 -05:00
Sean Whalen
4a43243835 Update .travis.yml 2018-11-26 09:03:57 -05:00
Sean Whalen
8e81d61207 Update .travis.yml 2018-11-26 08:44:26 -05:00
Sean Whalen
8ea02668e7 Update .travis.yml 2018-11-26 08:39:53 -05:00
Sean Whalen
e805b9dbeb Update .travis.yml 2018-11-26 08:35:50 -05:00
Sean Whalen
f982d870fe Actually use publicsuffix2 2018-11-26 08:31:32 -05:00
Sean Whalen
be27080e4d Update .travis.yml 2018-11-26 08:16:47 -05:00
Sean Whalen
84f7930e39 use publicsuffix2 2018-11-26 07:58:26 -05:00
Sean Whalen
6072d9df0b Fix typo 2018-11-21 22:52:36 -05:00
Sean Whalen
89248b8124 Update CHANGELOG.md 2018-11-19 08:51:28 -05:00
Sean Whalen
6eaccdc2fc Bump version to 5.0.0 2018-11-19 08:25:18 -05:00
Sean Whalen
68e2437364 Fix formatting 2018-11-19 08:22:23 -05:00
Sean Whalen
dbb3d7de4d Fix formatting 2018-11-19 08:21:26 -05:00
Sean Whalen
6740ae1e5c Fix formatting 2018-11-19 08:19:55 -05:00
Sean Whalen
7b017612f8 Fix formatting 2018-11-19 08:17:31 -05:00
Sean Whalen
06425b6302 Update documentation 2018-11-19 08:14:21 -05:00
Sean Whalen
f5956ccd5b Fix crash when Arrival-Date header is missing in a forensic report 2018-11-18 18:55:42 -05:00
Sean Whalen
85489d9ea9 PEP8 fixes 2018-11-18 16:52:17 -05:00
Sean Whalen
92056738e3 5.0.0 2018-11-18 15:02:50 -05:00
Sean Whalen
aa88d3eeb4 Elasticsearch index migration 2018-11-17 20:43:55 -05:00
Sean Whalen
7d2301c5bd Update documentation 2018-11-16 09:51:14 -05:00
Sean Whalen
7fb81abef0 4.5.0 2018-11-16 09:46:16 -05:00
Sean Whalen
1bd6d46b61 Fix typo 2018-11-15 15:51:12 -05:00
Sean Whalen
9c4dca0545 4.5.0 2018-11-15 15:48:10 -05:00
Sean Whalen
017f7bc432 PEP8 fix 2018-11-15 14:50:38 -05:00
Sean Whalen
6a4445c799 Update requirements 2018-11-15 13:28:45 -05:00
Sean Whalen
13e2b50671 Caching impovements 2018-11-15 13:17:09 -05:00
Sean Whalen
0ddc904c9d Merge pull request #32 from vitalvas/master
Cache reverse DNS lookups
2018-11-15 11:43:11 -05:00
Sean Whalen
bbc64ca044 wordaround invalid date_utc value 2018-11-15 11:39:28 -05:00
Sean Whalen
ba8c6fd30c Resolve flake8 warnings 2018-11-15 11:16:06 -05:00
Vitaliy Vasilenko
0d7e14a784 DNS performance optimization 2018-11-10 17:31:51 +02:00
Sean Whalen
c581efbae6 Bump version 2018-11-09 16:21:11 -05:00
Sean Whalen
4cf5dfc4e4 Remove unused import 2018-11-09 16:17:42 -05:00
Sean Whalen
76993d5e8b 4.4.1 - workaround for issue #31
Don't crash if Elasticsearch returns an unexpected result
2018-11-09 16:14:24 -05:00
Sean Whalen
6467ebe73d 4.4.0 - Fix packaging issues 2018-11-09 15:31:48 -05:00
Sean Whalen
2e1dcbf438 Add Davmail instructions to docs for EWS/OWA 2018-11-07 15:33:47 -05:00
Sean Whalen
d4936ea5a8 Update CHANGELOG.md 2018-11-06 17:37:36 -05:00
Sean Whalen
fe7c732084 Merge pull request #30 from mikesiegel/msiegel_kafkafix
A few Kafka client tweaks and changes
2018-11-06 17:31:27 -05:00
Mike Siegel
78e796a97c Fixing import 2018-10-29 08:08:40 -04:00
Mike Siegel
0e398f2c8d removing unused import 2018-10-29 07:51:21 -04:00
Mike Siegel
0c47ac178d Merging changes from upstream/master 2018-10-29 07:48:22 -04:00
Mike Siegel
a543cb4e44 Add T separator in datetime format 2018-10-29 07:48:01 -04:00
Sean Whalen
490b6f4700 Fix more documentation typos 2018-10-28 16:43:51 -04:00
Sean Whalen
9f7b7fcc93 Fix docs typo 2018-10-28 15:56:12 -04:00
Mike Siegel
e0c532c7eb - Moved report metadata and moved report_id, org_email and org_email, org_name up a level in JSON object
- Send individual slices of report due to Kafka message size limit being 1MB
- Date calculations from ES client to aid in dashboard display
2018-10-25 15:38:18 -04:00
Sean Whalen
01b72119fe Update documentation 2018-10-25 09:43:00 -04:00
Sean Whalen
003ac9b0f1 Debug logging improvements 2018-10-25 08:52:09 -04:00
Sean Whalen
ee00861146 4.3.8 2018-10-25 01:51:03 -04:00
Sean Whalen
5eb533e2a5 Better error handling 2018-10-24 14:48:51 -04:00
Sean Whalen
c6728186cc Fix logging 2018-10-24 13:57:02 -04:00
Sean Whalen
c75244b476 Fix forensic attachment processing 2018-10-24 13:35:57 -04:00
Sean Whalen
2e13b83945 More logging 2018-10-24 12:55:52 -04:00
Sean Whalen
d4fda8c93c Fix IMAP datatypes 2018-10-24 12:49:00 -04:00
Sean Whalen
7576508f2c Add more verbose logging 2018-10-23 14:51:14 -04:00
Sean Whalen
825fba8951 4.3.7 - When checking an inbox, always recheck for messages when processing is complete 2018-10-22 06:19:28 -04:00
Sean Whalen
6cb78f65cf Fix another syntax error in the docs 2018-10-19 18:43:33 -04:00
Sean Whalen
e87da5dd0f Fix syntax error in docs 2018-10-19 18:37:52 -04:00
Sean Whalen
28379226c1 Merge branch 'master' of https://github.com/domainaware/parsedmarc 2018-10-19 18:15:58 -04:00
Sean Whalen
90cbd95063 Add Elasticsearch JVM heap size info to docs 2018-10-19 18:15:48 -04:00
Sean Whalen
446732dad2 4.3.6 - Be more forgiving for forensic reports with missing fields 2018-10-19 11:07:35 -04:00
Sean Whalen
4b1721a96d Update CHANGELOG.md 2018-10-18 10:01:38 -04:00
Sean Whalen
90207a39a4 4.3.5 - Fix base64 attachment decoding (#26) 2018-10-18 09:51:30 -04:00
Sean Whalen
29324d4b2a Add Visual Studio Code settings to .gitignore 2018-10-18 05:52:22 -04:00
Sean Whalen
2e362d9fb9 Update documentation 2018-10-16 14:25:48 -04:00
Sean Whalen
7f6eae712e Match requests's urllib3 requirement 2018-10-16 14:18:44 -04:00
Sean Whalen
33d7c17177 4.3.4 2018-10-16 14:11:06 -04:00
Sean Whalen
eff2435989 Update help 2018-10-16 13:36:25 -04:00
Sean Whalen
f54ef80b00 4.3.4 2018-10-16 13:31:04 -04:00
Sean Whalen
1323e85530 PEP8 fix 2018-10-16 11:42:21 -04:00
Sean Whalen
6a5d2c4105 Merge pull request #25 from inoio/fix-missing-comments
fix crash on missing comments
2018-10-16 11:38:47 -04:00
Dennis Brakhane
71745f006d fix crash on missing comments
Some DMARC reports contain the "comment" field with a null value. This
would case a crash when trying to combine the comments into one.

Therefore, handle None comments as well.

Also remove a buggy line; the reason object already exists in the list,
so no need to append it (and even if we needed to append it, we would
need to iterate over a copy, otherwise we might end up in an endless
loop)
2018-10-16 16:49:18 +02:00
Sean Whalen
dff55f7abe Update docs 2018-10-14 20:55:29 -04:00
Sean Whalen
0e65cf7ae2 Fix existring sample lookup 2018-10-14 20:37:57 -04:00
Sean Whalen
29ef06f892 Fix duplicate forensic report Elasticsearch lookup 2018-10-14 20:12:47 -04:00
Sean Whalen
77111075b4 Fix email parsing 2018-10-14 19:05:15 -04:00
Sean Whalen
30b04645d7 4.3.3 - Fix forensic report email processing 2018-10-14 18:44:02 -04:00
Sean Whalen
1253f729b1 4.3.2 release 2018-10-14 18:06:57 -04:00
Sean Whalen
2c8c7f4659 Fix email parsing 2018-10-14 17:58:47 -04:00
Sean Whalen
48ca180db5 Fix header checks 2018-10-14 05:16:52 -04:00
Sean Whalen
a57d6836aa Fix error message formatting 2018-10-14 05:11:36 -04:00
Sean Whalen
7fdee0ab76 Fix duplicate forensic report search for Elasticsearch 2018-10-14 04:48:25 -04:00
Sean Whalen
3a575e91a1 Fix syntax error 2018-10-14 04:00:24 -04:00
Sean Whalen
0f63290d6e Fix parsing of some emails 2018-10-14 03:55:06 -04:00
Sean Whalen
826583cd37 Update kibana_saved_objects.json and install docs 2018-10-12 16:03:37 -04:00
Sean Whalen
c56c538c88 Fix doc API layout 2018-10-12 14:23:05 -04:00
Sean Whalen
a4a03bb027 Fix docs 2018-10-12 14:17:18 -04:00
Sean Whalen
41e80f3788 Fix CI 2018-10-12 14:11:21 -04:00
Sean Whalen
03f987840b Revert "Remove rstcheck from build script"
This reverts commit b40ca17263.
2018-10-12 14:09:50 -04:00
Sean Whalen
b40ca17263 Remove rstcheck from build script 2018-10-12 14:06:09 -04:00
Sean Whalen
6971bc1bda Update build.sh 2018-10-12 13:58:26 -04:00
Sean Whalen
187c788b47 Fix changelog 2018-10-12 13:54:14 -04:00
Sean Whalen
658b5466ca Fix import 2018-10-12 12:35:10 -04:00
Sean Whalen
8480bee676 Fix docs 2018-10-12 12:32:26 -04:00
Sean Whalen
a50c055579 Update docs 2018-10-12 12:30:37 -04:00
Sean Whalen
52dba5041d Update build script 2018-10-12 12:22:22 -04:00
Sean Whalen
ca151f54f2 Fix doc formatting 2018-10-12 12:16:47 -04:00
Sean Whalen
6234bd1ef6 Update docs/conf.py 2018-10-12 12:12:15 -04:00
Sean Whalen
7076bccc8d Update documentation 2018-10-12 12:07:21 -04:00
Sean Whalen
49387d9033 4.3.0 release 2018-10-12 12:02:55 -04:00
Sean Whalen
7394c40167 Remove unused link from README 2018-10-11 19:04:21 -04:00
Sean Whalen
f45ab94e06 Update test suitw 2018-10-11 19:01:02 -04:00
Sean Whalen
babdc661ac 4.3.0 release 2018-10-11 17:59:09 -04:00
Sean Whalen
a1e8506d42 Process email samples with missing dates 2018-10-11 16:32:09 -04:00
Sean Whalen
0a2aea0a7a Use latest mailparser 2018-10-11 16:05:46 -04:00
Sean Whalen
b335edacaf Downgrade mailparser to 3.5.1 2018-10-11 15:59:23 -04:00
Sean Whalen
e138c5467d Yet more refactoring 2018-10-11 14:51:29 -04:00
Sean Whalen
466745e5fb sigh again 2018-10-11 14:23:03 -04:00
Sean Whalen
c47b9fed4e sigh 2018-10-11 14:04:48 -04:00
Sean Whalen
f18bc98a96 Sigh 2018-10-11 14:01:40 -04:00
Sean Whalen
d8e0b05c6a another refactoring fix 2018-10-11 13:56:01 -04:00
Sean Whalen
7867baa842 Fopefolly final refactor fix 2018-10-11 13:46:38 -04:00
Sean Whalen
7728713ae8 Yet another refactoring fix 2018-10-11 13:39:03 -04:00
Sean Whalen
231a921d5c More refactoring fixes 2018-10-11 13:36:39 -04:00
Sean Whalen
e76f89a338 Fix whitespace 2018-10-11 13:28:25 -04:00
Sean Whalen
28a62cdbc6 Really fix refactoring 2018-10-11 13:24:16 -04:00
Sean Whalen
694c2afe23 Fix refactoring 2018-10-11 12:44:50 -04:00
Sean Whalen
6f5b23445e Fix import 2018-10-11 12:33:52 -04:00
Sean Whalen
b37205a98d Major refactoring 2018-10-11 08:01:42 -04:00
Sean Whalen
ab37f7ac5c PEP 8 fix 2018-10-10 21:14:54 -04:00
Sean Whalen
cff1cede46 4.2.1 - Bug fixes and Kafka support 2018-10-10 20:33:17 -04:00
Sean Whalen
524f9c0327 Merge pull request #21 from mikesiegel/mikesiegel_kafka
Add Kafka Support
2018-10-10 19:18:22 -04:00
Mike Siegel
074ce9b815 Removed logger from import 2018-10-10 13:20:28 -04:00
Mike Siegel
8d1c0cf3a0 fix merge conflict 2018-10-10 12:55:57 -04:00
Mike Siegel
fe611ac9df added k version to setup.py 2018-10-10 11:57:41 -04:00
Mike Siegel
66e707bfdf bumping version 2018-10-10 10:12:34 -04:00
Mike Siegel
966495a2a9 PEP8 changes 2018-10-10 10:04:30 -04:00
Mike Siegel
19df7f65c4 PEP8 fixes 2018-10-10 09:54:03 -04:00
Sean Whalen
88e3a5e0d6 4.2.0 Release 2018-10-10 09:40:35 -04:00
Mike Siegel
687a44ee58 split out individual records. 2018-10-10 09:11:24 -04:00
Sean Whalen
20afbba7e2 PEP 8 fix 2018-10-10 08:29:11 -04:00
Sean Whalen
35e6a72691 Remove duplicate logging message 2018-10-10 08:27:28 -04:00
Sean Whalen
ee97a76654 More logging fixes 2018-10-10 08:25:32 -04:00
Mike Siegel
a3ba85803a Modified to send entire ordered dict to Kafka. Bug: would barf on reports larger than 10 megs 2018-10-10 08:07:44 -04:00
Sean Whalen
d25d01a230 More logging fixes 2018-10-09 23:28:55 -04:00
Sean Whalen
c944264760 Fix logging 2018-10-09 23:00:28 -04:00
Sean Whalen
0d1a4786e1 Fix logging 2018-10-09 20:49:12 -04:00
Sean Whalen
d3cdb81977 Merge branch 'master' of https://github.com/domainaware/parsedmarc 2018-10-09 20:48:35 -04:00
Sean Whalen
16dcb2edc7 Fix logging 2018-10-09 20:48:31 -04:00
Sean Whalen
21af33687c Fix module logging 2018-10-09 20:35:45 -04:00
Sean Whalen
8ea0e62bdd Add more debug logging details 2018-10-09 20:21:17 -04:00
Mike Siegel
d4cf4a7e5f forgot to flush 2018-10-09 14:08:02 -04:00
Mike Siegel
5827d8b137 Initial commit 2018-10-09 13:32:41 -04:00
Sean Whalen
fd2d5093a9 More logging when --debug is used 2018-10-09 11:48:31 -04:00
Sean Whalen
7d2949d6a7 Merge branch 'master' of https://github.com/domainaware/parsedmarc 2018-10-09 11:12:04 -04:00
Sean Whalen
df1c2bac5f Fix Splunk forensic dashboard sorting 2018-10-09 11:12:00 -04:00
Sean Whalen
e2ec3bc2da Change default logging level to WARNING 2018-10-09 10:55:40 -04:00
Sean Whalen
20433cd2b6 Logging and packaging fixes 2018-10-09 10:34:42 -04:00
Sean Whalen
f235149863 4.2.0 - Save each aggregate report record as a separate Splunk event 2018-10-08 14:22:43 -04:00
Sean Whalen
66af8e6090 Workaround for forensic/ruf reports that are missing Reported-Domain 2018-10-08 09:31:53 -04:00
Sean Whalen
f1fa8709c2 4.1.9 - Workaround for forensic/ruf reports that are missing Arrival-Date 2018-10-08 09:09:11 -04:00
Sean Whalen
5b5e65d48b Update docs/index.rst 2018-10-08 07:07:43 -04:00
Sean Whalen
37d40c01ba Still parse if spf is missing from auth_results 2018-10-07 17:54:25 -04:00
Sean Whalen
16a4be2205 4.1.8 - Be more forgiving of weird XML 2018-10-07 12:50:02 -04:00
Sean Whalen
ead03b9779 4.1.7 - Remove any invalid XML schema tags before parsing the XML (#18) 2018-10-06 15:29:02 -04:00
Sean Whalen
ad922ad028 4.1.6 - Fix typo in CLI parser 2018-10-05 18:12:13 -04:00
Sean Whalen
7a5e5b6d1f Update dashboard export - see issue #17 2018-10-05 17:44:10 -04:00
Sean Whalen
eda6d0907b Fix missing Kibana visualizations - #17 2018-10-05 16:39:32 -04:00
Sean Whalen
84bba2783b PEP 8 whitespace fix 2018-10-05 16:19:50 -04:00
Sean Whalen
293d3ecf74 Update documentation 2018-10-05 16:16:24 -04:00
Sean Whalen
20282b4d30 Complete feature request in issue #16 2018-10-05 16:07:53 -04:00
Sean Whalen
1c20bfe200 Update changelog 2018-10-05 14:49:13 -04:00
Sean Whalen
bb55bb3911 Add missing error message 2018-10-05 12:35:24 -04:00
Sean Whalen
a4373c73e6 Add more debugging messages 2018-10-05 12:30:27 -04:00
Sean Whalen
1696096583 Move/delete 10 IMAP messages at a time 2018-10-05 10:11:48 -04:00
Sean Whalen
ec4793241e Fix typo 2018-10-05 08:27:06 -04:00
Sean Whalen
cd6191463e Better IMAP error handling 2018-10-05 08:23:56 -04:00
Sean Whalen
a1927be492 4.1.5 2018-10-04 22:04:19 -04:00
Sean Whalen
0709f8cc2f 4.1.4 - Change default logging level to ERROR 2018-09-30 19:00:09 -04:00
Sean Whalen
07051212c4 Fix documentation typo 2018-09-30 15:00:27 -04:00
Sean Whalen
4604ef64bc Update the readme 2018-09-30 11:49:16 -04:00
Sean Whalen
2aa1e2ef23 Update documentation 2018-09-30 11:46:31 -04:00
Sean Whalen
123ec62052 Update docs 2018-09-29 17:48:46 -04:00
Sean Whalen
5cbd685019 Fix crash introduced in 4.1.0 when creating Elasticsearch indexes (Issue #15) 2018-09-29 14:14:04 -04:00
Sean Whalen
bb98377a29 4.1.2 2018-09-29 14:03:06 -04:00
Sean Whalen
ce74617195 Add Splnk HEC link to docs 2018-09-29 13:58:03 -04:00
Sean Whalen
71e6ded025 Fix documentation typo 2018-09-29 13:53:50 -04:00
Sean Whalen
2ce57aeffc Update documentation 2018-09-29 13:49:25 -04:00
Sean Whalen
625089a12c 4.1.1 2018-09-29 13:25:27 -04:00
Sean Whalen
32c46795e8 Fix time range in Splunk aggregate dashboard 2018-09-28 08:31:22 -04:00
Sean Whalen
b22fa6fdf7 Remove DKIM specific filters from splunk dashboards
Filtering on data that does not exist led to incomplete dashboards
2018-09-28 00:30:08 -04:00
Sean Whalen
c5e44327b3 Remove implicit wildcards from Splunk dashboards 2018-09-28 00:08:49 -04:00
Sean Whalen
db2625fff9 Add Splunk dashboard source XML 2018-09-27 23:49:32 -04:00
Sean Whalen
18255103ed Update CLI documentation 2018-09-27 12:08:00 -04:00
Sean Whalen
a7fb20713b 4.1.0 2018-09-27 12:01:48 -04:00
Sean Whalen
ec5e8a4ca1 4.0.2 - Use report timestamps for Splunk timestamps 2018-09-26 16:03:20 -04:00
Sean Whalen
c4e39d61b5 4.0.1 2018-09-26 14:48:56 -04:00
Sean Whalen
fa1b2721d7 Merge pull request #14 from domainaware/4.0
4.0
2018-09-26 13:15:59 -04:00
Sean Whalen
08806f0d0c Workaround for random Exchange/Office365 Server Unavailable IMAP errors 2018-09-26 13:03:33 -04:00
Sean Whalen
4a34445b81 Update documentation 2018-09-26 12:45:50 -04:00
Sean Whalen
c102c2f21c Fix splunk HEC submission and --outgoing-ssl option
Changed --outgoing-SSL to --outgoing-ssl
2018-09-26 12:32:39 -04:00
Sean Whalen
83a76ec0cd Fix aggregate report splunk conversion 2018-09-25 17:37:43 -04:00
Sean Whalen
cdb9546bc0 Add --hec-skip-certificate-verification option 2018-09-25 16:04:05 -04:00
Sean Whalen
c9177f3342 Only save to Splunk when there are things to save 2018-09-25 15:50:53 -04:00
Sean Whalen
caf6cd1872 Fix error formatting 2018-09-25 14:47:06 -04:00
Sean Whalen
fa38bea8ea Fix error output 2018-09-25 14:44:23 -04:00
Sean Whalen
eff7c552c9 Fix CLI argument logic 2018-09-25 14:26:30 -04:00
Sean Whalen
c964241cba Splunk HEC token not HEC key 2018-09-25 14:21:03 -04:00
Sean Whalen
ba3c9de9b7 Fix HEC key check 2018-09-25 14:15:09 -04:00
Sean Whalen
253d421e29 Splunk and SMTP improvements
SMTP issue #12 fixed (based on PR #13 )
2018-09-25 13:40:55 -04:00
Sean Whalen
861ee7d247 Update Splunk support 2018-09-25 13:06:27 -04:00
Sean Whalen
a1a4cbbf28 Use correct splunk sourcetype format 2018-09-25 10:01:02 -04:00
Sean Whalen
2a4f558bbc Always send creds when reconnecting to IMAP 2018-09-24 05:17:29 -04:00
Sean Whalen
b11c6d587c Fix IMAP reconnection 2018-09-21 23:45:24 -04:00
Sean Whalen
5657a27262 Use port 587 by default when sending email
Hopefully fixes issue #12
2018-09-21 08:42:44 -04:00
Sean Whalen
0a694b0a24 - Always use \n as the newline when generating CSVs 2018-09-19 11:30:34 -04:00
Sean Whalen
0989a8bb8a Fix SMTP AUTH extension not supported by server error on some SMTP servers
Issue #12
2018-09-19 08:01:30 -04:00
Sean Whalen
c051980f26 Update output example in documentation 2018-09-19 07:35:06 -04:00
Sean Whalen
6b01fc0f3f Fix .msg parsing CLI exception when msgconvert is not found in the system path 2018-09-18 21:54:26 -04:00
Sean Whalen
db4e145b7a Add User-Agent to GioIP DB download 2018-09-18 21:40:34 -04:00
Sean Whalen
68c54d4c5c Add missing sub dictionary 2018-09-17 12:40:48 -04:00
Sean Whalen
aead7ee754 Add alignment booleans to JSON output 2018-09-17 12:35:27 -04:00
Sean Whalen
3fdd5457b1 Reduce default DNS timeout to 0.5 seconds 2018-09-17 11:45:08 -04:00
Sean Whalen
d18d9cf5d0 Fix changelog typo 2018-09-17 08:04:51 -04:00
Sean Whalen
9cf113abdc Fix PSL download 2018-09-16 23:05:52 -04:00
Sean Whalen
2796fdd691 PEP 8 fix 2018-09-16 23:02:18 -04:00
Sean Whalen
5160d687f3 Update CLI docs 2018-09-16 22:56:51 -04:00
Sean Whalen
b46fec8983 4.0.0 prelease 2018-09-16 22:51:49 -04:00
Sean Whalen
e8dd04f952 Update systemd config example 2018-09-13 15:50:17 -04:00
Sean Whalen
4d0bf2723f Require sphinx==1.7.9 when building
Sphinx 1.8.0 breaks rstcheck
2018-09-13 14:48:04 -04:00
Sean Whalen
b4b2dc298a Merge branch 'master' of https://github.com/domainaware/parsedmarc 2018-09-13 13:30:27 -04:00
Sean Whalen
e274052133 3.9.7 - Completely reset IMAP connection when a broken pipe is encountered 2018-09-13 13:30:20 -04:00
Sean Whalen
0bb7a5108a Add Ater option to systemd service 2018-09-11 12:58:55 -04:00
Sean Whalen
f59c0d62fc 3.9.6 - Finish incomplete broken pipe fix 2018-09-11 11:18:56 -04:00
Sean Whalen
a6dbf807e4 3.9.5 - Refactor to use a shared IMAP connection for inbox watching and message downloads 2018-09-10 08:48:04 -04:00
Sean Whalen
b1b7f3c329 3.9.4 Fix moving/deleting emails 2018-09-06 16:26:41 -04:00
Sean Whalen
b9c4c62b00 3.9.3 - Fix crash when forensic recorts are missing Arrival-Date 2018-09-06 15:09:44 -04:00
Sean Whalen
92f4085386 fix build.sh typo 2018-09-06 13:48:36 -04:00
Sean Whalen
a6094b2144 3.9.2 - Fix PEP 8 spacing and update build.sh 2018-09-06 13:43:41 -04:00
Sean Whalen
8e102b4e95 Actually bump version to 3.9.1 2018-09-06 12:47:59 -04:00
Sean Whalen
51987ba770 3.9.1 - Use COPY and delete if an IMAP server does not support MOVE (closes issue #9) 2018-09-06 12:45:56 -04:00
Sean Whalen
bcde4bebd5 3.9.0 - Multiple bug fixes 2018-09-06 11:10:18 -04:00
Sean Whalen
f19d623d7d Reduce IMAP IDLE refresh rate to 5 minutes
G-Suite is resetting connections after 10 minutes
2018-09-05 04:55:46 -04:00
Sean Whalen
7c6a0b185a 3.8.2 2018-09-03 22:07:08 -04:00
Sean Whalen
8afa271cb7 Merge pull request #8 from mikesiegel/nameservers_args
Updated to pass nameserver arguments to all occurances of parse_repor…
2018-09-02 20:34:33 -04:00
Mike Siegel
ff8aa4fc32 Updated to pass nameserver arguments to all occurances of parse_report_record(). This significantly speeds up processing long reports from the inbox in my testing. 2018-08-30 12:04:37 -04:00
Sean Whalen
22c2e8799a 3.8.1 - Better handling of .msg files when msgconvert is not installed 2018-08-27 09:04:08 -04:00
Sean Whalen
ca0397c331 Add DKIM selector to dashboard output 2018-08-25 23:48:40 -04:00
Sean Whalen
4853537765 PEP 8 fix and more documentation 2018-08-22 06:50:59 -04:00
Sean Whalen
3954ecc595 3.8.0 - Fix saving to Elasticsearch when the to header is mising from forensic sample 2018-08-21 16:00:46 -04:00
Sean Whalen
245262d997 3.8.0 - Allow forensic to header to be missing 2018-08-21 15:53:44 -04:00
Sean Whalen
8438e9bd5a 3.8.0 - Remove excess /r from forensic reports 2018-08-21 15:44:33 -04:00
Sean Whalen
7f7bde3145 3.8.0 - Parse forensic reports if they are base64-encoded 2018-08-21 15:33:07 -04:00
Sean Whalen
7106fe620e Better comments 2018-08-21 14:24:14 -04:00
Sean Whalen
1b14147d5b Only use . as an IMAP folder hierarchy separator when / does not work 2018-08-21 13:56:10 -04:00
Sean Whalen
1e130ca70a 3.7.4 - Fix dovecot IMAP server support (closes #5)
Use `.` instead of `/` as the IMAP folder hierarchy separator

https://stackoverflow.com/questions/51951887/what-is-the-proper-way-to-create-imap-subfolders-using-imapclient-for-python/51952493#51952493
2018-08-21 11:57:06 -04:00
Sean Whalen
7758411244 Workaround for dovecot mail server bug (issue #3) 2018-08-21 03:47:04 -04:00
Sean Whalen
d74ec346ce 3.7.3 - Fix saving attachment from forensic sample to Elasticsearch 2018-08-19 11:55:29 -04:00
Sean Whalen
4d1cdf9e18 Add booleans to allignment details is dashboard 2018-08-10 13:59:50 -04:00
Sean Whalen
ebc79cbe9c 3.7.2 - Fix pypy support and add pypy documentation
https://github.com/elastic/elasticsearch-dsl-py/blob/master/Changelog.rst#620-2018-07-03

https://github.com/elastic/elasticsearch-dsl-py/issues/953
2018-08-01 11:09:55 -04:00
Sean Whalen
f0040ce53e Change uses of DocType to Document support refactoring in Elasticsearch 6.2.0
https://github.com/elastic/elasticsearch-dsl-py/blob/master/Changelog.rst#620-2018-07-03
2018-08-01 08:51:42 -04:00
Sean Whalen
46fdbc79a2 Fix HTTPS redirect in example NGINX config 2018-07-31 10:48:04 -04:00
Sean Whalen
c3a862c245 Merge branch 'master' of https://github.com/domainaware/parsedmarc 2018-07-26 00:45:28 -04:00
Sean Whalen
2d0b0098a1 Add sdist to build automation 2018-07-26 00:45:18 -04:00
Sean Whalen
89a93ed4a8 Update line colors for Message Disposition Over Time 2018-07-23 10:37:14 -04:00
Sean Whalen
aed4d60ccb 3.7.1 2018-07-18 10:55:01 -04:00
Sean Whalen
b97a6f5150 3.7.0 2018-07-18 09:46:40 -04:00
Sean Whalen
de0a3b7c56 Foc documentation formatting 2018-07-01 17:20:23 -04:00
Sean Whalen
a8471848dc Fox documentation formatting 2018-06-30 10:46:36 -04:00
Sean Whalen
9b25e294ea Update documentation - closes #3 2018-06-30 10:35:11 -04:00
Sean Whalen
105e286d79 Consolidate DMARC Summary and DMARC allignment Failure dashboards 2018-06-30 10:07:55 -04:00
Sean Whalen
900b1707fb Update documentation to reflect dashboard changes 2018-06-30 10:05:41 -04:00
Sean Whalen
37ce15e284 Merge branch 'master' of https://github.com/domainaware/parsedmarc 2018-06-30 09:59:44 -04:00
Sean Whalen
3cec775854 Update warnings 2018-06-30 09:59:39 -04:00
Sean Whalen
e30a5bb14f 3.6.1 - Parse aggregate reports with missing spf domain 2018-06-29 11:56:47 -04:00
Sean Whalen
c269e49c2a Improve documentation 2018-06-29 09:01:37 -04:00
Sean Whalen
155351edbc Fix wording 2018-06-28 18:48:58 -04:00
Sean Whalen
60f9c06458 Fix documentation formatting 2018-06-28 18:41:29 -04:00
Sean Whalen
a049acfa5b Lots of documentation improvements 2018-06-28 18:04:32 -04:00
Sean Whalen
3f19489a9b Add dashboard switching note 2018-06-28 14:46:04 -04:00
Sean Whalen
c351d22095 Fix docs typo 2018-06-28 14:13:40 -04:00
Sean Whalen
9877abaf92 Fix documentation formatting 2018-06-28 14:09:28 -04:00
Sean Whalen
6c7965e35d Fix documentation link 2018-06-28 13:45:57 -04:00
Sean Whalen
cc3aff690e Fix documentation formatting 2018-06-28 13:09:43 -04:00
Sean Whalen
8584fd1a81 Add dashboard documentation 2018-06-28 13:05:08 -04:00
Sean Whalen
a1defd1512 Show from header in summary dashboard instead of base domain 2018-06-26 12:43:18 -04:00
Sean Whalen
45fe413b30 Make graph label names consistent 2018-06-21 14:23:19 -04:00
Sean Whalen
44f87fe924 3.6.0 - Much more robust error handling 2018-06-20 10:39:20 -04:00
Sean Whalen
1761f12604 Fix crash when parsing report with empty <auth_results></auth_results> 2018-06-20 09:47:50 -04:00
Sean Whalen
70d26506bb Fix dashboard loading times 2018-06-19 12:15:55 -04:00
Sean Whalen
9b85662988 Dashboard layout improvements 2018-06-18 17:06:37 -04:00
Sean Whalen
0f0b40238f Tons of dashboard fixes 2018-06-18 11:40:00 -04:00
Sean Whalen
0df1ebc3d7 Fix typo in docs 2018-06-10 12:32:19 -04:00
Sean Whalen
b8ff5b10d6 Bump setup.py version 2018-06-10 12:20:36 -04:00
Sean Whalen
acf912eaa4 Use Cloudflare's public DNS resolvers by default instead of Google's 2018-06-10 11:21:07 -04:00
Sean Whalen
aca42a6300 Merge branch 'master' of https://github.com/domainaware/parsedmarc 2018-06-10 09:22:59 -04:00
Sean Whalen
9390e10f54 3.5.0 2018-06-10 09:22:48 -04:00
Sean Whalen
011a0f299f Merge pull request #2 from soujak/patch-1
Fix typo
2018-06-10 09:10:52 -04:00
Sean Whalen
ee7ecc1c18 Add disposition graph to Kibana 2018-06-04 10:28:37 -04:00
Sean Whalen
519ff1ce5a Dashboard fixes 2018-06-02 12:18:54 -04:00
Sean Whalen
29d3fb63d7 Dashboard improvements 2018-06-02 12:03:30 -04:00
Sean Whalen
2deb92686a Fix typos in docs 2018-05-31 12:28:48 -04:00
Marco Solieri
5a397821d0 Fix typo 2018-05-22 09:44:40 +02:00
Sean Whalen
2c7002d3cc Fix testing automation 2018-05-14 19:39:14 -04:00
Sean Whalen
055b98c88a Update README 2018-03-31 23:43:39 -04:00
Sean Whalen
c50a907988 Fix build 2018-03-31 23:33:43 -04:00
Sean Whalen
a2bbc0effc Update build script 2018-03-31 23:05:14 -04:00
Sean Whalen
da57ccdf18 3.4.1 2018-03-31 23:02:10 -04:00
Sean Whalen
085c90a028 Update dashbord screenshot 2018-03-29 21:52:21 -04:00
Sean Whalen
5df5e362d6 Fix typos in dashboard 2018-03-29 21:00:46 -04:00
Sean Whalen
cc38f05593 Add debugging 2018-03-29 20:47:02 -04:00
Sean Whalen
c27acb7901 Fix logging 2018-03-29 18:33:05 -04:00
Sean Whalen
19157830bf Update CLI help docs 2018-03-29 18:23:39 -04:00
Sean Whalen
12288606f6 Update CLI help docs 2018-03-29 18:20:30 -04:00
Sean Whalen
ab9cce3197 Remove conflisting option string 2018-03-29 18:13:39 -04:00
Sean Whalen
845ee644fd Remove conflisting option string 2018-03-29 18:11:17 -04:00
Sean Whalen
3f00677449 It's actually 3.4.0 2018-03-29 18:05:17 -04:00
Sean Whalen
e9821c585a Update docs 2018-03-29 17:18:12 -04:00
Sean Whalen
95f58018f2 Prepare for the 2.4.0 release 2018-03-29 17:02:58 -04:00
Sean Whalen
68653c6b2c Remove unneeded sleep 2018-03-29 14:07:08 -04:00
Sean Whalen
77c0255f54 Better IMAP error handling 2018-03-29 14:05:51 -04:00
Sean Whalen
e54def617d Better IDLE refresh method 2018-03-29 13:38:16 -04:00
Sean Whalen
b8eca3a536 Refresh IDLE command every 10 minutes 2018-03-29 12:32:47 -04:00
Sean Whalen
f28bd7f059 Properly scale image in README 2018-03-28 09:50:08 -04:00
Sean Whalen
b55fbf7568 Better exception handling 2018-03-28 08:58:22 -04:00
Sean Whalen
b4eeca6155 Put brake back 2018-03-28 08:42:32 -04:00
Sean Whalen
d4af392b58 Fix IMAP IDLE processessing bug 2018-03-27 23:40:19 -04:00
Sean Whalen
e3035242f9 Remove debugging code 2018-03-27 16:40:00 -04:00
Sean Whalen
486dbce7a6 Make dup seatch more specific 2018-03-27 16:25:32 -04:00
Sean Whalen
dc9a935fe1 Make dmarc.elastic classes protected 2018-03-27 10:50:19 -04:00
Sean Whalen
02944a8f70 Add pip to build script 2018-03-27 10:43:26 -04:00
Sean Whalen
5405675e26 Fix docs 2018-03-27 10:39:35 -04:00
Sean Whalen
df5f407c7d Add marsedmarc.elastic to docs 2018-03-27 10:37:48 -04:00
Sean Whalen
8a49aacd43 Fix error message 2018-03-27 10:34:15 -04:00
Sean Whalen
eb05aaf709 3.3.0 2018-03-27 10:22:49 -04:00
Sean Whalen
02d2c12188 Fix existing aggregate report query 2018-03-27 10:15:43 -04:00
Sean Whalen
84adf2be2e Actually fix the bug 2018-03-27 10:08:36 -04:00
Sean Whalen
5d2e766d65 Fix existing report warning 2018-03-27 09:57:05 -04:00
Sean Whalen
35adcb63ca 3.2.1 2018-03-27 06:24:26 -04:00
Sean Whalen
aeb16b5f73 3.2.0 2018-03-26 23:04:08 -04:00
Sean Whalen
7ff1cd6ae8 3.1.0 2018-03-26 22:06:59 -04:00
Sean Whalen
faeef6e43d Fix screenshot target 2018-03-26 18:05:53 -04:00
Sean Whalen
bfe6fcfb7b Fix screenshot scaling 2018-03-26 18:00:26 -04:00
Sean Whalen
513d703440 Scale down screenshots 2018-03-26 17:53:04 -04:00
Sean Whalen
6409a90a5b Fix typo 2018-03-26 17:50:21 -04:00
Sean Whalen
8c99baba30 Add screenshots to docs 2018-03-26 17:46:33 -04:00
Sean Whalen
401edcba9c Add Kibana screenshots 2018-03-26 17:36:11 -04:00
Sean Whalen
87e1b6737e Update documentation 2018-03-26 13:38:43 -04:00
Sean Whalen
880644a6ca Increase Forensic Samples dashboard timeframe 2018-03-26 12:51:27 -04:00
Sean Whalen
cc9c496be1 Bug fixes 2018-03-26 10:51:04 -04:00
Sean Whalen
eea57a5719 PEP8 fixes 2018-03-25 23:12:07 -04:00
Sean Whalen
bcf0acef34 Fix exception handling 2018-03-25 23:06:34 -04:00
Sean Whalen
f755696df0 Yet more debugging 2018-03-24 18:46:50 -04:00
Sean Whalen
0392675a07 Add missing field 2018-03-24 18:28:54 -04:00
Sean Whalen
19b1df4f44 More debugging 2018-03-24 18:25:18 -04:00
Sean Whalen
f2c0fde99d Debugging 2018-03-24 12:43:25 -04:00
Sean Whalen
3c1f664d83 Update changelog 2018-03-20 11:46:51 -04:00
Sean Whalen
fc9222322f Update tests 2018-03-20 11:39:14 -04:00
Sean Whalen
182c5870c1 Fix typo in changelog 2018-03-20 11:24:06 -04:00
Sean Whalen
d13be1aab4 Normalize aggregate report IDs 2018-03-20 11:22:28 -04:00
Sean Whalen
4090f10d6f More exception handling 2018-03-20 11:05:37 -04:00
Sean Whalen
6de5eba4a3 More refactoring 2018-03-20 10:18:42 -04:00
Sean Whalen
cd54112782 More refactoring fixes 2018-03-20 09:57:13 -04:00
Sean Whalen
248a731df8 Fix refactoring 2018-03-19 21:52:02 -04:00
Sean Whalen
b0f57d6233 Fix typo 2018-03-19 15:54:13 -04:00
Sean Whalen
d11a9f4d34 More detailed warnings 2018-03-19 15:49:58 -04:00
Sean Whalen
4a7df9804b Add forgotten code 2018-03-19 12:44:31 -04:00
Sean Whalen
a0219004aa Fix CLI args 2018-03-19 12:17:17 -04:00
Sean Whalen
268b78b10a Prepare to test 3.0.0 2018-03-19 12:09:17 -04:00
Sean Whalen
effbb0bceb Move emails in chunks 2018-03-14 22:12:00 -04:00
Sean Whalen
fc16bb8a2f Fix CLI option 2018-03-14 21:47:08 -04:00
Sean Whalen
ae8c12732e Fix setting reports folder on IDLE 2018-03-14 21:27:36 -04:00
Sean Whalen
d781001087 Try again 2018-03-14 21:08:54 -04:00
Sean Whalen
9a4a66f22b Actually fix file content detection 2018-03-14 21:02:11 -04:00
Sean Whalen
1f73dcfe8c Fix file type detection 2018-03-14 20:55:10 -04:00
Sean Whalen
1c1280a0a2 Fix email retrevial 2018-03-14 20:48:29 -04:00
Sean Whalen
2b25308402 Testing 2.2.0 2018-03-14 20:43:35 -04:00
Sean Whalen
d900ebf0eb 2.1.2 2018-03-06 10:01:27 -05:00
Sean Whalen
28a115a223 2.1.1 - Documentation fixes 2018-03-06 07:14:37 -05:00
Sean Whalen
0924b0bfba Change default attachment name. 2.1.0 For real this time. 2018-03-05 18:08:53 -05:00
Sean Whalen
fd31cf164f Better error messages 2018-03-05 18:05:33 -05:00
Sean Whalen
e6a44232aa More error handling 2018-03-05 17:49:28 -05:00
Sean Whalen
3f823b4818 Email bug fixes - 2.0.0 release 2018-03-05 17:43:54 -05:00
Sean Whalen
544c915f0b Make mail sending more flexable 2018-03-05 17:31:14 -05:00
Sean Whalen
5043d34872 Fix sending email without login 2018-03-05 17:16:06 -05:00
Sean Whalen
614c8b68fb Use STARTTLS instead of SSL 2018-03-05 17:08:34 -05:00
Sean Whalen
1b50d37e30 Better error handling 2018-03-05 17:04:31 -05:00
Sean Whalen
7f686497ec Fix zip creation 2018-03-05 16:54:44 -05:00
Sean Whalen
d56d01592d Fix CLI input 2018-03-05 16:43:03 -05:00
Sean Whalen
545fd31783 2.1.0 2018-03-05 16:19:21 -05:00
Sean Whalen
d98df5f02b 2.0.1 2018-03-04 16:03:37 -05:00
Sean Whalen
7b3eb2aa2f Drop support for Python 2 2018-03-04 12:34:21 -05:00
Sean Whalen
aa73f55681 Fix Python 2 issue 2018-03-04 12:04:23 -05:00
Sean Whalen
0135d46afb Add missing dependency mail-parser 2018-03-04 11:54:00 -05:00
Sean Whalen
36a2cef580 Merge conflicts 2018-03-04 11:46:33 -05:00
Sean Whalen
05d49222c6 2.0.0 2018-03-04 11:22:24 -05:00
Sean Whalen
ff4e32e43d 1.1.0 2018-02-08 15:13:35 -05:00
Sean Whalen
8015e6a25c Update build script 2018-02-06 10:56:06 -05:00
Sean Whalen
c8d7bc703e 1.0.5 - Properly format errors list in CSV output 2018-02-06 10:40:58 -05:00
Sean Whalen
4d0c33b59f 1.0.4 - Prefix public suffix and GeoIP2 database filenames with . 2018-02-05 23:55:17 -05:00
Sean Whalen
ff518558cc 1.0.3 version bump 2018-02-05 23:10:47 -05:00
Sean Whalen
ef24b8563c 1.0.3 - Fix doc flaws 2018-02-05 23:09:10 -05:00
Sean Whalen
1bb26a718e 1.0.1 - Fix packaging flaw 2018-02-05 22:26:58 -05:00
132 changed files with 75168 additions and 1371 deletions

5
.dockerignore Normal file
View File

@@ -0,0 +1,5 @@
venv/
dist/
build/
test/
parsedmarc.egg-info/

1
.gitattributes vendored Normal file
View File

@@ -0,0 +1 @@
samples/* binary

55
.github/workflows/docker.yml vendored Normal file
View File

@@ -0,0 +1,55 @@
name: Build docker image
permissions:
contents: read
on:
release:
types:
- published
push:
branches:
- master
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build-and-push-image:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
steps:
- name: Checkout repository
uses: actions/checkout@v5
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: |
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
# generate Docker tags based on the following events/attributes
tags: |
type=ref,event=branch
type=ref,event=pr
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
- name: Log in to the Container registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push Docker image
uses: docker/build-push-action@v6
with:
context: .
push: ${{ github.event_name == 'release' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}

70
.github/workflows/python-tests.yml vendored Normal file
View File

@@ -0,0 +1,70 @@
name: Python tests
permissions:
contents: read
on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
jobs:
build:
runs-on: ubuntu-latest
services:
elasticsearch:
image: elasticsearch:8.19.7
env:
discovery.type: single-node
cluster.name: parsedmarc-cluster
discovery.seed_hosts: elasticsearch
bootstrap.memory_lock: true
xpack.security.enabled: false
xpack.license.self_generated.type: basic
ports:
- 9200:9200
- 9300:9300
strategy:
fail-fast: false
matrix:
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]
steps:
- uses: actions/checkout@v5
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v6
with:
python-version: ${{ matrix.python-version }}
- name: Install system dependencies
run: |
sudo apt-get -q update
sudo apt-get -qy install libemail-outlook-message-perl
- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
pip install .[build]
- name: Test building documentation
run: |
cd docs
make html
- name: Check code style
run: |
ruff check .
- name: Run unit tests
run: |
pytest --cov --cov-report=xml tests.py
- name: Test sample DMARC reports
run: |
pip install -e .
parsedmarc --debug -c ci.ini samples/aggregate/*
parsedmarc --debug -c ci.ini samples/forensic/*
- name: Test building packages
run: |
hatch build
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}

45
.gitignore vendored
View File

@@ -24,6 +24,7 @@ wheels/
*.egg-info/
.installed.cfg
*.egg
_tmp*
# PyInstaller
# Usually these files are written by a python script from a template
@@ -45,6 +46,7 @@ nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/
# Translations
*.mo
@@ -62,7 +64,7 @@ instance/
.scrapy
# Sphinx documentation
docs/_build/
docs/build/
# PyBuilder
target/
@@ -103,18 +105,43 @@ ENV/
# PyCharm Project settings
.idea/
# I/O files
# VS Code launch config
#.vscode/launch.json
*.xml
*.zip
*.gz
*.json
*.csv
# Visual Studio Code settings
#.vscode/
# I/O files
output/
*.xls*
# LibreOffice lock files
.~*
# ignore data files
# Data files
*.dat
*.mmdb
GeoIP*
GeoLite*
# Temp files
tmp/
# Config files
prod*.ini
stage*.ini
dev*.ini
# Private samples
samples/private
*.html
*.sqlite-journal
parsedmarc.ini
scratch.py
parsedmarc/resources/maps/base_reverse_dns.csv
parsedmarc/resources/maps/unknown_base_reverse_dns.csv
parsedmarc/resources/maps/sus_domains.csv
parsedmarc/resources/maps/unknown_domains.txt
*.bak

View File

@@ -1,23 +0,0 @@
language: python
sudo: false
python:
- '2.7'
- '3.4'
- '3.5'
- '3.6'
# commands to install dependencies
install:
- "pip install flake8 pytest-cov pytest coveralls"
- "pip install -r requirements.txt"
# commands to run samples
script:
- "flake8 *.py"
- "cd docs"
- "make html"
- "cd .."
- "python tests.py"
- "python setup.py bdist_wheel"

45
.vscode/launch.json vendored Normal file
View File

@@ -0,0 +1,45 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "Python Debugger: Current File",
"type": "debugpy",
"request": "launch",
"program": "${file}",
"console": "integratedTerminal"
},
{
"name": "tests.py",
"type": "debugpy",
"request": "launch",
"program": "tests.py",
"console": "integratedTerminal"
},
{
"name": "sample",
"type": "debugpy",
"request": "launch",
"module": "parsedmarc.cli",
"args": ["samples/private/sample"]
},
{
"name": "sortlists.py",
"type": "debugpy",
"request": "launch",
"program": "sortlists.py",
"cwd": "${workspaceFolder}/parsedmarc/resources/maps",
"console": "integratedTerminal"
},
{
"name": "find_unknown_base_reverse_dns.py",
"type": "debugpy",
"request": "launch",
"program": "find_unknown_base_reverse_dns.py",
"cwd": "${workspaceFolder}/parsedmarc/resources/maps",
"console": "integratedTerminal"
}
]
}

160
.vscode/settings.json vendored Normal file
View File

@@ -0,0 +1,160 @@
{
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnSave": true,
// Let Ruff handle lint fixes + import sorting on save
"editor.codeActionsOnSave": {
"source.fixAll.ruff": "explicit",
"source.organizeImports.ruff": "explicit"
}
},
"markdownlint.config": {
"MD024": false
},
"cSpell.words": [
"adkim",
"akamaiedge",
"amsmath",
"andrewmcgilvray",
"arcname",
"aspf",
"autoclass",
"automodule",
"backported",
"bellsouth",
"boto",
"brakhane",
"Brightmail",
"CEST",
"CHACHA",
"checkdmarc",
"Codecov",
"confnew",
"dateparser",
"dateutil",
"Davmail",
"DBIP",
"dearmor",
"deflist",
"devel",
"DMARC",
"Dmarcian",
"dnspython",
"dollarmath",
"dpkg",
"exampleuser",
"expiringdict",
"fieldlist",
"GELF",
"genindex",
"geoip",
"geoipupdate",
"Geolite",
"geolocation",
"githubpages",
"Grafana",
"hostnames",
"htpasswd",
"httpasswd",
"httplib",
"IMAP",
"imapclient",
"infile",
"Interaktive",
"IPDB",
"journalctl",
"keepalive",
"keyout",
"keyrings",
"Leeman",
"libemail",
"linkify",
"LISTSERV",
"lxml",
"mailparser",
"mailrelay",
"mailsuite",
"maxdepth",
"MAXHEADERS",
"maxmind",
"mbox",
"mfrom",
"michaeldavie",
"mikesiegel",
"Mimecast",
"mitigations",
"MMDB",
"modindex",
"msgconvert",
"msgraph",
"MSSP",
"multiprocess",
"Munge",
"ndjson",
"newkey",
"Nhcm",
"nojekyll",
"nondigest",
"nosecureimap",
"nosniff",
"nwettbewerb",
"opensearch",
"opensearchpy",
"parsedmarc",
"passsword",
"Postorius",
"premade",
"procs",
"publicsuffix",
"publicsuffixlist",
"publixsuffix",
"pygelf",
"pypy",
"pytest",
"quickstart",
"Reindex",
"replyto",
"reversename",
"Rollup",
"Rpdm",
"SAMEORIGIN",
"sdist",
"Servernameone",
"setuptools",
"smartquotes",
"SMTPTLS",
"sortlists",
"sortmaps",
"sourcetype",
"STARTTLS",
"tasklist",
"timespan",
"tlsa",
"tlsrpt",
"toctree",
"TQDDM",
"tqdm",
"truststore",
"Übersicht",
"uids",
"Uncategorized",
"unparasable",
"uper",
"urllib",
"Valimail",
"venv",
"Vhcw",
"viewcode",
"virtualenv",
"WBITS",
"webmail",
"Wettbewerber",
"Whalen",
"whitespaces",
"xennn",
"xmltodict",
"xpack",
"zscholl"
],
}

File diff suppressed because it is too large Load Diff

35
Dockerfile Normal file
View File

@@ -0,0 +1,35 @@
ARG BASE_IMAGE=python:3.13-slim
ARG USERNAME=parsedmarc
ARG USER_UID=1000
ARG USER_GID=$USER_UID
## build
FROM $BASE_IMAGE AS build
WORKDIR /app
RUN pip install hatch
COPY parsedmarc/ parsedmarc/
COPY README.md pyproject.toml ./
RUN hatch build
## image
FROM $BASE_IMAGE
ARG USERNAME
ARG USER_UID
ARG USER_GID
COPY --from=build /app/dist/*.whl /tmp/dist/
RUN set -ex; \
groupadd --gid ${USER_GID} ${USERNAME}; \
useradd --uid ${USER_UID} --gid ${USER_GID} -m ${USERNAME}; \
pip install /tmp/dist/*.whl; \
rm -rf /tmp/dist
USER $USERNAME
ENTRYPOINT ["parsedmarc"]

64
README.md Normal file
View File

@@ -0,0 +1,64 @@
# parsedmarc
[![Build
Status](https://github.com/domainaware/parsedmarc/actions/workflows/python-tests.yml/badge.svg)](https://github.com/domainaware/parsedmarc/actions/workflows/python-tests.yml)
[![Code
Coverage](https://codecov.io/gh/domainaware/parsedmarc/branch/master/graph/badge.svg)](https://codecov.io/gh/domainaware/parsedmarc)
[![PyPI
Package](https://img.shields.io/pypi/v/parsedmarc.svg)](https://pypi.org/project/parsedmarc/)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/parsedmarc?color=blue)](https://pypistats.org/packages/parsedmarc)
<p align="center">
<img src="https://raw.githubusercontent.com/domainaware/parsedmarc/refs/heads/master/docs/source/_static/screenshots/dmarc-summary-charts.png?raw=true" alt="A screenshot of DMARC summary charts in Kibana"/>
</p>
`parsedmarc` is a Python module and CLI utility for parsing DMARC
reports. When used with Elasticsearch and Kibana (or Splunk), it works
as a self-hosted open-source alternative to commercial DMARC report
processing services such as Agari Brand Protection, Dmarcian, OnDMARC,
ProofPoint Email Fraud Defense, and Valimail.
> [!NOTE]
> __Domain-based Message Authentication, Reporting, and Conformance__ (DMARC) is an email authentication protocol.
## Help Wanted
This project is maintained by one developer. Please consider reviewing the open
[issues](https://github.com/domainaware/parsedmarc/issues) to see how you can
contribute code, documentation, or user support. Assistance on the pinned
issues would be particularly helpful.
Thanks to all
[contributors](https://github.com/domainaware/parsedmarc/graphs/contributors)!
## Features
- Parses draft and 1.0 standard aggregate/rua DMARC reports
- Parses forensic/failure/ruf DMARC reports
- Parses reports from SMTP TLS Reporting
- Can parse reports from an inbox over IMAP, Microsoft Graph, or Gmail API
- Transparently handles gzip or zip compressed reports
- Consistent data structures
- Simple JSON and/or CSV output
- Optionally email the results
- Optionally send the results to Elasticsearch, Opensearch, and/or Splunk, for
use with premade dashboards
- Optionally send reports to Apache Kafka
## Python Compatibility
This project supports the following Python versions, which are either actively maintained or are the default versions
for RHEL or Debian.
| Version | Supported | Reason |
|---------|-----------|------------------------------------------------------------|
| < 3.6 | ❌ | End of Life (EOL) |
| 3.6 | ❌ | Used in RHEL 8, but not supported by project dependencies |
| 3.7 | ❌ | End of Life (EOL) |
| 3.8 | ❌ | End of Life (EOL) |
| 3.9 | ✅ | Supported until August 2026 (Debian 11); May 2032 (RHEL 9) |
| 3.10 | ✅ | Actively maintained |
| 3.11 | ✅ | Actively maintained; supported until June 2028 (Debian 12) |
| 3.12 | ✅ | Actively maintained; supported until May 2035 (RHEL 10) |
| 3.13 | ✅ | Actively maintained; supported until June 2030 (Debian 13) |
| 3.14 | ❌ | Not currently supported due to Not currently supported due to [this imapclient bug](https://github.com/mjs/imapclient/issues/618)|

View File

@@ -1,188 +0,0 @@
checkdmarc
==========
|Build Status|
``pasedmarc`` is a Python module and CLI utility for parsing aggregate DMARC reports.
Features
========
* Parses draft and 1.0 standard aggregate reports
* Transparently handles gzip or zip compressed reports
* Consistent data structures
* Simple JSON or CSV output
* Python 2 and 3 support
CLI help
========
::
usage: parsedmarc.py [-h] [-f FORMAT] [-o OUTPUT]
[-n NAMESERVER [NAMESERVER ...]] [-t TIMEOUT] [-v]
file_path [file_path ...]
Parses aggregate DMARC reports
positional arguments:
file_path one or more paths of aggregate report files
(compressed or uncompressed)
optional arguments:
-h, --help show this help message and exit
-f FORMAT, --format FORMAT
specify JSON or CSV output format
-o OUTPUT, --output OUTPUT
output to a file path rather than printing to the
screen
-n NAMESERVER [NAMESERVER ...], --nameserver NAMESERVER [NAMESERVER ...]
nameservers to query
-t TIMEOUT, --timeout TIMEOUT
number of seconds to wait for an answer from DNS
(default 6.0)
-v, --version show program's version number and exit
Sample output
=============
Here are the results from parsing the `example <https://dmarc.org/wiki/FAQ#I_need_to_implement_aggregate_reports.2C_what_do_they_look_like.3F>`_
report from the dmarc.org wiki. It's actually an older draft of the the 1.0
report schema standardized in
`RFC 7480 Appendix C <https://tools.ietf.org/html/rfc7489#appendix-C>`_.
This draft schema is still in wide use.
``parsedmarc`` produces consistent, normalized output, regardless of the report schema.
JSON
----
.. code-block:: json
{
"xml_schema": "draft",
"report_metadata": {
"org_name": "acme.com",
"org_email": "noreply-dmarc-support@acme.com",
"org_extra_contact_info": "http://acme.com/dmarc/support",
"report_id": "9391651994964116463",
"begin_date": "2012-04-27 20:00:00",
"end_date": "2012-04-28 19:59:59",
"errors": []
},
"policy_published": {
"domain": "example.com",
"adkim": "r",
"aspf": "r",
"p": "none",
"sp": "none",
"pct": "100",
"fo": "0"
},
"records": [
{
"source": {
"ip_address": "72.150.241.94",
"country": "US",
"reverse_dns": "adsl-72-150-241-94.shv.bellsouth.net",
"base_domain": "bellsouth.net"
},
"count": 2,
"policy_evaluated": {
"disposition": "none",
"dkim": "fail",
"spf": "pass",
"policy_override_reasons": []
},
"identifiers": {
"header_from": "example.com",
"envelope_from": "example.com",
"envelope_to": null
},
"auth_results": {
"dkim": [
{
"domain": "example.com",
"selector": "none",
"result": "fail"
}
],
"spf": [
{
"domain": "example.com",
"scope": "mfrom",
"result": "pass"
}
]
}
}
]
}
CSV
---
::
xml_schema,org_name,org_email,org_extra_contact_info,report_id,begin_date,end_date,errors,domain,adkim,aspf,p,sp,pct,fo,source_ip_address,source_country,source_reverse_dns,source_base_domain,count,disposition,dkim_alignment,spf_alignment,policy_override_reasons,policy_override_comments,envelope_from,header_from,envelope_to,dkim_domains,dkim_selectors,dkim_results,spf_domains,spf_scopes,spf_results
draft,acme.com,noreply-dmarc-support@acme.com,http://acme.com/dmarc/support,9391651994964116463,2012-04-27 20:00:00,2012-04-28 19:59:59,[],example.com,r,r,none,none,100,0,72.150.241.94,US,adsl-72-150-241-94.shv.bellsouth.net,bellsouth.net,2,none,fail,pass,,,example.com,example.com,,example.com,none,fail,example.com,mfrom,pass
What about forensic DMARC reports?
==================================
Forensic DMARC reports are emails with an attached email sample that failed a
DMARC check. You can parse them with any email message parser, such as
`mail-parser <https://pypi.python.org/pypi/mail-parser/>`_.
Very few recipients send forensic reports, and even those who do will often
provide only the message headers, and not the message's content, for privacy
reasons.
Installation
============
``parsedmarc`` works with Python 2 or 3, but Python 3 is preferred.
On Debian or Ubuntu systems, run:
.. code-block:: bash
$ sudo apt-get install python3-pip
Python 3 installers for Windows and macOS can be found at https://www.python.org/downloads/
To install or upgrade to the latest stable release of ``parsedmarc`` on macOS or Linux, run
.. code-block:: bash
$ sudo -H pip3 install -U checkdmarc
Or, install the latest development release directly from GitHub:
.. code-block:: bash
$ sudo -H pip3 install -U git+https://github.com/domainaware/parsedmarc.git
.. note::
On Windows, ``pip3`` is ``pip``, even with Python 3. So on Windows, simply
substitute ``pip`` as an administrator in place of ``sudo pip3``, in the above commands.
Documentation
=============
https://domainaware.github.io/parsedmarc
Bug reports
===========
Please report bugs on the GitHub issue tracker
https://github.com/domainaware/parsedmarc/issues
.. |Build Status| image:: https://travis-ci.org/domainaware/parsedmarc.svg?branch=master
:target: https://travis-ci.org/domainaware/parsedmarc

27
build.sh Executable file
View File

@@ -0,0 +1,27 @@
#!/usr/bin/env bash
set -e
if [ ! -d "venv" ]; then
virtualenv venv || exit
fi
. venv/bin/activate
pip install .[build]
ruff format .
cd docs
make clean
make html
touch build/html/.nojekyll
if [ -d "../../parsedmarc-docs" ]; then
cp -rf build/html/* ../../parsedmarc-docs/
fi
cd ..
cd parsedmarc/resources/maps
python3 sortlists.py
echo "Checking for invalid UTF-8 bytes in base_reverse_dns_map.csv"
python3 find_bad_utf8.py base_reverse_dns_map.csv
cd ../../..
python3 tests.py
rm -rf dist/ build/
hatch build

12
ci.ini Normal file
View File

@@ -0,0 +1,12 @@
[general]
save_aggregate = True
save_forensic = True
save_smtp_tls = True
debug = True
[elasticsearch]
hosts = http://localhost:9200
ssl = False
number_of_shards=2
number_of_replicas=2

55
docker-compose.yml Normal file
View File

@@ -0,0 +1,55 @@
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.19.7
environment:
- network.host=127.0.0.1
- http.host=0.0.0.0
- node.name=elasticsearch
- discovery.type=single-node
- cluster.name=parsedmarc-cluster
- discovery.seed_hosts=elasticsearch
- bootstrap.memory_lock=true
- xpack.security.enabled=false
- xpack.license.self_generated.type=basic
ports:
- "127.0.0.1:9200:9200"
ulimits:
memlock:
soft: -1
hard: -1
healthcheck:
test:
[
"CMD-SHELL",
"curl -s -XGET http://localhost:9200/_cluster/health?pretty | grep status | grep -q '\\(green\\|yellow\\)'"
]
interval: 10s
timeout: 10s
retries: 24
opensearch:
image: opensearchproject/opensearch:2
environment:
- network.host=127.0.0.1
- http.host=0.0.0.0
- node.name=opensearch
- discovery.type=single-node
- cluster.name=parsedmarc-cluster
- discovery.seed_hosts=opensearch
- bootstrap.memory_lock=true
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_INITIAL_ADMIN_PASSWORD}
ports:
- "127.0.0.1:9201:9200"
ulimits:
memlock:
soft: -1
hard: -1
healthcheck:
test:
[
"CMD-SHELL",
"curl -s -XGET http://localhost:9201/_cluster/health?pretty | grep status | grep -q '\\(green\\|yellow\\)'"
]
interval: 10s
timeout: 10s
retries: 24

View File

@@ -3,10 +3,10 @@
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = python -msphinx
SPHINXBUILD = python3 -msphinx
SPHINXPROJ = parsedmarc
SOURCEDIR = .
BUILDDIR = _build
SOURCEDIR = source
BUILDDIR = build
# Put it first so that "make" without argument is like "make help".
help:

View File

@@ -1 +0,0 @@
Make directory show up in git.

View File

@@ -1,179 +0,0 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
#
# parsedmarc documentation build configuration file, created by
# sphinx-quickstart on Mon Feb 5 18:25:39 2018.
#
# This file is execfile()d with the current directory set to its
# containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys
sys.path.insert(0, os.path.abspath('..'))
from parsedmarc import __version__
# -- General configuration ------------------------------------------------
# If your documentation needs a minimal Sphinx version, state it here.
#
# needs_sphinx = '1.0'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = ['sphinx.ext.autodoc',
'sphinx.ext.doctest',
'sphinx.ext.todo',
'sphinx.ext.viewcode',
'sphinx.ext.githubpages',
'sphinx.ext.napoleon']
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
# source_suffix = ['.rst', '.md']
source_suffix = '.rst'
# The master toctree document.
master_doc = 'index'
# General information about the project.
project = 'parsedmarc'
copyright = '2018, Sean Whalen'
author = 'Sean Whalen'
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = __version__
# The full version, including alpha/beta/rc tags.
release = version
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This patterns also effect to html_static_path and html_extra_path
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
todo_include_todos = False
# -- Options for HTML output ----------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'sphinx_rtd_theme'
# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#
# html_theme_options = {}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# Custom sidebar templates, must be a dictionary that maps document names
# to template names.
#
# This is required for the alabaster theme
# refs: http://alabaster.readthedocs.io/en/latest/installation.html#sidebars
html_sidebars = {
'**': [
'about.html',
'navigation.html',
'relations.html', # needs 'show_related': True theme option to display
'searchbox.html',
'donate.html',
]
}
# -- Options for HTMLHelp output ------------------------------------------
# Output file base name for HTML help builder.
htmlhelp_basename = 'parsedmarcdoc'
# -- Options for LaTeX output ---------------------------------------------
latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#
# 'papersize': 'letterpaper',
# The font size ('10pt', '11pt' or '12pt').
#
# 'pointsize': '10pt',
# Additional stuff for the LaTeX preamble.
#
# 'preamble': '',
# Latex figure (float) alignment
#
# 'figure_align': 'htbp',
}
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(master_doc, 'parsedmarc.tex', 'parsedmarc Documentation',
'parsedmarc', 'manual'),
]
# -- Options for manual page output ---------------------------------------
# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
(master_doc, 'parsedmarc', 'parsedmarc Documentation',
[author], 1)
]
# -- Options for Texinfo output -------------------------------------------
# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(master_doc, 'parsedmarc', 'parsedmarc Documentation',
author, 'parsedmarc', 'One line description of project.',
'Miscellaneous'),
]

View File

@@ -1,204 +0,0 @@
.. parsedmarc documentation master file, created by
sphinx-quickstart on Mon Feb 5 18:25:39 2018.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to parsedmarc's documentation!
======================================
|Build Status|
``pasedmarc`` is a Python module and CLI utility for parsing aggregate DMARC reports.
Features
========
* Parses draft and 1.0 standard aggregate reports
* Transparently handles gzip or zip compressed reports
* Consistent data structures
* Simple JSON or CSV output
* Python 2 and 3 support
CLI help
========
::
usage: parsedmarc.py [-h] [-f FORMAT] [-o OUTPUT]
[-n NAMESERVER [NAMESERVER ...]] [-t TIMEOUT] [-v]
file_path [file_path ...]
Parses aggregate DMARC reports
positional arguments:
file_path one or more paths of aggregate report files
(compressed or uncompressed)
optional arguments:
-h, --help show this help message and exit
-f FORMAT, --format FORMAT
specify JSON or CSV output format
-o OUTPUT, --output OUTPUT
output to a file path rather than printing to the
screen
-n NAMESERVER [NAMESERVER ...], --nameserver NAMESERVER [NAMESERVER ...]
nameservers to query
-t TIMEOUT, --timeout TIMEOUT
number of seconds to wait for an answer from DNS
(default 6.0)
-v, --version show program's version number and exit
Sample output
=============
Here are the results from parsing the `example <https://dmarc.org/wiki/FAQ#I_need_to_implement_aggregate_reports.2C_what_do_they_look_like.3F>`_
report from the dmarc.org wiki. It's actually an older draft of the the 1.0
report schema standardized in
`RFC 7480 Appendix C <https://tools.ietf.org/html/rfc7489#appendix-C>`_.
This draft schema is still in wide use.
``parsedmarc`` produces consistent, normalized output, regardless of the report schema.
JSON
----
.. code-block:: json
{
"xml_schema": "draft",
"report_metadata": {
"org_name": "acme.com",
"org_email": "noreply-dmarc-support@acme.com",
"org_extra_contact_info": "http://acme.com/dmarc/support",
"report_id": "9391651994964116463",
"begin_date": "2012-04-27 20:00:00",
"end_date": "2012-04-28 19:59:59",
"errors": []
},
"policy_published": {
"domain": "example.com",
"adkim": "r",
"aspf": "r",
"p": "none",
"sp": "none",
"pct": "100",
"fo": "0"
},
"records": [
{
"source": {
"ip_address": "72.150.241.94",
"country": "US",
"reverse_dns": "adsl-72-150-241-94.shv.bellsouth.net",
"base_domain": "bellsouth.net"
},
"count": 2,
"policy_evaluated": {
"disposition": "none",
"dkim": "fail",
"spf": "pass",
"policy_override_reasons": []
},
"identifiers": {
"header_from": "example.com",
"envelope_from": "example.com",
"envelope_to": null
},
"auth_results": {
"dkim": [
{
"domain": "example.com",
"selector": "none",
"result": "fail"
}
],
"spf": [
{
"domain": "example.com",
"scope": "mfrom",
"result": "pass"
}
]
}
}
]
}
CSV
---
::
xml_schema,org_name,org_email,org_extra_contact_info,report_id,begin_date,end_date,errors,domain,adkim,aspf,p,sp,pct,fo,source_ip_address,source_country,source_reverse_dns,source_base_domain,count,disposition,dkim_alignment,spf_alignment,policy_override_reasons,policy_override_comments,envelope_from,header_from,envelope_to,dkim_domains,dkim_selectors,dkim_results,spf_domains,spf_scopes,spf_results
draft,acme.com,noreply-dmarc-support@acme.com,http://acme.com/dmarc/support,9391651994964116463,2012-04-27 20:00:00,2012-04-28 19:59:59,[],example.com,r,r,none,none,100,0,72.150.241.94,US,adsl-72-150-241-94.shv.bellsouth.net,bellsouth.net,2,none,fail,pass,,,example.com,example.com,,example.com,none,fail,example.com,mfrom,pass
What about forensic DMARC reports?
==================================
Forensic DMARC reports are emails with an attached email sample that failed a
DMARC check. You can parse them with any email message parser, such as
`mail-parser <https://pypi.python.org/pypi/mail-parser/>`_.
Very few recipients send forensic reports, and even those who do will often
provide only the message headers, and not the message's content, for privacy
reasons.
Bug reports
===========
Please report bugs on the GitHub issue tracker
https://github.com/domainaware/parsedmarc/issues
Installation
============
``parsedmarc`` works with Python 2 or 3, but Python 3 is preferred.
On Debian or Ubuntu systems, run:
.. code-block:: bash
$ sudo apt-get install python3-pip
Python 3 installers for Windows and macOS can be found at https://www.python.org/downloads/
To install or upgrade to the latest stable release of ``parsedmarc`` on macOS or Linux, run
.. code-block:: bash
$ sudo -H pip3 install -U checkdmarc
Or, install the latest development release directly from GitHub:
.. code-block:: bash
$ sudo -H pip3 install -U git+https://github.com/domainaware/parsedmarc.git
.. note::
On Windows, ``pip3`` is ``pip``, even with Python 3. So on Windows, simply
substitute ``pip`` as an administrator in place of ``sudo pip3``, in the above commands.
API
===
.. automodule:: parsedmarc
:members:
.. toctree::
:maxdepth: 2
:caption: Contents:
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
.. |Build Status| image:: https://travis-ci.org/domainaware/parsedmarc.svg?branch=master
:target: https://travis-ci.org/domainaware/parsedmarc

View File

@@ -7,8 +7,8 @@ REM Command file for Sphinx documentation
if "%SPHINXBUILD%" == "" (
set SPHINXBUILD=python -msphinx
)
set SOURCEDIR=.
set BUILDDIR=_build
set SOURCEDIR=source
set BUILDDIR=build
set SPHINXPROJ=parsedmarc
if "%1" == "" goto help

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 102 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 250 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

41
docs/source/api.md Normal file
View File

@@ -0,0 +1,41 @@
# API reference
## parsedmarc
```{eval-rst}
.. automodule:: parsedmarc
:members:
```
## parsedmarc.elastic
```{eval-rst}
.. automodule:: parsedmarc.elastic
:members:
```
## parsedmarc.opensearch
```{eval-rst}
.. automodule:: parsedmarc.opensearch
:members:
```
## parsedmarc.splunk
```{eval-rst}
.. automodule:: parsedmarc.splunk
:members:
```
## parsedmarc.utils
```{eval-rst}
.. automodule:: parsedmarc.utils
:members:
```
## Indices and tables
- {ref}`genindex`
- {ref}`modindex`

94
docs/source/conf.py Normal file
View File

@@ -0,0 +1,94 @@
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
import os
import sys
sys.path.insert(0, os.path.abspath(os.path.join("..", "..")))
from parsedmarc import __version__
# -- Project information -----------------------------------------------------
project = "parsedmarc"
copyright = "2018 - 2025, Sean Whalen and contributors"
author = "Sean Whalen and contributors"
# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = __version__
# The full version, including alpha/beta/rc tags.
release = version
# -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
"sphinx.ext.autodoc",
"sphinx.ext.doctest",
"sphinx.ext.todo",
"sphinx.ext.viewcode",
"sphinx.ext.githubpages",
"sphinx.ext.napoleon",
"myst_parser",
]
myst_enable_extensions = [
"amsmath",
"colon_fence",
"deflist",
"dollarmath",
"fieldlist",
"html_admonition",
"html_image",
"linkify",
"replacements",
"smartquotes",
"strikethrough",
"substitution",
"tasklist",
]
myst_heading_anchors = 3
autoclass_content = "init"
# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]
# The suffixes of source filenames.
source_suffix = [".rst", ".md"]
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = []
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "sphinx_rtd_theme"
html_theme_options = {"globaltoc_collapse": False}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ["_static"]

View File

@@ -0,0 +1,7 @@
# Contributing to parsedmarc
## Bug reports
Please report bugs on the GitHub issue tracker
<https://github.com/domainaware/parsedmarc/issues>

189
docs/source/davmail.md Normal file
View File

@@ -0,0 +1,189 @@
# Accessing an inbox using OWA/EWS
:::{note}
Starting in 8.0.0, parsedmarc supports accessing Microsoft/Office 365
inboxes via the Microsoft Graph API, which is preferred over Davmail.
:::
Some organizations do not allow IMAP or the Microsoft Graph API,
and only support Exchange Web Services (EWS)/Outlook Web Access (OWA).
In that case, Davmail will need to be set up
as a local EWS/OWA IMAP gateway. It can even work where
[Modern Auth/multi-factor authentication] is required.
To do this, download the latest `davmail-version.zip` from
<https://sourceforge.net/projects/davmail/files/>
Extract the zip using the `unzip` command.
Install Java:
```bash
sudo apt-get install default-jre-headless
```
Configure Davmail by creating a `davmail.properties` file
```properties
# DavMail settings, see http://davmail.sourceforge.net/ for documentation
#############################################################
# Basic settings
# Server or workstation mode
davmail.server=true
# connection mode auto, EWS or WebDav
davmail.enableEws=auto
# base Exchange OWA or EWS url
davmail.url=https://outlook.office365.com/EWS/Exchange.asmx
# Listener ports
davmail.imapPort=1143
#############################################################
# Network settings
# Network proxy settings
davmail.enableProxy=false
davmail.useSystemProxies=false
davmail.proxyHost=
davmail.proxyPort=
davmail.proxyUser=
davmail.proxyPassword=
# proxy exclude list
davmail.noProxyFor=
# block remote connection to DavMail
davmail.allowRemote=false
# bind server sockets to the loopback address
davmail.bindAddress=127.0.0.1
# disable SSL for specified listeners
davmail.ssl.nosecureimap=true
# Send keepalive character during large folder and messages download
davmail.enableKeepalive=true
# Message count limit on folder retrieval
davmail.folderSizeLimit=0
#############################################################
# IMAP settings
# Delete messages immediately on IMAP STORE \Deleted flag
davmail.imapAutoExpunge=true
# Enable IDLE support, set polling delay in minutes
davmail.imapIdleDelay=1
# Always reply to IMAP RFC822.SIZE requests with Exchange approximate
# message size for performance reasons
davmail.imapAlwaysApproxMsgSize=true
# Client connection timeout in seconds - default 300, 0 to disable
davmail.clientSoTimeout=0
#############################################################
```
## Running DavMail as a systemd service
Use systemd to run `davmail` as a service.
Create a system user
```bash
sudo useradd davmail -r -s /bin/false
```
Protect the `davmail` configuration file from prying eyes
```bash
sudo chown root:davmail /opt/davmail/davmail.properties
sudo chmod u=rw,g=r,o= /opt/davmail/davmail.properties
```
Create the service configuration file
```bash
sudo nano /etc/systemd/system/davmail.service
```
```ini
[Unit]
Description=DavMail gateway service
Documentation=https://sourceforge.net/projects/davmail/
Wants=network-online.target
After=syslog.target network.target
[Service]
ExecStart=/opt/davmail/davmail /opt/davmail/davmail.properties
User=davmail
Group=davmail
Restart=always
RestartSec=5m
[Install]
WantedBy=multi-user.target
```
Then, enable the service
```bash
sudo systemctl daemon-reload
sudo systemctl enable parsedmarc.service
sudo service davmail restart
```
:::{note}
You must also run the above commands whenever you edit
`davmail.service`.
:::
:::{warning}
Always restart the service every time you upgrade to a new version of
`davmail`:
```bash
sudo service davmail restart
```
:::
To check the status of the service, run:
```bash
service davmail status
```
:::{note}
In the event of a crash, systemd will restart the service after 5
minutes, but the `service davmail status` command will only show the
logs for the current process. To vew the logs for previous runs as
well as the current process (newest to oldest), run:
```bash
journalctl -u davmail.service -r
```
:::
## Configuring parsedmarc for DavMail
Because you are interacting with DavMail server over the loopback
(i.e. `127.0.0.1`), add the following options to `parsedmarc.ini`
config file:
```ini
[imap]
host=127.0.0.1
port=1143
ssl=False
watch=True
```
[modern auth/multi-factor authentication]: https://davmail.sourceforge.net/faq.html

71
docs/source/dmarc.md Normal file
View File

@@ -0,0 +1,71 @@
# Understanding DMARC
## Resources
### DMARC guides
- [Demystifying DMARC] - A complete guide to SPF, DKIM, and DMARC
[demystifying dmarc]: https://seanthegeek.net/459/demystifying-dmarc/
### SPF and DMARC record validation
If you are looking for SPF and DMARC record validation and parsing,
check out the sister project,
[checkdmarc](https://domainaware.github.io/checkdmarc/).
### Lookalike domains
DMARC protects against domain spoofing, not lookalike domains. for open source
lookalike domain monitoring, check out [DomainAware](https://github.com/seanthegeek/domainaware).
## DMARC Alignment Guide
DMARC ensures that SPF and DKM authentication mechanisms actually authenticate
against the same domain that the end user sees.
A message passes a DMARC check by passing DKIM or SPF, **as long as the related
indicators are also in alignment**.
```{eval-rst}
+-----------------------+-----------------------+-----------------------+
| | **DKIM** | **SPF** |
+-----------------------+-----------------------+-----------------------+
| **Passing** | The signature in the | The mail server's IP |
| | DKIM header is | address is listed in |
| | validated using a | the SPF record of the |
| | public key that is | domain in the SMTP |
| | published as a DNS | envelope's mail from |
| | record of the domain | header |
| | name specified in the | |
| | signature | |
+-----------------------+-----------------------+-----------------------+
| **Alignment** | The signing domain | The domain in the |
| | aligns with the | SMTP envelope's mail |
| | domain in the | from header aligns |
| | message's from header | with the domain in |
| | | the message's from |
| | | header |
+-----------------------+-----------------------+-----------------------+
```
## What if a sender won't support DKIM/DMARC?
1. Some vendors don't know about DMARC yet; ask about SPF and DKIM/email
authentication.
2. Check if they can send through your email relays instead of theirs.
3. Do they really need to spoof your domain? Why not use the display
name instead?
4. Worst case, have that vendor send email as a specific subdomain of
your domain (e.g. `noreply@news.example.com`), and then create
separate SPF and DMARC records on `news.example.com`, and set
`p=none` in that DMARC record.
:::{warning}
Do not alter the `p` or `sp` values of the DMARC record on the
Top-Level Domain (TLD) that would leave you vulnerable to
spoofing of your TLD and/or any subdomain.
:::
```{include} mailing-lists.md
```

View File

@@ -0,0 +1,236 @@
# Elasticsearch and Kibana
To set up visual dashboards of DMARC data, install Elasticsearch and Kibana.
:::{note}
Elasticsearch and Kibana 6 or later are required
:::
## Installation
On Debian/Ubuntu based systems, run:
```bash
sudo apt-get install -y apt-transport-https
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
sudo apt-get update
sudo apt-get install -y elasticsearch kibana
```
For CentOS, RHEL, and other RPM systems, follow the Elastic RPM guides for
[Elasticsearch] and [Kibana].
:::{note}
Previously, the default JVM heap size for Elasticsearch was very small (1g),
which will cause it to crash under a heavy load. To fix this, increase the
minimum and maximum JVM heap sizes in `/etc/elasticsearch/jvm.options` to
more reasonable levels, depending on your server's resources.
Make sure the system has at least 2 GB more RAM than the assigned JVM
heap size.
Always set the minimum and maximum JVM heap sizes to the same
value.
For example, to set a 4 GB heap size, set
```bash
-Xms4g
-Xmx4g
```
See <https://www.elastic.co/guide/en/elasticsearch/reference/current/important-settings.html#heap-size-settings>
for more information.
:::
```bash
sudo systemctl daemon-reload
sudo systemctl enable elasticsearch.service
sudo systemctl enable kibana.service
sudo systemctl start elasticsearch.service
sudo systemctl start kibana.service
```
As of Elasticsearch 8.7, activate secure mode (xpack.security.*.ssl)
```bash
sudo vim /etc/elasticsearch/elasticsearch.yml
```
Add the following configuration
```text
# Enable security features
xpack.security.enabled: true
xpack.security.enrollment.enabled: true
# Enable encryption for HTTP API client connections, such as Kibana, Logstash, and Agents
xpack.security.http.ssl:
enabled: true
keystore.path: certs/http.p12
# Enable encryption and mutual authentication between cluster nodes
xpack.security.transport.ssl:
enabled: true
verification_mode: certificate
keystore.path: certs/transport.p12
truststore.path: certs/transport.p12
```
```bash
sudo systemctl restart elasticsearch
```
To create a self-signed certificate, run:
```bash
openssl req -x509 -nodes -days 365 -newkey rsa:4096 -keyout kibana.key -out kibana.crt
```
Or, to create a Certificate Signing Request (CSR) for a CA, run:
```bash
openssl req -newkey rsa:4096-nodes -keyout kibana.key -out kibana.csr
```
Fill in the prompts. Watch out for Common Name (e.g. server FQDN or YOUR
domain name), which is the IP address or domain name that you will use to access Kibana. it is the most important field.
If you generated a CSR, remove the CSR after you have your certs
```bash
rm -f kibana.csr
```
Move the keys into place and secure them:
```bash
sudo mv kibana.* /etc/kibana
sudo chmod 660 /etc/kibana/kibana.key
```
Activate the HTTPS server in Kibana
```bash
sudo vim /etc/kibana/kibana.yml
```
Add the following configuration
```text
server.host: "SERVER_IP"
server.publicBaseUrl: "https://SERVER_IP"
server.ssl.enabled: true
server.ssl.certificate: /etc/kibana/kibana.crt
server.ssl.key: /etc/kibana/kibana.key
```
:::{note}
For more security, you can configure Kibana to use a local network connexion
to elasticsearch :
```text
elasticsearch.hosts: ['https://SERVER_IP:9200']
```
=>
```text
elasticsearch.hosts: ['https://127.0.0.1:9200']
```
:::
```bash
sudo systemctl restart kibana
```
Enroll Kibana in Elasticsearch
```bash
sudo /usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana
```
Then access to your web server at `https://SERVER_IP:5601`, accept the self-signed
certificate and paste the token in the "Enrollment token" field.
```bash
sudo /usr/share/kibana/bin/kibana-verification-code
```
Then put the verification code to your web browser.
End Kibana configuration
```bash
sudo /usr/share/elasticsearch/bin/elasticsearch-setup-passwords interactive
sudo /usr/share/kibana/bin/kibana-encryption-keys generate
sudo vim /etc/kibana/kibana.yml
```
Add previously generated encryption keys
```text
xpack.encryptedSavedObjects.encryptionKey: xxxx...xxxx
xpack.reporting.encryptionKey: xxxx...xxxx
xpack.security.encryptionKey: xxxx...xxxx
```
```bash
sudo systemctl restart kibana
sudo systemctl restart elasticsearch
```
Now that Elasticsearch is up and running, use `parsedmarc` to send data to
it.
Download (right-click the link and click save as) [export.ndjson].
Connect to kibana using the "elastic" user and the password you previously provide
on the console ("End Kibana configuration" part).
Import `export.ndjson` the Saved Objects tab of the Stack management
page of Kibana. (Hamburger menu -> "Management" -> "Stack Management" ->
"Kibana" -> "Saved Objects")
It will give you the option to overwrite existing saved dashboards or
visualizations, which could be used to restore them if you or someone else
breaks them, as there are no permissions/access controls in Kibana without
the commercial [X-Pack].
```{image} _static/screenshots/saved-objects.png
:align: center
:alt: A screenshot of setting the Saved Objects Stack management UI in Kibana
:target: _static/screenshots/saved-objects.png
```
```{image} _static/screenshots/confirm-overwrite.png
:align: center
:alt: A screenshot of the overwrite conformation prompt
:target: _static/screenshots/confirm-overwrite.png
```
## Upgrading Kibana index patterns
`parsedmarc` 5.0.0 makes some changes to the way data is indexed in
Elasticsearch. if you are upgrading from a previous release of
`parsedmarc`, you need to complete the following steps to replace the
Kibana index patterns with versions that match the upgraded indexes:
1. Login in to Kibana, and click on Management
2. Under Kibana, click on Saved Objects
3. Check the checkboxes for the `dmarc_aggregate` and `dmarc_forensic`
index patterns
4. Click Delete
5. Click Delete on the conformation message
6. Download (right-click the link and click save as)
the latest version of [export.ndjson]
7. Import `export.ndjson` by clicking Import from the Kibana
Saved Objects page
## Records retention
Starting in version 5.0.0, `parsedmarc` stores data in a separate
index for each day to make it easy to comply with records
retention regulations such as GDPR. For more information,
check out the Elastic guide to [managing time-based indexes efficiently](https://www.elastic.co/blog/managing-time-based-indices-efficiently).
[elasticsearch]: https://www.elastic.co/guide/en/elasticsearch/reference/current/rpm.html
[export.ndjson]: https://raw.githubusercontent.com/domainaware/parsedmarc/master/kibana/export.ndjson
[kibana]: https://www.elastic.co/guide/en/kibana/current/rpm.html
[x-pack]: https://www.elastic.co/products/x-pack

31
docs/source/example.ini Normal file
View File

@@ -0,0 +1,31 @@
# This is an example comment
[general]
save_aggregate = True
save_forensic = True
[imap]
host = imap.example.com
user = dmarcresports@example.com
password = $uperSecure
watch = True
[elasticsearch]
hosts = 127.0.0.1:9200
ssl = False
[splunk_hec]
url = https://splunkhec.example.com
token = HECTokenGoesHere
index = email
[s3]
bucket = my-bucket
path = parsedmarc
[gmail_api]
credentials_file = /etc/example/credentials.json
token_file = /etc/example/token.json
include_spam_trash = True
paginate_messages = True
scopes = https://www.googleapis.com/auth/gmail.modify

84
docs/source/index.md Normal file
View File

@@ -0,0 +1,84 @@
# parsedmarc documentation - Open source DMARC report analyzer and visualizer
[![Build
Status](https://github.com/domainaware/parsedmarc/actions/workflows/python-tests.yml/badge.svg)](https://github.com/domainaware/parsedmarc/actions/workflows/python-tests.yml)
[![Code
Coverage](https://codecov.io/gh/domainaware/parsedmarc/branch/master/graph/badge.svg)](https://codecov.io/gh/domainaware/parsedmarc)
[![PyPI
Package](https://img.shields.io/pypi/v/parsedmarc.svg)](https://pypi.org/project/parsedmarc/)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/parsedmarc?color=blue)](https://pypistats.org/packages/parsedmarc)
:::{note}
**Help Wanted**
This is a project is maintained by one developer.
Please consider reviewing the open [issues] to see how you can contribute code, documentation, or user support.
Assistance on the pinned issues would be particularly helpful.
Thanks to all [contributors]!
:::
```{image} _static/screenshots/dmarc-summary-charts.png
:align: center
:alt: A screenshot of DMARC summary charts in Kibana
:scale: 50 %
:target: _static/screenshots/dmarc-summary-charts.png
```
`parsedmarc` is a Python module and CLI utility for parsing DMARC reports.
When used with Elasticsearch and Kibana (or Splunk), or with OpenSearch and Grafana, it works as a self-hosted
open source alternative to commercial DMARC report processing services such
as Agari Brand Protection, Dmarcian, OnDMARC, ProofPoint Email Fraud Defense,
and Valimail.
## Features
- Parses draft and 1.0 standard aggregate/rua DMARC reports
- Parses forensic/failure/ruf DMARC reports
- Parses reports from SMTP TLS Reporting
- Can parse reports from an inbox over IMAP, Microsoft Graph, or Gmail API
- Transparently handles gzip or zip compressed reports
- Consistent data structures
- Simple JSON and/or CSV output
- Optionally email the results
- Optionally send the results to Elasticsearch, Opensearch, and/or Splunk, for use
with premade dashboards
- Optionally send reports to Apache Kafka
## Python Compatibility
This project supports the following Python versions, which are either actively maintained or are the default versions
for RHEL or Debian.
| Version | Supported | Reason |
|---------|-----------|------------------------------------------------------------|
| < 3.6 | ❌ | End of Life (EOL) |
| 3.6 | ❌ | Used in RHEL 8, but not supported by project dependencies |
| 3.7 | ❌ | End of Life (EOL) |
| 3.8 | ❌ | End of Life (EOL) |
| 3.9 | ✅ | Supported until August 2026 (Debian 11); May 2032 (RHEL 9) |
| 3.10 | ✅ | Actively maintained |
| 3.11 | ✅ | Actively maintained; supported until June 2028 (Debian 12) |
| 3.12 | ✅ | Actively maintained; supported until May 2035 (RHEL 10) |
| 3.13 | ✅ | Actively maintained; supported until June 2030 (Debian 13) |
| 3.14 | ❌ | Not currently supported due to [this imapclient bug](https://github.com/mjs/imapclient/issues/618)|
```{toctree}
:caption: 'Contents'
:maxdepth: 2
installation
usage
output
elasticsearch
opensearch
kibana
splunk
davmail
dmarc
contributing
api
```
[contributors]: https://github.com/domainaware/parsedmarc/graphs/contributors
[issues]: https://github.com/domainaware/parsedmarc/issues

205
docs/source/installation.md Normal file
View File

@@ -0,0 +1,205 @@
# Installation
## Prerequisites
`parsedmarc` works with Python 3 only.
### Testing multiple report analyzers
If you would like to test parsedmarc and another report processing
solution at the same time, you can have up to two `mailto` URIs in each of the rua and ruf
tags in your DMARC record, separated by commas.
### Using a web proxy
If your system is behind a web proxy, you need to configure your system
to use that proxy. To do this, edit `/etc/environment` and add your
proxy details there, for example:
```bash
http_proxy=http://user:password@prox-server:3128
https_proxy=https://user:password@prox-server:3128
ftp_proxy=http://user:password@prox-server:3128
```
Or if no credentials are needed:
```bash
http_proxy=http://prox-server:3128
https_proxy=https://prox-server:3128
ftp_proxy=http://prox-server:3128
```
This will set the proxy up for use system-wide, including for `parsedmarc`.
### Using Microsoft Exchange
If your mail server is Microsoft Exchange, ensure that it is patched to at
least:
- Exchange Server 2010 Update Rollup 22 ([KB4295699])
- Exchange Server 2013 Cumulative Update 21 ([KB4099855])
- Exchange Server 2016 Cumulative Update 11 ([KB4134118])
### geoipupdate setup
:::{note}
Starting in `parsedmarc` 7.1.0, a static copy of the
[IP to Country Lite database] from IPDB is distributed with
`parsedmarc`, under the terms of the
[Creative Commons Attribution 4.0 International License].
as a fallback if the [MaxMind GeoLite2 Country database] is not
installed. However, `parsedmarc` cannot install updated versions of
these databases as they are released, so MaxMind's databases and the
[geoipupdate] tool is still the preferable solution.
The location of the database file can be overridden by using the
`ip_db_path` setting.
:::
On Debian 10 (Buster) or later, run:
```bash
sudo apt-get install -y geoipupdate
```
:::{note}
[Component "contrib"] is required in your apt sources.
:::
On Ubuntu systems run:
```bash
sudo add-apt-repository ppa:maxmind/ppa
sudo apt update
sudo apt install -y geoipupdate
```
On CentOS or RHEL systems, run:
```bash
sudo dnf install -y geoipupdate
```
The latest builds for Linux, macOS, and Windows can be downloaded
from the [geoipupdate releases page on GitHub].
On December 30th, 2019, MaxMind started requiring free accounts to
access the free Geolite2 databases, in order
[to comply with various privacy regulations].
Start by [registering for a free GeoLite2 account], and signing in.
Then, navigate to the [License Keys] page under your account,
and create a new license key for the version of
`geoipupdate` that was installed.
:::{warning}
The configuration file format is different for older (i.e. \<=3.1.1) and newer (i.e. >=3.1.1) versions
of `geoipupdate`. Be sure to select the correct version for your system.
:::
:::{note}
To check the version of `geoipupdate` that is installed, run:
```bash
geoipupdate -V
```
:::
You can use `parsedmarc` as the description for the key.
Once you have generated a key, download the config pre-filled
configuration file. This file should be saved at `/etc/GeoIP.conf`
on Linux or macOS systems, or at
`%SystemDrive%\ProgramData\MaxMind\GeoIPUpdate\GeoIP.conf` on
Windows systems.
Then run
```bash
sudo geoipupdate
```
To download the databases for the first time.
The GeoLite2 Country, City, and ASN databases are updated weekly,
every Tuesday. `geoipupdate` can be run weekly by adding a cron
job or scheduled task.
More information about `geoipupdate` can be found at the
[MaxMind geoipupdate page].
## Installing parsedmarc
On Debian or Ubuntu systems, run:
```bash
sudo apt-get install -y python3-pip python3-virtualenv python3-dev libxml2-dev libxslt-dev
```
On CentOS or RHEL systems, run:
```bash
sudo dnf install -y python39 python3-virtualenv python3-setuptools python3-devel libxml2-devel libxslt-devel
```
Python 3 installers for Windows and macOS can be found at
<https://www.python.org/downloads/>.
Create a system user
```bash
sudo mkdir /opt
sudo useradd parsedmarc -r -s /bin/false -m -b /opt
```
Install parsedmarc in a virtualenv
```bash
sudo -u parsedmarc virtualenv /opt/parsedmarc/venv
```
CentOS/RHEL 8 systems use Python 3.6 by default, so on those systems
explicitly tell `virtualenv` to use `python3.9` instead
```bash
sudo -u parsedmarc virtualenv -p python3.9 /opt/parsedmarc/venv
```
Activate the virtualenv
```bash
source /opt/parsedmarc/venv/bin/activate
```
To install or upgrade `parsedmarc` inside the virtualenv, run:
```bash
sudo -u parsedmarc /opt/parsedmarc/venv/bin/pip install -U parsedmarc
```
## Optional dependencies
If you would like to be able to parse emails saved from Microsoft
Outlook (i.e. OLE .msg files), install `msgconvert`:
On Debian or Ubuntu systems, run:
```bash
sudo apt-get install libemail-outlook-message-perl
```
[KB4295699]: https://support.microsoft.com/KB/4295699
[KB4099855]: https://support.microsoft.com/KB/4099855
[KB4134118]: https://support.microsoft.com/kb/4134118
[Component "contrib"]: https://wiki.debian.org/SourcesList#Component
[geoipupdate]: https://github.com/maxmind/geoipupdate
[geoipupdate releases page on github]: https://github.com/maxmind/geoipupdate/releases
[ip to country lite database]: https://db-ip.com/db/download/ip-to-country-lite
[license keys]: https://www.maxmind.com/en/accounts/current/license-key
[maxmind geoipupdate page]: https://dev.maxmind.com/geoip/updating-databases/
[maxmind geolite2 country database]: https://dev.maxmind.com/geoip/geolite2-free-geolocation-data
[registering for a free geolite2 account]: https://www.maxmind.com/en/geolite2/signup
[to comply with various privacy regulations]: https://blog.maxmind.com/2019/12/18/significant-changes-to-accessing-and-using-geolite2-databases/

87
docs/source/kibana.md Normal file
View File

@@ -0,0 +1,87 @@
# Using the Kibana dashboards
The Kibana DMARC dashboards are a human-friendly way to understand the
results from incoming DMARC reports.
:::{note}
The default dashboard is DMARC Summary. To switch between dashboards,
click on the Dashboard link on the left side menu of Kibana.
:::
## DMARC Summary
As the name suggests, this dashboard is the best place to start
reviewing your aggregate DMARC data.
Across the top of the dashboard, three pie charts display the percentage of
alignment pass/fail for SPF, DKIM, and DMARC. Clicking on any chart segment
will filter for that value.
:::{note}
Messages should not be considered malicious just because they failed to pass
DMARC; especially if you have just started collecting data. It may be a
legitimate service that needs SPF and DKIM configured correctly.
:::
Start by filtering the results to only show failed DKIM alignment. While DMARC
passes if a message passes SPF or DKIM alignment, only DKIM alignment remains
valid when a message is forwarded without changing the from address, which is
often caused by a mailbox forwarding rule. This is because DKIM signatures are
part of the message headers, whereas SPF relies on SMTP session headers.
Underneath the pie charts. you can see graphs of DMARC passage and message
disposition over time.
Under the graphs you will find the most useful data tables on the dashboard. On
the left, there is a list of organizations that are sending you DMARC reports.
In the center, there is a list of sending servers grouped by the base domain
in their reverse DNS. On the right, there is a list of email from domains,
sorted by message volume.
By hovering your mouse over a data table value and using the magnifying glass
icons, you can filter on our filter out different values. Start by looking at
the Message Sources by Reverse DNS table. Find a sender that you recognize,
such as an email marketing service, hover over it, and click on the plus (+)
magnifying glass icon, to add a filter that only shows results for that sender.
Now, look at the Message From Header table to the right. That shows you the
domains that a sender is sending as, which might tell you which brand/business
is using a particular service. With that information, you can contact them and
have them set up DKIM.
:::{note}
If you have a lot of B2C customers, you may see a high volume of emails as
your domains coming from consumer email services, such as Google/Gmail and
Yahoo! This occurs when customers have mailbox rules in place that forward
emails from an old account to a new account, which is why DKIM
authentication is so important, as mentioned earlier. Similar patterns may
be observed with businesses who send from reverse DNS addressees of
parent, subsidiary, and outdated brands.
:::
Further down the dashboard, you can filter by source country or source IP
address.
Tables showing SPF and DKIM alignment details are located under the IP address
table.
:::{note}
Previously, the alignment tables were included in a separate dashboard
called DMARC Alignment Failures. That dashboard has been consolidated into
the DMARC Summary dashboard. To view failures only, use the pie chart.
:::
Any other filters work the same way. You can also add your own custom temporary
filters by clicking on Add Filter at the upper right of the page.
## DMARC Forensic Samples
The DMARC Forensic Samples dashboard contains information on DMARC forensic
reports (also known as failure reports or ruf reports). These reports contain
samples of emails that have failed to pass DMARC.
:::{note}
Most recipients do not send forensic/failure/ruf reports at all to avoid
privacy leaks. Some recipients (notably Chinese webmail services) will only
supply the headers of sample emails. Very few provide the entire email.
:::

View File

@@ -0,0 +1,206 @@
## What about mailing lists?
When you deploy DMARC on your domain, you might find that messages
relayed by mailing lists are failing DMARC, most likely because the mailing
list is spoofing your from address, and modifying the subject,
footer, or other part of the message, thereby breaking the
DKIM signature.
### Mailing list best practices
Ideally, a mailing list should forward messages without altering the
headers or body content at all. [Joe Nelson] does a fantastic job of
explaining exactly what mailing lists should and shouldn't do to be
fully DMARC compliant. Rather than repeat his fine work, here's a
summary:
#### Do
- Retain headers from the original message
- Add [RFC 2369] List-Unsubscribe headers to outgoing messages, instead of
adding unsubscribe links to the body
> List-Unsubscribe: <https://list.example.com/unsubscribe-link>
- Add [RFC 2919] List-Id headers instead of modifying the subject
> List-Id: Example Mailing List <list.example.com>
Modern mail clients and webmail services generate unsubscribe buttons based on
these headers.
#### Do not
- Remove or modify any existing headers from the original message, including
From, Date, Subject, etc.
- Add to or remove content from the message body, **including traditional
disclaimers and unsubscribe footers**
In addition to complying with DMARC, this configuration ensures that Reply
and Reply All actions work like they would with any email message. Reply
replies to the message sender, and Reply All replies to the sender and the
list.
Even without a subject prefix or body footer, mailing list users can still
tell that a message came from the mailing list, because the message was sent
to the mailing list post address, and not their email address.
Configuration steps for common mailing list platforms are listed below.
#### Mailman 2
Navigate to General Settings, and configure the settings below
```{eval-rst}
============================ ==========
**Setting** **Value**
**subject_prefix**
**from_is_list** No
**first_strip_reply_to** No
**reply_goes_to_list** Poster
**include_rfc2369_headers** Yes
**include_list_post_header** Yes
**include_sender_header** No
============================ ==========
```
Navigate to Non-digest options, and configure the settings below
```{eval-rst}
=================== ==========
**Setting** **Value**
**msg_header**
**msg_footer**
**scrub_nondigest** No
=================== ==========
```
Navigate to Privacy Options> Sending Filters, and configure the settings below
```{eval-rst}
====================================== ==========
**Setting** **Value**
**dmarc_moderation_action** Accept
**dmarc_quarantine_moderation_action** Yes
**dmarc_none_moderation_action** Yes
====================================== ==========
```
#### Mailman 3
Navigate to Settings> List Identity
Make Subject prefix blank.
Navigate to Settings> Alter Messages
Configure the settings below
```{eval-rst}
====================================== ==========
**Setting** **Value**
**Convert html to plaintext** No
**Include RFC2369 headers** Yes
**Include the list post header** Yes
**Explicit reply-to address**
**First strip replyto** No
**Reply goes to list** No munging
====================================== ==========
```
Navigate to Settings> DMARC Mitigation
Configure the settings below
```{eval-rst}
================================== ===============================
**Setting** **Value**
**DMARC mitigation action** No DMARC mitigations
**DMARC mitigate unconditionally** No
================================== ===============================
```
Create a blank footer template for your mailing list to remove the message
footer. Unfortunately, the Postorius mailing list admin UI will not allow you
to create an empty template, so you'll have to create one using the system's
command line instead, for example:
```bash
touch var/templates/lists/list.example.com/en/list:member:regular:footer
```
Where `list.example.com` the list ID, and `en` is the language.
Then restart mailman core.
#### LISTSERV
[LISTSERV 16.0-2017a] and higher will rewrite the From header for domains
that enforce with a DMARC quarantine or reject policy.
Some additional steps are needed for Linux hosts.
#### Workarounds
If a mailing list must go **against** best practices and
modify the message (e.g. to add a required legal footer), the mailing
list administrator must configure the list to replace the From address of the
message (also known as munging) with the address of the mailing list, so they
no longer spoof email addresses with domains protected by DMARC.
Configuration steps for common mailing list platforms are listed below.
##### Mailman 2
Navigate to Privacy Options> Sending Filters, and configure the settings below
```{eval-rst}
====================================== ==========
**Setting** **Value**
**dmarc_moderation_action** Munge From
**dmarc_quarantine_moderation_action** Yes
**dmarc_none_moderation_action** Yes
====================================== ==========
```
:::{note}
Message wrapping could be used as the DMARC mitigation action instead. In
that case, the original message is added as an attachment to the mailing
list message, but that could interfere with inbox searching, or mobile
clients.
On the other hand, replacing the From address might cause users to
accidentally reply to the entire list, when they only intended to reply to
the original sender.
Choose the option that best fits your community.
:::
##### Mailman 3
In the DMARC Mitigations tab of the Settings page, configure the settings below
```{eval-rst}
================================== ===============================
**Setting** **Value**
**DMARC mitigation action** Replace From: with list address
**DMARC mitigate unconditionally** No
================================== ===============================
```
:::{note}
Message wrapping could be used as the DMARC mitigation action instead. In
that case, the original message is added as an attachment to the mailing
list message, but that could interfere with inbox searching, or mobile
clients.
On the other hand, replacing the From address might cause users to
accidentally reply to the entire list, when they only intended to reply to
the original sender.
:::
[joe nelson]: https://begriffs.com/posts/2018-09-18-dmarc-mailing-list.html
[listserv 16.0-2017a]: https://www.lsoft.com/news/dmarc-issue1-2018.asp
[rfc 2369]: https://tools.ietf.org/html/rfc2369
[rfc 2919]: https://tools.ietf.org/html/rfc2919

14
docs/source/opensearch.md Normal file
View File

@@ -0,0 +1,14 @@
# OpenSearch and Grafana
To set up visual dashboards of DMARC data, install OpenSearch and Grafana.
## Installation
OpenSearch: https://opensearch.org/docs/latest/install-and-configure/install-opensearch/index/
Grafana: https://grafana.com/docs/grafana/latest/setup-grafana/installation/
## Records retention
Starting in version 5.0.0, `parsedmarc` stores data in a separate
index for each day to make it easy to comply with records
retention regulations such as GDPR.

241
docs/source/output.md Normal file
View File

@@ -0,0 +1,241 @@
# Sample outputs
## Sample aggregate report output
Here are the results from parsing the [example](https://dmarc.org/wiki/FAQ#I_need_to_implement_aggregate_reports.2C_what_do_they_look_like.3F)
report from the dmarc.org wiki. It's actually an older draft of
the 1.0 report schema standardized in
[RFC 7480 Appendix C](https://tools.ietf.org/html/rfc7489#appendix-C).
This draft schema is still in wide use.
`parsedmarc` produces consistent, normalized output, regardless
of the report schema.
### JSON aggregate report
```json
{
"xml_schema": "draft",
"report_metadata": {
"org_name": "acme.com",
"org_email": "noreply-dmarc-support@acme.com",
"org_extra_contact_info": "http://acme.com/dmarc/support",
"report_id": "9391651994964116463",
"begin_date": "2012-04-27 20:00:00",
"end_date": "2012-04-28 19:59:59",
"timespan_requires_normalization": false,
"original_timespan_seconds": 86399,
"errors": []
},
"policy_published": {
"domain": "example.com",
"adkim": "r",
"aspf": "r",
"p": "none",
"sp": "none",
"pct": "100",
"fo": "0"
},
"records": [
{
"source": {
"ip_address": "72.150.241.94",
"country": "US",
"reverse_dns": null,
"base_domain": null,
"name": null,
"type": null
},
"count": 2,
"alignment": {
"spf": true,
"dkim": false,
"dmarc": true
},
"policy_evaluated": {
"disposition": "none",
"dkim": "fail",
"spf": "pass",
"policy_override_reasons": []
},
"identifiers": {
"header_from": "example.com",
"envelope_from": "example.com",
"envelope_to": null
},
"auth_results": {
"dkim": [
{
"domain": "example.com",
"selector": "none",
"result": "fail"
}
],
"spf": [
{
"domain": "example.com",
"scope": "mfrom",
"result": "pass"
}
]
},
"normalized_timespan": false,
"interval_begin": "2012-04-28 00:00:00",
"interval_end": "2012-04-28 23:59:59"
}
]
}
```
### CSV aggregate report
```text
xml_schema,org_name,org_email,org_extra_contact_info,report_id,begin_date,end_date,normalized_timespan,errors,domain,adkim,aspf,p,sp,pct,fo,source_ip_address,source_country,source_reverse_dns,source_base_domain,source_name,source_type,count,spf_aligned,dkim_aligned,dmarc_aligned,disposition,policy_override_reasons,policy_override_comments,envelope_from,header_from,envelope_to,dkim_domains,dkim_selectors,dkim_results,spf_domains,spf_scopes,spf_results
draft,acme.com,noreply-dmarc-support@acme.com,http://acme.com/dmarc/support,9391651994964116463,2012-04-28 00:00:00,2012-04-28 23:59:59,False,,example.com,r,r,none,none,100,0,72.150.241.94,US,,,,,2,True,False,True,none,,,example.com,example.com,,example.com,none,fail,example.com,mfrom,pass
draft,acme.com,noreply-dmarc-support@acme.com,http://acme.com/dmarc/support,9391651994964116463,2012-04-28 00:00:00,2012-04-28 23:59:59,False,,example.com,r,r,none,none,100,0,72.150.241.94,US,,,,,2,True,False,True,none,,,example.com,example.com,,example.com,none,fail,example.com,mfrom,pass
```
## Sample forensic report output
Thanks to GitHub user [xennn](https://github.com/xennn) for the anonymized
[forensic report email sample](<https://github.com/domainaware/parsedmarc/raw/master/samples/forensic/DMARC%20Failure%20Report%20for%20domain.de%20(mail-from%3Dsharepoint%40domain.de%2C%20ip%3D10.10.10.10).eml>).
### JSON forensic report
```json
{
"feedback_type": "auth-failure",
"user_agent": "Lua/1.0",
"version": "1.0",
"original_mail_from": "sharepoint@domain.de",
"original_rcpt_to": "peter.pan@domain.de",
"arrival_date": "Mon, 01 Oct 2018 11:20:27 +0200",
"message_id": "<38.E7.30937.BD6E1BB5@ mailrelay.de>",
"authentication_results": "dmarc=fail (p=none, dis=none) header.from=domain.de",
"delivery_result": "policy",
"auth_failure": [
"dmarc"
],
"reported_domain": "domain.de",
"arrival_date_utc": "2018-10-01 09:20:27",
"source": {
"ip_address": "10.10.10.10",
"country": null,
"reverse_dns": null,
"base_domain": null
},
"authentication_mechanisms": [],
"original_envelope_id": null,
"dkim_domain": null,
"sample_headers_only": false,
"sample": "Received: from Servernameone.domain.local (Servernameone.domain.local [10.10.10.10])\n\tby mailrelay.de (mail.DOMAIN.de) with SMTP id 38.E7.30937.BD6E1BB5; Mon, 1 Oct 2018 11:20:27 +0200 (CEST)\nDate: 01 Oct 2018 11:20:27 +0200\nMessage-ID: <38.E7.30937.BD6E1BB5@ mailrelay.de>\nTo: <peter.pan@domain.de>\nfrom: \"=?utf-8?B?SW50ZXJha3RpdmUgV2V0dGJld2VyYmVyLcOcYmVyc2ljaHQ=?=\" <sharepoint@domain.de>\nSubject: Subject\nMIME-Version: 1.0\nX-Mailer: Microsoft SharePoint Foundation 2010\nContent-Type: text/html; charset=utf-8\nContent-Transfer-Encoding: quoted-printable\n\n<html><head><base href=3D'\nwettbewerb' /></head><body><!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2//EN\"=\n><HTML><HEAD><META NAME=3D\"Generator\" CONTENT=3D\"MS Exchange Server version=\n 08.01.0240.003\"></html>\n",
"parsed_sample": {
"from": {
"display_name": "Interaktive Wettbewerber-Übersicht",
"address": "sharepoint@domain.de",
"local": "sharepoint",
"domain": "domain.de"
},
"to_domains": [
"domain.de"
],
"to": [
{
"display_name": null,
"address": "peter.pan@domain.de",
"local": "peter.pan",
"domain": "domain.de"
}
],
"subject": "Subject",
"timezone": "+2",
"mime-version": "1.0",
"date": "2018-10-01 09:20:27",
"content-type": "text/html; charset=utf-8",
"x-mailer": "Microsoft SharePoint Foundation 2010",
"body": "<html><head><base href='\nwettbewerb' /></head><body><!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 3.2//EN\"><HTML><HEAD><META NAME=\"Generator\" CONTENT=\"MS Exchange Server version 08.01.0240.003\"></html>",
"received": [
{
"from": "Servernameone.domain.local Servernameone.domain.local 10.10.10.10",
"by": "mailrelay.de mail.DOMAIN.de",
"with": "SMTP id 38.E7.30937.BD6E1BB5",
"date": "Mon, 1 Oct 2018 11:20:27 +0200 CEST",
"hop": 1,
"date_utc": "2018-10-01 09:20:27",
"delay": 0
}
],
"content-transfer-encoding": "quoted-printable",
"message-id": "<38.E7.30937.BD6E1BB5@ mailrelay.de>",
"has_defects": false,
"headers": {
"Received": "from Servernameone.domain.local (Servernameone.domain.local [10.10.10.10])\n\tby mailrelay.de (mail.DOMAIN.de) with SMTP id 38.E7.30937.BD6E1BB5; Mon, 1 Oct 2018 11:20:27 +0200 (CEST)",
"Date": "01 Oct 2018 11:20:27 +0200",
"Message-ID": "<38.E7.30937.BD6E1BB5@ mailrelay.de>",
"To": "<peter.pan@domain.de>",
"from": "\"Interaktive Wettbewerber-Übersicht\" <sharepoint@domain.de>",
"Subject": "Subject",
"MIME-Version": "1.0",
"X-Mailer": "Microsoft SharePoint Foundation 2010",
"Content-Type": "text/html; charset=utf-8",
"Content-Transfer-Encoding": "quoted-printable"
},
"reply_to": [],
"cc": [],
"bcc": [],
"attachments": [],
"filename_safe_subject": "Subject"
}
}
```
### CSV forensic report
```text
feedback_type,user_agent,version,original_envelope_id,original_mail_from,original_rcpt_to,arrival_date,arrival_date_utc,subject,message_id,authentication_results,dkim_domain,source_ip_address,source_country,source_reverse_dns,source_base_domain,delivery_result,auth_failure,reported_domain,authentication_mechanisms,sample_headers_only
auth-failure,Lua/1.0,1.0,,sharepoint@domain.de,peter.pan@domain.de,"Mon, 01 Oct 2018 11:20:27 +0200",2018-10-01 09:20:27,Subject,<38.E7.30937.BD6E1BB5@ mailrelay.de>,"dmarc=fail (p=none, dis=none) header.from=domain.de",,10.10.10.10,,,,policy,dmarc,domain.de,,False
```
### JSON SMTP TLS report
```json
[
{
"organization_name": "Example Inc.",
"begin_date": "2024-01-09T00:00:00Z",
"end_date": "2024-01-09T23:59:59Z",
"report_id": "2024-01-09T00:00:00Z_example.com",
"policies": [
{
"policy_domain": "example.com",
"policy_type": "sts",
"policy_strings": [
"version: STSv1",
"mode: testing",
"mx: example.com",
"max_age: 86400"
],
"successful_session_count": 0,
"failed_session_count": 3,
"failure_details": [
{
"result_type": "validation-failure",
"failed_session_count": 2,
"sending_mta_ip": "209.85.222.201",
"receiving_ip": "173.212.201.41",
"receiving_mx_hostname": "example.com"
},
{
"result_type": "validation-failure",
"failed_session_count": 1,
"sending_mta_ip": "209.85.208.176",
"receiving_ip": "173.212.201.41",
"receiving_mx_hostname": "example.com"
}
]
}
]
}
]
```

22
docs/source/splunk.md Normal file
View File

@@ -0,0 +1,22 @@
# Splunk
Starting in version 4.3.0 `parsedmarc` supports sending aggregate and/or
forensic DMARC data to a Splunk [HTTP Event collector (HEC)].
The project repository contains [XML files] for premade Splunk
dashboards for aggregate and forensic DMARC reports.
Copy and paste the contents of each file into a separate Splunk
dashboard XML editor.
:::{warning}
Change all occurrences of `index="email"` in the XML to
match your own index name.
:::
The Splunk dashboards display the same content and layout as the
Kibana dashboards, although the Kibana dashboards have slightly
easier and more flexible filtering options.
[xml files]: https://github.com/domainaware/parsedmarc/tree/master/splunk
[http event collector (hec)]: http://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHEC

553
docs/source/usage.md Normal file
View File

@@ -0,0 +1,553 @@
# Using parsedmarc
## CLI help
```text
usage: parsedmarc [-h] [-c CONFIG_FILE] [--strip-attachment-payloads] [-o OUTPUT]
[--aggregate-json-filename AGGREGATE_JSON_FILENAME] [--forensic-json-filename FORENSIC_JSON_FILENAME]
[--smtp-tls-json-filename SMTP_TLS_JSON_FILENAME] [--aggregate-csv-filename AGGREGATE_CSV_FILENAME]
[--forensic-csv-filename FORENSIC_CSV_FILENAME] [--smtp-tls-csv-filename SMTP_TLS_CSV_FILENAME]
[-n NAMESERVERS [NAMESERVERS ...]] [-t DNS_TIMEOUT] [--offline] [-s] [-w] [--verbose] [--debug]
[--log-file LOG_FILE] [--no-prettify-json] [-v]
[file_path ...]
Parses DMARC reports
positional arguments:
file_path one or more paths to aggregate or forensic report files, emails, or mbox files'
options:
-h, --help show this help message and exit
-c CONFIG_FILE, --config-file CONFIG_FILE
a path to a configuration file (--silent implied)
--strip-attachment-payloads
remove attachment payloads from forensic report output
-o OUTPUT, --output OUTPUT
write output files to the given directory
--aggregate-json-filename AGGREGATE_JSON_FILENAME
filename for the aggregate JSON output file
--forensic-json-filename FORENSIC_JSON_FILENAME
filename for the forensic JSON output file
--smtp-tls-json-filename SMTP_TLS_JSON_FILENAME
filename for the SMTP TLS JSON output file
--aggregate-csv-filename AGGREGATE_CSV_FILENAME
filename for the aggregate CSV output file
--forensic-csv-filename FORENSIC_CSV_FILENAME
filename for the forensic CSV output file
--smtp-tls-csv-filename SMTP_TLS_CSV_FILENAME
filename for the SMTP TLS CSV output file
-n NAMESERVERS [NAMESERVERS ...], --nameservers NAMESERVERS [NAMESERVERS ...]
nameservers to query
-t DNS_TIMEOUT, --dns_timeout DNS_TIMEOUT
number of seconds to wait for an answer from DNS (default: 2.0)
--offline do not make online queries for geolocation or DNS
-s, --silent only print errors
-w, --warnings print warnings in addition to errors
--verbose more verbose output
--debug print debugging information
--log-file LOG_FILE output logging to a file
--no-prettify-json output JSON in a single line without indentation
-v, --version show program's version number and exit
```
:::{note}
Starting in `parsedmarc` 6.0.0, most CLI options were moved to a
configuration file, described below.
:::
## Configuration file
`parsedmarc` can be configured by supplying the path to an INI file
```bash
parsedmarc -c /etc/parsedmarc.ini
```
For example
```ini
# This is an example comment
[general]
save_aggregate = True
save_forensic = True
[imap]
host = imap.example.com
user = dmarcresports@example.com
password = $uperSecure
[mailbox]
watch = True
delete = False
[elasticsearch]
hosts = 127.0.0.1:9200
ssl = False
[opensearch]
hosts = https://admin:admin@127.0.0.1:9200
ssl = True
[splunk_hec]
url = https://splunkhec.example.com
token = HECTokenGoesHere
index = email
[s3]
bucket = my-bucket
path = parsedmarc
[syslog]
server = localhost
port = 514
[gelf]
host = logger
port = 12201
mode = tcp
[webhook]
aggregate_url = https://aggregate_url.example.com
forensic_url = https://forensic_url.example.com
smtp_tls_url = https://smtp_tls_url.example.com
timeout = 60
```
The full set of configuration options are:
- `general`
- `save_aggregate` - bool: Save aggregate report data to
Elasticsearch, Splunk and/or S3
- `save_forensic` - bool: Save forensic report data to
Elasticsearch, Splunk and/or S3
- `save_smtp_tls` - bool: Save SMTP-STS report data to
Elasticsearch, Splunk and/or S3
- `index_prefix_domain_map` - bool: A path mapping of Opensearch/Elasticsearch index prefixes to domain names
- `strip_attachment_payloads` - bool: Remove attachment
payloads from results
- `silent` - bool: Set this to `False` to output results to STDOUT
- `output` - str: Directory to place JSON and CSV files in. This is required if you set either of the JSON output file options.
- `aggregate_json_filename` - str: filename for the aggregate
JSON output file
- `forensic_json_filename` - str: filename for the forensic
JSON output file
- `ip_db_path` - str: An optional custom path to a MMDB file
from MaxMind or DBIP
- `offline` - bool: Do not use online queries for geolocation
or DNS
- `always_use_local_files` - Disables the download of the reverse DNS map
- `local_reverse_dns_map_path` - Overrides the default local file path to use for the reverse DNS map
- `reverse_dns_map_url` - Overrides the default download URL for the reverse DNS map
- `nameservers` - str: A comma separated list of
DNS resolvers (Default: `[Cloudflare's public resolvers]`)
- `dns_test_address` - str: a dummy address used for DNS pre-flight checks
(Default: 1.1.1.1)
- `dns_timeout` - float: DNS timeout period
- `debug` - bool: Print debugging messages
- `silent` - bool: Only print errors (Default: `True`)
- `log_file` - str: Write log messages to a file at this path
- `n_procs` - int: Number of process to run in parallel when
parsing in CLI mode (Default: `1`)
:::{note}
Setting this to a number larger than one can improve
performance when processing thousands of files
:::
- `mailbox`
- `reports_folder` - str: The mailbox folder (or label for
Gmail) where the incoming reports can be found
(Default: `INBOX`)
- `archive_folder` - str: The mailbox folder (or label for
Gmail) to sort processed emails into (Default: `Archive`)
- `watch` - bool: Use the IMAP `IDLE` command to process
messages as they arrive or poll MS Graph for new messages
- `delete` - bool: Delete messages after processing them,
instead of archiving them
- `test` - bool: Do not move or delete messages
- `batch_size` - int: Number of messages to read and process
before saving. Default `10`. Use `0` for no limit.
- `check_timeout` - int: Number of seconds to wait for a IMAP
IDLE response or the number of seconds until the next
mail check (Default: `30`)
- `since` - str: Search for messages since certain time. (Examples: `5m|3h|2d|1w`)
Acceptable units - {"m":"minutes", "h":"hours", "d":"days", "w":"weeks"}.
Defaults to `1d` if incorrect value is provided.
- `imap`
- `host` - str: The IMAP server hostname or IP address
- `port` - int: The IMAP server port (Default: `993`)
:::{note}
`%` characters must be escaped with another `%` character,
so use `%%` wherever a `%` character is used.
:::
:::{note}
Starting in version 8.0.0, most options from the `imap`
section have been moved to the `mailbox` section.
:::
:::{note}
If your host recommends another port, still try 993
:::
- `ssl` - bool: Use an encrypted SSL/TLS connection
(Default: `True`)
- `skip_certificate_verification` - bool: Skip certificate
verification (not recommended)
- `user` - str: The IMAP user
- `password` - str: The IMAP password
- `msgraph`
- `auth_method` - str: Authentication method, valid types are
`UsernamePassword`, `DeviceCode`, or `ClientSecret`
(Default: `UsernamePassword`).
- `user` - str: The M365 user, required when the auth method is
UsernamePassword
- `password` - str: The user password, required when the auth
method is UsernamePassword
- `client_id` - str: The app registration's client ID
- `client_secret` - str: The app registration's secret
- `tenant_id` - str: The Azure AD tenant ID. This is required
for all auth methods except UsernamePassword.
- `mailbox` - str: The mailbox name. This defaults to the
current user if using the UsernamePassword auth method, but
could be a shared mailbox if the user has access to the mailbox
- `graph_url` - str: Microsoft Graph URL. Allows for use of National Clouds (ex Azure Gov)
(Default: https://graph.microsoft.com)
- `token_file` - str: Path to save the token file
(Default: `.token`)
- `allow_unencrypted_storage` - bool: Allows the Azure Identity
module to fall back to unencrypted token cache (Default: `False`).
Even if enabled, the cache will always try encrypted storage first.
:::{note}
You must create an app registration in Azure AD and have an
admin grant the Microsoft Graph `Mail.ReadWrite`
(delegated) permission to the app. If you are using
`UsernamePassword` auth and the mailbox is different from the
username, you must grant the app `Mail.ReadWrite.Shared`.
:::
:::{warning}
If you are using the `ClientSecret` auth method, you need to
grant the `Mail.ReadWrite` (application) permission to the
app. You must also restrict the application's access to a
specific mailbox since it allows all mailboxes by default.
Use the `New-ApplicationAccessPolicy` command in the
Exchange PowerShell module. If you need to scope the policy to
shared mailboxes, you can add them to a mail enabled security
group and use that as the group id.
```powershell
New-ApplicationAccessPolicy -AccessRight RestrictAccess
-AppId "<CLIENT_ID>" -PolicyScopeGroupId "<MAILBOX>"
-Description "Restrict access to dmarc reports mailbox."
```
:::
- `elasticsearch`
- `hosts` - str: A comma separated list of hostnames and ports
or URLs (e.g. `127.0.0.1:9200` or
`https://user:secret@localhost`)
:::{note}
Special characters in the username or password must be
[URL encoded].
:::
- `user` - str: Basic auth username
- `password` - str: Basic auth password
- `api_key` - str: API key
- `ssl` - bool: Use an encrypted SSL/TLS connection
(Default: `True`)
- `timeout` - float: Timeout in seconds (Default: 60)
- `cert_path` - str: Path to a trusted certificates
- `index_suffix` - str: A suffix to apply to the index names
- `index_prefix` - str: A prefix to apply to the index names
- `monthly_indexes` - bool: Use monthly indexes instead of daily indexes
- `number_of_shards` - int: The number of shards to use when
creating the index (Default: `1`)
- `number_of_replicas` - int: The number of replicas to use when
creating the index (Default: `0`)
- `opensearch`
- `hosts` - str: A comma separated list of hostnames and ports
or URLs (e.g. `127.0.0.1:9200` or
`https://user:secret@localhost`)
:::{note}
Special characters in the username or password must be
[URL encoded].
:::
- `user` - str: Basic auth username
- `password` - str: Basic auth password
- `api_key` - str: API key
- `ssl` - bool: Use an encrypted SSL/TLS connection
(Default: `True`)
- `timeout` - float: Timeout in seconds (Default: 60)
- `cert_path` - str: Path to a trusted certificates
- `index_suffix` - str: A suffix to apply to the index names
- `index_prefix` - str: A prefix to apply to the index names
- `monthly_indexes` - bool: Use monthly indexes instead of daily indexes
- `number_of_shards` - int: The number of shards to use when
creating the index (Default: `1`)
- `number_of_replicas` - int: The number of replicas to use when
creating the index (Default: `0`)
- `splunk_hec`
- `url` - str: The URL of the Splunk HTTP Events Collector (HEC)
- `token` - str: The HEC token
- `index` - str: The Splunk index to use
- `skip_certificate_verification` - bool: Skip certificate
verification (not recommended)
- `kafka`
- `hosts` - str: A comma separated list of Kafka hosts
- `user` - str: The Kafka user
- `passsword` - str: The Kafka password
- `ssl` - bool: Use an encrypted SSL/TLS connection (Default: `True`)
- `skip_certificate_verification` - bool: Skip certificate
verification (not recommended)
- `aggregate_topic` - str: The Kafka topic for aggregate reports
- `forensic_topic` - str: The Kafka topic for forensic reports
- `smtp`
- `host` - str: The SMTP hostname
- `port` - int: The SMTP port (Default: `25`)
- `ssl` - bool: Require SSL/TLS instead of using STARTTLS
- `skip_certificate_verification` - bool: Skip certificate
verification (not recommended)
- `user` - str: the SMTP username
- `password` - str: the SMTP password
- `from` - str: The From header to use in the email
- `to` - list: A list of email addresses to send to
- `subject` - str: The Subject header to use in the email
(Default: `parsedmarc report`)
- `attachment` - str: The ZIP attachment filenames
- `message` - str: The email message
(Default: `Please see the attached parsedmarc report.`)
:::{note}
`%` characters must be escaped with another `%` character,
so use `%%` wherever a `%` character is used.
:::
- `s3`
- `bucket` - str: The S3 bucket name
- `path` - str: The path to upload reports to (Default: `/`)
- `region_name` - str: The region name (Optional)
- `endpoint_url` - str: The endpoint URL (Optional)
- `access_key_id` - str: The access key id (Optional)
- `secret_access_key` - str: The secret access key (Optional)
- `syslog`
- `server` - str: The Syslog server name or IP address
- `port` - int: The UDP port to use (Default: `514`)
- `gmail_api`
- `credentials_file` - str: Path to file containing the
credentials, None to disable (Default: `None`)
- `token_file` - str: Path to save the token file
(Default: `.token`)
:::{note}
credentials_file and token_file can be got with [quickstart](https://developers.google.com/gmail/api/quickstart/python).Please change the scope to `https://www.googleapis.com/auth/gmail.modify`.
:::
- `include_spam_trash` - bool: Include messages in Spam and
Trash when searching reports (Default: `False`)
- `scopes` - str: Comma separated list of scopes to use when
acquiring credentials
(Default: `https://www.googleapis.com/auth/gmail.modify`)
- `oauth2_port` - int: The TCP port for the local server to
listen on for the OAuth2 response (Default: `8080`)
- `paginate_messages` - bool: When `True`, fetch all applicable Gmail messages.
When `False`, only fetch up to 100 new messages per run (Default: `True`)
- `log_analytics`
- `client_id` - str: The app registration's client ID
- `client_secret` - str: The app registration's client secret
- `tenant_id` - str: The tenant id where the app registration resides
- `dce` - str: The Data Collection Endpoint (DCE). Example: `https://{DCE-NAME}.{REGION}.ingest.monitor.azure.com`.
- `dcr_immutable_id` - str: The immutable ID of the Data Collection Rule (DCR)
- `dcr_aggregate_stream` - str: The stream name for aggregate reports in the DCR
- `dcr_forensic_stream` - str: The stream name for the forensic reports in the DCR
- `dcr_smtp_tls_stream` - str: The stream name for the SMTP TLS reports in the DCR
:::{note}
Information regarding the setup of the Data Collection Rule can be found [here](https://learn.microsoft.com/en-us/azure/azure-monitor/logs/tutorial-logs-ingestion-portal).
:::
- `gelf`
- `host` - str: The GELF server name or IP address
- `port` - int: The port to use
- `mode` - str: The GELF transport type to use. Valid modes: `tcp`, `udp`, `tls`
- `maildir`
- `maildir_path` - str: Full path for mailbox maidir location (Default: `INBOX`)
- `maildir_create` - bool: Create maildir if not present (Default: False)
- `webhook` - Post the individual reports to a webhook url with the report as the JSON body
- `aggregate_url` - str: URL of the webhook which should receive the aggregate reports
- `forensic_url` - str: URL of the webhook which should receive the forensic reports
- `smtp_tls_url` - str: URL of the webhook which should receive the smtp_tls reports
- `timeout` - int: Interval in which the webhook call should timeout
:::{warning}
It is **strongly recommended** to **not** use the `nameservers`
setting. By default, `parsedmarc` uses
[Cloudflare's public resolvers], which are much faster and more
reliable than Google, Cisco OpenDNS, or even most local resolvers.
The `nameservers` option should only be used if your network
blocks DNS requests to outside resolvers.
:::
:::{note}
`save_aggregate` and `save_forensic` are separate options
because you may not want to save forensic reports
(also known as failure reports) to your Elasticsearch instance,
particularly if you are in a highly-regulated industry that
handles sensitive data, such as healthcare or finance. If your
legitimate outgoing email fails DMARC, it is possible
that email may appear later in a forensic report.
Forensic reports contain the original headers of an email that
failed a DMARC check, and sometimes may also include the
full message body, depending on the policy of the reporting
organization.
Most reporting organizations do not send forensic reports of any
kind for privacy reasons. While aggregate DMARC reports are sent
at least daily, it is normal to receive very few forensic reports.
An alternative approach is to still collect forensic/failure/ruf
reports in your DMARC inbox, but run `parsedmarc` with
```save_forensic = True``` manually on a separate IMAP folder (using
the ```reports_folder``` option), after you have manually moved
known samples you want to save to that folder
(e.g. malicious samples and non-sensitive legitimate samples).
:::
:::{warning}
Elasticsearch 8 change limits policy for shards, restricting by
default to 1000. parsedmarc use a shard per analyzed day. If you
have more than ~3 years of data, you will need to update this
limit.
Check current usage (from Management -> Dev Tools -> Console):
```text
GET /_cluster/health?pretty
{
...
"active_primary_shards": 932,
"active_shards": 932,
...
}
```
Update the limit to 2k per example:
```text
PUT _cluster/settings
{
"persistent" : {
"cluster.max_shards_per_node" : 2000
}
}
```
Increasing this value increases resource usage.
:::
## Multi-tenant support
Starting in `8.19.0`, ParseDMARC provides multi-tenant support by placing data into separate OpenSearch or Elasticsearch index prefixes. To set this up, create a YAML file that is formatted where each key is a tenant name, and the value is a list of domains related to that tenant, not including subdomains, like this:
```yaml
example:
- example.com
- example.net
- example.org
whalensolutions:
- whalensolutions.com
```
Save it to disk where the user running ParseDMARC can read it, then set `index_prefix_domain_map` to that filepath in the `[general]` section of the ParseDMARC configuration file and do not set an `index_prefix` option in the `[elasticsearch]` or `[opensearch]` sections.
When configured correctly, if ParseDMARC finds that a report is related to a domain in the mapping, the report will be saved in an index name that has the tenant name prefixed to it with a trailing underscore. Then, you can use the security features of Opensearch or the ELK stack to only grant users access to the indexes that they need.
:::{note}
A domain cannot be used in multiple tenant lists. Only the first prefix list that contains the matching domain is used.
:::
## Running parsedmarc as a systemd service
Use systemd to run `parsedmarc` as a service and process reports as
they arrive.
Protect the `parsedmarc` configuration file from prying eyes
```bash
sudo chown root:parsedmarc /etc/parsedmarc.ini
sudo chmod u=rw,g=r,o= /etc/parsedmarc.ini
```
Create the service configuration file
```bash
sudo nano /etc/systemd/system/parsedmarc.service
```
```ini
[Unit]
Description=parsedmarc mailbox watcher
Documentation=https://domainaware.github.io/parsedmarc/
Wants=network-online.target
After=network.target network-online.target elasticsearch.service
[Service]
ExecStart=/opt/parsedmarc/venv/bin/parsedmarc -c /etc/parsedmarc.ini
User=parsedmarc
Group=parsedmarc
Restart=always
RestartSec=5m
[Install]
WantedBy=multi-user.target
```
Then, enable the service
```bash
sudo systemctl daemon-reload
sudo systemctl enable parsedmarc.service
sudo service parsedmarc restart
```
:::{note}
You must also run the above commands whenever you edit
`parsedmarc.service`.
:::
:::{warning}
Always restart the service every time you upgrade to a new version of
`parsedmarc`:
```bash
sudo service parsedmarc restart
```
:::
To check the status of the service, run:
```bash
service parsedmarc status
```
:::{note}
In the event of a crash, systemd will restart the service after 10
minutes, but the `service parsedmarc status` command will only show
the logs for the current process. To view the logs for previous runs
as well as the current process (newest to oldest), run:
```bash
journalctl -u parsedmarc.service -r
```
:::
[cloudflare's public resolvers]: https://1.1.1.1/
[url encoded]: https://en.wikipedia.org/wiki/Percent-encoding#Percent-encoding_reserved_characters

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

1
grafana/README.rst Normal file
View File

@@ -0,0 +1 @@
Dashboards contributed by Github user Bhozar.

Binary file not shown.

After

Width:  |  Height:  |  Size: 136 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 116 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 172 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 311 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

25
kibana/export.ndjson Normal file

File diff suppressed because one or more lines are too long

View File

@@ -1,4 +0,0 @@
#!/usr/bin/env bash
. ~/venv/domainaware/bin/activate
cd docs && make html && cp -r build/html/* ../../parsedmarc-docs/

View File

@@ -1,629 +0,0 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""A Python module and CLI for parsing aggregate DMARC reports"""
from __future__ import unicode_literals, print_function, absolute_import
import logging
from sys import version_info
from os import path, stat
import json
from datetime import datetime
from collections import OrderedDict
from datetime import timedelta
from io import BytesIO, StringIO
from gzip import GzipFile
import tarfile
from zipfile import ZipFile
from csv import DictWriter
import shutil
from argparse import ArgumentParser
from glob import glob
import publicsuffix
import xmltodict
import dns.reversename
import dns.resolver
import dns.exception
from requests import get
import geoip2.database
import geoip2.errors
__version__ = "1.0.0"
logger = logging.getLogger(__name__)
logger.setLevel(logging.WARNING)
# Python 2 comparability hack
if version_info[0] >= 3:
unicode = str
class InvalidAggregateReport(Exception):
"""Raised when an invalid DMARC aggregate report is encountered"""
def _get_base_domain(domain):
"""
Gets the base domain name for the given domain
.. note::
Results are based on a list of public domain suffixes at
https://publicsuffix.org/list/public_suffix_list.dat.
This file is saved to the current working directory,
where it is used as a cache file for 24 hours.
Args:
domain (str): A domain or subdomain
Returns:
str: The base domain of the given domain
"""
psl_path = "public_suffix_list.dat"
def download_psl():
fresh_psl = publicsuffix.fetch()
with open(psl_path, "w", encoding="utf-8") as fresh_psl_file:
fresh_psl_file.write(fresh_psl.read())
return publicsuffix.PublicSuffixList(fresh_psl)
if not path.exists(psl_path):
psl = download_psl()
else:
psl_age = datetime.now() - datetime.fromtimestamp(
stat(psl_path).st_mtime)
if psl_age > timedelta(hours=24):
psl = download_psl()
else:
with open(psl_path, encoding="utf-8") as psl_file:
psl = publicsuffix.PublicSuffixList(psl_file)
return psl.get_public_suffix(domain)
def _query_dns(domain, record_type, nameservers=None, timeout=6.0):
"""
Queries DNS
Args:
domain (str): The domain or subdomain to query about
record_type (str): The record type to query for
nameservers (list): A list of one or more nameservers to use
timeout (float): Sets the DNS timeout in seconds
Returns:
list: A list of answers
"""
resolver = dns.resolver.Resolver()
timeout = float(timeout)
if nameservers:
resolver.nameservers = nameservers
resolver.timeout = timeout
resolver.lifetime = timeout
return list(map(
lambda r: r.to_text().replace(' "', '').replace('"', '').rstrip("."),
resolver.query(domain, record_type, tcp=True)))
def _get_reverse_dns(ip_address, nameservers=None, timeout=6.0):
"""
Resolves an IP address to a hostname using a reverse DNS query
Args:
ip_address (str): The IP address to resolve
nameservers (list): A list of nameservers to query
timeout (float): Sets the DNS query timeout in seconds
Returns:
"""
hostname = None
try:
address = dns.reversename.from_address(ip_address)
hostname = _query_dns(address, "PTR",
nameservers=nameservers,
timeout=timeout)[0]
except dns.exception.DNSException:
pass
return hostname
def _timestamp_to_datetime(timestamp):
"""
Converts a UNIX/DMARC timestamp to a Python ``DateTime`` object
Args:
timestamp: The timestamp
Returns:
DateTime: The converted timestamp as a Python ``DateTime`` object
"""
return datetime.fromtimestamp(int(timestamp))
def _timestamp_to_human(timestamp):
"""
Converts a UNIX/DMARC timestamp to a human-readable string
Args:
timestamp: The timestamp
Returns:
str: The converted timestamp in ``YYYY-MM-DD HH:MM:SS`` format
"""
return _timestamp_to_datetime(timestamp).strftime("%Y-%m-%d %H:%M:%S")
def _human_timestamp_to_datetime(human_timestamp):
"""
Converts a human-readable timestamp into a Python ``DateTime`` object
Args:
human_timestamp (str): A timestamp in `YYYY-MM-DD HH:MM:SS`` format
Returns:
DateTime: The converted timestamp
"""
return datetime.strptime(human_timestamp, "%Y-%m-%d %H:%M:%S")
def _get_ip_address_country(ip_address):
"""
Uses the MaxMind Geolite2 Country database to return the ISO code for the
country associated with the given IPv4 or IPv6 address
Args:
ip_address (str): The IP address to query for
Returns:
str: And ISO country code associated with the given IP address
"""
db_filename = "GeoLite2-Country.mmdb"
def download_country_database():
"""Downloads the MaxMind Geolite2 Country database to the current
working directory"""
url = "https://geolite.maxmind.com/download/geoip/database/" \
"GeoLite2-Country.tar.gz"
tar_file = tarfile.open(fileobj=BytesIO(get(url).content), mode="r:gz")
tar_dir = tar_file.getnames()[0]
tar_path = "{0}/{1}".format(tar_dir, db_filename)
tar_file.extract(tar_path)
shutil.move(tar_path, ".")
shutil.rmtree(tar_dir)
system_paths = ["/usr/local/share/GeoIP/GeoLite2-Country.mmdb",
"/usr/share/GeoIP/GeoLite2-Country.mmdb"]
db_path = ""
for system_path in system_paths:
if path.exists(system_path):
db_path = system_path
break
if db_path == "":
if not path.exists(db_filename):
download_country_database()
else:
db_age = datetime.now() - datetime.fromtimestamp(
stat(db_filename).st_mtime)
if db_age > timedelta(days=60):
shutil.rmtree(db_path)
download_country_database()
db_path = db_filename
db_reader = geoip2.database.Reader(db_path)
country = None
try:
country = db_reader.country(ip_address).country.iso_code
except geoip2.errors.AddressNotFoundError:
pass
return country
def _parse_report_record(record, nameservers=None, timeout=6.0):
"""
Converts a record from a DMARC aggregate report into a more consistent
format
Args:
record (OrderedDict): The record to convert
nameservers (list): A list of one or more nameservers to use
timeout (float): Sets the DNS timeout in seconds
Returns:
OrderedDict: The converted record
"""
record = record.copy()
new_record = OrderedDict()
new_record["source"] = OrderedDict()
new_record["source"]["ip_address"] = record["row"]["source_ip"]
reverse_dns = _get_reverse_dns(new_record["source"]["ip_address"],
nameservers=nameservers,
timeout=timeout)
country = _get_ip_address_country(new_record["source"]["ip_address"])
new_record["source"]["country"] = country
new_record["source"]["reverse_dns"] = reverse_dns
new_record["source"]["base_domain"] = None
if new_record["source"]["reverse_dns"] is not None:
base_domain = _get_base_domain(new_record["source"]["reverse_dns"])
new_record["source"]["base_domain"] = base_domain
new_record["count"] = int(record["row"]["count"])
policy_evaluated = record["row"]["policy_evaluated"].copy()
new_policy_evaluated = OrderedDict([("disposition", "none"),
("dkim", "fail"),
("spf", "fail"),
("policy_override_reasons", [])
])
if "disposition" in policy_evaluated:
new_policy_evaluated["disposition"] = policy_evaluated["disposition"]
if "dkim" in policy_evaluated:
new_policy_evaluated["dkim"] = policy_evaluated["dkim"]
if "spf" in policy_evaluated:
new_policy_evaluated["spf"] = policy_evaluated["spf"]
reasons = []
if "reason" in policy_evaluated:
if type(policy_evaluated["reason"]) == list:
reasons = policy_evaluated["reason"]
else:
reasons = [policy_evaluated["reason"]]
for reason in reasons:
if "comment" not in reason:
reason["comment"] = "none"
reasons.append(reason)
new_policy_evaluated["policy_override_reasons"] = reasons
new_record["policy_evaluated"] = new_policy_evaluated
new_record["identifiers"] = record["identifiers"].copy()
new_record["auth_results"] = OrderedDict([("dkim", []), ("spf", [])])
auth_results = record["auth_results"].copy()
if "dkim" in auth_results:
if type(auth_results["dkim"]) != list:
auth_results["dkim"] = [auth_results["dkim"]]
for result in auth_results["dkim"]:
if "domain" in result and result["domain"] is not None:
new_result = OrderedDict([("domain", result["domain"])])
if "selector" in result and result["selector"] is not None:
new_result["selector"] = result["selector"]
else:
new_result["selector"] = "none"
if "result" in result and result["result"] is not None:
new_result["result"] = result["result"]
else:
new_result["result"] = "none"
new_record["auth_results"]["dkim"].append(new_result)
if type(auth_results["spf"]) != list:
auth_results["spf"] = [auth_results["spf"]]
for result in auth_results["spf"]:
new_result = OrderedDict([("domain", result["domain"])])
if "scope" in result and result["scope"] is not None:
new_result["scope"] = result["scope"]
else:
new_result["scope"] = "mfrom"
if "result" in result and result["result"] is not None:
new_result["result"] = result["result"]
else:
new_result["result"] = "none"
new_record["auth_results"]["spf"].append(new_result)
if "envelope_from" not in new_record["identifiers"]:
envelope_from = new_record["auth_results"]["spf"][-1]["domain"].lower()
new_record["identifiers"]["envelope_from"] = envelope_from
elif new_record["identifiers"]["envelope_from"] is None:
envelope_from = new_record["auth_results"]["spf"][-1]["domain"].lower()
new_record["identifiers"]["envelope_from"] = envelope_from
envelope_to = None
if "envelope_to" in new_record["identifiers"]:
envelope_to = new_record["identifiers"]["envelope_to"]
del new_record["identifiers"]["envelope_to"]
new_record["identifiers"]["envelope_to"] = envelope_to
return new_record
def parse_aggregate_report_xml(xml, nameservers=None, timeout=6.0):
"""Parses a DMARC XML report string and returns a consistent OrderedDict
Args:
xml (str): A string of DMARC aggregate report XML
nameservers (list): A list of one or more nameservers to use
timeout (float): Sets the DNS timeout in seconds
Returns:
OrderedDict: The parsed aggregate DMARC report
"""
try:
report = xmltodict.parse(xml)["feedback"]
report_metadata = report["report_metadata"]
schema = "draft"
if "version" in report:
schema = report["version"]
new_report = OrderedDict([("xml_schema", schema)])
new_report_metadata = OrderedDict()
new_report_metadata["org_name"] = report_metadata["org_name"]
new_report_metadata["org_email"] = report_metadata["email"]
extra = None
if "extra_contact_info" in report_metadata:
extra = report_metadata["extra_contact_info"]
new_report_metadata["org_extra_contact_info"] = extra
new_report_metadata["report_id"] = report_metadata["report_id"]
date_range = report["report_metadata"]["date_range"]
date_range["begin"] = _timestamp_to_human(date_range["begin"])
date_range["end"] = _timestamp_to_human(date_range["end"])
new_report_metadata["begin_date"] = date_range["begin"]
new_report_metadata["end_date"] = date_range["end"]
errors = []
if "error" in report["report_metadata"]:
if type(report["report_metadata"]["error"]) != list:
errors = [report["report_metadata"]["error"]]
else:
errors = report["report_metadata"]["error"]
new_report_metadata["errors"] = errors
new_report["report_metadata"] = new_report_metadata
records = []
policy_published = report["policy_published"]
new_policy_published = OrderedDict()
new_policy_published["domain"] = policy_published["domain"]
adkim = "r"
if "adkim" in policy_published:
if policy_published["adkim"] is not None:
adkim = policy_published["adkim"]
new_policy_published["adkim"] = adkim
aspf = "r"
if "aspf" in policy_published:
if policy_published["aspf"] is not None:
aspf = policy_published["aspf"]
new_policy_published["aspf"] = aspf
new_policy_published["p"] = policy_published["p"]
sp = new_policy_published["p"]
if "sp" in policy_published:
if policy_published["sp"] is not None:
sp = report["policy_published"]["sp"]
new_policy_published["sp"] = sp
pct = "100"
if "pct" in policy_published:
if policy_published["pct"] is not None:
pct = report["policy_published"]["pct"]
new_policy_published["pct"] = pct
fo = "0"
if "fo" in policy_published:
if policy_published["fo"] is not None:
fo = report["policy_published"]["fo"]
new_policy_published["fo"] = fo
new_report["policy_published"] = new_policy_published
if type(report["record"]) == list:
for record in report["record"]:
records.append(_parse_report_record(record,
nameservers=nameservers,
timeout=timeout))
else:
records.append(_parse_report_record(report["record"]))
new_report["records"] = records
return new_report
except KeyError as error:
raise InvalidAggregateReport("Missing field: "
"{0}".format(error.__str__()))
def parse_aggregate_report_file(_input, nameservers=None, timeout=6.0):
"""Parses a file at the given path, a file-like object. or bytes as a
aggregate DMARC report
Args:
_input: A path to a file, a file like object, or bytes
nameservers (list): A list of one or more nameservers to use
timeout (float): Sets the DNS timeout in seconds
Returns:
OrderedDict: The parsed DMARC aggregate report
"""
if type(_input) == str or type(_input) == unicode:
file_object = open(_input, "rb")
elif type(_input) == bytes:
file_object = BytesIO(_input)
else:
file_object = _input
try:
header = file_object.read(6)
file_object.seek(0)
if header.startswith(b"\x50\x4B\x03\x04"):
_zip = ZipFile(file_object)
xml = _zip.open(_zip.namelist()[0]).read().decode()
elif header.startswith(b"\x1F\x8B"):
xml = GzipFile(fileobj=file_object).read().decode()
elif header.startswith(b"\x3c\x3f\x78\x6d\x6c\x20"):
xml = file_object.read().decode()
else:
file_object.close()
raise InvalidAggregateReport("Not a valid zip, gzip, or xml file")
file_object.close()
except UnicodeDecodeError:
raise InvalidAggregateReport("File objects must be opened in binary "
"(rb) mode")
return parse_aggregate_report_xml(xml,
nameservers=nameservers,
timeout=timeout)
def parsed_aggregate_report_to_csv(_input):
"""
Converts one or more parsed aggregate reports to flat CSV format, including
headers
Args:
_input: A parsed aggregate report or list of parsed aggregate reports
Returns:
str: Parsed aggregate report data in flat CSV format, including headers
"""
fields = ["xml_schema", "org_name", "org_email",
"org_extra_contact_info", "report_id", "begin_date", "end_date",
"errors", "domain", "adkim", "aspf", "p", "sp", "pct", "fo",
"source_ip_address", "source_country", "source_reverse_dns",
"source_base_domain", "count", "disposition", "dkim_alignment",
"spf_alignment", "policy_override_reasons",
"policy_override_comments", "envelope_from", "header_from",
"envelope_to", "dkim_domains", "dkim_selectors", "dkim_results",
"spf_domains", "spf_scopes", "spf_results"]
csv_file_object = StringIO()
writer = DictWriter(csv_file_object, fields)
writer.writeheader()
if type(_input) == OrderedDict:
_input = [_input]
for report in _input:
xml_schema = report["xml_schema"]
org_name = report["report_metadata"]["org_name"]
org_email = report["report_metadata"]["org_email"]
org_extra_contact = report["report_metadata"]["org_extra_contact_info"]
report_id = report["report_metadata"]["report_id"]
begin_date = report["report_metadata"]["begin_date"]
end_date = report["report_metadata"]["end_date"]
errors = report["report_metadata"]["errors"]
domain = report["policy_published"]["domain"]
adkim = report["policy_published"]["adkim"]
aspf = report["policy_published"]["aspf"]
p = report["policy_published"]["p"]
sp = report["policy_published"]["sp"]
pct = report["policy_published"]["pct"]
fo = report["policy_published"]["fo"]
report_dict = dict(xml_schema=xml_schema, org_name=org_name,
org_email=org_email,
org_extra_contact_info=org_extra_contact,
report_id=report_id, begin_date=begin_date,
end_date=end_date, errors=errors, domain=domain,
adkim=adkim, aspf=aspf, p=p, sp=sp, pct=pct, fo=fo)
for record in report["records"]:
row = report_dict
row["source_ip_address"] = record["source"]["ip_address"]
row["source_country"] = record["source"]["country"]
row["source_reverse_dns"] = record["source"]["reverse_dns"]
row["source_base_domain"] = record["source"]["base_domain"]
row["count"] = record["count"]
row["disposition"] = record["policy_evaluated"]["disposition"]
row["spf_alignment"] = record["policy_evaluated"]["spf"]
row["dkim_alignment"] = record["policy_evaluated"]["dkim"]
policy_override_reasons = list(map(lambda r: r["type"],
record["policy_evaluated"]
["policy_override_reasons"]))
policy_override_comments = list(map(lambda r: r["comment"],
record["policy_evaluated"]
["policy_override_reasons"]))
row["policy_override_reasons"] = ",".join(
policy_override_reasons)
row["policy_override_comments"] = "|".join(
policy_override_comments)
row["envelope_from"] = record["identifiers"]["envelope_from"]
row["header_from"] = record["identifiers"]["header_from"]
envelope_to = record["identifiers"]["envelope_to"]
row["envelope_to"] = envelope_to
dkim_domains = []
dkim_selectors = []
dkim_results = []
for dkim_result in record["auth_results"]["dkim"]:
dkim_domains.append(dkim_result["domain"])
if "selector" in dkim_result:
dkim_selectors.append(dkim_result["selector"])
dkim_results.append(dkim_result["result"])
row["dkim_domains"] = ",".join(dkim_domains)
row["dkim_selectors"] = ",".join(dkim_selectors)
row["dkim_results"] = ",".join(dkim_results)
spf_domains = []
spf_scopes = []
spf_results = []
for spf_result in record["auth_results"]["spf"]:
spf_domains.append(spf_result["domain"])
spf_scopes.append(spf_result["scope"])
spf_results.append(spf_result["result"])
row["spf_domains"] = ",".join(spf_domains)
row["spf_scopes"] = ",".join(spf_scopes)
row["spf_results"] = ",".join(spf_results)
writer.writerow(row)
csv_file_object.flush()
return csv_file_object.getvalue()
def _main():
"""Called when the module in executed"""
arg_parser = ArgumentParser(description="Parses aggregate DMARC reports")
arg_parser.add_argument("file_path", nargs="+",
help="one or more paths of aggregate report "
"files (compressed or uncompressed)")
arg_parser.add_argument("-f", "--format", default="json",
help="specify JSON or CSV output format")
arg_parser.add_argument("-o", "--output",
help="output to a file path rather than "
"printing to the screen")
arg_parser.add_argument("-n", "--nameserver", nargs="+",
help="nameservers to query")
arg_parser.add_argument("-t", "--timeout",
help="number of seconds to wait for an answer "
"from DNS (default 6.0)",
type=float,
default=6.0)
arg_parser.add_argument("-v", "--version", action="version",
version=__version__)
args = arg_parser.parse_args()
file_paths = []
for file_path in args.file_path:
file_paths += glob(file_path)
file_paths = list(set(file_paths))
parsed_reports = []
for file_path in file_paths:
try:
report = parse_aggregate_report_file(file_path,
nameservers=args.nameserver,
timeout=args.timeout)
parsed_reports.append(report)
except InvalidAggregateReport as error:
logger.error("Unable to parse {0}: {1}".format(file_path,
error.__str__()))
output = ""
if args.format.lower() == "json":
if len(parsed_reports) == 1:
parsed_reports = parsed_reports[0]
output = json.dumps(parsed_reports,
ensure_ascii=False,
indent=2)
elif args.format.lower() == "csv":
output = parsed_aggregate_report_to_csv(parsed_reports)
else:
logger.error("Invalid output format: {0}".format(args.format))
exit(-1)
if args.output:
with open(args.output, "w", encoding="utf-8", newline="\n") as file:
file.write(output)
else:
print(output)
if __name__ == "__main__":
_main()

2564
parsedmarc/__init__.py Normal file

File diff suppressed because it is too large Load Diff

1696
parsedmarc/cli.py Normal file

File diff suppressed because it is too large Load Diff

3
parsedmarc/constants.py Normal file
View File

@@ -0,0 +1,3 @@
__version__ = "9.0.5"
USER_AGENT = f"parsedmarc/{__version__}"

855
parsedmarc/elastic.py Normal file
View File

@@ -0,0 +1,855 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Optional, Union, Any
from elasticsearch_dsl.search import Q
from elasticsearch_dsl import (
connections,
Object,
Document,
Index,
Nested,
InnerDoc,
Integer,
Text,
Boolean,
Ip,
Date,
Search,
)
from elasticsearch.helpers import reindex
from parsedmarc.log import logger
from parsedmarc.utils import human_timestamp_to_datetime
from parsedmarc import InvalidForensicReport
class ElasticsearchError(Exception):
"""Raised when an Elasticsearch error occurs"""
class _PolicyOverride(InnerDoc):
type = Text()
comment = Text()
class _PublishedPolicy(InnerDoc):
domain = Text()
adkim = Text()
aspf = Text()
p = Text()
sp = Text()
pct = Integer()
fo = Text()
class _DKIMResult(InnerDoc):
domain = Text()
selector = Text()
result = Text()
class _SPFResult(InnerDoc):
domain = Text()
scope = Text()
results = Text()
class _AggregateReportDoc(Document):
class Index:
name = "dmarc_aggregate"
xml_schema = Text()
org_name = Text()
org_email = Text()
org_extra_contact_info = Text()
report_id = Text()
date_range = Date()
date_begin = Date()
date_end = Date()
normalized_timespan = Boolean()
original_timespan_seconds = Integer
errors = Text()
published_policy = Object(_PublishedPolicy)
source_ip_address = Ip()
source_country = Text()
source_reverse_dns = Text()
source_base_domain = Text()
source_type = Text()
source_name = Text()
message_count = Integer
disposition = Text()
dkim_aligned = Boolean()
spf_aligned = Boolean()
passed_dmarc = Boolean()
policy_overrides = Nested(_PolicyOverride)
header_from = Text()
envelope_from = Text()
envelope_to = Text()
dkim_results = Nested(_DKIMResult)
spf_results = Nested(_SPFResult)
def add_policy_override(self, type_: str, comment: str):
self.policy_overrides.append(_PolicyOverride(type=type_, comment=comment))
def add_dkim_result(self, domain: str, selector: str, result: _DKIMResult):
self.dkim_results.append(
_DKIMResult(domain=domain, selector=selector, result=result)
)
def add_spf_result(self, domain: str, scope: str, result: _SPFResult):
self.spf_results.append(_SPFResult(domain=domain, scope=scope, result=result))
def save(self, **kwargs):
self.passed_dmarc = False
self.passed_dmarc = self.spf_aligned or self.dkim_aligned
return super().save(**kwargs)
class _EmailAddressDoc(InnerDoc):
display_name = Text()
address = Text()
class _EmailAttachmentDoc(Document):
filename = Text()
content_type = Text()
sha256 = Text()
class _ForensicSampleDoc(InnerDoc):
raw = Text()
headers = Object()
headers_only = Boolean()
to = Nested(_EmailAddressDoc)
subject = Text()
filename_safe_subject = Text()
_from = Object(_EmailAddressDoc)
date = Date()
reply_to = Nested(_EmailAddressDoc)
cc = Nested(_EmailAddressDoc)
bcc = Nested(_EmailAddressDoc)
body = Text()
attachments = Nested(_EmailAttachmentDoc)
def add_to(self, display_name: str, address: str):
self.to.append(_EmailAddressDoc(display_name=display_name, address=address))
def add_reply_to(self, display_name: str, address: str):
self.reply_to.append(
_EmailAddressDoc(display_name=display_name, address=address)
)
def add_cc(self, display_name: str, address: str):
self.cc.append(_EmailAddressDoc(display_name=display_name, address=address))
def add_bcc(self, display_name: str, address: str):
self.bcc.append(_EmailAddressDoc(display_name=display_name, address=address))
def add_attachment(self, filename: str, content_type: str, sha256: str):
self.attachments.append(
_EmailAttachmentDoc(
filename=filename, content_type=content_type, sha256=sha256
)
)
class _ForensicReportDoc(Document):
class Index:
name = "dmarc_forensic"
feedback_type = Text()
user_agent = Text()
version = Text()
original_mail_from = Text()
arrival_date = Date()
domain = Text()
original_envelope_id = Text()
authentication_results = Text()
delivery_results = Text()
source_ip_address = Ip()
source_country = Text()
source_reverse_dns = Text()
source_authentication_mechanisms = Text()
source_auth_failures = Text()
dkim_domain = Text()
original_rcpt_to = Text()
sample = Object(_ForensicSampleDoc)
class _SMTPTLSFailureDetailsDoc(InnerDoc):
result_type = Text()
sending_mta_ip = Ip()
receiving_mx_helo = Text()
receiving_ip = Ip()
failed_session_count = Integer()
additional_information_uri = Text()
failure_reason_code = Text()
class _SMTPTLSPolicyDoc(InnerDoc):
policy_domain = Text()
policy_type = Text()
policy_strings = Text()
mx_host_patterns = Text()
successful_session_count = Integer()
failed_session_count = Integer()
failure_details = Nested(_SMTPTLSFailureDetailsDoc)
def add_failure_details(
self,
result_type: Optional[str] = None,
ip_address: Optional[str] = None,
receiving_ip: Optional[str] = None,
receiving_mx_helo: Optional[str] = None,
failed_session_count: Optional[int] = None,
sending_mta_ip: Optional[str] = None,
receiving_mx_hostname: Optional[str] = None,
additional_information_uri: Optional[str] = None,
failure_reason_code: Union[str, int, None] = None,
):
_details = _SMTPTLSFailureDetailsDoc(
result_type=result_type,
ip_address=ip_address,
sending_mta_ip=sending_mta_ip,
receiving_mx_hostname=receiving_mx_hostname,
receiving_mx_helo=receiving_mx_helo,
receiving_ip=receiving_ip,
failed_session_count=failed_session_count,
additional_information=additional_information_uri,
failure_reason_code=failure_reason_code,
)
self.failure_details.append(_details)
class _SMTPTLSReportDoc(Document):
class Index:
name = "smtp_tls"
organization_name = Text()
date_range = Date()
date_begin = Date()
date_end = Date()
contact_info = Text()
report_id = Text()
policies = Nested(_SMTPTLSPolicyDoc)
def add_policy(
self,
policy_type: str,
policy_domain: str,
successful_session_count: int,
failed_session_count: int,
*,
policy_string: Optional[str] = None,
mx_host_patterns: Optional[list[str]] = None,
failure_details: Optional[str] = None,
):
self.policies.append(
policy_type=policy_type,
policy_domain=policy_domain,
successful_session_count=successful_session_count,
failed_session_count=failed_session_count,
policy_string=policy_string,
mx_host_patterns=mx_host_patterns,
failure_details=failure_details,
)
class AlreadySaved(ValueError):
"""Raised when a report to be saved matches an existing report"""
def set_hosts(
hosts: Union[str, list[str]],
*,
use_ssl: Optional[bool] = False,
ssl_cert_path: Optional[str] = None,
username: Optional[str] = None,
password: Optional[str] = None,
api_key: Optional[str] = None,
timeout: Optional[float] = 60.0,
):
"""
Sets the Elasticsearch hosts to use
Args:
hosts (Union[str, list[str]]): A single hostname or URL, or list of hostnames or URLs
use_ssl (bool): Use an HTTPS connection to the server
ssl_cert_path (str): Path to the certificate chain
username (str): The username to use for authentication
password (str): The password to use for authentication
api_key (str): The Base64 encoded API key to use for authentication
timeout (float): Timeout in seconds
"""
if not isinstance(hosts, list):
hosts = [hosts]
conn_params = {"hosts": hosts, "timeout": timeout}
if use_ssl:
conn_params["use_ssl"] = True
if ssl_cert_path:
conn_params["verify_certs"] = True
conn_params["ca_certs"] = ssl_cert_path
else:
conn_params["verify_certs"] = False
if username and password:
conn_params["http_auth"] = username + ":" + password
if api_key:
conn_params["api_key"] = api_key
connections.create_connection(**conn_params)
def create_indexes(names: list[str], settings: Optional[dict[str, Any]] = None):
"""
Create Elasticsearch indexes
Args:
names (list): A list of index names
settings (dict): Index settings
"""
for name in names:
index = Index(name)
try:
if not index.exists():
logger.debug("Creating Elasticsearch index: {0}".format(name))
if settings is None:
index.settings(number_of_shards=1, number_of_replicas=0)
else:
index.settings(**settings)
index.create()
except Exception as e:
raise ElasticsearchError("Elasticsearch error: {0}".format(e.__str__()))
def migrate_indexes(
aggregate_indexes: Optional[list[str]] = None,
forensic_indexes: Optional[list[str]] = None,
):
"""
Updates index mappings
Args:
aggregate_indexes (list): A list of aggregate index names
forensic_indexes (list): A list of forensic index names
"""
version = 2
if aggregate_indexes is None:
aggregate_indexes = []
if forensic_indexes is None:
forensic_indexes = []
for aggregate_index_name in aggregate_indexes:
if not Index(aggregate_index_name).exists():
continue
aggregate_index = Index(aggregate_index_name)
doc = "doc"
fo_field = "published_policy.fo"
fo = "fo"
fo_mapping = aggregate_index.get_field_mapping(fields=[fo_field])
fo_mapping = fo_mapping[list(fo_mapping.keys())[0]]["mappings"]
if doc not in fo_mapping:
continue
fo_mapping = fo_mapping[doc][fo_field]["mapping"][fo]
fo_type = fo_mapping["type"]
if fo_type == "long":
new_index_name = "{0}-v{1}".format(aggregate_index_name, version)
body = {
"properties": {
"published_policy.fo": {
"type": "text",
"fields": {"keyword": {"type": "keyword", "ignore_above": 256}},
}
}
}
Index(new_index_name).create()
Index(new_index_name).put_mapping(doc_type=doc, body=body)
reindex(connections.get_connection(), aggregate_index_name, new_index_name)
Index(aggregate_index_name).delete()
for forensic_index in forensic_indexes:
pass
def save_aggregate_report_to_elasticsearch(
aggregate_report: dict[str, Any],
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: Optional[int] = 1,
number_of_replicas: Optional[int] = 0,
):
"""
Saves a parsed DMARC aggregate report to Elasticsearch
Args:
aggregate_report (dict): A parsed forensic report
index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily indexes
number_of_shards (int): The number of shards to use in the index
number_of_replicas (int): The number of replicas to use in the index
Raises:
AlreadySaved
"""
logger.info("Saving aggregate report to Elasticsearch")
aggregate_report = aggregate_report.copy()
metadata = aggregate_report["report_metadata"]
org_name = metadata["org_name"]
report_id = metadata["report_id"]
domain = aggregate_report["policy_published"]["domain"]
begin_date = human_timestamp_to_datetime(metadata["begin_date"], to_utc=True)
end_date = human_timestamp_to_datetime(metadata["end_date"], to_utc=True)
if monthly_indexes:
index_date = begin_date.strftime("%Y-%m")
else:
index_date = begin_date.strftime("%Y-%m-%d")
org_name_query = Q(dict(match_phrase=dict(org_name=org_name)))
report_id_query = Q(dict(match_phrase=dict(report_id=report_id)))
domain_query = Q(dict(match_phrase={"published_policy.domain": domain}))
begin_date_query = Q(dict(match=dict(date_begin=begin_date)))
end_date_query = Q(dict(match=dict(date_end=end_date)))
if index_suffix is not None:
search_index = "dmarc_aggregate_{0}*".format(index_suffix)
else:
search_index = "dmarc_aggregate*"
if index_prefix is not None:
search_index = "{0}{1}".format(index_prefix, search_index)
search = Search(index=search_index)
query = org_name_query & report_id_query & domain_query
query = query & begin_date_query & end_date_query
search.query = query
try:
existing = search.execute()
except Exception as error_:
begin_date_human = begin_date.strftime("%Y-%m-%d %H:%M:%SZ")
end_date_human = end_date.strftime("%Y-%m-%d %H:%M:%SZ")
raise ElasticsearchError(
"Elasticsearch's search for existing report \
error: {}".format(error_.__str__())
)
if len(existing) > 0:
raise AlreadySaved(
"An aggregate report ID {0} from {1} about {2} "
"with a date range of {3} UTC to {4} UTC already "
"exists in "
"Elasticsearch".format(
report_id, org_name, domain, begin_date_human, end_date_human
)
)
published_policy = _PublishedPolicy(
domain=aggregate_report["policy_published"]["domain"],
adkim=aggregate_report["policy_published"]["adkim"],
aspf=aggregate_report["policy_published"]["aspf"],
p=aggregate_report["policy_published"]["p"],
sp=aggregate_report["policy_published"]["sp"],
pct=aggregate_report["policy_published"]["pct"],
fo=aggregate_report["policy_published"]["fo"],
)
for record in aggregate_report["records"]:
begin_date = human_timestamp_to_datetime(record["interval_begin"], to_utc=True)
end_date = human_timestamp_to_datetime(record["interval_end"], to_utc=True)
normalized_timespan = record["normalized_timespan"]
if monthly_indexes:
index_date = begin_date.strftime("%Y-%m")
else:
index_date = begin_date.strftime("%Y-%m-%d")
aggregate_report["begin_date"] = begin_date
aggregate_report["end_date"] = end_date
date_range = [aggregate_report["begin_date"], aggregate_report["end_date"]]
agg_doc = _AggregateReportDoc(
xml_schema=aggregate_report["xml_schema"],
org_name=metadata["org_name"],
org_email=metadata["org_email"],
org_extra_contact_info=metadata["org_extra_contact_info"],
report_id=metadata["report_id"],
date_range=date_range,
date_begin=begin_date,
date_end=end_date,
normalized_timespan=normalized_timespan,
errors=metadata["errors"],
published_policy=published_policy,
source_ip_address=record["source"]["ip_address"],
source_country=record["source"]["country"],
source_reverse_dns=record["source"]["reverse_dns"],
source_base_domain=record["source"]["base_domain"],
source_type=record["source"]["type"],
source_name=record["source"]["name"],
message_count=record["count"],
disposition=record["policy_evaluated"]["disposition"],
dkim_aligned=record["policy_evaluated"]["dkim"] is not None
and record["policy_evaluated"]["dkim"].lower() == "pass",
spf_aligned=record["policy_evaluated"]["spf"] is not None
and record["policy_evaluated"]["spf"].lower() == "pass",
header_from=record["identifiers"]["header_from"],
envelope_from=record["identifiers"]["envelope_from"],
envelope_to=record["identifiers"]["envelope_to"],
)
for override in record["policy_evaluated"]["policy_override_reasons"]:
agg_doc.add_policy_override(
type_=override["type"], comment=override["comment"]
)
for dkim_result in record["auth_results"]["dkim"]:
agg_doc.add_dkim_result(
domain=dkim_result["domain"],
selector=dkim_result["selector"],
result=dkim_result["result"],
)
for spf_result in record["auth_results"]["spf"]:
agg_doc.add_spf_result(
domain=spf_result["domain"],
scope=spf_result["scope"],
result=spf_result["result"],
)
index = "dmarc_aggregate"
if index_suffix:
index = "{0}_{1}".format(index, index_suffix)
if index_prefix:
index = "{0}{1}".format(index_prefix, index)
index = "{0}-{1}".format(index, index_date)
index_settings = dict(
number_of_shards=number_of_shards, number_of_replicas=number_of_replicas
)
create_indexes([index], index_settings)
agg_doc.meta.index = index
try:
agg_doc.save()
except Exception as e:
raise ElasticsearchError("Elasticsearch error: {0}".format(e.__str__()))
def save_forensic_report_to_elasticsearch(
forensic_report: dict[str, Any],
index_suffix: Optional[Any] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: int = 1,
number_of_replicas: int = 0,
):
"""
Saves a parsed DMARC forensic report to Elasticsearch
Args:
forensic_report (dict): A parsed forensic report
index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily
indexes
number_of_shards (int): The number of shards to use in the index
number_of_replicas (int): The number of replicas to use in the
index
Raises:
AlreadySaved
"""
logger.info("Saving forensic report to Elasticsearch")
forensic_report = forensic_report.copy()
sample_date = None
if forensic_report["parsed_sample"]["date"] is not None:
sample_date = forensic_report["parsed_sample"]["date"]
sample_date = human_timestamp_to_datetime(sample_date)
original_headers = forensic_report["parsed_sample"]["headers"]
headers = dict()
for original_header in original_headers:
headers[original_header.lower()] = original_headers[original_header]
arrival_date = human_timestamp_to_datetime(forensic_report["arrival_date_utc"])
arrival_date_epoch_milliseconds = int(arrival_date.timestamp() * 1000)
if index_suffix is not None:
search_index = "dmarc_forensic_{0}*".format(index_suffix)
else:
search_index = "dmarc_forensic*"
if index_prefix is not None:
search_index = "{0}{1}".format(index_prefix, search_index)
search = Search(index=search_index)
q = Q(dict(match=dict(arrival_date=arrival_date_epoch_milliseconds)))
from_ = None
to_ = None
subject = None
if "from" in headers:
# We convert the FROM header from a string list to a flat string.
headers["from"] = headers["from"][0]
if headers["from"][0] == "":
headers["from"] = headers["from"][1]
else:
headers["from"] = " <".join(headers["from"]) + ">"
from_ = dict()
from_["sample.headers.from"] = headers["from"]
from_query = Q(dict(match_phrase=from_))
q = q & from_query
if "to" in headers:
# We convert the TO header from a string list to a flat string.
headers["to"] = headers["to"][0]
if headers["to"][0] == "":
headers["to"] = headers["to"][1]
else:
headers["to"] = " <".join(headers["to"]) + ">"
to_ = dict()
to_["sample.headers.to"] = headers["to"]
to_query = Q(dict(match_phrase=to_))
q = q & to_query
if "subject" in headers:
subject = headers["subject"]
subject_query = {"match_phrase": {"sample.headers.subject": subject}}
q = q & Q(subject_query)
search.query = q
existing = search.execute()
if len(existing) > 0:
raise AlreadySaved(
"A forensic sample to {0} from {1} "
"with a subject of {2} and arrival date of {3} "
"already exists in "
"Elasticsearch".format(
to_, from_, subject, forensic_report["arrival_date_utc"]
)
)
parsed_sample = forensic_report["parsed_sample"]
sample = _ForensicSampleDoc(
raw=forensic_report["sample"],
headers=headers,
headers_only=forensic_report["sample_headers_only"],
date=sample_date,
subject=forensic_report["parsed_sample"]["subject"],
filename_safe_subject=parsed_sample["filename_safe_subject"],
body=forensic_report["parsed_sample"]["body"],
)
for address in forensic_report["parsed_sample"]["to"]:
sample.add_to(display_name=address["display_name"], address=address["address"])
for address in forensic_report["parsed_sample"]["reply_to"]:
sample.add_reply_to(
display_name=address["display_name"], address=address["address"]
)
for address in forensic_report["parsed_sample"]["cc"]:
sample.add_cc(display_name=address["display_name"], address=address["address"])
for address in forensic_report["parsed_sample"]["bcc"]:
sample.add_bcc(display_name=address["display_name"], address=address["address"])
for attachment in forensic_report["parsed_sample"]["attachments"]:
sample.add_attachment(
filename=attachment["filename"],
content_type=attachment["mail_content_type"],
sha256=attachment["sha256"],
)
try:
forensic_doc = _ForensicReportDoc(
feedback_type=forensic_report["feedback_type"],
user_agent=forensic_report["user_agent"],
version=forensic_report["version"],
original_mail_from=forensic_report["original_mail_from"],
arrival_date=arrival_date_epoch_milliseconds,
domain=forensic_report["reported_domain"],
original_envelope_id=forensic_report["original_envelope_id"],
authentication_results=forensic_report["authentication_results"],
delivery_results=forensic_report["delivery_result"],
source_ip_address=forensic_report["source"]["ip_address"],
source_country=forensic_report["source"]["country"],
source_reverse_dns=forensic_report["source"]["reverse_dns"],
source_base_domain=forensic_report["source"]["base_domain"],
authentication_mechanisms=forensic_report["authentication_mechanisms"],
auth_failure=forensic_report["auth_failure"],
dkim_domain=forensic_report["dkim_domain"],
original_rcpt_to=forensic_report["original_rcpt_to"],
sample=sample,
)
index = "dmarc_forensic"
if index_suffix:
index = "{0}_{1}".format(index, index_suffix)
if index_prefix:
index = "{0}{1}".format(index_prefix, index)
if monthly_indexes:
index_date = arrival_date.strftime("%Y-%m")
else:
index_date = arrival_date.strftime("%Y-%m-%d")
index = "{0}-{1}".format(index, index_date)
index_settings = dict(
number_of_shards=number_of_shards, number_of_replicas=number_of_replicas
)
create_indexes([index], index_settings)
forensic_doc.meta.index = index
try:
forensic_doc.save()
except Exception as e:
raise ElasticsearchError("Elasticsearch error: {0}".format(e.__str__()))
except KeyError as e:
raise InvalidForensicReport(
"Forensic report missing required field: {0}".format(e.__str__())
)
def save_smtp_tls_report_to_elasticsearch(
report: dict[str, Any],
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: Optional[int] = 1,
number_of_replicas: Optional[int] = 0,
):
"""
Saves a parsed SMTP TLS report to Elasticsearch
Args:
report (dict): A parsed SMTP TLS report
index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily indexes
number_of_shards (int): The number of shards to use in the index
number_of_replicas (int): The number of replicas to use in the index
Raises:
AlreadySaved
"""
logger.info("Saving smtp tls report to Elasticsearch")
org_name = report["organization_name"]
report_id = report["report_id"]
begin_date = human_timestamp_to_datetime(report["begin_date"], to_utc=True)
end_date = human_timestamp_to_datetime(report["end_date"], to_utc=True)
begin_date_human = begin_date.strftime("%Y-%m-%d %H:%M:%SZ")
end_date_human = end_date.strftime("%Y-%m-%d %H:%M:%SZ")
if monthly_indexes:
index_date = begin_date.strftime("%Y-%m")
else:
index_date = begin_date.strftime("%Y-%m-%d")
report["begin_date"] = begin_date
report["end_date"] = end_date
org_name_query = Q(dict(match_phrase=dict(org_name=org_name)))
report_id_query = Q(dict(match_phrase=dict(report_id=report_id)))
begin_date_query = Q(dict(match=dict(date_begin=begin_date)))
end_date_query = Q(dict(match=dict(date_end=end_date)))
if index_suffix is not None:
search_index = "smtp_tls_{0}*".format(index_suffix)
else:
search_index = "smtp_tls*"
if index_prefix is not None:
search_index = "{0}{1}".format(index_prefix, search_index)
search = Search(index=search_index)
query = org_name_query & report_id_query
query = query & begin_date_query & end_date_query
search.query = query
try:
existing = search.execute()
except Exception as error_:
raise ElasticsearchError(
"Elasticsearch's search for existing report \
error: {}".format(error_.__str__())
)
if len(existing) > 0:
raise AlreadySaved(
f"An SMTP TLS report ID {report_id} from "
f" {org_name} with a date range of "
f"{begin_date_human} UTC to "
f"{end_date_human} UTC already "
"exists in Elasticsearch"
)
index = "smtp_tls"
if index_suffix:
index = "{0}_{1}".format(index, index_suffix)
if index_prefix:
index = "{0}{1}".format(index_prefix, index)
index = "{0}-{1}".format(index, index_date)
index_settings = dict(
number_of_shards=number_of_shards, number_of_replicas=number_of_replicas
)
smtp_tls_doc = _SMTPTLSReportDoc(
org_name=report["organization_name"],
date_range=[report["begin_date"], report["end_date"]],
date_begin=report["begin_date"],
date_end=report["end_date"],
contact_info=report["contact_info"],
report_id=report["report_id"],
)
for policy in report["policies"]:
policy_strings = None
mx_host_patterns = None
if "policy_strings" in policy:
policy_strings = policy["policy_strings"]
if "mx_host_patterns" in policy:
mx_host_patterns = policy["mx_host_patterns"]
policy_doc = _SMTPTLSPolicyDoc(
policy_domain=policy["policy_domain"],
policy_type=policy["policy_type"],
successful_session_count=policy["successful_session_count"],
failed_session_count=policy["failed_session_count"],
policy_string=policy_strings,
mx_host_patterns=mx_host_patterns,
)
if "failure_details" in policy:
for failure_detail in policy["failure_details"]:
receiving_mx_hostname = None
additional_information_uri = None
failure_reason_code = None
ip_address = None
receiving_ip = None
receiving_mx_helo = None
sending_mta_ip = None
if "receiving_mx_hostname" in failure_detail:
receiving_mx_hostname = failure_detail["receiving_mx_hostname"]
if "additional_information_uri" in failure_detail:
additional_information_uri = failure_detail[
"additional_information_uri"
]
if "failure_reason_code" in failure_detail:
failure_reason_code = failure_detail["failure_reason_code"]
if "ip_address" in failure_detail:
ip_address = failure_detail["ip_address"]
if "receiving_ip" in failure_detail:
receiving_ip = failure_detail["receiving_ip"]
if "receiving_mx_helo" in failure_detail:
receiving_mx_helo = failure_detail["receiving_mx_helo"]
if "sending_mta_ip" in failure_detail:
sending_mta_ip = failure_detail["sending_mta_ip"]
policy_doc.add_failure_details(
result_type=failure_detail["result_type"],
ip_address=ip_address,
receiving_ip=receiving_ip,
receiving_mx_helo=receiving_mx_helo,
failed_session_count=failure_detail["failed_session_count"],
sending_mta_ip=sending_mta_ip,
receiving_mx_hostname=receiving_mx_hostname,
additional_information_uri=additional_information_uri,
failure_reason_code=failure_reason_code,
)
smtp_tls_doc.policies.append(policy_doc)
create_indexes([index], index_settings)
smtp_tls_doc.meta.index = index
try:
smtp_tls_doc.save()
except Exception as e:
raise ElasticsearchError("Elasticsearch error: {0}".format(e.__str__()))

75
parsedmarc/gelf.py Normal file
View File

@@ -0,0 +1,75 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Any
import logging
import logging.handlers
import json
import threading
from parsedmarc import (
parsed_aggregate_reports_to_csv_rows,
parsed_forensic_reports_to_csv_rows,
parsed_smtp_tls_reports_to_csv_rows,
)
from pygelf import GelfTcpHandler, GelfUdpHandler, GelfTlsHandler
log_context_data = threading.local()
class ContextFilter(logging.Filter):
def filter(self, record):
record.parsedmarc = log_context_data.parsedmarc
return True
class GelfClient(object):
"""A client for the Graylog Extended Log Format"""
def __init__(self, host, port, mode):
"""
Initializes the GelfClient
Args:
host (str): The GELF host
port (int): The GELF port
mode (str): The GELF transport mode
"""
self.host = host
self.port = port
self.logger = logging.getLogger("parsedmarc_syslog")
self.logger.setLevel(logging.INFO)
self.logger.addFilter(ContextFilter())
self.gelf_mode = {
"udp": GelfUdpHandler,
"tcp": GelfTcpHandler,
"tls": GelfTlsHandler,
}
self.handler = self.gelf_mode[mode](
host=self.host, port=self.port, include_extra_fields=True
)
self.logger.addHandler(self.handler)
def save_aggregate_report_to_gelf(
self, aggregate_reports: list[dict[str, Any]]
):
rows = parsed_aggregate_reports_to_csv_rows(aggregate_reports)
for row in rows:
log_context_data.parsedmarc = row
self.logger.info("parsedmarc aggregate report")
log_context_data.parsedmarc = None
def save_forensic_report_to_gelf(
self, forensic_reports: list[dict[str, Any]]
):
rows = parsed_forensic_reports_to_csv_rows(forensic_reports)
for row in rows:
self.logger.info(json.dumps(row))
def save_smtp_tls_report_to_gelf(self, smtp_tls_reports: dict[str, Any]):
rows = parsed_smtp_tls_reports_to_csv_rows(smtp_tls_reports)
for row in rows:
self.logger.info(json.dumps(row))

210
parsedmarc/kafkaclient.py Normal file
View File

@@ -0,0 +1,210 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Any, Optional, Union
from ssl import SSLContext
import json
from ssl import create_default_context
from kafka import KafkaProducer
from kafka.errors import NoBrokersAvailable, UnknownTopicOrPartitionError
from parsedmarc.utils import human_timestamp_to_datetime
from parsedmarc import __version__
from parsedmarc.log import logger
class KafkaError(RuntimeError):
"""Raised when a Kafka error occurs"""
class KafkaClient(object):
def __init__(
self,
kafka_hosts: list[str],
*,
ssl: Optional[bool] = False,
username: Optional[str] = None,
password: Optional[str] = None,
ssl_context: Optional[SSLContext] = None,
):
"""
Initializes the Kafka client
Args:
kafka_hosts (list): A list of Kafka hostnames
(with optional port numbers)
ssl (bool): Use a SSL/TLS connection
username (str): An optional username
password (str): An optional password
ssl_context (SSLContext): SSL context options
Notes:
``use_ssl=True`` is implied when a username or password are
supplied.
When using Azure Event Hubs, the username is literally
``$ConnectionString``, and the password is the
Azure Event Hub connection string.
"""
config = dict(
value_serializer=lambda v: json.dumps(v).encode("utf-8"),
bootstrap_servers=kafka_hosts,
client_id="parsedmarc-{0}".format(__version__),
)
if ssl or username or password:
config["security_protocol"] = "SSL"
config["ssl_context"] = ssl_context or create_default_context()
if username or password:
config["sasl_plain_username"] = username or ""
config["sasl_plain_password"] = password or ""
try:
self.producer = KafkaProducer(**config)
except NoBrokersAvailable:
raise KafkaError("No Kafka brokers available")
@staticmethod
def strip_metadata(report: dict[str, Any]):
"""
Duplicates org_name, org_email and report_id into JSON root
and removes report_metadata key to bring it more inline
with Elastic output.
"""
report["org_name"] = report["report_metadata"]["org_name"]
report["org_email"] = report["report_metadata"]["org_email"]
report["report_id"] = report["report_metadata"]["report_id"]
report.pop("report_metadata")
return report
@staticmethod
def generate_date_range(report: dict[str, Any]):
"""
Creates a date_range timestamp with format YYYY-MM-DD-T-HH:MM:SS
based on begin and end dates for easier parsing in Kibana.
Move to utils to avoid duplication w/ elastic?
"""
metadata = report["report_metadata"]
begin_date = human_timestamp_to_datetime(metadata["begin_date"])
end_date = human_timestamp_to_datetime(metadata["end_date"])
begin_date_human = begin_date.strftime("%Y-%m-%dT%H:%M:%S")
end_date_human = end_date.strftime("%Y-%m-%dT%H:%M:%S")
date_range = [begin_date_human, end_date_human]
logger.debug("date_range is {}".format(date_range))
return date_range
def save_aggregate_reports_to_kafka(
self,
aggregate_reports: Union[dict[str, Any], list[dict[str, Any]]],
aggregate_topic: str,
):
"""
Saves aggregate DMARC reports to Kafka
Args:
aggregate_reports (list): A list of aggregate report dictionaries
to save to Kafka
aggregate_topic (str): The name of the Kafka topic
"""
if isinstance(aggregate_reports, dict):
aggregate_reports = [aggregate_reports]
if len(aggregate_reports) < 1:
return
for report in aggregate_reports:
report["date_range"] = self.generate_date_range(report)
report = self.strip_metadata(report)
for slice in report["records"]:
slice["date_range"] = report["date_range"]
slice["org_name"] = report["org_name"]
slice["org_email"] = report["org_email"]
slice["policy_published"] = report["policy_published"]
slice["report_id"] = report["report_id"]
logger.debug("Sending slice.")
try:
logger.debug("Saving aggregate report to Kafka")
self.producer.send(aggregate_topic, slice)
except UnknownTopicOrPartitionError:
raise KafkaError(
"Kafka error: Unknown topic or partition on broker"
)
except Exception as e:
raise KafkaError("Kafka error: {0}".format(e.__str__()))
try:
self.producer.flush()
except Exception as e:
raise KafkaError("Kafka error: {0}".format(e.__str__()))
def save_forensic_reports_to_kafka(
self,
forensic_reports: Union[dict[str, Any], list[dict[str, Any]]],
forensic_topic: str,
):
"""
Saves forensic DMARC reports to Kafka, sends individual
records (slices) since Kafka requires messages to be <= 1MB
by default.
Args:
forensic_reports (list): A list of forensic report dicts
to save to Kafka
forensic_topic (str): The name of the Kafka topic
"""
if isinstance(forensic_reports, dict):
forensic_reports = [forensic_reports]
if len(forensic_reports) < 1:
return
try:
logger.debug("Saving forensic reports to Kafka")
self.producer.send(forensic_topic, forensic_reports)
except UnknownTopicOrPartitionError:
raise KafkaError("Kafka error: Unknown topic or partition on broker")
except Exception as e:
raise KafkaError("Kafka error: {0}".format(e.__str__()))
try:
self.producer.flush()
except Exception as e:
raise KafkaError("Kafka error: {0}".format(e.__str__()))
def save_smtp_tls_reports_to_kafka(
self,
smtp_tls_reports: Union[list[dict[str, Any]], dict[str, Any]],
smtp_tls_topic: str,
):
"""
Saves SMTP TLS reports to Kafka, sends individual
records (slices) since Kafka requires messages to be <= 1MB
by default.
Args:
smtp_tls_reports (list): A list of forensic report dicts
to save to Kafka
smtp_tls_topic (str): The name of the Kafka topic
"""
if isinstance(smtp_tls_reports, dict):
smtp_tls_reports = [smtp_tls_reports]
if len(smtp_tls_reports) < 1:
return
try:
logger.debug("Saving forensic reports to Kafka")
self.producer.send(smtp_tls_topic, smtp_tls_reports)
except UnknownTopicOrPartitionError:
raise KafkaError("Kafka error: Unknown topic or partition on broker")
except Exception as e:
raise KafkaError("Kafka error: {0}".format(e.__str__()))
try:
self.producer.flush()
except Exception as e:
raise KafkaError("Kafka error: {0}".format(e.__str__()))

4
parsedmarc/log.py Normal file
View File

@@ -0,0 +1,4 @@
import logging
logger = logging.getLogger(__name__)
logger.addHandler(logging.NullHandler())

195
parsedmarc/loganalytics.py Normal file
View File

@@ -0,0 +1,195 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Any
from parsedmarc.log import logger
from azure.core.exceptions import HttpResponseError
from azure.identity import ClientSecretCredential
from azure.monitor.ingestion import LogsIngestionClient
class LogAnalyticsException(Exception):
"""Raised when an Elasticsearch error occurs"""
class LogAnalyticsConfig:
"""
The LogAnalyticsConfig class is used to define the configuration
for the Log Analytics Client.
Properties:
client_id (str):
The client ID of the service principle.
client_secret (str):
The client secret of the service principle.
tenant_id (str):
The tenant ID where
the service principle resides.
dce (str):
The Data Collection Endpoint (DCE)
used by the Data Collection Rule (DCR).
dcr_immutable_id (str):
The immutable ID of
the Data Collection Rule (DCR).
dcr_aggregate_stream (str):
The Stream name where
the Aggregate DMARC reports
need to be pushed.
dcr_forensic_stream (str):
The Stream name where
the Forensic DMARC reports
need to be pushed.
dcr_smtp_tls_stream (str):
The Stream name where
the SMTP TLS Reports
need to be pushed.
"""
def __init__(
self,
client_id: str,
client_secret: str,
tenant_id: str,
dce: str,
dcr_immutable_id: str,
dcr_aggregate_stream: str,
dcr_forensic_stream: str,
dcr_smtp_tls_stream: str,
):
self.client_id = client_id
self.client_secret = client_secret
self.tenant_id = tenant_id
self.dce = dce
self.dcr_immutable_id = dcr_immutable_id
self.dcr_aggregate_stream = dcr_aggregate_stream
self.dcr_forensic_stream = dcr_forensic_stream
self.dcr_smtp_tls_stream = dcr_smtp_tls_stream
class LogAnalyticsClient(object):
"""
The LogAnalyticsClient is used to push
the generated DMARC reports to Log Analytics
via Data Collection Rules.
"""
def __init__(
self,
client_id: str,
client_secret: str,
tenant_id: str,
dce: str,
dcr_immutable_id: str,
dcr_aggregate_stream: str,
dcr_forensic_stream: str,
dcr_smtp_tls_stream: str,
):
self.conf = LogAnalyticsConfig(
client_id=client_id,
client_secret=client_secret,
tenant_id=tenant_id,
dce=dce,
dcr_immutable_id=dcr_immutable_id,
dcr_aggregate_stream=dcr_aggregate_stream,
dcr_forensic_stream=dcr_forensic_stream,
dcr_smtp_tls_stream=dcr_smtp_tls_stream,
)
if (
not self.conf.client_id
or not self.conf.client_secret
or not self.conf.tenant_id
or not self.conf.dce
or not self.conf.dcr_immutable_id
):
raise LogAnalyticsException(
"Invalid configuration. " + "One or more required settings are missing."
)
def publish_json(
self,
results,
logs_client: LogsIngestionClient,
dcr_stream: str,
):
"""
Background function to publish given
DMARC report to specific Data Collection Rule.
Args:
results (list):
The results generated by parsedmarc.
logs_client (LogsIngestionClient):
The client used to send the DMARC reports.
dcr_stream (str):
The stream name where the DMARC reports needs to be pushed.
"""
try:
logs_client.upload(self.conf.dcr_immutable_id, dcr_stream, results)
except HttpResponseError as e:
raise LogAnalyticsException("Upload failed: {error}".format(error=e))
def publish_results(
self,
results: dict[str, dict[str, Any]],
save_aggregate: bool,
save_forensic: bool,
save_smtp_tls: bool,
):
"""
Function to publish DMARC and/or SMTP TLS reports to Log Analytics
via Data Collection Rules (DCR).
Look below for docs:
https://learn.microsoft.com/en-us/azure/azure-monitor/logs/logs-ingestion-api-overview
Args:
results (list):
The DMARC reports (Aggregate & Forensic)
save_aggregate (bool):
Whether Aggregate reports can be saved into Log Analytics
save_forensic (bool):
Whether Forensic reports can be saved into Log Analytics
save_smtp_tls (bool):
Whether Forensic reports can be saved into Log Analytics
"""
conf = self.conf
credential = ClientSecretCredential(
tenant_id=conf.tenant_id,
client_id=conf.client_id,
client_secret=conf.client_secret,
)
logs_client = LogsIngestionClient(conf.dce, credential=credential)
if (
results["aggregate_reports"]
and conf.dcr_aggregate_stream
and len(results["aggregate_reports"]) > 0
and save_aggregate
):
logger.info("Publishing aggregate reports.")
self.publish_json(
results["aggregate_reports"], logs_client, conf.dcr_aggregate_stream
)
logger.info("Successfully pushed aggregate reports.")
if (
results["forensic_reports"]
and conf.dcr_forensic_stream
and len(results["forensic_reports"]) > 0
and save_forensic
):
logger.info("Publishing forensic reports.")
self.publish_json(
results["forensic_reports"], logs_client, conf.dcr_forensic_stream
)
logger.info("Successfully pushed forensic reports.")
if (
results["smtp_tls_reports"]
and conf.dcr_smtp_tls_stream
and len(results["smtp_tls_reports"]) > 0
and save_smtp_tls
):
logger.info("Publishing SMTP TLS reports.")
self.publish_json(
results["smtp_tls_reports"], logs_client, conf.dcr_smtp_tls_stream
)
logger.info("Successfully pushed SMTP TLS reports.")

View File

@@ -0,0 +1,13 @@
from parsedmarc.mail.mailbox_connection import MailboxConnection
from parsedmarc.mail.graph import MSGraphConnection
from parsedmarc.mail.gmail import GmailConnection
from parsedmarc.mail.imap import IMAPConnection
from parsedmarc.mail.maildir import MaildirConnection
__all__ = [
"MailboxConnection",
"MSGraphConnection",
"GmailConnection",
"IMAPConnection",
"MaildirConnection",
]

159
parsedmarc/mail/gmail.py Normal file
View File

@@ -0,0 +1,159 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from base64 import urlsafe_b64decode
from functools import lru_cache
from pathlib import Path
from time import sleep
from typing import List
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
from parsedmarc.log import logger
from parsedmarc.mail.mailbox_connection import MailboxConnection
def _get_creds(token_file, credentials_file, scopes, oauth2_port):
creds = None
if Path(token_file).exists():
creds = Credentials.from_authorized_user_file(token_file, scopes)
# If there are no (valid) credentials available, let the user log in.
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(credentials_file, scopes)
creds = flow.run_local_server(open_browser=False, oauth2_port=oauth2_port)
# Save the credentials for the next run
with Path(token_file).open("w") as token:
token.write(creds.to_json())
return creds
class GmailConnection(MailboxConnection):
def __init__(
self,
token_file: str,
credentials_file: str,
scopes: List[str],
include_spam_trash: bool,
reports_folder: str,
oauth2_port: int,
paginate_messages: bool,
):
creds = _get_creds(token_file, credentials_file, scopes, oauth2_port)
self.service = build("gmail", "v1", credentials=creds)
self.include_spam_trash = include_spam_trash
self.reports_label_id = self._find_label_id_for_label(reports_folder)
self.paginate_messages = paginate_messages
def create_folder(self, folder_name: str):
# Gmail doesn't support the name Archive
if folder_name == "Archive":
return
logger.debug(f"Creating label {folder_name}")
request_body = {"name": folder_name, "messageListVisibility": "show"}
try:
self.service.users().labels().create(
userId="me", body=request_body
).execute()
except HttpError as e:
if e.status_code == 409:
logger.debug(f"Folder {folder_name} already exists, skipping creation")
else:
raise e
def _fetch_all_message_ids(self, reports_label_id, page_token=None, since=None):
if since:
results = (
self.service.users()
.messages()
.list(
userId="me",
includeSpamTrash=self.include_spam_trash,
labelIds=[reports_label_id],
pageToken=page_token,
q=f"after:{since}",
)
.execute()
)
else:
results = (
self.service.users()
.messages()
.list(
userId="me",
includeSpamTrash=self.include_spam_trash,
labelIds=[reports_label_id],
pageToken=page_token,
)
.execute()
)
messages = results.get("messages", [])
for message in messages:
yield message["id"]
if "nextPageToken" in results and self.paginate_messages:
yield from self._fetch_all_message_ids(
reports_label_id, results["nextPageToken"]
)
def fetch_messages(self, reports_folder: str, **kwargs) -> List[str]:
reports_label_id = self._find_label_id_for_label(reports_folder)
since = kwargs.get("since")
if since:
return [
id for id in self._fetch_all_message_ids(reports_label_id, since=since)
]
else:
return [id for id in self._fetch_all_message_ids(reports_label_id)]
def fetch_message(self, message_id):
msg = (
self.service.users()
.messages()
.get(userId="me", id=message_id, format="raw")
.execute()
)
return urlsafe_b64decode(msg["raw"])
def delete_message(self, message_id: str):
self.service.users().messages().delete(userId="me", id=message_id)
def move_message(self, message_id: str, folder_name: str):
label_id = self._find_label_id_for_label(folder_name)
logger.debug(f"Moving message UID {message_id} to {folder_name}")
request_body = {
"addLabelIds": [label_id],
"removeLabelIds": [self.reports_label_id],
}
self.service.users().messages().modify(
userId="me", id=message_id, body=request_body
).execute()
def keepalive(self):
# Not needed
pass
def watch(self, check_callback, check_timeout):
"""Checks the mailbox for new messages every n seconds"""
while True:
sleep(check_timeout)
check_callback(self)
@lru_cache(maxsize=10)
def _find_label_id_for_label(self, label_name: str) -> str:
results = self.service.users().labels().list(userId="me").execute()
labels = results.get("labels", [])
for label in labels:
if label_name == label["id"] or label_name == label["name"]:
return label["id"]
return ""

269
parsedmarc/mail/graph.py Normal file
View File

@@ -0,0 +1,269 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from enum import Enum
from functools import lru_cache
from pathlib import Path
from time import sleep
from typing import List, Optional
from azure.identity import (
UsernamePasswordCredential,
DeviceCodeCredential,
ClientSecretCredential,
TokenCachePersistenceOptions,
AuthenticationRecord,
)
from msgraph.core import GraphClient
from parsedmarc.log import logger
from parsedmarc.mail.mailbox_connection import MailboxConnection
class AuthMethod(Enum):
DeviceCode = 1
UsernamePassword = 2
ClientSecret = 3
def _get_cache_args(token_path: Path, allow_unencrypted_storage):
cache_args = {
"cache_persistence_options": TokenCachePersistenceOptions(
name="parsedmarc", allow_unencrypted_storage=allow_unencrypted_storage
)
}
auth_record = _load_token(token_path)
if auth_record:
cache_args["authentication_record"] = AuthenticationRecord.deserialize(
auth_record
)
return cache_args
def _load_token(token_path: Path) -> Optional[str]:
if not token_path.exists():
return None
with token_path.open() as token_file:
return token_file.read()
def _cache_auth_record(record: AuthenticationRecord, token_path: Path):
token = record.serialize()
with token_path.open("w") as token_file:
token_file.write(token)
def _generate_credential(auth_method: str, token_path: Path, **kwargs):
if auth_method == AuthMethod.DeviceCode.name:
credential = DeviceCodeCredential(
client_id=kwargs["client_id"],
disable_automatic_authentication=True,
tenant_id=kwargs["tenant_id"],
**_get_cache_args(
token_path,
allow_unencrypted_storage=kwargs["allow_unencrypted_storage"],
),
)
elif auth_method == AuthMethod.UsernamePassword.name:
credential = UsernamePasswordCredential(
client_id=kwargs["client_id"],
client_credential=kwargs["client_secret"],
disable_automatic_authentication=True,
username=kwargs["username"],
password=kwargs["password"],
**_get_cache_args(
token_path,
allow_unencrypted_storage=kwargs["allow_unencrypted_storage"],
),
)
elif auth_method == AuthMethod.ClientSecret.name:
credential = ClientSecretCredential(
client_id=kwargs["client_id"],
tenant_id=kwargs["tenant_id"],
client_secret=kwargs["client_secret"],
)
else:
raise RuntimeError(f"Auth method {auth_method} not found")
return credential
class MSGraphConnection(MailboxConnection):
def __init__(
self,
auth_method: str,
mailbox: str,
graph_url: str,
client_id: str,
client_secret: str,
username: str,
password: str,
tenant_id: str,
token_file: str,
allow_unencrypted_storage: bool,
):
token_path = Path(token_file)
credential = _generate_credential(
auth_method,
client_id=client_id,
client_secret=client_secret,
username=username,
password=password,
tenant_id=tenant_id,
token_path=token_path,
allow_unencrypted_storage=allow_unencrypted_storage,
)
client_params = {
"credential": credential,
"cloud": graph_url,
}
if not isinstance(credential, ClientSecretCredential):
scopes = ["Mail.ReadWrite"]
# Detect if mailbox is shared
if mailbox and username != mailbox:
scopes = ["Mail.ReadWrite.Shared"]
auth_record = credential.authenticate(scopes=scopes)
_cache_auth_record(auth_record, token_path)
client_params["scopes"] = scopes
self._client = GraphClient(**client_params)
self.mailbox_name = mailbox
def create_folder(self, folder_name: str):
sub_url = ""
path_parts = folder_name.split("/")
if len(path_parts) > 1: # Folder is a subFolder
parent_folder_id = None
for folder in path_parts[:-1]:
parent_folder_id = self._find_folder_id_with_parent(
folder, parent_folder_id
)
sub_url = f"/{parent_folder_id}/childFolders"
folder_name = path_parts[-1]
request_body = {"displayName": folder_name}
request_url = f"/users/{self.mailbox_name}/mailFolders{sub_url}"
resp = self._client.post(request_url, json=request_body)
if resp.status_code == 409:
logger.debug(f"Folder {folder_name} already exists, skipping creation")
elif resp.status_code == 201:
logger.debug(f"Created folder {folder_name}")
else:
logger.warning(f"Unknown response {resp.status_code} {resp.json()}")
def fetch_messages(self, folder_name: str, **kwargs) -> List[str]:
"""Returns a list of message UIDs in the specified folder"""
folder_id = self._find_folder_id_from_folder_path(folder_name)
url = f"/users/{self.mailbox_name}/mailFolders/{folder_id}/messages"
since = kwargs.get("since")
if not since:
since = None
batch_size = kwargs.get("batch_size")
if not batch_size:
batch_size = 0
emails = self._get_all_messages(url, batch_size, since)
return [email["id"] for email in emails]
def _get_all_messages(self, url, batch_size, since):
messages: list
params = {"$select": "id"}
if since:
params["$filter"] = f"receivedDateTime ge {since}"
if batch_size and batch_size > 0:
params["$top"] = batch_size
else:
params["$top"] = 100
result = self._client.get(url, params=params)
if result.status_code != 200:
raise RuntimeError(f"Failed to fetch messages {result.text}")
messages = result.json()["value"]
# Loop if next page is present and not obtained message limit.
while "@odata.nextLink" in result.json() and (
since is not None or (batch_size == 0 or batch_size - len(messages) > 0)
):
result = self._client.get(result.json()["@odata.nextLink"])
if result.status_code != 200:
raise RuntimeError(f"Failed to fetch messages {result.text}")
messages.extend(result.json()["value"])
return messages
def mark_message_read(self, message_id: str):
"""Marks a message as read"""
url = f"/users/{self.mailbox_name}/messages/{message_id}"
resp = self._client.patch(url, json={"isRead": "true"})
if resp.status_code != 200:
raise RuntimeWarning(
f"Failed to mark message read{resp.status_code}: {resp.json()}"
)
def fetch_message(self, message_id: str, **kwargs):
url = f"/users/{self.mailbox_name}/messages/{message_id}/$value"
result = self._client.get(url)
if result.status_code != 200:
raise RuntimeWarning(
f"Failed to fetch message{result.status_code}: {result.json()}"
)
mark_read = kwargs.get("mark_read")
if mark_read:
self.mark_message_read(message_id)
return result.text
def delete_message(self, message_id: str):
url = f"/users/{self.mailbox_name}/messages/{message_id}"
resp = self._client.delete(url)
if resp.status_code != 204:
raise RuntimeWarning(
f"Failed to delete message {resp.status_code}: {resp.json()}"
)
def move_message(self, message_id: str, folder_name: str):
folder_id = self._find_folder_id_from_folder_path(folder_name)
request_body = {"destinationId": folder_id}
url = f"/users/{self.mailbox_name}/messages/{message_id}/move"
resp = self._client.post(url, json=request_body)
if resp.status_code != 201:
raise RuntimeWarning(
f"Failed to move message {resp.status_code}: {resp.json()}"
)
def keepalive(self):
# Not needed
pass
def watch(self, check_callback, check_timeout):
"""Checks the mailbox for new messages every n seconds"""
while True:
sleep(check_timeout)
check_callback(self)
@lru_cache(maxsize=10)
def _find_folder_id_from_folder_path(self, folder_name: str) -> str:
path_parts = folder_name.split("/")
parent_folder_id = None
if len(path_parts) > 1:
for folder in path_parts[:-1]:
folder_id = self._find_folder_id_with_parent(folder, parent_folder_id)
parent_folder_id = folder_id
return self._find_folder_id_with_parent(path_parts[-1], parent_folder_id)
else:
return self._find_folder_id_with_parent(folder_name, None)
def _find_folder_id_with_parent(
self, folder_name: str, parent_folder_id: Optional[str]
):
sub_url = ""
if parent_folder_id is not None:
sub_url = f"/{parent_folder_id}/childFolders"
url = f"/users/{self.mailbox_name}/mailFolders{sub_url}"
filter = f"?$filter=displayName eq '{folder_name}'"
folders_resp = self._client.get(url + filter)
if folders_resp.status_code != 200:
raise RuntimeWarning(f"Failed to list folders.{folders_resp.json()}")
folders: list = folders_resp.json()["value"]
matched_folders = [
folder for folder in folders if folder["displayName"] == folder_name
]
if len(matched_folders) == 0:
raise RuntimeError(f"folder {folder_name} not found")
selected_folder = matched_folders[0]
return selected_folder["id"]

96
parsedmarc/mail/imap.py Normal file
View File

@@ -0,0 +1,96 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Optional
from time import sleep
from imapclient.exceptions import IMAPClientError
from mailsuite.imap import IMAPClient
from socket import timeout
from parsedmarc.log import logger
from parsedmarc.mail.mailbox_connection import MailboxConnection
class IMAPConnection(MailboxConnection):
def __init__(
self,
host: Optional[str] = None,
*,
user: Optional[str] = None,
password: Optional[str] = None,
port: Optional[str] = None,
ssl: Optional[bool] = True,
verify: Optional[bool] = True,
timeout: Optional[int] = 30,
max_retries: Optional[int] = 4,
):
self._username = user
self._password = password
self._verify = verify
self._client = IMAPClient(
host,
user,
password,
port=port,
ssl=ssl,
verify=verify,
timeout=timeout,
max_retries=max_retries,
)
def create_folder(self, folder_name: str):
self._client.create_folder(folder_name)
def fetch_messages(self, reports_folder: str, **kwargs):
self._client.select_folder(reports_folder)
since = kwargs.get("since")
if since:
return self._client.search(["SINCE", since])
else:
return self._client.search()
def fetch_message(self, message_id: int):
return self._client.fetch_message(message_id, parse=False)
def delete_message(self, message_id: int):
self._client.delete_messages([message_id])
def move_message(self, message_id: int, folder_name: str):
self._client.move_messages([message_id], folder_name)
def keepalive(self):
self._client.noop()
def watch(self, check_callback, check_timeout):
"""
Use an IDLE IMAP connection to parse incoming emails,
and pass the results to a callback function
"""
# IDLE callback sends IMAPClient object,
# send back the imap connection object instead
def idle_callback_wrapper(client: IMAPClient):
self._client = client
check_callback(self)
while True:
try:
IMAPClient(
host=self._client.host,
username=self._username,
password=self._password,
port=self._client.port,
ssl=self._client.ssl,
verify=self._verify,
idle_callback=idle_callback_wrapper,
idle_timeout=check_timeout,
)
except (timeout, IMAPClientError):
logger.warning("IMAP connection timeout. Reconnecting...")
sleep(check_timeout)
except Exception as e:
logger.warning("IMAP connection error. {0}. Reconnecting...".format(e))
sleep(check_timeout)

View File

@@ -0,0 +1,32 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from abc import ABC
class MailboxConnection(ABC):
"""
Interface for a mailbox connection
"""
def create_folder(self, folder_name: str):
raise NotImplementedError
def fetch_messages(self, reports_folder: str, **kwargs) -> list[str]:
raise NotImplementedError
def fetch_message(self, message_id) -> str:
raise NotImplementedError
def delete_message(self, message_id: str):
raise NotImplementedError
def move_message(self, message_id: str, folder_name: str):
raise NotImplementedError
def keepalive(self):
raise NotImplementedError
def watch(self, check_callback, check_timeout):
raise NotImplementedError

View File

@@ -0,0 +1,69 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Optional
from time import sleep
from parsedmarc.log import logger
from parsedmarc.mail.mailbox_connection import MailboxConnection
import mailbox
import os
class MaildirConnection(MailboxConnection):
def __init__(
self,
maildir_path: Optional[bool] = None,
maildir_create: Optional[bool] = False,
):
self._maildir_path = maildir_path
self._maildir_create = maildir_create
maildir_owner = os.stat(maildir_path).st_uid
if os.getuid() != maildir_owner:
if os.getuid() == 0:
logger.warning(
"Switching uid to {} to access Maildir".format(maildir_owner)
)
os.setuid(maildir_owner)
else:
ex = "runtime uid {} differ from maildir {} owner {}".format(
os.getuid(), maildir_path, maildir_owner
)
raise Exception(ex)
self._client = mailbox.Maildir(maildir_path, create=maildir_create)
self._subfolder_client = {}
def create_folder(self, folder_name: str):
self._subfolder_client[folder_name] = self._client.add_folder(folder_name)
self._client.add_folder(folder_name)
def fetch_messages(self, reports_folder: str, **kwargs):
return self._client.keys()
def fetch_message(self, message_id: str):
return self._client.get(message_id).as_string()
def delete_message(self, message_id: str):
self._client.remove(message_id)
def move_message(self, message_id: str, folder_name: str):
message_data = self._client.get(message_id)
if folder_name not in self._subfolder_client.keys():
self._subfolder_client = mailbox.Maildir(
os.join(self.maildir_path, folder_name), create=self.maildir_create
)
self._subfolder_client[folder_name].add(message_data)
self._client.remove(message_id)
def keepalive(self):
return
def watch(self, check_callback, check_timeout):
while True:
try:
check_callback(self)
except Exception as e:
logger.warning("Maildir init error. {0}".format(e))
sleep(check_timeout)

855
parsedmarc/opensearch.py Normal file
View File

@@ -0,0 +1,855 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Optional, Union, Any
from opensearchpy import (
Q,
connections,
Object,
Document,
Index,
Nested,
InnerDoc,
Integer,
Text,
Boolean,
Ip,
Date,
Search,
)
from opensearchpy.helpers import reindex
from parsedmarc.log import logger
from parsedmarc.utils import human_timestamp_to_datetime
from parsedmarc import InvalidForensicReport
class OpenSearchError(Exception):
"""Raised when an OpenSearch error occurs"""
class _PolicyOverride(InnerDoc):
type = Text()
comment = Text()
class _PublishedPolicy(InnerDoc):
domain = Text()
adkim = Text()
aspf = Text()
p = Text()
sp = Text()
pct = Integer()
fo = Text()
class _DKIMResult(InnerDoc):
domain = Text()
selector = Text()
result = Text()
class _SPFResult(InnerDoc):
domain = Text()
scope = Text()
results = Text()
class _AggregateReportDoc(Document):
class Index:
name = "dmarc_aggregate"
xml_schema = Text()
org_name = Text()
org_email = Text()
org_extra_contact_info = Text()
report_id = Text()
date_range = Date()
date_begin = Date()
date_end = Date()
normalized_timespan = Boolean()
original_timespan_seconds = Integer
errors = Text()
published_policy = Object(_PublishedPolicy)
source_ip_address = Ip()
source_country = Text()
source_reverse_dns = Text()
source_base_domain = Text()
source_type = Text()
source_name = Text()
message_count = Integer
disposition = Text()
dkim_aligned = Boolean()
spf_aligned = Boolean()
passed_dmarc = Boolean()
policy_overrides = Nested(_PolicyOverride)
header_from = Text()
envelope_from = Text()
envelope_to = Text()
dkim_results = Nested(_DKIMResult)
spf_results = Nested(_SPFResult)
def add_policy_override(self, type_: str, comment: str):
self.policy_overrides.append(_PolicyOverride(type=type_, comment=comment))
def add_dkim_result(self, domain: str, selector: str, result: _DKIMResult):
self.dkim_results.append(
_DKIMResult(domain=domain, selector=selector, result=result)
)
def add_spf_result(self, domain: str, scope: str, result: _SPFResult):
self.spf_results.append(_SPFResult(domain=domain, scope=scope, result=result))
def save(self, **kwargs):
self.passed_dmarc = False
self.passed_dmarc = self.spf_aligned or self.dkim_aligned
return super().save(**kwargs)
class _EmailAddressDoc(InnerDoc):
display_name = Text()
address = Text()
class _EmailAttachmentDoc(Document):
filename = Text()
content_type = Text()
sha256 = Text()
class _ForensicSampleDoc(InnerDoc):
raw = Text()
headers = Object()
headers_only = Boolean()
to = Nested(_EmailAddressDoc)
subject = Text()
filename_safe_subject = Text()
_from = Object(_EmailAddressDoc)
date = Date()
reply_to = Nested(_EmailAddressDoc)
cc = Nested(_EmailAddressDoc)
bcc = Nested(_EmailAddressDoc)
body = Text()
attachments = Nested(_EmailAttachmentDoc)
def add_to(self, display_name: str, address: str):
self.to.append(_EmailAddressDoc(display_name=display_name, address=address))
def add_reply_to(self, display_name: str, address: str):
self.reply_to.append(
_EmailAddressDoc(display_name=display_name, address=address)
)
def add_cc(self, display_name: str, address: str):
self.cc.append(_EmailAddressDoc(display_name=display_name, address=address))
def add_bcc(self, display_name: str, address: str):
self.bcc.append(_EmailAddressDoc(display_name=display_name, address=address))
def add_attachment(self, filename: str, content_type: str, sha256: str):
self.attachments.append(
_EmailAttachmentDoc(
filename=filename, content_type=content_type, sha256=sha256
)
)
class _ForensicReportDoc(Document):
class Index:
name = "dmarc_forensic"
feedback_type = Text()
user_agent = Text()
version = Text()
original_mail_from = Text()
arrival_date = Date()
domain = Text()
original_envelope_id = Text()
authentication_results = Text()
delivery_results = Text()
source_ip_address = Ip()
source_country = Text()
source_reverse_dns = Text()
source_authentication_mechanisms = Text()
source_auth_failures = Text()
dkim_domain = Text()
original_rcpt_to = Text()
sample = Object(_ForensicSampleDoc)
class _SMTPTLSFailureDetailsDoc(InnerDoc):
result_type = Text()
sending_mta_ip = Ip()
receiving_mx_helo = Text()
receiving_ip = Ip()
failed_session_count = Integer()
additional_information_uri = Text()
failure_reason_code = Text()
class _SMTPTLSPolicyDoc(InnerDoc):
policy_domain = Text()
policy_type = Text()
policy_strings = Text()
mx_host_patterns = Text()
successful_session_count = Integer()
failed_session_count = Integer()
failure_details = Nested(_SMTPTLSFailureDetailsDoc)
def add_failure_details(
self,
result_type: Optional[str] = None,
ip_address: Optional[str] = None,
receiving_ip: Optional[str] = None,
receiving_mx_helo: Optional[str] = None,
failed_session_count: Optional[int] = None,
sending_mta_ip: Optional[str] = None,
receiving_mx_hostname: Optional[str] = None,
additional_information_uri: Optional[str] = None,
failure_reason_code: Union[str, int, None] = None,
):
_details = _SMTPTLSFailureDetailsDoc(
result_type=result_type,
ip_address=ip_address,
sending_mta_ip=sending_mta_ip,
receiving_mx_hostname=receiving_mx_hostname,
receiving_mx_helo=receiving_mx_helo,
receiving_ip=receiving_ip,
failed_session_count=failed_session_count,
additional_information=additional_information_uri,
failure_reason_code=failure_reason_code,
)
self.failure_details.append(_details)
class _SMTPTLSReportDoc(Document):
class Index:
name = "smtp_tls"
organization_name = Text()
date_range = Date()
date_begin = Date()
date_end = Date()
contact_info = Text()
report_id = Text()
policies = Nested(_SMTPTLSPolicyDoc)
def add_policy(
self,
policy_type: str,
policy_domain: str,
successful_session_count: int,
failed_session_count: int,
*,
policy_string: Optional[str] = None,
mx_host_patterns: Optional[list[str]] = None,
failure_details: Optional[str] = None,
):
self.policies.append(
policy_type=policy_type,
policy_domain=policy_domain,
successful_session_count=successful_session_count,
failed_session_count=failed_session_count,
policy_string=policy_string,
mx_host_patterns=mx_host_patterns,
failure_details=failure_details,
)
class AlreadySaved(ValueError):
"""Raised when a report to be saved matches an existing report"""
def set_hosts(
hosts: Union[str, list[str]],
*,
use_ssl: Optional[bool] = False,
ssl_cert_path: Optional[str] = None,
username: Optional[str] = None,
password: Optional[str] = None,
api_key: Optional[str] = None,
timeout: Optional[float] = 60.0,
):
"""
Sets the OpenSearch hosts to use
Args:
hosts (str|list[str]): A single hostname or URL, or list of hostnames or URLs
use_ssl (bool): Use an HTTPS connection to the server
ssl_cert_path (str): Path to the certificate chain
username (str): The username to use for authentication
password (str): The password to use for authentication
api_key (str): The Base64 encoded API key to use for authentication
timeout (float): Timeout in seconds
"""
if not isinstance(hosts, list):
hosts = [hosts]
conn_params = {"hosts": hosts, "timeout": timeout}
if use_ssl:
conn_params["use_ssl"] = True
if ssl_cert_path:
conn_params["verify_certs"] = True
conn_params["ca_certs"] = ssl_cert_path
else:
conn_params["verify_certs"] = False
if username and password:
conn_params["http_auth"] = username + ":" + password
if api_key:
conn_params["api_key"] = api_key
connections.create_connection(**conn_params)
def create_indexes(names: list[str], settings: Optional[dict[str, Any]] = None):
"""
Create OpenSearch indexes
Args:
names (list): A list of index names
settings (dict): Index settings
"""
for name in names:
index = Index(name)
try:
if not index.exists():
logger.debug("Creating OpenSearch index: {0}".format(name))
if settings is None:
index.settings(number_of_shards=1, number_of_replicas=0)
else:
index.settings(**settings)
index.create()
except Exception as e:
raise OpenSearchError("OpenSearch error: {0}".format(e.__str__()))
def migrate_indexes(
aggregate_indexes: Optional[list[str]] = None,
forensic_indexes: Optional[list[str]] = None,
):
"""
Updates index mappings
Args:
aggregate_indexes (list): A list of aggregate index names
forensic_indexes (list): A list of forensic index names
"""
version = 2
if aggregate_indexes is None:
aggregate_indexes = []
if forensic_indexes is None:
forensic_indexes = []
for aggregate_index_name in aggregate_indexes:
if not Index(aggregate_index_name).exists():
continue
aggregate_index = Index(aggregate_index_name)
doc = "doc"
fo_field = "published_policy.fo"
fo = "fo"
fo_mapping = aggregate_index.get_field_mapping(fields=[fo_field])
fo_mapping = fo_mapping[list(fo_mapping.keys())[0]]["mappings"]
if doc not in fo_mapping:
continue
fo_mapping = fo_mapping[doc][fo_field]["mapping"][fo]
fo_type = fo_mapping["type"]
if fo_type == "long":
new_index_name = "{0}-v{1}".format(aggregate_index_name, version)
body = {
"properties": {
"published_policy.fo": {
"type": "text",
"fields": {"keyword": {"type": "keyword", "ignore_above": 256}},
}
}
}
Index(new_index_name).create()
Index(new_index_name).put_mapping(doc_type=doc, body=body)
reindex(connections.get_connection(), aggregate_index_name, new_index_name)
Index(aggregate_index_name).delete()
for forensic_index in forensic_indexes:
pass
def save_aggregate_report_to_opensearch(
aggregate_report: dict[str, Any],
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: Optional[int] = 1,
number_of_replicas: Optional[int] = 0,
):
"""
Saves a parsed DMARC aggregate report to OpenSearch
Args:
aggregate_report (dict): A parsed forensic report
index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily indexes
number_of_shards (int): The number of shards to use in the index
number_of_replicas (int): The number of replicas to use in the index
Raises:
AlreadySaved
"""
logger.info("Saving aggregate report to OpenSearch")
aggregate_report = aggregate_report.copy()
metadata = aggregate_report["report_metadata"]
org_name = metadata["org_name"]
report_id = metadata["report_id"]
domain = aggregate_report["policy_published"]["domain"]
begin_date = human_timestamp_to_datetime(metadata["begin_date"], to_utc=True)
end_date = human_timestamp_to_datetime(metadata["end_date"], to_utc=True)
if monthly_indexes:
index_date = begin_date.strftime("%Y-%m")
else:
index_date = begin_date.strftime("%Y-%m-%d")
org_name_query = Q(dict(match_phrase=dict(org_name=org_name)))
report_id_query = Q(dict(match_phrase=dict(report_id=report_id)))
domain_query = Q(dict(match_phrase={"published_policy.domain": domain}))
begin_date_query = Q(dict(match=dict(date_begin=begin_date)))
end_date_query = Q(dict(match=dict(date_end=end_date)))
if index_suffix is not None:
search_index = "dmarc_aggregate_{0}*".format(index_suffix)
else:
search_index = "dmarc_aggregate*"
if index_prefix is not None:
search_index = "{0}{1}".format(index_prefix, search_index)
search = Search(index=search_index)
query = org_name_query & report_id_query & domain_query
query = query & begin_date_query & end_date_query
search.query = query
try:
existing = search.execute()
except Exception as error_:
begin_date_human = begin_date.strftime("%Y-%m-%d %H:%M:%SZ")
end_date_human = end_date.strftime("%Y-%m-%d %H:%M:%SZ")
raise OpenSearchError(
"OpenSearch's search for existing report \
error: {}".format(error_.__str__())
)
if len(existing) > 0:
raise AlreadySaved(
"An aggregate report ID {0} from {1} about {2} "
"with a date range of {3} UTC to {4} UTC already "
"exists in "
"OpenSearch".format(
report_id, org_name, domain, begin_date_human, end_date_human
)
)
published_policy = _PublishedPolicy(
domain=aggregate_report["policy_published"]["domain"],
adkim=aggregate_report["policy_published"]["adkim"],
aspf=aggregate_report["policy_published"]["aspf"],
p=aggregate_report["policy_published"]["p"],
sp=aggregate_report["policy_published"]["sp"],
pct=aggregate_report["policy_published"]["pct"],
fo=aggregate_report["policy_published"]["fo"],
)
for record in aggregate_report["records"]:
begin_date = human_timestamp_to_datetime(record["interval_begin"], to_utc=True)
end_date = human_timestamp_to_datetime(record["interval_end"], to_utc=True)
normalized_timespan = record["normalized_timespan"]
if monthly_indexes:
index_date = begin_date.strftime("%Y-%m")
else:
index_date = begin_date.strftime("%Y-%m-%d")
aggregate_report["begin_date"] = begin_date
aggregate_report["end_date"] = end_date
date_range = [aggregate_report["begin_date"], aggregate_report["end_date"]]
agg_doc = _AggregateReportDoc(
xml_schema=aggregate_report["xml_schema"],
org_name=metadata["org_name"],
org_email=metadata["org_email"],
org_extra_contact_info=metadata["org_extra_contact_info"],
report_id=metadata["report_id"],
date_range=date_range,
date_begin=begin_date,
date_end=end_date,
normalized_timespan=normalized_timespan,
errors=metadata["errors"],
published_policy=published_policy,
source_ip_address=record["source"]["ip_address"],
source_country=record["source"]["country"],
source_reverse_dns=record["source"]["reverse_dns"],
source_base_domain=record["source"]["base_domain"],
source_type=record["source"]["type"],
source_name=record["source"]["name"],
message_count=record["count"],
disposition=record["policy_evaluated"]["disposition"],
dkim_aligned=record["policy_evaluated"]["dkim"] is not None
and record["policy_evaluated"]["dkim"].lower() == "pass",
spf_aligned=record["policy_evaluated"]["spf"] is not None
and record["policy_evaluated"]["spf"].lower() == "pass",
header_from=record["identifiers"]["header_from"],
envelope_from=record["identifiers"]["envelope_from"],
envelope_to=record["identifiers"]["envelope_to"],
)
for override in record["policy_evaluated"]["policy_override_reasons"]:
agg_doc.add_policy_override(
type_=override["type"], comment=override["comment"]
)
for dkim_result in record["auth_results"]["dkim"]:
agg_doc.add_dkim_result(
domain=dkim_result["domain"],
selector=dkim_result["selector"],
result=dkim_result["result"],
)
for spf_result in record["auth_results"]["spf"]:
agg_doc.add_spf_result(
domain=spf_result["domain"],
scope=spf_result["scope"],
result=spf_result["result"],
)
index = "dmarc_aggregate"
if index_suffix:
index = "{0}_{1}".format(index, index_suffix)
if index_prefix:
index = "{0}{1}".format(index_prefix, index)
index = "{0}-{1}".format(index, index_date)
index_settings = dict(
number_of_shards=number_of_shards, number_of_replicas=number_of_replicas
)
create_indexes([index], index_settings)
agg_doc.meta.index = index
try:
agg_doc.save()
except Exception as e:
raise OpenSearchError("OpenSearch error: {0}".format(e.__str__()))
def save_forensic_report_to_opensearch(
forensic_report: dict[str, Any],
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: int = 1,
number_of_replicas: int = 0,
):
"""
Saves a parsed DMARC forensic report to OpenSearch
Args:
forensic_report (dict): A parsed forensic report
index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily
indexes
number_of_shards (int): The number of shards to use in the index
number_of_replicas (int): The number of replicas to use in the
index
Raises:
AlreadySaved
"""
logger.info("Saving forensic report to OpenSearch")
forensic_report = forensic_report.copy()
sample_date = None
if forensic_report["parsed_sample"]["date"] is not None:
sample_date = forensic_report["parsed_sample"]["date"]
sample_date = human_timestamp_to_datetime(sample_date)
original_headers = forensic_report["parsed_sample"]["headers"]
headers = dict()
for original_header in original_headers:
headers[original_header.lower()] = original_headers[original_header]
arrival_date = human_timestamp_to_datetime(forensic_report["arrival_date_utc"])
arrival_date_epoch_milliseconds = int(arrival_date.timestamp() * 1000)
if index_suffix is not None:
search_index = "dmarc_forensic_{0}*".format(index_suffix)
else:
search_index = "dmarc_forensic*"
if index_prefix is not None:
search_index = "{0}{1}".format(index_prefix, search_index)
search = Search(index=search_index)
q = Q(dict(match=dict(arrival_date=arrival_date_epoch_milliseconds)))
from_ = None
to_ = None
subject = None
if "from" in headers:
# We convert the FROM header from a string list to a flat string.
headers["from"] = headers["from"][0]
if headers["from"][0] == "":
headers["from"] = headers["from"][1]
else:
headers["from"] = " <".join(headers["from"]) + ">"
from_ = dict()
from_["sample.headers.from"] = headers["from"]
from_query = Q(dict(match_phrase=from_))
q = q & from_query
if "to" in headers:
# We convert the TO header from a string list to a flat string.
headers["to"] = headers["to"][0]
if headers["to"][0] == "":
headers["to"] = headers["to"][1]
else:
headers["to"] = " <".join(headers["to"]) + ">"
to_ = dict()
to_["sample.headers.to"] = headers["to"]
to_query = Q(dict(match_phrase=to_))
q = q & to_query
if "subject" in headers:
subject = headers["subject"]
subject_query = {"match_phrase": {"sample.headers.subject": subject}}
q = q & Q(subject_query)
search.query = q
existing = search.execute()
if len(existing) > 0:
raise AlreadySaved(
"A forensic sample to {0} from {1} "
"with a subject of {2} and arrival date of {3} "
"already exists in "
"OpenSearch".format(
to_, from_, subject, forensic_report["arrival_date_utc"]
)
)
parsed_sample = forensic_report["parsed_sample"]
sample = _ForensicSampleDoc(
raw=forensic_report["sample"],
headers=headers,
headers_only=forensic_report["sample_headers_only"],
date=sample_date,
subject=forensic_report["parsed_sample"]["subject"],
filename_safe_subject=parsed_sample["filename_safe_subject"],
body=forensic_report["parsed_sample"]["body"],
)
for address in forensic_report["parsed_sample"]["to"]:
sample.add_to(display_name=address["display_name"], address=address["address"])
for address in forensic_report["parsed_sample"]["reply_to"]:
sample.add_reply_to(
display_name=address["display_name"], address=address["address"]
)
for address in forensic_report["parsed_sample"]["cc"]:
sample.add_cc(display_name=address["display_name"], address=address["address"])
for address in forensic_report["parsed_sample"]["bcc"]:
sample.add_bcc(display_name=address["display_name"], address=address["address"])
for attachment in forensic_report["parsed_sample"]["attachments"]:
sample.add_attachment(
filename=attachment["filename"],
content_type=attachment["mail_content_type"],
sha256=attachment["sha256"],
)
try:
forensic_doc = _ForensicReportDoc(
feedback_type=forensic_report["feedback_type"],
user_agent=forensic_report["user_agent"],
version=forensic_report["version"],
original_mail_from=forensic_report["original_mail_from"],
arrival_date=arrival_date_epoch_milliseconds,
domain=forensic_report["reported_domain"],
original_envelope_id=forensic_report["original_envelope_id"],
authentication_results=forensic_report["authentication_results"],
delivery_results=forensic_report["delivery_result"],
source_ip_address=forensic_report["source"]["ip_address"],
source_country=forensic_report["source"]["country"],
source_reverse_dns=forensic_report["source"]["reverse_dns"],
source_base_domain=forensic_report["source"]["base_domain"],
authentication_mechanisms=forensic_report["authentication_mechanisms"],
auth_failure=forensic_report["auth_failure"],
dkim_domain=forensic_report["dkim_domain"],
original_rcpt_to=forensic_report["original_rcpt_to"],
sample=sample,
)
index = "dmarc_forensic"
if index_suffix:
index = "{0}_{1}".format(index, index_suffix)
if index_prefix:
index = "{0}{1}".format(index_prefix, index)
if monthly_indexes:
index_date = arrival_date.strftime("%Y-%m")
else:
index_date = arrival_date.strftime("%Y-%m-%d")
index = "{0}-{1}".format(index, index_date)
index_settings = dict(
number_of_shards=number_of_shards, number_of_replicas=number_of_replicas
)
create_indexes([index], index_settings)
forensic_doc.meta.index = index
try:
forensic_doc.save()
except Exception as e:
raise OpenSearchError("OpenSearch error: {0}".format(e.__str__()))
except KeyError as e:
raise InvalidForensicReport(
"Forensic report missing required field: {0}".format(e.__str__())
)
def save_smtp_tls_report_to_opensearch(
report: dict[str, Any],
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: Optional[int] = 1,
number_of_replicas: Optional[int] = 0,
):
"""
Saves a parsed SMTP TLS report to OpenSearch
Args:
report (dict): A parsed SMTP TLS report
index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily indexes
number_of_shards (int): The number of shards to use in the index
number_of_replicas (int): The number of replicas to use in the index
Raises:
AlreadySaved
"""
logger.info("Saving SMTP TLS report to OpenSearch")
org_name = report["organization_name"]
report_id = report["report_id"]
begin_date = human_timestamp_to_datetime(report["begin_date"], to_utc=True)
end_date = human_timestamp_to_datetime(report["end_date"], to_utc=True)
begin_date_human = begin_date.strftime("%Y-%m-%d %H:%M:%SZ")
end_date_human = end_date.strftime("%Y-%m-%d %H:%M:%SZ")
if monthly_indexes:
index_date = begin_date.strftime("%Y-%m")
else:
index_date = begin_date.strftime("%Y-%m-%d")
report["begin_date"] = begin_date
report["end_date"] = end_date
org_name_query = Q(dict(match_phrase=dict(org_name=org_name)))
report_id_query = Q(dict(match_phrase=dict(report_id=report_id)))
begin_date_query = Q(dict(match=dict(date_begin=begin_date)))
end_date_query = Q(dict(match=dict(date_end=end_date)))
if index_suffix is not None:
search_index = "smtp_tls_{0}*".format(index_suffix)
else:
search_index = "smtp_tls*"
if index_prefix is not None:
search_index = "{0}{1}".format(index_prefix, search_index)
search = Search(index=search_index)
query = org_name_query & report_id_query
query = query & begin_date_query & end_date_query
search.query = query
try:
existing = search.execute()
except Exception as error_:
raise OpenSearchError(
"OpenSearch's search for existing report \
error: {}".format(error_.__str__())
)
if len(existing) > 0:
raise AlreadySaved(
f"An SMTP TLS report ID {report_id} from "
f" {org_name} with a date range of "
f"{begin_date_human} UTC to "
f"{end_date_human} UTC already "
"exists in OpenSearch"
)
index = "smtp_tls"
if index_suffix:
index = "{0}_{1}".format(index, index_suffix)
if index_prefix:
index = "{0}{1}".format(index_prefix, index)
index = "{0}-{1}".format(index, index_date)
index_settings = dict(
number_of_shards=number_of_shards, number_of_replicas=number_of_replicas
)
smtp_tls_doc = _SMTPTLSReportDoc(
org_name=report["organization_name"],
date_range=[report["begin_date"], report["end_date"]],
date_begin=report["begin_date"],
date_end=report["end_date"],
contact_info=report["contact_info"],
report_id=report["report_id"],
)
for policy in report["policies"]:
policy_strings = None
mx_host_patterns = None
if "policy_strings" in policy:
policy_strings = policy["policy_strings"]
if "mx_host_patterns" in policy:
mx_host_patterns = policy["mx_host_patterns"]
policy_doc = _SMTPTLSPolicyDoc(
policy_domain=policy["policy_domain"],
policy_type=policy["policy_type"],
successful_session_count=policy["successful_session_count"],
failed_session_count=policy["failed_session_count"],
policy_string=policy_strings,
mx_host_patterns=mx_host_patterns,
)
if "failure_details" in policy:
for failure_detail in policy["failure_details"]:
receiving_mx_hostname = None
additional_information_uri = None
failure_reason_code = None
ip_address = None
receiving_ip = None
receiving_mx_helo = None
sending_mta_ip = None
if "receiving_mx_hostname" in failure_detail:
receiving_mx_hostname = failure_detail["receiving_mx_hostname"]
if "additional_information_uri" in failure_detail:
additional_information_uri = failure_detail[
"additional_information_uri"
]
if "failure_reason_code" in failure_detail:
failure_reason_code = failure_detail["failure_reason_code"]
if "ip_address" in failure_detail:
ip_address = failure_detail["ip_address"]
if "receiving_ip" in failure_detail:
receiving_ip = failure_detail["receiving_ip"]
if "receiving_mx_helo" in failure_detail:
receiving_mx_helo = failure_detail["receiving_mx_helo"]
if "sending_mta_ip" in failure_detail:
sending_mta_ip = failure_detail["sending_mta_ip"]
policy_doc.add_failure_details(
result_type=failure_detail["result_type"],
ip_address=ip_address,
receiving_ip=receiving_ip,
receiving_mx_helo=receiving_mx_helo,
failed_session_count=failure_detail["failed_session_count"],
sending_mta_ip=sending_mta_ip,
receiving_mx_hostname=receiving_mx_hostname,
additional_information_uri=additional_information_uri,
failure_reason_code=failure_reason_code,
)
smtp_tls_doc.policies.append(policy_doc)
create_indexes([index], index_settings)
smtp_tls_doc.meta.index = index
try:
smtp_tls_doc.save()
except Exception as e:
raise OpenSearchError("OpenSearch error: {0}".format(e.__str__()))

View File

View File

@@ -0,0 +1,7 @@
# About
`dbip-country-lite.mmdb` is provided by [dbip][dbip] under a
[Creative Commons Attribution 4.0 International License][cc].
[dbip]: https://db-ip.com/db/download/ip-to-country-lite
[cc]: http://creativecommons.org/licenses/by/4.0/

View File

Binary file not shown.

View File

@@ -0,0 +1,92 @@
# About
A mapping is meant to make it easier to identify who or what a sending source is. Please consider contributing
additional mappings in a GitHub Pull Request.
Do not open these CSV files in Excel. It will replace Unicode characters with question marks. Use LibreOffice Calc instead.
## base_reverse_dns_map.csv
A CSV file with three fields: `base_reverse_dns`, `name`, and `type`.
Most of the time the base reverse DNS of sending service is closely related to the name of the
service, but not always. Sometimes services will use multiple reverse DNS domains for the same service. For example,
Intuit Mailchimp uses the base domains `mcdlv.net`, `mcsv.net`,
and `rsgsv.net`. Having all of these mapped to the same service name and type makes it easier to answer questions like:
"How many emails is Intuit Mailchimp sending as my domains?"
The `service_type` is based on the following rule precedence:
1. All email security services are identified as `Email Security`, no matter how or where they are hosted.
2. All marketing services are identified as `Marketing`, no matter how or where they are hosted.
3. All telecommunications providers that offer internet access are identified as `ISP`, even if they also offer other services, such as web hosting or email hosting.
4. All web hosting providers are identified as `Web Hosting`, even if the service also offers email hosting.
5. All email account providers are identified as `Email Provider`, no matter how or where they are hosted
6. All legitimate platforms offering their Software as a Service (SaaS) are identified as `SaaS`, regardless of industry. This helps simplify metrics.
7. All other senders that use their own domain as a Reverse DNS base domain should be identified based on their industry
- Agriculture
- Automotive
- Beauty
- Conglomerate
- Construction
- Consulting
- Defense
- Education
- Email Provider
- Email Security
- Entertainment
- Event Planning
- Finance
- Food
- Government
- Government Media
- Healthcare
- IaaS
- Industrial
- ISP
- Legal
- Logistics
- Manufacturing
- Marketing
- MSP
- MSSP
- News
- Nonprofit
- PaaS
- Photography
- Physical Security
- Print
- Publishing
- Real Estate
- Retail
- SaaS
- Science
- Search Engine
- Social Media
- Sports
- Staffing
- Technology
- Travel
- Web Host
The file currently contains over 1,400 mappings from a wide variety of email sending sources.
## known_unknown_base_reverse_dns.txt
A list of reverse DNS base domains that could not be identified as belonging to a particular organization, service, or industry.
## base_reverse_dns.csv
A CSV with the fields `source_name` and optionally `message_count`. This CSV can be generated by exporting the base DNS data from the Kibana or Splunk dashboards provided by parsedmarc. This file is not tracked by Git.
## unknown_base_reverse_dns.csv
A CSV file with the fields `source_name` and `message_count`. This file is not tracked by Git.
## find_bad_utf8.py
Locates invalid UTF-8 bytes in files and optionally tries to current them. Generated by GPT5. Helped me find where I had introduced invalid bytes in `base_reverse_dns_map.csv`.
## find_unknown_base_reverse_dns.py
This is a python script that reads the domains in `base_reverse_dns.csv` and writes the domains that are not in `base_reverse_dns_map.csv` or `known_unknown_base_reverse_dns.txt` to `unknown_base_reverse_dns.csv`. This is useful for identifying potential additional domains to contribute to `base_reverse_dns_map.csv` and `known_unknown_base_reverse_dns.txt`.

View File

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,44 @@
Agriculture
Automotive
Beauty
Conglomerate
Construction
Consulting
Defense
Education
Email Provider
Email Security
Entertainment
Event Planning
Finance
Food
Government
Government Media
Healthcare
ISP
IaaS
Industrial
Legal
Logistics
MSP
MSSP
Manufacturing
Marketing
News
Nonprofit
PaaS
Photography
Physical Security
Print
Publishing
Real Estate
Retail
SaaS
Science
Search Engine
Social Media
Sports
Staffing
Technology
Travel
Web Host

View File

@@ -0,0 +1,488 @@
#!/usr/bin/env python3
import argparse
import codecs
import os
import sys
import shutil
from typing import List, Tuple
"""
Locates and optionally corrects bad UTF-8 bytes in a file.
Generated by GPT-5 Use at your own risk.
"""
# -------------------------
# UTF-8 scanning
# -------------------------
def scan_line_for_utf8_errors(
line_bytes: bytes, line_no: int, base_offset: int, context: int
):
"""
Scan one line of raw bytes for UTF-8 decoding errors.
Returns a list of dicts describing each error.
"""
pos = 0
results = []
while pos < len(line_bytes):
dec = codecs.getincrementaldecoder("utf-8")("strict")
try:
dec.decode(line_bytes[pos:], final=True)
break
except UnicodeDecodeError as e:
rel_index = e.start
abs_index_in_line = pos + rel_index
abs_offset = base_offset + abs_index_in_line
start_ctx = max(0, abs_index_in_line - context)
end_ctx = min(len(line_bytes), abs_index_in_line + 1 + context)
ctx_bytes = line_bytes[start_ctx:end_ctx]
bad_byte = line_bytes[abs_index_in_line : abs_index_in_line + 1]
col = abs_index_in_line + 1 # 1-based byte column
results.append(
{
"line": line_no,
"column": col,
"abs_offset": abs_offset,
"bad_byte_hex": bad_byte.hex(),
"context_hex": ctx_bytes.hex(),
"context_preview": ctx_bytes.decode("utf-8", errors="replace"),
}
)
# Move past the offending byte and continue
pos = abs_index_in_line + 1
return results
def scan_file_for_utf8_errors(path: str, context: int, limit: int):
errors_found = 0
limit_val = limit if limit != 0 else float("inf")
with open(path, "rb") as f:
total_offset = 0
line_no = 0
while True:
line = f.readline()
if not line:
break
line_no += 1
results = scan_line_for_utf8_errors(line, line_no, total_offset, context)
for r in results:
errors_found += 1
print(
f"[ERROR {errors_found}] Line {r['line']}, Column {r['column']}, "
f"Absolute byte offset {r['abs_offset']}"
)
print(f" Bad byte: 0x{r['bad_byte_hex']}")
print(f" Context (hex): {r['context_hex']}")
print(f" Context (preview): {r['context_preview']}")
print()
if errors_found >= limit_val:
print(f"Reached limit of {limit} errors. Stopping.")
return errors_found
total_offset += len(line)
if errors_found == 0:
print("No invalid UTF-8 bytes found. 🎉")
else:
print(f"Found {errors_found} invalid UTF-8 byte(s).")
return errors_found
# -------------------------
# Whole-file conversion
# -------------------------
def detect_encoding_text(path: str) -> Tuple[str, str]:
"""
Use charset-normalizer to detect file encoding.
Return (encoding_name, decoded_text). Falls back to cp1252 if needed.
"""
try:
from charset_normalizer import from_path
except ImportError:
print(
"Please install charset-normalizer: pip install charset-normalizer",
file=sys.stderr,
)
sys.exit(4)
matches = from_path(path)
match = matches.best()
if match is None or match.encoding is None:
# Fallback heuristic for Western single-byte text
with open(path, "rb") as fb:
data = fb.read()
try:
return "cp1252", data.decode("cp1252", errors="strict")
except UnicodeDecodeError:
print("Unable to detect encoding reliably.", file=sys.stderr)
sys.exit(5)
return match.encoding, str(match)
def convert_to_utf8(src_path: str, out_path: str, src_encoding: str = None) -> str:
"""
Convert an entire file to UTF-8 (re-decoding everything).
If src_encoding is provided, use it; else auto-detect.
Returns the encoding actually used.
"""
if src_encoding:
with open(src_path, "rb") as fb:
data = fb.read()
try:
text = data.decode(src_encoding, errors="strict")
except LookupError:
print(f"Unknown encoding: {src_encoding}", file=sys.stderr)
sys.exit(6)
except UnicodeDecodeError as e:
print(f"Decoding failed with {src_encoding}: {e}", file=sys.stderr)
sys.exit(7)
used = src_encoding
else:
used, text = detect_encoding_text(src_path)
with open(out_path, "w", encoding="utf-8", newline="") as fw:
fw.write(text)
return used
def verify_utf8_file(path: str) -> Tuple[bool, str]:
try:
with open(path, "rb") as fb:
fb.read().decode("utf-8", errors="strict")
return True, ""
except UnicodeDecodeError as e:
return False, str(e)
# -------------------------
# Targeted single-byte fixer
# -------------------------
def iter_lines_with_offsets(b: bytes):
"""
Yield (line_bytes, line_start_abs_offset). Preserves LF/CRLF/CR in bytes.
"""
start = 0
for i, byte in enumerate(b):
if byte == 0x0A: # LF
yield b[start : i + 1], start
start = i + 1
if start < len(b):
yield b[start:], start
def detect_probable_fallbacks() -> List[str]:
# Good defaults for Western/Portuguese text
return ["cp1252", "iso-8859-1", "iso-8859-15"]
def repair_mixed_utf8_line(line: bytes, base_offset: int, fallback_chain: List[str]):
"""
Strictly validate UTF-8 and fix *only* the exact offending byte when an error occurs.
This avoids touching adjacent valid UTF-8 (prevents mojibake like 'é').
"""
out_fragments: List[str] = []
fixes = []
pos = 0
n = len(line)
while pos < n:
dec = codecs.getincrementaldecoder("utf-8")("strict")
try:
s = dec.decode(line[pos:], final=True)
out_fragments.append(s)
break
except UnicodeDecodeError as e:
# Append the valid prefix before the error
if e.start > 0:
out_fragments.append(
line[pos : pos + e.start].decode("utf-8", errors="strict")
)
bad_index = pos + e.start # absolute index in 'line'
bad_slice = line[bad_index : bad_index + 1] # FIX EXACTLY ONE BYTE
# Decode that single byte using the first working fallback
decoded = None
used_enc = None
for enc in fallback_chain:
try:
decoded = bad_slice.decode(enc, errors="strict")
used_enc = enc
break
except Exception:
continue
if decoded is None:
# latin-1 always succeeds (byte->same code point)
decoded = bad_slice.decode("latin-1")
used_enc = "latin-1 (fallback)"
out_fragments.append(decoded)
# Log the fix
col_1based = bad_index + 1 # byte-based column
fixes.append(
{
"line_base_offset": base_offset,
"line": None, # caller fills line number
"column": col_1based,
"abs_offset": base_offset + bad_index,
"bad_bytes_hex": bad_slice.hex(),
"used_encoding": used_enc,
"replacement_preview": decoded,
}
)
# Advance exactly one byte past the offending byte and continue
pos = bad_index + 1
return "".join(out_fragments), fixes
def targeted_fix_to_utf8(
src_path: str,
out_path: str,
fallback_chain: List[str],
dry_run: bool,
max_fixes: int,
):
with open(src_path, "rb") as fb:
data = fb.read()
total_fixes = 0
repaired_lines: List[str] = []
line_no = 0
max_val = max_fixes if max_fixes != 0 else float("inf")
for line_bytes, base_offset in iter_lines_with_offsets(data):
line_no += 1
# Fast path: keep lines that are already valid UTF-8
try:
repaired_lines.append(line_bytes.decode("utf-8", errors="strict"))
continue
except UnicodeDecodeError:
pass
fixed_text, fixes = repair_mixed_utf8_line(
line_bytes, base_offset, fallback_chain=fallback_chain
)
for f in fixes:
f["line"] = line_no
repaired_lines.append(fixed_text)
# Log fixes
for f in fixes:
total_fixes += 1
print(
f"[FIX {total_fixes}] Line {f['line']}, Column {f['column']}, Abs offset {f['abs_offset']}"
)
print(f" Bad bytes: 0x{f['bad_bytes_hex']}")
print(f" Used encoding: {f['used_encoding']}")
preview = f["replacement_preview"].replace("\r", "\\r").replace("\n", "\\n")
if len(preview) > 40:
preview = preview[:40] + ""
print(f" Replacement preview: {preview}")
print()
if total_fixes >= max_val:
print(f"Reached max fixes limit ({max_fixes}). Stopping scan.")
break
if total_fixes >= max_val:
break
if dry_run:
print(f"Dry run complete. Detected {total_fixes} fix(es). No file written.")
return total_fixes
# Join and verify result can be encoded to UTF-8
repaired_text = "".join(repaired_lines)
try:
repaired_text.encode("utf-8", errors="strict")
except UnicodeEncodeError as e:
print(f"Internal error: repaired text not valid UTF-8: {e}", file=sys.stderr)
sys.exit(3)
with open(out_path, "w", encoding="utf-8", newline="") as fw:
fw.write(repaired_text)
print(f"Fixed file written to: {out_path}")
print(f"Total fixes applied: {total_fixes}")
return total_fixes
# -------------------------
# CLI
# -------------------------
def main():
ap = argparse.ArgumentParser(
description=(
"Scan for invalid UTF-8; optionally convert whole file or fix only invalid bytes.\n\n"
"By default, --convert and --fix **edit the input file in place** and create a backup "
"named '<input>.bak' before writing. If you pass --output, the original file is left "
"unchanged and no backup is created. Use --dry-run to preview fixes without writing."
),
formatter_class=argparse.RawTextHelpFormatter,
)
ap.add_argument("path", help="Path to the CSV/text file")
ap.add_argument(
"--context",
type=int,
default=20,
help="Bytes of context to show around errors (default: 20)",
)
ap.add_argument(
"--limit",
type=int,
default=100,
help="Max errors to report during scan (0 = unlimited)",
)
ap.add_argument(
"--skip-scan", action="store_true", help="Skip initial scan for speed"
)
# Whole-file convert
ap.add_argument(
"--convert",
action="store_true",
help="Convert entire file to UTF-8 using auto/forced encoding "
"(in-place by default; creates '<input>.bak').",
)
ap.add_argument(
"--encoding",
help="Force source encoding for --convert or first fallback for --fix",
)
ap.add_argument(
"--output",
help="Write to this path instead of in-place (no .bak is created in that case)",
)
# Targeted fix
ap.add_argument(
"--fix",
action="store_true",
help="Fix only invalid byte(s) via fallback encodings "
"(in-place by default; creates '<input>.bak').",
)
ap.add_argument(
"--fallbacks",
help="Comma-separated fallback encodings (default: cp1252,iso-8859-1,iso-8859-15)",
)
ap.add_argument(
"--dry-run",
action="store_true",
help="(fix) Print fixes but do not write or create a .bak",
)
ap.add_argument(
"--max-fixes",
type=int,
default=0,
help="(fix) Stop after N fixes (0 = unlimited)",
)
args = ap.parse_args()
path = args.path
if not os.path.isfile(path):
print(f"File not found: {path}", file=sys.stderr)
sys.exit(2)
# Optional scan first
if not args.skip_scan:
scan_file_for_utf8_errors(path, context=args.context, limit=args.limit)
# Mode selection guards
if args.convert and args.fix:
print("Choose either --convert or --fix (not both).", file=sys.stderr)
sys.exit(9)
if not args.convert and not args.fix and args.skip_scan:
print("No action selected (use --convert or --fix).")
return
if not args.convert and not args.fix:
# User only wanted a scan
return
# Determine output path and backup behavior
# In-place by default: create '<input>.bak' before overwriting.
if args.output:
out_path = args.output
in_place = False
else:
out_path = path
in_place = True
# CONVERT mode
if args.convert:
print("\n[CONVERT MODE] Converting file to UTF-8...")
if in_place:
# Create backup before overwriting original
backup_path = path + ".bak"
shutil.copy2(path, backup_path)
print(f"Backup created: {backup_path}")
used = convert_to_utf8(path, out_path, src_encoding=args.encoding)
print(f"Source encoding used: {used}")
print(f"Saved UTF-8 file as: {out_path}")
ok, err = verify_utf8_file(out_path)
if ok:
print("Verification: output is valid UTF-8 ✅")
else:
print(f"Verification failed: {err}")
sys.exit(8)
return
# FIX mode (targeted, single-byte)
if args.fix:
print("\n[FIX MODE] Fixing only invalid bytes to UTF-8...")
if args.dry_run:
# Dry-run: never write or create backup
out_path_effective = os.devnull
in_place_effective = False
else:
out_path_effective = out_path
in_place_effective = in_place
# Build fallback chain (if --encoding provided, try it first)
if args.fallbacks:
fallback_chain = [e.strip() for e in args.fallbacks.split(",") if e.strip()]
else:
fallback_chain = detect_probable_fallbacks()
if args.encoding and args.encoding not in fallback_chain:
fallback_chain = [args.encoding] + fallback_chain
if in_place_effective:
# Create backup before overwriting original (only when actually writing)
backup_path = path + ".bak"
shutil.copy2(path, backup_path)
print(f"Backup created: {backup_path}")
fix_count = targeted_fix_to_utf8(
path,
out_path_effective,
fallback_chain=fallback_chain,
dry_run=args.dry_run,
max_fixes=args.max_fixes,
)
if not args.dry_run:
ok, err = verify_utf8_file(out_path_effective)
if ok:
print("Verification: output is valid UTF-8 ✅")
print(f"Fix mode completed — {fix_count} byte(s) corrected.")
else:
print(f"Verification failed: {err}")
sys.exit(8)
return
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,78 @@
#!/usr/bin/env python
import os
import csv
def _main():
input_csv_file_path = "base_reverse_dns.csv"
base_reverse_dns_map_file_path = "base_reverse_dns_map.csv"
known_unknown_list_file_path = "known_unknown_base_reverse_dns.txt"
psl_overrides_file_path = "psl_overrides.txt"
output_csv_file_path = "unknown_base_reverse_dns.csv"
csv_headers = ["source_name", "message_count"]
known_unknown_domains = []
psl_overrides = []
known_domains = []
output_rows = []
def load_list(file_path, list_var):
if not os.path.exists(file_path):
print(f"Error: {file_path} does not exist")
print(f"Loading {file_path}")
with open(file_path) as f:
for line in f.readlines():
domain = line.lower().strip()
if domain in list_var:
print(f"Error: {domain} is in {file_path} multiple times")
exit(1)
elif domain != "":
list_var.append(domain)
load_list(known_unknown_list_file_path, known_unknown_domains)
load_list(psl_overrides_file_path, psl_overrides)
if not os.path.exists(base_reverse_dns_map_file_path):
print(f"Error: {base_reverse_dns_map_file_path} does not exist")
print(f"Loading {base_reverse_dns_map_file_path}")
with open(base_reverse_dns_map_file_path) as f:
for row in csv.DictReader(f):
domain = row["base_reverse_dns"].lower().strip()
if domain in known_domains:
print(
f"Error: {domain} is in {base_reverse_dns_map_file_path} multiple times"
)
exit()
else:
known_domains.append(domain)
if domain in known_unknown_domains and known_domains:
print(
f"Error:{domain} is in {known_unknown_list_file_path} and \
{base_reverse_dns_map_file_path}"
)
exit(1)
if not os.path.exists(input_csv_file_path):
print(f"Error: {base_reverse_dns_map_file_path} does not exist")
exit(1)
with open(input_csv_file_path) as f:
for row in csv.DictReader(f):
domain = row["source_name"].lower().strip()
if domain == "":
continue
for psl_domain in psl_overrides:
if domain.endswith(psl_domain):
domain = psl_domain.strip(".").strip("-")
break
if domain not in known_domains and domain not in known_unknown_domains:
print(f"New unknown domain found: {domain}")
output_rows.append(row)
print(f"Writing {output_csv_file_path}")
with open(output_csv_file_path, "w") as f:
writer = csv.DictWriter(f, fieldnames=csv_headers)
writer.writeheader()
writer.writerows(output_rows)
if __name__ == "__main__":
_main()

View File

@@ -0,0 +1,601 @@
1jli.site
26.107
444qcuhilla.com
4xr1.com
9services.com
a7e.ru
a94434500-blog.com
aams8.jp
abv-10.top
acemail.co.in
activaicon.com
adcritic.net
adlucrumnewsletter.com
admin.corpivensa.gob.ve
advantageiq.com
advrider.ro
aerospacevitro.us.com
agenturserver.de
aghories.com
ai270.net
albagroup-eg.com
alchemy.net
alohabeachcamp.net
alsiscad.com
aluminumpipetubing.com
americanstorageca.com
amplusserver.info
anchorfundhub.com
anglishment.com
anteldata.net.uy
antis.edu
antonaoll.com
anviklass.org
anwrgrp.lat
aosau.net
arandomserver.com
aransk.ru
ardcs.cn
armninl.met
as29550.net
asahachimaru.com
aserv.co.za
asmecam.it
ateky.net.br
aurelienvos.com
automatech.lat
avistaadvantage.com
b8sales.com
bahjs.com
baliaura.com
banaras.co
bearandbullmarketnews.com
bestinvestingtime.com
bhjui.com
biocorp.com
biosophy.net
bitter-echo.com
bizhostingservices.com
blguss.com
bluenet.ch
bluhosting.com
bnasg.com
bodiax.pp.ua
bost-law.com
brainity.com
brazalnde.net
brellatransplc.shop
brnonet.cz
broadwaycover.com
brushinglegal.de
brw.net
btes.tv
budgeteasehub.com
buoytoys.com
buyjapanese.jp
c53dw7m24rj.com
cahtelrandom.org
casadelmarsamara.com
cashflowmasterypro.com
cavabeen.com
cbti.net
centralmalaysia.com
chauffeurplan.co.uk
checkpox.fun
chegouseuvlache.org
chinaxingyu.xyz
christus.mx
churchills.market
ci-xyz.fit
cisumrecords.com
ckaik.cn
clcktoact.com
cli-eurosignal.cz
cloud-admin.it
cloud-edm.com
cloudflare-email.org
cloudhosting.rs
cloudlogin.co
cloudplatformpro.com
cnode.io
cntcloud.com
code-it.net
codefriend.top
colombiaceropapel.org
commerceinsurance.com
comsharempc.com
conexiona.com
coolblaze.com
coowo.com
corpemail.net
cp2-myorderbox.com
cps.com.ar
crnagora.net
cross-d-bar-troutranch.com
ctla.co.kr
cumbalikonakhotel.com
currencyexconverter.com
daakbabu.com
daikinmae.com
dairyvalley.com.my
dastans.ru
datahost36.de
ddii.network
deep-sek.shop
deetownsounds.com
descarca-counter-strike.net
detrot.xyz
dettlaffinc.com
dextoolse.net
digestivedaily.com
digi.net.my
dinofelis.cn
diwkyncbi.top
dkginternet.com
dnexpress.info
dns-oid.com
dnsindia.net
domainserver.ne.jp
domconfig.com
doorsrv.com
dreampox.fun
dreamtechmedia.com
ds.network
dss-group.net
dvj.theworkpc.com
dwlcka.com
dynamic-wiretel.in
dyntcorp.com
easternkingspei.com
economiceagles.com
egosimail.com
eliotporterphotos.us
emailgids.net
emailperegrine.com
entendercopilot.com
entretothom.net
epaycontrol.com
epicinvestmentsreview.co
epicinvestmentsreview.com
epik.com
epsilon-group.com
erestaff.com
euro-trade-gmbh.com
example.com
exposervers.com-new
extendcp.co.uk
eyecandyhosting.xyz
fastwebnet.it
fd9ing7wfn.com
feipnghardware.com
fetscorp.shop
fewo-usedom.net
fin-crime.com
financeaimpoint.com
financeupward.com
firmflat.com
flex-video.bnr.la
flourishfusionlife.com
formicidaehunt.net
fosterheap.com
fredi.shop
frontiernet.net
ftifb7tk3c.com
gamersprotectionvpn.online
gendns.com
getgreencardsfast.com
getthatroi.com
gibbshosting.com
gigidea.net
giize.com
ginous.eu.com
gis.net
gist-th.com
globalglennpartners.com
goldsboroughplace.com
gophermedia.com
gqlists.us.com
gratzl.de
greatestworldnews.com
greennutritioncare.com
gsbb.com
gumbolimbo.net
h-serv.co.uk
haedefpartners.com
halcyon-aboveboard.com
hanzubon.org
healthfuljourneyjoy.com
hgnbroken.us.com
highwey-diesel.com
hirofactory.com
hjd.asso.fr
hongchenggco.pro
hongkongtaxi.co
hopsinthehanger.com
hosted-by-worldstream.net
hostelsucre.com
hosting1337.com
hostinghane.com
hostinglotus.cloud
hostingmichigan.com
hostiran.name
hostmnl.com
hostname.localhost
hostnetwork.com
hosts.net.nz
hostserv.eu
hostwhitelabel.com
hpms1.jp
hunariojmk.net
hunriokinmuim.net
hypericine.com
i-mecca.net
iaasdns.com
iam.net.ma
iconmarketingguy.com
idcfcloud.net
idealconcept.live
igmohji.com
igppevents.org.uk
ihglobaldns.com
ilmessicano.com
imjtmn.cn
immenzaces.com
in-addr-arpa
in-addr.arpa
indsalelimited.com
indulgent-holistic.com
industechint.org
inshaaegypt.com
intal.uz
interfarma.kz
intocpanel.com
ip-147-135-108.us
ip-178-33-109.eu
ip-ptr.tech
iswhatpercent.com
itsidc.com
itwebs.com
iuon.net
ivol.co
jalanet.co.id
jimishare.com
jlccptt.net.cn
jlenterprises.co.uk
jmontalto.com
joyomokei.com
jumanra.org
justlongshirts.com
kahlaa.com
kaw.theworkpc.com
kbronet.com.tw
kdnursing.org
kielnet.net
kihy.theworkpc.com
kingschurchwirral.org
kitchenaildbd.com
klaomi.shop
knkconsult.net
kohshikai.com
krhfund.org
krillaglass.com
lancorhomes.com
landpedia.org
lanzatuseo.es
layerdns.cloud
learninglinked.com
legenditds.com
levertechcentre.com
lhost.no
lideri.net.br
lighthouse-media.com
lightpath.net
limogesporcelainboxes.com
lindsaywalt.net
linuxsunucum.com
listertermoformadoa.com
llsend.com
local.net
lohkal.com
londionrtim.net
lonestarmm.net
longmarquis.com
longwoodmgmt.com
lse.kz
lunvoy.com
luxarpro.ru
lwl-puehringer.at
lynx.net.lb
lyse.net
m-sender.com.ua
maggiolicloud.it
magnetmail.net
magnumgo.uz
maia11.com
mail-fire.com
mailsentinel.net
mailset.cn
malardino.net
managed-vps.net
manhattanbulletpoint.com
manpowerservices.com
marketmysterycode.com
marketwizardspro.com
masterclassjournal.com
matroguel.cam
maximpactipo.com
mechanicalwalk.store
mediavobis.com
meqlobal.com
mgts.by
migrans.net
miixta.com
milleniumsrv.com
mindworksunlimited.com
mirth-gale.com
misorpresa.com
mitomobile.com
mitsubachi-kibako.net
mjinn.com
mkegs.shop
mobius.fr
model-ac.ink
moderntradingnews.com
monnaiegroup.com
monopolizeright.com
moonjaws.com
morningnewscatcher.com
motion4ever.net
mschosting.com
msdp1.com
mspnet.pro
mts-nn.ru
multifamilydesign.com
mxserver.ro
mxthunder.net
my-ihor.ru
mycloudmailbox.com
myfriendforum.com
myrewards.net
mysagestore.com
mysecurewebserver.com
myshanet.net
myvps.jp
mywedsite.net
mywic.eu
name.tools
nanshenqfurniture.com
nask.pl
navertise.net
ncbb.kz
ncport.ru
ncsdi.ws
nebdig.com
neovet-base.ru
netbri.com
netcentertelecom.net.br
neti.ee
netkl.org
newinvestingguide.com
newwallstreetcode.com
ngvcv.cn
nic.name
nidix.net
nieuwedagnetwerk.net
nlscanme.com
nmeuh.cn
noisndametal.com
nucleusemail.com
nutriboostlife.com
nwo.giize.com
nwwhalewatchers.org
ny.adsl
nyt1.com
offerslatedeals.com
office365.us
ogicom.net
olivettilexikon.co.uk
omegabrasil.inf.br
onnet21.com
onumubunumu.com
oppt-ac.fit
orbitel.net.co
orfsurface.com
orientalspot.com
outsidences.com
ovaltinalization.co
overta.ru
ox28vgrurc.com
pamulang.net
panaltyspot.space
panolacountysheriffms.com
passionatesmiles.com
paulinelam.com
pdi-corp.com
peloquinbeck.com
perimetercenter.net
permanentscreen.com
permasteellisagroup.com
perumkijhyu.net
pesnia.com.ua
ph8ltwdi12o.com
pharmada.com.de
phdns3.es
pigelixval1.com
pipefittingsindia.com
planethoster.net
playamedia.io
plesk.page
pmnhost.net
pokiloandhu.net
pokupki5.ru
polandi.net
popiup.com
ports.net
posolstvostilya.com
potia.net
prima.com.ar
prima.net.ar
profsol.co.uk
prohealthmotion.com
promooffermarket.site
proudserver.com
proxado.com
psnm.ru
pvcwindowsprices.live
qontenciplc.autos
quakeclick.com
quasarstate.store
quatthonggiotico.com
qxyxab44njd.com
radianthealthrenaissance.com
rapidns.com
raxa.host
reberte.com
reethvikintl.com
regruhosting.ru
reliablepanel.com
rgb365.eu
riddlecamera.net
riddletrends.com
roccopugliese.com
runnin-rebels.com
rupar.puglia.it
rwdhosting.ca
s500host.com
sageevents.co.ke
sahacker-2020.com
samsales.site
sante-lorraine.fr
saransk.ru
satirogluet.com
scioncontacts.com
sdcc.my
seaspraymta3.net
secorp.mx
securen.net
securerelay.in
securev.net
seductiveeyes.com
seizethedayconsulting.com
serroplast.shop
server290.com
server342.com
server3559.cc
servershost.biz
sfek.kz
sgnetway.net
shopfox.ca
silvestrejaguar.sbs
silvestreonca.sbs
simplediagnostics.org
siriuscloud.jp
sisglobalresearch.com
sixpacklink.net
sjestyle.com
smallvillages.com
smartape-vps.com
solusoftware.com
sourcedns.com
southcoastwebhosting12.com
specialtvvs.com
spiritualtechnologies.io
sprout.org
srv.cat
stableserver.net
statlerfa.co.uk
stock-smtp.top
stockepictigers.com
stockexchangejournal.com
subterranean-concave.com
suksangroup.com
swissbluetopaz.com
switer.shop
sysop4.com
system.eu.com
szhongbing.com
t-jon.com
tacaindo.net
tacom.tj
tankertelz.co
tataidc.com
teamveiw.com
tecnoxia.net
tel-xyz.fit
tenkids.net
terminavalley.com
thaicloudsolutions.com
thaikinghost.com
thaimonster.com
thegermainetruth.net
thehandmaderose.com
thepushcase.com
ticdns.com
tigo.bo
toledofibra.net.br
topdns.com
totaal.net
totalplay.net
tqh.ro
traderlearningcenter.com
tradeukraine.site
traveleza.com
trwww.com
tsuzakij.com
tullostrucking.com
turbinetrends.com
twincitiesdistinctivehomes.com
tylerfordonline.com
uiyum.com
ultragate.com
uneedacollie.com
unified.services
unite.services
urawasl.com
us.servername.us
vagebond.net
varvia.de
vbcploo.com
vdc.vn
vendimetry.com
vibrantwellnesscorp.com
virtualine.org
visit.docotor
viviotech.us
vlflgl.com
volganet.ru
vrns.net
vulterdi.edu
vvondertex.com
wallstreetsgossip.com
wamego.net
wanekoohost.com
wealthexpertisepro.com
web-login.eu
weblinkinternational.com
webnox.io
websale.net
welllivinghive.com
westparkcom.com
wetransfer-eu.com
wheelch.me
whoflew.com
whpservers.com
wisdomhard.com
wisewealthcircle.com
wisvis.com
wodeniowa.com
wordpresshosting.xyz
wsiph2.com
xnt.mx
xodiax.com
xpnuf.cn
xsfati.us.com
xspmail.jp
yourciviccompass.com
yourinvestworkbook.com
yoursitesecure.net
zerowebhosting.net
zmml.uk
znlc.jp
ztomy.com

View File

@@ -0,0 +1,23 @@
-applefibernet.com
-c3.net.pl
-celsiainternet.com
-clientes-izzi.mx
-clientes-zap-izzi.mx
-imnet.com.br
-mcnbd.com
-smile.com.bd
-tataidc.co.in
-veloxfiber.com.br
-wconect.com.br
.amazonaws.com
.cloudaccess.net
.ddnsgeek.com
.fastvps-server.com
.in-addr-arpa
.in-addr.arpa
.kasserver.com
.kinghost.net
.linode.com
.linodeusercontent.com
.na4u.ru
.sakura.ne.jp

View File

@@ -0,0 +1,184 @@
#!/usr/bin/env python3
from __future__ import annotations
import os
import csv
from pathlib import Path
from typing import Mapping, Iterable, Optional, Collection, Union, List, Dict
class CSVValidationError(Exception):
def __init__(self, errors: list[str]):
super().__init__("\n".join(errors))
self.errors = errors
def sort_csv(
filepath: Union[str, Path],
field: str,
*,
sort_field_value_must_be_unique: bool = True,
strip_whitespace: bool = True,
fields_to_lowercase: Optional[Iterable[str]] = None,
case_insensitive_sort: bool = False,
required_fields: Optional[Iterable[str]] = None,
allowed_values: Optional[Mapping[str, Collection[str]]] = None,
) -> List[Dict[str, str]]:
"""
Read a CSV, optionally normalize rows (strip whitespace, lowercase certain fields),
validate field values, and write the sorted CSV back to the same path.
- filepath: Path to the CSV to sort.
- field: The field name to sort by.
- fields_to_lowercase: Permanently lowercases these field(s) in the data.
- strip_whitespace: Remove all whitespace at the beginning and of field values.
- case_insensitive_sort: Ignore case when sorting without changing values.
- required_fields: A list of fields that must have data in all rows.
- allowed_values: A mapping of allowed values for fields.
"""
path = Path(filepath)
required_fields = set(required_fields or [])
lower_set = set(fields_to_lowercase or [])
allowed_sets = {k: set(v) for k, v in (allowed_values or {}).items()}
if sort_field_value_must_be_unique:
seen_sort_field_values = []
with path.open("r", newline="") as infile:
reader = csv.DictReader(infile)
fieldnames = reader.fieldnames or []
if field not in fieldnames:
raise CSVValidationError([f"Missing sort column: {field!r}"])
missing_headers = required_fields - set(fieldnames)
if missing_headers:
raise CSVValidationError(
[f"Missing required header(s): {sorted(missing_headers)}"]
)
rows = list(reader)
def normalize_row(row: Dict[str, str]) -> None:
if strip_whitespace:
for k, v in row.items():
if isinstance(v, str):
row[k] = v.strip()
for fld in lower_set:
if fld in row and isinstance(row[fld], str):
row[fld] = row[fld].lower()
def validate_row(
row: Dict[str, str], sort_field: str, line_no: int, errors: list[str]
) -> None:
if sort_field_value_must_be_unique:
if row[sort_field] in seen_sort_field_values:
errors.append(f"Line {line_no}: Duplicate row for '{row[sort_field]}'")
else:
seen_sort_field_values.append(row[sort_field])
for rf in required_fields:
val = row.get(rf)
if val is None or val == "":
errors.append(
f"Line {line_no}: Missing value for required field '{rf}'"
)
for field, allowed_values in allowed_sets.items():
if field in row:
val = row[field]
if val not in allowed_values:
errors.append(
f"Line {line_no}: '{val}' is not an allowed value for '{field}' "
f"(allowed: {sorted(allowed_values)})"
)
errors: list[str] = []
for idx, row in enumerate(rows, start=2): # header is line 1
normalize_row(row)
validate_row(row, field, idx, errors)
if errors:
raise CSVValidationError(errors)
def sort_key(r: Dict[str, str]):
v = r.get(field, "")
if isinstance(v, str) and case_insensitive_sort:
return v.casefold()
return v
rows.sort(key=sort_key)
with open(filepath, "w", newline="") as outfile:
writer = csv.DictWriter(outfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(rows)
def sort_list_file(
filepath: Union[str, Path],
*,
lowercase: bool = True,
strip: bool = True,
deduplicate: bool = True,
remove_blank_lines: bool = True,
ending_newline: bool = True,
newline: Optional[str] = "\n",
):
"""Read a list from a file, sort it, optionally strip and deduplicate the values,
then write that list back to the file.
- Filepath: The path to the file.
- lowercase: Lowercase all values prior to sorting.
- remove_blank_lines: Remove any plank lines.
- ending_newline: End the file with a newline, even if remove_blank_lines is true.
- newline: The newline character to use.
"""
with open(filepath, mode="r", newline=newline) as infile:
lines = infile.readlines()
for i in range(len(lines)):
if lowercase:
lines[i] = lines[i].lower()
if strip:
lines[i] = lines[i].strip()
if deduplicate:
lines = list(set(lines))
if remove_blank_lines:
while "" in lines:
lines.remove("")
lines = sorted(lines)
if ending_newline:
if lines[-1] != "":
lines.append("")
with open(filepath, mode="w", newline=newline) as outfile:
outfile.write("\n".join(lines))
def _main():
map_file = "base_reverse_dns_map.csv"
map_key = "base_reverse_dns"
list_files = ["known_unknown_base_reverse_dns.txt", "psl_overrides.txt"]
types_file = "base_reverse_dns_types.txt"
with open(types_file) as f:
types = f.readlines()
while "" in types:
types.remove("")
map_allowed_values = {"Type": types}
for list_file in list_files:
if not os.path.exists(list_file):
print(f"Error: {list_file} does not exist")
exit(1)
sort_list_file(list_file)
if not os.path.exists(types_file):
print(f"Error: {types_file} does not exist")
exit(1)
sort_list_file(types_file, lowercase=False)
if not os.path.exists(map_file):
print(f"Error: {map_file} does not exist")
exit(1)
try:
sort_csv(map_file, map_key, allowed_values=map_allowed_values)
except CSVValidationError as e:
print(f"{map_file} did not validate: {e}")
if __name__ == "__main__":
_main()

95
parsedmarc/s3.py Normal file
View File

@@ -0,0 +1,95 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Any
import json
import boto3
from parsedmarc.log import logger
from parsedmarc.utils import human_timestamp_to_datetime
class S3Client(object):
"""A client for interacting with Amazon S3"""
def __init__(
self,
bucket_name: str,
bucket_path: str,
region_name: str,
endpoint_url: str,
access_key_id: str,
secret_access_key: str,
):
"""
Initializes the S3Client
Args:
bucket_name (str): The S3 Bucket
bucket_path (str): The path to save reports
region_name (str): The region name
endpoint_url (str): The endpoint URL
access_key_id (str): The access key id
secret_access_key (str): The secret access key
"""
self.bucket_name = bucket_name
self.bucket_path = bucket_path
self.metadata_keys = [
"org_name",
"org_email",
"report_id",
"begin_date",
"end_date",
]
# https://github.com/boto/boto3/blob/1.24.7/boto3/session.py#L312
self.s3 = boto3.resource(
"s3",
region_name=region_name,
endpoint_url=endpoint_url,
aws_access_key_id=access_key_id,
aws_secret_access_key=secret_access_key,
)
self.bucket: Any = self.s3.Bucket(self.bucket_name)
def save_aggregate_report_to_s3(self, report: dict[str, Any]):
self.save_report_to_s3(report, "aggregate")
def save_forensic_report_to_s3(self, report: dict[str, Any]):
self.save_report_to_s3(report, "forensic")
def save_smtp_tls_report_to_s3(self, report: dict[str, Any]):
self.save_report_to_s3(report, "smtp_tls")
def save_report_to_s3(self, report: dict[str, Any], report_type: str):
if report_type == "smtp_tls":
report_date = report["begin_date"]
report_id = report["report_id"]
else:
report_date = human_timestamp_to_datetime(
report["report_metadata"]["begin_date"]
)
report_id = report["report_metadata"]["report_id"]
path_template = "{0}/{1}/year={2}/month={3:02d}/day={4:02d}/{5}.json"
object_path = path_template.format(
self.bucket_path,
report_type,
report_date.year,
report_date.month,
report_date.day,
report_id,
)
logger.debug(
"Saving {0} report to s3://{1}/{2}".format(
report_type, self.bucket_name, object_path
)
)
object_metadata = {
k: v
for k, v in report["report_metadata"].items()
if k in self.metadata_keys
}
self.bucket.put_object(
Body=json.dumps(report), Key=object_path, Metadata=object_metadata
)

211
parsedmarc/splunk.py Normal file
View File

@@ -0,0 +1,211 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Any, Union
from urllib.parse import urlparse
import socket
import json
import urllib3
import requests
from parsedmarc.constants import USER_AGENT
from parsedmarc.log import logger
from parsedmarc.utils import human_timestamp_to_unix_timestamp
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
class SplunkError(RuntimeError):
"""Raised when a Splunk API error occurs"""
class HECClient(object):
"""A client for a Splunk HTTP Events Collector (HEC)"""
# http://docs.splunk.com/Documentation/Splunk/latest/Data/AboutHEC
# http://docs.splunk.com/Documentation/Splunk/latest/RESTREF/RESTinput#services.2Fcollector
def __init__(
self,
url: str,
access_token: str,
index: str,
source: str = "parsedmarc",
verify=True,
timeout=60,
):
"""
Initializes the HECClient
Args:
url (str): The URL of the HEC
access_token (str): The HEC access token
index (str): The name of the index
source (str): The source name
verify (bool): Verify SSL certificates
timeout (float): Number of seconds to wait for the server to send
data before giving up
"""
parsed_url = urlparse(url)
self.url = "{0}://{1}/services/collector/event/1.0".format(
parsed_url.scheme, parsed_url.netloc
)
self.access_token = access_token.lstrip("Splunk ")
self.index = index
self.host = socket.getfqdn()
self.source = source
self.session = requests.Session()
self.timeout = timeout
self.session.verify = verify
self._common_data: dict[str, Union[str, int, float, dict]] = dict(
host=self.host, source=self.source, index=self.index
)
self.session.headers = {
"User-Agent": USER_AGENT,
"Authorization": "Splunk {0}".format(self.access_token),
}
def save_aggregate_reports_to_splunk(
self,
aggregate_reports: Union[list[dict[str, Any]], dict[str, Any]],
):
"""
Saves aggregate DMARC reports to Splunk
Args:
aggregate_reports: A list of aggregate report dictionaries
to save in Splunk
"""
logger.debug("Saving aggregate reports to Splunk")
if isinstance(aggregate_reports, dict):
aggregate_reports = [aggregate_reports]
if len(aggregate_reports) < 1:
return
data = self._common_data.copy()
json_str = ""
for report in aggregate_reports:
for record in report["records"]:
new_report: dict[str, Union[str, int, float, dict]] = dict()
for metadata in report["report_metadata"]:
new_report[metadata] = report["report_metadata"][metadata]
new_report["interval_begin"] = record["interval_begin"]
new_report["interval_end"] = record["interval_end"]
new_report["normalized_timespan"] = record["normalized_timespan"]
new_report["published_policy"] = report["policy_published"]
new_report["source_ip_address"] = record["source"]["ip_address"]
new_report["source_country"] = record["source"]["country"]
new_report["source_reverse_dns"] = record["source"]["reverse_dns"]
new_report["source_base_domain"] = record["source"]["base_domain"]
new_report["source_type"] = record["source"]["type"]
new_report["source_name"] = record["source"]["name"]
new_report["message_count"] = record["count"]
new_report["disposition"] = record["policy_evaluated"]["disposition"]
new_report["spf_aligned"] = record["alignment"]["spf"]
new_report["dkim_aligned"] = record["alignment"]["dkim"]
new_report["passed_dmarc"] = record["alignment"]["dmarc"]
new_report["header_from"] = record["identifiers"]["header_from"]
new_report["envelope_from"] = record["identifiers"]["envelope_from"]
if "dkim" in record["auth_results"]:
new_report["dkim_results"] = record["auth_results"]["dkim"]
if "spf" in record["auth_results"]:
new_report["spf_results"] = record["auth_results"]["spf"]
data["sourcetype"] = "dmarc:aggregate"
timestamp = human_timestamp_to_unix_timestamp(
new_report["interval_begin"]
)
data["time"] = timestamp
data["event"] = new_report.copy()
json_str += "{0}\n".format(json.dumps(data))
if not self.session.verify:
logger.debug("Skipping certificate verification for Splunk HEC")
try:
response = self.session.post(self.url, data=json_str, timeout=self.timeout)
response = response.json()
except Exception as e:
raise SplunkError(e.__str__())
if response["code"] != 0:
raise SplunkError(response["text"])
def save_forensic_reports_to_splunk(
self,
forensic_reports: Union[list[dict[str, Any]], dict[str, Any]],
):
"""
Saves forensic DMARC reports to Splunk
Args:
forensic_reports (list): A list of forensic report dictionaries
to save in Splunk
"""
logger.debug("Saving forensic reports to Splunk")
if isinstance(forensic_reports, dict):
forensic_reports = [forensic_reports]
if len(forensic_reports) < 1:
return
json_str = ""
for report in forensic_reports:
data = self._common_data.copy()
data["sourcetype"] = "dmarc:forensic"
timestamp = human_timestamp_to_unix_timestamp(report["arrival_date_utc"])
data["time"] = timestamp
data["event"] = report.copy()
json_str += "{0}\n".format(json.dumps(data))
if not self.session.verify:
logger.debug("Skipping certificate verification for Splunk HEC")
try:
response = self.session.post(self.url, data=json_str, timeout=self.timeout)
response = response.json()
except Exception as e:
raise SplunkError(e.__str__())
if response["code"] != 0:
raise SplunkError(response["text"])
def save_smtp_tls_reports_to_splunk(
self, reports: Union[list[dict[str, Any]], dict[str, Any]]
):
"""
Saves aggregate DMARC reports to Splunk
Args:
reports: A list of SMTP TLS report dictionaries
to save in Splunk
"""
logger.debug("Saving SMTP TLS reports to Splunk")
if isinstance(reports, dict):
reports = [reports]
if len(reports) < 1:
return
data = self._common_data.copy()
json_str = ""
for report in reports:
data["sourcetype"] = "smtp:tls"
timestamp = human_timestamp_to_unix_timestamp(report["begin_date"])
data["time"] = timestamp
data["event"] = report.copy()
json_str += "{0}\n".format(json.dumps(data))
if not self.session.verify:
logger.debug("Skipping certificate verification for Splunk HEC")
try:
response = self.session.post(self.url, data=json_str, timeout=self.timeout)
response = response.json()
except Exception as e:
raise SplunkError(e.__str__())
if response["code"] != 0:
raise SplunkError(response["text"])

57
parsedmarc/syslog.py Normal file
View File

@@ -0,0 +1,57 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
import logging
import logging.handlers
from typing import Any
import json
from parsedmarc import (
parsed_aggregate_reports_to_csv_rows,
parsed_forensic_reports_to_csv_rows,
parsed_smtp_tls_reports_to_csv_rows,
)
class SyslogClient(object):
"""A client for Syslog"""
def __init__(self, server_name: str, server_port: int):
"""
Initializes the SyslogClient
Args:
server_name (str): The Syslog server
server_port (int): The Syslog UDP port
"""
self.server_name = server_name
self.server_port = server_port
self.logger = logging.getLogger("parsedmarc_syslog")
self.logger.setLevel(logging.INFO)
log_handler = logging.handlers.SysLogHandler(address=(server_name, server_port))
self.logger.addHandler(log_handler)
def save_aggregate_report_to_syslog(
self, aggregate_reports: list[dict[str, Any]]
):
rows = parsed_aggregate_reports_to_csv_rows(aggregate_reports)
for row in rows:
self.logger.info(json.dumps(row))
def save_forensic_report_to_syslog(
self, forensic_reports: list[dict[str, Any]]
):
rows = parsed_forensic_reports_to_csv_rows(forensic_reports)
for row in rows:
self.logger.info(json.dumps(row))
def save_smtp_tls_report_to_syslog(
self, smtp_tls_reports: list[dict[str, Any]]
):
rows = parsed_smtp_tls_reports_to_csv_rows(smtp_tls_reports)
for row in rows:
self.logger.info(json.dumps(row))

721
parsedmarc/utils.py Normal file
View File

@@ -0,0 +1,721 @@
# -*- coding: utf-8 -*-
"""Utility functions that might be useful for other projects"""
from __future__ import annotations
from typing import Optional, Union, TypedDict, Any
import logging
import os
from datetime import datetime
from datetime import timezone
from datetime import timedelta
from expiringdict import ExpiringDict
import tempfile
import subprocess
import shutil
import mailparser
import json
import hashlib
import base64
import mailbox
import re
import csv
import io
try:
from importlib.resources import files
except ImportError:
# Try backported to PY<3 `importlib_resources`
from importlib.resources import files
from dateutil.parser import parse as parse_date
import dns.reversename
import dns.resolver
import dns.exception
import geoip2.database
import geoip2.errors
import publicsuffixlist
import requests
from parsedmarc.log import logger
import parsedmarc.resources.dbip
import parsedmarc.resources.maps
from parsedmarc.constants import USER_AGENT
parenthesis_regex = re.compile(r"\s*\(.*\)\s*")
null_file = open(os.devnull, "w")
mailparser_logger = logging.getLogger("mailparser")
mailparser_logger.setLevel(logging.CRITICAL)
psl = publicsuffixlist.PublicSuffixList()
psl_overrides_path = str(files(parsedmarc.resources.maps).joinpath("psl_overrides.txt"))
with open(psl_overrides_path) as f:
psl_overrides = [line.rstrip() for line in f.readlines()]
while "" in psl_overrides:
psl_overrides.remove("")
class EmailParserError(RuntimeError):
"""Raised when an error parsing the email occurs"""
class DownloadError(RuntimeError):
"""Raised when an error occurs when downloading a file"""
class EmailAddress(TypedDict):
"""Parsed email address information"""
display_name: Optional[str]
address: str
local: Optional[str]
domain: Optional[str]
def decode_base64(data: str) -> bytes:
"""
Decodes a base64 string, with padding being optional
Args:
data (str): A base64 encoded string
Returns:
bytes: The decoded bytes
"""
data = bytes(data, encoding="ascii")
missing_padding = len(data) % 4
if missing_padding != 0:
data += b"=" * (4 - missing_padding)
return base64.b64decode(data)
def get_base_domain(domain: str) -> str:
"""
Gets the base domain name for the given domain
.. note::
Results are based on a list of public domain suffixes at
https://publicsuffix.org/list/public_suffix_list.dat and overrides included in
parsedmarc.resources.maps.psl_overrides.txt
Args:
domain (str): A domain or subdomain
Returns:
str: The base domain of the given domain
"""
domain = domain.lower()
publicsuffix = psl.privatesuffix(domain)
for override in psl_overrides:
if domain.endswith(override):
return override.strip(".").strip("-")
return publicsuffix
def query_dns(
domain: str,
record_type: str,
*,
cache: Optional[ExpiringDict] = None,
nameservers: list[str] = None,
timeout: int = 2.0,
) -> list[str]:
"""
Queries DNS
Args:
domain (str): The domain or subdomain to query about
record_type (str): The record type to query for
cache (ExpiringDict): Cache storage
nameservers (list): A list of one or more nameservers to use
(Cloudflare's public DNS resolvers by default)
timeout (float): Sets the DNS timeout in seconds
Returns:
list: A list of answers
"""
domain = str(domain).lower()
record_type = record_type.upper()
cache_key = "{0}_{1}".format(domain, record_type)
if cache:
records = cache.get(cache_key, None)
if records:
return records
resolver = dns.resolver.Resolver()
timeout = float(timeout)
if nameservers is None:
nameservers = [
"1.1.1.1",
"1.0.0.1",
"2606:4700:4700::1111",
"2606:4700:4700::1001",
]
resolver.nameservers = nameservers
resolver.timeout = timeout
resolver.lifetime = timeout
if record_type == "TXT":
resource_records = list(
map(
lambda r: r.strings,
resolver.resolve(domain, record_type, lifetime=timeout),
)
)
_resource_record = [
resource_record[0][:0].join(resource_record)
for resource_record in resource_records
if resource_record
]
records = [r.decode() for r in _resource_record]
else:
records = list(
map(
lambda r: r.to_text().replace('"', "").rstrip("."),
resolver.resolve(domain, record_type, lifetime=timeout),
)
)
if cache:
cache[cache_key] = records
return records
def get_reverse_dns(
ip_address,
*,
cache: Optional[ExpiringDict] = None,
nameservers: list[str] = None,
timeout: int = 2.0,
) -> str:
"""
Resolves an IP address to a hostname using a reverse DNS query
Args:
ip_address (str): The IP address to resolve
cache (ExpiringDict): Cache storage
nameservers (list): A list of one or more nameservers to use
(Cloudflare's public DNS resolvers by default)
timeout (float): Sets the DNS query timeout in seconds
Returns:
str: The reverse DNS hostname (if any)
"""
hostname = None
try:
address = dns.reversename.from_address(ip_address)
hostname = query_dns(
address, "PTR", cache=cache, nameservers=nameservers, timeout=timeout
)[0]
except dns.exception.DNSException as e:
logger.warning(f"get_reverse_dns({ip_address}) exception: {e}")
pass
return hostname
def timestamp_to_datetime(timestamp: int) -> datetime:
"""
Converts a UNIX/DMARC timestamp to a Python ``datetime`` object
Args:
timestamp (int): The timestamp
Returns:
datetime: The converted timestamp as a Python ``datetime`` object
"""
return datetime.fromtimestamp(int(timestamp))
def timestamp_to_human(timestamp: int) -> str:
"""
Converts a UNIX/DMARC timestamp to a human-readable string
Args:
timestamp: The timestamp
Returns:
str: The converted timestamp in ``YYYY-MM-DD HH:MM:SS`` format
"""
return timestamp_to_datetime(timestamp).strftime("%Y-%m-%d %H:%M:%S")
def human_timestamp_to_datetime(
human_timestamp: str, *, to_utc: Optional[bool] = False
) -> datetime:
"""
Converts a human-readable timestamp into a Python ``datetime`` object
Args:
human_timestamp (str): A timestamp string
to_utc (bool): Convert the timestamp to UTC
Returns:
datetime: The converted timestamp
"""
human_timestamp = human_timestamp.replace("-0000", "")
human_timestamp = parenthesis_regex.sub("", human_timestamp)
dt = parse_date(human_timestamp)
return dt.astimezone(timezone.utc) if to_utc else dt
def human_timestamp_to_unix_timestamp(human_timestamp: str) -> int:
"""
Converts a human-readable timestamp into a UNIX timestamp
Args:
human_timestamp (str): A timestamp in `YYYY-MM-DD HH:MM:SS`` format
Returns:
float: The converted timestamp
"""
human_timestamp = human_timestamp.replace("T", " ")
return human_timestamp_to_datetime(human_timestamp).timestamp()
def get_ip_address_country(ip_address: str, *, db_path: Optional[str] = None) -> str:
"""
Returns the ISO code for the country associated
with the given IPv4 or IPv6 address
Args:
ip_address (str): The IP address to query for
db_path (str): Path to a MMDB file from MaxMind or DBIP
Returns:
str: And ISO country code associated with the given IP address
"""
db_paths = [
"GeoLite2-Country.mmdb",
"/usr/local/share/GeoIP/GeoLite2-Country.mmdb",
"/usr/share/GeoIP/GeoLite2-Country.mmdb",
"/var/lib/GeoIP/GeoLite2-Country.mmdb",
"/var/local/lib/GeoIP/GeoLite2-Country.mmdb",
"/usr/local/var/GeoIP/GeoLite2-Country.mmdb",
"%SystemDrive%\\ProgramData\\MaxMind\\GeoIPUpdate\\GeoIP\\"
"GeoLite2-Country.mmdb",
"C:\\GeoIP\\GeoLite2-Country.mmdb",
"dbip-country-lite.mmdb",
"dbip-country.mmdb",
]
if db_path is not None:
if not os.path.isfile(db_path):
db_path = None
logger.warning(
f"No file exists at {db_path}. Falling back to an "
"included copy of the IPDB IP to Country "
"Lite database."
)
if db_path is None:
for system_path in db_paths:
if os.path.exists(system_path):
db_path = system_path
break
if db_path is None:
db_path = str(
files(parsedmarc.resources.dbip).joinpath("dbip-country-lite.mmdb")
)
db_age = datetime.now() - datetime.fromtimestamp(os.stat(db_path).st_mtime)
if db_age > timedelta(days=30):
logger.warning("IP database is more than a month old")
db_reader = geoip2.database.Reader(db_path)
country = None
try:
country = db_reader.country(ip_address).country.iso_code
except geoip2.errors.AddressNotFoundError:
pass
return country
def get_service_from_reverse_dns_base_domain(
base_domain: str,
*,
always_use_local_file: Optional[bool] = False,
local_file_path: Optional[str] = None,
url: Optional[str] = None,
offline: Optional[bool] = False,
reverse_dns_map: Optional[dict[str, Any]] = None,
) -> dict[str, Any]:
"""
Returns the service name of a given base domain name from reverse DNS.
Args:
base_domain (str): The base domain of the reverse DNS lookup
always_use_local_file (bool): Always use a local map file
local_file_path (str): Path to a local map file
url (str): URL ro a reverse DNS map
offline (bool): Use the built-in copy of the reverse DNS map
reverse_dns_map (dict): A reverse DNS map
Returns:
dict: A dictionary containing name and type.
If the service is unknown, the name will be
the supplied reverse_dns_base_domain and the type will be None
"""
def load_csv(_csv_file):
reader = csv.DictReader(_csv_file)
for row in reader:
key = row["base_reverse_dns"].lower().strip()
reverse_dns_map[key] = dict(name=row["name"], type=row["type"])
base_domain = base_domain.lower().strip()
if url is None:
url = (
"https://raw.githubusercontent.com/domainaware"
"/parsedmarc/master/parsedmarc/"
"resources/maps/base_reverse_dns_map.csv"
)
if reverse_dns_map is None:
reverse_dns_map = dict()
csv_file = io.StringIO()
if not (offline or always_use_local_file) and len(reverse_dns_map) == 0:
try:
logger.debug(f"Trying to fetch reverse DNS map from {url}...")
headers = {"User-Agent": USER_AGENT}
response = requests.get(url, headers=headers)
response.raise_for_status()
csv_file.write(response.text)
csv_file.seek(0)
load_csv(csv_file)
except requests.exceptions.RequestException as e:
logger.warning(f"Failed to fetch reverse DNS map: {e}")
except Exception:
logger.warning("Not a valid CSV file")
csv_file.seek(0)
logging.debug("Response body:")
logger.debug(csv_file.read())
if len(reverse_dns_map) == 0:
logger.info("Loading included reverse DNS map...")
path = str(
files(parsedmarc.resources.maps).joinpath("base_reverse_dns_map.csv")
)
if local_file_path is not None:
path = local_file_path
with open(path) as csv_file:
load_csv(csv_file)
try:
service = reverse_dns_map[base_domain]
except KeyError:
service = dict(name=base_domain, type=None)
return service
def get_ip_address_info(
ip_address: str,
*,
ip_db_path: Optional[str] = None,
reverse_dns_map_path: Optional[str] = None,
always_use_local_files: Optional[bool] = False,
reverse_dns_map_url: Optional[str] = None,
cache: Optional[ExpiringDict] = None,
reverse_dns_map: Optional[dict[str, Any]] = None,
offline: Optional[bool] = False,
nameservers: Optional[list[str]] = None,
timeout: Optional[float] = 2.0,
) -> dict[str, Any]:
"""
Returns reverse DNS and country information for the given IP address
Args:
ip_address (str): The IP address to check
ip_db_path (str): path to a MMDB file from MaxMind or DBIP
reverse_dns_map_path (str): Path to a reverse DNS map file
reverse_dns_map_url (str): URL to the reverse DNS map file
always_use_local_files (bool): Do not download files
cache (ExpiringDict): Cache storage
reverse_dns_map (dict): A reverse DNS map
offline (bool): Do not make online queries for geolocation or DNS
nameservers (list): A list of one or more nameservers to use
(Cloudflare's public DNS resolvers by default)
timeout (float): Sets the DNS timeout in seconds
Returns:
dict: ``ip_address``, ``reverse_dns``, ``country``
"""
ip_address = ip_address.lower()
if cache is not None:
info = cache.get(ip_address, None)
if info:
logger.debug(f"IP address {ip_address} was found in cache")
return info
info = dict()
info["ip_address"] = ip_address
if offline:
reverse_dns = None
else:
reverse_dns = get_reverse_dns(
ip_address, nameservers=nameservers, timeout=timeout
)
country = get_ip_address_country(ip_address, db_path=ip_db_path)
info["country"] = country
info["reverse_dns"] = reverse_dns
info["base_domain"] = None
info["name"] = None
info["type"] = None
if reverse_dns is not None:
base_domain = get_base_domain(reverse_dns)
if base_domain is not None:
service = get_service_from_reverse_dns_base_domain(
base_domain,
offline=offline,
local_file_path=reverse_dns_map_path,
url=reverse_dns_map_url,
always_use_local_file=always_use_local_files,
reverse_dns_map=reverse_dns_map,
)
info["base_domain"] = base_domain
info["type"] = service["type"]
info["name"] = service["name"]
if cache is not None:
cache[ip_address] = info
logger.debug(f"IP address {ip_address} added to cache")
else:
logger.debug(f"IP address {ip_address} reverse_dns not found")
return info
def parse_email_address(original_address: str) -> EmailAddress:
if original_address[0] == "":
display_name = None
else:
display_name = original_address[0]
address = original_address[1]
address_parts = address.split("@")
local = None
domain = None
if len(address_parts) > 1:
local = address_parts[0].lower()
domain = address_parts[-1].lower()
return {
"display_name": display_name,
"address": address,
"local": local,
"domain": domain,
}
def get_filename_safe_string(string: str) -> str:
"""
Converts a string to a string that is safe for a filename
Args:
string (str): A string to make safe for a filename
Returns:
str: A string safe for a filename
"""
invalid_filename_chars = ["\\", "/", ":", '"', "*", "?", "|", "\n", "\r"]
if string is None:
string = "None"
for char in invalid_filename_chars:
string = string.replace(char, "")
string = string.rstrip(".")
string = (string[:100]) if len(string) > 100 else string
return string
def is_mbox(path: str) -> bool:
"""
Checks if the given content is an MBOX mailbox file
Args:
path: Content to check
Returns:
bool: A flag that indicates if the file is an MBOX mailbox file
"""
_is_mbox = False
try:
mbox = mailbox.mbox(path)
if len(mbox.keys()) > 0:
_is_mbox = True
except Exception as e:
logger.debug("Error checking for MBOX file: {0}".format(e.__str__()))
return _is_mbox
def is_outlook_msg(content: Union[bytes, Any]) -> bool:
"""
Checks if the given content is an Outlook msg OLE/MSG file
Args:
content: Content to check
Returns:
bool: A flag that indicates if the file is an Outlook MSG file
"""
return isinstance(content, bytes) and content.startswith(
b"\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1"
)
def convert_outlook_msg(msg_bytes: bytes) -> str:
"""
Uses the ``msgconvert`` Perl utility to convert an Outlook MS file to
standard RFC 822 format
Args:
msg_bytes (bytes): the content of the .msg file
Returns:
A RFC 822 string
"""
if not is_outlook_msg(msg_bytes):
raise ValueError("The supplied bytes are not an Outlook MSG file")
orig_dir = os.getcwd()
tmp_dir = tempfile.mkdtemp()
os.chdir(tmp_dir)
with open("sample.msg", "wb") as msg_file:
msg_file.write(msg_bytes)
rfc822_bytes: bytes
try:
subprocess.check_call(
["msgconvert", "sample.msg"], stdout=null_file, stderr=null_file
)
eml_path = "sample.eml"
with open(eml_path, "rb") as eml_file:
rfc822_bytes = eml_file.read()
except FileNotFoundError:
raise EmailParserError(
"Failed to convert Outlook MSG: msgconvert utility not found"
)
finally:
os.chdir(orig_dir)
shutil.rmtree(tmp_dir)
return rfc822_bytes.decode("utf-8", errors="replace")
def parse_email(
data: Union[bytes, str], *, strip_attachment_payloads: Optional[bool] = False
) -> dict[str, Any]:
"""
A simplified email parser
Args:
data: The RFC 822 message string, or MSG binary
strip_attachment_payloads (bool): Remove attachment payloads
Returns:
dict: Parsed email data
"""
if isinstance(data, bytes):
if is_outlook_msg(data):
data = convert_outlook_msg(data)
else:
data = data.decode("utf-8", errors="replace")
parsed_email = mailparser.parse_from_string(data)
headers = json.loads(parsed_email.headers_json).copy()
parsed_email = json.loads(parsed_email.mail_json).copy()
parsed_email["headers"] = headers
if "received" in parsed_email:
for received in parsed_email["received"]:
if "date_utc" in received:
if received["date_utc"] is None:
del received["date_utc"]
else:
received["date_utc"] = received["date_utc"].replace("T", " ")
if "from" not in parsed_email:
if "From" in parsed_email["headers"]:
parsed_email["from"] = parsed_email["Headers"]["From"]
else:
parsed_email["from"] = None
if parsed_email["from"] is not None:
parsed_email["from"] = parse_email_address(parsed_email["from"][0])
if "date" in parsed_email:
parsed_email["date"] = parsed_email["date"].replace("T", " ")
else:
parsed_email["date"] = None
if "reply_to" in parsed_email:
parsed_email["reply_to"] = list(
map(lambda x: parse_email_address(x), parsed_email["reply_to"])
)
else:
parsed_email["reply_to"] = []
if "to" in parsed_email:
parsed_email["to"] = list(
map(lambda x: parse_email_address(x), parsed_email["to"])
)
else:
parsed_email["to"] = []
if "cc" in parsed_email:
parsed_email["cc"] = list(
map(lambda x: parse_email_address(x), parsed_email["cc"])
)
else:
parsed_email["cc"] = []
if "bcc" in parsed_email:
parsed_email["bcc"] = list(
map(lambda x: parse_email_address(x), parsed_email["bcc"])
)
else:
parsed_email["bcc"] = []
if "delivered_to" in parsed_email:
parsed_email["delivered_to"] = list(
map(lambda x: parse_email_address(x), parsed_email["delivered_to"])
)
if "attachments" not in parsed_email:
parsed_email["attachments"] = []
else:
for attachment in parsed_email["attachments"]:
if "payload" in attachment:
payload = attachment["payload"]
try:
if "content_transfer_encoding" in attachment:
if attachment["content_transfer_encoding"] == "base64":
payload = decode_base64(payload)
else:
payload = str.encode(payload)
attachment["sha256"] = hashlib.sha256(payload).hexdigest()
except Exception as e:
logger.debug("Unable to decode attachment: {0}".format(e.__str__()))
if strip_attachment_payloads:
for attachment in parsed_email["attachments"]:
if "payload" in attachment:
del attachment["payload"]
if "subject" not in parsed_email:
parsed_email["subject"] = None
parsed_email["filename_safe_subject"] = get_filename_safe_string(
parsed_email["subject"]
)
if "body" not in parsed_email:
parsed_email["body"] = None
return parsed_email

65
parsedmarc/webhook.py Normal file
View File

@@ -0,0 +1,65 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Any, Optional, Union
import requests
from parsedmarc import logger
from parsedmarc.constants import USER_AGENT
class WebhookClient(object):
"""A client for webhooks"""
def __init__(
self,
aggregate_url: str,
forensic_url: str,
smtp_tls_url: str,
timeout: Optional[int] = 60,
):
"""
Initializes the WebhookClient
Args:
aggregate_url (str): The aggregate report webhook url
forensic_url (str): The forensic report webhook url
smtp_tls_url (str): The smtp_tls report webhook url
timeout (int): The timeout to use when calling the webhooks
"""
self.aggregate_url = aggregate_url
self.forensic_url = forensic_url
self.smtp_tls_url = smtp_tls_url
self.timeout = timeout
self.session = requests.Session()
self.session.headers = {
"User-Agent": USER_AGENT,
"Content-Type": "application/json",
}
def save_forensic_report_to_webhook(self, report: str):
try:
self._send_to_webhook(self.forensic_url, report)
except Exception as error_:
logger.error("Webhook Error: {0}".format(error_.__str__()))
def save_smtp_tls_report_to_webhook(self, report: str):
try:
self._send_to_webhook(self.smtp_tls_url, report)
except Exception as error_:
logger.error("Webhook Error: {0}".format(error_.__str__()))
def save_aggregate_report_to_webhook(self, report: str):
try:
self._send_to_webhook(self.aggregate_url, report)
except Exception as error_:
logger.error("Webhook Error: {0}".format(error_.__str__()))
def _send_to_webhook(
self, webhook_url: str, payload: Union[bytes, str, dict[str, Any]]
):
try:
self.session.post(webhook_url, data=payload, timeout=self.timeout)
except Exception as error_:
logger.error("Webhook Error: {0}".format(error_.__str__()))

10
publish-docs.sh Executable file
View File

@@ -0,0 +1,10 @@
#!/bin/bash
git pull
cd ../parsedmarc-docs || exit
git pull
cd ../parsedmarc || exit
./build.sh
cd ../parsedmarc-docs || exit
git add .
git commit -m "Update docs"
git push

98
pyproject.toml Normal file
View File

@@ -0,0 +1,98 @@
[build-system]
requires = [
"hatchling>=1.27.0",
]
requires_python = ">=3.9,<3.14"
build-backend = "hatchling.build"
[project]
name = "parsedmarc"
dynamic = [
"version",
]
description = "A Python package and CLI for parsing aggregate and forensic DMARC reports"
readme = "README.md"
license = "Apache-2.0"
authors = [
{ name = "Sean Whalen", email = "whalenster@gmail.com" },
]
keywords = [
"DMARC",
"parser",
"reporting",
]
classifiers = [
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Developers",
"Intended Audience :: Information Technology",
"License :: OSI Approved :: Apache Software License",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3"
]
requires-python = ">=3.9, <3.14"
dependencies = [
"azure-identity>=1.8.0",
"azure-monitor-ingestion>=1.0.0",
"boto3>=1.16.63",
"dateparser>=1.1.1",
"dnspython>=2.0.0",
"elasticsearch-dsl==7.4.0",
"elasticsearch<7.14.0",
"expiringdict>=1.1.4",
"geoip2>=3.0.0",
"google-api-core>=2.4.0",
"google-api-python-client>=2.35.0",
"google-auth-httplib2>=0.1.0",
"google-auth-oauthlib>=0.4.6",
"google-auth>=2.3.3",
"imapclient>=2.1.0",
"kafka-python-ng>=2.2.2",
"lxml>=4.4.0",
"mailsuite>=1.9.18",
"msgraph-core==0.2.2",
"opensearch-py>=2.4.2,<=3.0.0",
"publicsuffixlist>=0.10.0",
"pygelf>=0.4.2",
"requests>=2.22.0",
"tqdm>=4.31.1",
"urllib3>=1.25.7",
"xmltodict>=0.12.0",
"PyYAML>=6.0.3"
]
[project.optional-dependencies]
build = [
"hatch>=1.14.0",
"myst-parser[linkify]",
"nose",
"pytest",
"pytest-cov",
"ruff",
"sphinx",
"sphinx_rtd_theme",
]
[project.scripts]
parsedmarc = "parsedmarc.cli:_main"
[project.urls]
Homepage = "https://domainaware.github.io/parsedmarc"
[tool.hatch.version]
path = "parsedmarc/constants.py"
[tool.hatch.build.targets.sdist]
include = [
"/parsedmarc",
]
[tool.hatch.build]
exclude = [
"base_reverse_dns.csv",
"find_bad_utf8.py",
"find_unknown_base_reverse_dns.py",
"unknown_base_reverse_dns.csv",
"sortmaps.py",
"README.md",
"*.bak"
]

View File

@@ -1,10 +0,0 @@
dnspython
requests
publicsuffix
xmltodict
geoip2
flake8
sphinx
sphinx_rtd_theme
collective.checkdocs
wheel

View File

@@ -0,0 +1,40 @@
<?xml version="1.0" encoding="UTF-8" ?>
<feedback>
<report_metadata>
<org_name></org_name>
<email>administrator@accurateplastics.com</email>
<report_id>example.com:1538463741</report_id>
<date_range>
<begin>1538413632</begin>
<end>1538413632</end>
</date_range>
</report_metadata>
<policy_published>
<domain>example.com</domain>
<adkim>r</adkim>
<aspf>r</aspf>
<p>none</p>
<sp>reject</sp>
<pct>100</pct>
</policy_published>
<record>
<row>
<source_ip>12.20.127.122</source_ip>
<count>1</count>
<policy_evaluated>
<disposition>none</disposition>
<dkim>fail</dkim>
<spf>fail</spf>
</policy_evaluated>
</row>
<identifiers>
<header_from>example.com</header_from>
</identifiers>
<auth_results>
<spf>
<domain></domain>
<result>none</result>
</spf>
</auth_results>
</record>
</feedback>

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,181 @@
Received: from SN6PR04MB4480.namprd04.prod.outlook.com (2603:10b6:803:41::31)
by SN6PR04MB4477.namprd04.prod.outlook.com with HTTPS via
SN4PR0501CA0054.NAMPRD05.PROD.OUTLOOK.COM; Wed, 13 Feb 2019 10:48:13 +0000
Received: from DM5PR04CA0035.namprd04.prod.outlook.com (2603:10b6:3:12b::21)
by SN6PR04MB4480.namprd04.prod.outlook.com (2603:10b6:805:a5::17) with
Microsoft SMTP Server (version=TLS1_2,
cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1622.16; Wed, 13 Feb
2019 10:48:12 +0000
Received: from BY2NAM03FT014.eop-NAM03.prod.protection.outlook.com
(2a01:111:f400:7e4a::207) by DM5PR04CA0035.outlook.office365.com
(2603:10b6:3:12b::21) with Microsoft SMTP Server (version=TLS1_2,
cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1622.16 via Frontend
Transport; Wed, 13 Feb 2019 10:48:12 +0000
Authentication-Results: spf=softfail (sender IP is 199.230.200.198)
smtp.mailfrom=google.com; cardinalhealth.mail.onmicrosoft.com; dkim=fail
(signature did not verify)
header.d=google.com;cardinalhealth.mail.onmicrosoft.com; dmarc=fail
action=oreject header.from=google.com;
Received-SPF: SoftFail (protection.outlook.com: domain of transitioning
google.com discourages use of 199.230.200.198 as permitted sender)
Received: from SMTP10.cardinalhealth.com (199.230.200.198) by
BY2NAM03FT014.mail.protection.outlook.com (10.152.84.239) with Microsoft SMTP
Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id
15.20.1580.10 via Frontend Transport; Wed, 13 Feb 2019 10:48:12 +0000
Received: from WPOH0019EXHYB01.cardinalhealth.net (10.212.25.115) by
smtp10.cardinalhealth.com (10.208.183.149) with Microsoft SMTP Server (TLS)
id 14.3.361.1; Wed, 13 Feb 2019 05:47:36 -0500
Received: from WPIL0210EXCAS23.cardinalhealth.net (161.244.3.66) by
WPOH0019EXHYB01.cardinalhealth.net (10.212.25.115) with Microsoft SMTP Server
(TLS) id 14.3.361.1; Wed, 13 Feb 2019 05:47:32 -0500
Received: from smtp2.cardinal.com (198.89.161.108) by
WPIL0210EXCAS23.cardinalhealth.net (161.244.3.66) with Microsoft SMTP Server
(TLS) id 14.3.361.1; Wed, 13 Feb 2019 04:47:32 -0600
Authentication-Results-Original: smtp2.cardinal.com; spf=Pass
smtp.mailfrom=noreply-dmarc-support@google.com
Received-SPF: Pass (smtp2.cardinal.com: domain of
noreply-dmarc-support@google.com designates 209.85.166.201 as
permitted sender) identity=mailfrom;
client-ip=209.85.166.201; receiver=smtp2.cardinal.com;
envelope-from="noreply-dmarc-support@google.com";
x-sender="noreply-dmarc-support@google.com";
x-conformance=spf_only; x-record-type="v=spf1"
X-SenderGrp: None
X-SBRS: 3.5
X-ExtWarning: Yes
X-SLBL-Result: SAFE-LISTED
X-Amp-Result: UNKNOWN
X-Amp-Original-Verdict: FILE UNKNOWN
X-Amp-File-Uploaded: False
IronPort-PHdr: =?us-ascii?q?9a23=3AQPa/HRJPWXWEsohNPdmcpTZcNBhigK39O0su0rRi?=
=?us-ascii?q?jrtPdqq5+JG7Zh7Q4vJiiFKPVoLeuatJ?=
X-IPAS-Result: =?us-ascii?q?A0HNBQBz9WNch8mmVdFjHAEBAR8EAQEFAQEGEIE2AoJpgQM?=
=?us-ascii?q?nh0+FPIsFlRGCXYIygSQDGCoSCAEEGAMTgQKDOQECg1YhATQJDQEDAQEBAQEBA?=
=?us-ascii?q?QECAhABAQEKCwkIKSMIBIJ0BE06MAEBAQEBAQEBAQEBAQEBAQEBAQEBAQEUAjM?=
=?us-ascii?q?XKQUYGwwBAThiMgEFAYNZAYIBngU8iikBAQGCCxOCeAEBBXSBFzuDPIEdBwgSg?=
=?us-ascii?q?luFFIF9gkSCFoFXgkyBJoF4AoF4NoR5GoEshjgJiViIBIhVgQsJgXaBRIEPgRu?=
=?us-ascii?q?BVosZGYFeAWOQLRmYRgGDVQ8hgSVugSFwXIESgU4JIxeBczSDIIp0ITCPbwEB?=
X-IronPort-Outbreak-Status: No, level 0, Unknown - Unknown
Received: from mail-it1-f201.google.com ([209.85.166.201]) by
smtp2.cardinal.com with ESMTP/TLS/AES128-GCM-SHA256; 13 Feb 2019 04:47:15
-0600
Received: by mail-it1-f201.google.com with SMTP id p21so3350239itb.8
for <dmarcreports@cardinalhealth.com>; Wed, 13 Feb 2019 02:47:15 -0800
(PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=google.com; s=20161025;
h=mime-version:date:message-id:subject:from:to:content-disposition
:content-transfer-encoding;
bh=9auxxiIHA4359CcIJ8D48sw6ZXttCPhnCsgP3zpHWC4=;
b=QSiY0EPGhBY11lvmfniPOmA71xd6uAv17KYGbEtmOtocmGen1BQ96kZA2rNtAtZDUx
IfLoJgkzO31GmVXh9k0kBi+r8vR16zXebZHuBgfBesayykHOfSSWOTZtwSukaWV3RChV
PPMRPMksnrITfFHNo3u6xbcx6usplxn8XS8XyENgua30BBjweJXYMrQrRkrjgLv+JpoY
o6fVvAtcuSnwtm3fv9j1GsqSK05sw2aVFZkJLP9DvMfhW+bXJJ2rVp4MvVqlleua20Ro
Y0vbFMWtbvFZseSOc+AYGvv6oL9eBilXal26kS2BrRJQ+B4Yt4GFiRDbjF4QqVSTHOd4
YDSw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20161025;
h=x-gm-message-state:mime-version:date:message-id:subject:from:to
:content-disposition:content-transfer-encoding;
bh=9auxxiIHA4359CcIJ8D48sw6ZXttCPhnCsgP3zpHWC4=;
b=lqkAygJJ/s8iZJI+AnwWegB47wiLE67qFfi26J+5Uu5lQuObEqK2KTlBZGwnd51c8R
A2y47YQ9RqBKWTibQH9pLBKT5ChyxP/+Vk8e3wK+IfU720wG8P2eDXn91rBSBNIXOTOn
McNwB/z6Ukurt8GFiy4aMvni0t3yWa4/xL0c5zFQJAgwm01jhxOjgOpnnqF0ppYatzf4
5maCMzhvRJ9FFsDhk8sE0pJjdaWS9ybDGKOteSYr1wlGSnydTkt2z6z38IO8fgVJwT3e
AUiqhNgNmDdyNI3Aom8dnfJHsyqjKC3iTXGxmqsMDVui4faHwOqMx2lgjuv7bbQFyv1K
b//w==
X-Gm-Message-State: AHQUAubgr+0/AsmLjETdSMNb9+rAZ5j0xlExGd75CusDHQJl4eIy2avN
XnhZqrYsvbPhKCNFyDE0KQ==
X-Google-Smtp-Source: AHgI3IZS0dawFR+Q0mnSaOenxA+M5W6V6z1dMorYgiX855zYf4aqZyS2Jjol+TCgY3f/lSsVDbuiefhqmZNtNA==
MIME-Version: 1.0
X-Received: by 2002:a24:1c87:: with SMTP id c129mr1998419itc.11.1550054834929;
Wed, 13 Feb 2019 02:47:14 -0800 (PST)
Date: Tue, 12 Feb 2019 15:59:59 -0800
Message-ID: <949348866075514174@google.com>
Subject: Report domain: borschow.com Submitter: google.com Report-ID: 949348866075514174
From: <noreply-dmarc-support@google.com>
To: <dmarcreports@cardinalhealth.com>
name="google.com!borschow.com!1549929600!1550015999.zip"
filename="google.com!borschow.com!1549929600!1550015999.zip"
Return-Path: noreply-dmarc-support@google.com
X-OrganizationHeadersPreserved: WPIL0210EXCAS23.cardinalhealth.net
X-MS-Exchange-Organization-ExpirationStartTime: 13 Feb 2019 10:48:12.5214
(UTC)
X-MS-Exchange-Organization-ExpirationStartTimeReason: OriginalSubmit
X-MS-Exchange-Organization-ExpirationInterval: 2:00:00:00.0000000
X-MS-Exchange-Organization-ExpirationIntervalReason: OriginalSubmit
X-MS-Exchange-Organization-Network-Message-Id:
e9371fc9-cf12-4fcb-502a-08d691a0c038
X-EOPAttributedMessage: 0
X-MS-Exchange-Organization-MessageDirectionality: Originating
X-CrossPremisesHeadersPromoted:
BY2NAM03FT014.eop-NAM03.prod.protection.outlook.com
X-CrossPremisesHeadersFiltered:
BY2NAM03FT014.eop-NAM03.prod.protection.outlook.com
X-Forefront-Antispam-Report:
CIP:199.230.200.198;IPV:NLI;CTRY:US;EFV:NLI;SFV:SKN;SFS:;DIR:INB;SFP:;SCL:-1;SRVR:SN6PR04MB4480;H:SMTP10.cardinalhealth.com;FPR:;SPF:None;LANG:en;
X-Microsoft-Exchange-Diagnostics:
1;BY2NAM03FT014;1:9gD0HAosLjXNiAwpOsmGDn2zgTJAIEDY0eKyvNoIEz6oio7FckNeMUQRNa3AUeC/Qil0Sn2rntyy6LjTvutGN6e2BsGQyNaiKEsKI3so3l0Kpb9i3CdYF21/lNVHrjKuxxEJ8F7WUBlR88jcBwDjl6x0CO2FBckAmUnBJOJv2zg=
X-MS-Exchange-Organization-AuthSource: WPIL0210EXCAS23.cardinalhealth.net
X-MS-Exchange-Organization-AuthAs: Anonymous
X-OriginatorOrg: cardinalhealth.onmicrosoft.com
X-MS-PublicTrafficType: Email
X-MS-Office365-Filtering-Correlation-Id: e9371fc9-cf12-4fcb-502a-08d691a0c038
X-Microsoft-Antispam:
BCL:0;PCL:0;RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(8559020)(8990200)(5600110)(711020)(4605077)(4710068)(4711035)(2017052603328)(7153060)(49563074);SRVR:SN6PR04MB4480;
X-MS-TrafficTypeDiagnostic: SN6PR04MB4480:
X-MS-Exchange-Organization-SCL: -1
X-Microsoft-Exchange-Diagnostics:
1;SN6PR04MB4480;20:9lFp0O5yeS9rEVtgFCaVjrPpXZAA0REuLHin4MfFWihk274IOvh7BRRMQfNNBwtC3q0+vTeNPc37wIBQlwVq6T7j1bNe06DnEjGgP5GWNU7zOUjt6qeq21ebYk/RV2QcTM85ZcFxr8SF2bGAKcNkj7GQLnnogH5o1GotLtqwXOht4qNZxhp46eCrIt+yQJJHFJyHFoflM9+z2WAYSl6yY8Wauhp05LBIqjduLdEN6MmU7bpPrzOmBrc4nuOmA4s1D8A3KdzBCdx0wIEwBv1zq6RjAB43UrfhpwMmh07U60CU/0QPhSrEBfn19eZLv4eTRJozsA313tp/mPylGCegahxmClixw/ku/GENI6pWxTCz3Jz1x8YCMLqJS7M+UOg7IosLPbUr26Q2CtSduue1vhk6v8peX5c5di6b9WftMKup3YMESA0RrqNbS6HbBCmH+iHSUwWjTBVva4L0fGiG82SbFbkH/UH+ZoFzkFnMtDZwDK+9pK+oHS2k97XwVzzYrzFh98JBdnk6jI/L2MmHWldt75NqJjSifAf2P/PjlploUQ8CAHsZZSRx5cu5tLaptOzUq/NVXF23VavhTslxK5C0/ntAAZAAvhmR8v/FNFU=;23:m8XEp0VuraCd8j9yukaQaVZE+Ufz0qQ9v4369t8CewCI5dikl+UkdVfYaJsMrwHTAtH3it3YrgDXpo7py6m/RDwgDnLGxviIfy/puyEEN3Qq99TJUMn19W9S5U7VJ8DgMZsnEv6RVCrjf05bNshRyA==
X-Microsoft-Exchange-Diagnostics:
1;SN6PR04MB4480;20:fJacS7QTNHPZGJt2KoBiyZLfHf3R2G+vFZ1EUyyFaqoQUdJU3WDLMmHMUqn36br0oZNxyMkV05SJMoFAz3mVO2hO/nsUX6SR7RMpr5XHYxLD+tPbbTTT2aGzo5IR+GOrJc5l3z4uX34x0WdoggvjUhi6DWaqwMn/OnkEBJ7ozYg=
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Feb 2019 10:48:12.1984
(UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: e9371fc9-cf12-4fcb-502a-08d691a0c038
X-MS-Exchange-CrossTenant-Id: a86dc674-e6a2-4c25-b4ba-e36c31382530
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=a86dc674-e6a2-4c25-b4ba-e36c31382530;Ip=[199.230.200.198];Helo=[SMTP10.cardinalhealth.com]
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR04MB4480
X-MS-Exchange-Transport-EndToEndLatency: 00:00:01.1339945
X-MS-Exchange-Processed-By-BccFoldering: 15.20.1601.011
X-Microsoft-Antispam-Mailbox-Delivery:
ucf:0;jmr:0;ex:0;auth:0;dest:I;ENG:(750119)(520011016)(944506303)(944626516);
X-Microsoft-Antispam-Message-Info:
Cqz80Cj8FVW4uTBbPiVvb9OH0/VEl4Uz5BiS+YWHErndckPxKMInYe6J09QImrgTO+t2bYNNpL66Km1sbVKa2o+iWj1pSIxAONbkfZuosS0y7Xbj/NEw1eqGtwavoj5WckV7MfJmBINAEBVg6UPnNhmW5rY8PTa27tmGQgYoE7wm5JLH0EH8ARuebjtTf8j+WuBg/BY6uaK7FLOdAUnnlvAtoHDKTnL/oH5E6GG59HTarZyC4wMl5sN2ibbMqZ80Aj4EczyuoWz1N2thodsW/4yzTk2w2dtHgwMdKgPI+4xALQC81kQAlgVyN9ukvBpJnRKrA4bvx+XhUUsMKsoXmpWdQCIoALqAfXheY/96JepEYN05Fqa7wzDRLkbejIfsfPq16asiX/kw8Dq2N/WG5LeQpC28iOkY4TB/GlI6CQPVd8rMY3DvzBYZPyCAcUdPhXC3nR+qxLea+zH/s1IRKaXolnF0r0zaiCki952hC6UwfdeK9F/nCA75kRb930QXJbmOS9emnEf+xqWhIhJuMdd8gV1NLSz6SDimeHRfDgXMTgNUjkk/BQQ0bCWAEQrRPxdvt/5PEiUZMZzMKZAqYh67j2RpV8FC/qJLjHPljagvtH4bUvGmpn/W9MArWgsUkk2skhNcLVletMwYbVXvJfJPr7K9Pnfpnd4p1ETHwjlzXaKcvlziIE29MYEXPUcg9rnk2t33YTM1NJHhgyiKebbrHC2/BU1O+BNrkZYQhqlkvsAu4JxBdyzld2sDz9CQdvwOSAwOkMpdlkaHV26Y0e6gPLkaprWVqXPr5KFXSUfuz2fvVUNM+FuHGV/fIFkcKdK4lw0MRufwhBz1gqudL/PSQuI8r9lQmuh7K3+gIprdWqOiYlYEELj+TMnSnZaFkbIX70rhAAkB7MoNfs/A38hIooGzxlRYzTrlPqwoIkOpyqQykDzXoNRODHwo7QJx
Content-type: multipart/mixed;
boundary="B_3632898004_720446853"
> This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
--B_3632898004_720446853
Content-type: text/plain;
charset="UTF-8"
Content-transfer-encoding: 7bit
--B_3632898004_720446853
Content-type: application/zip; name="google.com!borschow.com!1549929600!1550015999.zip";
x-mac-creator="4F50494D"
Content-ID: <B399946F5954C04A8523535C1E5699A5@namprd04.prod.outlook.com>
Content-disposition: attachment;
filename="google.com!borschow.com!1549929600!1550015999.zip"
Content-transfer-encoding: base64
UEsDBAoAAAAIAPhTTU5+28OP0QEAACgEAAAxAAAAZ29vZ2xlLmNvbSFib3JzY2hvdy5jb20h
MTU0OTkyOTYwMCExNTUwMDE1OTk5LnhtbI1TwXakIBC85yvmzX1EJ2rGfYTsab9g9+xjoFU2
CjzATPL3iwHRnexhT2J1d3VXNeCX92k8vIGxQsnnY5HlxwNIpriQ/fPx188fp8vx8EIecAfA
r5S9kofDARvQyrh2Akc5dXTBPKpM30o6AemV6kfImJowSmDIgYmKkUjlGcaPE5+oYSc764Xu
+74s5MWad2doy5R0lLlWyE6RwTltvyEUS7OtFFFEpb2BQeeyrqtL7rm+1gfiKENw0pTNY3m5
1HX+VFVFWTyVGG3RkO2VQmuo7KMWD12hF5IUVdk056bOfa+ArHGQ3EerPC+qpmn8JHIlQ3+z
pW57S7FWo2AfrZ6vo7ADpEGUN0eSqzKWDeoWDItgyKD8VUzEYBQOEbS6+8SWb4A0MfAbmMNI
R8RukF0xzRwpFnHL4XPYfw3m3WTKrDMadUsuWDUbBq3QpDln1WNWFHVW5GffIQXWVKZm6Zth
FA4rHPvBGx1n7xtfA4sZwmplhfPXN02+x3aZixWdv1Y+IbkSRXcxkKzZqbzr6le1asOCg3Si
E/75pLIBKAfTdkZNdyvaRyLVFwJMZze0Buw8uo1zN9b/7D9e7oUj6oo/Sdp2BdB9wyU5LBKj
7dH/AVBLAQIKAAoAAAAIAPhTTU5+28OP0QEAACgEAAAxAAAAAAAAAAAAAAAAAAAAAABnb29n
bGUuY29tIWJvcnNjaG93LmNvbSExNTQ5OTI5NjAwITE1NTAwMTU5OTkueG1sUEsFBgAAAAAB
AAEAXwAAACACAAAAAA==
--B_3632898004_720446853--

View File

@@ -0,0 +1,44 @@
<?xml version="1.0"?>
<feedback>
<version>1.0</version>
<report_metadata>
<org_name>addisonfoods.com</org_name>
<email>postmaster@addisonfoods.com</email>
<report_id>3ceb5548498640beaeb47327e202b0b9</report_id>
<date_range>
<begin>1536105600</begin>
<end>1536191999</end>
</date_range>
</report_metadata>
<policy_published>
<domain>example.com</domain>
<adkim>r</adkim>
<aspf>r</aspf>
<p>none</p>
<sp>none</sp>
<pct>100</pct>
<fo>0</fo>
</policy_published>
<record>
<row>
<source_ip>109.203.100.17</source_ip>
<count>1</count>
<policy_evaluated>
<disposition>none</disposition>
<dkim>fail</dkim>
<spf>fail</spf>
</policy_evaluated>
</row>
<identifiers>
<envelope_from>example.com</envelope_from>
<header_from>example.com</header_from>
</identifiers>
<auth_results>
<dkim>
<domain>toptierhighticket.club</domain>
<selector>default</selector>
<result>pass</result>
</dkim>
</auth_results>
</record>
</feedback>

View File

@@ -0,0 +1,56 @@
<?xml version="1.0" encoding="UTF-8"?>
<feedback>
<version>1.0</version>
<report_metadata>
<org_name>example.org</org_name>
<email>noreply-dmarc-support@example.org</email>
<extra_contact_info>https://support.example.org/dmarc</extra_contact_info>
<report_id>20240125141224705995</report_id>
<date_range>
<begin>1706159544</begin>
<end>1706185733</end>
</date_range>
</report_metadata>
<policy_published>
<domain>example.com</domain>
<adkim>r</adkim>
<aspf>r</aspf>
<p>quarantine</p>
<sp>quarantine</sp>
<pct>100</pct>
<fo>1</fo>
</policy_published>
<record>
<row>
<source_ip>198.51.100.123</source_ip>
<count>2</count>
<policy_evaluated>
<disposition>none</disposition>
<dkim>pass</dkim>
<spf>fail</spf>
<reason>
<type></type>
<comment></comment>
</reason>
</policy_evaluated>
</row>
<identifiers>
<envelope_to>example.net</envelope_to>
<envelope_from>example.edu</envelope_from>
<header_from>example.com</header_from>
</identifiers>
<auth_results>
<dkim>
<domain>example.com</domain>
<selector>example</selector>
<result>pass</result>
<human_result>2048-bit key</human_result>
</dkim>
<spf>
<domain>example.edu</domain>
<scope>mfrom</scope>
<result>pass</result>
</spf>
</auth_results>
</record>
</feedback>

Some files were not shown because too many files have changed in this diff Show More