Compare commits

...

47 Commits

Author SHA1 Message Date
copilot-swe-agent[bot]
2174f23eb5 Add comprehensive TypedDicts to minimize Any usage in public APIs
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 22:28:19 +00:00
copilot-swe-agent[bot]
febbb107c4 Fix Python 3.9 compatibility: replace pipe union syntax with Union/Optional
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 22:18:57 +00:00
copilot-swe-agent[bot]
9a64b494e7 Fix code review issues: incomplete isinstance and variable name mismatch
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 21:45:21 +00:00
copilot-swe-agent[bot]
e93209c766 Fix function signatures and improve type annotations
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 21:42:25 +00:00
copilot-swe-agent[bot]
d1c22466be Replace OrderedDict with dict and add TypedDict definitions
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 21:36:57 +00:00
copilot-swe-agent[bot]
3d1b2522d3 Initial plan 2025-12-17 21:19:30 +00:00
Sean Whalen
af9ad568ec Specify Python version requirements in pyproject.toml 2025-12-17 16:18:24 -05:00
Sean Whalen
748164d177 Fix #638 2025-12-17 16:09:26 -05:00
Sean Whalen
487e5e1149 Format on build 2025-12-12 15:56:52 -05:00
Sean Whalen
73010cf964 Use ruff for code formatting 2025-12-12 15:44:46 -05:00
Sean Whalen
a4a5475aa8 Fix another typo before releasing 9.0.5 2025-12-08 15:29:48 -05:00
Sean Whalen
dab78880df Actual 9.0.5 release
Fix typo
2025-12-08 15:26:58 -05:00
Sean Whalen
fb54e3b742 9.0.5
- Fix report type detection bug introduced in `9.0.4` (yanked).
2025-12-08 15:22:02 -05:00
Sean Whalen
6799f10364 9.0.4
Fixes

- Fix saving reports to OpenSearch ([#637](https://github.com/domainaware/parsedmarc/issues/637))
- Fix parsing certain DMARC failure/forensic reports
- Some fixes to type hints (incomplete, but published as-is due to the above bugs)
2025-12-08 13:26:59 -05:00
Sean Whalen
445c9565a4 Update bug link in docs 2025-12-06 15:05:19 -05:00
Sean Whalen
4b786846ae Remove Python 3.14 from testing
Until cpython bug https://github.com/python/cpython/issues/142307 is fixed
2025-12-05 11:05:29 -05:00
Sean Whalen
23ae563cd8 Update Python version support details in documentation 2025-12-05 10:48:04 -05:00
Sean Whalen
cdd000e675 9.0.3
- Set `requires-python` to `>=3.9, <3.14` to avoid [this bug](https://github.com/python/cpython/issues/142307)
2025-12-05 10:43:28 -05:00
Sean Whalen
7d58abc67b Add shebang and encoding declaration to tests.py 2025-12-04 10:21:53 -05:00
Sean Whalen
a18ae439de Fix typo in RHEL version support description in documentation 2025-12-04 10:18:15 -05:00
Sean Whalen
d7061330a8 Use None for blank fields in the Top 1000 Message Sources by Name DMARC Summary dashboard widget 2025-12-03 09:22:33 -05:00
Sean Whalen
9d5654b8ec Fix bugs with the Top 1000 Message Sources by Name DMARC Summary dashboard widget 2025-12-03 09:14:52 -05:00
Sean Whalen
a0e0070dd0 Bump version to 9.0.2 2025-12-02 20:12:58 -05:00
Sean Whalen
cf3b7f2c29 ## 9.0.2
## Improvements

- Type hinting is now used properly across the entire library. (#445)

## Fixes

- Decompress report files as needed when passed via the CLI.
- Fixed incomplete removal of the ability for `parsedmarc.utils.extract_report` to accept a file path directly in `8.15.0`.

## Breaking changes

This version of the library requires consumers to pass certain arguments as keyword-only. Internally, the API uses a bare `*` in the function signature. This is standard per [PEP 3102](https://peps.python.org/pep-3102/)  and as documented in the Python Language Reference.
.
2025-12-02 19:41:14 -05:00
Sean Whalen
d312522ab7 Enhance type hints and argument formatting in multiple files for improved clarity and consistency 2025-12-02 17:06:57 -05:00
Sean Whalen
888d717476 Enhance type hints and argument formatting in utils.py for improved clarity and consistency 2025-12-02 16:21:30 -05:00
Sean Whalen
1127f65fbb Enhance type hints and argument formatting in webhook.py for improved clarity and consistency 2025-12-02 15:52:31 -05:00
Sean Whalen
d017dfcddf Enhance type hints and argument formatting across multiple files for improved clarity and consistency 2025-12-02 15:17:37 -05:00
Sean Whalen
5fae99aacc Enhance type hints for improved clarity and consistency in __init__.py, elastic.py, and opensearch.py 2025-12-02 14:14:06 -05:00
Sean Whalen
ba57368ac3 Refactor argument formatting and type hints in elastic.py for consistency 2025-12-02 13:13:25 -05:00
Sean Whalen
dc6ee5de98 Add type hints to methods in opensearch.py for improved clarity and type checking 2025-12-02 13:11:59 -05:00
Sean Whalen
158d63d205 Complete annotations on elastic.py 2025-12-02 12:59:03 -05:00
Oscar Mattsson
f1933b906c Fix 404 link to maxmind docs (#635) 2025-12-02 09:26:01 -05:00
Anael Mobilia
4b98d795ff Define minimal Python version on pyproject (#634) 2025-12-01 20:22:49 -05:00
Sean Whalen
b1356f7dfc 9.0.1
- Allow multiple `records` for the same aggregate DMARC report in Elasticsearch and Opensearch (fixes issue in 9.0.0)
- Fix typos
2025-12-01 18:57:23 -05:00
Sean Whalen
1969196e1a Switch CHANGELOG headers 2025-12-01 18:01:54 -05:00
Sean Whalen
553f15f6a9 Code formatting 2025-12-01 17:24:10 -05:00
Sean Whalen
1fc9f638e2 9.0.0 (#629)
* Normalize report volumes when a report timespan exceed 24 hours
2025-12-01 17:06:58 -05:00
Sean Whalen
48bff504b4 Fix build script to properly publish docs 2025-12-01 11:08:21 -05:00
Sean Whalen
681b7cbf85 Formatting 2025-12-01 10:56:08 -05:00
Sean Whalen
0922d6e83a Add supported Python versions to the documentation index 2025-12-01 10:24:19 -05:00
Sean Whalen
baf3f95fb1 Update README with clarification on Python 3.6 support 2025-12-01 10:20:56 -05:00
Anael Mobilia
a51f945305 Clearly define supported Python versions policy (#633)
* Clearly define supported Python versions.

Support policy based on author's comment on https://github.com/domainaware/parsedmarc/pull/458#issuecomment-2002516299 #458

* Compile Python 3.6 as Ubuntu latest run against Ubuntu 24.04 which haven't Python3.6 + 20.04 is no longer available
https://raw.githubusercontent.com/actions/python-versions/main/versions-manifest.json

* Use latest versions of GH Actions

* Silent some technicals GH Actions steps

* Elasticsearch / opensearch: use supported versions + align used versions

* Delete .github/workflows/python-tests-3.6.yml

Drop Python 3.6 test

* Update Python 3.6 support status in README

---------

Co-authored-by: Sean Whalen <44679+seanthegeek@users.noreply.github.com>
2025-12-01 10:02:47 -05:00
Sean Whalen
55dbf8e3db Add sources my name table to the Kibana DMARC Summary dashboard
This matches the table in the Splunk DMARC  Aggregate reports dashboard
2025-11-30 19:44:14 -05:00
Anael Mobilia
00267c9847 Codestyle cleanup (#631)
* Fix typos

* Copyright - Update date

* Codestyle xxx is False -> not xxx

* Ensure "_find_label_id_for_label" always return str

* PEP-8 : apiKey -> api_key + backward compatibility for config files

* Duplicate variable initialization

* Fix format
2025-11-30 19:13:57 -05:00
Anael Mobilia
51356175e1 Get option on the type described on documentation (#632) 2025-11-30 19:00:04 -05:00
Anael Mobilia
3be10d30dd Fix warnings in docker-compose.yml (#630)
* Fix level=warning msg="...\parsedmarc\docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion"

* Fix "Unquoted port mapping not recommended"
2025-11-30 18:59:01 -05:00
36 changed files with 1620 additions and 1029 deletions

View File

@@ -24,11 +24,11 @@ jobs:
steps:
- name: Checkout repository
uses: actions/checkout@v3
uses: actions/checkout@v5
- name: Docker meta
id: meta
uses: docker/metadata-action@v3
uses: docker/metadata-action@v5
with:
images: |
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
@@ -40,16 +40,14 @@ jobs:
type=semver,pattern={{major}}.{{minor}}
- name: Log in to the Container registry
# https://github.com/docker/login-action/releases/tag/v2.0.0
uses: docker/login-action@49ed152c8eca782a232dede0303416e8f356c37b
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push Docker image
# https://github.com/docker/build-push-action/releases/tag/v3.0.0
uses: docker/build-push-action@e551b19e49efd4e98792db7592c17c09b89db8d8
uses: docker/build-push-action@v6
with:
context: .
push: ${{ github.event_name == 'release' }}

View File

@@ -15,7 +15,7 @@ jobs:
services:
elasticsearch:
image: elasticsearch:8.18.2
image: elasticsearch:8.19.7
env:
discovery.type: single-node
cluster.name: parsedmarc-cluster
@@ -33,15 +33,15 @@ jobs:
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
uses: actions/setup-python@v6
with:
python-version: ${{ matrix.python-version }}
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt-get install -y libemail-outlook-message-perl
sudo apt-get -q update
sudo apt-get -qy install libemail-outlook-message-perl
- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
@@ -65,6 +65,6 @@ jobs:
run: |
hatch build
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}

13
.vscode/launch.json vendored
View File

@@ -19,20 +19,11 @@
"console": "integratedTerminal"
},
{
"name": "sample.eml",
"name": "sample",
"type": "debugpy",
"request": "launch",
"module": "parsedmarc.cli",
"args": ["samples/private/sample.eml"]
},
{
"name": "find_sus_domains.py",
"type": "debugpy",
"request": "launch",
"program": "find_sus_domains.py",
"args": ["-i", "unknown_domains.txt", "-o", "sus_domains.csv"],
"cwd": "${workspaceFolder}/parsedmarc/resources/maps",
"console": "integratedTerminal"
"args": ["samples/private/sample"]
},
{
"name": "sortlists.py",

16
.vscode/settings.json vendored
View File

@@ -1,4 +1,14 @@
{
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnSave": true,
// Let Ruff handle lint fixes + import sorting on save
"editor.codeActionsOnSave": {
"source.fixAll.ruff": "explicit",
"source.organizeImports.ruff": "explicit"
}
},
"markdownlint.config": {
"MD024": false
},
@@ -36,6 +46,7 @@
"exampleuser",
"expiringdict",
"fieldlist",
"GELF",
"genindex",
"geoip",
"geoipupdate",
@@ -65,17 +76,20 @@
"mailrelay",
"mailsuite",
"maxdepth",
"MAXHEADERS",
"maxmind",
"mbox",
"mfrom",
"michaeldavie",
"mikesiegel",
"Mimecast",
"mitigations",
"MMDB",
"modindex",
"msgconvert",
"msgraph",
"MSSP",
"multiprocess",
"Munge",
"ndjson",
"newkey",
@@ -86,6 +100,7 @@
"nosniff",
"nwettbewerb",
"opensearch",
"opensearchpy",
"parsedmarc",
"passsword",
"Postorius",
@@ -123,6 +138,7 @@
"truststore",
"Übersicht",
"uids",
"Uncategorized",
"unparasable",
"uper",
"urllib",

File diff suppressed because it is too large Load Diff

View File

@@ -23,11 +23,10 @@ ProofPoint Email Fraud Defense, and Valimail.
## Help Wanted
This project is maintained by one developer. Please consider
reviewing the open
[issues](https://github.com/domainaware/parsedmarc/issues) to see how
you can contribute code, documentation, or user support. Assistance on
the pinned issues would be particularly helpful.
This project is maintained by one developer. Please consider reviewing the open
[issues](https://github.com/domainaware/parsedmarc/issues) to see how you can
contribute code, documentation, or user support. Assistance on the pinned
issues would be particularly helpful.
Thanks to all
[contributors](https://github.com/domainaware/parsedmarc/graphs/contributors)!
@@ -42,6 +41,24 @@ Thanks to all
- Consistent data structures
- Simple JSON and/or CSV output
- Optionally email the results
- Optionally send the results to Elasticsearch, Opensearch, and/or Splunk, for use
with premade dashboards
- Optionally send the results to Elasticsearch, Opensearch, and/or Splunk, for
use with premade dashboards
- Optionally send reports to Apache Kafka
## Python Compatibility
This project supports the following Python versions, which are either actively maintained or are the default versions
for RHEL or Debian.
| Version | Supported | Reason |
|---------|-----------|------------------------------------------------------------|
| < 3.6 | ❌ | End of Life (EOL) |
| 3.6 | ❌ | Used in RHEL 8, but not supported by project dependencies |
| 3.7 | ❌ | End of Life (EOL) |
| 3.8 | ❌ | End of Life (EOL) |
| 3.9 | ✅ | Supported until August 2026 (Debian 11); May 2032 (RHEL 9) |
| 3.10 | ✅ | Actively maintained |
| 3.11 | ✅ | Actively maintained; supported until June 2028 (Debian 12) |
| 3.12 | ✅ | Actively maintained; supported until May 2035 (RHEL 10) |
| 3.13 | ✅ | Actively maintained; supported until June 2030 (Debian 13) |
| 3.14 | ❌ | Not currently supported due to Not currently supported due to [this imapclient bug](https://github.com/mjs/imapclient/issues/618)|

View File

@@ -9,12 +9,11 @@ fi
. venv/bin/activate
pip install .[build]
ruff format .
ruff check .
cd docs
make clean
make html
touch build/html/.nojekyll
if [ -d "./../parsedmarc-docs" ]; then
if [ -d "../../parsedmarc-docs" ]; then
cp -rf build/html/* ../../parsedmarc-docs/
fi
cd ..

View File

@@ -1,8 +1,6 @@
version: '3.7'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.3.1
image: docker.elastic.co/elasticsearch/elasticsearch:8.19.7
environment:
- network.host=127.0.0.1
- http.host=0.0.0.0
@@ -14,7 +12,7 @@ services:
- xpack.security.enabled=false
- xpack.license.self_generated.type=basic
ports:
- 127.0.0.1:9200:9200
- "127.0.0.1:9200:9200"
ulimits:
memlock:
soft: -1
@@ -30,7 +28,7 @@ services:
retries: 24
opensearch:
image: opensearchproject/opensearch:2.18.0
image: opensearchproject/opensearch:2
environment:
- network.host=127.0.0.1
- http.host=0.0.0.0
@@ -41,7 +39,7 @@ services:
- bootstrap.memory_lock=true
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=${OPENSEARCH_INITIAL_ADMIN_PASSWORD}
ports:
- 127.0.0.1:9201:9200
- "127.0.0.1:9201:9200"
ulimits:
memlock:
soft: -1

View File

@@ -20,7 +20,7 @@ from parsedmarc import __version__
# -- Project information -----------------------------------------------------
project = "parsedmarc"
copyright = "2018 - 2023, Sean Whalen and contributors"
copyright = "2018 - 2025, Sean Whalen and contributors"
author = "Sean Whalen and contributors"
# The version info for the project you're documenting, acts as replacement for

View File

@@ -45,6 +45,24 @@ and Valimail.
with premade dashboards
- Optionally send reports to Apache Kafka
## Python Compatibility
This project supports the following Python versions, which are either actively maintained or are the default versions
for RHEL or Debian.
| Version | Supported | Reason |
|---------|-----------|------------------------------------------------------------|
| < 3.6 | ❌ | End of Life (EOL) |
| 3.6 | ❌ | Used in RHEL 8, but not supported by project dependencies |
| 3.7 | ❌ | End of Life (EOL) |
| 3.8 | ❌ | End of Life (EOL) |
| 3.9 | ✅ | Supported until August 2026 (Debian 11); May 2032 (RHEL 9) |
| 3.10 | ✅ | Actively maintained |
| 3.11 | ✅ | Actively maintained; supported until June 2028 (Debian 12) |
| 3.12 | ✅ | Actively maintained; supported until May 2035 (RHEL 10) |
| 3.13 | ✅ | Actively maintained; supported until June 2030 (Debian 13) |
| 3.14 | ❌ | Not currently supported due to [this imapclient bug](https://github.com/mjs/imapclient/issues/618)|
```{toctree}
:caption: 'Contents'
:maxdepth: 2

View File

@@ -199,7 +199,7 @@ sudo apt-get install libemail-outlook-message-perl
[geoipupdate releases page on github]: https://github.com/maxmind/geoipupdate/releases
[ip to country lite database]: https://db-ip.com/db/download/ip-to-country-lite
[license keys]: https://www.maxmind.com/en/accounts/current/license-key
[maxmind geoipupdate page]: https://dev.maxmind.com/geoip/geoipupdate/
[maxmind geoipupdate page]: https://dev.maxmind.com/geoip/updating-databases/
[maxmind geolite2 country database]: https://dev.maxmind.com/geoip/geolite2-free-geolocation-data
[registering for a free geolite2 account]: https://www.maxmind.com/en/geolite2/signup
[to comply with various privacy regulations]: https://blog.maxmind.com/2019/12/18/significant-changes-to-accessing-and-using-geolite2-databases/

View File

@@ -23,6 +23,8 @@ of the report schema.
"report_id": "9391651994964116463",
"begin_date": "2012-04-27 20:00:00",
"end_date": "2012-04-28 19:59:59",
"timespan_requires_normalization": false,
"original_timespan_seconds": 86399,
"errors": []
},
"policy_published": {
@@ -39,8 +41,10 @@ of the report schema.
"source": {
"ip_address": "72.150.241.94",
"country": "US",
"reverse_dns": "adsl-72-150-241-94.shv.bellsouth.net",
"base_domain": "bellsouth.net"
"reverse_dns": null,
"base_domain": null,
"name": null,
"type": null
},
"count": 2,
"alignment": {
@@ -74,7 +78,10 @@ of the report schema.
"result": "pass"
}
]
}
},
"normalized_timespan": false,
"interval_begin": "2012-04-28 00:00:00",
"interval_end": "2012-04-28 23:59:59"
}
]
}
@@ -83,8 +90,10 @@ of the report schema.
### CSV aggregate report
```text
xml_schema,org_name,org_email,org_extra_contact_info,report_id,begin_date,end_date,errors,domain,adkim,aspf,p,sp,pct,fo,source_ip_address,source_country,source_reverse_dns,source_base_domain,count,spf_aligned,dkim_aligned,dmarc_aligned,disposition,policy_override_reasons,policy_override_comments,envelope_from,header_from,envelope_to,dkim_domains,dkim_selectors,dkim_results,spf_domains,spf_scopes,spf_results
draft,acme.com,noreply-dmarc-support@acme.com,http://acme.com/dmarc/support,9391651994964116463,2012-04-27 20:00:00,2012-04-28 19:59:59,,example.com,r,r,none,none,100,0,72.150.241.94,US,adsl-72-150-241-94.shv.bellsouth.net,bellsouth.net,2,True,False,True,none,,,example.com,example.com,,example.com,none,fail,example.com,mfrom,pass
xml_schema,org_name,org_email,org_extra_contact_info,report_id,begin_date,end_date,normalized_timespan,errors,domain,adkim,aspf,p,sp,pct,fo,source_ip_address,source_country,source_reverse_dns,source_base_domain,source_name,source_type,count,spf_aligned,dkim_aligned,dmarc_aligned,disposition,policy_override_reasons,policy_override_comments,envelope_from,header_from,envelope_to,dkim_domains,dkim_selectors,dkim_results,spf_domains,spf_scopes,spf_results
draft,acme.com,noreply-dmarc-support@acme.com,http://acme.com/dmarc/support,9391651994964116463,2012-04-28 00:00:00,2012-04-28 23:59:59,False,,example.com,r,r,none,none,100,0,72.150.241.94,US,,,,,2,True,False,True,none,,,example.com,example.com,,example.com,none,fail,example.com,mfrom,pass
draft,acme.com,noreply-dmarc-support@acme.com,http://acme.com/dmarc/support,9391651994964116463,2012-04-28 00:00:00,2012-04-28 23:59:59,False,,example.com,r,r,none,none,100,0,72.150.241.94,US,,,,,2,True,False,True,none,,,example.com,example.com,,example.com,none,fail,example.com,mfrom,pass
```
## Sample forensic report output

View File

@@ -4,47 +4,50 @@
```text
usage: parsedmarc [-h] [-c CONFIG_FILE] [--strip-attachment-payloads] [-o OUTPUT]
[--aggregate-json-filename AGGREGATE_JSON_FILENAME]
[--forensic-json-filename FORENSIC_JSON_FILENAME]
[--aggregate-csv-filename AGGREGATE_CSV_FILENAME]
[--forensic-csv-filename FORENSIC_CSV_FILENAME]
[-n NAMESERVERS [NAMESERVERS ...]] [-t DNS_TIMEOUT] [--offline]
[-s] [--verbose] [--debug] [--log-file LOG_FILE] [-v]
[file_path ...]
[--aggregate-json-filename AGGREGATE_JSON_FILENAME] [--forensic-json-filename FORENSIC_JSON_FILENAME]
[--smtp-tls-json-filename SMTP_TLS_JSON_FILENAME] [--aggregate-csv-filename AGGREGATE_CSV_FILENAME]
[--forensic-csv-filename FORENSIC_CSV_FILENAME] [--smtp-tls-csv-filename SMTP_TLS_CSV_FILENAME]
[-n NAMESERVERS [NAMESERVERS ...]] [-t DNS_TIMEOUT] [--offline] [-s] [-w] [--verbose] [--debug]
[--log-file LOG_FILE] [--no-prettify-json] [-v]
[file_path ...]
Parses DMARC reports
Parses DMARC reports
positional arguments:
file_path one or more paths to aggregate or forensic report
files, emails, or mbox files'
positional arguments:
file_path one or more paths to aggregate or forensic report files, emails, or mbox files'
optional arguments:
-h, --help show this help message and exit
-c CONFIG_FILE, --config-file CONFIG_FILE
a path to a configuration file (--silent implied)
--strip-attachment-payloads
remove attachment payloads from forensic report output
-o OUTPUT, --output OUTPUT
write output files to the given directory
--aggregate-json-filename AGGREGATE_JSON_FILENAME
filename for the aggregate JSON output file
--forensic-json-filename FORENSIC_JSON_FILENAME
filename for the forensic JSON output file
--aggregate-csv-filename AGGREGATE_CSV_FILENAME
filename for the aggregate CSV output file
--forensic-csv-filename FORENSIC_CSV_FILENAME
filename for the forensic CSV output file
-n NAMESERVERS [NAMESERVERS ...], --nameservers NAMESERVERS [NAMESERVERS ...]
nameservers to query
-t DNS_TIMEOUT, --dns_timeout DNS_TIMEOUT
number of seconds to wait for an answer from DNS
(default: 2.0)
--offline do not make online queries for geolocation or DNS
-s, --silent only print errors and warnings
--verbose more verbose output
--debug print debugging information
--log-file LOG_FILE output logging to a file
-v, --version show program's version number and exit
options:
-h, --help show this help message and exit
-c CONFIG_FILE, --config-file CONFIG_FILE
a path to a configuration file (--silent implied)
--strip-attachment-payloads
remove attachment payloads from forensic report output
-o OUTPUT, --output OUTPUT
write output files to the given directory
--aggregate-json-filename AGGREGATE_JSON_FILENAME
filename for the aggregate JSON output file
--forensic-json-filename FORENSIC_JSON_FILENAME
filename for the forensic JSON output file
--smtp-tls-json-filename SMTP_TLS_JSON_FILENAME
filename for the SMTP TLS JSON output file
--aggregate-csv-filename AGGREGATE_CSV_FILENAME
filename for the aggregate CSV output file
--forensic-csv-filename FORENSIC_CSV_FILENAME
filename for the forensic CSV output file
--smtp-tls-csv-filename SMTP_TLS_CSV_FILENAME
filename for the SMTP TLS CSV output file
-n NAMESERVERS [NAMESERVERS ...], --nameservers NAMESERVERS [NAMESERVERS ...]
nameservers to query
-t DNS_TIMEOUT, --dns_timeout DNS_TIMEOUT
number of seconds to wait for an answer from DNS (default: 2.0)
--offline do not make online queries for geolocation or DNS
-s, --silent only print errors
-w, --warnings print warnings in addition to errors
--verbose more verbose output
--debug print debugging information
--log-file LOG_FILE output logging to a file
--no-prettify-json output JSON in a single line without indentation
-v, --version show program's version number and exit
```
:::{note}
@@ -169,7 +172,7 @@ The full set of configuration options are:
IDLE response or the number of seconds until the next
mail check (Default: `30`)
- `since` - str: Search for messages since certain time. (Examples: `5m|3h|2d|1w`)
Acceptable units - {"m":"minutes", "h":"hours", "d":"days", "w":"weeks"}).
Acceptable units - {"m":"minutes", "h":"hours", "d":"days", "w":"weeks"}.
Defaults to `1d` if incorrect value is provided.
- `imap`
- `host` - str: The IMAP server hostname or IP address
@@ -254,7 +257,7 @@ The full set of configuration options are:
:::
- `user` - str: Basic auth username
- `password` - str: Basic auth password
- `apiKey` - str: API key
- `api_key` - str: API key
- `ssl` - bool: Use an encrypted SSL/TLS connection
(Default: `True`)
- `timeout` - float: Timeout in seconds (Default: 60)
@@ -277,7 +280,7 @@ The full set of configuration options are:
:::
- `user` - str: Basic auth username
- `password` - str: Basic auth password
- `apiKey` - str: API key
- `api_key` - str: API key
- `ssl` - bool: Use an encrypted SSL/TLS connection
(Default: `True`)
- `timeout` - float: Timeout in seconds (Default: 60)

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

View File

@@ -10,7 +10,6 @@ from glob import glob
import logging
import math
import yaml
from collections import OrderedDict
import json
from ssl import CERT_NONE, create_default_context
from multiprocessing import Pipe, Process
@@ -77,6 +76,7 @@ def cli_parse(
always_use_local_files,
reverse_dns_map_path,
reverse_dns_map_url,
normalize_timespan_threshold_hours,
conn,
):
"""Separated this function for multiprocessing"""
@@ -91,6 +91,7 @@ def cli_parse(
nameservers=nameservers,
dns_timeout=dns_timeout,
strip_attachment_payloads=sa,
normalize_timespan_threshold_hours=normalize_timespan_threshold_hours,
)
conn.send([file_results, file_path])
except ParserError as error:
@@ -591,7 +592,7 @@ def _main():
elasticsearch_monthly_indexes=False,
elasticsearch_username=None,
elasticsearch_password=None,
elasticsearch_apiKey=None,
elasticsearch_api_key=None,
opensearch_hosts=None,
opensearch_timeout=60,
opensearch_number_of_shards=1,
@@ -603,7 +604,7 @@ def _main():
opensearch_monthly_indexes=False,
opensearch_username=None,
opensearch_password=None,
opensearch_apiKey=None,
opensearch_api_key=None,
kafka_hosts=None,
kafka_username=None,
kafka_password=None,
@@ -659,6 +660,7 @@ def _main():
webhook_forensic_url=None,
webhook_smtp_tls_url=None,
webhook_timeout=60,
normalize_timespan_threshold_hours=24.0,
)
args = arg_parser.parse_args()
@@ -674,8 +676,11 @@ def _main():
if "general" in config.sections():
general_config = config["general"]
if "silent" in general_config:
if general_config["silent"].lower() == "false":
opts.silent = False
opts.silent = general_config.getboolean("silent")
if "normalize_timespan_threshold_hours" in general_config:
opts.normalize_timespan_threshold_hours = general_config.getfloat(
"normalize_timespan_threshold_hours"
)
if "index_prefix_domain_map" in general_config:
with open(general_config["index_prefix_domain_map"]) as f:
index_prefix_domain_map = yaml.safe_load(f)
@@ -723,11 +728,11 @@ def _main():
)
exit(-1)
if "save_aggregate" in general_config:
opts.save_aggregate = general_config["save_aggregate"]
opts.save_aggregate = general_config.getboolean("save_aggregate")
if "save_forensic" in general_config:
opts.save_forensic = general_config["save_forensic"]
opts.save_forensic = general_config.getboolean("save_forensic")
if "save_smtp_tls" in general_config:
opts.save_smtp_tls = general_config["save_smtp_tls"]
opts.save_smtp_tls = general_config.getboolean("save_smtp_tls")
if "debug" in general_config:
opts.debug = general_config.getboolean("debug")
if "verbose" in general_config:
@@ -798,8 +803,9 @@ def _main():
if "ssl" in imap_config:
opts.imap_ssl = imap_config.getboolean("ssl")
if "skip_certificate_verification" in imap_config:
imap_verify = imap_config.getboolean("skip_certificate_verification")
opts.imap_skip_certificate_verification = imap_verify
opts.imap_skip_certificate_verification = imap_config.getboolean(
"skip_certificate_verification"
)
if "user" in imap_config:
opts.imap_user = imap_config["user"]
else:
@@ -975,8 +981,12 @@ def _main():
opts.elasticsearch_username = elasticsearch_config["user"]
if "password" in elasticsearch_config:
opts.elasticsearch_password = elasticsearch_config["password"]
# Until 8.20
if "apiKey" in elasticsearch_config:
opts.elasticsearch_apiKey = elasticsearch_config["apiKey"]
# Since 8.20
if "api_key" in elasticsearch_config:
opts.elasticsearch_apiKey = elasticsearch_config["api_key"]
if "opensearch" in config:
opensearch_config = config["opensearch"]
@@ -1011,8 +1021,12 @@ def _main():
opts.opensearch_username = opensearch_config["user"]
if "password" in opensearch_config:
opts.opensearch_password = opensearch_config["password"]
# Until 8.20
if "apiKey" in opensearch_config:
opts.opensearch_apiKey = opensearch_config["apiKey"]
# Since 8.20
if "api_key" in opensearch_config:
opts.opensearch_apiKey = opensearch_config["api_key"]
if "splunk_hec" in config.sections():
hec_config = config["splunk_hec"]
@@ -1169,7 +1183,9 @@ def _main():
)
opts.gmail_api_scopes = _str_to_list(opts.gmail_api_scopes)
if "oauth2_port" in gmail_api_config:
opts.gmail_api_oauth2_port = gmail_api_config.get("oauth2_port", 8080)
opts.gmail_api_oauth2_port = gmail_api_config.getint(
"oauth2_port", 8080
)
if "maildir" in config.sections():
maildir_api_config = config["maildir"]
@@ -1271,11 +1287,11 @@ def _main():
es_smtp_tls_index = "{0}{1}".format(prefix, es_smtp_tls_index)
elastic.set_hosts(
opts.elasticsearch_hosts,
opts.elasticsearch_ssl,
opts.elasticsearch_ssl_cert_path,
opts.elasticsearch_username,
opts.elasticsearch_password,
opts.elasticsearch_apiKey,
use_ssl=opts.elasticsearch_ssl,
ssl_cert_path=opts.elasticsearch_ssl_cert_path,
username=opts.elasticsearch_username,
password=opts.elasticsearch_password,
api_key=opts.elasticsearch_api_key,
timeout=opts.elasticsearch_timeout,
)
elastic.migrate_indexes(
@@ -1303,11 +1319,11 @@ def _main():
os_smtp_tls_index = "{0}{1}".format(prefix, os_smtp_tls_index)
opensearch.set_hosts(
opts.opensearch_hosts,
opts.opensearch_ssl,
opts.opensearch_ssl_cert_path,
opts.opensearch_username,
opts.opensearch_password,
opts.opensearch_apiKey,
use_ssl=opts.opensearch_ssl,
ssl_cert_path=opts.opensearch_ssl_cert_path,
username=opts.opensearch_username,
password=opts.opensearch_password,
api_key=opts.opensearch_api_key,
timeout=opts.opensearch_timeout,
)
opensearch.migrate_indexes(
@@ -1445,6 +1461,7 @@ def _main():
opts.always_use_local_files,
opts.reverse_dns_map_path,
opts.reverse_dns_map_url,
opts.normalize_timespan_threshold_hours,
child_conn,
),
)
@@ -1463,7 +1480,7 @@ def _main():
pbar.update(counter - pbar.n)
for result in results:
if type(result[0]) is ParserError:
if isinstance(result[0], ParserError) or result[0] is None:
logger.error("Failed to parse {0} - {1}".format(result[1], result[0]))
else:
if result[0]["report_type"] == "aggregate":
@@ -1495,6 +1512,7 @@ def _main():
reverse_dns_map_path=opts.reverse_dns_map_path,
reverse_dns_map_url=opts.reverse_dns_map_url,
offline=opts.offline,
normalize_timespan_threshold_hours=opts.normalize_timespan_threshold_hours,
)
aggregate_reports += reports["aggregate_reports"]
forensic_reports += reports["forensic_reports"]
@@ -1513,7 +1531,7 @@ def _main():
if opts.imap_skip_certificate_verification:
logger.debug("Skipping IMAP certificate verification")
verify = False
if opts.imap_ssl is False:
if not opts.imap_ssl:
ssl = False
mailbox_connection = IMAPConnection(
@@ -1604,6 +1622,7 @@ def _main():
test=opts.mailbox_test,
strip_attachment_payloads=opts.strip_attachment_payloads,
since=opts.mailbox_since,
normalize_timespan_threshold_hours=opts.normalize_timespan_threshold_hours,
)
aggregate_reports += reports["aggregate_reports"]
@@ -1614,7 +1633,7 @@ def _main():
logger.exception("Mailbox Error")
exit(1)
results = OrderedDict(
results = dict(
[
("aggregate_reports", aggregate_reports),
("forensic_reports", forensic_reports),
@@ -1666,6 +1685,7 @@ def _main():
reverse_dns_map_path=opts.reverse_dns_map_path,
reverse_dns_map_url=opts.reverse_dns_map_url,
offline=opts.offline,
normalize_timespan_threshold_hours=opts.normalize_timespan_threshold_hours,
)
except FileExistsError as error:
logger.error("{0}".format(error.__str__()))

View File

@@ -1,2 +1,3 @@
__version__ = "8.19.1"
__version__ = "9.0.5"
USER_AGENT = f"parsedmarc/{__version__}"

View File

@@ -1,6 +1,9 @@
# -*- coding: utf-8 -*-
from collections import OrderedDict
from __future__ import annotations
from typing import Optional, Union, Any
from elasticsearch_dsl.search import Q
from elasticsearch_dsl import (
@@ -67,6 +70,8 @@ class _AggregateReportDoc(Document):
date_range = Date()
date_begin = Date()
date_end = Date()
normalized_timespan = Boolean()
original_timespan_seconds = Integer
errors = Text()
published_policy = Object(_PublishedPolicy)
source_ip_address = Ip()
@@ -87,15 +92,15 @@ class _AggregateReportDoc(Document):
dkim_results = Nested(_DKIMResult)
spf_results = Nested(_SPFResult)
def add_policy_override(self, type_, comment):
def add_policy_override(self, type_: str, comment: str):
self.policy_overrides.append(_PolicyOverride(type=type_, comment=comment))
def add_dkim_result(self, domain, selector, result):
def add_dkim_result(self, domain: str, selector: str, result: _DKIMResult):
self.dkim_results.append(
_DKIMResult(domain=domain, selector=selector, result=result)
)
def add_spf_result(self, domain, scope, result):
def add_spf_result(self, domain: str, scope: str, result: _SPFResult):
self.spf_results.append(_SPFResult(domain=domain, scope=scope, result=result))
def save(self, **kwargs):
@@ -131,21 +136,21 @@ class _ForensicSampleDoc(InnerDoc):
body = Text()
attachments = Nested(_EmailAttachmentDoc)
def add_to(self, display_name, address):
def add_to(self, display_name: str, address: str):
self.to.append(_EmailAddressDoc(display_name=display_name, address=address))
def add_reply_to(self, display_name, address):
def add_reply_to(self, display_name: str, address: str):
self.reply_to.append(
_EmailAddressDoc(display_name=display_name, address=address)
)
def add_cc(self, display_name, address):
def add_cc(self, display_name: str, address: str):
self.cc.append(_EmailAddressDoc(display_name=display_name, address=address))
def add_bcc(self, display_name, address):
def add_bcc(self, display_name: str, address: str):
self.bcc.append(_EmailAddressDoc(display_name=display_name, address=address))
def add_attachment(self, filename, content_type, sha256):
def add_attachment(self, filename: str, content_type: str, sha256: str):
self.attachments.append(
_EmailAttachmentDoc(
filename=filename, content_type=content_type, sha256=sha256
@@ -197,15 +202,15 @@ class _SMTPTLSPolicyDoc(InnerDoc):
def add_failure_details(
self,
result_type,
ip_address,
receiving_ip,
receiving_mx_helo,
failed_session_count,
sending_mta_ip=None,
receiving_mx_hostname=None,
additional_information_uri=None,
failure_reason_code=None,
result_type: Optional[str] = None,
ip_address: Optional[str] = None,
receiving_ip: Optional[str] = None,
receiving_mx_helo: Optional[str] = None,
failed_session_count: Optional[int] = None,
sending_mta_ip: Optional[str] = None,
receiving_mx_hostname: Optional[str] = None,
additional_information_uri: Optional[str] = None,
failure_reason_code: Union[str, int, None] = None,
):
_details = _SMTPTLSFailureDetailsDoc(
result_type=result_type,
@@ -235,13 +240,14 @@ class _SMTPTLSReportDoc(Document):
def add_policy(
self,
policy_type,
policy_domain,
successful_session_count,
failed_session_count,
policy_string=None,
mx_host_patterns=None,
failure_details=None,
policy_type: str,
policy_domain: str,
successful_session_count: int,
failed_session_count: int,
*,
policy_string: Optional[str] = None,
mx_host_patterns: Optional[list[str]] = None,
failure_details: Optional[str] = None,
):
self.policies.append(
policy_type=policy_type,
@@ -259,24 +265,25 @@ class AlreadySaved(ValueError):
def set_hosts(
hosts,
use_ssl=False,
ssl_cert_path=None,
username=None,
password=None,
apiKey=None,
timeout=60.0,
hosts: Union[str, list[str]],
*,
use_ssl: Optional[bool] = False,
ssl_cert_path: Optional[str] = None,
username: Optional[str] = None,
password: Optional[str] = None,
api_key: Optional[str] = None,
timeout: Optional[float] = 60.0,
):
"""
Sets the Elasticsearch hosts to use
Args:
hosts (str): A single hostname or URL, or list of hostnames or URLs
use_ssl (bool): Use a HTTPS connection to the server
hosts (Union[str, list[str]]): A single hostname or URL, or list of hostnames or URLs
use_ssl (bool): Use an HTTPS connection to the server
ssl_cert_path (str): Path to the certificate chain
username (str): The username to use for authentication
password (str): The password to use for authentication
apiKey (str): The Base64 encoded API key to use for authentication
api_key (str): The Base64 encoded API key to use for authentication
timeout (float): Timeout in seconds
"""
if not isinstance(hosts, list):
@@ -289,14 +296,14 @@ def set_hosts(
conn_params["ca_certs"] = ssl_cert_path
else:
conn_params["verify_certs"] = False
if username:
if username and password:
conn_params["http_auth"] = username + ":" + password
if apiKey:
conn_params["api_key"] = apiKey
if api_key:
conn_params["api_key"] = api_key
connections.create_connection(**conn_params)
def create_indexes(names, settings=None):
def create_indexes(names: list[str], settings: Optional[dict[str, Any]] = None):
"""
Create Elasticsearch indexes
@@ -319,7 +326,10 @@ def create_indexes(names, settings=None):
raise ElasticsearchError("Elasticsearch error: {0}".format(e.__str__()))
def migrate_indexes(aggregate_indexes=None, forensic_indexes=None):
def migrate_indexes(
aggregate_indexes: Optional[list[str]] = None,
forensic_indexes: Optional[list[str]] = None,
):
"""
Updates index mappings
@@ -366,18 +376,18 @@ def migrate_indexes(aggregate_indexes=None, forensic_indexes=None):
def save_aggregate_report_to_elasticsearch(
aggregate_report,
index_suffix=None,
index_prefix=None,
monthly_indexes=False,
number_of_shards=1,
number_of_replicas=0,
aggregate_report: dict[str, Any],
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: Optional[int] = 1,
number_of_replicas: Optional[int] = 0,
):
"""
Saves a parsed DMARC aggregate report to Elasticsearch
Args:
aggregate_report (OrderedDict): A parsed forensic report
aggregate_report (dict): A parsed forensic report
index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily indexes
@@ -395,15 +405,11 @@ def save_aggregate_report_to_elasticsearch(
domain = aggregate_report["policy_published"]["domain"]
begin_date = human_timestamp_to_datetime(metadata["begin_date"], to_utc=True)
end_date = human_timestamp_to_datetime(metadata["end_date"], to_utc=True)
begin_date_human = begin_date.strftime("%Y-%m-%d %H:%M:%SZ")
end_date_human = end_date.strftime("%Y-%m-%d %H:%M:%SZ")
if monthly_indexes:
index_date = begin_date.strftime("%Y-%m")
else:
index_date = begin_date.strftime("%Y-%m-%d")
aggregate_report["begin_date"] = begin_date
aggregate_report["end_date"] = end_date
date_range = [aggregate_report["begin_date"], aggregate_report["end_date"]]
org_name_query = Q(dict(match_phrase=dict(org_name=org_name)))
report_id_query = Q(dict(match_phrase=dict(report_id=report_id)))
@@ -425,6 +431,9 @@ def save_aggregate_report_to_elasticsearch(
try:
existing = search.execute()
except Exception as error_:
begin_date_human = begin_date.strftime("%Y-%m-%d %H:%M:%SZ")
end_date_human = end_date.strftime("%Y-%m-%d %H:%M:%SZ")
raise ElasticsearchError(
"Elasticsearch's search for existing report \
error: {}".format(error_.__str__())
@@ -450,6 +459,17 @@ def save_aggregate_report_to_elasticsearch(
)
for record in aggregate_report["records"]:
begin_date = human_timestamp_to_datetime(record["interval_begin"], to_utc=True)
end_date = human_timestamp_to_datetime(record["interval_end"], to_utc=True)
normalized_timespan = record["normalized_timespan"]
if monthly_indexes:
index_date = begin_date.strftime("%Y-%m")
else:
index_date = begin_date.strftime("%Y-%m-%d")
aggregate_report["begin_date"] = begin_date
aggregate_report["end_date"] = end_date
date_range = [aggregate_report["begin_date"], aggregate_report["end_date"]]
agg_doc = _AggregateReportDoc(
xml_schema=aggregate_report["xml_schema"],
org_name=metadata["org_name"],
@@ -457,8 +477,9 @@ def save_aggregate_report_to_elasticsearch(
org_extra_contact_info=metadata["org_extra_contact_info"],
report_id=metadata["report_id"],
date_range=date_range,
date_begin=aggregate_report["begin_date"],
date_end=aggregate_report["end_date"],
date_begin=begin_date,
date_end=end_date,
normalized_timespan=normalized_timespan,
errors=metadata["errors"],
published_policy=published_policy,
source_ip_address=record["source"]["ip_address"],
@@ -517,18 +538,18 @@ def save_aggregate_report_to_elasticsearch(
def save_forensic_report_to_elasticsearch(
forensic_report,
index_suffix=None,
index_prefix=None,
monthly_indexes=False,
number_of_shards=1,
number_of_replicas=0,
forensic_report: dict[str, Any],
index_suffix: Optional[Any] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: int = 1,
number_of_replicas: int = 0,
):
"""
Saves a parsed DMARC forensic report to Elasticsearch
Args:
forensic_report (OrderedDict): A parsed forensic report
forensic_report (dict): A parsed forensic report
index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily
@@ -548,7 +569,7 @@ def save_forensic_report_to_elasticsearch(
sample_date = forensic_report["parsed_sample"]["date"]
sample_date = human_timestamp_to_datetime(sample_date)
original_headers = forensic_report["parsed_sample"]["headers"]
headers = OrderedDict()
headers = dict()
for original_header in original_headers:
headers[original_header.lower()] = original_headers[original_header]
@@ -684,18 +705,18 @@ def save_forensic_report_to_elasticsearch(
def save_smtp_tls_report_to_elasticsearch(
report,
index_suffix=None,
index_prefix=None,
monthly_indexes=False,
number_of_shards=1,
number_of_replicas=0,
report: dict[str, Any],
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: Optional[int] = 1,
number_of_replicas: Optional[int] = 0,
):
"""
Saves a parsed SMTP TLS report to Elasticsearch
Args:
report (OrderedDict): A parsed SMTP TLS report
report (dict): A parsed SMTP TLS report
index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily indexes
@@ -781,7 +802,7 @@ def save_smtp_tls_report_to_elasticsearch(
policy_doc = _SMTPTLSPolicyDoc(
policy_domain=policy["policy_domain"],
policy_type=policy["policy_type"],
succesful_session_count=policy["successful_session_count"],
successful_session_count=policy["successful_session_count"],
failed_session_count=policy["failed_session_count"],
policy_string=policy_strings,
mx_host_patterns=mx_host_patterns,

View File

@@ -1,5 +1,9 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Any
import logging
import logging.handlers
import json
@@ -48,7 +52,9 @@ class GelfClient(object):
)
self.logger.addHandler(self.handler)
def save_aggregate_report_to_gelf(self, aggregate_reports):
def save_aggregate_report_to_gelf(
self, aggregate_reports: list[dict[str, Any]]
):
rows = parsed_aggregate_reports_to_csv_rows(aggregate_reports)
for row in rows:
log_context_data.parsedmarc = row
@@ -56,12 +62,14 @@ class GelfClient(object):
log_context_data.parsedmarc = None
def save_forensic_report_to_gelf(self, forensic_reports):
def save_forensic_report_to_gelf(
self, forensic_reports: list[dict[str, Any]]
):
rows = parsed_forensic_reports_to_csv_rows(forensic_reports)
for row in rows:
self.logger.info(json.dumps(row))
def save_smtp_tls_report_to_gelf(self, smtp_tls_reports):
def save_smtp_tls_report_to_gelf(self, smtp_tls_reports: dict[str, Any]):
rows = parsed_smtp_tls_reports_to_csv_rows(smtp_tls_reports)
for row in rows:
self.logger.info(json.dumps(row))

View File

@@ -1,11 +1,15 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Any, Optional, Union
from ssl import SSLContext
import json
from ssl import create_default_context
from kafka import KafkaProducer
from kafka.errors import NoBrokersAvailable, UnknownTopicOrPartitionError
from collections import OrderedDict
from parsedmarc.utils import human_timestamp_to_datetime
from parsedmarc import __version__
@@ -18,7 +22,13 @@ class KafkaError(RuntimeError):
class KafkaClient(object):
def __init__(
self, kafka_hosts, ssl=False, username=None, password=None, ssl_context=None
self,
kafka_hosts: list[str],
*,
ssl: Optional[bool] = False,
username: Optional[str] = None,
password: Optional[str] = None,
ssl_context: Optional[SSLContext] = None,
):
"""
Initializes the Kafka client
@@ -28,7 +38,7 @@ class KafkaClient(object):
ssl (bool): Use a SSL/TLS connection
username (str): An optional username
password (str): An optional password
ssl_context: SSL context options
ssl_context (SSLContext): SSL context options
Notes:
``use_ssl=True`` is implied when a username or password are
@@ -55,7 +65,7 @@ class KafkaClient(object):
raise KafkaError("No Kafka brokers available")
@staticmethod
def strip_metadata(report):
def strip_metadata(report: dict[str, Any]):
"""
Duplicates org_name, org_email and report_id into JSON root
and removes report_metadata key to bring it more inline
@@ -69,7 +79,7 @@ class KafkaClient(object):
return report
@staticmethod
def generate_daterange(report):
def generate_date_range(report: dict[str, Any]):
"""
Creates a date_range timestamp with format YYYY-MM-DD-T-HH:MM:SS
based on begin and end dates for easier parsing in Kibana.
@@ -86,7 +96,11 @@ class KafkaClient(object):
logger.debug("date_range is {}".format(date_range))
return date_range
def save_aggregate_reports_to_kafka(self, aggregate_reports, aggregate_topic):
def save_aggregate_reports_to_kafka(
self,
aggregate_reports: Union[dict[str, Any], list[dict[str, Any]]],
aggregate_topic: str,
):
"""
Saves aggregate DMARC reports to Kafka
@@ -96,16 +110,14 @@ class KafkaClient(object):
aggregate_topic (str): The name of the Kafka topic
"""
if isinstance(aggregate_reports, dict) or isinstance(
aggregate_reports, OrderedDict
):
if isinstance(aggregate_reports, dict):
aggregate_reports = [aggregate_reports]
if len(aggregate_reports) < 1:
return
for report in aggregate_reports:
report["date_range"] = self.generate_daterange(report)
report["date_range"] = self.generate_date_range(report)
report = self.strip_metadata(report)
for slice in report["records"]:
@@ -129,7 +141,11 @@ class KafkaClient(object):
except Exception as e:
raise KafkaError("Kafka error: {0}".format(e.__str__()))
def save_forensic_reports_to_kafka(self, forensic_reports, forensic_topic):
def save_forensic_reports_to_kafka(
self,
forensic_reports: Union[dict[str, Any], list[dict[str, Any]]],
forensic_topic: str,
):
"""
Saves forensic DMARC reports to Kafka, sends individual
records (slices) since Kafka requires messages to be <= 1MB
@@ -159,7 +175,11 @@ class KafkaClient(object):
except Exception as e:
raise KafkaError("Kafka error: {0}".format(e.__str__()))
def save_smtp_tls_reports_to_kafka(self, smtp_tls_reports, smtp_tls_topic):
def save_smtp_tls_reports_to_kafka(
self,
smtp_tls_reports: Union[list[dict[str, Any]], dict[str, Any]],
smtp_tls_topic: str,
):
"""
Saves SMTP TLS reports to Kafka, sends individual
records (slices) since Kafka requires messages to be <= 1MB

View File

@@ -1,4 +1,9 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Any
from parsedmarc.log import logger
from azure.core.exceptions import HttpResponseError
from azure.identity import ClientSecretCredential
@@ -102,7 +107,12 @@ class LogAnalyticsClient(object):
"Invalid configuration. " + "One or more required settings are missing."
)
def publish_json(self, results, logs_client: LogsIngestionClient, dcr_stream: str):
def publish_json(
self,
results,
logs_client: LogsIngestionClient,
dcr_stream: str,
):
"""
Background function to publish given
DMARC report to specific Data Collection Rule.
@@ -121,7 +131,11 @@ class LogAnalyticsClient(object):
raise LogAnalyticsException("Upload failed: {error}".format(error=e))
def publish_results(
self, results, save_aggregate: bool, save_forensic: bool, save_smtp_tls: bool
self,
results: dict[str, dict[str, Any]],
save_aggregate: bool,
save_forensic: bool,
save_smtp_tls: bool,
):
"""
Function to publish DMARC and/or SMTP TLS reports to Log Analytics

View File

@@ -1,3 +1,7 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from base64 import urlsafe_b64decode
from functools import lru_cache
from pathlib import Path
@@ -152,3 +156,4 @@ class GmailConnection(MailboxConnection):
for label in labels:
if label_name == label["id"] or label_name == label["name"]:
return label["id"]
return ""

View File

@@ -1,3 +1,7 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from enum import Enum
from functools import lru_cache
from pathlib import Path

View File

@@ -1,3 +1,9 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Optional
from time import sleep
from imapclient.exceptions import IMAPClientError
@@ -11,14 +17,15 @@ from parsedmarc.mail.mailbox_connection import MailboxConnection
class IMAPConnection(MailboxConnection):
def __init__(
self,
host=None,
user=None,
password=None,
port=None,
ssl=True,
verify=True,
timeout=30,
max_retries=4,
host: Optional[str] = None,
*,
user: Optional[str] = None,
password: Optional[str] = None,
port: Optional[str] = None,
ssl: Optional[bool] = True,
verify: Optional[bool] = True,
timeout: Optional[int] = 30,
max_retries: Optional[int] = 4,
):
self._username = user
self._password = password
@@ -45,13 +52,13 @@ class IMAPConnection(MailboxConnection):
else:
return self._client.search()
def fetch_message(self, message_id):
def fetch_message(self, message_id: int):
return self._client.fetch_message(message_id, parse=False)
def delete_message(self, message_id: str):
def delete_message(self, message_id: int):
self._client.delete_messages([message_id])
def move_message(self, message_id: str, folder_name: str):
def move_message(self, message_id: int, folder_name: str):
self._client.move_messages([message_id], folder_name)
def keepalive(self):

View File

@@ -1,5 +1,8 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from abc import ABC
from typing import List
class MailboxConnection(ABC):
@@ -10,7 +13,7 @@ class MailboxConnection(ABC):
def create_folder(self, folder_name: str):
raise NotImplementedError
def fetch_messages(self, reports_folder: str, **kwargs) -> List[str]:
def fetch_messages(self, reports_folder: str, **kwargs) -> list[str]:
raise NotImplementedError
def fetch_message(self, message_id) -> str:

View File

@@ -1,3 +1,9 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Optional
from time import sleep
from parsedmarc.log import logger
@@ -9,8 +15,8 @@ import os
class MaildirConnection(MailboxConnection):
def __init__(
self,
maildir_path=None,
maildir_create=False,
maildir_path: Optional[bool] = None,
maildir_create: Optional[bool] = False,
):
self._maildir_path = maildir_path
self._maildir_create = maildir_create
@@ -36,7 +42,7 @@ class MaildirConnection(MailboxConnection):
def fetch_messages(self, reports_folder: str, **kwargs):
return self._client.keys()
def fetch_message(self, message_id):
def fetch_message(self, message_id: str):
return self._client.get(message_id).as_string()
def delete_message(self, message_id: str):

View File

@@ -1,6 +1,9 @@
# -*- coding: utf-8 -*-
from collections import OrderedDict
from __future__ import annotations
from typing import Optional, Union, Any
from opensearchpy import (
Q,
@@ -67,6 +70,8 @@ class _AggregateReportDoc(Document):
date_range = Date()
date_begin = Date()
date_end = Date()
normalized_timespan = Boolean()
original_timespan_seconds = Integer
errors = Text()
published_policy = Object(_PublishedPolicy)
source_ip_address = Ip()
@@ -87,15 +92,15 @@ class _AggregateReportDoc(Document):
dkim_results = Nested(_DKIMResult)
spf_results = Nested(_SPFResult)
def add_policy_override(self, type_, comment):
def add_policy_override(self, type_: str, comment: str):
self.policy_overrides.append(_PolicyOverride(type=type_, comment=comment))
def add_dkim_result(self, domain, selector, result):
def add_dkim_result(self, domain: str, selector: str, result: _DKIMResult):
self.dkim_results.append(
_DKIMResult(domain=domain, selector=selector, result=result)
)
def add_spf_result(self, domain, scope, result):
def add_spf_result(self, domain: str, scope: str, result: _SPFResult):
self.spf_results.append(_SPFResult(domain=domain, scope=scope, result=result))
def save(self, **kwargs):
@@ -131,21 +136,21 @@ class _ForensicSampleDoc(InnerDoc):
body = Text()
attachments = Nested(_EmailAttachmentDoc)
def add_to(self, display_name, address):
def add_to(self, display_name: str, address: str):
self.to.append(_EmailAddressDoc(display_name=display_name, address=address))
def add_reply_to(self, display_name, address):
def add_reply_to(self, display_name: str, address: str):
self.reply_to.append(
_EmailAddressDoc(display_name=display_name, address=address)
)
def add_cc(self, display_name, address):
def add_cc(self, display_name: str, address: str):
self.cc.append(_EmailAddressDoc(display_name=display_name, address=address))
def add_bcc(self, display_name, address):
def add_bcc(self, display_name: str, address: str):
self.bcc.append(_EmailAddressDoc(display_name=display_name, address=address))
def add_attachment(self, filename, content_type, sha256):
def add_attachment(self, filename: str, content_type: str, sha256: str):
self.attachments.append(
_EmailAttachmentDoc(
filename=filename, content_type=content_type, sha256=sha256
@@ -197,15 +202,15 @@ class _SMTPTLSPolicyDoc(InnerDoc):
def add_failure_details(
self,
result_type,
ip_address,
receiving_ip,
receiving_mx_helo,
failed_session_count,
sending_mta_ip=None,
receiving_mx_hostname=None,
additional_information_uri=None,
failure_reason_code=None,
result_type: Optional[str] = None,
ip_address: Optional[str] = None,
receiving_ip: Optional[str] = None,
receiving_mx_helo: Optional[str] = None,
failed_session_count: Optional[int] = None,
sending_mta_ip: Optional[str] = None,
receiving_mx_hostname: Optional[str] = None,
additional_information_uri: Optional[str] = None,
failure_reason_code: Union[str, int, None] = None,
):
_details = _SMTPTLSFailureDetailsDoc(
result_type=result_type,
@@ -235,13 +240,14 @@ class _SMTPTLSReportDoc(Document):
def add_policy(
self,
policy_type,
policy_domain,
successful_session_count,
failed_session_count,
policy_string=None,
mx_host_patterns=None,
failure_details=None,
policy_type: str,
policy_domain: str,
successful_session_count: int,
failed_session_count: int,
*,
policy_string: Optional[str] = None,
mx_host_patterns: Optional[list[str]] = None,
failure_details: Optional[str] = None,
):
self.policies.append(
policy_type=policy_type,
@@ -259,24 +265,25 @@ class AlreadySaved(ValueError):
def set_hosts(
hosts,
use_ssl=False,
ssl_cert_path=None,
username=None,
password=None,
apiKey=None,
timeout=60.0,
hosts: Union[str, list[str]],
*,
use_ssl: Optional[bool] = False,
ssl_cert_path: Optional[str] = None,
username: Optional[str] = None,
password: Optional[str] = None,
api_key: Optional[str] = None,
timeout: Optional[float] = 60.0,
):
"""
Sets the OpenSearch hosts to use
Args:
hosts (str|list): A hostname or URL, or list of hostnames or URLs
hosts (str|list[str]): A single hostname or URL, or list of hostnames or URLs
use_ssl (bool): Use an HTTPS connection to the server
ssl_cert_path (str): Path to the certificate chain
username (str): The username to use for authentication
password (str): The password to use for authentication
apiKey (str): The Base64 encoded API key to use for authentication
api_key (str): The Base64 encoded API key to use for authentication
timeout (float): Timeout in seconds
"""
if not isinstance(hosts, list):
@@ -289,14 +296,14 @@ def set_hosts(
conn_params["ca_certs"] = ssl_cert_path
else:
conn_params["verify_certs"] = False
if username:
if username and password:
conn_params["http_auth"] = username + ":" + password
if apiKey:
conn_params["api_key"] = apiKey
if api_key:
conn_params["api_key"] = api_key
connections.create_connection(**conn_params)
def create_indexes(names, settings=None):
def create_indexes(names: list[str], settings: Optional[dict[str, Any]] = None):
"""
Create OpenSearch indexes
@@ -319,7 +326,10 @@ def create_indexes(names, settings=None):
raise OpenSearchError("OpenSearch error: {0}".format(e.__str__()))
def migrate_indexes(aggregate_indexes=None, forensic_indexes=None):
def migrate_indexes(
aggregate_indexes: Optional[list[str]] = None,
forensic_indexes: Optional[list[str]] = None,
):
"""
Updates index mappings
@@ -366,18 +376,18 @@ def migrate_indexes(aggregate_indexes=None, forensic_indexes=None):
def save_aggregate_report_to_opensearch(
aggregate_report,
index_suffix=None,
index_prefix=None,
monthly_indexes=False,
number_of_shards=1,
number_of_replicas=0,
aggregate_report: dict[str, Any],
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: Optional[int] = 1,
number_of_replicas: Optional[int] = 0,
):
"""
Saves a parsed DMARC aggregate report to OpenSearch
Args:
aggregate_report (OrderedDict): A parsed forensic report
aggregate_report (dict): A parsed forensic report
index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily indexes
@@ -395,15 +405,11 @@ def save_aggregate_report_to_opensearch(
domain = aggregate_report["policy_published"]["domain"]
begin_date = human_timestamp_to_datetime(metadata["begin_date"], to_utc=True)
end_date = human_timestamp_to_datetime(metadata["end_date"], to_utc=True)
begin_date_human = begin_date.strftime("%Y-%m-%d %H:%M:%SZ")
end_date_human = end_date.strftime("%Y-%m-%d %H:%M:%SZ")
if monthly_indexes:
index_date = begin_date.strftime("%Y-%m")
else:
index_date = begin_date.strftime("%Y-%m-%d")
aggregate_report["begin_date"] = begin_date
aggregate_report["end_date"] = end_date
date_range = [aggregate_report["begin_date"], aggregate_report["end_date"]]
org_name_query = Q(dict(match_phrase=dict(org_name=org_name)))
report_id_query = Q(dict(match_phrase=dict(report_id=report_id)))
@@ -425,6 +431,9 @@ def save_aggregate_report_to_opensearch(
try:
existing = search.execute()
except Exception as error_:
begin_date_human = begin_date.strftime("%Y-%m-%d %H:%M:%SZ")
end_date_human = end_date.strftime("%Y-%m-%d %H:%M:%SZ")
raise OpenSearchError(
"OpenSearch's search for existing report \
error: {}".format(error_.__str__())
@@ -450,6 +459,17 @@ def save_aggregate_report_to_opensearch(
)
for record in aggregate_report["records"]:
begin_date = human_timestamp_to_datetime(record["interval_begin"], to_utc=True)
end_date = human_timestamp_to_datetime(record["interval_end"], to_utc=True)
normalized_timespan = record["normalized_timespan"]
if monthly_indexes:
index_date = begin_date.strftime("%Y-%m")
else:
index_date = begin_date.strftime("%Y-%m-%d")
aggregate_report["begin_date"] = begin_date
aggregate_report["end_date"] = end_date
date_range = [aggregate_report["begin_date"], aggregate_report["end_date"]]
agg_doc = _AggregateReportDoc(
xml_schema=aggregate_report["xml_schema"],
org_name=metadata["org_name"],
@@ -457,8 +477,9 @@ def save_aggregate_report_to_opensearch(
org_extra_contact_info=metadata["org_extra_contact_info"],
report_id=metadata["report_id"],
date_range=date_range,
date_begin=aggregate_report["begin_date"],
date_end=aggregate_report["end_date"],
date_begin=begin_date,
date_end=end_date,
normalized_timespan=normalized_timespan,
errors=metadata["errors"],
published_policy=published_policy,
source_ip_address=record["source"]["ip_address"],
@@ -517,18 +538,18 @@ def save_aggregate_report_to_opensearch(
def save_forensic_report_to_opensearch(
forensic_report,
index_suffix=None,
index_prefix=None,
monthly_indexes=False,
number_of_shards=1,
number_of_replicas=0,
forensic_report: dict[str, Any],
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: int = 1,
number_of_replicas: int = 0,
):
"""
Saves a parsed DMARC forensic report to OpenSearch
Args:
forensic_report (OrderedDict): A parsed forensic report
forensic_report (dict): A parsed forensic report
index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily
@@ -548,7 +569,7 @@ def save_forensic_report_to_opensearch(
sample_date = forensic_report["parsed_sample"]["date"]
sample_date = human_timestamp_to_datetime(sample_date)
original_headers = forensic_report["parsed_sample"]["headers"]
headers = OrderedDict()
headers = dict()
for original_header in original_headers:
headers[original_header.lower()] = original_headers[original_header]
@@ -684,18 +705,18 @@ def save_forensic_report_to_opensearch(
def save_smtp_tls_report_to_opensearch(
report,
index_suffix=None,
index_prefix=None,
monthly_indexes=False,
number_of_shards=1,
number_of_replicas=0,
report: dict[str, Any],
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: Optional[int] = 1,
number_of_replicas: Optional[int] = 0,
):
"""
Saves a parsed SMTP TLS report to OpenSearch
Args:
report (OrderedDict): A parsed SMTP TLS report
report (dict): A parsed SMTP TLS report
index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily indexes
@@ -705,7 +726,7 @@ def save_smtp_tls_report_to_opensearch(
Raises:
AlreadySaved
"""
logger.info("Saving aggregate report to OpenSearch")
logger.info("Saving SMTP TLS report to OpenSearch")
org_name = report["organization_name"]
report_id = report["report_id"]
begin_date = human_timestamp_to_datetime(report["begin_date"], to_utc=True)
@@ -781,7 +802,7 @@ def save_smtp_tls_report_to_opensearch(
policy_doc = _SMTPTLSPolicyDoc(
policy_domain=policy["policy_domain"],
policy_type=policy["policy_type"],
succesful_session_count=policy["successful_session_count"],
successful_session_count=policy["successful_session_count"],
failed_session_count=policy["failed_session_count"],
policy_string=policy_strings,
mx_host_patterns=mx_host_patterns,

View File

@@ -132,6 +132,7 @@ asu-vei.ru,ASU-VEI,Industrial
atextelecom.com.br,ATEX Telecom,ISP
atmailcloud.com,atmail,Email Provider
ats.ca,ATS Healthcare,Healthcare
att.net,AT&T,ISP
atw.ne.jp,ATW,Web Host
au-net.ne.jp,KDDI,ISP
au.com,au,ISP
@@ -242,6 +243,7 @@ carandainet.com.br,CN Internet,ISP
cardhealth.com,Cardinal Health,Healthcare
cardinal.com,Cardinal Health,Healthcare
cardinalhealth.com,Cardinal Health,Healthcare
cardinalscriptnet.com,Cardinal Health,Healthcare
carecentrix.com,CareCentrix,Healthcare
carleton.edu,Carlton College,Education
carrierzone.com,carrierzone,Email Security
@@ -697,6 +699,7 @@ hdsupply-email.com,HD Supply,Retail
healthall.com,UC Health,Healthcare
healthcaresupplypros.com,Healthcare Supply Pros,Healthcare
healthproductsforyou.com,Health Products For You,Healthcare
healthtouch.com,Cardinal Health,Healthcare
helloserver6.com,1st Source Web,Marketing
helpforcb.com,InterServer,Web Host
helpscout.net,Help Scout,SaaS
@@ -753,6 +756,8 @@ hostwindsdns.com,Hostwinds,Web Host
hotnet.net.il,Hot Net Internet Services,ISP
hp.com,HP,Technology
hringdu.is,Hringdu,ISP
hslda.net,Home School Legal Defense Association (HSLDA),Education
hslda.org,Home School Legal Defense Association (HSLDA),Education
hspherefilter.com,"DynamicNet, Inc. (DNI)",Web Host
htc.net,HTC,ISP
htmlservices.it,HTMLServices.it,MSP
@@ -763,6 +768,7 @@ hughston.com,Hughston Clinic,Healthcare
hvvc.us,Hivelocity,Web Host
i2ts.ne.jp,i2ts,Web Host
i4i.com,i4i,Technology
ibindley.com,Cardinal Health,Healthcare
ice.co.cr,Grupo ICE,Industrial
icehosting.nl,IceHosting,Web Host
icewarpcloud.in,IceWrap,Email Provider
@@ -832,6 +838,7 @@ ip-5-196-151.eu,OVH,Web Host
ip-51-161-36.net,OVH,Web Host
ip-51-195-53.eu,OVH,Web Host
ip-51-254-53.eu,OVH,Web Host
ip-51-38-67.eu,OVH,Web Host
ip-51-77-42.eu,OVH,Web Host
ip-51-83-140.eu,OVH,Web Host
ip-51-89-240.eu,OVH,Web Host
@@ -1217,6 +1224,7 @@ nettoday.co.th,Net Today,Web Host
netventure.pl,Netventure,MSP
netvigator.com,HKT,ISP
netvision.net.il,013 Netvision,ISP
network-tech.com,Network Technologies International (NTI),SaaS
network.kz,network.kz,ISP
network80.com,Network80,Web Host
neubox.net,Neubox,Web Host
1 base_reverse_dns name type
132 atextelecom.com.br ATEX Telecom ISP
133 atmailcloud.com atmail Email Provider
134 ats.ca ATS Healthcare Healthcare
135 att.net AT&T ISP
136 atw.ne.jp ATW Web Host
137 au-net.ne.jp KDDI ISP
138 au.com au ISP
243 cardhealth.com Cardinal Health Healthcare
244 cardinal.com Cardinal Health Healthcare
245 cardinalhealth.com Cardinal Health Healthcare
246 cardinalscriptnet.com Cardinal Health Healthcare
247 carecentrix.com CareCentrix Healthcare
248 carleton.edu Carlton College Education
249 carrierzone.com carrierzone Email Security
699 healthall.com UC Health Healthcare
700 healthcaresupplypros.com Healthcare Supply Pros Healthcare
701 healthproductsforyou.com Health Products For You Healthcare
702 healthtouch.com Cardinal Health Healthcare
703 helloserver6.com 1st Source Web Marketing
704 helpforcb.com InterServer Web Host
705 helpscout.net Help Scout SaaS
756 hotnet.net.il Hot Net Internet Services ISP
757 hp.com HP Technology
758 hringdu.is Hringdu ISP
759 hslda.net Home School Legal Defense Association (HSLDA) Education
760 hslda.org Home School Legal Defense Association (HSLDA) Education
761 hspherefilter.com DynamicNet, Inc. (DNI) Web Host
762 htc.net HTC ISP
763 htmlservices.it HTMLServices.it MSP
768 hvvc.us Hivelocity Web Host
769 i2ts.ne.jp i2ts Web Host
770 i4i.com i4i Technology
771 ibindley.com Cardinal Health Healthcare
772 ice.co.cr Grupo ICE Industrial
773 icehosting.nl IceHosting Web Host
774 icewarpcloud.in IceWrap Email Provider
838 ip-51-161-36.net OVH Web Host
839 ip-51-195-53.eu OVH Web Host
840 ip-51-254-53.eu OVH Web Host
841 ip-51-38-67.eu OVH Web Host
842 ip-51-77-42.eu OVH Web Host
843 ip-51-83-140.eu OVH Web Host
844 ip-51-89-240.eu OVH Web Host
1224 netventure.pl Netventure MSP
1225 netvigator.com HKT ISP
1226 netvision.net.il 013 Netvision ISP
1227 network-tech.com Network Technologies International (NTI) SaaS
1228 network.kz network.kz ISP
1229 network80.com Network80 Web Host
1230 neubox.net Neubox Web Host

View File

@@ -13,8 +13,6 @@ def _main():
csv_headers = ["source_name", "message_count"]
output_rows = []
known_unknown_domains = []
psl_overrides = []
known_domains = []

View File

@@ -1,5 +1,9 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Any
import json
import boto3
@@ -8,16 +12,16 @@ from parsedmarc.utils import human_timestamp_to_datetime
class S3Client(object):
"""A client for a Amazon S3"""
"""A client for interacting with Amazon S3"""
def __init__(
self,
bucket_name,
bucket_path,
region_name,
endpoint_url,
access_key_id,
secret_access_key,
bucket_name: str,
bucket_path: str,
region_name: str,
endpoint_url: str,
access_key_id: str,
secret_access_key: str,
):
"""
Initializes the S3Client
@@ -47,18 +51,18 @@ class S3Client(object):
aws_access_key_id=access_key_id,
aws_secret_access_key=secret_access_key,
)
self.bucket = self.s3.Bucket(self.bucket_name)
self.bucket: Any = self.s3.Bucket(self.bucket_name)
def save_aggregate_report_to_s3(self, report):
def save_aggregate_report_to_s3(self, report: dict[str, Any]):
self.save_report_to_s3(report, "aggregate")
def save_forensic_report_to_s3(self, report):
def save_forensic_report_to_s3(self, report: dict[str, Any]):
self.save_report_to_s3(report, "forensic")
def save_smtp_tls_report_to_s3(self, report):
def save_smtp_tls_report_to_s3(self, report: dict[str, Any]):
self.save_report_to_s3(report, "smtp_tls")
def save_report_to_s3(self, report, report_type):
def save_report_to_s3(self, report: dict[str, Any], report_type: str):
if report_type == "smtp_tls":
report_date = report["begin_date"]
report_id = report["report_id"]

View File

@@ -1,3 +1,10 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Any, Union
from urllib.parse import urlparse
import socket
import json
@@ -23,7 +30,13 @@ class HECClient(object):
# http://docs.splunk.com/Documentation/Splunk/latest/RESTREF/RESTinput#services.2Fcollector
def __init__(
self, url, access_token, index, source="parsedmarc", verify=True, timeout=60
self,
url: str,
access_token: str,
index: str,
source: str = "parsedmarc",
verify=True,
timeout=60,
):
"""
Initializes the HECClient
@@ -37,9 +50,9 @@ class HECClient(object):
timeout (float): Number of seconds to wait for the server to send
data before giving up
"""
url = urlparse(url)
parsed_url = urlparse(url)
self.url = "{0}://{1}/services/collector/event/1.0".format(
url.scheme, url.netloc
parsed_url.scheme, parsed_url.netloc
)
self.access_token = access_token.lstrip("Splunk ")
self.index = index
@@ -48,14 +61,19 @@ class HECClient(object):
self.session = requests.Session()
self.timeout = timeout
self.session.verify = verify
self._common_data = dict(host=self.host, source=self.source, index=self.index)
self._common_data: dict[str, Union[str, int, float, dict]] = dict(
host=self.host, source=self.source, index=self.index
)
self.session.headers = {
"User-Agent": USER_AGENT,
"Authorization": "Splunk {0}".format(self.access_token),
}
def save_aggregate_reports_to_splunk(self, aggregate_reports):
def save_aggregate_reports_to_splunk(
self,
aggregate_reports: Union[list[dict[str, Any]], dict[str, Any]],
):
"""
Saves aggregate DMARC reports to Splunk
@@ -75,9 +93,12 @@ class HECClient(object):
json_str = ""
for report in aggregate_reports:
for record in report["records"]:
new_report = dict()
new_report: dict[str, Union[str, int, float, dict]] = dict()
for metadata in report["report_metadata"]:
new_report[metadata] = report["report_metadata"][metadata]
new_report["interval_begin"] = record["interval_begin"]
new_report["interval_end"] = record["interval_end"]
new_report["normalized_timespan"] = record["normalized_timespan"]
new_report["published_policy"] = report["policy_published"]
new_report["source_ip_address"] = record["source"]["ip_address"]
new_report["source_country"] = record["source"]["country"]
@@ -98,7 +119,9 @@ class HECClient(object):
new_report["spf_results"] = record["auth_results"]["spf"]
data["sourcetype"] = "dmarc:aggregate"
timestamp = human_timestamp_to_unix_timestamp(new_report["begin_date"])
timestamp = human_timestamp_to_unix_timestamp(
new_report["interval_begin"]
)
data["time"] = timestamp
data["event"] = new_report.copy()
json_str += "{0}\n".format(json.dumps(data))
@@ -113,7 +136,10 @@ class HECClient(object):
if response["code"] != 0:
raise SplunkError(response["text"])
def save_forensic_reports_to_splunk(self, forensic_reports):
def save_forensic_reports_to_splunk(
self,
forensic_reports: Union[list[dict[str, Any]], dict[str, Any]],
):
"""
Saves forensic DMARC reports to Splunk
@@ -147,7 +173,9 @@ class HECClient(object):
if response["code"] != 0:
raise SplunkError(response["text"])
def save_smtp_tls_reports_to_splunk(self, reports):
def save_smtp_tls_reports_to_splunk(
self, reports: Union[list[dict[str, Any]], dict[str, Any]]
):
"""
Saves aggregate DMARC reports to Splunk

View File

@@ -1,7 +1,14 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
import logging
import logging.handlers
from typing import Any
import json
from parsedmarc import (
@@ -14,7 +21,7 @@ from parsedmarc import (
class SyslogClient(object):
"""A client for Syslog"""
def __init__(self, server_name, server_port):
def __init__(self, server_name: str, server_port: int):
"""
Initializes the SyslogClient
Args:
@@ -28,17 +35,23 @@ class SyslogClient(object):
log_handler = logging.handlers.SysLogHandler(address=(server_name, server_port))
self.logger.addHandler(log_handler)
def save_aggregate_report_to_syslog(self, aggregate_reports):
def save_aggregate_report_to_syslog(
self, aggregate_reports: list[dict[str, Any]]
):
rows = parsed_aggregate_reports_to_csv_rows(aggregate_reports)
for row in rows:
self.logger.info(json.dumps(row))
def save_forensic_report_to_syslog(self, forensic_reports):
def save_forensic_report_to_syslog(
self, forensic_reports: list[dict[str, Any]]
):
rows = parsed_forensic_reports_to_csv_rows(forensic_reports)
for row in rows:
self.logger.info(json.dumps(row))
def save_smtp_tls_report_to_syslog(self, smtp_tls_reports):
def save_smtp_tls_report_to_syslog(
self, smtp_tls_reports: list[dict[str, Any]]
):
rows = parsed_smtp_tls_reports_to_csv_rows(smtp_tls_reports)
for row in rows:
self.logger.info(json.dumps(row))

View File

@@ -1,11 +1,17 @@
# -*- coding: utf-8 -*-
"""Utility functions that might be useful for other projects"""
from __future__ import annotations
from typing import Optional, Union, TypedDict, Any
import logging
import os
from datetime import datetime
from datetime import timezone
from datetime import timedelta
from collections import OrderedDict
from expiringdict import ExpiringDict
import tempfile
import subprocess
import shutil
@@ -60,12 +66,20 @@ class DownloadError(RuntimeError):
"""Raised when an error occurs when downloading a file"""
def decode_base64(data):
class EmailAddress(TypedDict):
"""Parsed email address information"""
display_name: Optional[str]
address: str
local: Optional[str]
domain: Optional[str]
def decode_base64(data: str) -> bytes:
"""
Decodes a base64 string, with padding being optional
Args:
data: A base64 encoded string
data (str): A base64 encoded string
Returns:
bytes: The decoded bytes
@@ -78,7 +92,7 @@ def decode_base64(data):
return base64.b64decode(data)
def get_base_domain(domain):
def get_base_domain(domain: str) -> str:
"""
Gets the base domain name for the given domain
@@ -102,7 +116,14 @@ def get_base_domain(domain):
return publicsuffix
def query_dns(domain, record_type, cache=None, nameservers=None, timeout=2.0):
def query_dns(
domain: str,
record_type: str,
*,
cache: Optional[ExpiringDict] = None,
nameservers: list[str] = None,
timeout: int = 2.0,
) -> list[str]:
"""
Queries DNS
@@ -163,7 +184,13 @@ def query_dns(domain, record_type, cache=None, nameservers=None, timeout=2.0):
return records
def get_reverse_dns(ip_address, cache=None, nameservers=None, timeout=2.0):
def get_reverse_dns(
ip_address,
*,
cache: Optional[ExpiringDict] = None,
nameservers: list[str] = None,
timeout: int = 2.0,
) -> str:
"""
Resolves an IP address to a hostname using a reverse DNS query
@@ -191,7 +218,7 @@ def get_reverse_dns(ip_address, cache=None, nameservers=None, timeout=2.0):
return hostname
def timestamp_to_datetime(timestamp):
def timestamp_to_datetime(timestamp: int) -> datetime:
"""
Converts a UNIX/DMARC timestamp to a Python ``datetime`` object
@@ -204,7 +231,7 @@ def timestamp_to_datetime(timestamp):
return datetime.fromtimestamp(int(timestamp))
def timestamp_to_human(timestamp):
def timestamp_to_human(timestamp: int) -> str:
"""
Converts a UNIX/DMARC timestamp to a human-readable string
@@ -217,7 +244,9 @@ def timestamp_to_human(timestamp):
return timestamp_to_datetime(timestamp).strftime("%Y-%m-%d %H:%M:%S")
def human_timestamp_to_datetime(human_timestamp, to_utc=False):
def human_timestamp_to_datetime(
human_timestamp: str, *, to_utc: Optional[bool] = False
) -> datetime:
"""
Converts a human-readable timestamp into a Python ``datetime`` object
@@ -236,7 +265,7 @@ def human_timestamp_to_datetime(human_timestamp, to_utc=False):
return dt.astimezone(timezone.utc) if to_utc else dt
def human_timestamp_to_unix_timestamp(human_timestamp):
def human_timestamp_to_unix_timestamp(human_timestamp: str) -> int:
"""
Converts a human-readable timestamp into a UNIX timestamp
@@ -250,7 +279,7 @@ def human_timestamp_to_unix_timestamp(human_timestamp):
return human_timestamp_to_datetime(human_timestamp).timestamp()
def get_ip_address_country(ip_address, db_path=None):
def get_ip_address_country(ip_address: str, *, db_path: Optional[str] = None) -> str:
"""
Returns the ISO code for the country associated
with the given IPv4 or IPv6 address
@@ -277,7 +306,7 @@ def get_ip_address_country(ip_address, db_path=None):
]
if db_path is not None:
if os.path.isfile(db_path) is False:
if not os.path.isfile(db_path):
db_path = None
logger.warning(
f"No file exists at {db_path}. Falling back to an "
@@ -313,13 +342,14 @@ def get_ip_address_country(ip_address, db_path=None):
def get_service_from_reverse_dns_base_domain(
base_domain,
always_use_local_file=False,
local_file_path=None,
url=None,
offline=False,
reverse_dns_map=None,
):
base_domain: str,
*,
always_use_local_file: Optional[bool] = False,
local_file_path: Optional[str] = None,
url: Optional[str] = None,
offline: Optional[bool] = False,
reverse_dns_map: Optional[dict[str, Any]] = None,
) -> dict[str, Any]:
"""
Returns the service name of a given base domain name from reverse DNS.
@@ -388,17 +418,18 @@ def get_service_from_reverse_dns_base_domain(
def get_ip_address_info(
ip_address,
ip_db_path=None,
reverse_dns_map_path=None,
always_use_local_files=False,
reverse_dns_map_url=None,
cache=None,
reverse_dns_map=None,
offline=False,
nameservers=None,
timeout=2.0,
):
ip_address: str,
*,
ip_db_path: Optional[str] = None,
reverse_dns_map_path: Optional[str] = None,
always_use_local_files: Optional[bool] = False,
reverse_dns_map_url: Optional[str] = None,
cache: Optional[ExpiringDict] = None,
reverse_dns_map: Optional[dict[str, Any]] = None,
offline: Optional[bool] = False,
nameservers: Optional[list[str]] = None,
timeout: Optional[float] = 2.0,
) -> dict[str, Any]:
"""
Returns reverse DNS and country information for the given IP address
@@ -416,7 +447,7 @@ def get_ip_address_info(
timeout (float): Sets the DNS timeout in seconds
Returns:
OrderedDict: ``ip_address``, ``reverse_dns``
dict: ``ip_address``, ``reverse_dns``, ``country``
"""
ip_address = ip_address.lower()
@@ -425,7 +456,7 @@ def get_ip_address_info(
if info:
logger.debug(f"IP address {ip_address} was found in cache")
return info
info = OrderedDict()
info = dict()
info["ip_address"] = ip_address
if offline:
reverse_dns = None
@@ -463,7 +494,7 @@ def get_ip_address_info(
return info
def parse_email_address(original_address):
def parse_email_address(original_address: str) -> EmailAddress:
if original_address[0] == "":
display_name = None
else:
@@ -476,17 +507,15 @@ def parse_email_address(original_address):
local = address_parts[0].lower()
domain = address_parts[-1].lower()
return OrderedDict(
[
("display_name", display_name),
("address", address),
("local", local),
("domain", domain),
]
)
return {
"display_name": display_name,
"address": address,
"local": local,
"domain": domain,
}
def get_filename_safe_string(string):
def get_filename_safe_string(string: str) -> str:
"""
Converts a string to a string that is safe for a filename
@@ -508,7 +537,7 @@ def get_filename_safe_string(string):
return string
def is_mbox(path):
def is_mbox(path: str) -> bool:
"""
Checks if the given content is an MBOX mailbox file
@@ -529,7 +558,7 @@ def is_mbox(path):
return _is_mbox
def is_outlook_msg(content):
def is_outlook_msg(content: Union[bytes, Any]) -> bool:
"""
Checks if the given content is an Outlook msg OLE/MSG file
@@ -544,7 +573,7 @@ def is_outlook_msg(content):
)
def convert_outlook_msg(msg_bytes):
def convert_outlook_msg(msg_bytes: bytes) -> str:
"""
Uses the ``msgconvert`` Perl utility to convert an Outlook MS file to
standard RFC 822 format
@@ -562,13 +591,14 @@ def convert_outlook_msg(msg_bytes):
os.chdir(tmp_dir)
with open("sample.msg", "wb") as msg_file:
msg_file.write(msg_bytes)
rfc822_bytes: bytes
try:
subprocess.check_call(
["msgconvert", "sample.msg"], stdout=null_file, stderr=null_file
)
eml_path = "sample.eml"
with open(eml_path, "rb") as eml_file:
rfc822 = eml_file.read()
rfc822_bytes = eml_file.read()
except FileNotFoundError:
raise EmailParserError(
"Failed to convert Outlook MSG: msgconvert utility not found"
@@ -577,10 +607,12 @@ def convert_outlook_msg(msg_bytes):
os.chdir(orig_dir)
shutil.rmtree(tmp_dir)
return rfc822
return rfc822_bytes.decode("utf-8", errors="replace")
def parse_email(data, strip_attachment_payloads=False):
def parse_email(
data: Union[bytes, str], *, strip_attachment_payloads: Optional[bool] = False
) -> dict[str, Any]:
"""
A simplified email parser
@@ -595,7 +627,8 @@ def parse_email(data, strip_attachment_payloads=False):
if isinstance(data, bytes):
if is_outlook_msg(data):
data = convert_outlook_msg(data)
data = data.decode("utf-8", errors="replace")
else:
data = data.decode("utf-8", errors="replace")
parsed_email = mailparser.parse_from_string(data)
headers = json.loads(parsed_email.headers_json).copy()
parsed_email = json.loads(parsed_email.mail_json).copy()

View File

@@ -1,3 +1,9 @@
# -*- coding: utf-8 -*-
from __future__ import annotations
from typing import Any, Optional, Union
import requests
from parsedmarc import logger
@@ -7,7 +13,13 @@ from parsedmarc.constants import USER_AGENT
class WebhookClient(object):
"""A client for webhooks"""
def __init__(self, aggregate_url, forensic_url, smtp_tls_url, timeout=60):
def __init__(
self,
aggregate_url: str,
forensic_url: str,
smtp_tls_url: str,
timeout: Optional[int] = 60,
):
"""
Initializes the WebhookClient
Args:
@@ -26,25 +38,27 @@ class WebhookClient(object):
"Content-Type": "application/json",
}
def save_forensic_report_to_webhook(self, report):
def save_forensic_report_to_webhook(self, report: str):
try:
self._send_to_webhook(self.forensic_url, report)
except Exception as error_:
logger.error("Webhook Error: {0}".format(error_.__str__()))
def save_smtp_tls_report_to_webhook(self, report):
def save_smtp_tls_report_to_webhook(self, report: str):
try:
self._send_to_webhook(self.smtp_tls_url, report)
except Exception as error_:
logger.error("Webhook Error: {0}".format(error_.__str__()))
def save_aggregate_report_to_webhook(self, report):
def save_aggregate_report_to_webhook(self, report: str):
try:
self._send_to_webhook(self.aggregate_url, report)
except Exception as error_:
logger.error("Webhook Error: {0}".format(error_.__str__()))
def _send_to_webhook(self, webhook_url, payload):
def _send_to_webhook(
self, webhook_url: str, payload: Union[bytes, str, dict[str, Any]]
):
try:
self.session.post(webhook_url, data=payload, timeout=self.timeout)
except Exception as error_:

View File

@@ -2,6 +2,7 @@
requires = [
"hatchling>=1.27.0",
]
requires_python = ">=3.9,<3.14"
build-backend = "hatchling.build"
[project]
@@ -28,6 +29,7 @@ classifiers = [
"Operating System :: OS Independent",
"Programming Language :: Python :: 3"
]
requires-python = ">=3.9, <3.14"
dependencies = [
"azure-identity>=1.8.0",
"azure-monitor-ingestion>=1.0.0",
@@ -86,11 +88,11 @@ include = [
[tool.hatch.build]
exclude = [
"base_reverse_dns.csv",
"find_bad_utf8.py",
"find_unknown_base_reverse_dns.py",
"unknown_base_reverse_dns.csv",
"sortmaps.py",
"README.md",
"*.bak"
"base_reverse_dns.csv",
"find_bad_utf8.py",
"find_unknown_base_reverse_dns.py",
"unknown_base_reverse_dns.csv",
"sortmaps.py",
"README.md",
"*.bak"
]

View File

@@ -1,3 +1,6 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from __future__ import absolute_import, print_function, unicode_literals
import os
@@ -74,7 +77,7 @@ class Test(unittest.TestCase):
print()
file = "samples/extract_report/nice-input.xml"
print("Testing {0}: ".format(file), end="")
xmlout = parsedmarc.extract_report(file)
xmlout = parsedmarc.extract_report_from_file_path(file)
xmlin_file = open("samples/extract_report/nice-input.xml")
xmlin = xmlin_file.read()
xmlin_file.close()