Compare commits

...

22 Commits

Author SHA1 Message Date
copilot-swe-agent[bot]
2174f23eb5 Add comprehensive TypedDicts to minimize Any usage in public APIs
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 22:28:19 +00:00
copilot-swe-agent[bot]
febbb107c4 Fix Python 3.9 compatibility: replace pipe union syntax with Union/Optional
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 22:18:57 +00:00
copilot-swe-agent[bot]
9a64b494e7 Fix code review issues: incomplete isinstance and variable name mismatch
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 21:45:21 +00:00
copilot-swe-agent[bot]
e93209c766 Fix function signatures and improve type annotations
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 21:42:25 +00:00
copilot-swe-agent[bot]
d1c22466be Replace OrderedDict with dict and add TypedDict definitions
Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-17 21:36:57 +00:00
copilot-swe-agent[bot]
3d1b2522d3 Initial plan 2025-12-17 21:19:30 +00:00
Sean Whalen
af9ad568ec Specify Python version requirements in pyproject.toml 2025-12-17 16:18:24 -05:00
Sean Whalen
748164d177 Fix #638 2025-12-17 16:09:26 -05:00
Sean Whalen
487e5e1149 Format on build 2025-12-12 15:56:52 -05:00
Sean Whalen
73010cf964 Use ruff for code formatting 2025-12-12 15:44:46 -05:00
Sean Whalen
a4a5475aa8 Fix another typo before releasing 9.0.5 2025-12-08 15:29:48 -05:00
Sean Whalen
dab78880df Actual 9.0.5 release
Fix typo
2025-12-08 15:26:58 -05:00
Sean Whalen
fb54e3b742 9.0.5
- Fix report type detection bug introduced in `9.0.4` (yanked).
2025-12-08 15:22:02 -05:00
Sean Whalen
6799f10364 9.0.4
Fixes

- Fix saving reports to OpenSearch ([#637](https://github.com/domainaware/parsedmarc/issues/637))
- Fix parsing certain DMARC failure/forensic reports
- Some fixes to type hints (incomplete, but published as-is due to the above bugs)
2025-12-08 13:26:59 -05:00
Sean Whalen
445c9565a4 Update bug link in docs 2025-12-06 15:05:19 -05:00
Sean Whalen
4b786846ae Remove Python 3.14 from testing
Until cpython bug https://github.com/python/cpython/issues/142307 is fixed
2025-12-05 11:05:29 -05:00
Sean Whalen
23ae563cd8 Update Python version support details in documentation 2025-12-05 10:48:04 -05:00
Sean Whalen
cdd000e675 9.0.3
- Set `requires-python` to `>=3.9, <3.14` to avoid [this bug](https://github.com/python/cpython/issues/142307)
2025-12-05 10:43:28 -05:00
Sean Whalen
7d58abc67b Add shebang and encoding declaration to tests.py 2025-12-04 10:21:53 -05:00
Sean Whalen
a18ae439de Fix typo in RHEL version support description in documentation 2025-12-04 10:18:15 -05:00
Sean Whalen
d7061330a8 Use None for blank fields in the Top 1000 Message Sources by Name DMARC Summary dashboard widget 2025-12-03 09:22:33 -05:00
Sean Whalen
9d5654b8ec Fix bugs with the Top 1000 Message Sources by Name DMARC Summary dashboard widget 2025-12-03 09:14:52 -05:00
23 changed files with 540 additions and 300 deletions

View File

@@ -30,7 +30,7 @@ jobs:
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13", "3.14"] python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]
steps: steps:
- uses: actions/checkout@v5 - uses: actions/checkout@v5

10
.vscode/settings.json vendored
View File

@@ -1,4 +1,14 @@
{ {
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnSave": true,
// Let Ruff handle lint fixes + import sorting on save
"editor.codeActionsOnSave": {
"source.fixAll.ruff": "explicit",
"source.organizeImports.ruff": "explicit"
}
},
"markdownlint.config": { "markdownlint.config": {
"MD024": false "MD024": false
}, },

View File

@@ -1,5 +1,25 @@
# Changelog # Changelog
## 9.0.5
## Fixes
- Fix report type detection introduced in `9.0.4`.
## 9.0.4 (Yanked)
### Fixes
- Fix saving reports to OpenSearch ([#637](https://github.com/domainaware/parsedmarc/issues/637))
- Fix parsing certain DMARC failure/forensic reports
- Some fixes to type hints (incomplete, but published as-is due to the above bugs)
## 9.0.3
### Fixes
- Set `requires-python` to `>=3.9, <3.14` to avoid [this bug](https://github.com/python/cpython/issues/142307)
## 9.0.2 ## 9.0.2
## Improvements ## Improvements

View File

@@ -53,12 +53,12 @@ for RHEL or Debian.
| Version | Supported | Reason | | Version | Supported | Reason |
|---------|-----------|------------------------------------------------------------| |---------|-----------|------------------------------------------------------------|
| < 3.6 | ❌ | End of Life (EOL) | | < 3.6 | ❌ | End of Life (EOL) |
| 3.6 | ❌ | Used in RHHEL 8, but not supported by project dependencies | | 3.6 | ❌ | Used in RHEL 8, but not supported by project dependencies |
| 3.7 | ❌ | End of Life (EOL) | | 3.7 | ❌ | End of Life (EOL) |
| 3.8 | ❌ | End of Life (EOL) | | 3.8 | ❌ | End of Life (EOL) |
| 3.9 | ✅ | Supported until August 2026 (Debian 11); May 2032 (RHEL 9) | | 3.9 | ✅ | Supported until August 2026 (Debian 11); May 2032 (RHEL 9) |
| 3.10 | ✅ | Actively maintained | | 3.10 | ✅ | Actively maintained |
| 3.11 | ✅ | Actively maintained; supported until June 2028 (Debian 12) | | 3.11 | ✅ | Actively maintained; supported until June 2028 (Debian 12) |
| 3.12 | ✅ | Actively maintained; supported until May 2035 (RHEL 10) | | 3.12 | ✅ | Actively maintained; supported until May 2035 (RHEL 10) |
| 3.13 | ✅ | Actively maintained; supported until June 2030 (Debian 13) | | 3.13 | ✅ | Actively maintained; supported until June 2030 (Debian 13) |
| 3.14 | | Actively maintained | | 3.14 | | Not currently supported due to Not currently supported due to [this imapclient bug](https://github.com/mjs/imapclient/issues/618)|

View File

@@ -9,7 +9,6 @@ fi
. venv/bin/activate . venv/bin/activate
pip install .[build] pip install .[build]
ruff format . ruff format .
ruff check .
cd docs cd docs
make clean make clean
make html make html

View File

@@ -53,16 +53,15 @@ for RHEL or Debian.
| Version | Supported | Reason | | Version | Supported | Reason |
|---------|-----------|------------------------------------------------------------| |---------|-----------|------------------------------------------------------------|
| < 3.6 | ❌ | End of Life (EOL) | | < 3.6 | ❌ | End of Life (EOL) |
| 3.6 | ❌ | Used in RHHEL 8, but not supported by project dependencies | | 3.6 | ❌ | Used in RHEL 8, but not supported by project dependencies |
| 3.7 | ❌ | End of Life (EOL) | | 3.7 | ❌ | End of Life (EOL) |
| 3.8 | ❌ | End of Life (EOL) | | 3.8 | ❌ | End of Life (EOL) |
| 3.9 | ✅ | Supported until August 2026 (Debian 11); May 2032 (RHEL 9) | | 3.9 | ✅ | Supported until August 2026 (Debian 11); May 2032 (RHEL 9) |
| 3.10 | ✅ | Actively maintained | | 3.10 | ✅ | Actively maintained |
| 3.11 | ✅ | Actively maintained; supported until June 2028 (Debian 12) | | 3.11 | ✅ | Actively maintained; supported until June 2028 (Debian 12) |
| 3.12 | ✅ | Actively maintained; supported until May 2035 (RHEL 10) | | 3.12 | ✅ | Actively maintained; supported until May 2035 (RHEL 10) |
| 3.13 | ✅ | Actively maintained; supported until June 2030 (Debian 13) | | 3.13 | ✅ | Actively maintained; supported until June 2030 (Debian 13) |
| 3.14 | | Actively maintained | | 3.14 | | Not currently supported due to [this imapclient bug](https://github.com/mjs/imapclient/issues/618)|
```{toctree} ```{toctree}
:caption: 'Contents' :caption: 'Contents'

File diff suppressed because one or more lines are too long

View File

@@ -4,7 +4,7 @@
from __future__ import annotations from __future__ import annotations
from typing import Dict, List, Any, Union, Optional, IO, Callable from typing import Any, Union, Optional, IO, Callable, TypedDict, NotRequired
import binascii import binascii
import email import email
@@ -19,7 +19,6 @@ import xml.parsers.expat as expat
import zipfile import zipfile
import zlib import zlib
from base64 import b64decode from base64 import b64decode
from collections import OrderedDict
from csv import DictWriter from csv import DictWriter
from datetime import datetime, timedelta, timezone, tzinfo from datetime import datetime, timedelta, timezone, tzinfo
from io import BytesIO, StringIO from io import BytesIO, StringIO
@@ -57,6 +56,17 @@ MAGIC_GZIP = b"\x1f\x8b"
MAGIC_XML = b"\x3c\x3f\x78\x6d\x6c\x20" MAGIC_XML = b"\x3c\x3f\x78\x6d\x6c\x20"
MAGIC_JSON = b"\7b" MAGIC_JSON = b"\7b"
EMAIL_SAMPLE_CONTENT_TYPES = (
"text/rfc822",
"text-rfc-822",
"text/rfc822-headers",
"text/rfc-822-headers",
"message/rfc822",
"message/rfc-822",
"message/rfc822-headers",
"message/rfc-822-headers",
)
IP_ADDRESS_CACHE = ExpiringDict(max_len=10000, max_age_seconds=14400) IP_ADDRESS_CACHE = ExpiringDict(max_len=10000, max_age_seconds=14400)
SEEN_AGGREGATE_REPORT_IDS = ExpiringDict(max_len=100000000, max_age_seconds=3600) SEEN_AGGREGATE_REPORT_IDS = ExpiringDict(max_len=100000000, max_age_seconds=3600)
REVERSE_DNS_MAP = dict() REVERSE_DNS_MAP = dict()
@@ -82,11 +92,188 @@ class InvalidForensicReport(InvalidDMARCReport):
"""Raised when an invalid DMARC forensic report is encountered""" """Raised when an invalid DMARC forensic report is encountered"""
# TypedDict definitions for structured data
class DateIntervalBucket(TypedDict):
"""Represents a time bucket for interval normalization"""
begin: datetime
end: datetime
count: int
class IPAddressInfo(TypedDict, total=False):
"""Information about an IP address"""
ip_address: str
country: NotRequired[Optional[str]]
reverse_dns: NotRequired[Optional[str]]
base_domain: NotRequired[Optional[str]]
type: NotRequired[Optional[str]]
name: NotRequired[Optional[str]]
class AlignmentInfo(TypedDict):
"""DMARC alignment information"""
spf: bool
dkim: bool
dmarc: bool
class PolicyOverrideReason(TypedDict, total=False):
"""DMARC policy override reason"""
type: NotRequired[str]
comment: NotRequired[Optional[str]]
class PolicyEvaluated(TypedDict):
"""DMARC policy evaluation result"""
disposition: str
dkim: str
spf: str
policy_override_reasons: list[PolicyOverrideReason]
class DKIMAuthResult(TypedDict):
"""DKIM authentication result"""
domain: str
selector: str
result: str
class SPFAuthResult(TypedDict):
"""SPF authentication result"""
domain: str
scope: str
result: str
class AuthResults(TypedDict):
"""Authentication results for DKIM and SPF"""
dkim: list[DKIMAuthResult]
spf: list[SPFAuthResult]
class Identifiers(TypedDict):
"""Message identifiers"""
header_from: str
envelope_from: Optional[str]
envelope_to: Optional[str]
class ParsedReportRecord(TypedDict):
"""A parsed DMARC aggregate report record"""
source: IPAddressInfo
count: int
alignment: AlignmentInfo
policy_evaluated: PolicyEvaluated
identifiers: Identifiers
auth_results: AuthResults
class ParsedReportRecordWithMetadata(ParsedReportRecord, total=False):
"""A parsed DMARC report record with normalization metadata"""
normalized_timespan: bool
interval_begin: NotRequired[str]
interval_end: NotRequired[str]
class ReportMetadata(TypedDict, total=False):
"""DMARC report metadata"""
org_name: str
org_email: NotRequired[Optional[str]]
org_extra_contact_info: NotRequired[Optional[str]]
report_id: str
begin_date: str
end_date: str
errors: NotRequired[list[str]]
class PolicyPublished(TypedDict, total=False):
"""DMARC policy as published in DNS"""
domain: str
adkim: NotRequired[str]
aspf: NotRequired[str]
p: str
sp: NotRequired[str]
pct: NotRequired[str]
fo: NotRequired[str]
class ParsedAggregateReport(TypedDict):
"""A complete parsed DMARC aggregate report"""
xml_schema: str
report_metadata: ReportMetadata
policy_published: PolicyPublished
records: list[ParsedReportRecord]
class SMTPTLSFailureDetails(TypedDict):
"""SMTP TLS failure details"""
result_type: str
sending_mta_ip: NotRequired[Optional[str]]
receiving_mx_hostname: NotRequired[Optional[str]]
receiving_mx_helo: NotRequired[Optional[str]]
receiving_ip: NotRequired[Optional[str]]
failed_session_count: int
additional_information: NotRequired[Optional[str]]
failure_reason_code: NotRequired[Optional[str]]
class SMTPTLSPolicy(TypedDict, total=False):
"""SMTP TLS policy information"""
policy_type: str
policy_domain: str
policy_strings: NotRequired[list[str]]
mx_host_patterns: NotRequired[list[str]]
successful_session_count: int
failed_session_count: int
failure_details: NotRequired[list[SMTPTLSFailureDetails]]
class ParsedSMTPTLSReport(TypedDict):
"""A complete parsed SMTP TLS report"""
organization_name: str
begin_date: str
end_date: str
contact_info: str
report_id: str
policies: list[SMTPTLSPolicy]
class ParsedForensicReport(TypedDict, total=False):
"""A parsed DMARC forensic report"""
feedback_type: str
user_agent: NotRequired[Optional[str]]
version: NotRequired[Optional[str]]
original_envelope_id: NotRequired[Optional[str]]
original_mail_from: NotRequired[Optional[str]]
original_rcpt_to: NotRequired[Optional[str]]
arrival_date: str
arrival_date_utc: str
subject: NotRequired[str]
message_id: str
authentication_results: str
delivery_result: str
auth_failure: list[str]
reported_domain: str
arrival_date_utc: str
source: IPAddressInfo
authentication_mechanisms: list[str]
dkim_domain: NotRequired[Optional[str]]
sample_headers_only: bool
sample: NotRequired[str]
parsed_sample: NotRequired[dict[str, Any]]
class ReportTypeWrapper(TypedDict):
"""Wrapper for report type identification"""
report_type: str
report: Union[ParsedAggregateReport, ParsedForensicReport, ParsedSMTPTLSReport]
def _bucket_interval_by_day( def _bucket_interval_by_day(
begin: datetime, begin: datetime,
end: datetime, end: datetime,
total_count: int, total_count: int,
) -> List[Dict[Any]]: ) -> list[DateIntervalBucket]:
""" """
Split the interval [begin, end) into daily buckets and distribute Split the interval [begin, end) into daily buckets and distribute
`total_count` proportionally across those buckets. `total_count` proportionally across those buckets.
@@ -148,7 +335,7 @@ def _bucket_interval_by_day(
if day_cursor > begin: if day_cursor > begin:
day_cursor -= timedelta(days=1) day_cursor -= timedelta(days=1)
day_buckets: List[Dict[str, Any]] = [] day_buckets: list[dict[str, Any]] = []
while day_cursor < end: while day_cursor < end:
day_start = day_cursor day_start = day_cursor
@@ -180,12 +367,12 @@ def _bucket_interval_by_day(
# Then apply a "largest remainder" rounding strategy to ensure the sum # Then apply a "largest remainder" rounding strategy to ensure the sum
# equals exactly total_count. # equals exactly total_count.
exact_values: List[float] = [ exact_values: list[float] = [
(b["seconds"] / interval_seconds) * total_count for b in day_buckets (b["seconds"] / interval_seconds) * total_count for b in day_buckets
] ]
floor_values: List[int] = [int(x) for x in exact_values] floor_values: list[int] = [int(x) for x in exact_values]
fractional_parts: List[float] = [x - int(x) for x in exact_values] fractional_parts: list[float] = [x - int(x) for x in exact_values]
# How many counts do we still need to distribute after flooring? # How many counts do we still need to distribute after flooring?
remainder = total_count - sum(floor_values) remainder = total_count - sum(floor_values)
@@ -205,7 +392,7 @@ def _bucket_interval_by_day(
final_counts[idx] += 1 final_counts[idx] += 1
# --- Step 3: Build the final per-day result list ------------------------- # --- Step 3: Build the final per-day result list -------------------------
results: List[Dict[str, Any]] = [] results: list[DateIntervalBucket] = []
for bucket, count in zip(day_buckets, final_counts): for bucket, count in zip(day_buckets, final_counts):
if count > 0: if count > 0:
results.append( results.append(
@@ -220,8 +407,8 @@ def _bucket_interval_by_day(
def _append_parsed_record( def _append_parsed_record(
parsed_record: OrderedDict[str, Any], parsed_record: dict[str, Any],
records: OrderedDict[str, Any], records: list[dict[str, Any]],
begin_dt: datetime, begin_dt: datetime,
end_dt: datetime, end_dt: datetime,
normalize: bool, normalize: bool,
@@ -264,22 +451,22 @@ def _append_parsed_record(
def _parse_report_record( def _parse_report_record(
record: OrderedDict, record: dict[str, Any],
*, *,
ip_db_path: Optional[str] = None, ip_db_path: Optional[str] = None,
always_use_local_files: bool = False, always_use_local_files: Optional[bool] = False,
reverse_dns_map_path: Optional[str] = None, reverse_dns_map_path: Optional[str] = None,
reverse_dns_map_url: Optional[str] = None, reverse_dns_map_url: Optional[str] = None,
offline: bool = False, offline: Optional[bool] = False,
nameservers: Optional[list[str]] = None, nameservers: Optional[list[str]] = None,
dns_timeout: Optional[float] = 2.0, dns_timeout: Optional[float] = 2.0,
) -> OrderedDict[str, Any]: ) -> dict[str, Any]:
""" """
Converts a record from a DMARC aggregate report into a more consistent Converts a record from a DMARC aggregate report into a more consistent
format format
Args: Args:
record (OrderedDict): The record to convert record (dict): The record to convert
always_use_local_files (bool): Do not download files always_use_local_files (bool): Do not download files
reverse_dns_map_path (str): Path to a reverse DNS map file reverse_dns_map_path (str): Path to a reverse DNS map file
reverse_dns_map_url (str): URL to a reverse DNS map file reverse_dns_map_url (str): URL to a reverse DNS map file
@@ -290,10 +477,10 @@ def _parse_report_record(
dns_timeout (float): Sets the DNS timeout in seconds dns_timeout (float): Sets the DNS timeout in seconds
Returns: Returns:
OrderedDict: The converted record dict: The converted record
""" """
record = record.copy() record = record.copy()
new_record = OrderedDict() new_record: dict[str, Any] = {}
if record["row"]["source_ip"] is None: if record["row"]["source_ip"] is None:
raise ValueError("Source IP address is empty") raise ValueError("Source IP address is empty")
new_record_source = get_ip_address_info( new_record_source = get_ip_address_info(
@@ -311,14 +498,12 @@ def _parse_report_record(
new_record["source"] = new_record_source new_record["source"] = new_record_source
new_record["count"] = int(record["row"]["count"]) new_record["count"] = int(record["row"]["count"])
policy_evaluated = record["row"]["policy_evaluated"].copy() policy_evaluated = record["row"]["policy_evaluated"].copy()
new_policy_evaluated = OrderedDict( new_policy_evaluated: dict[str, Any] = {
[ "disposition": "none",
("disposition", "none"), "dkim": "fail",
("dkim", "fail"), "spf": "fail",
("spf", "fail"), "policy_override_reasons": [],
("policy_override_reasons", []), }
]
)
if "disposition" in policy_evaluated: if "disposition" in policy_evaluated:
new_policy_evaluated["disposition"] = policy_evaluated["disposition"] new_policy_evaluated["disposition"] = policy_evaluated["disposition"]
if new_policy_evaluated["disposition"].strip().lower() == "pass": if new_policy_evaluated["disposition"].strip().lower() == "pass":
@@ -355,7 +540,7 @@ def _parse_report_record(
new_record["identifiers"] = record["identities"].copy() new_record["identifiers"] = record["identities"].copy()
else: else:
new_record["identifiers"] = record["identifiers"].copy() new_record["identifiers"] = record["identifiers"].copy()
new_record["auth_results"] = OrderedDict([("dkim", []), ("spf", [])]) new_record["auth_results"] = {"dkim": [], "spf": []}
if type(new_record["identifiers"]["header_from"]) is str: if type(new_record["identifiers"]["header_from"]) is str:
lowered_from = new_record["identifiers"]["header_from"].lower() lowered_from = new_record["identifiers"]["header_from"].lower()
else: else:
@@ -374,7 +559,7 @@ def _parse_report_record(
auth_results["dkim"] = [auth_results["dkim"]] auth_results["dkim"] = [auth_results["dkim"]]
for result in auth_results["dkim"]: for result in auth_results["dkim"]:
if "domain" in result and result["domain"] is not None: if "domain" in result and result["domain"] is not None:
new_result = OrderedDict([("domain", result["domain"])]) new_result: dict[str, str] = {"domain": result["domain"]}
if "selector" in result and result["selector"] is not None: if "selector" in result and result["selector"] is not None:
new_result["selector"] = result["selector"] new_result["selector"] = result["selector"]
else: else:
@@ -389,16 +574,16 @@ def _parse_report_record(
auth_results["spf"] = [auth_results["spf"]] auth_results["spf"] = [auth_results["spf"]]
for result in auth_results["spf"]: for result in auth_results["spf"]:
if "domain" in result and result["domain"] is not None: if "domain" in result and result["domain"] is not None:
new_result = OrderedDict([("domain", result["domain"])]) new_spf_result: dict[str, str] = {"domain": result["domain"]}
if "scope" in result and result["scope"] is not None: if "scope" in result and result["scope"] is not None:
new_result["scope"] = result["scope"] new_spf_result["scope"] = result["scope"]
else: else:
new_result["scope"] = "mfrom" new_spf_result["scope"] = "mfrom"
if "result" in result and result["result"] is not None: if "result" in result and result["result"] is not None:
new_result["result"] = result["result"] new_spf_result["result"] = result["result"]
else: else:
new_result["result"] = "none" new_spf_result["result"] = "none"
new_record["auth_results"]["spf"].append(new_result) new_record["auth_results"]["spf"].append(new_spf_result)
if "envelope_from" not in new_record["identifiers"]: if "envelope_from" not in new_record["identifiers"]:
envelope_from = None envelope_from = None
@@ -427,12 +612,12 @@ def _parse_report_record(
return new_record return new_record
def _parse_smtp_tls_failure_details(failure_details: dict[str, Any]): def _parse_smtp_tls_failure_details(failure_details: dict[str, Any]) -> SMTPTLSFailureDetails:
try: try:
new_failure_details = OrderedDict( new_failure_details: dict[str, Any] = {
result_type=failure_details["result-type"], "result_type": failure_details["result-type"],
failed_session_count=failure_details["failed-session-count"], "failed_session_count": failure_details["failed-session-count"],
) }
if "sending-mta-ip" in failure_details: if "sending-mta-ip" in failure_details:
new_failure_details["sending_mta_ip"] = failure_details["sending-mta-ip"] new_failure_details["sending_mta_ip"] = failure_details["sending-mta-ip"]
@@ -463,7 +648,7 @@ def _parse_smtp_tls_failure_details(failure_details: dict[str, Any]):
raise InvalidSMTPTLSReport(str(e)) raise InvalidSMTPTLSReport(str(e))
def _parse_smtp_tls_report_policy(policy: dict[str, Any]): def _parse_smtp_tls_report_policy(policy: dict[str, Any]) -> SMTPTLSPolicy:
policy_types = ["tlsa", "sts", "no-policy-found"] policy_types = ["tlsa", "sts", "no-policy-found"]
try: try:
policy_domain = policy["policy"]["policy-domain"] policy_domain = policy["policy"]["policy-domain"]
@@ -471,7 +656,10 @@ def _parse_smtp_tls_report_policy(policy: dict[str, Any]):
failure_details = [] failure_details = []
if policy_type not in policy_types: if policy_type not in policy_types:
raise InvalidSMTPTLSReport(f"Invalid policy type {policy_type}") raise InvalidSMTPTLSReport(f"Invalid policy type {policy_type}")
new_policy = OrderedDict(policy_domain=policy_domain, policy_type=policy_type) new_policy: dict[str, Any] = {
"policy_domain": policy_domain,
"policy_type": policy_type,
}
if "policy-string" in policy["policy"]: if "policy-string" in policy["policy"]:
if isinstance(policy["policy"]["policy-string"], list): if isinstance(policy["policy"]["policy-string"], list):
if len(policy["policy"]["policy-string"]) > 0: if len(policy["policy"]["policy-string"]) > 0:
@@ -500,7 +688,7 @@ def _parse_smtp_tls_report_policy(policy: dict[str, Any]):
raise InvalidSMTPTLSReport(str(e)) raise InvalidSMTPTLSReport(str(e))
def parse_smtp_tls_report_json(report: dict[str, Any]): def parse_smtp_tls_report_json(report: str) -> ParsedSMTPTLSReport:
"""Parses and validates an SMTP TLS report""" """Parses and validates an SMTP TLS report"""
required_fields = [ required_fields = [
"organization-name", "organization-name",
@@ -512,47 +700,49 @@ def parse_smtp_tls_report_json(report: dict[str, Any]):
try: try:
policies = [] policies = []
report = json.loads(report) report_dict = json.loads(report)
for required_field in required_fields: for required_field in required_fields:
if required_field not in report: if required_field not in report_dict:
raise Exception(f"Missing required field: {required_field}]") raise Exception(f"Missing required field: {required_field}]")
if not isinstance(report["policies"], list): if not isinstance(report_dict["policies"], list):
policies_type = type(report["policies"]) policies_type = type(report_dict["policies"])
raise InvalidSMTPTLSReport(f"policies must be a list, not {policies_type}") raise InvalidSMTPTLSReport(f"policies must be a list, not {policies_type}")
for policy in report["policies"]: for policy in report_dict["policies"]:
policies.append(_parse_smtp_tls_report_policy(policy)) policies.append(_parse_smtp_tls_report_policy(policy))
new_report = OrderedDict( new_report: dict[str, Any] = {
organization_name=report["organization-name"], "organization_name": report_dict["organization-name"],
begin_date=report["date-range"]["start-datetime"], "begin_date": report_dict["date-range"]["start-datetime"],
end_date=report["date-range"]["end-datetime"], "end_date": report_dict["date-range"]["end-datetime"],
contact_info=report["contact-info"], "contact_info": report_dict["contact-info"],
report_id=report["report-id"], "report_id": report_dict["report-id"],
policies=policies, "policies": policies,
) }
return new_report return new_report
except KeyError as e: except KeyError as e:
InvalidSMTPTLSReport(f"Missing required field: {e}") raise InvalidSMTPTLSReport(f"Missing required field: {e}")
except Exception as e: except Exception as e:
raise InvalidSMTPTLSReport(str(e)) raise InvalidSMTPTLSReport(str(e))
def parsed_smtp_tls_reports_to_csv_rows(reports: OrderedDict[str, Any]): def parsed_smtp_tls_reports_to_csv_rows(
"""Converts one oor more parsed SMTP TLS reports into a list of single reports: Union[ParsedSMTPTLSReport, list[ParsedSMTPTLSReport]],
layer OrderedDict objects suitable for use in a CSV""" ) -> list[dict[str, Any]]:
if type(reports) is OrderedDict: """Converts one or more parsed SMTP TLS reports into a list of single
layer dict objects suitable for use in a CSV"""
if type(reports) is dict:
reports = [reports] reports = [reports]
rows = [] rows = []
for report in reports: for report in reports:
common_fields = OrderedDict( common_fields: dict[str, Any] = {
organization_name=report["organization_name"], "organization_name": report["organization_name"],
begin_date=report["begin_date"], "begin_date": report["begin_date"],
end_date=report["end_date"], "end_date": report["end_date"],
report_id=report["report_id"], "report_id": report["report_id"],
) }
record = common_fields.copy() record = common_fields.copy()
for policy in report["policies"]: for policy in report["policies"]:
if "policy_strings" in policy: if "policy_strings" in policy:
@@ -574,7 +764,7 @@ def parsed_smtp_tls_reports_to_csv_rows(reports: OrderedDict[str, Any]):
return rows return rows
def parsed_smtp_tls_reports_to_csv(reports: OrderedDict[str, Any]) -> str: def parsed_smtp_tls_reports_to_csv(reports: ParsedSMTPTLSReport) -> str:
""" """
Converts one or more parsed SMTP TLS reports to flat CSV format, including Converts one or more parsed SMTP TLS reports to flat CSV format, including
headers headers
@@ -622,17 +812,17 @@ def parsed_smtp_tls_reports_to_csv(reports: OrderedDict[str, Any]) -> str:
def parse_aggregate_report_xml( def parse_aggregate_report_xml(
xml: str, xml: str,
*, *,
ip_db_path: Optional[bool] = None, ip_db_path: Optional[str] = None,
always_use_local_files: Optional[bool] = False, always_use_local_files: Optional[bool] = False,
reverse_dns_map_path: Optional[bool] = None, reverse_dns_map_path: Optional[str] = None,
reverse_dns_map_url: Optional[bool] = None, reverse_dns_map_url: Optional[str] = None,
offline: Optional[bool] = False, offline: Optional[bool] = False,
nameservers: Optional[list[str]] = None, nameservers: Optional[list[str]] = None,
timeout: Optional[float] = 2.0, timeout: Optional[float] = 2.0,
keep_alive: Optional[callable] = None, keep_alive: Optional[Callable] = None,
normalize_timespan_threshold_hours: Optional[float] = 24.0, normalize_timespan_threshold_hours: float = 24.0,
) -> OrderedDict[str, Any]: ) -> ParsedAggregateReport:
"""Parses a DMARC XML report string and returns a consistent OrderedDict """Parses a DMARC XML report string and returns a consistent dict
Args: Args:
xml (str): A string of DMARC aggregate report XML xml (str): A string of DMARC aggregate report XML
@@ -648,7 +838,7 @@ def parse_aggregate_report_xml(
normalize_timespan_threshold_hours (float): Normalize timespans beyond this normalize_timespan_threshold_hours (float): Normalize timespans beyond this
Returns: Returns:
OrderedDict: The parsed aggregate DMARC report dict: The parsed aggregate DMARC report
""" """
errors = [] errors = []
# Parse XML and recover from errors # Parse XML and recover from errors
@@ -680,8 +870,8 @@ def parse_aggregate_report_xml(
schema = "draft" schema = "draft"
if "version" in report: if "version" in report:
schema = report["version"] schema = report["version"]
new_report = OrderedDict([("xml_schema", schema)]) new_report: dict[str, Any] = {"xml_schema": schema}
new_report_metadata = OrderedDict() new_report_metadata: dict[str, Any] = {}
if report_metadata["org_name"] is None: if report_metadata["org_name"] is None:
if report_metadata["email"] is not None: if report_metadata["email"] is not None:
report_metadata["org_name"] = report_metadata["email"].split("@")[-1] report_metadata["org_name"] = report_metadata["email"].split("@")[-1]
@@ -742,7 +932,7 @@ def parse_aggregate_report_xml(
policy_published = report["policy_published"] policy_published = report["policy_published"]
if type(policy_published) is list: if type(policy_published) is list:
policy_published = policy_published[0] policy_published = policy_published[0]
new_policy_published = OrderedDict() new_policy_published: dict[str, Any] = {}
new_policy_published["domain"] = policy_published["domain"] new_policy_published["domain"] = policy_published["domain"]
adkim = "r" adkim = "r"
if "adkim" in policy_published: if "adkim" in policy_published:
@@ -847,14 +1037,14 @@ def extract_report(content: Union[bytes, str, IO[Any]]) -> str:
str: The extracted text str: The extracted text
""" """
file_object = None file_object: Union[BytesIO, IO[Any]]
try: try:
if isinstance(content, str): if isinstance(content, str):
try: try:
file_object = BytesIO(b64decode(content)) file_object = BytesIO(b64decode(content))
except binascii.Error: except binascii.Error:
return content return content
elif type(content) is bytes: elif isinstance(content, bytes):
file_object = BytesIO(content) file_object = BytesIO(content)
else: else:
file_object = content file_object = content
@@ -909,12 +1099,12 @@ def parse_aggregate_report_file(
dns_timeout: Optional[float] = 2.0, dns_timeout: Optional[float] = 2.0,
keep_alive: Optional[Callable] = None, keep_alive: Optional[Callable] = None,
normalize_timespan_threshold_hours: Optional[float] = 24.0, normalize_timespan_threshold_hours: Optional[float] = 24.0,
) -> OrderedDict[str, any]: ) -> ParsedAggregateReport:
"""Parses a file at the given path, a file-like object. or bytes as an """Parses a file at the given path, a file-like object. or bytes as an
aggregate DMARC report aggregate DMARC report
Args: Args:
_input (str | bytes | IO): A path to a file, a file like object, or bytes _input (Union[str, bytes, IO]): A path to a file, a file like object, or bytes
offline (bool): Do not query online for geolocation or DNS offline (bool): Do not query online for geolocation or DNS
always_use_local_files (bool): Do not download files always_use_local_files (bool): Do not download files
reverse_dns_map_path (str): Path to a reverse DNS map file reverse_dns_map_path (str): Path to a reverse DNS map file
@@ -927,7 +1117,7 @@ def parse_aggregate_report_file(
normalize_timespan_threshold_hours (float): Normalize timespans beyond this normalize_timespan_threshold_hours (float): Normalize timespans beyond this
Returns: Returns:
OrderedDict: The parsed DMARC aggregate report dict: The parsed DMARC aggregate report
""" """
try: try:
@@ -950,7 +1140,7 @@ def parse_aggregate_report_file(
def parsed_aggregate_reports_to_csv_rows( def parsed_aggregate_reports_to_csv_rows(
reports: list[OrderedDict[str, Any]], reports: list[ParsedAggregateReport],
) -> list[dict[str, Any]]: ) -> list[dict[str, Any]]:
""" """
Converts one or more parsed aggregate reports to list of dicts in flat CSV Converts one or more parsed aggregate reports to list of dicts in flat CSV
@@ -967,7 +1157,7 @@ def parsed_aggregate_reports_to_csv_rows(
def to_str(obj): def to_str(obj):
return str(obj).lower() return str(obj).lower()
if type(reports) is OrderedDict: if type(reports) is dict:
reports = [reports] reports = [reports]
rows = [] rows = []
@@ -1075,7 +1265,7 @@ def parsed_aggregate_reports_to_csv_rows(
return rows return rows
def parsed_aggregate_reports_to_csv(reports: list[OrderedDict[str, Any]]) -> str: def parsed_aggregate_reports_to_csv(reports: list[ParsedAggregateReport]) -> str:
""" """
Converts one or more parsed aggregate reports to flat CSV format, including Converts one or more parsed aggregate reports to flat CSV format, including
headers headers
@@ -1148,15 +1338,15 @@ def parse_forensic_report(
*, *,
always_use_local_files: Optional[bool] = False, always_use_local_files: Optional[bool] = False,
reverse_dns_map_path: Optional[str] = None, reverse_dns_map_path: Optional[str] = None,
reverse_dns_map_url: str = None, reverse_dns_map_url: Optional[str] = None,
offline: Optional[bool] = False, offline: Optional[bool] = False,
ip_db_path: Optional[str] = None, ip_db_path: Optional[str] = None,
nameservers: Optional[list[str]] = None, nameservers: Optional[list[str]] = None,
dns_timeout: Optional[float] = 2.0, dns_timeout: Optional[float] = 2.0,
strip_attachment_payloads: Optional[bool] = False, strip_attachment_payloads: Optional[bool] = False,
) -> OrderedDict[str, Any]: ) -> ParsedForensicReport:
""" """
Converts a DMARC forensic report and sample to a ``OrderedDict`` Converts a DMARC forensic report and sample to a ``dict``
Args: Args:
feedback_report (str): A message's feedback report as a string feedback_report (str): A message's feedback report as a string
@@ -1174,12 +1364,12 @@ def parse_forensic_report(
forensic report results forensic report results
Returns: Returns:
OrderedDict: A parsed report and sample dict: A parsed report and sample
""" """
delivery_results = ["delivered", "spam", "policy", "reject", "other"] delivery_results = ["delivered", "spam", "policy", "reject", "other"]
try: try:
parsed_report = OrderedDict() parsed_report: dict[str, Any] = {}
report_values = feedback_report_regex.findall(feedback_report) report_values = feedback_report_regex.findall(feedback_report)
for report_value in report_values: for report_value in report_values:
key = report_value[0].lower().replace("-", "_") key = report_value[0].lower().replace("-", "_")
@@ -1282,7 +1472,7 @@ def parse_forensic_report(
raise InvalidForensicReport("Unexpected error: {0}".format(error.__str__())) raise InvalidForensicReport("Unexpected error: {0}".format(error.__str__()))
def parsed_forensic_reports_to_csv_rows(reports: list[OrderedDict[str, Any]]): def parsed_forensic_reports_to_csv_rows(reports: list[ParsedForensicReport]):
""" """
Converts one or more parsed forensic reports to a list of dicts in flat CSV Converts one or more parsed forensic reports to a list of dicts in flat CSV
format format
@@ -1293,7 +1483,7 @@ def parsed_forensic_reports_to_csv_rows(reports: list[OrderedDict[str, Any]]):
Returns: Returns:
list: Parsed forensic report data as a list of dicts in flat CSV format list: Parsed forensic report data as a list of dicts in flat CSV format
""" """
if type(reports) is OrderedDict: if type(reports) is dict:
reports = [reports] reports = [reports]
rows = [] rows = []
@@ -1318,7 +1508,7 @@ def parsed_forensic_reports_to_csv_rows(reports: list[OrderedDict[str, Any]]):
return rows return rows
def parsed_forensic_reports_to_csv(reports: list[dict[str, Any]]) -> str: def parsed_forensic_reports_to_csv(reports: list[ParsedForensicReport]) -> str:
""" """
Converts one or more parsed forensic reports to flat CSV format, including Converts one or more parsed forensic reports to flat CSV format, including
headers headers
@@ -1383,7 +1573,7 @@ def parse_report_email(
strip_attachment_payloads: Optional[bool] = False, strip_attachment_payloads: Optional[bool] = False,
keep_alive: Optional[callable] = None, keep_alive: Optional[callable] = None,
normalize_timespan_threshold_hours: Optional[float] = 24.0, normalize_timespan_threshold_hours: Optional[float] = 24.0,
) -> OrderedDict[str, Any]: ) -> dict[str, Any]:
""" """
Parses a DMARC report from an email Parses a DMARC report from an email
@@ -1402,7 +1592,7 @@ def parse_report_email(
normalize_timespan_threshold_hours (float): Normalize timespans beyond this normalize_timespan_threshold_hours (float): Normalize timespans beyond this
Returns: Returns:
OrderedDict: dict:
* ``report_type``: ``aggregate`` or ``forensic`` * ``report_type``: ``aggregate`` or ``forensic``
* ``report``: The parsed report * ``report``: The parsed report
""" """
@@ -1426,6 +1616,7 @@ def parse_report_email(
feedback_report = None feedback_report = None
smtp_tls_report = None smtp_tls_report = None
sample = None sample = None
is_feedback_report: bool = False
if "From" in msg_headers: if "From" in msg_headers:
logger.info("Parsing mail from {0} on {1}".format(msg_headers["From"], date)) logger.info("Parsing mail from {0} on {1}".format(msg_headers["From"], date))
if "Subject" in msg_headers: if "Subject" in msg_headers:
@@ -1441,6 +1632,7 @@ def parse_report_email(
if content_type == "text/html": if content_type == "text/html":
continue continue
elif content_type == "message/feedback-report": elif content_type == "message/feedback-report":
is_feedback_report = True
try: try:
if "Feedback-Type" in payload: if "Feedback-Type" in payload:
feedback_report = payload feedback_report = payload
@@ -1451,23 +1643,23 @@ def parse_report_email(
feedback_report = feedback_report.replace("\\n", "\n") feedback_report = feedback_report.replace("\\n", "\n")
except (ValueError, TypeError, binascii.Error): except (ValueError, TypeError, binascii.Error):
feedback_report = payload feedback_report = payload
elif content_type == "text/rfc822-headers": elif is_feedback_report and content_type in EMAIL_SAMPLE_CONTENT_TYPES:
sample = payload
elif content_type == "message/rfc822":
sample = payload sample = payload
elif content_type == "application/tlsrpt+json": elif content_type == "application/tlsrpt+json":
if not payload.strip().startswith("{"): if not payload.strip().startswith("{"):
payload = str(b64decode(payload)) payload = str(b64decode(payload))
smtp_tls_report = parse_smtp_tls_report_json(payload) smtp_tls_report = parse_smtp_tls_report_json(payload)
return OrderedDict( return {
[("report_type", "smtp_tls"), ("report", smtp_tls_report)] "report_type": "smtp_tls",
) "report": smtp_tls_report,
}
elif content_type == "application/tlsrpt+gzip": elif content_type == "application/tlsrpt+gzip":
payload = extract_report(payload) payload = extract_report(payload)
smtp_tls_report = parse_smtp_tls_report_json(payload) smtp_tls_report = parse_smtp_tls_report_json(payload)
return OrderedDict( return {
[("report_type", "smtp_tls"), ("report", smtp_tls_report)] "report_type": "smtp_tls",
) "report": smtp_tls_report,
}
elif content_type == "text/plain": elif content_type == "text/plain":
if "A message claiming to be from you has failed" in payload: if "A message claiming to be from you has failed" in payload:
try: try:
@@ -1498,9 +1690,10 @@ def parse_report_email(
payload = payload.decode("utf-8", errors="replace") payload = payload.decode("utf-8", errors="replace")
if payload.strip().startswith("{"): if payload.strip().startswith("{"):
smtp_tls_report = parse_smtp_tls_report_json(payload) smtp_tls_report = parse_smtp_tls_report_json(payload)
result = OrderedDict( result = {
[("report_type", "smtp_tls"), ("report", smtp_tls_report)] "report_type": "smtp_tls",
) "report": smtp_tls_report,
}
elif payload.strip().startswith("<"): elif payload.strip().startswith("<"):
aggregate_report = parse_aggregate_report_xml( aggregate_report = parse_aggregate_report_xml(
payload, payload,
@@ -1514,9 +1707,10 @@ def parse_report_email(
keep_alive=keep_alive, keep_alive=keep_alive,
normalize_timespan_threshold_hours=normalize_timespan_threshold_hours, normalize_timespan_threshold_hours=normalize_timespan_threshold_hours,
) )
result = OrderedDict( result = {
[("report_type", "aggregate"), ("report", aggregate_report)] "report_type": "aggregate",
) "report": aggregate_report,
}
return result return result
@@ -1560,7 +1754,7 @@ def parse_report_email(
except Exception as e: except Exception as e:
raise InvalidForensicReport(e.__str__()) raise InvalidForensicReport(e.__str__())
result = OrderedDict([("report_type", "forensic"), ("report", forensic_report)]) result = {"report_type": "forensic", "report": forensic_report}
return result return result
if result is None: if result is None:
@@ -1581,12 +1775,12 @@ def parse_report_file(
offline: Optional[bool] = False, offline: Optional[bool] = False,
keep_alive: Optional[Callable] = None, keep_alive: Optional[Callable] = None,
normalize_timespan_threshold_hours: Optional[float] = 24, normalize_timespan_threshold_hours: Optional[float] = 24,
) -> OrderedDict[str, Any]: ) -> ReportTypeWrapper:
"""Parses a DMARC aggregate or forensic file at the given path, a """Parses a DMARC aggregate or forensic file at the given path, a
file-like object. or bytes file-like object. or bytes
Args: Args:
input_ (str | bytes | IO): A path to a file, a file like object, or bytes input_ (Union[str, bytes, IO]): A path to a file, a file like object, or bytes
nameservers (list): A list of one or more nameservers to use nameservers (list): A list of one or more nameservers to use
(Cloudflare's public DNS resolvers by default) (Cloudflare's public DNS resolvers by default)
dns_timeout (float): Sets the DNS timeout in seconds dns_timeout (float): Sets the DNS timeout in seconds
@@ -1600,7 +1794,7 @@ def parse_report_file(
keep_alive (callable): Keep alive function keep_alive (callable): Keep alive function
Returns: Returns:
OrderedDict: The parsed DMARC report dict: The parsed DMARC report
""" """
if type(input_) is str: if type(input_) is str:
logger.debug("Parsing {0}".format(input_)) logger.debug("Parsing {0}".format(input_))
@@ -1627,11 +1821,11 @@ def parse_report_file(
keep_alive=keep_alive, keep_alive=keep_alive,
normalize_timespan_threshold_hours=normalize_timespan_threshold_hours, normalize_timespan_threshold_hours=normalize_timespan_threshold_hours,
) )
results = OrderedDict([("report_type", "aggregate"), ("report", report)]) results = {"report_type": "aggregate", "report": report}
except InvalidAggregateReport: except InvalidAggregateReport:
try: try:
report = parse_smtp_tls_report_json(content) report = parse_smtp_tls_report_json(content)
results = OrderedDict([("report_type", "smtp_tls"), ("report", report)]) results = {"report_type": "smtp_tls", "report": report}
except InvalidSMTPTLSReport: except InvalidSMTPTLSReport:
try: try:
sa = strip_attachment_payloads sa = strip_attachment_payloads
@@ -1665,7 +1859,7 @@ def get_dmarc_reports_from_mbox(
reverse_dns_map_url: Optional[str] = None, reverse_dns_map_url: Optional[str] = None,
offline: Optional[bool] = False, offline: Optional[bool] = False,
normalize_timespan_threshold_hours: Optional[float] = 24.0, normalize_timespan_threshold_hours: Optional[float] = 24.0,
) -> OrderedDict[str, OrderedDict[str, Any]]: ) -> dict[str, dict[str, Any]]:
"""Parses a mailbox in mbox format containing e-mails with attached """Parses a mailbox in mbox format containing e-mails with attached
DMARC reports DMARC reports
@@ -1684,7 +1878,7 @@ def get_dmarc_reports_from_mbox(
normalize_timespan_threshold_hours (float): Normalize timespans beyond this normalize_timespan_threshold_hours (float): Normalize timespans beyond this
Returns: Returns:
OrderedDict: Lists of ``aggregate_reports``, ``forensic_reports``, and ``smtp_tls_reports`` dict: Lists of ``aggregate_reports``, ``forensic_reports``, and ``smtp_tls_reports``
""" """
aggregate_reports = [] aggregate_reports = []
@@ -1733,13 +1927,11 @@ def get_dmarc_reports_from_mbox(
logger.warning(error.__str__()) logger.warning(error.__str__())
except mailbox.NoSuchMailboxError: except mailbox.NoSuchMailboxError:
raise InvalidDMARCReport("Mailbox {0} does not exist".format(input_)) raise InvalidDMARCReport("Mailbox {0} does not exist".format(input_))
return OrderedDict( return {
[ "aggregate_reports": aggregate_reports,
("aggregate_reports", aggregate_reports), "forensic_reports": forensic_reports,
("forensic_reports", forensic_reports), "smtp_tls_reports": smtp_tls_reports,
("smtp_tls_reports", smtp_tls_reports), }
]
)
def get_dmarc_reports_from_mailbox( def get_dmarc_reports_from_mailbox(
@@ -1750,19 +1942,19 @@ def get_dmarc_reports_from_mailbox(
delete: Optional[bool] = False, delete: Optional[bool] = False,
test: Optional[bool] = False, test: Optional[bool] = False,
ip_db_path: Optional[str] = None, ip_db_path: Optional[str] = None,
always_use_local_files: Optional[str] = False, always_use_local_files: Optional[bool] = False,
reverse_dns_map_path: Optional[str] = None, reverse_dns_map_path: Optional[str] = None,
reverse_dns_map_url: Optional[str] = None, reverse_dns_map_url: Optional[str] = None,
offline: Optional[bool] = False, offline: Optional[bool] = False,
nameservers: Optional[list[str]] = None, nameservers: Optional[list[str]] = None,
dns_timeout: Optional[float] = 6.0, dns_timeout: Optional[float] = 6.0,
strip_attachment_payloads: Optional[bool] = False, strip_attachment_payloads: Optional[bool] = False,
results: Optional[OrderedDict[str, any]] = None, results: Optional[dict[str, Any]] = None,
batch_size: Optional[int] = 10, batch_size: Optional[int] = 10,
since: Optional[datetime] = None, since: Optional[datetime] = None,
create_folders: Optional[bool] = True, create_folders: Optional[bool] = True,
normalize_timespan_threshold_hours: Optional[float] = 24, normalize_timespan_threshold_hours: Optional[float] = 24,
) -> OrderedDict[str, OrderedDict[str, Any]]: ) -> dict[str, dict[str, Any]]:
""" """
Fetches and parses DMARC reports from a mailbox Fetches and parses DMARC reports from a mailbox
@@ -1791,7 +1983,7 @@ def get_dmarc_reports_from_mailbox(
normalize_timespan_threshold_hours (float): Normalize timespans beyond this normalize_timespan_threshold_hours (float): Normalize timespans beyond this
Returns: Returns:
OrderedDict: Lists of ``aggregate_reports``, ``forensic_reports``, and ``smtp_tls_reports`` dict: Lists of ``aggregate_reports``, ``forensic_reports``, and ``smtp_tls_reports``
""" """
if delete and test: if delete and test:
raise ValueError("delete and test options are mutually exclusive") raise ValueError("delete and test options are mutually exclusive")
@@ -2025,13 +2217,11 @@ def get_dmarc_reports_from_mailbox(
except Exception as e: except Exception as e:
e = "Error moving message UID {0}: {1}".format(msg_uid, e) e = "Error moving message UID {0}: {1}".format(msg_uid, e)
logger.error("Mailbox error: {0}".format(e)) logger.error("Mailbox error: {0}".format(e))
results = OrderedDict( results: dict[str, Any] = {
[ "aggregate_reports": aggregate_reports,
("aggregate_reports", aggregate_reports), "forensic_reports": forensic_reports,
("forensic_reports", forensic_reports), "smtp_tls_reports": smtp_tls_reports,
("smtp_tls_reports", smtp_tls_reports), }
]
)
if current_time: if current_time:
total_messages = len( total_messages = len(
@@ -2172,7 +2362,7 @@ def append_csv(filename, csv):
def save_output( def save_output(
results: OrderedDict[str, Any], results: dict[str, Any],
*, *,
output_directory: Optional[str] = "output", output_directory: Optional[str] = "output",
aggregate_json_filename: Optional[str] = "aggregate.json", aggregate_json_filename: Optional[str] = "aggregate.json",
@@ -2186,7 +2376,7 @@ def save_output(
Save report data in the given directory Save report data in the given directory
Args: Args:
results (OrderedDict): Parsing results results (dict): Parsing results
output_directory (str): The path to the directory to save in output_directory (str): The path to the directory to save in
aggregate_json_filename (str): Filename for the aggregate JSON file aggregate_json_filename (str): Filename for the aggregate JSON file
forensic_json_filename (str): Filename for the forensic JSON file forensic_json_filename (str): Filename for the forensic JSON file
@@ -2258,12 +2448,12 @@ def save_output(
sample_file.write(sample) sample_file.write(sample)
def get_report_zip(results: OrderedDict[str, Any]) -> bytes: def get_report_zip(results: dict[str, Any]) -> bytes:
""" """
Creates a zip file of parsed report output Creates a zip file of parsed report output
Args: Args:
results (OrderedDict): The parsed results results (dict): The parsed results
Returns: Returns:
bytes: zip file bytes bytes: zip file bytes
@@ -2304,7 +2494,7 @@ def get_report_zip(results: OrderedDict[str, Any]) -> bytes:
def email_results( def email_results(
results: OrderedDict, results: dict[str, Any],
*, *,
host: str, host: str,
mail_from: str, mail_from: str,
@@ -2324,7 +2514,7 @@ def email_results(
Emails parsing results as a zip file Emails parsing results as a zip file
Args: Args:
results (OrderedDict): Parsing results results (dict): Parsing results
host (str): Mail server hostname or IP address host (str): Mail server hostname or IP address
mail_from: The value of the message from header mail_from: The value of the message from header
mail_to (list): A list of addresses to mail to mail_to (list): A list of addresses to mail to

View File

@@ -10,7 +10,6 @@ from glob import glob
import logging import logging
import math import math
import yaml import yaml
from collections import OrderedDict
import json import json
from ssl import CERT_NONE, create_default_context from ssl import CERT_NONE, create_default_context
from multiprocessing import Pipe, Process from multiprocessing import Pipe, Process
@@ -1481,7 +1480,7 @@ def _main():
pbar.update(counter - pbar.n) pbar.update(counter - pbar.n)
for result in results: for result in results:
if type(result[0]) is ParserError: if isinstance(result[0], ParserError) or result[0] is None:
logger.error("Failed to parse {0} - {1}".format(result[1], result[0])) logger.error("Failed to parse {0} - {1}".format(result[1], result[0]))
else: else:
if result[0]["report_type"] == "aggregate": if result[0]["report_type"] == "aggregate":
@@ -1634,7 +1633,7 @@ def _main():
logger.exception("Mailbox Error") logger.exception("Mailbox Error")
exit(1) exit(1)
results = OrderedDict( results = dict(
[ [
("aggregate_reports", aggregate_reports), ("aggregate_reports", aggregate_reports),
("forensic_reports", forensic_reports), ("forensic_reports", forensic_reports),

View File

@@ -1,2 +1,3 @@
__version__ = "9.0.2" __version__ = "9.0.5"
USER_AGENT = f"parsedmarc/{__version__}" USER_AGENT = f"parsedmarc/{__version__}"

View File

@@ -4,7 +4,6 @@ from __future__ import annotations
from typing import Optional, Union, Any from typing import Optional, Union, Any
from collections import OrderedDict
from elasticsearch_dsl.search import Q from elasticsearch_dsl.search import Q
from elasticsearch_dsl import ( from elasticsearch_dsl import (
@@ -203,11 +202,11 @@ class _SMTPTLSPolicyDoc(InnerDoc):
def add_failure_details( def add_failure_details(
self, self,
result_type: str, result_type: Optional[str] = None,
ip_address: str, ip_address: Optional[str] = None,
receiving_ip: str, receiving_ip: Optional[str] = None,
receiving_mx_helo: str, receiving_mx_helo: Optional[str] = None,
failed_session_count: int, failed_session_count: Optional[int] = None,
sending_mta_ip: Optional[str] = None, sending_mta_ip: Optional[str] = None,
receiving_mx_hostname: Optional[str] = None, receiving_mx_hostname: Optional[str] = None,
additional_information_uri: Optional[str] = None, additional_information_uri: Optional[str] = None,
@@ -279,7 +278,7 @@ def set_hosts(
Sets the Elasticsearch hosts to use Sets the Elasticsearch hosts to use
Args: Args:
hosts (str | list[str]): A single hostname or URL, or list of hostnames or URLs hosts (Union[str, list[str]]): A single hostname or URL, or list of hostnames or URLs
use_ssl (bool): Use an HTTPS connection to the server use_ssl (bool): Use an HTTPS connection to the server
ssl_cert_path (str): Path to the certificate chain ssl_cert_path (str): Path to the certificate chain
username (str): The username to use for authentication username (str): The username to use for authentication
@@ -297,7 +296,7 @@ def set_hosts(
conn_params["ca_certs"] = ssl_cert_path conn_params["ca_certs"] = ssl_cert_path
else: else:
conn_params["verify_certs"] = False conn_params["verify_certs"] = False
if username: if username and password:
conn_params["http_auth"] = username + ":" + password conn_params["http_auth"] = username + ":" + password
if api_key: if api_key:
conn_params["api_key"] = api_key conn_params["api_key"] = api_key
@@ -377,7 +376,7 @@ def migrate_indexes(
def save_aggregate_report_to_elasticsearch( def save_aggregate_report_to_elasticsearch(
aggregate_report: OrderedDict[str, Any], aggregate_report: dict[str, Any],
index_suffix: Optional[str] = None, index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None, index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False, monthly_indexes: Optional[bool] = False,
@@ -388,7 +387,7 @@ def save_aggregate_report_to_elasticsearch(
Saves a parsed DMARC aggregate report to Elasticsearch Saves a parsed DMARC aggregate report to Elasticsearch
Args: Args:
aggregate_report (OrderedDict): A parsed forensic report aggregate_report (dict): A parsed forensic report
index_suffix (str): The suffix of the name of the index to save to index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily indexes monthly_indexes (bool): Use monthly indexes instead of daily indexes
@@ -539,8 +538,8 @@ def save_aggregate_report_to_elasticsearch(
def save_forensic_report_to_elasticsearch( def save_forensic_report_to_elasticsearch(
forensic_report: OrderedDict[str, Any], forensic_report: dict[str, Any],
index_suffix: Optional[any] = None, index_suffix: Optional[Any] = None,
index_prefix: Optional[str] = None, index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False, monthly_indexes: Optional[bool] = False,
number_of_shards: int = 1, number_of_shards: int = 1,
@@ -550,7 +549,7 @@ def save_forensic_report_to_elasticsearch(
Saves a parsed DMARC forensic report to Elasticsearch Saves a parsed DMARC forensic report to Elasticsearch
Args: Args:
forensic_report (OrderedDict): A parsed forensic report forensic_report (dict): A parsed forensic report
index_suffix (str): The suffix of the name of the index to save to index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily monthly_indexes (bool): Use monthly indexes instead of daily
@@ -570,7 +569,7 @@ def save_forensic_report_to_elasticsearch(
sample_date = forensic_report["parsed_sample"]["date"] sample_date = forensic_report["parsed_sample"]["date"]
sample_date = human_timestamp_to_datetime(sample_date) sample_date = human_timestamp_to_datetime(sample_date)
original_headers = forensic_report["parsed_sample"]["headers"] original_headers = forensic_report["parsed_sample"]["headers"]
headers = OrderedDict() headers = dict()
for original_header in original_headers: for original_header in original_headers:
headers[original_header.lower()] = original_headers[original_header] headers[original_header.lower()] = original_headers[original_header]
@@ -706,9 +705,9 @@ def save_forensic_report_to_elasticsearch(
def save_smtp_tls_report_to_elasticsearch( def save_smtp_tls_report_to_elasticsearch(
report: OrderedDict[str, Any], report: dict[str, Any],
index_suffix: str = None, index_suffix: Optional[str] = None,
index_prefix: str = None, index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False, monthly_indexes: Optional[bool] = False,
number_of_shards: Optional[int] = 1, number_of_shards: Optional[int] = 1,
number_of_replicas: Optional[int] = 0, number_of_replicas: Optional[int] = 0,
@@ -717,7 +716,7 @@ def save_smtp_tls_report_to_elasticsearch(
Saves a parsed SMTP TLS report to Elasticsearch Saves a parsed SMTP TLS report to Elasticsearch
Args: Args:
report (OrderedDict): A parsed SMTP TLS report report (dict): A parsed SMTP TLS report
index_suffix (str): The suffix of the name of the index to save to index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily indexes monthly_indexes (bool): Use monthly indexes instead of daily indexes

View File

@@ -8,7 +8,6 @@ import logging
import logging.handlers import logging.handlers
import json import json
import threading import threading
from collections import OrderedDict
from parsedmarc import ( from parsedmarc import (
parsed_aggregate_reports_to_csv_rows, parsed_aggregate_reports_to_csv_rows,
@@ -53,7 +52,9 @@ class GelfClient(object):
) )
self.logger.addHandler(self.handler) self.logger.addHandler(self.handler)
def save_aggregate_report_to_gelf(self, aggregate_reports: OrderedDict[str, Any]): def save_aggregate_report_to_gelf(
self, aggregate_reports: list[dict[str, Any]]
):
rows = parsed_aggregate_reports_to_csv_rows(aggregate_reports) rows = parsed_aggregate_reports_to_csv_rows(aggregate_reports)
for row in rows: for row in rows:
log_context_data.parsedmarc = row log_context_data.parsedmarc = row
@@ -61,12 +62,14 @@ class GelfClient(object):
log_context_data.parsedmarc = None log_context_data.parsedmarc = None
def save_forensic_report_to_gelf(self, forensic_reports: OrderedDict[str, Any]): def save_forensic_report_to_gelf(
self, forensic_reports: list[dict[str, Any]]
):
rows = parsed_forensic_reports_to_csv_rows(forensic_reports) rows = parsed_forensic_reports_to_csv_rows(forensic_reports)
for row in rows: for row in rows:
self.logger.info(json.dumps(row)) self.logger.info(json.dumps(row))
def save_smtp_tls_report_to_gelf(self, smtp_tls_reports: OrderedDict[str, Any]): def save_smtp_tls_report_to_gelf(self, smtp_tls_reports: dict[str, Any]):
rows = parsed_smtp_tls_reports_to_csv_rows(smtp_tls_reports) rows = parsed_smtp_tls_reports_to_csv_rows(smtp_tls_reports)
for row in rows: for row in rows:
self.logger.info(json.dumps(row)) self.logger.info(json.dumps(row))

View File

@@ -2,7 +2,7 @@
from __future__ import annotations from __future__ import annotations
from typing import Any, Optional from typing import Any, Optional, Union
from ssl import SSLContext from ssl import SSLContext
import json import json
@@ -10,7 +10,6 @@ from ssl import create_default_context
from kafka import KafkaProducer from kafka import KafkaProducer
from kafka.errors import NoBrokersAvailable, UnknownTopicOrPartitionError from kafka.errors import NoBrokersAvailable, UnknownTopicOrPartitionError
from collections import OrderedDict
from parsedmarc.utils import human_timestamp_to_datetime from parsedmarc.utils import human_timestamp_to_datetime
from parsedmarc import __version__ from parsedmarc import __version__
@@ -66,7 +65,7 @@ class KafkaClient(object):
raise KafkaError("No Kafka brokers available") raise KafkaError("No Kafka brokers available")
@staticmethod @staticmethod
def strip_metadata(report: OrderedDict[str, Any]): def strip_metadata(report: dict[str, Any]):
""" """
Duplicates org_name, org_email and report_id into JSON root Duplicates org_name, org_email and report_id into JSON root
and removes report_metadata key to bring it more inline and removes report_metadata key to bring it more inline
@@ -80,7 +79,7 @@ class KafkaClient(object):
return report return report
@staticmethod @staticmethod
def generate_date_range(report: OrderedDict[str, Any]): def generate_date_range(report: dict[str, Any]):
""" """
Creates a date_range timestamp with format YYYY-MM-DD-T-HH:MM:SS Creates a date_range timestamp with format YYYY-MM-DD-T-HH:MM:SS
based on begin and end dates for easier parsing in Kibana. based on begin and end dates for easier parsing in Kibana.
@@ -98,7 +97,9 @@ class KafkaClient(object):
return date_range return date_range
def save_aggregate_reports_to_kafka( def save_aggregate_reports_to_kafka(
self, aggregate_reports: list[OrderedDict][str, Any], aggregate_topic: str self,
aggregate_reports: Union[dict[str, Any], list[dict[str, Any]]],
aggregate_topic: str,
): ):
""" """
Saves aggregate DMARC reports to Kafka Saves aggregate DMARC reports to Kafka
@@ -109,9 +110,7 @@ class KafkaClient(object):
aggregate_topic (str): The name of the Kafka topic aggregate_topic (str): The name of the Kafka topic
""" """
if isinstance(aggregate_reports, dict) or isinstance( if isinstance(aggregate_reports, dict):
aggregate_reports, OrderedDict
):
aggregate_reports = [aggregate_reports] aggregate_reports = [aggregate_reports]
if len(aggregate_reports) < 1: if len(aggregate_reports) < 1:
@@ -143,7 +142,9 @@ class KafkaClient(object):
raise KafkaError("Kafka error: {0}".format(e.__str__())) raise KafkaError("Kafka error: {0}".format(e.__str__()))
def save_forensic_reports_to_kafka( def save_forensic_reports_to_kafka(
self, forensic_reports: OrderedDict[str, Any], forensic_topic: str self,
forensic_reports: Union[dict[str, Any], list[dict[str, Any]]],
forensic_topic: str,
): ):
""" """
Saves forensic DMARC reports to Kafka, sends individual Saves forensic DMARC reports to Kafka, sends individual
@@ -175,7 +176,9 @@ class KafkaClient(object):
raise KafkaError("Kafka error: {0}".format(e.__str__())) raise KafkaError("Kafka error: {0}".format(e.__str__()))
def save_smtp_tls_reports_to_kafka( def save_smtp_tls_reports_to_kafka(
self, smtp_tls_reports: list[OrderedDict[str, Any]], smtp_tls_topic: str self,
smtp_tls_reports: Union[list[dict[str, Any]], dict[str, Any]],
smtp_tls_topic: str,
): ):
""" """
Saves SMTP TLS reports to Kafka, sends individual Saves SMTP TLS reports to Kafka, sends individual

View File

@@ -3,7 +3,6 @@
from __future__ import annotations from __future__ import annotations
from typing import Any from typing import Any
from collections import OrderedDict
from parsedmarc.log import logger from parsedmarc.log import logger
from azure.core.exceptions import HttpResponseError from azure.core.exceptions import HttpResponseError
@@ -110,7 +109,7 @@ class LogAnalyticsClient(object):
def publish_json( def publish_json(
self, self,
results: OrderedDict[str, OrderedDict[str, Any]], results,
logs_client: LogsIngestionClient, logs_client: LogsIngestionClient,
dcr_stream: str, dcr_stream: str,
): ):
@@ -133,7 +132,7 @@ class LogAnalyticsClient(object):
def publish_results( def publish_results(
self, self,
results: OrderedDict[str, OrderedDict[str, Any]], results: dict[str, dict[str, Any]],
save_aggregate: bool, save_aggregate: bool,
save_forensic: bool, save_forensic: bool,
save_smtp_tls: bool, save_smtp_tls: bool,

View File

@@ -4,7 +4,6 @@ from __future__ import annotations
from typing import Optional, Union, Any from typing import Optional, Union, Any
from collections import OrderedDict
from opensearchpy import ( from opensearchpy import (
Q, Q,
@@ -203,11 +202,11 @@ class _SMTPTLSPolicyDoc(InnerDoc):
def add_failure_details( def add_failure_details(
self, self,
result_type: str, result_type: Optional[str] = None,
ip_address: str, ip_address: Optional[str] = None,
receiving_ip: str, receiving_ip: Optional[str] = None,
receiving_mx_helo: str, receiving_mx_helo: Optional[str] = None,
failed_session_count: int, failed_session_count: Optional[int] = None,
sending_mta_ip: Optional[str] = None, sending_mta_ip: Optional[str] = None,
receiving_mx_hostname: Optional[str] = None, receiving_mx_hostname: Optional[str] = None,
additional_information_uri: Optional[str] = None, additional_information_uri: Optional[str] = None,
@@ -297,7 +296,7 @@ def set_hosts(
conn_params["ca_certs"] = ssl_cert_path conn_params["ca_certs"] = ssl_cert_path
else: else:
conn_params["verify_certs"] = False conn_params["verify_certs"] = False
if username: if username and password:
conn_params["http_auth"] = username + ":" + password conn_params["http_auth"] = username + ":" + password
if api_key: if api_key:
conn_params["api_key"] = api_key conn_params["api_key"] = api_key
@@ -376,8 +375,8 @@ def migrate_indexes(
pass pass
def save_aggregate_report_to_elasticsearch( def save_aggregate_report_to_opensearch(
aggregate_report: OrderedDict[str, Any], aggregate_report: dict[str, Any],
index_suffix: Optional[str] = None, index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None, index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False, monthly_indexes: Optional[bool] = False,
@@ -388,7 +387,7 @@ def save_aggregate_report_to_elasticsearch(
Saves a parsed DMARC aggregate report to OpenSearch Saves a parsed DMARC aggregate report to OpenSearch
Args: Args:
aggregate_report (OrderedDict): A parsed forensic report aggregate_report (dict): A parsed forensic report
index_suffix (str): The suffix of the name of the index to save to index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily indexes monthly_indexes (bool): Use monthly indexes instead of daily indexes
@@ -538,9 +537,9 @@ def save_aggregate_report_to_elasticsearch(
raise OpenSearchError("OpenSearch error: {0}".format(e.__str__())) raise OpenSearchError("OpenSearch error: {0}".format(e.__str__()))
def save_forensic_report_to_elasticsearch( def save_forensic_report_to_opensearch(
forensic_report: OrderedDict[str, Any], forensic_report: dict[str, Any],
index_suffix: Optional[any] = None, index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None, index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False, monthly_indexes: Optional[bool] = False,
number_of_shards: int = 1, number_of_shards: int = 1,
@@ -550,7 +549,7 @@ def save_forensic_report_to_elasticsearch(
Saves a parsed DMARC forensic report to OpenSearch Saves a parsed DMARC forensic report to OpenSearch
Args: Args:
forensic_report (OrderedDict): A parsed forensic report forensic_report (dict): A parsed forensic report
index_suffix (str): The suffix of the name of the index to save to index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily monthly_indexes (bool): Use monthly indexes instead of daily
@@ -570,7 +569,7 @@ def save_forensic_report_to_elasticsearch(
sample_date = forensic_report["parsed_sample"]["date"] sample_date = forensic_report["parsed_sample"]["date"]
sample_date = human_timestamp_to_datetime(sample_date) sample_date = human_timestamp_to_datetime(sample_date)
original_headers = forensic_report["parsed_sample"]["headers"] original_headers = forensic_report["parsed_sample"]["headers"]
headers = OrderedDict() headers = dict()
for original_header in original_headers: for original_header in original_headers:
headers[original_header.lower()] = original_headers[original_header] headers[original_header.lower()] = original_headers[original_header]
@@ -705,10 +704,10 @@ def save_forensic_report_to_elasticsearch(
) )
def save_smtp_tls_report_to_elasticsearch( def save_smtp_tls_report_to_opensearch(
report: OrderedDict[str, Any], report: dict[str, Any],
index_suffix: str = None, index_suffix: Optional[str] = None,
index_prefix: str = None, index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False, monthly_indexes: Optional[bool] = False,
number_of_shards: Optional[int] = 1, number_of_shards: Optional[int] = 1,
number_of_replicas: Optional[int] = 0, number_of_replicas: Optional[int] = 0,
@@ -717,7 +716,7 @@ def save_smtp_tls_report_to_elasticsearch(
Saves a parsed SMTP TLS report to OpenSearch Saves a parsed SMTP TLS report to OpenSearch
Args: Args:
report (OrderedDict): A parsed SMTP TLS report report (dict): A parsed SMTP TLS report
index_suffix (str): The suffix of the name of the index to save to index_suffix (str): The suffix of the name of the index to save to
index_prefix (str): The prefix of the name of the index to save to index_prefix (str): The prefix of the name of the index to save to
monthly_indexes (bool): Use monthly indexes instead of daily indexes monthly_indexes (bool): Use monthly indexes instead of daily indexes

View File

@@ -132,6 +132,7 @@ asu-vei.ru,ASU-VEI,Industrial
atextelecom.com.br,ATEX Telecom,ISP atextelecom.com.br,ATEX Telecom,ISP
atmailcloud.com,atmail,Email Provider atmailcloud.com,atmail,Email Provider
ats.ca,ATS Healthcare,Healthcare ats.ca,ATS Healthcare,Healthcare
att.net,AT&T,ISP
atw.ne.jp,ATW,Web Host atw.ne.jp,ATW,Web Host
au-net.ne.jp,KDDI,ISP au-net.ne.jp,KDDI,ISP
au.com,au,ISP au.com,au,ISP
@@ -242,6 +243,7 @@ carandainet.com.br,CN Internet,ISP
cardhealth.com,Cardinal Health,Healthcare cardhealth.com,Cardinal Health,Healthcare
cardinal.com,Cardinal Health,Healthcare cardinal.com,Cardinal Health,Healthcare
cardinalhealth.com,Cardinal Health,Healthcare cardinalhealth.com,Cardinal Health,Healthcare
cardinalscriptnet.com,Cardinal Health,Healthcare
carecentrix.com,CareCentrix,Healthcare carecentrix.com,CareCentrix,Healthcare
carleton.edu,Carlton College,Education carleton.edu,Carlton College,Education
carrierzone.com,carrierzone,Email Security carrierzone.com,carrierzone,Email Security
@@ -697,6 +699,7 @@ hdsupply-email.com,HD Supply,Retail
healthall.com,UC Health,Healthcare healthall.com,UC Health,Healthcare
healthcaresupplypros.com,Healthcare Supply Pros,Healthcare healthcaresupplypros.com,Healthcare Supply Pros,Healthcare
healthproductsforyou.com,Health Products For You,Healthcare healthproductsforyou.com,Health Products For You,Healthcare
healthtouch.com,Cardinal Health,Healthcare
helloserver6.com,1st Source Web,Marketing helloserver6.com,1st Source Web,Marketing
helpforcb.com,InterServer,Web Host helpforcb.com,InterServer,Web Host
helpscout.net,Help Scout,SaaS helpscout.net,Help Scout,SaaS
@@ -753,6 +756,8 @@ hostwindsdns.com,Hostwinds,Web Host
hotnet.net.il,Hot Net Internet Services,ISP hotnet.net.il,Hot Net Internet Services,ISP
hp.com,HP,Technology hp.com,HP,Technology
hringdu.is,Hringdu,ISP hringdu.is,Hringdu,ISP
hslda.net,Home School Legal Defense Association (HSLDA),Education
hslda.org,Home School Legal Defense Association (HSLDA),Education
hspherefilter.com,"DynamicNet, Inc. (DNI)",Web Host hspherefilter.com,"DynamicNet, Inc. (DNI)",Web Host
htc.net,HTC,ISP htc.net,HTC,ISP
htmlservices.it,HTMLServices.it,MSP htmlservices.it,HTMLServices.it,MSP
@@ -763,6 +768,7 @@ hughston.com,Hughston Clinic,Healthcare
hvvc.us,Hivelocity,Web Host hvvc.us,Hivelocity,Web Host
i2ts.ne.jp,i2ts,Web Host i2ts.ne.jp,i2ts,Web Host
i4i.com,i4i,Technology i4i.com,i4i,Technology
ibindley.com,Cardinal Health,Healthcare
ice.co.cr,Grupo ICE,Industrial ice.co.cr,Grupo ICE,Industrial
icehosting.nl,IceHosting,Web Host icehosting.nl,IceHosting,Web Host
icewarpcloud.in,IceWrap,Email Provider icewarpcloud.in,IceWrap,Email Provider
@@ -832,6 +838,7 @@ ip-5-196-151.eu,OVH,Web Host
ip-51-161-36.net,OVH,Web Host ip-51-161-36.net,OVH,Web Host
ip-51-195-53.eu,OVH,Web Host ip-51-195-53.eu,OVH,Web Host
ip-51-254-53.eu,OVH,Web Host ip-51-254-53.eu,OVH,Web Host
ip-51-38-67.eu,OVH,Web Host
ip-51-77-42.eu,OVH,Web Host ip-51-77-42.eu,OVH,Web Host
ip-51-83-140.eu,OVH,Web Host ip-51-83-140.eu,OVH,Web Host
ip-51-89-240.eu,OVH,Web Host ip-51-89-240.eu,OVH,Web Host
@@ -1217,6 +1224,7 @@ nettoday.co.th,Net Today,Web Host
netventure.pl,Netventure,MSP netventure.pl,Netventure,MSP
netvigator.com,HKT,ISP netvigator.com,HKT,ISP
netvision.net.il,013 Netvision,ISP netvision.net.il,013 Netvision,ISP
network-tech.com,Network Technologies International (NTI),SaaS
network.kz,network.kz,ISP network.kz,network.kz,ISP
network80.com,Network80,Web Host network80.com,Network80,Web Host
neubox.net,Neubox,Web Host neubox.net,Neubox,Web Host
1 base_reverse_dns name type
132 atextelecom.com.br ATEX Telecom ISP
133 atmailcloud.com atmail Email Provider
134 ats.ca ATS Healthcare Healthcare
135 att.net AT&T ISP
136 atw.ne.jp ATW Web Host
137 au-net.ne.jp KDDI ISP
138 au.com au ISP
243 cardhealth.com Cardinal Health Healthcare
244 cardinal.com Cardinal Health Healthcare
245 cardinalhealth.com Cardinal Health Healthcare
246 cardinalscriptnet.com Cardinal Health Healthcare
247 carecentrix.com CareCentrix Healthcare
248 carleton.edu Carlton College Education
249 carrierzone.com carrierzone Email Security
699 healthall.com UC Health Healthcare
700 healthcaresupplypros.com Healthcare Supply Pros Healthcare
701 healthproductsforyou.com Health Products For You Healthcare
702 healthtouch.com Cardinal Health Healthcare
703 helloserver6.com 1st Source Web Marketing
704 helpforcb.com InterServer Web Host
705 helpscout.net Help Scout SaaS
756 hotnet.net.il Hot Net Internet Services ISP
757 hp.com HP Technology
758 hringdu.is Hringdu ISP
759 hslda.net Home School Legal Defense Association (HSLDA) Education
760 hslda.org Home School Legal Defense Association (HSLDA) Education
761 hspherefilter.com DynamicNet, Inc. (DNI) Web Host
762 htc.net HTC ISP
763 htmlservices.it HTMLServices.it MSP
768 hvvc.us Hivelocity Web Host
769 i2ts.ne.jp i2ts Web Host
770 i4i.com i4i Technology
771 ibindley.com Cardinal Health Healthcare
772 ice.co.cr Grupo ICE Industrial
773 icehosting.nl IceHosting Web Host
774 icewarpcloud.in IceWrap Email Provider
838 ip-51-161-36.net OVH Web Host
839 ip-51-195-53.eu OVH Web Host
840 ip-51-254-53.eu OVH Web Host
841 ip-51-38-67.eu OVH Web Host
842 ip-51-77-42.eu OVH Web Host
843 ip-51-83-140.eu OVH Web Host
844 ip-51-89-240.eu OVH Web Host
1224 netventure.pl Netventure MSP
1225 netvigator.com HKT ISP
1226 netvision.net.il 013 Netvision ISP
1227 network-tech.com Network Technologies International (NTI) SaaS
1228 network.kz network.kz ISP
1229 network80.com Network80 Web Host
1230 neubox.net Neubox Web Host

View File

@@ -7,8 +7,6 @@ from typing import Any
import json import json
import boto3 import boto3
from collections import OrderedDict
from parsedmarc.log import logger from parsedmarc.log import logger
from parsedmarc.utils import human_timestamp_to_datetime from parsedmarc.utils import human_timestamp_to_datetime
@@ -53,18 +51,18 @@ class S3Client(object):
aws_access_key_id=access_key_id, aws_access_key_id=access_key_id,
aws_secret_access_key=secret_access_key, aws_secret_access_key=secret_access_key,
) )
self.bucket = self.s3.Bucket(self.bucket_name) self.bucket: Any = self.s3.Bucket(self.bucket_name)
def save_aggregate_report_to_s3(self, report: OrderedDict[str, Any]): def save_aggregate_report_to_s3(self, report: dict[str, Any]):
self.save_report_to_s3(report, "aggregate") self.save_report_to_s3(report, "aggregate")
def save_forensic_report_to_s3(self, report: OrderedDict[str, Any]): def save_forensic_report_to_s3(self, report: dict[str, Any]):
self.save_report_to_s3(report, "forensic") self.save_report_to_s3(report, "forensic")
def save_smtp_tls_report_to_s3(self, report: OrderedDict[str, Any]): def save_smtp_tls_report_to_s3(self, report: dict[str, Any]):
self.save_report_to_s3(report, "smtp_tls") self.save_report_to_s3(report, "smtp_tls")
def save_report_to_s3(self, report: OrderedDict[str, Any], report_type: str): def save_report_to_s3(self, report: dict[str, Any], report_type: str):
if report_type == "smtp_tls": if report_type == "smtp_tls":
report_date = report["begin_date"] report_date = report["begin_date"]
report_id = report["report_id"] report_id = report["report_id"]

View File

@@ -2,9 +2,8 @@
from __future__ import annotations from __future__ import annotations
from typing import Any from typing import Any, Union
from collections import OrderedDict
from urllib.parse import urlparse from urllib.parse import urlparse
import socket import socket
@@ -35,7 +34,7 @@ class HECClient(object):
url: str, url: str,
access_token: str, access_token: str,
index: str, index: str,
source: bool = "parsedmarc", source: str = "parsedmarc",
verify=True, verify=True,
timeout=60, timeout=60,
): ):
@@ -51,9 +50,9 @@ class HECClient(object):
timeout (float): Number of seconds to wait for the server to send timeout (float): Number of seconds to wait for the server to send
data before giving up data before giving up
""" """
url = urlparse(url) parsed_url = urlparse(url)
self.url = "{0}://{1}/services/collector/event/1.0".format( self.url = "{0}://{1}/services/collector/event/1.0".format(
url.scheme, url.netloc parsed_url.scheme, parsed_url.netloc
) )
self.access_token = access_token.lstrip("Splunk ") self.access_token = access_token.lstrip("Splunk ")
self.index = index self.index = index
@@ -62,7 +61,9 @@ class HECClient(object):
self.session = requests.Session() self.session = requests.Session()
self.timeout = timeout self.timeout = timeout
self.session.verify = verify self.session.verify = verify
self._common_data = dict(host=self.host, source=self.source, index=self.index) self._common_data: dict[str, Union[str, int, float, dict]] = dict(
host=self.host, source=self.source, index=self.index
)
self.session.headers = { self.session.headers = {
"User-Agent": USER_AGENT, "User-Agent": USER_AGENT,
@@ -70,7 +71,8 @@ class HECClient(object):
} }
def save_aggregate_reports_to_splunk( def save_aggregate_reports_to_splunk(
self, aggregate_reports: list[OrderedDict[str, Any]] self,
aggregate_reports: Union[list[dict[str, Any]], dict[str, Any]],
): ):
""" """
Saves aggregate DMARC reports to Splunk Saves aggregate DMARC reports to Splunk
@@ -91,7 +93,7 @@ class HECClient(object):
json_str = "" json_str = ""
for report in aggregate_reports: for report in aggregate_reports:
for record in report["records"]: for record in report["records"]:
new_report = dict() new_report: dict[str, Union[str, int, float, dict]] = dict()
for metadata in report["report_metadata"]: for metadata in report["report_metadata"]:
new_report[metadata] = report["report_metadata"][metadata] new_report[metadata] = report["report_metadata"][metadata]
new_report["interval_begin"] = record["interval_begin"] new_report["interval_begin"] = record["interval_begin"]
@@ -135,7 +137,8 @@ class HECClient(object):
raise SplunkError(response["text"]) raise SplunkError(response["text"])
def save_forensic_reports_to_splunk( def save_forensic_reports_to_splunk(
self, forensic_reports: list[OrderedDict[str, Any]] self,
forensic_reports: Union[list[dict[str, Any]], dict[str, Any]],
): ):
""" """
Saves forensic DMARC reports to Splunk Saves forensic DMARC reports to Splunk
@@ -170,7 +173,9 @@ class HECClient(object):
if response["code"] != 0: if response["code"] != 0:
raise SplunkError(response["text"]) raise SplunkError(response["text"])
def save_smtp_tls_reports_to_splunk(self, reports: OrderedDict[str, Any]): def save_smtp_tls_reports_to_splunk(
self, reports: Union[list[dict[str, Any]], dict[str, Any]]
):
""" """
Saves aggregate DMARC reports to Splunk Saves aggregate DMARC reports to Splunk

View File

@@ -8,7 +8,6 @@ import logging.handlers
from typing import Any from typing import Any
from collections import OrderedDict
import json import json
@@ -37,21 +36,21 @@ class SyslogClient(object):
self.logger.addHandler(log_handler) self.logger.addHandler(log_handler)
def save_aggregate_report_to_syslog( def save_aggregate_report_to_syslog(
self, aggregate_reports: list[OrderedDict[str, Any]] self, aggregate_reports: list[dict[str, Any]]
): ):
rows = parsed_aggregate_reports_to_csv_rows(aggregate_reports) rows = parsed_aggregate_reports_to_csv_rows(aggregate_reports)
for row in rows: for row in rows:
self.logger.info(json.dumps(row)) self.logger.info(json.dumps(row))
def save_forensic_report_to_syslog( def save_forensic_report_to_syslog(
self, forensic_reports: list[OrderedDict[str, Any]] self, forensic_reports: list[dict[str, Any]]
): ):
rows = parsed_forensic_reports_to_csv_rows(forensic_reports) rows = parsed_forensic_reports_to_csv_rows(forensic_reports)
for row in rows: for row in rows:
self.logger.info(json.dumps(row)) self.logger.info(json.dumps(row))
def save_smtp_tls_report_to_syslog( def save_smtp_tls_report_to_syslog(
self, smtp_tls_reports: list[OrderedDict[str, Any]] self, smtp_tls_reports: list[dict[str, Any]]
): ):
rows = parsed_smtp_tls_reports_to_csv_rows(smtp_tls_reports) rows = parsed_smtp_tls_reports_to_csv_rows(smtp_tls_reports)
for row in rows: for row in rows:

View File

@@ -4,14 +4,13 @@
from __future__ import annotations from __future__ import annotations
from typing import Optional, Union from typing import Optional, Union, TypedDict, Any
import logging import logging
import os import os
from datetime import datetime from datetime import datetime
from datetime import timezone from datetime import timezone
from datetime import timedelta from datetime import timedelta
from collections import OrderedDict
from expiringdict import ExpiringDict from expiringdict import ExpiringDict
import tempfile import tempfile
import subprocess import subprocess
@@ -67,6 +66,14 @@ class DownloadError(RuntimeError):
"""Raised when an error occurs when downloading a file""" """Raised when an error occurs when downloading a file"""
class EmailAddress(TypedDict):
"""Parsed email address information"""
display_name: Optional[str]
address: str
local: Optional[str]
domain: Optional[str]
def decode_base64(data: str) -> bytes: def decode_base64(data: str) -> bytes:
""" """
Decodes a base64 string, with padding being optional Decodes a base64 string, with padding being optional
@@ -335,14 +342,14 @@ def get_ip_address_country(ip_address: str, *, db_path: Optional[str] = None) ->
def get_service_from_reverse_dns_base_domain( def get_service_from_reverse_dns_base_domain(
base_domain, base_domain: str,
*, *,
always_use_local_file: Optional[bool] = False, always_use_local_file: Optional[bool] = False,
local_file_path: Optional[bool] = None, local_file_path: Optional[str] = None,
url: Optional[bool] = None, url: Optional[str] = None,
offline: Optional[bool] = False, offline: Optional[bool] = False,
reverse_dns_map: Optional[bool] = None, reverse_dns_map: Optional[dict[str, Any]] = None,
) -> str: ) -> dict[str, Any]:
""" """
Returns the service name of a given base domain name from reverse DNS. Returns the service name of a given base domain name from reverse DNS.
@@ -411,18 +418,18 @@ def get_service_from_reverse_dns_base_domain(
def get_ip_address_info( def get_ip_address_info(
ip_address, ip_address: str,
*, *,
ip_db_path: Optional[str] = None, ip_db_path: Optional[str] = None,
reverse_dns_map_path: Optional[str] = None, reverse_dns_map_path: Optional[str] = None,
always_use_local_files: Optional[bool] = False, always_use_local_files: Optional[bool] = False,
reverse_dns_map_url: Optional[bool] = None, reverse_dns_map_url: Optional[str] = None,
cache: Optional[ExpiringDict] = None, cache: Optional[ExpiringDict] = None,
reverse_dns_map: Optional[bool] = None, reverse_dns_map: Optional[dict[str, Any]] = None,
offline: Optional[bool] = False, offline: Optional[bool] = False,
nameservers: Optional[list[str]] = None, nameservers: Optional[list[str]] = None,
timeout: Optional[float] = 2.0, timeout: Optional[float] = 2.0,
) -> OrderedDict[str, str]: ) -> dict[str, Any]:
""" """
Returns reverse DNS and country information for the given IP address Returns reverse DNS and country information for the given IP address
@@ -440,7 +447,7 @@ def get_ip_address_info(
timeout (float): Sets the DNS timeout in seconds timeout (float): Sets the DNS timeout in seconds
Returns: Returns:
OrderedDict: ``ip_address``, ``reverse_dns``, ``country`` dict: ``ip_address``, ``reverse_dns``, ``country``
""" """
ip_address = ip_address.lower() ip_address = ip_address.lower()
@@ -449,7 +456,7 @@ def get_ip_address_info(
if info: if info:
logger.debug(f"IP address {ip_address} was found in cache") logger.debug(f"IP address {ip_address} was found in cache")
return info return info
info = OrderedDict() info = dict()
info["ip_address"] = ip_address info["ip_address"] = ip_address
if offline: if offline:
reverse_dns = None reverse_dns = None
@@ -487,7 +494,7 @@ def get_ip_address_info(
return info return info
def parse_email_address(original_address: str) -> OrderedDict[str, str]: def parse_email_address(original_address: str) -> EmailAddress:
if original_address[0] == "": if original_address[0] == "":
display_name = None display_name = None
else: else:
@@ -500,14 +507,12 @@ def parse_email_address(original_address: str) -> OrderedDict[str, str]:
local = address_parts[0].lower() local = address_parts[0].lower()
domain = address_parts[-1].lower() domain = address_parts[-1].lower()
return OrderedDict( return {
[ "display_name": display_name,
("display_name", display_name), "address": address,
("address", address), "local": local,
("local", local), "domain": domain,
("domain", domain), }
]
)
def get_filename_safe_string(string: str) -> str: def get_filename_safe_string(string: str) -> str:
@@ -553,7 +558,7 @@ def is_mbox(path: str) -> bool:
return _is_mbox return _is_mbox
def is_outlook_msg(content) -> bool: def is_outlook_msg(content: Union[bytes, Any]) -> bool:
""" """
Checks if the given content is an Outlook msg OLE/MSG file Checks if the given content is an Outlook msg OLE/MSG file
@@ -586,13 +591,14 @@ def convert_outlook_msg(msg_bytes: bytes) -> str:
os.chdir(tmp_dir) os.chdir(tmp_dir)
with open("sample.msg", "wb") as msg_file: with open("sample.msg", "wb") as msg_file:
msg_file.write(msg_bytes) msg_file.write(msg_bytes)
rfc822_bytes: bytes
try: try:
subprocess.check_call( subprocess.check_call(
["msgconvert", "sample.msg"], stdout=null_file, stderr=null_file ["msgconvert", "sample.msg"], stdout=null_file, stderr=null_file
) )
eml_path = "sample.eml" eml_path = "sample.eml"
with open(eml_path, "rb") as eml_file: with open(eml_path, "rb") as eml_file:
rfc822 = eml_file.read() rfc822_bytes = eml_file.read()
except FileNotFoundError: except FileNotFoundError:
raise EmailParserError( raise EmailParserError(
"Failed to convert Outlook MSG: msgconvert utility not found" "Failed to convert Outlook MSG: msgconvert utility not found"
@@ -601,12 +607,12 @@ def convert_outlook_msg(msg_bytes: bytes) -> str:
os.chdir(orig_dir) os.chdir(orig_dir)
shutil.rmtree(tmp_dir) shutil.rmtree(tmp_dir)
return rfc822 return rfc822_bytes.decode("utf-8", errors="replace")
def parse_email( def parse_email(
data: Union[bytes, str], *, strip_attachment_payloads: Optional[bool] = False data: Union[bytes, str], *, strip_attachment_payloads: Optional[bool] = False
): ) -> dict[str, Any]:
""" """
A simplified email parser A simplified email parser
@@ -621,7 +627,8 @@ def parse_email(
if isinstance(data, bytes): if isinstance(data, bytes):
if is_outlook_msg(data): if is_outlook_msg(data):
data = convert_outlook_msg(data) data = convert_outlook_msg(data)
data = data.decode("utf-8", errors="replace") else:
data = data.decode("utf-8", errors="replace")
parsed_email = mailparser.parse_from_string(data) parsed_email = mailparser.parse_from_string(data)
headers = json.loads(parsed_email.headers_json).copy() headers = json.loads(parsed_email.headers_json).copy()
parsed_email = json.loads(parsed_email.mail_json).copy() parsed_email = json.loads(parsed_email.mail_json).copy()

View File

@@ -4,8 +4,6 @@ from __future__ import annotations
from typing import Any, Optional, Union from typing import Any, Optional, Union
from collections import OrderedDict
import requests import requests
from parsedmarc import logger from parsedmarc import logger
@@ -40,19 +38,19 @@ class WebhookClient(object):
"Content-Type": "application/json", "Content-Type": "application/json",
} }
def save_forensic_report_to_webhook(self, report: OrderedDict[str, Any]): def save_forensic_report_to_webhook(self, report: str):
try: try:
self._send_to_webhook(self.forensic_url, report) self._send_to_webhook(self.forensic_url, report)
except Exception as error_: except Exception as error_:
logger.error("Webhook Error: {0}".format(error_.__str__())) logger.error("Webhook Error: {0}".format(error_.__str__()))
def save_smtp_tls_report_to_webhook(self, report: OrderedDict[str, Any]): def save_smtp_tls_report_to_webhook(self, report: str):
try: try:
self._send_to_webhook(self.smtp_tls_url, report) self._send_to_webhook(self.smtp_tls_url, report)
except Exception as error_: except Exception as error_:
logger.error("Webhook Error: {0}".format(error_.__str__())) logger.error("Webhook Error: {0}".format(error_.__str__()))
def save_aggregate_report_to_webhook(self, report: OrderedDict[str, Any]): def save_aggregate_report_to_webhook(self, report: str):
try: try:
self._send_to_webhook(self.aggregate_url, report) self._send_to_webhook(self.aggregate_url, report)
except Exception as error_: except Exception as error_:

View File

@@ -2,6 +2,7 @@
requires = [ requires = [
"hatchling>=1.27.0", "hatchling>=1.27.0",
] ]
requires_python = ">=3.9,<3.14"
build-backend = "hatchling.build" build-backend = "hatchling.build"
[project] [project]
@@ -28,7 +29,7 @@ classifiers = [
"Operating System :: OS Independent", "Operating System :: OS Independent",
"Programming Language :: Python :: 3" "Programming Language :: Python :: 3"
] ]
requires-python = ">= 3.9" requires-python = ">=3.9, <3.14"
dependencies = [ dependencies = [
"azure-identity>=1.8.0", "azure-identity>=1.8.0",
"azure-monitor-ingestion>=1.0.0", "azure-monitor-ingestion>=1.0.0",

View File

@@ -1,3 +1,6 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from __future__ import absolute_import, print_function, unicode_literals from __future__ import absolute_import, print_function, unicode_literals
import os import os