Compare commits

...

19 Commits

Author SHA1 Message Date
copilot-swe-agent[bot]
a4ac5accd2 Add TypedDict definitions to replace dict[str, Any] for better typing
- Added comprehensive TypedDict definitions for all major data structures
- IPAddressInfo, EmailAddress, ReverseDNSService in utils.py
- AggregateReport, ForensicReport, SMTPTLSReport and related types in __init__.py
- Updated function signatures throughout codebase to use TypedDict types
- Reduced dict[str, Any] usage from 50+ to 11 (remaining are legitimate generic cases)
- All tests pass successfully

Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-19 21:29:14 +00:00
copilot-swe-agent[bot]
639f8f674b Replace dict() and OrderedDict() with literal dictionaries for better typing
- Converted all dict() calls to {} literals
- Replaced OrderedDict() with {} literals (Python 3.7+ preserves insertion order)
- Updated all OrderedDict type hints to dict type hints
- Removed unused OrderedDict imports from all files
- All tests pass successfully
- Code passes ruff checks

Co-authored-by: seanthegeek <44679+seanthegeek@users.noreply.github.com>
2025-12-19 21:04:56 +00:00
copilot-swe-agent[bot]
380df4df7c Initial plan 2025-12-19 20:49:40 +00:00
Sean Whalen
af9ad568ec Specify Python version requirements in pyproject.toml 2025-12-17 16:18:24 -05:00
Sean Whalen
748164d177 Fix #638 2025-12-17 16:09:26 -05:00
Sean Whalen
487e5e1149 Format on build 2025-12-12 15:56:52 -05:00
Sean Whalen
73010cf964 Use ruff for code formatting 2025-12-12 15:44:46 -05:00
Sean Whalen
a4a5475aa8 Fix another typo before releasing 9.0.5 2025-12-08 15:29:48 -05:00
Sean Whalen
dab78880df Actual 9.0.5 release
Fix typo
2025-12-08 15:26:58 -05:00
Sean Whalen
fb54e3b742 9.0.5
- Fix report type detection bug introduced in `9.0.4` (yanked).
2025-12-08 15:22:02 -05:00
Sean Whalen
6799f10364 9.0.4
Fixes

- Fix saving reports to OpenSearch ([#637](https://github.com/domainaware/parsedmarc/issues/637))
- Fix parsing certain DMARC failure/forensic reports
- Some fixes to type hints (incomplete, but published as-is due to the above bugs)
2025-12-08 13:26:59 -05:00
Sean Whalen
445c9565a4 Update bug link in docs 2025-12-06 15:05:19 -05:00
Sean Whalen
4b786846ae Remove Python 3.14 from testing
Until cpython bug https://github.com/python/cpython/issues/142307 is fixed
2025-12-05 11:05:29 -05:00
Sean Whalen
23ae563cd8 Update Python version support details in documentation 2025-12-05 10:48:04 -05:00
Sean Whalen
cdd000e675 9.0.3
- Set `requires-python` to `>=3.9, <3.14` to avoid [this bug](https://github.com/python/cpython/issues/142307)
2025-12-05 10:43:28 -05:00
Sean Whalen
7d58abc67b Add shebang and encoding declaration to tests.py 2025-12-04 10:21:53 -05:00
Sean Whalen
a18ae439de Fix typo in RHEL version support description in documentation 2025-12-04 10:18:15 -05:00
Sean Whalen
d7061330a8 Use None for blank fields in the Top 1000 Message Sources by Name DMARC Summary dashboard widget 2025-12-03 09:22:33 -05:00
Sean Whalen
9d5654b8ec Fix bugs with the Top 1000 Message Sources by Name DMARC Summary dashboard widget 2025-12-03 09:14:52 -05:00
23 changed files with 531 additions and 270 deletions

View File

@@ -30,7 +30,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13", "3.14"]
python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]
steps:
- uses: actions/checkout@v5

10
.vscode/settings.json vendored
View File

@@ -1,4 +1,14 @@
{
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnSave": true,
// Let Ruff handle lint fixes + import sorting on save
"editor.codeActionsOnSave": {
"source.fixAll.ruff": "explicit",
"source.organizeImports.ruff": "explicit"
}
},
"markdownlint.config": {
"MD024": false
},

View File

@@ -1,5 +1,25 @@
# Changelog
## 9.0.5
## Fixes
- Fix report type detection introduced in `9.0.4`.
## 9.0.4 (Yanked)
### Fixes
- Fix saving reports to OpenSearch ([#637](https://github.com/domainaware/parsedmarc/issues/637))
- Fix parsing certain DMARC failure/forensic reports
- Some fixes to type hints (incomplete, but published as-is due to the above bugs)
## 9.0.3
### Fixes
- Set `requires-python` to `>=3.9, <3.14` to avoid [this bug](https://github.com/python/cpython/issues/142307)
## 9.0.2
## Improvements

View File

@@ -53,12 +53,12 @@ for RHEL or Debian.
| Version | Supported | Reason |
|---------|-----------|------------------------------------------------------------|
| < 3.6 | ❌ | End of Life (EOL) |
| 3.6 | ❌ | Used in RHHEL 8, but not supported by project dependencies |
| 3.7 | ❌ | End of Life (EOL) |
| 3.8 | ❌ | End of Life (EOL) |
| 3.9 | ✅ | Supported until August 2026 (Debian 11); May 2032 (RHEL 9) |
| 3.10 | ✅ | Actively maintained |
| 3.11 | ✅ | Actively maintained; supported until June 2028 (Debian 12) |
| 3.12 | ✅ | Actively maintained; supported until May 2035 (RHEL 10) |
| 3.13 | ✅ | Actively maintained; supported until June 2030 (Debian 13) |
| 3.14 | | Actively maintained |
| 3.6 | ❌ | Used in RHEL 8, but not supported by project dependencies |
| 3.7 | ❌ | End of Life (EOL) |
| 3.8 | ❌ | End of Life (EOL) |
| 3.9 | ✅ | Supported until August 2026 (Debian 11); May 2032 (RHEL 9) |
| 3.10 | ✅ | Actively maintained |
| 3.11 | ✅ | Actively maintained; supported until June 2028 (Debian 12) |
| 3.12 | ✅ | Actively maintained; supported until May 2035 (RHEL 10) |
| 3.13 | ✅ | Actively maintained; supported until June 2030 (Debian 13) |
| 3.14 | | Not currently supported due to Not currently supported due to [this imapclient bug](https://github.com/mjs/imapclient/issues/618)|

View File

@@ -9,7 +9,6 @@ fi
. venv/bin/activate
pip install .[build]
ruff format .
ruff check .
cd docs
make clean
make html

View File

@@ -53,16 +53,15 @@ for RHEL or Debian.
| Version | Supported | Reason |
|---------|-----------|------------------------------------------------------------|
| < 3.6 | ❌ | End of Life (EOL) |
| 3.6 | ❌ | Used in RHHEL 8, but not supported by project dependencies |
| 3.7 | ❌ | End of Life (EOL) |
| 3.8 | ❌ | End of Life (EOL) |
| 3.9 | ✅ | Supported until August 2026 (Debian 11); May 2032 (RHEL 9) |
| 3.10 | ✅ | Actively maintained |
| 3.11 | ✅ | Actively maintained; supported until June 2028 (Debian 12) |
| 3.12 | ✅ | Actively maintained; supported until May 2035 (RHEL 10) |
| 3.13 | ✅ | Actively maintained; supported until June 2030 (Debian 13) |
| 3.14 | | Actively maintained |
| 3.6 | ❌ | Used in RHEL 8, but not supported by project dependencies |
| 3.7 | ❌ | End of Life (EOL) |
| 3.8 | ❌ | End of Life (EOL) |
| 3.9 | ✅ | Supported until August 2026 (Debian 11); May 2032 (RHEL 9) |
| 3.10 | ✅ | Actively maintained |
| 3.11 | ✅ | Actively maintained; supported until June 2028 (Debian 12) |
| 3.12 | ✅ | Actively maintained; supported until May 2035 (RHEL 10) |
| 3.13 | ✅ | Actively maintained; supported until June 2030 (Debian 13) |
| 3.14 | | Not currently supported due to [this imapclient bug](https://github.com/mjs/imapclient/issues/618)|
```{toctree}
:caption: 'Contents'

File diff suppressed because one or more lines are too long

View File

@@ -4,7 +4,7 @@
from __future__ import annotations
from typing import Dict, List, Any, Union, Optional, IO, Callable
from typing import Dict, List, Any, Union, Optional, IO, Callable, TypedDict
import binascii
import email
@@ -19,7 +19,6 @@ import xml.parsers.expat as expat
import zipfile
import zlib
from base64 import b64decode
from collections import OrderedDict
from csv import DictWriter
from datetime import datetime, timedelta, timezone, tzinfo
from io import BytesIO, StringIO
@@ -39,12 +38,177 @@ from parsedmarc.mail import (
)
from parsedmarc.constants import __version__
from parsedmarc.utils import get_base_domain, get_ip_address_info
from parsedmarc.utils import get_base_domain, get_ip_address_info, IPAddressInfo
from parsedmarc.utils import is_outlook_msg, convert_outlook_msg
from parsedmarc.utils import parse_email
from parsedmarc.utils import timestamp_to_human, human_timestamp_to_datetime
# TypedDict definitions for DMARC report structures
class PolicyOverrideReason(TypedDict, total=False):
"""Reason for DMARC policy override"""
type: str
comment: Optional[str]
class PolicyEvaluated(TypedDict):
"""DMARC policy evaluation result"""
disposition: str
dkim: str
spf: str
policy_override_reasons: list[PolicyOverrideReason]
class Alignment(TypedDict):
"""DMARC alignment information"""
spf: bool
dkim: bool
dmarc: bool
class DKIMResult(TypedDict, total=False):
"""DKIM authentication result"""
domain: str
selector: str
result: str
class SPFResult(TypedDict, total=False):
"""SPF authentication result"""
domain: str
scope: str
result: str
class AuthResults(TypedDict):
"""Authentication results"""
dkim: list[DKIMResult]
spf: list[SPFResult]
class DMARCIdentifiers(TypedDict):
"""DMARC identifiers"""
header_from: str
envelope_from: str
envelope_to: Optional[str]
class DMARCRecord(TypedDict):
"""Parsed DMARC aggregate record"""
source: IPAddressInfo
count: int
alignment: Alignment
policy_evaluated: PolicyEvaluated
identifiers: DMARCIdentifiers
auth_results: AuthResults
class PublishedPolicy(TypedDict):
"""Published DMARC policy"""
domain: str
adkim: str
aspf: str
p: str
sp: str
pct: str
fo: str
class ReportMetadata(TypedDict, total=False):
"""DMARC report metadata"""
org_name: str
org_email: str
org_extra_contact_info: Optional[str]
report_id: str
begin_date: str
end_date: str
errors: list[str]
timespan_requires_normalization: bool
original_timespan_seconds: int
class AggregateReport(TypedDict):
"""Parsed DMARC aggregate report"""
xml_schema: str
report_metadata: ReportMetadata
policy_published: PublishedPolicy
records: list[DMARCRecord]
class SMTPTLSFailureDetails(TypedDict, total=False):
"""SMTP TLS failure details"""
result_type: str
failed_session_count: int
sending_mta_ip: Optional[str]
receiving_ip: Optional[str]
receiving_mx_hostname: Optional[str]
receiving_mx_helo: Optional[str]
additional_info_uri: Optional[str]
failure_reason_code: Optional[str]
class SMTPTLSPolicy(TypedDict, total=False):
"""SMTP TLS policy"""
policy_domain: str
policy_type: str
policy_strings: Optional[list[str]]
mx_host_patterns: Optional[list[str]]
successful_session_count: int
total_successful_session_count: int
total_failure_session_count: int
failure_details: list[SMTPTLSFailureDetails]
class SMTPTLSReport(TypedDict):
"""Parsed SMTP TLS report"""
organization_name: str
begin_date: str
end_date: str
contact_info: str
report_id: str
policies: list[SMTPTLSPolicy]
class ForensicReport(TypedDict, total=False):
"""Parsed DMARC forensic report"""
feedback_type: str
user_agent: Optional[str]
version: Optional[str]
original_envelope_id: Optional[str]
original_mail_from: Optional[str]
original_rcpt_to: Optional[str]
arrival_date: str
arrival_date_utc: str
subject: Optional[str]
message_id: Optional[str]
authentication_results: Optional[str]
dkim_domain: Optional[str]
source_ip_address: Optional[str]
source_country: Optional[str]
source_reverse_dns: Optional[str]
source_base_domain: Optional[str]
delivery_result: Optional[str]
auth_failure: Optional[list[str]]
reported_domain: Optional[str]
arrival_date_utc: str
sample: Optional[str]
parsed_sample: Optional[dict]
sample_headers_only: bool
class ParsedReport(TypedDict):
"""Container for parsed report with type"""
report_type: str
report: Union[AggregateReport, ForensicReport, SMTPTLSReport]
class ParseResults(TypedDict):
"""Results from parsing multiple reports"""
aggregate_reports: list[AggregateReport]
forensic_reports: list[ForensicReport]
smtp_tls_reports: list[SMTPTLSReport]
logger.debug("parsedmarc v{0}".format(__version__))
feedback_report_regex = re.compile(r"^([\w\-]+): (.+)$", re.MULTILINE)
@@ -57,9 +221,20 @@ MAGIC_GZIP = b"\x1f\x8b"
MAGIC_XML = b"\x3c\x3f\x78\x6d\x6c\x20"
MAGIC_JSON = b"\7b"
EMAIL_SAMPLE_CONTENT_TYPES = (
"text/rfc822",
"text-rfc-822",
"text/rfc822-headers",
"text/rfc-822-headers",
"message/rfc822",
"message/rfc-822",
"message/rfc822-headers",
"message/rfc-822-headers",
)
IP_ADDRESS_CACHE = ExpiringDict(max_len=10000, max_age_seconds=14400)
SEEN_AGGREGATE_REPORT_IDS = ExpiringDict(max_len=100000000, max_age_seconds=3600)
REVERSE_DNS_MAP = dict()
REVERSE_DNS_MAP = {}
class ParserError(RuntimeError):
@@ -86,7 +261,7 @@ def _bucket_interval_by_day(
begin: datetime,
end: datetime,
total_count: int,
) -> List[Dict[Any]]:
) -> List[Dict[str, Any]]:
"""
Split the interval [begin, end) into daily buckets and distribute
`total_count` proportionally across those buckets.
@@ -220,8 +395,8 @@ def _bucket_interval_by_day(
def _append_parsed_record(
parsed_record: OrderedDict[str, Any],
records: OrderedDict[str, Any],
parsed_record: DMARCRecord,
records: list[DMARCRecord],
begin_dt: datetime,
end_dt: datetime,
normalize: bool,
@@ -264,22 +439,22 @@ def _append_parsed_record(
def _parse_report_record(
record: OrderedDict,
record: dict,
*,
ip_db_path: Optional[str] = None,
always_use_local_files: bool = False,
always_use_local_files: Optional[bool] = False,
reverse_dns_map_path: Optional[str] = None,
reverse_dns_map_url: Optional[str] = None,
offline: bool = False,
offline: Optional[bool] = False,
nameservers: Optional[list[str]] = None,
dns_timeout: Optional[float] = 2.0,
) -> OrderedDict[str, Any]:
) -> DMARCRecord:
"""
Converts a record from a DMARC aggregate report into a more consistent
format
Args:
record (OrderedDict): The record to convert
record (dict): The record to convert
always_use_local_files (bool): Do not download files
reverse_dns_map_path (str): Path to a reverse DNS map file
reverse_dns_map_url (str): URL to a reverse DNS map file
@@ -293,7 +468,7 @@ def _parse_report_record(
OrderedDict: The converted record
"""
record = record.copy()
new_record = OrderedDict()
new_record = {}
if record["row"]["source_ip"] is None:
raise ValueError("Source IP address is empty")
new_record_source = get_ip_address_info(
@@ -311,14 +486,12 @@ def _parse_report_record(
new_record["source"] = new_record_source
new_record["count"] = int(record["row"]["count"])
policy_evaluated = record["row"]["policy_evaluated"].copy()
new_policy_evaluated = OrderedDict(
[
("disposition", "none"),
("dkim", "fail"),
("spf", "fail"),
("policy_override_reasons", []),
]
)
new_policy_evaluated = {
"disposition": "none",
"dkim": "fail",
"spf": "fail",
"policy_override_reasons": [],
}
if "disposition" in policy_evaluated:
new_policy_evaluated["disposition"] = policy_evaluated["disposition"]
if new_policy_evaluated["disposition"].strip().lower() == "pass":
@@ -337,10 +510,11 @@ def _parse_report_record(
and policy_evaluated["dkim"].lower() == "pass"
)
dmarc_aligned = spf_aligned or dkim_aligned
new_record["alignment"] = dict()
new_record["alignment"]["spf"] = spf_aligned
new_record["alignment"]["dkim"] = dkim_aligned
new_record["alignment"]["dmarc"] = dmarc_aligned
new_record["alignment"] = {
"spf": spf_aligned,
"dkim": dkim_aligned,
"dmarc": dmarc_aligned,
}
if "reason" in policy_evaluated:
if type(policy_evaluated["reason"]) is list:
reasons = policy_evaluated["reason"]
@@ -355,7 +529,7 @@ def _parse_report_record(
new_record["identifiers"] = record["identities"].copy()
else:
new_record["identifiers"] = record["identifiers"].copy()
new_record["auth_results"] = OrderedDict([("dkim", []), ("spf", [])])
new_record["auth_results"] = {"dkim": [], "spf": []}
if type(new_record["identifiers"]["header_from"]) is str:
lowered_from = new_record["identifiers"]["header_from"].lower()
else:
@@ -374,7 +548,7 @@ def _parse_report_record(
auth_results["dkim"] = [auth_results["dkim"]]
for result in auth_results["dkim"]:
if "domain" in result and result["domain"] is not None:
new_result = OrderedDict([("domain", result["domain"])])
new_result = {"domain": result["domain"]}
if "selector" in result and result["selector"] is not None:
new_result["selector"] = result["selector"]
else:
@@ -389,7 +563,7 @@ def _parse_report_record(
auth_results["spf"] = [auth_results["spf"]]
for result in auth_results["spf"]:
if "domain" in result and result["domain"] is not None:
new_result = OrderedDict([("domain", result["domain"])])
new_result = {"domain": result["domain"]}
if "scope" in result and result["scope"] is not None:
new_result["scope"] = result["scope"]
else:
@@ -429,10 +603,10 @@ def _parse_report_record(
def _parse_smtp_tls_failure_details(failure_details: dict[str, Any]):
try:
new_failure_details = OrderedDict(
result_type=failure_details["result-type"],
failed_session_count=failure_details["failed-session-count"],
)
new_failure_details = {
"result_type": failure_details["result-type"],
"failed_session_count": failure_details["failed-session-count"],
}
if "sending-mta-ip" in failure_details:
new_failure_details["sending_mta_ip"] = failure_details["sending-mta-ip"]
@@ -471,7 +645,10 @@ def _parse_smtp_tls_report_policy(policy: dict[str, Any]):
failure_details = []
if policy_type not in policy_types:
raise InvalidSMTPTLSReport(f"Invalid policy type {policy_type}")
new_policy = OrderedDict(policy_domain=policy_domain, policy_type=policy_type)
new_policy = {
"policy_domain": policy_domain,
"policy_type": policy_type,
}
if "policy-string" in policy["policy"]:
if isinstance(policy["policy"]["policy-string"], list):
if len(policy["policy"]["policy-string"]) > 0:
@@ -500,7 +677,7 @@ def _parse_smtp_tls_report_policy(policy: dict[str, Any]):
raise InvalidSMTPTLSReport(str(e))
def parse_smtp_tls_report_json(report: dict[str, Any]):
def parse_smtp_tls_report_json(report: str) -> SMTPTLSReport:
"""Parses and validates an SMTP TLS report"""
required_fields = [
"organization-name",
@@ -512,24 +689,24 @@ def parse_smtp_tls_report_json(report: dict[str, Any]):
try:
policies = []
report = json.loads(report)
report_dict = json.loads(report)
for required_field in required_fields:
if required_field not in report:
if required_field not in report_dict:
raise Exception(f"Missing required field: {required_field}]")
if not isinstance(report["policies"], list):
policies_type = type(report["policies"])
if not isinstance(report_dict["policies"], list):
policies_type = type(report_dict["policies"])
raise InvalidSMTPTLSReport(f"policies must be a list, not {policies_type}")
for policy in report["policies"]:
for policy in report_dict["policies"]:
policies.append(_parse_smtp_tls_report_policy(policy))
new_report = OrderedDict(
organization_name=report["organization-name"],
begin_date=report["date-range"]["start-datetime"],
end_date=report["date-range"]["end-datetime"],
contact_info=report["contact-info"],
report_id=report["report-id"],
policies=policies,
)
new_report = {
"organization_name": report_dict["organization-name"],
"begin_date": report_dict["date-range"]["start-datetime"],
"end_date": report_dict["date-range"]["end-datetime"],
"contact_info": report_dict["contact-info"],
"report_id": report_dict["report-id"],
"policies": policies,
}
return new_report
@@ -539,20 +716,22 @@ def parse_smtp_tls_report_json(report: dict[str, Any]):
raise InvalidSMTPTLSReport(str(e))
def parsed_smtp_tls_reports_to_csv_rows(reports: OrderedDict[str, Any]):
def parsed_smtp_tls_reports_to_csv_rows(
reports: Union[SMTPTLSReport, list[SMTPTLSReport]],
):
"""Converts one oor more parsed SMTP TLS reports into a list of single
layer OrderedDict objects suitable for use in a CSV"""
if type(reports) is OrderedDict:
if type(reports) is dict:
reports = [reports]
rows = []
for report in reports:
common_fields = OrderedDict(
organization_name=report["organization_name"],
begin_date=report["begin_date"],
end_date=report["end_date"],
report_id=report["report_id"],
)
common_fields = {
"organization_name": report["organization_name"],
"begin_date": report["begin_date"],
"end_date": report["end_date"],
"report_id": report["report_id"],
}
record = common_fields.copy()
for policy in report["policies"]:
if "policy_strings" in policy:
@@ -574,7 +753,7 @@ def parsed_smtp_tls_reports_to_csv_rows(reports: OrderedDict[str, Any]):
return rows
def parsed_smtp_tls_reports_to_csv(reports: OrderedDict[str, Any]) -> str:
def parsed_smtp_tls_reports_to_csv(reports: SMTPTLSReport) -> str:
"""
Converts one or more parsed SMTP TLS reports to flat CSV format, including
headers
@@ -622,16 +801,16 @@ def parsed_smtp_tls_reports_to_csv(reports: OrderedDict[str, Any]) -> str:
def parse_aggregate_report_xml(
xml: str,
*,
ip_db_path: Optional[bool] = None,
ip_db_path: Optional[str] = None,
always_use_local_files: Optional[bool] = False,
reverse_dns_map_path: Optional[bool] = None,
reverse_dns_map_url: Optional[bool] = None,
reverse_dns_map_path: Optional[str] = None,
reverse_dns_map_url: Optional[str] = None,
offline: Optional[bool] = False,
nameservers: Optional[list[str]] = None,
timeout: Optional[float] = 2.0,
keep_alive: Optional[callable] = None,
normalize_timespan_threshold_hours: Optional[float] = 24.0,
) -> OrderedDict[str, Any]:
keep_alive: Optional[Callable] = None,
normalize_timespan_threshold_hours: float = 24.0,
) -> AggregateReport:
"""Parses a DMARC XML report string and returns a consistent OrderedDict
Args:
@@ -680,8 +859,8 @@ def parse_aggregate_report_xml(
schema = "draft"
if "version" in report:
schema = report["version"]
new_report = OrderedDict([("xml_schema", schema)])
new_report_metadata = OrderedDict()
new_report = {"xml_schema": schema}
new_report_metadata = {}
if report_metadata["org_name"] is None:
if report_metadata["email"] is not None:
report_metadata["org_name"] = report_metadata["email"].split("@")[-1]
@@ -742,7 +921,7 @@ def parse_aggregate_report_xml(
policy_published = report["policy_published"]
if type(policy_published) is list:
policy_published = policy_published[0]
new_policy_published = OrderedDict()
new_policy_published = {}
new_policy_published["domain"] = policy_published["domain"]
adkim = "r"
if "adkim" in policy_published:
@@ -909,7 +1088,7 @@ def parse_aggregate_report_file(
dns_timeout: Optional[float] = 2.0,
keep_alive: Optional[Callable] = None,
normalize_timespan_threshold_hours: Optional[float] = 24.0,
) -> OrderedDict[str, any]:
) -> AggregateReport:
"""Parses a file at the given path, a file-like object. or bytes as an
aggregate DMARC report
@@ -950,7 +1129,7 @@ def parse_aggregate_report_file(
def parsed_aggregate_reports_to_csv_rows(
reports: list[OrderedDict[str, Any]],
reports: list[AggregateReport],
) -> list[dict[str, Any]]:
"""
Converts one or more parsed aggregate reports to list of dicts in flat CSV
@@ -967,7 +1146,7 @@ def parsed_aggregate_reports_to_csv_rows(
def to_str(obj):
return str(obj).lower()
if type(reports) is OrderedDict:
if type(reports) is dict:
reports = [reports]
rows = []
@@ -1075,7 +1254,7 @@ def parsed_aggregate_reports_to_csv_rows(
return rows
def parsed_aggregate_reports_to_csv(reports: list[OrderedDict[str, Any]]) -> str:
def parsed_aggregate_reports_to_csv(reports: list[AggregateReport]) -> str:
"""
Converts one or more parsed aggregate reports to flat CSV format, including
headers
@@ -1148,13 +1327,13 @@ def parse_forensic_report(
*,
always_use_local_files: Optional[bool] = False,
reverse_dns_map_path: Optional[str] = None,
reverse_dns_map_url: str = None,
reverse_dns_map_url: Optional[str] = None,
offline: Optional[bool] = False,
ip_db_path: Optional[str] = None,
nameservers: Optional[list[str]] = None,
dns_timeout: Optional[float] = 2.0,
strip_attachment_payloads: Optional[bool] = False,
) -> OrderedDict[str, Any]:
) -> ForensicReport:
"""
Converts a DMARC forensic report and sample to a ``OrderedDict``
@@ -1179,7 +1358,7 @@ def parse_forensic_report(
delivery_results = ["delivered", "spam", "policy", "reject", "other"]
try:
parsed_report = OrderedDict()
parsed_report = {}
report_values = feedback_report_regex.findall(feedback_report)
for report_value in report_values:
key = report_value[0].lower().replace("-", "_")
@@ -1282,7 +1461,7 @@ def parse_forensic_report(
raise InvalidForensicReport("Unexpected error: {0}".format(error.__str__()))
def parsed_forensic_reports_to_csv_rows(reports: list[OrderedDict[str, Any]]):
def parsed_forensic_reports_to_csv_rows(reports: list[AggregateReport]):
"""
Converts one or more parsed forensic reports to a list of dicts in flat CSV
format
@@ -1293,7 +1472,7 @@ def parsed_forensic_reports_to_csv_rows(reports: list[OrderedDict[str, Any]]):
Returns:
list: Parsed forensic report data as a list of dicts in flat CSV format
"""
if type(reports) is OrderedDict:
if type(reports) is dict:
reports = [reports]
rows = []
@@ -1318,7 +1497,7 @@ def parsed_forensic_reports_to_csv_rows(reports: list[OrderedDict[str, Any]]):
return rows
def parsed_forensic_reports_to_csv(reports: list[dict[str, Any]]) -> str:
def parsed_forensic_reports_to_csv(reports: list[AggregateReport]) -> str:
"""
Converts one or more parsed forensic reports to flat CSV format, including
headers
@@ -1383,7 +1562,7 @@ def parse_report_email(
strip_attachment_payloads: Optional[bool] = False,
keep_alive: Optional[callable] = None,
normalize_timespan_threshold_hours: Optional[float] = 24.0,
) -> OrderedDict[str, Any]:
) -> ParsedReport:
"""
Parses a DMARC report from an email
@@ -1426,6 +1605,7 @@ def parse_report_email(
feedback_report = None
smtp_tls_report = None
sample = None
is_feedback_report: bool = False
if "From" in msg_headers:
logger.info("Parsing mail from {0} on {1}".format(msg_headers["From"], date))
if "Subject" in msg_headers:
@@ -1441,6 +1621,7 @@ def parse_report_email(
if content_type == "text/html":
continue
elif content_type == "message/feedback-report":
is_feedback_report = True
try:
if "Feedback-Type" in payload:
feedback_report = payload
@@ -1451,29 +1632,29 @@ def parse_report_email(
feedback_report = feedback_report.replace("\\n", "\n")
except (ValueError, TypeError, binascii.Error):
feedback_report = payload
elif content_type == "text/rfc822-headers":
sample = payload
elif content_type == "message/rfc822":
elif is_feedback_report and content_type in EMAIL_SAMPLE_CONTENT_TYPES:
sample = payload
elif content_type == "application/tlsrpt+json":
if not payload.strip().startswith("{"):
payload = str(b64decode(payload))
smtp_tls_report = parse_smtp_tls_report_json(payload)
return OrderedDict(
[("report_type", "smtp_tls"), ("report", smtp_tls_report)]
)
return {
"report_type": "smtp_tls",
"report": smtp_tls_report,
}
elif content_type == "application/tlsrpt+gzip":
payload = extract_report(payload)
smtp_tls_report = parse_smtp_tls_report_json(payload)
return OrderedDict(
[("report_type", "smtp_tls"), ("report", smtp_tls_report)]
)
return {
"report_type": "smtp_tls",
"report": smtp_tls_report,
}
elif content_type == "text/plain":
if "A message claiming to be from you has failed" in payload:
try:
parts = payload.split("detected.", 1)
field_matches = text_report_regex.findall(parts[0])
fields = dict()
fields = {}
for match in field_matches:
field_name = match[0].lower().replace(" ", "-")
fields[field_name] = match[1].strip()
@@ -1497,10 +1678,11 @@ def parse_report_email(
if isinstance(payload, bytes):
payload = payload.decode("utf-8", errors="replace")
if payload.strip().startswith("{"):
smtp_tls_report = parse_smtp_tls_report_json(payload)
result = OrderedDict(
[("report_type", "smtp_tls"), ("report", smtp_tls_report)]
)
result = parse_smtp_tls_report_json(payload)
result = {
"report_type": "smtp_tls",
"report": smtp_tls_report,
}
elif payload.strip().startswith("<"):
aggregate_report = parse_aggregate_report_xml(
payload,
@@ -1514,9 +1696,10 @@ def parse_report_email(
keep_alive=keep_alive,
normalize_timespan_threshold_hours=normalize_timespan_threshold_hours,
)
result = OrderedDict(
[("report_type", "aggregate"), ("report", aggregate_report)]
)
result = {
"report_type": "aggregate",
"report": aggregate_report,
}
return result
@@ -1560,7 +1743,10 @@ def parse_report_email(
except Exception as e:
raise InvalidForensicReport(e.__str__())
result = OrderedDict([("report_type", "forensic"), ("report", forensic_report)])
result = {
"report_type": "forensic",
"report": forensic_report,
}
return result
if result is None:
@@ -1581,7 +1767,7 @@ def parse_report_file(
offline: Optional[bool] = False,
keep_alive: Optional[Callable] = None,
normalize_timespan_threshold_hours: Optional[float] = 24,
) -> OrderedDict[str, Any]:
) -> ParsedReport:
"""Parses a DMARC aggregate or forensic file at the given path, a
file-like object. or bytes
@@ -1627,11 +1813,17 @@ def parse_report_file(
keep_alive=keep_alive,
normalize_timespan_threshold_hours=normalize_timespan_threshold_hours,
)
results = OrderedDict([("report_type", "aggregate"), ("report", report)])
results = {
"report_type": "aggregate",
"report": report,
}
except InvalidAggregateReport:
try:
report = parse_smtp_tls_report_json(content)
results = OrderedDict([("report_type", "smtp_tls"), ("report", report)])
results = {
"report_type": "smtp_tls",
"report": report,
}
except InvalidSMTPTLSReport:
try:
sa = strip_attachment_payloads
@@ -1665,7 +1857,7 @@ def get_dmarc_reports_from_mbox(
reverse_dns_map_url: Optional[str] = None,
offline: Optional[bool] = False,
normalize_timespan_threshold_hours: Optional[float] = 24.0,
) -> OrderedDict[str, OrderedDict[str, Any]]:
) -> ParseResults:
"""Parses a mailbox in mbox format containing e-mails with attached
DMARC reports
@@ -1733,13 +1925,11 @@ def get_dmarc_reports_from_mbox(
logger.warning(error.__str__())
except mailbox.NoSuchMailboxError:
raise InvalidDMARCReport("Mailbox {0} does not exist".format(input_))
return OrderedDict(
[
("aggregate_reports", aggregate_reports),
("forensic_reports", forensic_reports),
("smtp_tls_reports", smtp_tls_reports),
]
)
return {
"aggregate_reports": aggregate_reports,
"forensic_reports": forensic_reports,
"smtp_tls_reports": smtp_tls_reports,
}
def get_dmarc_reports_from_mailbox(
@@ -1750,19 +1940,19 @@ def get_dmarc_reports_from_mailbox(
delete: Optional[bool] = False,
test: Optional[bool] = False,
ip_db_path: Optional[str] = None,
always_use_local_files: Optional[str] = False,
always_use_local_files: Optional[bool] = False,
reverse_dns_map_path: Optional[str] = None,
reverse_dns_map_url: Optional[str] = None,
offline: Optional[bool] = False,
nameservers: Optional[list[str]] = None,
dns_timeout: Optional[float] = 6.0,
strip_attachment_payloads: Optional[bool] = False,
results: Optional[OrderedDict[str, any]] = None,
results: Optional[dict[str, Any]] = None,
batch_size: Optional[int] = 10,
since: Optional[datetime] = None,
create_folders: Optional[bool] = True,
normalize_timespan_threshold_hours: Optional[float] = 24,
) -> OrderedDict[str, OrderedDict[str, Any]]:
) -> ParseResults:
"""
Fetches and parses DMARC reports from a mailbox
@@ -2025,13 +2215,11 @@ def get_dmarc_reports_from_mailbox(
except Exception as e:
e = "Error moving message UID {0}: {1}".format(msg_uid, e)
logger.error("Mailbox error: {0}".format(e))
results = OrderedDict(
[
("aggregate_reports", aggregate_reports),
("forensic_reports", forensic_reports),
("smtp_tls_reports", smtp_tls_reports),
]
)
results = {
"aggregate_reports": aggregate_reports,
"forensic_reports": forensic_reports,
"smtp_tls_reports": smtp_tls_reports,
}
if current_time:
total_messages = len(
@@ -2172,7 +2360,7 @@ def append_csv(filename, csv):
def save_output(
results: OrderedDict[str, Any],
results: dict[str, Any],
*,
output_directory: Optional[str] = "output",
aggregate_json_filename: Optional[str] = "aggregate.json",
@@ -2258,7 +2446,7 @@ def save_output(
sample_file.write(sample)
def get_report_zip(results: OrderedDict[str, Any]) -> bytes:
def get_report_zip(results: ParseResults) -> bytes:
"""
Creates a zip file of parsed report output
@@ -2304,7 +2492,7 @@ def get_report_zip(results: OrderedDict[str, Any]) -> bytes:
def email_results(
results: OrderedDict,
results: dict,
*,
host: str,
mail_from: str,

View File

@@ -10,7 +10,6 @@ from glob import glob
import logging
import math
import yaml
from collections import OrderedDict
import json
from ssl import CERT_NONE, create_default_context
from multiprocessing import Pipe, Process
@@ -1481,7 +1480,7 @@ def _main():
pbar.update(counter - pbar.n)
for result in results:
if type(result[0]) is ParserError:
if isinstance(result[0], ParserError) or result[0] is None:
logger.error("Failed to parse {0} - {1}".format(result[1], result[0]))
else:
if result[0]["report_type"] == "aggregate":
@@ -1634,13 +1633,11 @@ def _main():
logger.exception("Mailbox Error")
exit(1)
results = OrderedDict(
[
("aggregate_reports", aggregate_reports),
("forensic_reports", forensic_reports),
("smtp_tls_reports", smtp_tls_reports),
]
)
results = {
"aggregate_reports": aggregate_reports,
"forensic_reports": forensic_reports,
"smtp_tls_reports": smtp_tls_reports,
}
process_reports(results)

View File

@@ -1,2 +1,3 @@
__version__ = "9.0.2"
__version__ = "9.0.5"
USER_AGENT = f"parsedmarc/{__version__}"

View File

@@ -4,7 +4,6 @@ from __future__ import annotations
from typing import Optional, Union, Any
from collections import OrderedDict
from elasticsearch_dsl.search import Q
from elasticsearch_dsl import (
@@ -25,7 +24,7 @@ from elasticsearch.helpers import reindex
from parsedmarc.log import logger
from parsedmarc.utils import human_timestamp_to_datetime
from parsedmarc import InvalidForensicReport
from parsedmarc import InvalidForensicReport, AggregateReport, ForensicReport, SMTPTLSReport
class ElasticsearchError(Exception):
@@ -203,11 +202,11 @@ class _SMTPTLSPolicyDoc(InnerDoc):
def add_failure_details(
self,
result_type: str,
ip_address: str,
receiving_ip: str,
receiving_mx_helo: str,
failed_session_count: int,
result_type: Optional[str] = None,
ip_address: Optional[str] = None,
receiving_ip: Optional[str] = None,
receiving_mx_helo: Optional[str] = None,
failed_session_count: Optional[int] = None,
sending_mta_ip: Optional[str] = None,
receiving_mx_hostname: Optional[str] = None,
additional_information_uri: Optional[str] = None,
@@ -297,7 +296,7 @@ def set_hosts(
conn_params["ca_certs"] = ssl_cert_path
else:
conn_params["verify_certs"] = False
if username:
if username and password:
conn_params["http_auth"] = username + ":" + password
if api_key:
conn_params["api_key"] = api_key
@@ -377,7 +376,7 @@ def migrate_indexes(
def save_aggregate_report_to_elasticsearch(
aggregate_report: OrderedDict[str, Any],
aggregate_report: SMTPTLSReport,
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
@@ -539,8 +538,8 @@ def save_aggregate_report_to_elasticsearch(
def save_forensic_report_to_elasticsearch(
forensic_report: OrderedDict[str, Any],
index_suffix: Optional[any] = None,
forensic_report: SMTPTLSReport,
index_suffix: Optional[Any] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: int = 1,
@@ -570,7 +569,7 @@ def save_forensic_report_to_elasticsearch(
sample_date = forensic_report["parsed_sample"]["date"]
sample_date = human_timestamp_to_datetime(sample_date)
original_headers = forensic_report["parsed_sample"]["headers"]
headers = OrderedDict()
headers = {}
for original_header in original_headers:
headers[original_header.lower()] = original_headers[original_header]
@@ -597,7 +596,7 @@ def save_forensic_report_to_elasticsearch(
else:
headers["from"] = " <".join(headers["from"]) + ">"
from_ = dict()
from_ = {}
from_["sample.headers.from"] = headers["from"]
from_query = Q(dict(match_phrase=from_))
q = q & from_query
@@ -609,7 +608,7 @@ def save_forensic_report_to_elasticsearch(
else:
headers["to"] = " <".join(headers["to"]) + ">"
to_ = dict()
to_ = {}
to_["sample.headers.to"] = headers["to"]
to_query = Q(dict(match_phrase=to_))
q = q & to_query
@@ -706,9 +705,9 @@ def save_forensic_report_to_elasticsearch(
def save_smtp_tls_report_to_elasticsearch(
report: OrderedDict[str, Any],
index_suffix: str = None,
index_prefix: str = None,
report: SMTPTLSReport,
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: Optional[int] = 1,
number_of_replicas: Optional[int] = 0,

View File

@@ -3,12 +3,12 @@
from __future__ import annotations
from typing import Any
from parsedmarc import AggregateReport, ForensicReport, SMTPTLSReport, ParseResults
import logging
import logging.handlers
import json
import threading
from collections import OrderedDict
from parsedmarc import (
parsed_aggregate_reports_to_csv_rows,
@@ -53,7 +53,9 @@ class GelfClient(object):
)
self.logger.addHandler(self.handler)
def save_aggregate_report_to_gelf(self, aggregate_reports: OrderedDict[str, Any]):
def save_aggregate_report_to_gelf(
self, aggregate_reports: list[AggregateReport]
):
rows = parsed_aggregate_reports_to_csv_rows(aggregate_reports)
for row in rows:
log_context_data.parsedmarc = row
@@ -61,12 +63,14 @@ class GelfClient(object):
log_context_data.parsedmarc = None
def save_forensic_report_to_gelf(self, forensic_reports: OrderedDict[str, Any]):
def save_forensic_report_to_gelf(
self, forensic_reports: list[ForensicReport]
):
rows = parsed_forensic_reports_to_csv_rows(forensic_reports)
for row in rows:
self.logger.info(json.dumps(row))
def save_smtp_tls_report_to_gelf(self, smtp_tls_reports: OrderedDict[str, Any]):
def save_smtp_tls_report_to_gelf(self, smtp_tls_reports: SMTPTLSReport):
rows = parsed_smtp_tls_reports_to_csv_rows(smtp_tls_reports)
for row in rows:
self.logger.info(json.dumps(row))

View File

@@ -2,7 +2,8 @@
from __future__ import annotations
from typing import Any, Optional
from typing import Any, Optional, Union
from parsedmarc import AggregateReport, ForensicReport, SMTPTLSReport, ParseResults
from ssl import SSLContext
import json
@@ -10,7 +11,6 @@ from ssl import create_default_context
from kafka import KafkaProducer
from kafka.errors import NoBrokersAvailable, UnknownTopicOrPartitionError
from collections import OrderedDict
from parsedmarc.utils import human_timestamp_to_datetime
from parsedmarc import __version__
@@ -66,7 +66,7 @@ class KafkaClient(object):
raise KafkaError("No Kafka brokers available")
@staticmethod
def strip_metadata(report: OrderedDict[str, Any]):
def strip_metadata(report: dict[str, Any]):
"""
Duplicates org_name, org_email and report_id into JSON root
and removes report_metadata key to bring it more inline
@@ -80,7 +80,7 @@ class KafkaClient(object):
return report
@staticmethod
def generate_date_range(report: OrderedDict[str, Any]):
def generate_date_range(report: dict[str, Any]):
"""
Creates a date_range timestamp with format YYYY-MM-DD-T-HH:MM:SS
based on begin and end dates for easier parsing in Kibana.
@@ -98,7 +98,9 @@ class KafkaClient(object):
return date_range
def save_aggregate_reports_to_kafka(
self, aggregate_reports: list[OrderedDict][str, Any], aggregate_topic: str
self,
aggregate_reports: Union[AggregateReport, list[AggregateReport]],
aggregate_topic: str,
):
"""
Saves aggregate DMARC reports to Kafka
@@ -143,7 +145,9 @@ class KafkaClient(object):
raise KafkaError("Kafka error: {0}".format(e.__str__()))
def save_forensic_reports_to_kafka(
self, forensic_reports: OrderedDict[str, Any], forensic_topic: str
self,
forensic_reports: Union[AggregateReport, list[AggregateReport]],
forensic_topic: str,
):
"""
Saves forensic DMARC reports to Kafka, sends individual
@@ -175,7 +179,9 @@ class KafkaClient(object):
raise KafkaError("Kafka error: {0}".format(e.__str__()))
def save_smtp_tls_reports_to_kafka(
self, smtp_tls_reports: list[OrderedDict[str, Any]], smtp_tls_topic: str
self,
smtp_tls_reports: Union[list[SMTPTLSReport], SMTPTLSReport],
smtp_tls_topic: str,
):
"""
Saves SMTP TLS reports to Kafka, sends individual

View File

@@ -3,7 +3,7 @@
from __future__ import annotations
from typing import Any
from collections import OrderedDict
from parsedmarc import AggregateReport, ForensicReport, SMTPTLSReport, ParseResults
from parsedmarc.log import logger
from azure.core.exceptions import HttpResponseError
@@ -110,7 +110,7 @@ class LogAnalyticsClient(object):
def publish_json(
self,
results: OrderedDict[str, OrderedDict[str, Any]],
results,
logs_client: LogsIngestionClient,
dcr_stream: str,
):
@@ -133,7 +133,7 @@ class LogAnalyticsClient(object):
def publish_results(
self,
results: OrderedDict[str, OrderedDict[str, Any]],
results: ParseResults,
save_aggregate: bool,
save_forensic: bool,
save_smtp_tls: bool,

View File

@@ -4,7 +4,6 @@ from __future__ import annotations
from typing import Optional, Union, Any
from collections import OrderedDict
from opensearchpy import (
Q,
@@ -25,7 +24,7 @@ from opensearchpy.helpers import reindex
from parsedmarc.log import logger
from parsedmarc.utils import human_timestamp_to_datetime
from parsedmarc import InvalidForensicReport
from parsedmarc import InvalidForensicReport, AggregateReport, ForensicReport, SMTPTLSReport
class OpenSearchError(Exception):
@@ -203,11 +202,11 @@ class _SMTPTLSPolicyDoc(InnerDoc):
def add_failure_details(
self,
result_type: str,
ip_address: str,
receiving_ip: str,
receiving_mx_helo: str,
failed_session_count: int,
result_type: Optional[str] = None,
ip_address: Optional[str] = None,
receiving_ip: Optional[str] = None,
receiving_mx_helo: Optional[str] = None,
failed_session_count: Optional[int] = None,
sending_mta_ip: Optional[str] = None,
receiving_mx_hostname: Optional[str] = None,
additional_information_uri: Optional[str] = None,
@@ -297,7 +296,7 @@ def set_hosts(
conn_params["ca_certs"] = ssl_cert_path
else:
conn_params["verify_certs"] = False
if username:
if username and password:
conn_params["http_auth"] = username + ":" + password
if api_key:
conn_params["api_key"] = api_key
@@ -376,8 +375,8 @@ def migrate_indexes(
pass
def save_aggregate_report_to_elasticsearch(
aggregate_report: OrderedDict[str, Any],
def save_aggregate_report_to_opensearch(
aggregate_report: AggregateReport,
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
@@ -538,9 +537,9 @@ def save_aggregate_report_to_elasticsearch(
raise OpenSearchError("OpenSearch error: {0}".format(e.__str__()))
def save_forensic_report_to_elasticsearch(
forensic_report: OrderedDict[str, Any],
index_suffix: Optional[any] = None,
def save_forensic_report_to_opensearch(
forensic_report: ForensicReport,
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: int = 1,
@@ -570,7 +569,7 @@ def save_forensic_report_to_elasticsearch(
sample_date = forensic_report["parsed_sample"]["date"]
sample_date = human_timestamp_to_datetime(sample_date)
original_headers = forensic_report["parsed_sample"]["headers"]
headers = OrderedDict()
headers = {}
for original_header in original_headers:
headers[original_header.lower()] = original_headers[original_header]
@@ -597,7 +596,7 @@ def save_forensic_report_to_elasticsearch(
else:
headers["from"] = " <".join(headers["from"]) + ">"
from_ = dict()
from_ = {}
from_["sample.headers.from"] = headers["from"]
from_query = Q(dict(match_phrase=from_))
q = q & from_query
@@ -609,7 +608,7 @@ def save_forensic_report_to_elasticsearch(
else:
headers["to"] = " <".join(headers["to"]) + ">"
to_ = dict()
to_ = {}
to_["sample.headers.to"] = headers["to"]
to_query = Q(dict(match_phrase=to_))
q = q & to_query
@@ -705,10 +704,10 @@ def save_forensic_report_to_elasticsearch(
)
def save_smtp_tls_report_to_elasticsearch(
report: OrderedDict[str, Any],
index_suffix: str = None,
index_prefix: str = None,
def save_smtp_tls_report_to_opensearch(
report: dict[str, Any],
index_suffix: Optional[str] = None,
index_prefix: Optional[str] = None,
monthly_indexes: Optional[bool] = False,
number_of_shards: Optional[int] = 1,
number_of_replicas: Optional[int] = 0,

View File

@@ -132,6 +132,7 @@ asu-vei.ru,ASU-VEI,Industrial
atextelecom.com.br,ATEX Telecom,ISP
atmailcloud.com,atmail,Email Provider
ats.ca,ATS Healthcare,Healthcare
att.net,AT&T,ISP
atw.ne.jp,ATW,Web Host
au-net.ne.jp,KDDI,ISP
au.com,au,ISP
@@ -242,6 +243,7 @@ carandainet.com.br,CN Internet,ISP
cardhealth.com,Cardinal Health,Healthcare
cardinal.com,Cardinal Health,Healthcare
cardinalhealth.com,Cardinal Health,Healthcare
cardinalscriptnet.com,Cardinal Health,Healthcare
carecentrix.com,CareCentrix,Healthcare
carleton.edu,Carlton College,Education
carrierzone.com,carrierzone,Email Security
@@ -697,6 +699,7 @@ hdsupply-email.com,HD Supply,Retail
healthall.com,UC Health,Healthcare
healthcaresupplypros.com,Healthcare Supply Pros,Healthcare
healthproductsforyou.com,Health Products For You,Healthcare
healthtouch.com,Cardinal Health,Healthcare
helloserver6.com,1st Source Web,Marketing
helpforcb.com,InterServer,Web Host
helpscout.net,Help Scout,SaaS
@@ -753,6 +756,8 @@ hostwindsdns.com,Hostwinds,Web Host
hotnet.net.il,Hot Net Internet Services,ISP
hp.com,HP,Technology
hringdu.is,Hringdu,ISP
hslda.net,Home School Legal Defense Association (HSLDA),Education
hslda.org,Home School Legal Defense Association (HSLDA),Education
hspherefilter.com,"DynamicNet, Inc. (DNI)",Web Host
htc.net,HTC,ISP
htmlservices.it,HTMLServices.it,MSP
@@ -763,6 +768,7 @@ hughston.com,Hughston Clinic,Healthcare
hvvc.us,Hivelocity,Web Host
i2ts.ne.jp,i2ts,Web Host
i4i.com,i4i,Technology
ibindley.com,Cardinal Health,Healthcare
ice.co.cr,Grupo ICE,Industrial
icehosting.nl,IceHosting,Web Host
icewarpcloud.in,IceWrap,Email Provider
@@ -832,6 +838,7 @@ ip-5-196-151.eu,OVH,Web Host
ip-51-161-36.net,OVH,Web Host
ip-51-195-53.eu,OVH,Web Host
ip-51-254-53.eu,OVH,Web Host
ip-51-38-67.eu,OVH,Web Host
ip-51-77-42.eu,OVH,Web Host
ip-51-83-140.eu,OVH,Web Host
ip-51-89-240.eu,OVH,Web Host
@@ -1217,6 +1224,7 @@ nettoday.co.th,Net Today,Web Host
netventure.pl,Netventure,MSP
netvigator.com,HKT,ISP
netvision.net.il,013 Netvision,ISP
network-tech.com,Network Technologies International (NTI),SaaS
network.kz,network.kz,ISP
network80.com,Network80,Web Host
neubox.net,Neubox,Web Host
1 base_reverse_dns name type
132 atextelecom.com.br ATEX Telecom ISP
133 atmailcloud.com atmail Email Provider
134 ats.ca ATS Healthcare Healthcare
135 att.net AT&T ISP
136 atw.ne.jp ATW Web Host
137 au-net.ne.jp KDDI ISP
138 au.com au ISP
243 cardhealth.com Cardinal Health Healthcare
244 cardinal.com Cardinal Health Healthcare
245 cardinalhealth.com Cardinal Health Healthcare
246 cardinalscriptnet.com Cardinal Health Healthcare
247 carecentrix.com CareCentrix Healthcare
248 carleton.edu Carlton College Education
249 carrierzone.com carrierzone Email Security
699 healthall.com UC Health Healthcare
700 healthcaresupplypros.com Healthcare Supply Pros Healthcare
701 healthproductsforyou.com Health Products For You Healthcare
702 healthtouch.com Cardinal Health Healthcare
703 helloserver6.com 1st Source Web Marketing
704 helpforcb.com InterServer Web Host
705 helpscout.net Help Scout SaaS
756 hotnet.net.il Hot Net Internet Services ISP
757 hp.com HP Technology
758 hringdu.is Hringdu ISP
759 hslda.net Home School Legal Defense Association (HSLDA) Education
760 hslda.org Home School Legal Defense Association (HSLDA) Education
761 hspherefilter.com DynamicNet, Inc. (DNI) Web Host
762 htc.net HTC ISP
763 htmlservices.it HTMLServices.it MSP
768 hvvc.us Hivelocity Web Host
769 i2ts.ne.jp i2ts Web Host
770 i4i.com i4i Technology
771 ibindley.com Cardinal Health Healthcare
772 ice.co.cr Grupo ICE Industrial
773 icehosting.nl IceHosting Web Host
774 icewarpcloud.in IceWrap Email Provider
838 ip-51-161-36.net OVH Web Host
839 ip-51-195-53.eu OVH Web Host
840 ip-51-254-53.eu OVH Web Host
841 ip-51-38-67.eu OVH Web Host
842 ip-51-77-42.eu OVH Web Host
843 ip-51-83-140.eu OVH Web Host
844 ip-51-89-240.eu OVH Web Host
1224 netventure.pl Netventure MSP
1225 netvigator.com HKT ISP
1226 netvision.net.il 013 Netvision ISP
1227 network-tech.com Network Technologies International (NTI) SaaS
1228 network.kz network.kz ISP
1229 network80.com Network80 Web Host
1230 neubox.net Neubox Web Host

View File

@@ -3,11 +3,11 @@
from __future__ import annotations
from typing import Any
from parsedmarc import AggregateReport, ForensicReport, SMTPTLSReport, ParseResults
import json
import boto3
from collections import OrderedDict
from parsedmarc.log import logger
from parsedmarc.utils import human_timestamp_to_datetime
@@ -53,18 +53,18 @@ class S3Client(object):
aws_access_key_id=access_key_id,
aws_secret_access_key=secret_access_key,
)
self.bucket = self.s3.Bucket(self.bucket_name)
self.bucket = self.s3.Bucket(self.bucket_name) # type: ignore
def save_aggregate_report_to_s3(self, report: OrderedDict[str, Any]):
def save_aggregate_report_to_s3(self, report: Union[AggregateReport, ForensicReport, SMTPTLSReport]):
self.save_report_to_s3(report, "aggregate")
def save_forensic_report_to_s3(self, report: OrderedDict[str, Any]):
def save_forensic_report_to_s3(self, report: Union[AggregateReport, ForensicReport, SMTPTLSReport]):
self.save_report_to_s3(report, "forensic")
def save_smtp_tls_report_to_s3(self, report: OrderedDict[str, Any]):
def save_smtp_tls_report_to_s3(self, report: Union[AggregateReport, ForensicReport, SMTPTLSReport]):
self.save_report_to_s3(report, "smtp_tls")
def save_report_to_s3(self, report: OrderedDict[str, Any], report_type: str):
def save_report_to_s3(self, report: Union[AggregateReport, ForensicReport, SMTPTLSReport], report_type: str):
if report_type == "smtp_tls":
report_date = report["begin_date"]
report_id = report["report_id"]

View File

@@ -2,9 +2,9 @@
from __future__ import annotations
from typing import Any
from typing import Any, Union
from parsedmarc import AggregateReport, ForensicReport, SMTPTLSReport, ParseResults
from collections import OrderedDict
from urllib.parse import urlparse
import socket
@@ -35,7 +35,7 @@ class HECClient(object):
url: str,
access_token: str,
index: str,
source: bool = "parsedmarc",
source: str = "parsedmarc",
verify=True,
timeout=60,
):
@@ -51,9 +51,9 @@ class HECClient(object):
timeout (float): Number of seconds to wait for the server to send
data before giving up
"""
url = urlparse(url)
parsed_url = urlparse(url)
self.url = "{0}://{1}/services/collector/event/1.0".format(
url.scheme, url.netloc
parsed_url.scheme, parsed_url.netloc
)
self.access_token = access_token.lstrip("Splunk ")
self.index = index
@@ -62,7 +62,9 @@ class HECClient(object):
self.session = requests.Session()
self.timeout = timeout
self.session.verify = verify
self._common_data = dict(host=self.host, source=self.source, index=self.index)
self._common_data: dict[str, Union[str, int, float, dict]] = dict(
host=self.host, source=self.source, index=self.index
)
self.session.headers = {
"User-Agent": USER_AGENT,
@@ -70,7 +72,8 @@ class HECClient(object):
}
def save_aggregate_reports_to_splunk(
self, aggregate_reports: list[OrderedDict[str, Any]]
self,
aggregate_reports: Union[list[AggregateReport], AggregateReport],
):
"""
Saves aggregate DMARC reports to Splunk
@@ -91,7 +94,7 @@ class HECClient(object):
json_str = ""
for report in aggregate_reports:
for record in report["records"]:
new_report = dict()
new_report: dict[str, Union[str, int, float, dict]] = {}
for metadata in report["report_metadata"]:
new_report[metadata] = report["report_metadata"][metadata]
new_report["interval_begin"] = record["interval_begin"]
@@ -135,7 +138,8 @@ class HECClient(object):
raise SplunkError(response["text"])
def save_forensic_reports_to_splunk(
self, forensic_reports: list[OrderedDict[str, Any]]
self,
forensic_reports: Union[list[AggregateReport], AggregateReport],
):
"""
Saves forensic DMARC reports to Splunk
@@ -170,7 +174,9 @@ class HECClient(object):
if response["code"] != 0:
raise SplunkError(response["text"])
def save_smtp_tls_reports_to_splunk(self, reports: OrderedDict[str, Any]):
def save_smtp_tls_reports_to_splunk(
self, reports: Union[list[AggregateReport], AggregateReport]
):
"""
Saves aggregate DMARC reports to Splunk

View File

@@ -7,8 +7,8 @@ import logging
import logging.handlers
from typing import Any
from parsedmarc import AggregateReport, ForensicReport, SMTPTLSReport, ParseResults
from collections import OrderedDict
import json
@@ -37,21 +37,21 @@ class SyslogClient(object):
self.logger.addHandler(log_handler)
def save_aggregate_report_to_syslog(
self, aggregate_reports: list[OrderedDict[str, Any]]
self, aggregate_reports: list[AggregateReport]
):
rows = parsed_aggregate_reports_to_csv_rows(aggregate_reports)
for row in rows:
self.logger.info(json.dumps(row))
def save_forensic_report_to_syslog(
self, forensic_reports: list[OrderedDict[str, Any]]
self, forensic_reports: list[ForensicReport]
):
rows = parsed_forensic_reports_to_csv_rows(forensic_reports)
for row in rows:
self.logger.info(json.dumps(row))
def save_smtp_tls_report_to_syslog(
self, smtp_tls_reports: list[OrderedDict[str, Any]]
self, smtp_tls_reports: list[SMTPTLSReport]
):
rows = parsed_smtp_tls_reports_to_csv_rows(smtp_tls_reports)
for row in rows:

View File

@@ -4,14 +4,13 @@
from __future__ import annotations
from typing import Optional, Union
from typing import Optional, Union, TypedDict
import logging
import os
from datetime import datetime
from datetime import timezone
from datetime import timedelta
from collections import OrderedDict
from expiringdict import ExpiringDict
import tempfile
import subprocess
@@ -46,6 +45,32 @@ import parsedmarc.resources.dbip
import parsedmarc.resources.maps
from parsedmarc.constants import USER_AGENT
# TypedDict definitions for better typing
class IPAddressInfo(TypedDict, total=False):
"""Information about an IP address"""
ip_address: str
country: Optional[str]
reverse_dns: Optional[str]
base_domain: Optional[str]
name: Optional[str]
type: Optional[str]
class EmailAddress(TypedDict, total=False):
"""Parsed email address information"""
display_name: Optional[str]
address: str
local: Optional[str]
domain: Optional[str]
class ReverseDNSService(TypedDict):
"""Reverse DNS service information"""
name: str
type: Optional[str]
parenthesis_regex = re.compile(r"\s*\(.*\)\s*")
null_file = open(os.devnull, "w")
@@ -67,7 +92,7 @@ class DownloadError(RuntimeError):
"""Raised when an error occurs when downloading a file"""
def decode_base64(data: str) -> bytes:
def decode_base64(data) -> bytes:
"""
Decodes a base64 string, with padding being optional
@@ -342,7 +367,7 @@ def get_service_from_reverse_dns_base_domain(
url: Optional[bool] = None,
offline: Optional[bool] = False,
reverse_dns_map: Optional[bool] = None,
) -> str:
) -> ReverseDNSService:
"""
Returns the service name of a given base domain name from reverse DNS.
@@ -363,7 +388,7 @@ def get_service_from_reverse_dns_base_domain(
reader = csv.DictReader(_csv_file)
for row in reader:
key = row["base_reverse_dns"].lower().strip()
reverse_dns_map[key] = dict(name=row["name"], type=row["type"])
reverse_dns_map[key] = {"name": row["name"], "type": row["type"]}
base_domain = base_domain.lower().strip()
if url is None:
@@ -373,7 +398,7 @@ def get_service_from_reverse_dns_base_domain(
"resources/maps/base_reverse_dns_map.csv"
)
if reverse_dns_map is None:
reverse_dns_map = dict()
reverse_dns_map = {}
csv_file = io.StringIO()
if not (offline or always_use_local_file) and len(reverse_dns_map) == 0:
@@ -405,7 +430,7 @@ def get_service_from_reverse_dns_base_domain(
try:
service = reverse_dns_map[base_domain]
except KeyError:
service = dict(name=base_domain, type=None)
service = {"name": base_domain, "type": None}
return service
@@ -416,13 +441,13 @@ def get_ip_address_info(
ip_db_path: Optional[str] = None,
reverse_dns_map_path: Optional[str] = None,
always_use_local_files: Optional[bool] = False,
reverse_dns_map_url: Optional[bool] = None,
reverse_dns_map_url: Optional[str] = None,
cache: Optional[ExpiringDict] = None,
reverse_dns_map: Optional[bool] = None,
reverse_dns_map: Optional[dict] = None,
offline: Optional[bool] = False,
nameservers: Optional[list[str]] = None,
timeout: Optional[float] = 2.0,
) -> OrderedDict[str, str]:
) -> IPAddressInfo:
"""
Returns reverse DNS and country information for the given IP address
@@ -449,7 +474,7 @@ def get_ip_address_info(
if info:
logger.debug(f"IP address {ip_address} was found in cache")
return info
info = OrderedDict()
info = {}
info["ip_address"] = ip_address
if offline:
reverse_dns = None
@@ -487,7 +512,7 @@ def get_ip_address_info(
return info
def parse_email_address(original_address: str) -> OrderedDict[str, str]:
def parse_email_address(original_address: str) -> EmailAddress:
if original_address[0] == "":
display_name = None
else:
@@ -500,14 +525,12 @@ def parse_email_address(original_address: str) -> OrderedDict[str, str]:
local = address_parts[0].lower()
domain = address_parts[-1].lower()
return OrderedDict(
[
("display_name", display_name),
("address", address),
("local", local),
("domain", domain),
]
)
return {
"display_name": display_name,
"address": address,
"local": local,
"domain": domain,
}
def get_filename_safe_string(string: str) -> str:

View File

@@ -4,8 +4,6 @@ from __future__ import annotations
from typing import Any, Optional, Union
from collections import OrderedDict
import requests
from parsedmarc import logger
@@ -40,19 +38,19 @@ class WebhookClient(object):
"Content-Type": "application/json",
}
def save_forensic_report_to_webhook(self, report: OrderedDict[str, Any]):
def save_forensic_report_to_webhook(self, report: str):
try:
self._send_to_webhook(self.forensic_url, report)
except Exception as error_:
logger.error("Webhook Error: {0}".format(error_.__str__()))
def save_smtp_tls_report_to_webhook(self, report: OrderedDict[str, Any]):
def save_smtp_tls_report_to_webhook(self, report: str):
try:
self._send_to_webhook(self.smtp_tls_url, report)
except Exception as error_:
logger.error("Webhook Error: {0}".format(error_.__str__()))
def save_aggregate_report_to_webhook(self, report: OrderedDict[str, Any]):
def save_aggregate_report_to_webhook(self, report: str):
try:
self._send_to_webhook(self.aggregate_url, report)
except Exception as error_:

View File

@@ -2,6 +2,7 @@
requires = [
"hatchling>=1.27.0",
]
requires_python = ">=3.9,<3.14"
build-backend = "hatchling.build"
[project]
@@ -28,7 +29,7 @@ classifiers = [
"Operating System :: OS Independent",
"Programming Language :: Python :: 3"
]
requires-python = ">= 3.9"
requires-python = ">=3.9, <3.14"
dependencies = [
"azure-identity>=1.8.0",
"azure-monitor-ingestion>=1.0.0",

View File

@@ -1,3 +1,6 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from __future__ import absolute_import, print_function, unicode_literals
import os