API reference
parsedmarc
A Python package for parsing DMARC reports
- exception parsedmarc.InvalidAggregateReport[source]
Raised when an invalid DMARC aggregate report is encountered
- exception parsedmarc.InvalidFailureReport[source]
Raised when an invalid DMARC failure report is encountered
- parsedmarc.InvalidForensicReport
alias of
InvalidFailureReport
- exception parsedmarc.InvalidSMTPTLSReport[source]
Raised when an invalid SMTP TLS report is encountered
- parsedmarc.append_json(filename: str, reports: Sequence[AggregateReport] | Sequence[FailureReport] | Sequence[SMTPTLSReport]) None[source]
Append
reportsto a JSON array on disk, creating the file if needed.Reads the existing array (if the file exists and parses cleanly), merges the new reports onto the end, and rewrites the file as a single valid JSON array. An earlier version of this used an
open(..., "a+")+seek()+ overwrite pattern, but Python’s documentation is explicit that on POSIX,a/a+writes always go to EOF regardless of seek position — so the second call onto an existing file produced[...],\n[...]-style corrupted output. Read-merge-write is the only way to get a valid JSON array out of repeated appends.
- parsedmarc.email_results(results: ParsingResults, host: str, mail_from: str, mail_to: list[str] | None, *, mail_cc: list[str] | None = None, mail_bcc: list[str] | None = None, port: int = 0, require_encryption: bool = False, verify: bool = True, username: str | None = None, password: str | None = None, subject: str | None = None, attachment_filename: str | None = None, message: str | None = None)[source]
Emails parsing results as a zip file
- Parameters:
results (dict) – Parsing results
host (str) – Mail server hostname or IP address
mail_from – The value of the message from header
mail_to (list) – A list of addresses to mail to
mail_cc (list) – A list of addresses to CC
mail_bcc (list) – A list addresses to BCC
port (int) – Port to use
require_encryption (bool) – Require a secure connection from the start
verify (bool) – verify the SSL/TLS certificate
username (str) – An optional username
password (str) – An optional password
subject (str) – Overrides the default message subject
attachment_filename (str) – Override the default attachment filename
message (str) – Override the default plain text body
- parsedmarc.extract_report(content: bytes | str | BinaryIO) str[source]
Extracts text from a zip or gzip file, as a base64-encoded string, file-like object, or bytes.
- Parameters:
content – report file as a base64-encoded string, file-like object or
bytes.
- Returns:
The extracted text
- Return type:
str
- parsedmarc.extract_report_from_file_path(file_path: str | bytes | PathLike[str] | PathLike[bytes]) str[source]
Extracts report from a file at the given file_path
- parsedmarc.get_dmarc_reports_from_mailbox(connection: MailboxConnection, *, reports_folder: str = 'INBOX', archive_folder: str = 'Archive', delete: bool = False, test: bool = False, ip_db_path: str | None = None, always_use_local_files: bool = False, reverse_dns_map_path: str | None = None, reverse_dns_map_url: str | None = None, offline: bool = False, nameservers: list[str] | None = None, dns_timeout: float = 6.0, dns_retries: int = 0, strip_attachment_payloads: bool = False, results: ParsingResults | None = None, batch_size: int = 10, since: datetime | date | str | None = None, create_folders: bool = True, normalize_timespan_threshold_hours: float = 24) ParsingResults[source]
Fetches and parses DMARC reports from a mailbox
- Parameters:
connection – A Mailbox connection object
reports_folder (str) – The folder where reports can be found
archive_folder (str) – The folder to move processed mail to
delete (bool) – Delete messages after processing them
test (bool) – Do not move or delete messages after processing them
ip_db_path (str) – Path to a MMDB file from IPinfo, MaxMind, or DBIP
always_use_local_files (bool) – Do not download files
reverse_dns_map_path (str) – Path to a reverse DNS map file
reverse_dns_map_url (str) – URL to a reverse DNS map file
offline (bool) – Do not query online for geolocation or DNS
nameservers (list) – A list of DNS nameservers to query
dns_timeout (float) – Set the DNS query timeout
dns_retries (int) – Number of times to retry DNS queries on timeout or other transient errors
strip_attachment_payloads (bool) – Remove attachment payloads from failure report results
results (dict) – Results from the previous run
batch_size (int) – Number of messages to read and process before saving (use 0 for no limit)
since – Search for messages since certain time (units - {“m”:”minutes”, “h”:”hours”, “d”:”days”, “w”:”weeks”})
create_folders (bool) – Whether to create the destination folders (not used in watch)
normalize_timespan_threshold_hours (float) – Normalize timespans beyond this
- Returns:
Lists of
aggregate_reports,failure_reports, andsmtp_tls_reports- Return type:
dict
- parsedmarc.get_dmarc_reports_from_mbox(input_: str, *, nameservers: list[str] | None = None, dns_timeout: float = 2.0, dns_retries: int = 0, strip_attachment_payloads: bool = False, ip_db_path: str | None = None, always_use_local_files: bool = False, reverse_dns_map_path: str | None = None, reverse_dns_map_url: str | None = None, offline: bool = False, normalize_timespan_threshold_hours: float = 24.0) ParsingResults[source]
Parses a mailbox in mbox format containing e-mails with attached DMARC reports
- Parameters:
input (str) – A path to a mbox file
nameservers (list) – A list of one or more nameservers to use (Cloudflare’s public DNS resolvers by default)
dns_timeout (float) – Sets the DNS timeout in seconds
dns_retries (int) – Number of times to retry DNS queries on timeout or other transient errors
strip_attachment_payloads (bool) – Remove attachment payloads from failure report results
always_use_local_files (bool) – Do not download files
reverse_dns_map_path (str) – Path to a reverse DNS map file
reverse_dns_map_url (str) – URL to a reverse DNS map file
ip_db_path (str) – Path to a MMDB file from IPinfo, MaxMind, or DBIP
offline (bool) – Do not make online queries for geolocation or DNS
normalize_timespan_threshold_hours (float) – Normalize timespans beyond this
- Returns:
Lists of
aggregate_reports,failure_reports, andsmtp_tls_reports- Return type:
dict
- parsedmarc.get_report_zip(results: ParsingResults) bytes[source]
Creates a zip file of parsed report output
- Parameters:
results – The parsed results
- Returns:
zip file bytes
- Return type:
bytes
- parsedmarc.parse_aggregate_report_file(_input: str | bytes | BinaryIO, *, offline: bool = False, always_use_local_files: bool = False, reverse_dns_map_path: str | None = None, reverse_dns_map_url: str | None = None, ip_db_path: str | None = None, nameservers: list[str] | None = None, dns_timeout: float = 2.0, dns_retries: int = 0, keep_alive: Callable | None = None, normalize_timespan_threshold_hours: float = 24.0) AggregateReport[source]
Parses a file at the given path, a file-like object. or bytes as an aggregate DMARC report
- Parameters:
_input (str | bytes | IO) – A path to a file, a file like object, or bytes
offline (bool) – Do not query online for geolocation or DNS
always_use_local_files (bool) – Do not download files
reverse_dns_map_path (str) – Path to a reverse DNS map file
reverse_dns_map_url (str) – URL to a reverse DNS map file
ip_db_path (str) – Path to a MMDB file from IPinfo, MaxMind, or DBIP
nameservers (list) – A list of one or more nameservers to use (Cloudflare’s public DNS resolvers by default)
dns_timeout (float) – Sets the DNS timeout in seconds
dns_retries (int) – Number of times to retry DNS queries on timeout or other transient errors
keep_alive (callable) – Keep alive function
normalize_timespan_threshold_hours (float) – Normalize timespans beyond this
- Returns:
The parsed DMARC aggregate report
- Return type:
dict
- parsedmarc.parse_aggregate_report_xml(xml: str, *, ip_db_path: str | None = None, always_use_local_files: bool = False, reverse_dns_map_path: str | None = None, reverse_dns_map_url: str | None = None, offline: bool = False, nameservers: list[str] | None = None, timeout: float = 2.0, retries: int = 0, keep_alive: Callable | None = None, normalize_timespan_threshold_hours: float = 24.0) AggregateReport[source]
Parses a DMARC XML report string and returns a consistent dict
- Parameters:
xml (str) – A string of DMARC aggregate report XML
ip_db_path (str) – Path to a MMDB file from IPinfo, MaxMind, or DBIP
always_use_local_files (bool) – Do not download files
reverse_dns_map_path (str) – Path to a reverse DNS map file
reverse_dns_map_url (str) – URL to a reverse DNS map file
offline (bool) – Do not query online for geolocation or DNS
nameservers (list) – A list of one or more nameservers to use (Cloudflare’s public DNS resolvers by default)
timeout (float) – Sets the DNS timeout in seconds
retries (int) – Number of times to retry DNS queries on timeout or other transient errors
keep_alive (callable) – Keep alive function
normalize_timespan_threshold_hours (float) – Normalize timespans beyond this
- Returns:
The parsed aggregate DMARC report
- Return type:
dict
- parsedmarc.parse_failure_report(feedback_report: str, sample: str, msg_date: datetime, *, always_use_local_files: bool = False, reverse_dns_map_path: str | None = None, reverse_dns_map_url: str | None = None, offline: bool = False, ip_db_path: str | None = None, nameservers: list[str] | None = None, dns_timeout: float = 2.0, dns_retries: int = 0, strip_attachment_payloads: bool = False) FailureReport[source]
Converts a DMARC failure report and sample to a dict
- Parameters:
feedback_report (str) – A message’s feedback report as a string
sample (str) – The RFC 822 headers or RFC 822 message sample
ip_db_path (str) – Path to a MMDB file from IPinfo, MaxMind, or DBIP
always_use_local_files (bool) – Do not download files
reverse_dns_map_path (str) – Path to a reverse DNS map file
reverse_dns_map_url (str) – URL to a reverse DNS map file
offline (bool) – Do not query online for geolocation or DNS
msg_date (str) – The message’s date header
nameservers (list) – A list of one or more nameservers to use (Cloudflare’s public DNS resolvers by default)
dns_timeout (float) – Sets the DNS timeout in seconds
dns_retries (int) – Number of times to retry DNS queries on timeout or other transient errors
strip_attachment_payloads (bool) – Remove attachment payloads from failure report results
- Returns:
A parsed report and sample
- Return type:
dict
- parsedmarc.parse_forensic_report(feedback_report: str, sample: str, msg_date: datetime, *, always_use_local_files: bool = False, reverse_dns_map_path: str | None = None, reverse_dns_map_url: str | None = None, offline: bool = False, ip_db_path: str | None = None, nameservers: list[str] | None = None, dns_timeout: float = 2.0, dns_retries: int = 0, strip_attachment_payloads: bool = False) FailureReport
Converts a DMARC failure report and sample to a dict
- Parameters:
feedback_report (str) – A message’s feedback report as a string
sample (str) – The RFC 822 headers or RFC 822 message sample
ip_db_path (str) – Path to a MMDB file from IPinfo, MaxMind, or DBIP
always_use_local_files (bool) – Do not download files
reverse_dns_map_path (str) – Path to a reverse DNS map file
reverse_dns_map_url (str) – URL to a reverse DNS map file
offline (bool) – Do not query online for geolocation or DNS
msg_date (str) – The message’s date header
nameservers (list) – A list of one or more nameservers to use (Cloudflare’s public DNS resolvers by default)
dns_timeout (float) – Sets the DNS timeout in seconds
dns_retries (int) – Number of times to retry DNS queries on timeout or other transient errors
strip_attachment_payloads (bool) – Remove attachment payloads from failure report results
- Returns:
A parsed report and sample
- Return type:
dict
- parsedmarc.parse_report_email(input_: bytes | str, *, offline: bool = False, ip_db_path: str | None = None, always_use_local_files: bool = False, reverse_dns_map_path: str | None = None, reverse_dns_map_url: str | None = None, nameservers: list[str] | None = None, dns_timeout: float = 2.0, dns_retries: int = 0, strip_attachment_payloads: bool = False, keep_alive: Callable | None = None, normalize_timespan_threshold_hours: float = 24.0) AggregateParsedReport | FailureParsedReport | SMTPTLSParsedReport[source]
Parses a DMARC report from an email
- Parameters:
input – An emailed DMARC report in RFC 822 format, as bytes or a string
ip_db_path (str) – Path to a MMDB file from IPinfo, MaxMind, or DBIP
always_use_local_files (bool) – Do not download files
reverse_dns_map_path (str) – Path to a reverse DNS map
reverse_dns_map_url (str) – URL to a reverse DNS map
offline (bool) – Do not query online for geolocation on DNS
nameservers (list) – A list of one or more nameservers to use
dns_timeout (float) – Sets the DNS timeout in seconds
dns_retries (int) – Number of times to retry DNS queries on timeout or other transient errors
strip_attachment_payloads (bool) – Remove attachment payloads from failure report results
keep_alive (callable) – keep alive function
normalize_timespan_threshold_hours (float) – Normalize timespans beyond this
- Returns:
report_type:aggregateorfailurereport: The parsed report
- Return type:
dict
- parsedmarc.parse_report_file(input_: bytes | str | PathLike[str] | PathLike[bytes] | BinaryIO, *, nameservers: list[str] | None = None, dns_timeout: float = 2.0, dns_retries: int = 0, strip_attachment_payloads: bool = False, ip_db_path: str | None = None, always_use_local_files: bool = False, reverse_dns_map_path: str | None = None, reverse_dns_map_url: str | None = None, offline: bool = False, keep_alive: Callable | None = None, normalize_timespan_threshold_hours: float = 24) AggregateParsedReport | FailureParsedReport | SMTPTLSParsedReport[source]
Parses a DMARC aggregate or failure file at the given path, a file-like object. or bytes
- Parameters:
input (str | os.PathLike | bytes | BinaryIO) – A path to a file, a file-like object, or bytes
nameservers (list) – A list of one or more nameservers to use (Cloudflare’s public DNS resolvers by default)
dns_timeout (float) – Sets the DNS timeout in seconds
dns_retries (int) – Number of times to retry DNS queries on timeout or other transient errors
strip_attachment_payloads (bool) – Remove attachment payloads from failure report results
ip_db_path (str) – Path to a MMDB file from IPinfo, MaxMind, or DBIP
always_use_local_files (bool) – Do not download files
reverse_dns_map_path (str) – Path to a reverse DNS map
reverse_dns_map_url (str) – URL to a reverse DNS map
offline (bool) – Do not make online queries for geolocation or DNS
keep_alive (callable) – Keep alive function
- Returns:
The parsed DMARC report
- Return type:
dict
- parsedmarc.parse_smtp_tls_report_json(report: str | bytes) SMTPTLSReport[source]
Parses and validates an SMTP TLS report
- parsedmarc.parsed_aggregate_reports_to_csv(reports: AggregateReport | list[AggregateReport]) str[source]
Converts one or more parsed aggregate reports to flat CSV format, including headers
- Parameters:
reports – A parsed aggregate report or list of parsed aggregate reports
- Returns:
Parsed aggregate report data in flat CSV format, including headers
- Return type:
str
- parsedmarc.parsed_aggregate_reports_to_csv_rows(reports: AggregateReport | list[AggregateReport]) list[dict[str, Any]][source]
Converts one or more parsed aggregate reports to list of dicts in flat CSV format
- Parameters:
reports – A parsed aggregate report or list of parsed aggregate reports
- Returns:
Parsed aggregate report data as a list of dicts in flat CSV format
- Return type:
list
- parsedmarc.parsed_failure_reports_to_csv(reports: FailureReport | list[FailureReport]) str[source]
Converts one or more parsed failure reports to flat CSV format, including headers
- Parameters:
reports – A parsed failure report or list of parsed failure reports
- Returns:
Parsed failure report data in flat CSV format, including headers
- Return type:
str
- parsedmarc.parsed_failure_reports_to_csv_rows(reports: FailureReport | list[FailureReport]) list[dict[str, Any]][source]
Converts one or more parsed failure reports to a list of dicts in flat CSV format
- Parameters:
reports – A parsed failure report or list of parsed failure reports
- Returns:
Parsed failure report data as a list of dicts in flat CSV format
- Return type:
list
- parsedmarc.parsed_forensic_reports_to_csv(reports: FailureReport | list[FailureReport]) str
Converts one or more parsed failure reports to flat CSV format, including headers
- Parameters:
reports – A parsed failure report or list of parsed failure reports
- Returns:
Parsed failure report data in flat CSV format, including headers
- Return type:
str
- parsedmarc.parsed_forensic_reports_to_csv_rows(reports: FailureReport | list[FailureReport]) list[dict[str, Any]]
Converts one or more parsed failure reports to a list of dicts in flat CSV format
- Parameters:
reports – A parsed failure report or list of parsed failure reports
- Returns:
Parsed failure report data as a list of dicts in flat CSV format
- Return type:
list
- parsedmarc.parsed_smtp_tls_reports_to_csv(reports: SMTPTLSReport | list[SMTPTLSReport]) str[source]
Converts one or more parsed SMTP TLS reports to flat CSV format, including headers
- Parameters:
reports – A parsed aggregate report or list of parsed aggregate reports
- Returns:
Parsed aggregate report data in flat CSV format, including headers
- Return type:
str
- parsedmarc.parsed_smtp_tls_reports_to_csv_rows(reports: SMTPTLSReport | list[SMTPTLSReport]) list[dict[str, Any]][source]
Converts one oor more parsed SMTP TLS reports into a list of single layer dict objects suitable for use in a CSV
- parsedmarc.save_output(results: ParsingResults, *, output_directory: str = 'output', aggregate_json_filename: str = 'aggregate.json', failure_json_filename: str = 'failure.json', smtp_tls_json_filename: str = 'smtp_tls.json', aggregate_csv_filename: str = 'aggregate.csv', failure_csv_filename: str = 'failure.csv', smtp_tls_csv_filename: str = 'smtp_tls.csv')[source]
Save report data in the given directory
- Parameters:
results – Parsing results
output_directory (str) – The path to the directory to save in
aggregate_json_filename (str) – Filename for the aggregate JSON file
failure_json_filename (str) – Filename for the failure JSON file
smtp_tls_json_filename (str) – Filename for the SMTP TLS JSON file
aggregate_csv_filename (str) – Filename for the aggregate CSV file
failure_csv_filename (str) – Filename for the failure CSV file
smtp_tls_csv_filename (str) – Filename for the SMTP TLS CSV file
- parsedmarc.watch_inbox(mailbox_connection: MailboxConnection, callback: Callable, *, reports_folder: str = 'INBOX', archive_folder: str = 'Archive', delete: bool = False, test: bool = False, check_timeout: int = 30, ip_db_path: str | None = None, always_use_local_files: bool = False, reverse_dns_map_path: str | None = None, reverse_dns_map_url: str | None = None, offline: bool = False, nameservers: list[str] | None = None, dns_timeout: float = 6.0, dns_retries: int = 0, strip_attachment_payloads: bool = False, batch_size: int = 10, since: datetime | date | str | None = None, normalize_timespan_threshold_hours: float = 24, config_reloading: Callable | None = None)[source]
- Watches the mailbox for new messages and
sends the results to a callback function
- Parameters:
mailbox_connection – The mailbox connection object
callback – The callback function to receive the parsing results
reports_folder (str) – The IMAP folder where reports can be found
archive_folder (str) – The folder to move processed mail to
delete (bool) – Delete messages after processing them
test (bool) – Do not move or delete messages after processing them
check_timeout (int) – Number of seconds to wait for a IMAP IDLE response or the number of seconds until the next mail check
ip_db_path (str) – Path to a MMDB file from IPinfo, MaxMind, or DBIP
always_use_local_files (bool) – Do not download files
reverse_dns_map_path (str) – Path to a reverse DNS map file
reverse_dns_map_url (str) – URL to a reverse DNS map file
offline (bool) – Do not query online for geolocation or DNS
nameservers (list) – A list of one or more nameservers to use (Cloudflare’s public DNS resolvers by default)
dns_timeout (float) – Set the DNS query timeout
dns_retries (int) – Number of times to retry DNS queries on timeout or other transient errors
strip_attachment_payloads (bool) – Replace attachment payloads in failure report samples with None
batch_size (int) – Number of messages to read and process before saving
since – Search for messages since certain time
normalize_timespan_threshold_hours (float) – Normalize timespans beyond this
config_reloading – Optional callable that returns True when a config reload has been requested (e.g. via SIGHUP)
parsedmarc.elastic
- exception parsedmarc.elastic.AlreadySaved[source]
Raised when a report to be saved matches an existing report
- parsedmarc.elastic.create_indexes(names: list[str], settings: dict[str, Any] | None = None)[source]
Create Elasticsearch indexes
- Parameters:
names (list) – A list of index names
settings (dict) – Index settings. In Serverless mode, keys in
_SERVERLESS_REJECTED_SETTINGSare filtered out and the remaining keys are passed through; defaults are skipped entirely.
- parsedmarc.elastic.migrate_indexes(aggregate_indexes: list[str] | None = None, failure_indexes: list[str] | None = None)[source]
Updates index mappings
- Parameters:
aggregate_indexes (list) – A list of aggregate index names
failure_indexes (list) – A list of failure index names
- parsedmarc.elastic.save_aggregate_report_to_elasticsearch(aggregate_report: dict[str, Any], index_suffix: str | None = None, index_prefix: str | None = None, monthly_indexes: bool | None = False, number_of_shards: int = 1, number_of_replicas: int = 0)[source]
Saves a parsed DMARC aggregate report to Elasticsearch
- Parameters:
aggregate_report (dict) – A parsed aggregate report
index_suffix (str) – The suffix of the name of the index to save to
index_prefix (str) – The prefix of the name of the index to save to
monthly_indexes (bool) – Use monthly indexes instead of daily indexes
number_of_shards (int) – The number of shards to use in the index
number_of_replicas (int) – The number of replicas to use in the index
- Raises:
- parsedmarc.elastic.save_failure_report_to_elasticsearch(failure_report: dict[str, Any], index_suffix: Any | None = None, index_prefix: str | None = None, monthly_indexes: bool | None = False, number_of_shards: int = 1, number_of_replicas: int = 0)[source]
Saves a parsed DMARC failure report to Elasticsearch
- Parameters:
failure_report (dict) – A parsed failure report
index_suffix (str) – The suffix of the name of the index to save to
index_prefix (str) – The prefix of the name of the index to save to
monthly_indexes (bool) – Use monthly indexes instead of daily indexes
number_of_shards (int) – The number of shards to use in the index
number_of_replicas (int) – The number of replicas to use in the index
- Raises:
- parsedmarc.elastic.save_forensic_report_to_elasticsearch(failure_report: dict[str, Any], index_suffix: Any | None = None, index_prefix: str | None = None, monthly_indexes: bool | None = False, number_of_shards: int = 1, number_of_replicas: int = 0)
Saves a parsed DMARC failure report to Elasticsearch
- Parameters:
failure_report (dict) – A parsed failure report
index_suffix (str) – The suffix of the name of the index to save to
index_prefix (str) – The prefix of the name of the index to save to
monthly_indexes (bool) – Use monthly indexes instead of daily indexes
number_of_shards (int) – The number of shards to use in the index
number_of_replicas (int) – The number of replicas to use in the index
- Raises:
- parsedmarc.elastic.save_smtp_tls_report_to_elasticsearch(report: dict[str, Any], index_suffix: str | None = None, index_prefix: str | None = None, monthly_indexes: bool = False, number_of_shards: int = 1, number_of_replicas: int = 0)[source]
Saves a parsed SMTP TLS report to Elasticsearch
- Parameters:
report (dict) – A parsed SMTP TLS report
index_suffix (str) – The suffix of the name of the index to save to
index_prefix (str) – The prefix of the name of the index to save to
monthly_indexes (bool) – Use monthly indexes instead of daily indexes
number_of_shards (int) – The number of shards to use in the index
number_of_replicas (int) – The number of replicas to use in the index
- Raises:
- parsedmarc.elastic.set_hosts(hosts: str | list[str], *, use_ssl: bool = False, ssl_cert_path: str | None = None, skip_certificate_verification: bool = False, username: str | None = None, password: str | None = None, api_key: str | None = None, timeout: float = 60.0, serverless: bool = False)[source]
Sets the Elasticsearch hosts to use
- Parameters:
hosts (str | list[str]) – A single hostname or URL, or list of hostnames or URLs
use_ssl (bool) – Use an HTTPS connection to the server
ssl_cert_path (str) – Path to the certificate chain
skip_certificate_verification (bool) – Skip certificate verification
username (str) – The username to use for authentication
password (str) – The password to use for authentication
api_key (str) – The Base64 encoded API key to use for authentication
timeout (float) – Timeout in seconds
serverless (bool) – Target an Elastic Cloud Serverless project. When True,
create_indexesstripsnumber_of_shards/number_of_replicasfrom its settings (which Serverless rejects with HTTP 400) and passes any other settings through unchanged.
parsedmarc.opensearch
- exception parsedmarc.opensearch.AlreadySaved[source]
Raised when a report to be saved matches an existing report
- parsedmarc.opensearch.create_indexes(names: list[str], settings: dict[str, Any] | None = None)[source]
Create OpenSearch indexes
- Parameters:
names (list) – A list of index names
settings (dict) – Index settings
- parsedmarc.opensearch.migrate_indexes(aggregate_indexes: list[str] | None = None, failure_indexes: list[str] | None = None)[source]
Updates index mappings
- Parameters:
aggregate_indexes (list) – A list of aggregate index names
failure_indexes (list) – A list of failure index names
- parsedmarc.opensearch.save_aggregate_report_to_opensearch(aggregate_report: dict[str, Any], index_suffix: str | None = None, index_prefix: str | None = None, monthly_indexes: bool = False, number_of_shards: int = 1, number_of_replicas: int = 0)[source]
Saves a parsed DMARC aggregate report to OpenSearch
- Parameters:
aggregate_report (dict) – A parsed aggregate report
index_suffix (str) – The suffix of the name of the index to save to
index_prefix (str) – The prefix of the name of the index to save to
monthly_indexes (bool) – Use monthly indexes instead of daily indexes
number_of_shards (int) – The number of shards to use in the index
number_of_replicas (int) – The number of replicas to use in the index
- Raises:
- parsedmarc.opensearch.save_failure_report_to_opensearch(failure_report: dict[str, Any], index_suffix: str | None = None, index_prefix: str | None = None, monthly_indexes: bool = False, number_of_shards: int = 1, number_of_replicas: int = 0)[source]
Saves a parsed DMARC failure report to OpenSearch
- Parameters:
failure_report (dict) – A parsed failure report
index_suffix (str) – The suffix of the name of the index to save to
index_prefix (str) – The prefix of the name of the index to save to
monthly_indexes (bool) – Use monthly indexes instead of daily indexes
number_of_shards (int) – The number of shards to use in the index
number_of_replicas (int) – The number of replicas to use in the index
- Raises:
- parsedmarc.opensearch.save_forensic_report_to_opensearch(failure_report: dict[str, Any], index_suffix: str | None = None, index_prefix: str | None = None, monthly_indexes: bool = False, number_of_shards: int = 1, number_of_replicas: int = 0)
Saves a parsed DMARC failure report to OpenSearch
- Parameters:
failure_report (dict) – A parsed failure report
index_suffix (str) – The suffix of the name of the index to save to
index_prefix (str) – The prefix of the name of the index to save to
monthly_indexes (bool) – Use monthly indexes instead of daily indexes
number_of_shards (int) – The number of shards to use in the index
number_of_replicas (int) – The number of replicas to use in the index
- Raises:
- parsedmarc.opensearch.save_smtp_tls_report_to_opensearch(report: dict[str, Any], index_suffix: str | None = None, index_prefix: str | None = None, monthly_indexes: bool = False, number_of_shards: int = 1, number_of_replicas: int = 0)[source]
Saves a parsed SMTP TLS report to OpenSearch
- Parameters:
report (dict) – A parsed SMTP TLS report
index_suffix (str) – The suffix of the name of the index to save to
index_prefix (str) – The prefix of the name of the index to save to
monthly_indexes (bool) – Use monthly indexes instead of daily indexes
number_of_shards (int) – The number of shards to use in the index
number_of_replicas (int) – The number of replicas to use in the index
- Raises:
- parsedmarc.opensearch.set_hosts(hosts: str | list[str], *, use_ssl: bool | None = False, ssl_cert_path: str | None = None, skip_certificate_verification: bool = False, username: str | None = None, password: str | None = None, api_key: str | None = None, timeout: float | None = 60.0, auth_type: str = 'basic', aws_region: str | None = None, aws_service: str = 'es')[source]
Sets the OpenSearch hosts to use
- Parameters:
hosts (str|list[str]) – A single hostname or URL, or list of hostnames or URLs
use_ssl (bool) – Use an HTTPS connection to the server
ssl_cert_path (str) – Path to the certificate chain
skip_certificate_verification (bool) – Skip certificate verification
username (str) – The username to use for authentication
password (str) – The password to use for authentication
api_key (str) – The Base64 encoded API key to use for authentication
timeout (float) – Timeout in seconds
auth_type (str) – OpenSearch auth mode: basic (default) or awssigv4
aws_region (str) – AWS region for SigV4 auth (required for awssigv4)
aws_service (str) – AWS service for SigV4 signing (default: es)
parsedmarc.splunk
- class parsedmarc.splunk.HECClient(url: str, access_token: str, index: str, source: str = 'parsedmarc', verify=True, timeout=60)[source]
Initializes the HECClient
- Parameters:
url (str) – The URL of the HEC
access_token (str) – The HEC access token
index (str) – The name of the index
source (str) – The source name
verify (bool) – Verify SSL certificates
timeout (float) – Number of seconds to wait for the server to send data before giving up
- save_aggregate_reports_to_splunk(aggregate_reports: list[dict[str, Any]] | dict[str, Any])[source]
Saves aggregate DMARC reports to Splunk
- Parameters:
aggregate_reports – A list of aggregate report dictionaries to save in Splunk
- save_failure_reports_to_splunk(failure_reports: list[dict[str, Any]] | dict[str, Any])[source]
Saves failure DMARC reports to Splunk
- Parameters:
failure_reports (list) – A list of failure report dictionaries to save in Splunk
- save_forensic_reports_to_splunk(failure_reports: list[dict[str, Any]] | dict[str, Any])
Saves failure DMARC reports to Splunk
- Parameters:
failure_reports (list) – A list of failure report dictionaries to save in Splunk
parsedmarc.types
- parsedmarc.types.ForensicParsedReport
alias of
FailureParsedReport
- parsedmarc.types.ForensicReport
alias of
FailureReport
- class parsedmarc.types.ParsedEmail
parsedmarc.utils
Utility functions that might be useful for other projects
- exception parsedmarc.utils.DownloadError[source]
Raised when an error occurs when downloading a file
- exception parsedmarc.utils.InvalidIPinfoAPIKey[source]
Raised when the IPinfo API rejects the configured token.
- parsedmarc.utils.configure_ipinfo_api(token: str | None, *, probe: bool = True) None[source]
Configure the IPinfo Lite REST API as the primary source for IP lookups.
When a token is configured,
get_ip_address_db_record()hits the API first for every lookup and falls back to the MMDB on network errors. An invalid token raisesInvalidIPinfoAPIKey— the CLI catches that and exits fatally.- Parameters:
token – IPinfo API token.
Noneor empty disables the API.probe – If
True, verify the token by looking up1.1.1.1. A 401/403 raisesInvalidIPinfoAPIKey; other errors are logged and the token is still accepted so per-request fallback can take over.
- parsedmarc.utils.convert_outlook_msg(msg_bytes: bytes) bytes[source]
Uses the
msgconvertPerl utility to convert an Outlook MS file to standard RFC 822 format- Parameters:
msg_bytes (bytes) – the content of the .msg file
- Returns:
A RFC 822 bytes payload
- parsedmarc.utils.decode_base64(data: str) bytes[source]
Decodes a base64 string, with padding being optional
- Parameters:
data (str) – A base64 encoded string
- Returns:
The decoded bytes
- Return type:
bytes
- parsedmarc.utils.get_base_domain(domain: str) str | None[source]
Gets the base domain name for the given domain
Note
Results are based on a list of public domain suffixes at https://publicsuffix.org/list/public_suffix_list.dat and overrides included in parsedmarc.resources.maps.psl_overrides.txt
- Parameters:
domain (str) – A domain or subdomain
- Returns:
The base domain of the given domain
- Return type:
str
- parsedmarc.utils.get_filename_safe_string(string: str) str[source]
Converts a string to a string that is safe for a filename
- Parameters:
string (str) – A string to make safe for a filename
- Returns:
A string safe for a filename
- Return type:
str
- parsedmarc.utils.get_ip_address_country(ip_address: str, *, db_path: str | None = None) str | None[source]
Returns the ISO code for the country associated with the given IPv4 or IPv6 address.
- Parameters:
ip_address (str) – The IP address to query for
db_path (str) – Path to a MMDB file from IPinfo, MaxMind, or DBIP
- Returns:
And ISO country code associated with the given IP address
- Return type:
str
- parsedmarc.utils.get_ip_address_db_record(ip_address: str, *, db_path: str | None = None) _IPDatabaseRecord[source]
Look up an IP and return country + ASN fields.
If the IPinfo Lite API is configured via
configure_ipinfo_api(), the API is queried first; any non-fatal failure (rate limit, quota, network) falls through to the MMDB. An invalid API token raisesInvalidIPinfoAPIKeyand is not caught here.IPinfo Lite carries
country_code,as_name, andas_domainon every record. MaxMind/DBIP country-only databases carry only country, soas_name/as_domaincome back None for those users.
- parsedmarc.utils.get_ip_address_info(ip_address, *, ip_db_path: str | None = None, reverse_dns_map_path: str | None = None, always_use_local_files: bool = False, reverse_dns_map_url: str | None = None, cache: ExpiringDict | None = None, reverse_dns_map: dict[str, ReverseDNSService] | None = None, offline: bool = False, nameservers: list[str] | None = None, timeout: float = 2.0, retries: int = 0) IPAddressInfo[source]
Returns reverse DNS and country information for the given IP address
- Parameters:
ip_address (str) – The IP address to check
ip_db_path (str) – path to a MMDB file from MaxMind or DBIP
reverse_dns_map_path (str) – Path to a reverse DNS map file
reverse_dns_map_url (str) – URL to the reverse DNS map file
always_use_local_files (bool) – Do not download files
cache (ExpiringDict) – Cache storage
reverse_dns_map (dict) – A reverse DNS map
offline (bool) – Do not make online queries for geolocation or DNS
nameservers (list) – A list of one or more nameservers to use (Cloudflare’s public DNS resolvers by default)
timeout (float) – Sets the DNS timeout in seconds
retries (int) – Number of times to retry on timeout or other transient errors
- Returns:
ip_address,reverse_dns,country- Return type:
dict
- parsedmarc.utils.get_reverse_dns(ip_address, *, cache: ExpiringDict | None = None, nameservers: list[str] | None = None, timeout: float = 2.0, retries: int = 0) str | None[source]
Resolves an IP address to a hostname using a reverse DNS query
- Parameters:
ip_address (str) – The IP address to resolve
cache (ExpiringDict) – Cache storage
nameservers (list) – A list of one or more nameservers to use (Cloudflare’s public DNS resolvers by default)
timeout (float) – Sets the DNS query timeout in seconds
retries (int) – Number of times to retry on timeout or other transient errors
- Returns:
The reverse DNS hostname (if any)
- Return type:
str
- parsedmarc.utils.get_service_from_reverse_dns_base_domain(base_domain, *, always_use_local_file: bool = False, local_file_path: str | None = None, url: str | None = None, offline: bool = False, reverse_dns_map: dict[str, ReverseDNSService] | None = None) ReverseDNSService[source]
Returns the service name of a given base domain name from reverse DNS.
- Parameters:
base_domain (str) – The base domain of the reverse DNS lookup
always_use_local_file (bool) – Always use a local map file
local_file_path (str) – Path to a local map file
url (str) – URL ro a reverse DNS map
offline (bool) – Use the built-in copy of the reverse DNS map
reverse_dns_map (dict) – A reverse DNS map
- Returns:
A dictionary containing name and type. If the service is unknown, the name will be the supplied reverse_dns_base_domain and the type will be None
- Return type:
dict
- parsedmarc.utils.human_timestamp_to_datetime(human_timestamp: str, *, to_utc: bool = False) datetime[source]
Converts a human-readable timestamp into a Python
datetimeobject- Parameters:
human_timestamp (str) – A timestamp string
to_utc (bool) – Convert the timestamp to UTC
- Returns:
The converted timestamp
- Return type:
datetime
- parsedmarc.utils.human_timestamp_to_unix_timestamp(human_timestamp: str) int[source]
Converts a human-readable timestamp into a UNIX timestamp
- Parameters:
human_timestamp (str) – A timestamp in YYYY-MM-DD HH:MM:SS` format
- Returns:
The converted timestamp
- Return type:
float
- parsedmarc.utils.is_mbox(path: str) bool[source]
Checks if the given content is an MBOX mailbox file
- Parameters:
path – Content to check
- Returns:
A flag that indicates if the file is an MBOX mailbox file
- Return type:
bool
- parsedmarc.utils.is_outlook_msg(content) bool[source]
Checks if the given content is an Outlook msg OLE/MSG file
- Parameters:
content – Content to check
- Returns:
A flag that indicates if the file is an Outlook MSG file
- Return type:
bool
- parsedmarc.utils.load_ip_db(*, always_use_local_file: bool = False, local_file_path: str | None = None, url: str | None = None, offline: bool = False) None[source]
Downloads the IP-to-country MMDB database from a URL and caches it locally. Falls back to the bundled copy on failure or when offline.
- Parameters:
always_use_local_file – Always use a local/bundled database file
local_file_path – Path to a local MMDB file
url – URL to the MMDB database file
offline – Do not make online requests
- parsedmarc.utils.load_psl_overrides(*, always_use_local_file: bool = False, local_file_path: str | None = None, url: str | None = None, offline: bool = False) list[str][source]
Loads the PSL overrides list from a URL or local file.
Clears and repopulates the module-level
psl_overrideslist in place, then returns it. The URL is tried first; on failure (or whenoffline/always_use_local_fileis set) the local path is used, defaulting to the bundledpsl_overrides.txt.- Parameters:
always_use_local_file (bool) – Always use a local overrides file
local_file_path (str) – Path to a local overrides file
url (str) – URL to a PSL overrides file
offline (bool) – Use the built-in copy of the overrides
- Returns:
the module-level
psl_overrideslist- Return type:
list[str]
- parsedmarc.utils.load_reverse_dns_map(reverse_dns_map: dict[str, ReverseDNSService], *, always_use_local_file: bool = False, local_file_path: str | None = None, url: str | None = None, offline: bool = False, psl_overrides_path: str | None = None, psl_overrides_url: str | None = None) None[source]
Loads the reverse DNS map from a URL or local file.
Clears and repopulates the given map dict in place. If the map is fetched from a URL, that is tried first; on failure (or if offline/local mode is selected) the bundled CSV is used as a fallback.
psl_overrides.txtis reloaded at the same time using the sameoffline/always_use_local_fileflags (with separate path/URL kwargs), so map entries that depend on a recent overrides entry fold correctly.- Parameters:
reverse_dns_map (dict) – The map dict to populate (modified in place)
always_use_local_file (bool) – Always use a local map file
local_file_path (str) – Path to a local map file
url (str) – URL to a reverse DNS map
offline (bool) – Use the built-in copy of the reverse DNS map
psl_overrides_path (str) – Path to a local PSL overrides file
psl_overrides_url (str) – URL to a PSL overrides file
- parsedmarc.utils.parse_email(data: bytes | str, *, strip_attachment_payloads: bool = False) dict[source]
A simplified email parser
- Parameters:
data – The RFC 822 message string, or MSG binary
strip_attachment_payloads (bool) – Remove attachment payloads
- Returns:
Parsed email data
- Return type:
dict
- parsedmarc.utils.query_dns(domain: str, record_type: str, *, cache: ExpiringDict | None = None, nameservers: list[str] | None = None, timeout: float = 2.0, retries: int = 0, _attempt: int = 0) list[str][source]
Queries DNS
- Parameters:
domain (str) – The domain or subdomain to query about
record_type (str) – The record type to query for
cache (ExpiringDict) – Cache storage
nameservers (list) – A list of one or more nameservers to use (Cloudflare’s public DNS resolvers by default). Pass
parsedmarc.constants.RECOMMENDED_DNS_NAMESERVERSfor a cross-provider mix that fails over when one provider’s path is slow or broken.timeout (float) – Overall DNS lifetime budget in seconds per configured nameserver. Per-query UDP attempts are capped at
min(1.0, timeout)so dnspython retries within the lifetime on transient UDP packet loss (mirroringdig’s default+tries=3behavior); with multiple nameservers configured this same cap also makes a slow or broken nameserver fall through to the next quickly.retries (int) – Number of times to retry the whole query after a timeout or other transient error (
LifetimeTimeout,NoNameservers,OSError). Failover between configured nameservers happens within each attempt.
- Returns:
A list of answers
- Return type:
list