diff --git a/_modules/index.html b/_modules/index.html index 11ac7a0..cbb2225 100644 --- a/_modules/index.html +++ b/_modules/index.html @@ -5,14 +5,14 @@
-Extracts report from a file at the given file_path
Parses a DMARC aggregate or forensic file at the given path, a file-like object. or bytes
input (str | bytes | BinaryIO) – A path to a file, a file like object, or bytes
input (str | os.PathLike | bytes | BinaryIO) – A path to a file, +a file-like object, or bytes
nameservers (list) – A list of one or more nameservers to use (Cloudflare’s public DNS resolvers by default)
dns_timeout (float) – Sets the DNS timeout in seconds
sends the results to a callback function
strip_attachment_payloads (bool) – Replace attachment payloads in forensic report samples with None
batch_size (int) – Number of messages to read and process before saving
since – Search for messages since certain time
normalize_timespan_threshold_hours (float) – Normalize timespans beyond this
Sets the OpenSearch hosts to use
password (str) – The password to use for authentication
api_key (str) – The Base64 encoded API key to use for authentication
timeout (float) – Timeout in seconds
auth_type (str) – OpenSearch auth mode: basic (default) or awssigv4
aws_region (str) – AWS region for SigV4 auth (required for awssigv4)
aws_service (str) – AWS service for SigV4 signing (default: es)
dns_timeout - float: DNS timeout perioddebug - bool: Print debugging messages
silent - bool: Only print errors (Default: True)
fail_on_output_error - bool: Exit with a non-zero status code if
+any configured output destination fails while saving/publishing
+reports (Default: False)
log_file - str: Write log messages to a file at this path
n_procs - int: Number of process to run in parallel when
parsing in CLI mode (Default: 1)
msgraph
auth_method - str: Authentication method, valid types are
-UsernamePassword, DeviceCode, or ClientSecret
+UsernamePassword, DeviceCode, ClientSecret, or Certificate
(Default: UsernamePassword).
user - str: The M365 user, required when the auth method is
UsernamePassword
client_id - str: The app registration’s client ID
client_secret - str: The app registration’s secret
certificate_path - str: Path to a PEM or PKCS12 certificate
+including the private key. Required when the auth method is
+Certificate
certificate_password - str: Optional password for the
+certificate file when using Certificate auth
tenant_id - str: The Azure AD tenant ID. This is required
for all auth methods except UsernamePassword.
mailbox - str: The mailbox name. This defaults to the
@@ -338,6 +347,8 @@ group and use that as the group id.
The same application permission and mailbox scoping guidance
+applies to the Certificate auth method.
127.0
user - str: Basic auth username
password - str: Basic auth password
api_key - str: API key
+auth_type - str: Authentication type: basic (default) or awssigv4 (the key authentication_type is accepted as an alias for this option)
+aws_region - str: AWS region for SigV4 authentication
+(required when auth_type = awssigv4)
+aws_service - str: AWS service for SigV4 signing (Default: es)
ssl - bool: Use an encrypted SSL/TLS connection
(Default: True)
timeout - float: Timeout in seconds (Default: 60)
@@ -506,11 +521,21 @@ so use %%credentials_file - str: Path to file containing the
credentials, None to disable (Default: None)
token_file - str: Path to save the token file
-(Default: .token)
+(Default: .token)
+auth_mode - str: Authentication mode, installed_app (default)
+or service_account
+service_account_user - str: Delegated mailbox user for Gmail
+service account auth (required for domain-wide delegation). Also
+accepted as delegated_user for backward compatibility.
Note
credentials_file and token_file can be got with quickstart.Please change the scope to https://www.googleapis.com/auth/gmail.modify.
+
+Note
+When auth_mode = service_account, credentials_file must point to a
+Google service account key JSON file, and token_file is not used.
+
include_spam_trash - bool: Include messages in Spam and
Trash when searching reports (Default: False)
@@ -621,6 +646,33 @@ Check current usage (from Management -> Dev Tools -> Console):
Increasing this value increases resource usage.
+
+Performance tuning
+For large mailbox imports or backfills, parsedmarc can consume a noticeable amount
+of memory, especially when it runs on the same host as Elasticsearch or
+OpenSearch. The following settings can reduce peak memory usage and make long
+imports more predictable:
+
+Reduce mailbox.batch_size to smaller values such as 100-500 instead of
+processing a very large message set at once. Smaller batches trade throughput
+for lower peak memory use and less sink pressure.
+Keep n_procs low for mailbox-heavy runs. In practice, 1-2 workers is often
+a safer starting point for large backfills than aggressive parallelism.
+Use mailbox.since to process reports in smaller time windows such as 1d,
+7d, or another interval that fits the backlog. This makes it easier to catch
+up incrementally instead of loading an entire mailbox history in one run.
+Set strip_attachment_payloads = True when forensic reports contain large
+attachments and you do not need to retain the raw payloads in the parsed
+output.
+Prefer running parsedmarc separately from Elasticsearch or OpenSearch, or
+reserve enough RAM for both services if they must share a host.
+For very large imports, prefer incremental supervised runs, such as a
+scheduler or systemd service, over infrequent massive backfills.
+
+These are operational tuning recommendations rather than hard requirements, but
+they are often enough to avoid memory pressure and reduce failures during
+high-volume mailbox processing.
+
Multi-tenant support
Starting in 8.19.0, ParseDMARC provides multi-tenant support by placing data into separate OpenSearch or Elasticsearch index prefixes. To set this up, create a YAML file that is formatted where each key is a tenant name, and the value is a list of domains related to that tenant, not including subdomains, like this: