mirror of
https://github.com/domainaware/parsedmarc.git
synced 2026-03-26 08:22:45 +00:00
Compare commits
4 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
12c4676b79 | ||
|
|
cda039ee27 | ||
|
|
ff0ca6538c | ||
|
|
2032438d3b |
33
CHANGELOG.md
33
CHANGELOG.md
@@ -1,5 +1,38 @@
|
||||
# Changelog
|
||||
|
||||
## 9.5.1
|
||||
|
||||
### Changes
|
||||
|
||||
- Correct ISO format for MSGraphConnection timestamps (PR #706)
|
||||
|
||||
## 9.5.0
|
||||
|
||||
### Added
|
||||
|
||||
- Environment variable configuration support: any config option can now be set via `PARSEDMARC_{SECTION}_{KEY}` environment variables (e.g. `PARSEDMARC_IMAP_PASSWORD`, `PARSEDMARC_SPLUNK_HEC_TOKEN`). Environment variables override config file values but are overridden by CLI arguments.
|
||||
- `PARSEDMARC_CONFIG_FILE` environment variable to specify the config file path without the `-c` flag.
|
||||
- Env-only mode: parsedmarc can now run without a config file when `PARSEDMARC_*` environment variables are set, enabling fully file-less Docker deployments.
|
||||
- Explicit read permission check on config file, giving a clear error message when the container UID cannot read the file (e.g. `chmod 600` with a UID mismatch).
|
||||
|
||||
## 9.4.0
|
||||
|
||||
### Added
|
||||
|
||||
- Extracted `load_reverse_dns_map()` utility function in `utils.py` for loading the reverse DNS map independently of individual IP lookups.
|
||||
- SIGHUP reload now re-downloads/reloads the reverse DNS map, so changes take effect without restarting.
|
||||
- Add premade OpenSearch index patterns, visualizations, and dashboards
|
||||
|
||||
### Changed
|
||||
|
||||
- When `index_prefix_domain_map` is configured, SMTP TLS reports for domains not in the map are now silently dropped instead of being output. Unlike DMARC, TLS-RPT has no DNS authorization records, so this filtering prevents processing reports for unrelated domains.
|
||||
- Bump OpenSearch support to `< 4`
|
||||
|
||||
### Fixed
|
||||
|
||||
- Fixed `get_index_prefix` using wrong key (`domain` instead of `policy_domain`) for SMTP TLS reports, which prevented domain map matching from working for TLS reports.
|
||||
- Domain matching in `get_index_prefix` now lowercases the domain for case-insensitive comparison.
|
||||
|
||||
## 9.3.1
|
||||
|
||||
### Breaking changes
|
||||
|
||||
@@ -15,7 +15,7 @@ services:
|
||||
condition: service_healthy
|
||||
|
||||
opensearch-dashboards:
|
||||
image: opensearchproject/opensearch-dashboards:2
|
||||
image: opensearchproject/opensearch-dashboards:3
|
||||
environment:
|
||||
- OPENSEARCH_HOSTS=["https://opensearch:9200"]
|
||||
ports:
|
||||
|
||||
@@ -531,6 +531,96 @@ PUT _cluster/settings
|
||||
Increasing this value increases resource usage.
|
||||
:::
|
||||
|
||||
## Environment variable configuration
|
||||
|
||||
Any configuration option can be set via environment variables using the
|
||||
naming convention `PARSEDMARC_{SECTION}_{KEY}` (uppercase). This is
|
||||
especially useful for Docker deployments where file permissions make it
|
||||
difficult to use config files for secrets.
|
||||
|
||||
**Priority order:** CLI arguments > environment variables > config file > defaults
|
||||
|
||||
### Examples
|
||||
|
||||
```bash
|
||||
# Set IMAP credentials via env vars
|
||||
export PARSEDMARC_IMAP_HOST=imap.example.com
|
||||
export PARSEDMARC_IMAP_USER=dmarc@example.com
|
||||
export PARSEDMARC_IMAP_PASSWORD=secret
|
||||
|
||||
# Elasticsearch
|
||||
export PARSEDMARC_ELASTICSEARCH_HOSTS=http://localhost:9200
|
||||
export PARSEDMARC_ELASTICSEARCH_SSL=false
|
||||
|
||||
# Splunk HEC (note: section name splunk_hec becomes SPLUNK_HEC)
|
||||
export PARSEDMARC_SPLUNK_HEC_URL=https://splunk.example.com
|
||||
export PARSEDMARC_SPLUNK_HEC_TOKEN=my-hec-token
|
||||
export PARSEDMARC_SPLUNK_HEC_INDEX=email
|
||||
|
||||
# General settings
|
||||
export PARSEDMARC_GENERAL_SAVE_AGGREGATE=true
|
||||
export PARSEDMARC_GENERAL_DEBUG=true
|
||||
```
|
||||
|
||||
### Specifying the config file via environment variable
|
||||
|
||||
```bash
|
||||
export PARSEDMARC_CONFIG_FILE=/etc/parsedmarc.ini
|
||||
parsedmarc
|
||||
```
|
||||
|
||||
### Running without a config file (env-only mode)
|
||||
|
||||
When no config file is given (neither `-c` flag nor `PARSEDMARC_CONFIG_FILE`),
|
||||
parsedmarc will still pick up any `PARSEDMARC_*` environment variables. This
|
||||
enables fully file-less deployments:
|
||||
|
||||
```bash
|
||||
export PARSEDMARC_GENERAL_SAVE_AGGREGATE=true
|
||||
export PARSEDMARC_GENERAL_OFFLINE=true
|
||||
export PARSEDMARC_ELASTICSEARCH_HOSTS=http://elasticsearch:9200
|
||||
parsedmarc /path/to/reports/*
|
||||
```
|
||||
|
||||
### Docker Compose example
|
||||
|
||||
```yaml
|
||||
services:
|
||||
parsedmarc:
|
||||
image: parsedmarc:latest
|
||||
environment:
|
||||
PARSEDMARC_IMAP_HOST: imap.example.com
|
||||
PARSEDMARC_IMAP_USER: dmarc@example.com
|
||||
PARSEDMARC_IMAP_PASSWORD: ${IMAP_PASSWORD}
|
||||
PARSEDMARC_MAILBOX_WATCH: "true"
|
||||
PARSEDMARC_ELASTICSEARCH_HOSTS: http://elasticsearch:9200
|
||||
PARSEDMARC_GENERAL_SAVE_AGGREGATE: "true"
|
||||
PARSEDMARC_GENERAL_SAVE_FORENSIC: "true"
|
||||
```
|
||||
|
||||
### Section name mapping
|
||||
|
||||
For sections with underscores in the name, the full section name is used:
|
||||
|
||||
| Section | Env var prefix |
|
||||
|------------------|-------------------------------|
|
||||
| `general` | `PARSEDMARC_GENERAL_` |
|
||||
| `mailbox` | `PARSEDMARC_MAILBOX_` |
|
||||
| `imap` | `PARSEDMARC_IMAP_` |
|
||||
| `msgraph` | `PARSEDMARC_MSGRAPH_` |
|
||||
| `elasticsearch` | `PARSEDMARC_ELASTICSEARCH_` |
|
||||
| `opensearch` | `PARSEDMARC_OPENSEARCH_` |
|
||||
| `splunk_hec` | `PARSEDMARC_SPLUNK_HEC_` |
|
||||
| `kafka` | `PARSEDMARC_KAFKA_` |
|
||||
| `smtp` | `PARSEDMARC_SMTP_` |
|
||||
| `s3` | `PARSEDMARC_S3_` |
|
||||
| `syslog` | `PARSEDMARC_SYSLOG_` |
|
||||
| `gmail_api` | `PARSEDMARC_GMAIL_API_` |
|
||||
| `maildir` | `PARSEDMARC_MAILDIR_` |
|
||||
| `log_analytics` | `PARSEDMARC_LOG_ANALYTICS_` |
|
||||
| `gelf` | `PARSEDMARC_GELF_` |
|
||||
| `webhook` | `PARSEDMARC_WEBHOOK_` |
|
||||
|
||||
## Performance tuning
|
||||
|
||||
For large mailbox imports or backfills, parsedmarc can consume a noticeable amount
|
||||
|
||||
28
opensearch/opensearch_dashboards.ndjson
Normal file
28
opensearch/opensearch_dashboards.ndjson
Normal file
File diff suppressed because one or more lines are too long
@@ -1957,7 +1957,7 @@ def get_dmarc_reports_from_mailbox(
|
||||
elif isinstance(connection, MSGraphConnection):
|
||||
since = (
|
||||
datetime.now(timezone.utc) - timedelta(minutes=_since)
|
||||
).isoformat() + "Z"
|
||||
).isoformat()
|
||||
current_time = datetime.now(timezone.utc).isoformat() + "Z"
|
||||
elif isinstance(connection, GmailConnection):
|
||||
since = (datetime.now(timezone.utc) - timedelta(minutes=_since)).strftime(
|
||||
|
||||
@@ -19,6 +19,7 @@ import yaml
|
||||
from tqdm import tqdm
|
||||
|
||||
from parsedmarc import (
|
||||
REVERSE_DNS_MAP,
|
||||
SEEN_AGGREGATE_REPORT_IDS,
|
||||
InvalidDMARCReport,
|
||||
ParserError,
|
||||
@@ -48,7 +49,12 @@ from parsedmarc.mail import (
|
||||
)
|
||||
from parsedmarc.mail.graph import AuthMethod
|
||||
from parsedmarc.types import ParsingResults
|
||||
from parsedmarc.utils import get_base_domain, get_reverse_dns, is_mbox
|
||||
from parsedmarc.utils import (
|
||||
get_base_domain,
|
||||
get_reverse_dns,
|
||||
is_mbox,
|
||||
load_reverse_dns_map,
|
||||
)
|
||||
|
||||
# Increase the max header limit for very large emails. `_MAXHEADERS` is a
|
||||
# private stdlib attribute and may not exist in type stubs.
|
||||
@@ -69,6 +75,79 @@ def _str_to_list(s):
|
||||
return list(map(lambda i: i.lstrip(), _list))
|
||||
|
||||
|
||||
# All known INI config section names, used for env var resolution.
|
||||
_KNOWN_SECTIONS = frozenset(
|
||||
{
|
||||
"general",
|
||||
"mailbox",
|
||||
"imap",
|
||||
"msgraph",
|
||||
"elasticsearch",
|
||||
"opensearch",
|
||||
"splunk_hec",
|
||||
"kafka",
|
||||
"smtp",
|
||||
"s3",
|
||||
"syslog",
|
||||
"gmail_api",
|
||||
"maildir",
|
||||
"log_analytics",
|
||||
"gelf",
|
||||
"webhook",
|
||||
}
|
||||
)
|
||||
|
||||
|
||||
def _resolve_section_key(suffix: str) -> tuple:
|
||||
"""Resolve an env var suffix like ``IMAP_PASSWORD`` to ``('imap', 'password')``.
|
||||
|
||||
Uses longest-prefix matching against known section names so that
|
||||
multi-word sections like ``splunk_hec`` are handled correctly.
|
||||
|
||||
Returns ``(None, None)`` when no known section matches.
|
||||
"""
|
||||
suffix_lower = suffix.lower()
|
||||
|
||||
best_section = None
|
||||
best_key = None
|
||||
for section in _KNOWN_SECTIONS:
|
||||
section_prefix = section + "_"
|
||||
if suffix_lower.startswith(section_prefix):
|
||||
key = suffix_lower[len(section_prefix) :]
|
||||
if key and (best_section is None or len(section) > len(best_section)):
|
||||
best_section = section
|
||||
best_key = key
|
||||
|
||||
return best_section, best_key
|
||||
|
||||
|
||||
def _apply_env_overrides(config: ConfigParser) -> None:
|
||||
"""Inject ``PARSEDMARC_*`` environment variables into *config*.
|
||||
|
||||
Environment variables matching ``PARSEDMARC_{SECTION}_{KEY}`` override
|
||||
(or create) the corresponding config-file values. Sections are created
|
||||
automatically when they do not yet exist.
|
||||
"""
|
||||
prefix = "PARSEDMARC_"
|
||||
|
||||
for env_key, env_value in os.environ.items():
|
||||
if not env_key.startswith(prefix) or env_key == "PARSEDMARC_CONFIG_FILE":
|
||||
continue
|
||||
|
||||
suffix = env_key[len(prefix) :]
|
||||
section, key = _resolve_section_key(suffix)
|
||||
|
||||
if section is None:
|
||||
logger.debug("Ignoring unrecognized env var: %s", env_key)
|
||||
continue
|
||||
|
||||
if not config.has_section(section):
|
||||
config.add_section(section)
|
||||
|
||||
config.set(section, key, env_value)
|
||||
logger.debug("Config override from env: [%s] %s", section, key)
|
||||
|
||||
|
||||
def _configure_logging(log_level, log_file=None):
|
||||
"""
|
||||
Configure logging for the current process.
|
||||
@@ -172,12 +251,39 @@ class ConfigurationError(Exception):
|
||||
pass
|
||||
|
||||
|
||||
def _parse_config_file(config_file, opts):
|
||||
"""Parse a config file and update opts in place.
|
||||
def _load_config(config_file: str | None = None) -> ConfigParser:
|
||||
"""Load configuration from an INI file and/or environment variables.
|
||||
|
||||
Args:
|
||||
config_file: Path to the .ini config file
|
||||
opts: Namespace object to update with parsed values
|
||||
config_file: Optional path to an .ini config file.
|
||||
|
||||
Returns:
|
||||
A ``ConfigParser`` populated from the file (if given) and from any
|
||||
``PARSEDMARC_*`` environment variables.
|
||||
|
||||
Raises:
|
||||
ConfigurationError: If *config_file* is given but does not exist.
|
||||
"""
|
||||
config = ConfigParser()
|
||||
if config_file is not None:
|
||||
abs_path = os.path.abspath(config_file)
|
||||
if not os.path.exists(abs_path):
|
||||
raise ConfigurationError("A file does not exist at {0}".format(abs_path))
|
||||
if not os.access(abs_path, os.R_OK):
|
||||
raise ConfigurationError(
|
||||
"Unable to read {0} — check file permissions".format(abs_path)
|
||||
)
|
||||
config.read(config_file)
|
||||
_apply_env_overrides(config)
|
||||
return config
|
||||
|
||||
|
||||
def _parse_config(config: ConfigParser, opts):
|
||||
"""Apply a loaded ``ConfigParser`` to *opts* in place.
|
||||
|
||||
Args:
|
||||
config: A ``ConfigParser`` (from ``_load_config``).
|
||||
opts: Namespace object to update with parsed values.
|
||||
|
||||
Returns:
|
||||
index_prefix_domain_map or None
|
||||
@@ -185,13 +291,8 @@ def _parse_config_file(config_file, opts):
|
||||
Raises:
|
||||
ConfigurationError: If required settings are missing or invalid.
|
||||
"""
|
||||
abs_path = os.path.abspath(config_file)
|
||||
if not os.path.exists(abs_path):
|
||||
raise ConfigurationError("A file does not exist at {0}".format(abs_path))
|
||||
opts.silent = True
|
||||
config = ConfigParser()
|
||||
index_prefix_domain_map = None
|
||||
config.read(config_file)
|
||||
if "general" in config.sections():
|
||||
general_config = config["general"]
|
||||
if "silent" in general_config:
|
||||
@@ -1068,20 +1169,22 @@ def _main():
|
||||
elif "reported_domain" in report:
|
||||
domain = report["reported_domain"]
|
||||
elif "policies" in report:
|
||||
domain = report["policies"][0]["domain"]
|
||||
domain = report["policies"][0]["policy_domain"]
|
||||
if domain:
|
||||
domain = get_base_domain(domain)
|
||||
for prefix in index_prefix_domain_map:
|
||||
if domain in index_prefix_domain_map[prefix]:
|
||||
prefix = (
|
||||
prefix.lower()
|
||||
.strip()
|
||||
.strip("_")
|
||||
.replace(" ", "_")
|
||||
.replace("-", "_")
|
||||
)
|
||||
prefix = f"{prefix}_"
|
||||
return prefix
|
||||
if domain:
|
||||
domain = domain.lower()
|
||||
for prefix in index_prefix_domain_map:
|
||||
if domain in index_prefix_domain_map[prefix]:
|
||||
prefix = (
|
||||
prefix.lower()
|
||||
.strip()
|
||||
.strip("_")
|
||||
.replace(" ", "_")
|
||||
.replace("-", "_")
|
||||
)
|
||||
prefix = f"{prefix}_"
|
||||
return prefix
|
||||
return None
|
||||
|
||||
def process_reports(reports_):
|
||||
@@ -1092,6 +1195,22 @@ def _main():
|
||||
logger.error(message)
|
||||
output_errors.append(message)
|
||||
|
||||
if index_prefix_domain_map is not None:
|
||||
filtered_tls = []
|
||||
for report in reports_.get("smtp_tls_reports", []):
|
||||
if get_index_prefix(report) is not None:
|
||||
filtered_tls.append(report)
|
||||
else:
|
||||
domain = "unknown"
|
||||
if "policies" in report and report["policies"]:
|
||||
domain = report["policies"][0].get("policy_domain", "unknown")
|
||||
logger.debug(
|
||||
"Ignoring SMTP TLS report for domain not in "
|
||||
"index_prefix_domain_map: %s",
|
||||
domain,
|
||||
)
|
||||
reports_["smtp_tls_reports"] = filtered_tls
|
||||
|
||||
indent_value = 2 if opts.prettify_json else None
|
||||
output_str = "{0}\n".format(
|
||||
json.dumps(reports_, ensure_ascii=False, indent=indent_value)
|
||||
@@ -1659,9 +1778,16 @@ def _main():
|
||||
|
||||
index_prefix_domain_map = None
|
||||
|
||||
if args.config_file:
|
||||
config_file = args.config_file or os.environ.get("PARSEDMARC_CONFIG_FILE")
|
||||
has_env_config = any(
|
||||
k.startswith("PARSEDMARC_") and k != "PARSEDMARC_CONFIG_FILE"
|
||||
for k in os.environ
|
||||
)
|
||||
|
||||
if config_file or has_env_config:
|
||||
try:
|
||||
index_prefix_domain_map = _parse_config_file(args.config_file, opts)
|
||||
config = _load_config(config_file)
|
||||
index_prefix_domain_map = _parse_config(config, opts)
|
||||
except ConfigurationError as e:
|
||||
logger.critical(str(e))
|
||||
exit(-1)
|
||||
@@ -2078,15 +2204,25 @@ def _main():
|
||||
# Build a fresh opts starting from CLI-only defaults so that
|
||||
# sections removed from the config file actually take effect.
|
||||
new_opts = Namespace(**vars(opts_from_cli))
|
||||
new_index_prefix_domain_map = _parse_config_file(
|
||||
args.config_file, new_opts
|
||||
)
|
||||
new_config = _load_config(config_file)
|
||||
new_index_prefix_domain_map = _parse_config(new_config, new_opts)
|
||||
new_clients = _init_output_clients(new_opts)
|
||||
|
||||
# All steps succeeded — commit the changes atomically.
|
||||
_close_output_clients(clients)
|
||||
clients = new_clients
|
||||
index_prefix_domain_map = new_index_prefix_domain_map
|
||||
|
||||
# Reload the reverse DNS map so changes to the
|
||||
# map path/URL in the config take effect.
|
||||
load_reverse_dns_map(
|
||||
REVERSE_DNS_MAP,
|
||||
always_use_local_file=new_opts.always_use_local_files,
|
||||
local_file_path=new_opts.reverse_dns_map_path,
|
||||
url=new_opts.reverse_dns_map_url,
|
||||
offline=new_opts.offline,
|
||||
)
|
||||
|
||||
for k, v in vars(new_opts).items():
|
||||
setattr(opts, k, v)
|
||||
|
||||
|
||||
@@ -1,3 +1,3 @@
|
||||
__version__ = "9.3.1"
|
||||
__version__ = "9.5.1"
|
||||
|
||||
USER_AGENT = f"parsedmarc/{__version__}"
|
||||
|
||||
@@ -335,6 +335,76 @@ def get_ip_address_country(
|
||||
return country
|
||||
|
||||
|
||||
def load_reverse_dns_map(
|
||||
reverse_dns_map: ReverseDNSMap,
|
||||
*,
|
||||
always_use_local_file: bool = False,
|
||||
local_file_path: Optional[str] = None,
|
||||
url: Optional[str] = None,
|
||||
offline: bool = False,
|
||||
) -> None:
|
||||
"""
|
||||
Loads the reverse DNS map from a URL or local file.
|
||||
|
||||
Clears and repopulates the given map dict in place. If the map is
|
||||
fetched from a URL, that is tried first; on failure (or if offline/local
|
||||
mode is selected) the bundled CSV is used as a fallback.
|
||||
|
||||
Args:
|
||||
reverse_dns_map (dict): The map dict to populate (modified in place)
|
||||
always_use_local_file (bool): Always use a local map file
|
||||
local_file_path (str): Path to a local map file
|
||||
url (str): URL to a reverse DNS map
|
||||
offline (bool): Use the built-in copy of the reverse DNS map
|
||||
"""
|
||||
if url is None:
|
||||
url = (
|
||||
"https://raw.githubusercontent.com/domainaware"
|
||||
"/parsedmarc/master/parsedmarc/"
|
||||
"resources/maps/base_reverse_dns_map.csv"
|
||||
)
|
||||
|
||||
reverse_dns_map.clear()
|
||||
|
||||
def load_csv(_csv_file):
|
||||
reader = csv.DictReader(_csv_file)
|
||||
for row in reader:
|
||||
key = row["base_reverse_dns"].lower().strip()
|
||||
reverse_dns_map[key] = {
|
||||
"name": row["name"].strip(),
|
||||
"type": row["type"].strip(),
|
||||
}
|
||||
|
||||
csv_file = io.StringIO()
|
||||
|
||||
if not (offline or always_use_local_file):
|
||||
try:
|
||||
logger.debug(f"Trying to fetch reverse DNS map from {url}...")
|
||||
headers = {"User-Agent": USER_AGENT}
|
||||
response = requests.get(url, headers=headers)
|
||||
response.raise_for_status()
|
||||
csv_file.write(response.text)
|
||||
csv_file.seek(0)
|
||||
load_csv(csv_file)
|
||||
except requests.exceptions.RequestException as e:
|
||||
logger.warning(f"Failed to fetch reverse DNS map: {e}")
|
||||
except Exception:
|
||||
logger.warning("Not a valid CSV file")
|
||||
csv_file.seek(0)
|
||||
logging.debug("Response body:")
|
||||
logger.debug(csv_file.read())
|
||||
|
||||
if len(reverse_dns_map) == 0:
|
||||
logger.info("Loading included reverse DNS map...")
|
||||
path = str(
|
||||
files(parsedmarc.resources.maps).joinpath("base_reverse_dns_map.csv")
|
||||
)
|
||||
if local_file_path is not None:
|
||||
path = local_file_path
|
||||
with open(path) as csv_file:
|
||||
load_csv(csv_file)
|
||||
|
||||
|
||||
def get_service_from_reverse_dns_base_domain(
|
||||
base_domain,
|
||||
*,
|
||||
@@ -361,55 +431,21 @@ def get_service_from_reverse_dns_base_domain(
|
||||
"""
|
||||
|
||||
base_domain = base_domain.lower().strip()
|
||||
if url is None:
|
||||
url = (
|
||||
"https://raw.githubusercontent.com/domainaware"
|
||||
"/parsedmarc/master/parsedmarc/"
|
||||
"resources/maps/base_reverse_dns_map.csv"
|
||||
)
|
||||
reverse_dns_map_value: ReverseDNSMap
|
||||
if reverse_dns_map is None:
|
||||
reverse_dns_map_value = {}
|
||||
else:
|
||||
reverse_dns_map_value = reverse_dns_map
|
||||
|
||||
def load_csv(_csv_file):
|
||||
reader = csv.DictReader(_csv_file)
|
||||
for row in reader:
|
||||
key = row["base_reverse_dns"].lower().strip()
|
||||
reverse_dns_map_value[key] = {
|
||||
"name": row["name"],
|
||||
"type": row["type"],
|
||||
}
|
||||
|
||||
csv_file = io.StringIO()
|
||||
|
||||
if not (offline or always_use_local_file) and len(reverse_dns_map_value) == 0:
|
||||
try:
|
||||
logger.debug(f"Trying to fetch reverse DNS map from {url}...")
|
||||
headers = {"User-Agent": USER_AGENT}
|
||||
response = requests.get(url, headers=headers)
|
||||
response.raise_for_status()
|
||||
csv_file.write(response.text)
|
||||
csv_file.seek(0)
|
||||
load_csv(csv_file)
|
||||
except requests.exceptions.RequestException as e:
|
||||
logger.warning(f"Failed to fetch reverse DNS map: {e}")
|
||||
except Exception:
|
||||
logger.warning("Not a valid CSV file")
|
||||
csv_file.seek(0)
|
||||
logging.debug("Response body:")
|
||||
logger.debug(csv_file.read())
|
||||
|
||||
if len(reverse_dns_map_value) == 0:
|
||||
logger.info("Loading included reverse DNS map...")
|
||||
path = str(
|
||||
files(parsedmarc.resources.maps).joinpath("base_reverse_dns_map.csv")
|
||||
load_reverse_dns_map(
|
||||
reverse_dns_map_value,
|
||||
always_use_local_file=always_use_local_file,
|
||||
local_file_path=local_file_path,
|
||||
url=url,
|
||||
offline=offline,
|
||||
)
|
||||
if local_file_path is not None:
|
||||
path = local_file_path
|
||||
with open(path) as csv_file:
|
||||
load_csv(csv_file)
|
||||
|
||||
service: ReverseDNSService
|
||||
try:
|
||||
service = reverse_dns_map_value[base_domain]
|
||||
|
||||
@@ -50,7 +50,7 @@ dependencies = [
|
||||
"lxml>=4.4.0",
|
||||
"mailsuite>=1.11.2",
|
||||
"msgraph-core==0.2.2",
|
||||
"opensearch-py>=2.4.2,<=3.0.0",
|
||||
"opensearch-py>=2.4.2,<=4.0.0",
|
||||
"publicsuffixlist>=0.10.0",
|
||||
"pygelf>=0.4.2",
|
||||
"requests>=2.22.0",
|
||||
|
||||
412
tests.py
412
tests.py
@@ -4,12 +4,14 @@
|
||||
from __future__ import absolute_import, print_function, unicode_literals
|
||||
|
||||
import io
|
||||
import json
|
||||
import os
|
||||
import signal
|
||||
import sys
|
||||
import tempfile
|
||||
import unittest
|
||||
from base64 import urlsafe_b64encode
|
||||
from configparser import ConfigParser
|
||||
from glob import glob
|
||||
from pathlib import Path
|
||||
from tempfile import NamedTemporaryFile, TemporaryDirectory
|
||||
@@ -1984,7 +1986,8 @@ watch = true
|
||||
"SIGHUP not available on this platform",
|
||||
)
|
||||
@patch("parsedmarc.cli._init_output_clients")
|
||||
@patch("parsedmarc.cli._parse_config_file")
|
||||
@patch("parsedmarc.cli._parse_config")
|
||||
@patch("parsedmarc.cli._load_config")
|
||||
@patch("parsedmarc.cli.get_dmarc_reports_from_mailbox")
|
||||
@patch("parsedmarc.cli.watch_inbox")
|
||||
@patch("parsedmarc.cli.IMAPConnection")
|
||||
@@ -1993,6 +1996,7 @@ watch = true
|
||||
mock_imap,
|
||||
mock_watch,
|
||||
mock_get_reports,
|
||||
mock_load_config,
|
||||
mock_parse_config,
|
||||
mock_init_clients,
|
||||
):
|
||||
@@ -2006,7 +2010,9 @@ watch = true
|
||||
"smtp_tls_reports": [],
|
||||
}
|
||||
|
||||
def parse_side_effect(config_file, opts):
|
||||
mock_load_config.return_value = ConfigParser()
|
||||
|
||||
def parse_side_effect(config, opts):
|
||||
opts.imap_host = "imap.example.com"
|
||||
opts.imap_user = "user"
|
||||
opts.imap_password = "pass"
|
||||
@@ -2045,7 +2051,7 @@ watch = true
|
||||
self.assertEqual(cm.exception.code, 1)
|
||||
# watch_inbox was called twice: initial run + after reload
|
||||
self.assertEqual(mock_watch.call_count, 2)
|
||||
# _parse_config_file called for initial load + reload
|
||||
# _parse_config called for initial load + reload
|
||||
self.assertGreaterEqual(mock_parse_config.call_count, 2)
|
||||
|
||||
@unittest.skipUnless(
|
||||
@@ -2053,7 +2059,8 @@ watch = true
|
||||
"SIGHUP not available on this platform",
|
||||
)
|
||||
@patch("parsedmarc.cli._init_output_clients")
|
||||
@patch("parsedmarc.cli._parse_config_file")
|
||||
@patch("parsedmarc.cli._parse_config")
|
||||
@patch("parsedmarc.cli._load_config")
|
||||
@patch("parsedmarc.cli.get_dmarc_reports_from_mailbox")
|
||||
@patch("parsedmarc.cli.watch_inbox")
|
||||
@patch("parsedmarc.cli.IMAPConnection")
|
||||
@@ -2062,6 +2069,7 @@ watch = true
|
||||
mock_imap,
|
||||
mock_watch,
|
||||
mock_get_reports,
|
||||
mock_load_config,
|
||||
mock_parse_config,
|
||||
mock_init_clients,
|
||||
):
|
||||
@@ -2075,11 +2083,13 @@ watch = true
|
||||
"smtp_tls_reports": [],
|
||||
}
|
||||
|
||||
mock_load_config.return_value = ConfigParser()
|
||||
|
||||
# Initial parse sets required opts; reload parse raises
|
||||
initial_map = {"prefix_": ["example.com"]}
|
||||
call_count = [0]
|
||||
|
||||
def parse_side_effect(config_file, opts):
|
||||
def parse_side_effect(config, opts):
|
||||
call_count[0] += 1
|
||||
opts.imap_host = "imap.example.com"
|
||||
opts.imap_user = "user"
|
||||
@@ -2129,7 +2139,8 @@ watch = true
|
||||
"SIGHUP not available on this platform",
|
||||
)
|
||||
@patch("parsedmarc.cli._init_output_clients")
|
||||
@patch("parsedmarc.cli._parse_config_file")
|
||||
@patch("parsedmarc.cli._parse_config")
|
||||
@patch("parsedmarc.cli._load_config")
|
||||
@patch("parsedmarc.cli.get_dmarc_reports_from_mailbox")
|
||||
@patch("parsedmarc.cli.watch_inbox")
|
||||
@patch("parsedmarc.cli.IMAPConnection")
|
||||
@@ -2138,6 +2149,7 @@ watch = true
|
||||
mock_imap,
|
||||
mock_watch,
|
||||
mock_get_reports,
|
||||
mock_load_config,
|
||||
mock_parse_config,
|
||||
mock_init_clients,
|
||||
):
|
||||
@@ -2151,7 +2163,9 @@ watch = true
|
||||
"smtp_tls_reports": [],
|
||||
}
|
||||
|
||||
def parse_side_effect(config_file, opts):
|
||||
mock_load_config.return_value = ConfigParser()
|
||||
|
||||
def parse_side_effect(config, opts):
|
||||
opts.imap_host = "imap.example.com"
|
||||
opts.imap_user = "user"
|
||||
opts.imap_password = "pass"
|
||||
@@ -2278,6 +2292,390 @@ watch = true
|
||||
# Second init (after reload with v2 config): kafka_hosts should be None
|
||||
self.assertIsNone(init_opts_captures[1].kafka_hosts)
|
||||
|
||||
@unittest.skipUnless(
|
||||
hasattr(signal, "SIGHUP"),
|
||||
"SIGHUP not available on this platform",
|
||||
)
|
||||
@patch("parsedmarc.cli._init_output_clients")
|
||||
@patch("parsedmarc.cli._parse_config")
|
||||
@patch("parsedmarc.cli._load_config")
|
||||
@patch("parsedmarc.cli.get_dmarc_reports_from_mailbox")
|
||||
@patch("parsedmarc.cli.watch_inbox")
|
||||
@patch("parsedmarc.cli.IMAPConnection")
|
||||
def testReloadRefreshesReverseDnsMap(
|
||||
self,
|
||||
mock_imap,
|
||||
mock_watch,
|
||||
mock_get_reports,
|
||||
mock_load_config,
|
||||
mock_parse_config,
|
||||
mock_init_clients,
|
||||
):
|
||||
"""SIGHUP reload repopulates the reverse DNS map so lookups still work."""
|
||||
import signal as signal_module
|
||||
|
||||
from parsedmarc import REVERSE_DNS_MAP
|
||||
|
||||
mock_imap.return_value = object()
|
||||
mock_get_reports.return_value = {
|
||||
"aggregate_reports": [],
|
||||
"forensic_reports": [],
|
||||
"smtp_tls_reports": [],
|
||||
}
|
||||
|
||||
mock_load_config.return_value = ConfigParser()
|
||||
|
||||
def parse_side_effect(config, opts):
|
||||
opts.imap_host = "imap.example.com"
|
||||
opts.imap_user = "user"
|
||||
opts.imap_password = "pass"
|
||||
opts.mailbox_watch = True
|
||||
return None
|
||||
|
||||
mock_parse_config.side_effect = parse_side_effect
|
||||
mock_init_clients.return_value = {}
|
||||
|
||||
# Snapshot the map state after each watch_inbox call
|
||||
map_snapshots = []
|
||||
|
||||
watch_calls = [0]
|
||||
|
||||
def watch_side_effect(*args, **kwargs):
|
||||
watch_calls[0] += 1
|
||||
if watch_calls[0] == 1:
|
||||
if hasattr(signal_module, "SIGHUP"):
|
||||
import os
|
||||
|
||||
os.kill(os.getpid(), signal_module.SIGHUP)
|
||||
return
|
||||
else:
|
||||
# Capture the map state after reload, before we stop the loop
|
||||
map_snapshots.append(dict(REVERSE_DNS_MAP))
|
||||
raise FileExistsError("stop")
|
||||
|
||||
mock_watch.side_effect = watch_side_effect
|
||||
|
||||
with tempfile.NamedTemporaryFile("w", suffix=".ini", delete=False) as cfg:
|
||||
cfg.write(self._BASE_CONFIG)
|
||||
cfg_path = cfg.name
|
||||
self.addCleanup(lambda: os.path.exists(cfg_path) and os.remove(cfg_path))
|
||||
|
||||
# Pre-populate the map so we can verify it gets refreshed
|
||||
REVERSE_DNS_MAP.clear()
|
||||
REVERSE_DNS_MAP["stale.example.com"] = {
|
||||
"name": "Stale",
|
||||
"type": "stale",
|
||||
}
|
||||
original_contents = dict(REVERSE_DNS_MAP)
|
||||
|
||||
with patch.object(sys, "argv", ["parsedmarc", "-c", cfg_path]):
|
||||
with self.assertRaises(SystemExit):
|
||||
parsedmarc.cli._main()
|
||||
|
||||
self.assertEqual(mock_watch.call_count, 2)
|
||||
# The map should have been repopulated (not empty, not the stale data)
|
||||
self.assertEqual(len(map_snapshots), 1)
|
||||
refreshed = map_snapshots[0]
|
||||
self.assertGreater(len(refreshed), 0, "Map should not be empty after reload")
|
||||
self.assertNotEqual(
|
||||
refreshed,
|
||||
original_contents,
|
||||
"Map should have been refreshed, not kept stale data",
|
||||
)
|
||||
self.assertNotIn(
|
||||
"stale.example.com",
|
||||
refreshed,
|
||||
"Stale entry should have been cleared by reload",
|
||||
)
|
||||
|
||||
|
||||
class TestIndexPrefixDomainMapTlsFiltering(unittest.TestCase):
|
||||
"""Tests that SMTP TLS reports for unmapped domains are filtered out
|
||||
when index_prefix_domain_map is configured."""
|
||||
|
||||
@patch("parsedmarc.cli.get_dmarc_reports_from_mailbox")
|
||||
@patch("parsedmarc.cli.IMAPConnection")
|
||||
def testTlsReportsFilteredByDomainMap(
|
||||
self,
|
||||
mock_imap_connection,
|
||||
mock_get_reports,
|
||||
):
|
||||
"""TLS reports for domains not in the map should be silently dropped."""
|
||||
mock_imap_connection.return_value = object()
|
||||
mock_get_reports.return_value = {
|
||||
"aggregate_reports": [],
|
||||
"forensic_reports": [],
|
||||
"smtp_tls_reports": [
|
||||
{
|
||||
"organization_name": "Allowed Org",
|
||||
"begin_date": "2024-01-01T00:00:00Z",
|
||||
"end_date": "2024-01-01T23:59:59Z",
|
||||
"report_id": "allowed-1",
|
||||
"contact_info": "tls@allowed.example.com",
|
||||
"policies": [
|
||||
{
|
||||
"policy_domain": "allowed.example.com",
|
||||
"policy_type": "sts",
|
||||
"successful_session_count": 1,
|
||||
"failed_session_count": 0,
|
||||
}
|
||||
],
|
||||
},
|
||||
{
|
||||
"organization_name": "Unmapped Org",
|
||||
"begin_date": "2024-01-01T00:00:00Z",
|
||||
"end_date": "2024-01-01T23:59:59Z",
|
||||
"report_id": "unmapped-1",
|
||||
"contact_info": "tls@unmapped.example.net",
|
||||
"policies": [
|
||||
{
|
||||
"policy_domain": "unmapped.example.net",
|
||||
"policy_type": "sts",
|
||||
"successful_session_count": 5,
|
||||
"failed_session_count": 0,
|
||||
}
|
||||
],
|
||||
},
|
||||
{
|
||||
"organization_name": "Mixed Case Org",
|
||||
"begin_date": "2024-01-01T00:00:00Z",
|
||||
"end_date": "2024-01-01T23:59:59Z",
|
||||
"report_id": "mixed-case-1",
|
||||
"contact_info": "tls@mixedcase.example.com",
|
||||
"policies": [
|
||||
{
|
||||
"policy_domain": "MixedCase.Example.Com",
|
||||
"policy_type": "sts",
|
||||
"successful_session_count": 2,
|
||||
"failed_session_count": 0,
|
||||
}
|
||||
],
|
||||
},
|
||||
],
|
||||
}
|
||||
|
||||
domain_map = {"tenant_a": ["example.com"]}
|
||||
with NamedTemporaryFile("w", suffix=".yaml", delete=False) as map_file:
|
||||
import yaml
|
||||
|
||||
yaml.dump(domain_map, map_file)
|
||||
map_path = map_file.name
|
||||
self.addCleanup(lambda: os.path.exists(map_path) and os.remove(map_path))
|
||||
|
||||
config = f"""[general]
|
||||
save_smtp_tls = true
|
||||
silent = false
|
||||
index_prefix_domain_map = {map_path}
|
||||
|
||||
[imap]
|
||||
host = imap.example.com
|
||||
user = test-user
|
||||
password = test-password
|
||||
"""
|
||||
with NamedTemporaryFile("w", suffix=".ini", delete=False) as config_file:
|
||||
config_file.write(config)
|
||||
config_path = config_file.name
|
||||
self.addCleanup(lambda: os.path.exists(config_path) and os.remove(config_path))
|
||||
|
||||
captured = io.StringIO()
|
||||
with patch.object(sys, "argv", ["parsedmarc", "-c", config_path]):
|
||||
with patch("sys.stdout", captured):
|
||||
parsedmarc.cli._main()
|
||||
|
||||
output = json.loads(captured.getvalue())
|
||||
tls_reports = output["smtp_tls_reports"]
|
||||
self.assertEqual(len(tls_reports), 2)
|
||||
report_ids = {r["report_id"] for r in tls_reports}
|
||||
self.assertIn("allowed-1", report_ids)
|
||||
self.assertIn("mixed-case-1", report_ids)
|
||||
self.assertNotIn("unmapped-1", report_ids)
|
||||
|
||||
|
||||
class TestEnvVarConfig(unittest.TestCase):
|
||||
"""Tests for environment variable configuration support."""
|
||||
|
||||
def test_resolve_section_key_simple(self):
|
||||
"""Simple section names resolve correctly."""
|
||||
from parsedmarc.cli import _resolve_section_key
|
||||
|
||||
self.assertEqual(_resolve_section_key("IMAP_PASSWORD"), ("imap", "password"))
|
||||
self.assertEqual(_resolve_section_key("GENERAL_DEBUG"), ("general", "debug"))
|
||||
self.assertEqual(_resolve_section_key("S3_BUCKET"), ("s3", "bucket"))
|
||||
self.assertEqual(_resolve_section_key("GELF_HOST"), ("gelf", "host"))
|
||||
|
||||
def test_resolve_section_key_underscore_sections(self):
|
||||
"""Multi-word section names (splunk_hec, gmail_api, etc.) resolve correctly."""
|
||||
from parsedmarc.cli import _resolve_section_key
|
||||
|
||||
self.assertEqual(
|
||||
_resolve_section_key("SPLUNK_HEC_TOKEN"), ("splunk_hec", "token")
|
||||
)
|
||||
self.assertEqual(
|
||||
_resolve_section_key("GMAIL_API_CREDENTIALS_FILE"),
|
||||
("gmail_api", "credentials_file"),
|
||||
)
|
||||
self.assertEqual(
|
||||
_resolve_section_key("LOG_ANALYTICS_CLIENT_ID"),
|
||||
("log_analytics", "client_id"),
|
||||
)
|
||||
|
||||
def test_resolve_section_key_unknown(self):
|
||||
"""Unknown prefixes return (None, None)."""
|
||||
from parsedmarc.cli import _resolve_section_key
|
||||
|
||||
self.assertEqual(_resolve_section_key("UNKNOWN_FOO"), (None, None))
|
||||
# Just a section name with no key should not match
|
||||
self.assertEqual(_resolve_section_key("IMAP"), (None, None))
|
||||
|
||||
def test_apply_env_overrides_injects_values(self):
|
||||
"""Env vars are injected into an existing ConfigParser."""
|
||||
from configparser import ConfigParser
|
||||
from parsedmarc.cli import _apply_env_overrides
|
||||
|
||||
config = ConfigParser()
|
||||
config.add_section("imap")
|
||||
config.set("imap", "host", "original.example.com")
|
||||
|
||||
env = {
|
||||
"PARSEDMARC_IMAP_HOST": "new.example.com",
|
||||
"PARSEDMARC_IMAP_PASSWORD": "secret123",
|
||||
}
|
||||
with patch.dict(os.environ, env, clear=False):
|
||||
_apply_env_overrides(config)
|
||||
|
||||
self.assertEqual(config.get("imap", "host"), "new.example.com")
|
||||
self.assertEqual(config.get("imap", "password"), "secret123")
|
||||
|
||||
def test_apply_env_overrides_creates_sections(self):
|
||||
"""Env vars create new sections when they don't exist."""
|
||||
from configparser import ConfigParser
|
||||
from parsedmarc.cli import _apply_env_overrides
|
||||
|
||||
config = ConfigParser()
|
||||
|
||||
env = {"PARSEDMARC_ELASTICSEARCH_HOSTS": "http://localhost:9200"}
|
||||
with patch.dict(os.environ, env, clear=False):
|
||||
_apply_env_overrides(config)
|
||||
|
||||
self.assertTrue(config.has_section("elasticsearch"))
|
||||
self.assertEqual(config.get("elasticsearch", "hosts"), "http://localhost:9200")
|
||||
|
||||
def test_apply_env_overrides_ignores_config_file_var(self):
|
||||
"""PARSEDMARC_CONFIG_FILE is not injected as a config key."""
|
||||
from configparser import ConfigParser
|
||||
from parsedmarc.cli import _apply_env_overrides
|
||||
|
||||
config = ConfigParser()
|
||||
|
||||
env = {"PARSEDMARC_CONFIG_FILE": "/some/path.ini"}
|
||||
with patch.dict(os.environ, env, clear=False):
|
||||
_apply_env_overrides(config)
|
||||
|
||||
self.assertEqual(config.sections(), [])
|
||||
|
||||
def test_load_config_with_file_and_env_override(self):
|
||||
"""Env vars override values from an INI file."""
|
||||
from parsedmarc.cli import _load_config
|
||||
|
||||
with NamedTemporaryFile(mode="w", suffix=".ini", delete=False) as f:
|
||||
f.write(
|
||||
"[imap]\nhost = file.example.com\nuser = alice\npassword = fromfile\n"
|
||||
)
|
||||
f.flush()
|
||||
config_path = f.name
|
||||
|
||||
try:
|
||||
env = {"PARSEDMARC_IMAP_PASSWORD": "fromenv"}
|
||||
with patch.dict(os.environ, env, clear=False):
|
||||
config = _load_config(config_path)
|
||||
|
||||
self.assertEqual(config.get("imap", "host"), "file.example.com")
|
||||
self.assertEqual(config.get("imap", "user"), "alice")
|
||||
self.assertEqual(config.get("imap", "password"), "fromenv")
|
||||
finally:
|
||||
os.unlink(config_path)
|
||||
|
||||
def test_load_config_env_only(self):
|
||||
"""Config can be loaded purely from env vars with no file."""
|
||||
from parsedmarc.cli import _load_config
|
||||
|
||||
env = {
|
||||
"PARSEDMARC_GENERAL_DEBUG": "true",
|
||||
"PARSEDMARC_ELASTICSEARCH_HOSTS": "http://localhost:9200",
|
||||
}
|
||||
with patch.dict(os.environ, env, clear=False):
|
||||
config = _load_config(None)
|
||||
|
||||
self.assertEqual(config.get("general", "debug"), "true")
|
||||
self.assertEqual(config.get("elasticsearch", "hosts"), "http://localhost:9200")
|
||||
|
||||
def test_parse_config_from_env(self):
|
||||
"""Full round-trip: env vars -> ConfigParser -> opts."""
|
||||
from argparse import Namespace
|
||||
from parsedmarc.cli import _load_config, _parse_config
|
||||
|
||||
env = {
|
||||
"PARSEDMARC_GENERAL_DEBUG": "true",
|
||||
"PARSEDMARC_GENERAL_SAVE_AGGREGATE": "true",
|
||||
"PARSEDMARC_GENERAL_OFFLINE": "true",
|
||||
}
|
||||
with patch.dict(os.environ, env, clear=False):
|
||||
config = _load_config(None)
|
||||
|
||||
opts = Namespace()
|
||||
_parse_config(config, opts)
|
||||
|
||||
self.assertTrue(opts.debug)
|
||||
self.assertTrue(opts.save_aggregate)
|
||||
self.assertTrue(opts.offline)
|
||||
|
||||
def test_config_file_env_var(self):
|
||||
"""PARSEDMARC_CONFIG_FILE env var specifies the config file path."""
|
||||
from argparse import Namespace
|
||||
from parsedmarc.cli import _load_config, _parse_config
|
||||
|
||||
with NamedTemporaryFile(mode="w", suffix=".ini", delete=False) as f:
|
||||
f.write("[general]\ndebug = true\noffline = true\n")
|
||||
f.flush()
|
||||
config_path = f.name
|
||||
|
||||
try:
|
||||
env = {"PARSEDMARC_CONFIG_FILE": config_path}
|
||||
with patch.dict(os.environ, env, clear=False):
|
||||
config = _load_config(os.environ.get("PARSEDMARC_CONFIG_FILE"))
|
||||
|
||||
opts = Namespace()
|
||||
_parse_config(config, opts)
|
||||
self.assertTrue(opts.debug)
|
||||
self.assertTrue(opts.offline)
|
||||
finally:
|
||||
os.unlink(config_path)
|
||||
|
||||
def test_boolean_values_from_env(self):
|
||||
"""Various boolean string representations work through ConfigParser."""
|
||||
from configparser import ConfigParser
|
||||
from parsedmarc.cli import _apply_env_overrides
|
||||
|
||||
for true_val in ("true", "yes", "1", "on", "True", "YES"):
|
||||
config = ConfigParser()
|
||||
env = {"PARSEDMARC_GENERAL_DEBUG": true_val}
|
||||
with patch.dict(os.environ, env, clear=False):
|
||||
_apply_env_overrides(config)
|
||||
self.assertTrue(
|
||||
config.getboolean("general", "debug"),
|
||||
f"Expected truthy for {true_val!r}",
|
||||
)
|
||||
|
||||
for false_val in ("false", "no", "0", "off", "False", "NO"):
|
||||
config = ConfigParser()
|
||||
env = {"PARSEDMARC_GENERAL_DEBUG": false_val}
|
||||
with patch.dict(os.environ, env, clear=False):
|
||||
_apply_env_overrides(config)
|
||||
self.assertFalse(
|
||||
config.getboolean("general", "debug"),
|
||||
f"Expected falsy for {false_val!r}",
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main(verbosity=2)
|
||||
|
||||
Reference in New Issue
Block a user