parsedmarc

mirror of https://github.com/domainaware/parsedmarc.git synced 2026-05-25 21:35:22 +00:00

Author	SHA1	Message	Date
Sean Whalen	180fc581fe	fix: OSD Global-tenant import + dropped report files with glob metacharacters; validate dev stack on OpenSearch 3.x with PostgreSQL (#781 ) * fix: import OpenSearch dashboards into the real Global tenant dashboard-dev-bootstrap.sh sent `securitytenant: global_tenant`. The OpenSearch security plugin reads that header as a tenant name, and `global_tenant` is a sample custom tenant from the security demo config -- not the shared Global tenant, whose token is the literal `global`. The import therefore landed in a separate `global_tenant` tenant (its own `.kibana_<hash>_globaltenant_1` index) and the dashboards were invisible to anyone viewing the Global tenant in OpenSearch Dashboards. Verified against the live dev cluster: `_find` under `securitytenant: global` returned 26 objects and `.kibana_1` (the Global tenant index the UI reads) went from 2 to 67 docs after re-importing with the fix. An empty/omitted header read 0 from Global -- it falls back to the user's configured default tenant -- so `global` is the only reliable token. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: don't drop report files whose names contain glob metacharacters The CLI expanded every file argument with glob(), which treats [, ], , and ? as pattern syntax. A literal path like "[Netease DMARC Failure Report] Rent Reminder.eml" -- the bracketed shape many providers use for emailed failure reports -- was read as a character class, matched nothing, and was dropped before reaching the parser, with no error. File arguments that exist on disk are now taken literally; only non-existent paths are globbed, so shell-style wildcards still expand. Also adds "postgresql" to _KNOWN_SECTIONS so PARSEDMARC_POSTGRESQL_ env vars (and their _FILE Docker-secret variants) resolve like every other backend -- the PostgreSQL backend is new in 10.0.0, so this completes the unreleased feature rather than fixing a released regression, and is documented under the PostgreSQL enhancement, not Bug fixes. Regression tests added for both. Verified end-to-end: all four samples/failure/.eml now index (the bracketed Netease report included). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> dev: validate dashboards on OpenSearch 3.x and add PostgreSQL to the dev stack The dev stack ran OpenSearch Dashboards 3.x against OpenSearch 2.x, an unsupported cross-major pairing. Bump opensearch to :3 (validated on 3.6.0: OSD import into the Global tenant and all dashboards work). Add a postgresql service plus bootstrap wiring so the new PostgreSQL backend is exercised alongside the others: wait for PG, seed it via PARSEDMARC_POSTGRESQL_* env vars on the same parsedmarc run, wipe it on RESEED, create a Grafana grafana-postgresql-datasource (uid dmarc-pg), and import dashboards/grafana/Grafana-DMARC_Reports-PostgreSQL.json. PG seeding is gated on psycopg being importable: parsedmarc aborts the whole run (exit 1, nothing written to any backend) when a configured output backend can't initialize, so wiring in PG without the optional extra would silently zero ES/OS/Splunk too. When psycopg is absent the script warns and skips PG, leaving the other backends seeded. Also fix the Grafana admin password env: the container was given GRAFANA_PASSWORD, which Grafana ignores -- it reads GF_SECURITY_ADMIN_PASSWORD. Defaults to admin to match the script. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: list PostgreSQL on the premade-dashboards features bullet PostgreSQL ships a premade Grafana dashboard (dashboards/grafana/Grafana-DMARC_Reports-PostgreSQL.json), so it belongs on the "for use with premade dashboards" bullet alongside Elasticsearch, OpenSearch, and Splunk rather than on the plain-output-destinations line. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: clear stale org_email mapping conflict in the OpenSearch dashboards The aggregate index pattern in dashboards/opensearch/opensearch_dashboards.ndjson shipped a cached field-list snapshot where org_email was a text/object conflict, plus leftover org_email.#text and org_email.#text.keyword subfields. Those came from a cluster that had indexed a langAttrString email dict ({"#text": ..., "@lang": ...}) before the parser unwrapped it. org_email is mapped as Text() and parse_aggregate_report_xml now unwraps a dict email to a plain string, so current data is consistently text -- a clean cluster's _field_caps reports no conflict. Cleared the frozen conflict and the two artifact subfields, leaving org_email (text) and org_email.keyword, matching the live mapping. Verified: re-importing the corrected ndjson yields an index pattern with org_email as a plain text field and zero conflicts; only the aggregate index-pattern line changed, all other saved objects byte-identical. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * dev: seed the RFC 9990 (dmarc-2.0) aggregate samples samples/aggregate/rfc9990-sample.xml and rfc9990-example.net!...xml were not in the bootstrap's SAMPLE_FILES, so the dev stack only ever indexed RFC 7489 reports and the new DMARCbis fields (np, testing, discovery_method, generator, xml_namespace) never appeared in the OpenSearch/Kibana indices or were available to the dashboards. Added both samples (one declares the urn:ietf:params:xml:ns:dmarc-2.0 namespace, the other is namespaceless RFC 9990-shaped, covering both detection paths). Verified the seeded data now carries np/testing/ discovery_method/generator and xml_namespace=urn:ietf:params:xml:ns:dmarc-2.0; OpenSearch Dashboards surfaces them on an index-pattern field-list refresh. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * dev: auto-resolve (or create) a venv for the seed and ensure psycopg The seed previously required parsedmarc to be pre-installed and only warned-and-skipped PostgreSQL when psycopg was missing. Resolve the seed environment by precedence instead: 1. explicit PARSEDMARC_BIN -> used as-is, nothing installed 2. active $VIRTUAL_ENV 3. existing repo venv/ or .venv/ 4. otherwise create $REPO_ROOT/venv For cases 2-4, run `pip install -e .[postgresql]` only when the CLI or psycopg is missing, so the dev stack can populate Postgres out of the box without a manual install step. The explicit-PARSEDMARC_BIN path is left untouched (and the psycopg seed guard still warns/skips if that env lacks the extra). Verified: a RESEED run resolves the active venv, seeds ES/OS/Splunk/PG including the RFC 9990 fields, with no output-client errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 15:42:41 -04:00
Sean Whalen	a6778707d7	Finish forensic→failure rename: archive-folder migration + dashboard/doc cleanup (#776 ) The forensic→failure rename (#659) left a few loose ends and one deliberate hold-back. This closes them. Leftover rename misses (broken paths / stale canonical names): - CONTRIBUTING.md, dashboard-dev-bootstrap.sh: samples/forensic/* → samples/failure/* - dashboard-dev-bootstrap.sh, dashboards/README.md: dmarc_forensic_dashboard.xml → dmarc_failure_dashboard.xml (the file was already renamed; the import path and view name were not) - docs/source/usage.md: PARSEDMARC_GENERAL_SAVE_FORENSIC → ..._SAVE_FAILURE example - samples/parsedmarc.ini: save_forensic → save_failure - pyproject.toml, README.md: canonical "failure" naming (ci.ini intentionally keeps save_forensic to smoke-test the deprecated alias.) Archive subfolder rename + on-startup migration: - New failure reports now archive to <archive>/Failure (was <archive>/Forensic). - _migrate_forensic_archive_folder() runs once on startup (best-effort): renames Forensic→Failure when no Failure folder exists yet, merges the two when both exist, no-ops when there's no legacy folder, and logs-and-skips a mailbox it can't reorganize (warn, don't crash). This consolidates pre- and post-rename failure reports into one folder, replacing the previously documented decision to keep the folder named Forensic to avoid a split archive. Uses the folder-management API (folder_exists / rename_folder / merge_folders) added in mailsuite 2.1.0; the pin is bumped to >=2.1.0. Grafana dashboard (the rename PR updated OSD/Splunk/ES-OS but not Grafana): - Forensic panel titles + the datasource label → Failure; the fo-column display label and its linked byName field-override matcher both → "Failure Policy" (changed together so the column-width override keeps matching). - dev-bootstrap Grafana ES datasource: dmarc_forensic* → dmarc_f* (matches both pre-rename dmarc_forensic* and post-rename dmarc_failure, like the OSD/Kibana dashboards); RESEED wipe loop now also clears dmarc_failure indices. - Removed dashboards/grafana/Grafana-DMARC_Reports.json-new_panel.json, an orphan export accidentally committed in #736 and referenced by nothing. Tests (tests/test_init.py): - TestMigrateForensicArchiveFolderMaildir: real on-disk Maildir round-trips via mailsuite's MaildirConnection (no mocks) — rename, merge, no-op, and the full get_dmarc_reports_from_mailbox orchestration. Runs in CI (no network/creds). - TestMigrateForensicArchiveFolderErrorHandling: the one path a real Maildir can't reproduce — a backend that raises mid-operation must warn, not crash. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-21 12:29:40 -04:00
Sean Whalen	4e8c28bbc0	Align Kibana dashboards with OpenSearch Dashboards source-of-truth (#737 ) * Align Kibana dashboards with OpenSearch Dashboards source-of-truth OSD is a fork of Kibana 7.10 and Kibana 8.x's saved-object migration handlers accept OSD's saved-object format directly. Replace the legacy Kibana export with a byte-identical copy of the OSD ndjson, so the two backends ship the same panels, metric aggregations, panel titles, and field assignments instead of drifting independently. Verified against Kibana 8.19.7: import returns successCount=26 with no errors and Kibana auto-migrates each viz / dashboard to its current saved-object schema (typeMigrationVersion 8.5.0 for visualizations, 10.3.0 for dashboards) on import. Net effects for Kibana users on import: - Picks up the metric-aggregation fix from 9.10.3 — pies, tables, and the choropleth now sum(message_count) instead of counting OS docs, giving real message volume rather than distinct source-row counts. - Adds "Message sources by Autonomous System" and "Message sources by name and type" panels (previously only on OSD). - Forensic dashboard simplified to OSD's two-panel layout (markdown intro + samples table) — drops the Kibana-only IP-address and country-ISO tables and the choropleth. - Adds the "SMTP TLS reporting" dashboard (was absent from the bundled Kibana export). - Drops the extraneous "Evolution DMARC par source_reverse_DNS" Lens visualization that snuck in via a community contribution. Updates docs/source/kibana.md to reflect the new dashboard names ("DMARC aggregate reports" / "DMARC failure reports") and adds a brief section on the SMTP TLS reporting dashboard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Drop the duplicate Kibana ndjson; point Kibana users at the OSD file Kibana 8.x's saved-object migration handlers accept the OpenSearch Dashboards saved-object format directly (verified by import returning successCount=26 with no errors), so a separate kibana/export.ndjson was just two copies of the same bytes that would inevitably drift. Drop it and update the bootstrap script and docs to point at the existing dashboards/opensearch/opensearch_dashboards.ndjson. Add a path-filtered CI workflow (.github/workflows/dashboards.yml) that fires only when the OSD ndjson changes. It stands up an Elasticsearch + Kibana 8.19.7 service pair, POSTs the file at the saved-objects import endpoint, and asserts success=true with no errors. That keeps the single-file source compatible with Kibana on every change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Sean Whalen <seanthegeek@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 01:30:48 -04:00
Sean Whalen	826e78c390	Fix DMARC dashboard metrics (OSD + Splunk) and add dashboard-dev bootstrap (#736 ) * OSD: fix aggregate dashboard metrics to sum(message_count) 13 panels on the DMARC aggregate dashboard were aggregating with `count` (number of OSD docs) when they should have been summing `message_count`. Each parsedmarc OSD doc represents one (source_ip, auth_results) tuple from the XML and carries an integer message_count, so doc-counting reports "distinct sources" rather than "messages". Panels with titles like "Message volume by header from", "DMARC passage over time", etc. were producing misleading numbers. Affected panels: SPF/DKIM/Passed-DMARC pies; Reporting orgs; Sources by reverse DNS / header from / name+type / ASN / country / IP; Map; SPF and DKIM details. (DMARC failure email samples kept count — one OSD doc per RUF sample, so it's correct. SMTP TLS panels untouched — they sum the right session-count fields.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Splunk: align dashboards with OSD and fix query bugs Aggregate dashboard: - Add "Message sources by Autonomous System" panel (source_asn / as_name / as_domain), formatted "AS<n>" at render with eval, matching the OSD addition. - DKIM details: add the missing dkim_aligned column. - SPF details: reorder columns to OSD order (spf_aligned at end). - Map / country titles renamed to match OSD ("Map of message sources by country", "Message sources by country"). - Map widget: stats count by Country -> stats sum(message_count) by Country, so the choropleth shades by message volume not record count. - fillnull "none"/"unknown" applied to source_reverse_dns, source_base_domain, source_country to mirror OSD's missing-bucket labels. - charting.fieldColors {true: green, false: red} on SPF/DKIM/Passed-DMARC pies and the DMARC-passage timechart. Forensic dashboard: - Restructure to match OSD's two-panel layout (markdown + samples table). - Drop the country map / IP table / country-ISO table panels (not in OSD). - Samples table columns aligned to OSD: arrival_date_utc, source.ip_address, from, subject, reply_to, authentication_results. - Tolerate null headers in the base_search filter (was: parsed_sample.headers.From=* required field to exist; LinkedIn RUF sample with null From was filtered out). SMTP TLS dashboard: - Reorder metrics to OSD order (successful before failed). - Domains panel: add policy_type bucket. - Failure details: replace search-time `failed_session_count>0` (which doesn't evaluate against multivalued JSON paths in Splunk) with `result_type=` for presence + post-stats `where failed_sessions>0`. Drop _time/successful_sessions columns; reorder to match OSD. - Wire the existing policy_type input into all three searches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Add dashboard-dev bootstrap script and VSCode task dashboard-dev-bootstrap.sh brings up docker-compose.dashboard-dev.yml, seeds parsedmarc sample data into ES + OS + Splunk via parsedmarc-dev.ini, and re-imports every dashboard into Kibana, OpenSearch Dashboards, Grafana, and Splunk. Idempotent: existence checks skip provisioning that's already done; only the dashboard imports re-run unconditionally on every invocation (that's the point of running it after a dashboard edit). Notable provisioning quirks the script handles: - Splunk's auto-created HEC token (from the SPLUNK_HEC_TOKEN env) ships with indexes=[] and index=default; rewrites it to allow the email index. - ES 8.x rejects wildcard DELETEs by default; RESEED=1 enumerates daily parsedmarc indexes via _cat/indices and deletes one at a time. - Splunk has no clean-in-place REST endpoint for live indexes; RESEED=1 deletes and recreates the email index (then re-applies the HEC token). - OSD security plugin tenants: imports target global_tenant explicitly via the securitytenant header so they're visible to the shared workspace rather than landing in the API user's private tenant. Override with OSD_TENANT=<name>. - Splunk ships an in-product announcement view (scheduled_export_dashboard) with sharing=global; the script narrows it to sharing=app so it stops showing up in every app's dashboards list. Adds a "Dev Dashboard: Bootstrap" task to .vscode/tasks.json that runs the script. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * CHANGELOG: 9.10.3 entry for the dashboard metric fix and alignment work Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Bump version to 9.10.3 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * CHANGELOG: warn against the "Create new objects with unique IDs" import mode OSD's import dialog has two modes: the default "Check for existing objects" (which honors saved-object IDs and overwrites in place when "Automatically overwrite conflicts" is on) and "Create new objects with unique IDs" (which imports under fresh UUIDs and leaves the buggy originals untouched). Picking the second one means the dashboards keep rendering the wrong numbers because the originals are never replaced. Spell that out so users don't fall into the trap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * OSD: label the metric column "messages" instead of "Sum of message_count" OSD's table column header defaults to "Sum of message_count" when the metric agg has no customLabel. "messages" reads better and matches what the panels are actually counting. Applies to all 15 aggregate-DMARC visualizations that use sum(message_count). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * CHANGELOG: tighten the 9.10.3 entry — clearer and more actionable Trim the verbose technical exposition; lead each fix with the user-visible symptom. Move the action-required call out to its own header in upgrade notes so the re-import instructions don't get lost in a wall of text. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Move per-tool dashboard exports under a single dashboards/ directory Consolidates the four sibling top-level folders (kibana/, opensearch/, grafana/, splunk/) into dashboards/{kibana,opensearch,grafana,splunk}/. Updates the only path references in tracked files: bootstrap script (5 lines), CHANGELOG.md (1 line), and the kibana/export.ndjson raw URL in docs/source/elasticsearch.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * OSD: restore the "DKIM alignment" panel title on the aggregate dashboard The DKIM alignment panel had no title override in panelsJSON, so OSD fell back to the visualization's own name ("Aggregate DMARC DKIM alignment"). Every other pie/table on the same dashboard sets a clean title (SPF alignment, Passed DMARC, etc.) — this was a stray regression. Set the panel title to "DKIM alignment" to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Splunk: color the message-disposition timechart by severity Reject is red, quarantine is yellow, none is green — same semantic mapping as the SPF/DKIM/Passed-DMARC pies and the DMARC-passage timechart, applied via charting.fieldColors. Matches OSD's existing color overrides on the equivalent viz. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * CHANGELOG: clarify that "Create new objects with unique IDs" is the default The OSD import dialog defaults to that mode — users have to actively switch away from it, not just avoid picking it. Reword the upgrade note to lead with the switch and explain why the default would silently preserve the bug. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Sean Whalen <seanthegeek@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:40:01 -04:00

4 Commits