Files
parsedmarc/.vscode/settings.json
T
Sean Whalen 180fc581fe fix: OSD Global-tenant import + dropped report files with glob metacharacters; validate dev stack on OpenSearch 3.x with PostgreSQL (#781)
* fix: import OpenSearch dashboards into the real Global tenant

dashboard-dev-bootstrap.sh sent `securitytenant: global_tenant`. The
OpenSearch security plugin reads that header as a tenant *name*, and
`global_tenant` is a sample custom tenant from the security demo config
-- not the shared Global tenant, whose token is the literal `global`.
The import therefore landed in a separate `global_tenant` tenant (its
own `.kibana_<hash>_globaltenant_1` index) and the dashboards were
invisible to anyone viewing the Global tenant in OpenSearch Dashboards.

Verified against the live dev cluster: `_find` under `securitytenant:
global` returned 26 objects and `.kibana_1` (the Global tenant index the
UI reads) went from 2 to 67 docs after re-importing with the fix. An
empty/omitted header read 0 from Global -- it falls back to the user's
configured default tenant -- so `global` is the only reliable token.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: don't drop report files whose names contain glob metacharacters

The CLI expanded every file argument with glob(), which treats [, ], *,
and ? as pattern syntax. A literal path like
"[Netease DMARC Failure Report] Rent Reminder.eml" -- the bracketed shape
many providers use for emailed failure reports -- was read as a character
class, matched nothing, and was dropped before reaching the parser, with
no error. File arguments that exist on disk are now taken literally; only
non-existent paths are globbed, so shell-style wildcards still expand.

Also adds "postgresql" to _KNOWN_SECTIONS so PARSEDMARC_POSTGRESQL_* env
vars (and their _FILE Docker-secret variants) resolve like every other
backend -- the PostgreSQL backend is new in 10.0.0, so this completes the
unreleased feature rather than fixing a released regression, and is
documented under the PostgreSQL enhancement, not Bug fixes.

Regression tests added for both. Verified end-to-end: all four
samples/failure/*.eml now index (the bracketed Netease report included).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* dev: validate dashboards on OpenSearch 3.x and add PostgreSQL to the dev stack

The dev stack ran OpenSearch Dashboards 3.x against OpenSearch 2.x, an
unsupported cross-major pairing. Bump opensearch to :3 (validated on
3.6.0: OSD import into the Global tenant and all dashboards work).

Add a postgresql service plus bootstrap wiring so the new PostgreSQL
backend is exercised alongside the others: wait for PG, seed it via
PARSEDMARC_POSTGRESQL_* env vars on the same parsedmarc run, wipe it on
RESEED, create a Grafana grafana-postgresql-datasource (uid dmarc-pg),
and import dashboards/grafana/Grafana-DMARC_Reports-PostgreSQL.json.

PG seeding is gated on psycopg being importable: parsedmarc aborts the
whole run (exit 1, nothing written to any backend) when a configured
output backend can't initialize, so wiring in PG without the optional
extra would silently zero ES/OS/Splunk too. When psycopg is absent the
script warns and skips PG, leaving the other backends seeded.

Also fix the Grafana admin password env: the container was given
GRAFANA_PASSWORD, which Grafana ignores -- it reads
GF_SECURITY_ADMIN_PASSWORD. Defaults to admin to match the script.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: list PostgreSQL on the premade-dashboards features bullet

PostgreSQL ships a premade Grafana dashboard
(dashboards/grafana/Grafana-DMARC_Reports-PostgreSQL.json), so it belongs
on the "for use with premade dashboards" bullet alongside Elasticsearch,
OpenSearch, and Splunk rather than on the plain-output-destinations line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: clear stale org_email mapping conflict in the OpenSearch dashboards

The aggregate index pattern in dashboards/opensearch/opensearch_dashboards.ndjson
shipped a cached field-list snapshot where org_email was a text/object
conflict, plus leftover org_email.#text and org_email.#text.keyword
subfields. Those came from a cluster that had indexed a langAttrString
email dict ({"#text": ..., "@lang": ...}) before the parser unwrapped it.

org_email is mapped as Text() and parse_aggregate_report_xml now unwraps a
dict email to a plain string, so current data is consistently text -- a
clean cluster's _field_caps reports no conflict. Cleared the frozen
conflict and the two artifact subfields, leaving org_email (text) and
org_email.keyword, matching the live mapping.

Verified: re-importing the corrected ndjson yields an index pattern with
org_email as a plain text field and zero conflicts; only the aggregate
index-pattern line changed, all other saved objects byte-identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* dev: seed the RFC 9990 (dmarc-2.0) aggregate samples

samples/aggregate/rfc9990-sample.xml and rfc9990-example.net!...xml were
not in the bootstrap's SAMPLE_FILES, so the dev stack only ever indexed
RFC 7489 reports and the new DMARCbis fields (np, testing,
discovery_method, generator, xml_namespace) never appeared in the
OpenSearch/Kibana indices or were available to the dashboards.

Added both samples (one declares the urn:ietf:params:xml:ns:dmarc-2.0
namespace, the other is namespaceless RFC 9990-shaped, covering both
detection paths). Verified the seeded data now carries np/testing/
discovery_method/generator and xml_namespace=urn:ietf:params:xml:ns:dmarc-2.0;
OpenSearch Dashboards surfaces them on an index-pattern field-list refresh.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* dev: auto-resolve (or create) a venv for the seed and ensure psycopg

The seed previously required parsedmarc to be pre-installed and only
warned-and-skipped PostgreSQL when psycopg was missing. Resolve the seed
environment by precedence instead:

  1. explicit PARSEDMARC_BIN  -> used as-is, nothing installed
  2. active $VIRTUAL_ENV
  3. existing repo venv/ or .venv/
  4. otherwise create $REPO_ROOT/venv

For cases 2-4, run `pip install -e .[postgresql]` only when the CLI or
psycopg is missing, so the dev stack can populate Postgres out of the box
without a manual install step. The explicit-PARSEDMARC_BIN path is left
untouched (and the psycopg seed guard still warns/skips if that env lacks
the extra).

Verified: a RESEED run resolves the active venv, seeds ES/OS/Splunk/PG
including the RFC 9990 fields, with no output-client errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 15:42:41 -04:00

418 lines
7.5 KiB
JSON

{
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnSave": true,
// Let Ruff handle lint fixes + import sorting on save
"editor.codeActionsOnSave": {
"source.fixAll.ruff": "explicit",
"source.organizeImports.ruff": "explicit"
}
},
"markdownlint.config": {
"MD024": false
},
"cSpell.ignorePaths": [
"parsedmarc/resources/**",
"samples/**",
"dashboards/**"
],
"cSpell.words": [
"abbp",
"adkim",
"AFRINIC",
"akamaiedge",
"AKIA",
"amsmath",
"andrewmcgilvray",
"Angkasa",
"antipattern",
"aoss",
"apikey",
"APNIC",
"arcname",
"ARIN",
"asahi",
"aspf",
"autoclass",
"automodule",
"awssigv",
"AWSV",
"Ayuntamiento",
"backported",
"baltcom",
"Bankstown",
"bayii",
"behaviour",
"bellsouth",
"bestbuy",
"Bhozar",
"BIGSERIAL",
"biznesu",
"Boldyn",
"bombbomb",
"borschow",
"boto",
"brakhane",
"Brightmail",
"Brightspace",
"Buildtech",
"Cadian",
"cafile",
"Centrale",
"certfile",
"CEST",
"CFWS",
"CHACHA",
"charliermarsh",
"charrefs",
"checkdmarc",
"chello",
"choropleth",
"CLOUDFLARENET",
"Codecov",
"colour",
"Comune",
"Concentrix",
"confnew",
"cooldown",
"cornerstoneondemand",
"CPAN",
"cprapid",
"creds",
"csec",
"cust",
"cyberfolks",
"datagram",
"Datech",
"dateparser",
"dateutil",
"Davmail",
"DBIP",
"dbname",
"ddgs",
"dearmor",
"dedup",
"dedups",
"defaultdict",
"defence",
"deflist",
"descr",
"devel",
"DGRAM",
"Dienstleister",
"digicelgroup",
"digicelsr",
"disambiguator",
"dlivry",
"DMARC",
"Dmarcian",
"dnspython",
"dollarmath",
"domainaware",
"Dotdigital",
"dpkg",
"Draffin",
"electrolyser",
"Energia",
"enloe",
"EPEL",
"estudio",
"Evolus",
"exampleuser",
"expanduser",
"expandvars",
"expiringdict",
"faxpipe",
"fieldlist",
"fintech",
"firstlight",
"firstsourceweb",
"firstwave",
"foohost",
"footguns",
"freseniusmedicalcare",
"fromenv",
"fspath",
"gaierror",
"GELF",
"Genesys",
"genindex",
"geoip",
"geoipupdate",
"Geolite",
"geolocation",
"gerenciados",
"geteuid",
"getpid",
"getuid",
"Gigantara",
"githubpages",
"Gmina",
"goco",
"Grafana",
"greenecountyny",
"Gurgaon",
"helpforcb",
"henkel",
"homelab",
"homelabs",
"Hostinger",
"hostnames",
"htpasswd",
"httpasswd",
"httplib",
"hugedomains",
"idens",
"ifhost",
"IMAP",
"imapclient",
"infile",
"infogérance",
"informatiques",
"Interaktive",
"interstitials",
"IPDB",
"IPFS",
"ipinfo",
"isinstance",
"isready",
"journalctl",
"junitxml",
"kafkaclient",
"keepalive",
"keycorpgroup",
"keyout",
"keyrings",
"kiota",
"kwarg",
"kwargs",
"LACNIC",
"lancastergeneralhealth",
"lastik",
"Leeman",
"libemail",
"libpq",
"linkify",
"LISTSERV",
"localonly",
"lodestonegroup",
"loganalytics",
"Lojistik",
"Loomis",
"Ltda",
"Luxembourgish",
"lxml",
"Maildir",
"mailparser",
"mailrelay",
"mailsuite",
"MAINPID",
"maxdepth",
"MAXHEADERS",
"maxmind",
"maxminddb",
"mbox",
"mcdlv",
"mcsv",
"metacharacters",
"mfrom",
"mhdw",
"Miasta",
"Miasto",
"michaeldavie",
"mikesiegel",
"Mimecast",
"misattributed",
"mitigations",
"mktemp",
"MMDB",
"modindex",
"Mosquée",
"msgconvert",
"msgraph",
"MSSP",
"multiprocess",
"multivalued",
"Munge",
"myshopify",
"namespaceless",
"ndjson",
"Netease",
"Newfold",
"newkey",
"Newswire",
"Newtek",
"Nhcm",
"nitelusa",
"nobre",
"nobreinternet",
"nojekyll",
"nologin",
"nondigest",
"nordictelecom",
"NORDU",
"Norlys",
"nosecureimap",
"nosniff",
"nwettbewerb",
"NXDOMAIN",
"Oberoi",
"opensearch",
"opensearchpy",
"organisation",
"orgname",
"oxfordnetworks",
"Paltalk",
"parsedmarc",
"passsword",
"pawyo",
"pbar",
"penyedia",
"perfdrive",
"PGPASSWORD",
"pharma",
"pipefail",
"plog",
"pmarc",
"pneuservis",
"Postorius",
"premade",
"prestataire",
"privatesuffix",
"procs",
"psql",
"psycopg",
"publicsuffix",
"publicsuffixlist",
"publixsuffix",
"pura",
"pygelf",
"pyproject",
"pypy",
"pytest",
"qasl",
"quickstart",
"RDAP",
"rdns",
"readlines",
"rebrands",
"regusnet",
"Reindex",
"replyto",
"researchable",
"reversename",
"Rollup",
"Rostelecom",
"Rpdm",
"rsgsv",
"SAMEORIGIN",
"Sangoma",
"Sarenet",
"saunalahti",
"sdist",
"seanthegeek",
"sekret",
"sendgrid",
"Servernameone",
"SERVFAIL",
"serviços",
"setuid",
"setuptools",
"signum",
"Sigorta",
"Sikt",
"Sinch",
"smartquotes",
"SMTPTLS",
"socktype",
"solusi",
"sortlists",
"sortmaps",
"sourcetype",
"splunkd",
"sqls",
"sslmode",
"STARTTLS",
"subfolders",
"subzones",
"sungardas",
"Talkdesk",
"tasklist",
"Techexpert",
"telco",
"telcos",
"Telecomunicaciones",
"Telecomunicações",
"Telefónica",
"Telekommunikation",
"Telekomunikasyon",
"Teleperformance",
"Telus",
"testpaths",
"tigobusiness",
"timechart",
"timespan",
"timespans",
"timestamptz",
"tlsa",
"tlsrpt",
"tmddedicated",
"toctree",
"tolower",
"TQDDM",
"tqdm",
"treewalk",
"Treten",
"truststore",
"Tutanota",
"typosquats",
"uids",
"Ukrinfosystems",
"umbler",
"Uncategorized",
"unfindable",
"Uninett",
"unparasable",
"unparseable",
"unwritable",
"uper",
"uplandsoftware",
"urllib",
"Urząd",
"usługi",
"Valimail",
"venv",
"Vertikal",
"Vhcw",
"viewcode",
"virtualenv",
"Voximplant",
"WBITS",
"webmail",
"webpass",
"WEMPI",
"Wettbewerber",
"Whalen",
"whalensolutions",
"whitespaces",
"WHOIS",
"worklist",
"Wylance",
"xennn",
"xmlchanged",
"xmlin",
"xmlnice",
"xmlout",
"xmltodict",
"xpack",
"zscholl",
"Übersicht",
"Şirketi",
"Δήμος",
"клуб",
"шиномонтаж",
"পবিত্র",
"মন্দির",
],
}