diff --git a/.vscode/settings.json b/.vscode/settings.json index acce9a4..237bf27 100644 --- a/.vscode/settings.json +++ b/.vscode/settings.json @@ -26,6 +26,7 @@ "automodule", "backported", "bellsouth", + "Bhozar", "boto", "brakhane", "Brightmail", diff --git a/AGENTS.md b/AGENTS.md index 3f2864d..0497d48 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -93,6 +93,10 @@ IP address info cached for 4 hours, seen aggregate report IDs cached for 1 hour - Token file writes must create parent directories before opening for write. - Store natively numeric values as numbers, not pre-formatted strings. Example: ASN is stored as `int 15169`, not `"AS15169"`; Elasticsearch / OpenSearch mappings for such fields use `Integer()` so consumers can do range queries and numeric sorts. Display layers format with a prefix at render time. +## Local dev secrets + +If a config file is listed in `.gitignore`, treat its contents as secret. Do not paste its literal values into any tracked file — READMEs, docs, code comments, commit messages, PR descriptions, sample/test fixtures. Reference the variable name (e.g. `$SOME_PASSWORD`) or show a placeholder (`...`) instead, and tell the reader to pick their own values. This is both a real-leak hedge and a way to keep secret scanners (GitHub secret scanning, push protection, third-party scanners) from firing false positives on the repo. Defer to `.gitignore` as the source of truth on what's secret — the rule applies to any gitignored config file the project ever adds, not just the ones present today (currently `.env` and `parsedmarc*.ini`). + ## Editing tracked data files Before rewriting a tracked list/data file from freshly-generated content (anything under `parsedmarc/resources/maps/`, CSVs, `.txt` lists), check the existing file first — `git show HEAD: | wc -l`, `git log -1 -- `, `git diff --stat`. Files like `known_unknown_base_reverse_dns.txt` and `base_reverse_dns_map.csv` accumulate manually-curated entries across many sessions, and a "fresh" regeneration that drops the row count is almost certainly destroying prior work. If the new content is meant to *add* rather than *replace*, use a merge/append pattern. Treat any unexpected row-count drop in the pending diff as a red flag. diff --git a/dashboards/README.md b/dashboards/README.md new file mode 100644 index 0000000..1ccd6f9 --- /dev/null +++ b/dashboards/README.md @@ -0,0 +1,97 @@ +# Dashboard development + +This directory holds the dashboard sources that ship with parsedmarc: + +- [opensearch/opensearch_dashboards.ndjson](opensearch/opensearch_dashboards.ndjson) — the source-of-truth saved-objects export. It is imported into both **OpenSearch Dashboards** and **Kibana** (the file format is compatible with both). +- [grafana/Grafana-DMARC_Reports.json](grafana/Grafana-DMARC_Reports.json) — the Grafana dashboard, with two Elasticsearch datasources (`dmarc-ag`, `dmarc-fo`). +- [splunk/](splunk/) — three Splunk dashboard XML views (`dmarc_aggregate`, `dmarc_forensic`, `smtp_tls`). + +Edits to any of these files should be exported from a running instance after authoring the change in the UI, not hand-edited (with the occasional exception of small XML tweaks for Splunk). + +## The dev stack + +[docker-compose.dashboard-dev.yml](../docker-compose.dashboard-dev.yml) brings up every viz target at once so a single dashboard change can be authored and re-exported across all four UIs in one session. It `include:`s [docker-compose.yml](../docker-compose.yml) for the Elasticsearch and OpenSearch backends, then layers on Kibana, OpenSearch Dashboards, Grafana, and Splunk. + +| Service | URL | Credentials | +| --------------------- | ------------------------------------------------ | ------------------------------------------------------ | +| Elasticsearch | http://localhost:9200 | (security disabled) | +| OpenSearch | https://localhost:9201 | `admin` / `$OPENSEARCH_INITIAL_ADMIN_PASSWORD` | +| Kibana | http://localhost:5601 | (security disabled) | +| OpenSearch Dashboards | http://localhost:5602 | `admin` / `$OPENSEARCH_INITIAL_ADMIN_PASSWORD` | +| Grafana | http://localhost:3000 | `admin` / `$GRAFANA_PASSWORD` | +| Splunk Web / HEC | http://localhost:8000 / https://localhost:8088 | `admin` / `$SPLUNK_PASSWORD`, HEC token `$SPLUNK_HEC_TOKEN` | + +All ports bind to `127.0.0.1` only. + +## Prerequisites + +1. Docker with the Compose v2 plugin. +2. A repo-root `.env` defining the secrets the compose file references: + + ```ini + OPENSEARCH_INITIAL_ADMIN_PASSWORD=... + SPLUNK_PASSWORD=... + SPLUNK_HEC_TOKEN=... + GRAFANA_PASSWORD=... + ``` + + Pick any values you like — these are local-only dev secrets. Both `.env` and `parsedmarc*.ini` are gitignored. The matching values must also appear in [parsedmarc-dev.ini](../parsedmarc-dev.ini), which the bootstrap script feeds to the parsedmarc CLI for sample-data ingestion. +3. The parsedmarc CLI on `PATH` (or in `./venv/bin/`) — `pip install -e .[build]` from the repo root works. Override the lookup with `PARSEDMARC_BIN=/path/to/parsedmarc` if needed. + +## One-shot bootstrap + +[dashboard-dev-bootstrap.sh](../dashboard-dev-bootstrap.sh) is the normal entry point. It is idempotent — re-run it any time: + +```bash +./dashboard-dev-bootstrap.sh +``` + +It does, in order: + +1. `docker compose -f docker-compose.dashboard-dev.yml up -d` and waits for every service's health endpoint. +2. Provisions Splunk: creates the `email` index, creates the `DMARC` app, configures the auto-created HEC token to allow the `email` index, and scopes the search-app's "scheduled export" announcement view away from `global` so it stops appearing in the DMARC app's dashboard list. +3. Seeds Elasticsearch, OpenSearch, and Splunk with parsedmarc-parsed sample reports (from [samples/](../samples/)) so the dashboards render against real data. Skipped when ES already has aggregate docs — pass `RESEED=1` to wipe and re-seed all three backends. +4. Imports the dashboard files from this directory into the running services. This step always runs, so the typical edit loop is **edit in the UI → export → save into this directory → re-run the bootstrap script** to verify the file imports cleanly into a fresh service. + +VS Code users can run this via the **Dev Dashboard: Bootstrap** task in [.vscode/tasks.json](../.vscode/tasks.json). **Dev Dashboard: Up** brings the stack up without importing or seeding. + +## Editing a dashboard + +After running the bootstrap script once, the round trip for each platform is: + +### OpenSearch Dashboards (and Kibana) + +1. Edit the dashboard at http://localhost:5602/ (OpenSearch Dashboards) — this is the canonical authoring surface. +2. **Stack Management → Saved Objects → Export**, select the DMARC dashboard, include related objects, and save the resulting `.ndjson` over [opensearch/opensearch_dashboards.ndjson](opensearch/opensearch_dashboards.ndjson). +3. Re-run `./dashboard-dev-bootstrap.sh` to confirm it re-imports cleanly into both OSD and Kibana. The Kibana CI workflow ([.github/workflows/dashboards.yml](../.github/workflows/dashboards.yml)) also imports the same file on every PR that touches it. + +OSD imports default to the `global_tenant` so other admins on the instance can see the result. Set `OSD_TENANT=...` to import elsewhere. + +### Grafana + +1. Edit the dashboard at http://localhost:3000/. +2. **Dashboard settings → JSON Model**, copy the JSON, save it to [grafana/Grafana-DMARC_Reports.json](grafana/Grafana-DMARC_Reports.json). +3. Re-run the bootstrap script. + +The bootstrap script provisions two `elasticsearch` datasources (`dmarc-ag` for `dmarc_aggregate*`, `dmarc-fo` for `dmarc_forensic*`) on first run; existing datasources are left alone. + +### Splunk + +1. Edit the dashboard at http://localhost:8000/ inside the **DMARC** app. +2. Open the dashboard's **Source** view, copy the XML, and paste it over the matching file in [splunk/](splunk/) (`dmarc_aggregate_dashboard.xml`, `dmarc_forensic_dashboard.xml`, or `smtp_tls_dashboard.xml`). +3. Re-run the bootstrap script. It re-imports each view via `DELETE` + `POST` to the splunkd management API. + +## Reseeding sample data + +```bash +RESEED=1 ./dashboard-dev-bootstrap.sh +``` + +Wipes every `dmarc_aggregate*` / `dmarc_forensic*` / `smtp_tls*` index from ES and OS, drops and recreates the Splunk `email` index, then re-runs the parsedmarc CLI against the curated sample list. Use this after changing parsedmarc's enrichment or output schemas. + +## Tearing the stack down + +```bash +docker compose -f docker-compose.dashboard-dev.yml down # stop containers, keep volumes +docker compose -f docker-compose.dashboard-dev.yml down -v # also drop volumes (full reset) +``` diff --git a/dashboards/grafana/README.rst b/dashboards/grafana/README.md similarity index 67% rename from dashboards/grafana/README.rst rename to dashboards/grafana/README.md index 8dfd4fd..a4e93d3 100644 --- a/dashboards/grafana/README.rst +++ b/dashboards/grafana/README.md @@ -1 +1,3 @@ +# Grafana dashboards + Dashboards contributed by Github user Bhozar. diff --git a/dashboards/splunk/README.rst b/dashboards/splunk/README.rst deleted file mode 100644 index acf7bce..0000000 --- a/dashboards/splunk/README.rst +++ /dev/null @@ -1,87 +0,0 @@ -=================== -Splunk Installation -=================== - -Install Splunk for use with Docker ----------------------------------- - -Download latest Splunk image:: - - docker pull splunk/splunk:latest - -Run Splunk with Docker ----------------------- - -Listen on all network interfaces:: - - docker run -d -p 8000:8000 -p 8088:8088 -e "SPLUNK_START_ARGS=--accept-license" -e "SPLUNK_PASSWORD=password1234" -e "SPLUNK_HEC_TOKEN=hec-token-1234" --name splunk splunk/splunk:latest - -Listen on localhost for use with reverse proxy with base URL ``/splunk``:: - - docker run -d -p 127.0.0.1:8000:8000 -p 127.0.0.1:8088:8088 -e "SPLUNK_START_ARGS=--accept-license" -e "SPLUNK_PASSWORD=password1234" -e "SPLUNK_HEC_TOKEN=hec-token-1234" -e "SPLUNK_ROOT_ENDPOINT=/splunk" --name splunk splunk/splunk:latest - -Set up reverse proxy, e.g. Apache2:: - - ProxyPass /splunk http://127.0.0.1:8000/splunk - ProxyPassReverse /splunk http://127.0.0.1:8000/splunk - -Splunk Configuration --------------------- - -Access web UI at http://127.0.0.1:8000 and log in with ``admin:password1234``. - -Create App and Index -~~~~~~~~~~~~~~~~~~~~ - -- Settings > Data > Indexes: New Index - - - Index name: "email" - -- HEC token ``hec-token-1234`` should be already set up. - - - Check under Settings > Data > Data inputs: HTTP Event Collector - -- Apps > Manage Apps: Create app - - - Name: "parsedmarc" - - Folder name: "parsedmarc" - -Create Dashboards -~~~~~~~~~~~~~~~~~ - -1. Navigate to the app you want to add the dashboards to, or create a new app called DMARC -2. Click Dashboards -3. Click Create New Dashboard -4. Use a descriptive title, such as "Aggregate DMARC Data" -5. Click Create Dashboard -6. Click on the Source button -7. Paste the content of ''dmarc_aggregate_dashboard.xml`` into the source editor -8. If the index storing the DMARC data is not named email, replace index="email" accordingly -9. Click Save -10. Click Dashboards -11. Click Create New Dashboard -12. Use a descriptive title, such as "Forensic DMARC Data" -13. Click Create Dashboard -14. Click on the Source button -15. Paste the content of ''dmarc_forensic_dashboard.xml`` into the source editor -16. If the index storing the DMARC data is not named email, replace index="email" accordingly -17. Click Save - -============== -Example Config -============== - -parsedmarc.ini:: - - [splunk_hec] - url = https://127.0.0.1:8088/ - token = hec-token-1234 - index = email - skip_certificate_verification = True - -Note that ``skip_certificate_verification = True`` disables security checks. - -Run parsedmarc:: - - python3 -m parsedmarc.cli -c parsedmarc.ini -