Files
parsedmarc/google_secops_parser/README.md
T
Sean Whalen ca27428713 Add Google SecOps (Chronicle) UDM parser for syslog output
A SecOps-side custom parser (CBN) that maps parsedmarc's [syslog] JSON
events to the Unified Data Model. No library changes: parsedmarc already
emits structured JSON, so the DMARC->UDM mapping lives in the parser and a
downstream UDM schema change is a parser edit, not a parsedmarc release.

Covers all three report types:
- aggregate -> EMAIL_TRANSACTION
- failure   -> EMAIL_TRANSACTION
- smtp_tls  -> GENERIC_EVENT (noun from policy_domain, present on every row)

Built strictly against the official UDM and parser-syntax docs (cited
inline). Sets metadata.event_timestamp from the report window via date{},
maps disposition / auth-failure to security_result with valid action and
category enums (AUTH_VIOLATION on DMARC fail), uses real network.email
field names, and strips syslog framing before JSON parsing. Ships real
sample events generated from the project's sample reports for validation.

Not yet validated against a live SecOps tenant; caveats are documented in
the README.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 09:24:20 -04:00

192 lines
12 KiB
Markdown

# Google SecOps (Chronicle) parser for parsedmarc
A [Google Security Operations](https://cloud.google.com/security/products/security-operations)
custom parser (configuration-based normalizer / CBN) that maps the JSON events
parsedmarc emits through its built-in `[syslog]` output to the Unified Data
Model (UDM).
This is a **SecOps-side parser only** — it requires no changes to parsedmarc.
parsedmarc already ships structured JSON over syslog; the DMARC→UDM mapping
lives here so that a downstream UDM schema change is a parser edit rather than a
parsedmarc release.
## Status
> [!IMPORTANT]
> This parser was written strictly against the official Google documentation
> linked at the bottom of this file, but it has **not yet been validated against
> a live SecOps tenant**. Before using it in production, paste it into the SecOps
> parser-validation tool and confirm each sample event below parses and that the
> assertions in [Caveats](#caveats) hold. Please report fixes back to the
> [parsedmarc](https://github.com/domainaware/parsedmarc) project.
## Supported report types
parsedmarc emits three flat JSON shapes (one object per syslog line). The parser
detects them by a field unique to each and maps them as follows:
| parsedmarc report | Detected by | UDM `metadata.event_type` |
|---|---|---|
| DMARC aggregate | `adkim` | `EMAIL_TRANSACTION` |
| DMARC failure | `feedback_type` | `EMAIL_TRANSACTION` |
| SMTP TLS (RFC 8460) | `policy_type` | `GENERIC_EVENT` |
`EMAIL_TRANSACTION` and `GENERIC_EVENT` are both valid `metadata.event_type`
values. Note that **`GENERIC_EVENT` events only appear in raw-log and UDM
search**, not in the curated SecOps views — that is the documented behaviour for
generic events, and it is why SMTP TLS reports surface differently from the two
DMARC types.
## Caveats
1. **Unvalidated** — see [Status](#status).
2. **Boolean coercion** — parsedmarc emits `dmarc_aligned`, `spf_aligned`,
`dkim_aligned`, `testing`, and `normalized_timespan` as JSON booleans. The
parser assumes the `json{}` filter exposes them as the strings `"true"` /
`"false"` (the CBN convention) and compares them as such. The security-
relevant consequence to confirm in the validation tool: a DMARC-fail record
(`dmarc_aligned=false`) must receive `security_result.category =
AUTH_VIOLATION`.
3. **Aggregate count** — a DMARC aggregate record summarizes `count` messages
from one source IP, not a single message. Each record becomes one
`EMAIL_TRANSACTION` with `count` carried in `additional.fields`. There is no
first-class per-message expansion (fanning out `count` copies would
misrepresent the data).
4. **Address format** — aggregate reports only carry the From *domain*, so
`network.email.from` holds a bare domain for aggregate events but a full
address for failure events. UDM email-address fields are expected to be
`local-mailbox@domain`; downstream consumers should account for the
aggregate-domain case.
## UDM field mappings
All UDM field names below are from the
[UDM field list](https://cloud.google.com/chronicle/docs/reference/udm-field-list)
and [SecurityResult reference](https://cloud.google.com/chronicle/docs/reference/rest/v1alpha/SecurityResult).
### DMARC aggregate → `EMAIL_TRANSACTION`
| parsedmarc field | UDM field |
|---|---|
| `begin_date` | `metadata.event_timestamp` (via `date{}`) |
| `report_id` | `metadata.product_log_id` |
| `source_ip_address` | `principal.ip` |
| `source_reverse_dns` | `principal.hostname` |
| `source_country` | `principal.location.country_or_region` |
| `domain` | `target.hostname` |
| `header_from` | `network.email.from` (domain; see caveat 4) |
| `disposition` | `security_result.action` (`none``ALLOW`, `quarantine``QUARANTINE`, `reject``BLOCK`) |
| `dmarc_aligned=false` | `security_result.category = AUTH_VIOLATION` |
| `org_name`, `org_email`, `count`, `p`, `sp`, `np`, `pct`, `fo`, `adkim`, `aspf`, `testing`, `discovery_method`, `normalized_timespan`, `*_aligned`, `dkim_*`, `spf_*`, `policy_override_*`, `source_base_domain`, `source_name`, `source_type`, `source_asn`, `source_as_name`, `source_as_domain`, `envelope_from`, `envelope_to` | `additional.fields` |
### DMARC failure → `EMAIL_TRANSACTION`
| parsedmarc field | UDM field |
|---|---|
| `arrival_date_utc` | `metadata.event_timestamp` (via `date{}`) |
| `message_id` | `metadata.product_log_id`, `network.email.mail_id` |
| `source_ip_address` | `principal.ip` |
| `source_reverse_dns` | `principal.hostname` |
| `source_country` | `principal.location.country_or_region` |
| `reported_domain` | `target.hostname` |
| `original_mail_from` | `network.email.from` |
| `original_rcpt_to` | `network.email.to` |
| `subject` | `network.email.subject` |
| `auth_failure` | `security_result.category = AUTH_VIOLATION` + description |
| `delivery_result` | `security_result.action` (`reject``BLOCK`, `quarantine``QUARANTINE`, `delivered``ALLOW`) |
| `feedback_type`, `authentication_results`, `authentication_mechanisms`, `user_agent`, `dkim_domain`, `arrival_date` | `additional.fields` |
### SMTP TLS → `GENERIC_EVENT`
| parsedmarc field | UDM field |
|---|---|
| `begin_date` | `metadata.event_timestamp` (ISO 8601, via `date{}`) |
| `report_id` | `metadata.product_log_id` |
| `policy_domain` | `target.hostname` (always present → the noun) |
| `receiving_ip` | `target.ip` (failure rows only) |
| `sending_mta_ip` | `principal.ip` (failure rows only) |
| `result_type` | `security_result` (`action=FAIL`, `category=POLICY_VIOLATION`) |
| `organization_name`, `policy_type`, `policy_strings`, `mx_host_patterns`, `successful_session_count`, `failed_session_count`, `failure_reason_code`, `receiving_mx_hostname`, `receiving_mx_helo`, `additional_info_uri` | `additional.fields` |
> parsedmarc emits SMTP TLS reports as separate rows: one **success** row per
> policy (counts, no MTA IPs) and one **failure** row per failure detail (which
> may also lack MTA IPs, e.g. `sts-policy-fetch-error`). The noun therefore comes
> from `policy_domain`, which is present on every row.
## Installation
### 1. Configure parsedmarc syslog output
```ini
[syslog]
server = your-collector.example.com
port = 514
```
parsedmarc writes each report record as a single-line JSON message.
### 2. Collect the syslog stream into SecOps
Syslog is ingested by a **collector**, not a Feed. Run the
[Bindplane agent](https://cloud.google.com/chronicle/docs/install/install-forwarder)
(Google's recommended on-premises collector; the legacy Chronicle forwarder is
end-of-life) with a **Syslog** collector pointed at the port above, and assign it
a custom log type (for example `PARSEDMARC`).
### 3. Install this parser for that log type
Associate `parsedmarc.conf` with the custom log type via the SecOps parser
management UI or API (see
[Manage parsers](https://cloud.google.com/chronicle/docs/event-processing/manage-parser-updates)).
Validate against the sample events below before activating.
## Sample events for validation
These are **real** single-line outputs from parsedmarc's `[syslog]` serializers
(generated from the project's sample reports). Use them in the parser-validation
tool. A live syslog line will also carry a `<PRI>` prefix; the parser strips any
leading framing before the first `{`.
### Aggregate — DMARC fail (`dmarc_aligned=false`)
```json
{"xml_schema": "draft", "org_name": "accurateplastics.com", "org_email": "administrator@accurateplastics.com", "org_extra_contact_info": "", "report_id": "example.com:1538463741", "begin_date": "2018-10-01 17:07:12", "end_date": "2018-10-01 17:07:12", "normalized_timespan": false, "errors": "", "domain": "example.com", "adkim": "r", "aspf": "r", "p": "none", "sp": "reject", "np": "", "pct": "100", "fo": "", "testing": "", "discovery_method": "", "source_ip_address": "12.20.127.122", "source_country": "US", "source_reverse_dns": "", "source_base_domain": "", "source_name": "AT&T", "source_type": "ISP", "source_asn": 7018, "source_as_name": "AT&T Enterprises, LLC", "source_as_domain": "att.com", "count": 1, "spf_aligned": false, "dkim_aligned": false, "dmarc_aligned": false, "disposition": "none", "policy_override_reasons": "", "policy_override_comments": "", "envelope_from": "", "header_from": "example.com", "envelope_to": "", "dkim_domains": "", "dkim_selectors": "", "dkim_results": "", "spf_domains": "", "spf_scopes": "", "spf_results": ""}
```
### Aggregate — DMARC pass (`dmarc_aligned=true`)
```json
{"xml_schema": "1.0", "org_name": "example.org", "org_email": "noreply-dmarc-support@example.org", "org_extra_contact_info": "https://support.example.org/dmarc", "report_id": "20240125141224705995", "begin_date": "2024-01-25 05:12:24", "end_date": "2024-01-25 12:28:53", "normalized_timespan": false, "errors": "", "domain": "example.com", "adkim": "r", "aspf": "r", "p": "quarantine", "sp": "quarantine", "np": "", "pct": "100", "fo": "1", "testing": "", "discovery_method": "", "source_ip_address": "198.51.100.123", "source_country": "", "source_reverse_dns": "", "source_base_domain": "", "source_name": "", "source_type": "", "source_asn": "", "source_as_name": "", "source_as_domain": "", "count": 2, "spf_aligned": false, "dkim_aligned": true, "dmarc_aligned": true, "disposition": "none", "policy_override_reasons": "none", "policy_override_comments": "none", "envelope_from": "example.edu", "header_from": "example.com", "envelope_to": "example.net", "dkim_domains": "example.com", "dkim_selectors": "example", "dkim_results": "pass", "spf_domains": "example.edu", "spf_scopes": "mfrom", "spf_results": "pass"}
```
### Failure
```json
{"feedback_type": "auth-failure", "user_agent": "Lua/1.0", "version": "1.0", "original_mail_from": "sharepoint@domain.de", "original_rcpt_to": "peter.pan@domain.de", "arrival_date": "Mon, 01 Oct 2018 11:20:27 +0200", "message_id": "<38.E7.30937.BD6E1BB5@ mailrelay.de>", "authentication_results": "dmarc=fail (p=none, dis=none) header.from=domain.de", "delivery_result": "policy", "auth_failure": "dmarc", "reported_domain": "domain.de", "arrival_date_utc": "2018-10-01 09:20:27", "authentication_mechanisms": "", "original_envelope_id": null, "dkim_domain": null, "sample_headers_only": false, "source_ip_address": "10.10.10.10", "source_reverse_dns": null, "source_base_domain": null, "source_name": null, "source_type": null, "source_asn": null, "source_as_name": null, "source_as_domain": null, "source_country": null, "subject": "Subject"}
```
### SMTP TLS — success row (counts only)
```json
{"organization_name": "Synametrics Technologies, Inc.", "begin_date": "2025-12-07T19:00:00Z", "end_date": "2025-12-08T18:59:59Z", "report_id": "1765256572301+dmarc-reports.dengage.com", "policy_strings": "version: STSv1|mode: enforce|mx: mta1.inboxsys.net|mx: mta2.inboxsys.net|max_age: 86400", "policy_domain": "dmarc-reports.dengage.com", "policy_type": "sts", "successful_session_count": 2, "failed_session_count": 0}
```
### SMTP TLS — failure-detail row
```json
{"organization_name": "Mail.ru", "begin_date": "2024-02-22T00:00:00Z", "end_date": "2024-02-23T00:00:00Z", "report_id": "b28254de-7b2e-be36-bb5c-4c3b92da8b25@mail.ru", "result_type": "sts-policy-fetch-error", "failed_session_count": 1, "failure_reason_code": "bad https response code: 404"}
```
## Official references
- [Overview of the UDM](https://cloud.google.com/chronicle/docs/event-processing/udm-overview)
- [Overview of log parsing](https://cloud.google.com/chronicle/docs/event-processing/parsing-overview)
- [Parser syntax reference](https://cloud.google.com/chronicle/docs/reference/parser-syntax)
- [UDM field list](https://cloud.google.com/chronicle/docs/reference/udm-field-list)
- [SecurityResult reference](https://cloud.google.com/chronicle/docs/reference/rest/v1alpha/SecurityResult)
- [Feed management](https://cloud.google.com/chronicle/docs/administration/feed-management-overview)
## License
Distributed under the same license as [parsedmarc](https://github.com/domainaware/parsedmarc).