Fix JSON type handling and pre-json field init in SecOps parser

Two CBN behaviors, confirmed against Google's own "How to parse JSON data"
guide (statedump shows JSON true/199 retaining boolean/integer type) and the
published Corelight production parser:

1. The json{} filter preserves the original JSON type, so parsedmarc's boolean
   *_aligned / testing / normalized_timespan and numeric count / *_session_count
   / source_asn would never match string comparisons. Add a mutate{convert} step
   turning them into strings before any == "true"/"false" test or %{...} use.

2. CBN raises _failed_parsing_ when an `if [field]` references a field absent
   from the log, and most detection/mapping fields are absent in 2 of the 3
   report shapes (or null within one). Initialize every conditionally-checked
   field to "" before the json{} filter.

Without these, DMARC-fail records would not be categorized AUTH_VIOLATION and
aggregate/TLS reports could fail parsing outright. README caveat and PR
validation steps updated accordingly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Sean Whalen
2026-06-04 10:22:02 -04:00
parent 784e3050bd
commit 2d9a2a2a8f
2 changed files with 127 additions and 17 deletions
+11 -7
View File
@@ -40,13 +40,17 @@ DMARC types.
## Caveats
1. **Unvalidated** — see [Status](#status).
2. **Boolean coercion** — parsedmarc emits `dmarc_aligned`, `spf_aligned`,
`dkim_aligned`, `testing`, and `normalized_timespan` as JSON booleans. The
parser assumes the `json{}` filter exposes them as the strings `"true"` /
`"false"` (the CBN convention) and compares them as such. The security-
relevant consequence to confirm in the validation tool: a DMARC-fail record
(`dmarc_aligned=false`) must receive `security_result.category =
AUTH_VIOLATION`.
2. **JSON type handling** — parsedmarc emits `dmarc_aligned` / `spf_aligned` /
`dkim_aligned` / `testing` / `normalized_timespan` as JSON booleans and
`count` / `*_session_count` / `source_asn` as numbers. Chronicle's `json{}`
filter **preserves the original JSON type**, so the parser explicitly
converts these to strings (`mutate { convert => { … => "string" } }`) before
any comparison — otherwise `[dmarc_aligned] == "false"` would never match.
Relatedly, every field tested in an `if` is initialized to `""` *before* the
`json` filter, because CBN raises `_failed_parsing_` on a conditional that
references a field absent from the log. A DMARC-fail record
(`dmarc_aligned=false`) should yield `security_result.category =
AUTH_VIOLATION` — still worth confirming in the validation tool.
3. **Aggregate count** — a DMARC aggregate record summarizes `count` messages
from one source IP, not a single message. Each record becomes one
`EMAIL_TRANSACTION` with `count` carried in `additional.fields`. There is no