Identifying Malware Delivery, Port-Protocol Mismatches and Atypical Server Communications Defenders who must understand and protect new networks quickly have a need to rapidly understand a network’s setup, and quickly identify deviations from normality. While many networks run under typical settings due to norms (i.e. DNS runs over port 53), any given network can be configured in a variety of ways. This presentation and the associated paper describe research on leveraging Apache Spark’s FPGrowth and Association Rule Inference algorithms to identify strong categorical correlations, or “rules”, in enriched network data (e.g. Bro, YAF). The work continues by showing how the algorithms can be used to rapidly identify violations of the induced rules, which can be subjected to purpose-built whitelists to surface the most security-relevant violations. This research is novel, and helps analysts move beyond querying for “known knowns” in network traffic. Instead, the algorithms and workflow described here allow analysts to begin by identifying classes of atypical behaviors, and then prune those down to the ones they find most interesting. Running examples from (anonymized) enterprise data include identifying mime-type and file extension discrepancies, mismatched port/protocol combinations, and communications to/from servers over unusual ports. A set of extensions and other potential use cases is described as well.