Distributed Summary Statistics with Bro,

No ratings

Presented at Flocon 2014 by

When analyzing network traffic a number of questions have historically been too difficult to answer not just in realtime but also in post-hoc analysis. As the amount of traffic has increased so too has the complexity of queries such as "Which of my machines have talked to the largest number of unique external IPs?" "Which HTTP POST requests had the largest increase in volume from yesterday?" or "What users had the highest number of failed DNS queries?" While the bottleneck for these types of queries has traditionally been the high memory cost a new class of probabilistic algorithms can summarize billions of elements in a couple of kilobytes of RAM. The fact that these algorithms are also mergeable allows for distributing and scaling these calculations to the 100-gigabit level and beyond. Designed specifically to summarize huge datasets Bro summary statistics offer a fresh view of network activity. With this new framework the answers to these types of questions only require a few lines of Bro scripting while making use of Bro's advanced protocol analysis to summarize anything from layer 2 through 7. This talk is an example-heavy look at the new framework designed to teach attendees how to write such scripts. No previous Bro experience is necessary. Users are encouraged to bring questions about their networks that they've been dying to know but have never before been able to calculate.