Processing Intelligence Feeds with Open Source SoftwareReturn to TOC

No ratings

Presented at FIRST 2014 by

After almost three years at AusCERT, he moved to Japan to work for JPCERT/CC, the Japanese national CSIRT. This lead to tasks as varied as phishing site monitoring system development, dashboard development, international training, malware analysis, and sensor network visualisations. After four years and a great amount of cultural broadening in Japan, he moved back to Australia to start CSIRT Foundry. For incident response teams, gathering and processing event data from open source intelligence feeds is crucial for getting external, expert perspectives on their constituents' networks. Teams need a system which: * Automatically gathers open source or private feeds provided by others; * Adjusts the data in the feeds to have standard field names and data formats; * Filters out unwanted data (e.g. only keep country / ASN specific data); * Stores the data in a searchable, scalable way; * Allows searches and trend analysis over long periods of time; * Has an attractive web interface, allowing an analyst to make on-demand reports and visualisations; * Has an API to allow easy data export into other tools such as RTIR; * Is freely open for modification and use by IR teams with limited budgets. In this presentation, we explain that the software to achieve these goals already exists - we just need a little glue to put them together. We will present various open source tools (Abusehelper, ContactDB, Logstash, Elasticsearch, Kibana, and IFAS) with demonstrations of their capabilities. Attendees will take away knowledge of how to start using each of these pieces of software, as well as an easy method for integrating them all together (IFAS). Finally, participants will be shown an "install wizard" way of quickly setting up the IFAS open source feed processing system we outline. This talk will especially help newer CSIRTs looking to build or extend their capability. Overview ------- * Problem * Teams need widespread awareness of incident activity * Must be fully automated and reliable * Many feeds available, all in different formats * Trying to get everyone to standardise as of today is a hard problem - many long-standing feeds have formats that cannot be easily changed * Feed gathering must be fully automated, flexible, and resilient on failure * Once we have all this data, how do we use it? * Large volumes of data * Must be quickly searchable * Must be able to report / visualise * Must be capable of extending with future logs and new formats * Solutions exist, but... * Splunk: an excellent product, but expensive * Out of budget for small / developing teams * Requires teams to write extraction rules for many feeds types * Does not foster uninhibited collaborative development - divides the Splunk "haves" and "have-nots" * Solution * Build something we can all use * Open source all the way * Can share and collaborate using same building blocks * Introducing IFAS: Information Feed Analysis System * Take best-in-class open source tools, and integrate them together to tackle the problems above * Gets a CSIRT from no event feed processing to gathering, storage, processing and reporting quickly, an essential step in becoming a mature CSIRT * Use them pre-integrated and configured with IFAS, or mix and match tools as you like * Transparent and well documented API for data exchange * Data harmonisation: gather feeds and ensure all are harmonised * Collect and analyse feeds * Enrich data - Geo IP resolution, Whois lookups * Harmonise field names and formats * Deduplicate events * Write out as logs * IFAS uses: Abusehelper * Log parsing * Need a Splunk-like log tailing and transform functionality * Then, need to put processed log event data into datastore * IFAS uses: Logstash * Log storage * Don't know what future event types we might enounter, want flexible schema * Must be able to search quickly * Must be able to store at scale * IFAS uses: Elasticsearch * Event reporting * Find the best reliable contact to report the incidents automatically: * ContactDB * Log analysis * Need an analyst interface to enable: * Operational searches * Splunk-like ad-hoc searches for strings, IP addresses in event history * IFAS uses: Kibana * Analytical searches * Reporting on how incidents are trending, e.g. worst performing ISPs for phishing sites last year * IFAS uses: IFAS Reporter (Django web application) * Need a way to set up and integrate all these tools for a team with few resources * Some of these tools have complex setup, and need to be customised heavily before use * IFAS automated install: installs Abusehelper, Logstash, Kibana, IFAS Reporter as a single-box solution. * The IFAS bundle sets all of the above with sane defaults, ready to collect feeds * May be customised per-site * Takeaways * Tools already exist to achieve open source feed gathering, parsing, normalisation, storage, and analysis * Use any of these excellent tools separately to achieve a particular goal, or IFAS for a turnkey solution integrating all of them for you