Don't Show Me Yours, I Won't Show You Mine: Security Research with Non-Public Data

No ratings

Presented at HotSec 2015 by

In recent years, papers at top security conferences increasingly rely on non-public data, such as passwords, telemetry, or other confidential data from inside universities and corporations. This model has both important risks and important benefits, including: access to real-world data that could not be obtained any other way, larger-scale experiments than would be otherwise possible, risk of disclosure of users' private data, difficulty of reproduction, limitations on who has access and connections to conduct this kind of work, and many others. Despite the risks, this kind of research is not going away anytime soon. In this session, we will discuss (as case studies) several recent examples of research on proprietary data and how the data was obtained and protected. We will discuss when this model is or is not appropriate, how proprietary data can be properly protected, and whether and how we can promote as much reproducibility as possible in this situation. We will discuss what are (or should be) best practices for researchers considering a study of non-public data. Our hope is to spark a broader discussion in the community about sharing data in a responsible manner and utilizing non-public data sets in security research. Note: Please fill out this short survey before the session. This will allow us incorporate your responses into the discussion in advance.