Mapping the Privacy-Utility Tradeoff in Mobile Phone Data for Development

Today's age of data holds high potential to enhance the way we pursue and monitor progress in the fields of development and humanitarian action. We study the relation between data utility and privacy risk in large-scale behavioral data, focusing on mobile phone metadata as paradigmatic domain. To measure utility, we survey experts about the value of mobile phone metadata at various spatial and temporal granularity levels. To measure privacy, we propose a formal and intuitive measure of reidentification riskx2014the information ratiox2014and compute it at each granularity level. Our results confirm the existence of a stark tradeoff between data utility and reidentifiability, where the most valuable datasets are also most prone to reidentification. When data is specified at ZIP-code and hourly levels, outside knowledge of only 7 retrieval of the remaining 93 specified at municipality and daily levels, reidentification requires on average outside knowledge of 51 retrieve the remaining 49 directly erodes its value, and highlight the need for using data-coarsening, not as stand-alone mechanism, but in combination with data-sharing models that provide adjustable degrees of accountability and security.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset