Protecting Sensory Data against Sensitive Inferences
There is growing concern about how personal data are used when users grant applications direct access to the sensors in their mobile devices. For example, time-series data generated by motion sensors reflect directly users' activities and indirectly their personalities. It is therefore important to design privacy-preserving data analysis methods that can run on mobile devices. In this paper, we propose a feature learning architecture that can be deployed in distributed environments to provide flexible and negotiable privacy-preserving data transmission. It should be flexible because the internal architecture of each component can be independently changed according to users or service providers needs. It is negotiable because expected privacy and utility can be negotiated based on the requirements of the data subject and underlying application. For the specific use-case of activity recognition, we conducted experiments on two real-world datasets of smartphone's motion sensors, one of them is collected by the authors and will be publicly available by this paper for the first time. Results indicate the proposed framework establishes a good trade-off between application's utility and data subjects' privacy. We show that it maintains the usefulness of the transformed data for activity recognition (with around an average loss of three percentage points) while almost eliminating the possibility of gender classification (from more than 90% to around 50%, the target random guess). These results also have implication for moving from the current binary setting of granting permission to mobile apps or not, toward a situation where users can grant each application permission over a limited range of inferences according to the provided services.
READ FULL TEXT