Privacy-preserving release of mobility data: a clean-slate approach
The quantity of mobility data is overwhelming nowadays providing tremendous potential for various value-added services. While the benefits of these mobility datasets are apparent, they also provide significant threat to location privacy. Although a multitude of anonymization schemes have been proposed to release location data, they all suffer from the inherent sparseness and high-dimensionality of location trajectories which render most techniques inapplicable in practice. In this paper, we revisit the problem of releasing location trajectories with strong privacy guarantees. We propose a general approach to synthesize location trajectories meanwhile providing differential privacy. We model the generator distribution of the dataset by first constructing a model to generate the source and destination location of trajectories along with time information, and then compute all transition probabilities between close locations given the destination of the synthetic trajectory. Finally, an optimization algorithm is used to find the most probable trajectory between the given source and destination at a given time using the computed transition probabilities. We exploit several inherent properties of location data to boost the performance of our model, and demonstrate its usability on a public location dataset. We also develop a novel composite of generative neural network to synthesize location trajectories which might be of independent interest.
READ FULL TEXT