Differentially Private Generative Adversarial Networks for Time Series, Continuous, and Discrete Open Data
Open data plays a fundamental role in the 21th century by stimulating economic growth and by enabling more transparent and inclusive societies. However, it is always difficult to create new high-quality datasets with the required privacy guarantees for many use cases. This paper aims at creating a framework for releasing new open data while protecting the individuality of the users through a strict definition of privacy called differential privacy. Unlike previous work, this paper provides a framework for privacy preserving data publishing that can be easily adapted to different use cases, from the generation of time-series to continuous data, and discrete data; no previous work has focused on the later class. Indeed, many use cases expose discrete data or at least a combination between categorical and numerical values. Thanks to the latest developments in deep learning and generative models, it is now possible to model rich-semantic data maintaining both the original distribution of the features and the correlations between them. The output of this framework is a deep network, namely a generator, able to create new data on demand. We demonstrate the efficiency of our approach on real datasets from the French public administration and classic benchmark datasets.
READ FULL TEXT