Together or Alone: The Price of Privacy in Joint Learning
Machine Learning is a widely-used method for prediction generation. These predictions are more accurate when the model is trained on a larger dataset. On the other hand, usually the data is divided amongst different entities. To preserve privacy, the training can be done locally and the model can be safely aggregated amongst the participants. However, if the number of participants in the Joint Learning is small -- e.g. there are only 2 data holders --, the safe aggregation may lose its power since the output of the training already contains much information about the other participant(s). To resolve this issue, the participants must employ privacy preserving mechanisms, which inevitably affect the accuracy of the model. In this paper we model the training process as a two player game where each player aims to achieve a higher accuracy while preserving its privacy. We describe 3 privacy preserving mechanisms and apply them to 2 real world datasets. Furthermore, we develop the theoretical model for different player types and we either find or prove the existence of a Nash-Equilibrium. Finally, we introduce the notion of Price of Privacy, a novel approach to measure the effect of privacy protection on the accuracy of the model, to facilitate our analysis.
READ FULL TEXT