Author Obfuscation Using Generalised Differential Privacy
The problem of obfuscating the authorship of a text document has received little attention in the literature to date. Current approaches are ad-hoc and rely on assumptions about an adversary's auxiliary knowledge which makes it difficult to reason about the privacy properties of these methods. Differential privacy is a well-known and robust privacy approach, but its reliance on the notion of adjacency between datasets has prevented its application to text document privacy. However, generalised differential privacy permits the application of differential privacy to arbitrary datasets endowed with a metric and has been demonstrated on problems involving the release of individual data points. In this paper we show how to apply generalised differential privacy to author obfuscation by utilising existing tools and methods from the stylometry and natural language processing literature.
READ FULL TEXT