Efficient Private Storage of Sparse Machine Learning Data
We consider the problem of maintaining sparsity in private distributed storage of confidential machine learning data. In many applications, e.g., face recognition, the data used in machine learning algorithms is represented by sparse matrices which can be stored and processed efficiently. However, mechanisms maintaining perfect information-theoretic privacy require encoding the sparse matrices into randomized dense matrices. It has been shown that, under some restrictions on the storage nodes, sparsity can be maintained at the expense of relaxing the perfect information-theoretic privacy requirement, i.e., allowing some information leakage. In this work, we lift the restrictions imposed on the storage nodes and show that there exists a trade-off between sparsity and the achievable privacy guarantees. We focus on the setting of non-colluding nodes and construct a coding scheme that encodes the sparse input matrices into matrices with the desired sparsity level while limiting the information leakage.
READ FULL TEXT