Towards a Zero-One Law for Entrywise Low Rank Approximation
There are a number of approximation algorithms for NP-hard versions of low rank approximation, such as finding a rank-k matrix B minimizing the sum of absolute values of differences to a given matrix A, _rank-k BA-B_1, or more generally finding a rank-k matrix B which minimizes the sum of p-th powers of absolute values of differences, _rank-k BA-B_p^p. Many of these algorithms are linear time columns subset selection algorithms, returning a subset of poly(k n) columns whose cost is no more than a poly(k) factor larger than the cost of the best rank-k matrix. The above error measures are special cases of the following general entrywise low rank approximation problem: given an arbitrary function g:R→R_≥0, find a rank-k matrix B which minimizes A-B_g=∑_i,jg(A_i,j-B_i,j). A natural question is which functions g admit efficient approximation algorithms? Indeed, this is a central question of recent work studying generalized low rank models. We give approximation algorithms for every function g which is approximately monotone and satisfies an approximate triangle inequality, and we show both of these conditions are necessary. Further, our algorithm is efficient if the function g admits an efficient approximate regression algorithm. Our algorithm handles functions which are not even scale-invariant, such as the Huber loss function. Our algorithms have poly(k)-approximation ratio in general. We further improve the approximation ratio to (1+ϵ) for ℓ_1 loss when the entries of the error matrix are i.i.d. random variables drawn from a distribution μ of which (1+γ) moment exists, for an arbitrary small constant γ>0. We also show our moment assumption is necessary.
READ FULL TEXT