Almost Linear Constant-Factor Sketching for ℓ_1 and Logistic Regression

03/31/2023
by   Alexander Munteanu, et al.
0

We improve upon previous oblivious sketching and turnstile streaming results for ℓ_1 and logistic regression, giving a much smaller sketching dimension achieving O(1)-approximation and yielding an efficient optimization problem in the sketch space. Namely, we achieve for any constant c>0 a sketching dimension of Õ(d^1+c) for ℓ_1 regression and Õ(μ d^1+c) for logistic regression, where μ is a standard measure that captures the complexity of compressing the data. For ℓ_1-regression our sketching dimension is near-linear and improves previous work which either required Ω(log d)-approximation with this sketching dimension, or required a larger poly(d) number of rows. Similarly, for logistic regression previous work had worse poly(μ d) factors in its sketching dimension. We also give a tradeoff that yields a 1+ε approximation in input sparsity time by increasing the total size to (dlog(n)/ε)^O(1/ε) for ℓ_1 and to (μ dlog(n)/ε)^O(1/ε) for logistic regression. Finally, we show that our sketch can be extended to approximate a regularized version of logistic regression where the data-dependent regularizer corresponds to the variance of the individual logistic losses.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset