Tree-Projected Gradient Descent for Estimating Gradient-Sparse Parameters on Graphs
We study estimation of a gradient-sparse parameter vector θ^* ∈ℝ^p, having strong gradient-sparsity s^*:=∇_G θ^*_0 on an underlying graph G. Given observations Z_1,…,Z_n and a smooth, convex loss function ℒ for which θ^* minimizes the population risk 𝔼[ℒ(θ;Z_1,…,Z_n)], we propose to estimate θ^* by a projected gradient descent algorithm that iteratively and approximately projects gradient steps onto spaces of vectors having small gradient-sparsity over low-degree spanning trees of G. We show that, under suitable restricted strong convexity and smoothness assumptions for the loss, the resulting estimator achieves the squared-error risk s^*/nlog (1+p/s^*) up to a multiplicative constant that is independent of G. In contrast, previous polynomial-time algorithms have only been shown to achieve this guarantee in more specialized settings, or under additional assumptions for G and/or the sparsity pattern of ∇_G θ^*. As applications of our general framework, we apply our results to the examples of linear models and generalized linear models with random design.
READ FULL TEXT