Large-Scale Classification using Multinomial Regression and ADMM

01/27/2019
by   Samy Wu Fung, et al.
0

We present a novel method for learning the weights in multinomial logistic regression based on the alternating direction method of multipliers (ADMM). In each iteration, our algorithm decomposes the training into three steps; a linear least-squares problem for the weights, a global variable update involving a separable cross-entropy loss function, and a trivial dual variable update The least-squares problem can be factorized in the off-line phase, and the separability in the global variable update allows for efficient parallelization, leading to faster convergence. We compare our method with stochastic gradient descent for linear classification as well as for transfer learning and show that the proposed ADMM-Softmax leads to improved generalization and convergence.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset