DS-MLR: Exploiting Double Separability for Scaling up Distributed Multinomial Logistic Regression
Scaling multinomial logistic regression to datasets with very large number of data points and classes has not been trivial. This is primarily because one needs to compute the log-partition function on every data point. This makes distributing the computation hard. In this paper, we present a distributed stochastic gradient descent based optimization method (DS-MLR) for scaling up multinomial logistic regression problems to very large data. Our algorithm exploits double-separability, an attractive property we observe in the objective functions of several models in machine learning, that allows us to achieve both data as well as model parallelism simultaneously. In addition to being parallelizable, our algorithm can also easily be made asynchronous. We demonstrate the effectiveness of our method empirically on several real-world datasets, for instance a reddit dataset with data and parameter sizes of 200 GB and 300 GB respectively.
READ FULL TEXT 
  
  
     share
 share