Heterogeneous Matrix Factorization: When Features Differ by Datasets
In myriad statistical applications, data are collected from related but heterogeneous sources. These sources share some commonalities while containing idiosyncratic characteristics. More specifically, consider the setting where observation matrices from N sources {M_i}_i=1^N are generated from a few common and source-specific factors. Is it possible to recover the shared and source-specific factors? We show that under appropriate conditions on the alignment of source-specific factors, the problem is well-defined and both shared and source-specific factors are identifiable under a constrained matrix factorization objective. To solve this objective, we propose a new class of matrix factorization algorithms, called Heterogeneous Matrix Factorization. HMF is easy to implement, enjoys local linear convergence under suitable assumptions, and is intrinsically distributed. Through a variety of empirical studies, we showcase the advantageous properties of HMF and its potential application in feature extraction and change detection.
READ FULL TEXT