SparCA: Sparse Compressed Agglomeration for Feature Extraction and Dimensionality Reduction

01/26/2023
by   Leland Barnard, et al.
0

The most effective dimensionality reduction procedures produce interpretable features from the raw input space while also providing good performance for downstream supervised learning tasks. For many methods, this requires optimizing one or more hyperparameters for a specific task, which can limit generalizability. In this study we propose sparse compressed agglomeration (SparCA), a novel dimensionality reduction procedure that involves a multistep hierarchical feature grouping, compression, and feature selection process. We demonstrate the characteristics and performance of the SparCA method across heterogenous synthetic and real-world datasets, including images, natural language, and single cell gene expression data. Our results show that SparCA is applicable to a wide range of data types, produces highly interpretable features, and shows compelling performance on downstream supervised learning tasks without the need for hyperparameter tuning.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset