Model-based Clustering with Sparse Covariance Matrices

11/21/2017
by   Michael Fop, et al.
0

Finite Gaussian mixture models are widely used for model-based clustering of continuous data. Nevertheless, they can suffer of over parameterization. Thus, parsimonious models have been developed via covariance matrix decompositions or assuming local independence. However, these remedies do not allow for direct estimation of sparse covariance matrices nor do they take into account that the structure of association among the variables can vary from one cluster to the other. To this end, we introduce mixtures of Gaussian covariance graph models for model-based clustering with sparse covariance matrices. A penalized likelihood approach is employed for estimation and a general penalty term on the graph configurations can be used to induce different levels of sparsity and incorporate prior knowledge. Optimization is carried out using a genetic algorithm in conjunction with a structural-EM algorithm for parameters and graph structure estimation. With this approach, sparse component covariance matrices are directly obtained. The framework results in a parsimonious model-based clustering of the data via a flexible model for the within-group joint distribution of the variables. Extensive simulated data experiments and application to illustrative datasets shows that the proposed method attains good classification performance and model quality.

READ FULL TEXT

page 16

page 18

page 19

page 32

page 33

page 34

page 35

research
07/20/2023

Sparse model-based clustering of three-way data via lasso-type penalties

Mixtures of matrix Gaussian distributions provide a probabilistic framew...
research
05/17/2021

Group-wise shrinkage for multiclass Gaussian Graphical Models

Gaussian Graphical Models are widely employed for modelling dependence a...
research
03/27/2023

Regularized EM algorithm

Expectation-Maximization (EM) algorithm is a widely used iterative algor...
research
09/10/2020

A Family of Mixture Models for Biclustering

Biclustering is used for simultaneous clustering of the observations and...
research
02/22/2023

Improving Model Choice in Classification: An Approach Based on Clustering of Covariance Matrices

This work introduces a refinement of the Parsimonious Model for fitting ...
research
06/01/2018

Model-based clustering for populations of networks

We propose a model-based clustering method for populations of networks t...
research
02/21/2008

Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables

Clustering analysis is one of the most widely used statistical tools in ...

Please sign up or login with your details

Forgot password? Click here to reset