Visualizing class specific heterogeneous tendencies in categorical data

11/05/2018
by   Mariko Takagishi, et al.
0

In multiple correspondence analysis, both individuals (observations) and categories can be represented in a biplot. In this biplot, relationships between categories, between individuals, as well as the associations between individuals and categories, are depicted jointly. It can be useful to add information regarding the individuals to enhance interpretation. Such additional information can consist, for example, of a set of categorical variables for which the interdependencies are not of immediate concern, but that might assist in interpreting the plot, and in particular, with respect to the relationships between individuals and categories. In this paper, we propose a new method for adding such additional information. We introduce a multiple set cluster correspondence analysis approach that finds clusters specific for classes, defined as subsets of the data corresponding to the categories of the additional variables. Our method can be used to construct a biplot that visualizes heterogeneous tendencies of the individuals, as well as their relationship with respect to the original categorical variables. We investigate the performance of the proposed method through a simulation study and we apply it to a data set regarding road accidents in the United Kingdom.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset