Generating Stereotypes Automatically For Complex Categorical Features
In the context of stereotypes creation for recommender systems, we found that certain types of categorical variables pose particular challenges if simple clustering procedures were employed with the objective to create stereotypes. A categorical variable is defined to be complex when it cannot be easily translated into a numerical variable, when the semantic of the categories potentially plays an important role in the optimal determination of stereotypes, and when it is also multi-choice (e.g., each item can be labelled with one or more categories that may be applicable, in a non pre-defined number). The main objective of this paper is to analyse the possibility of obtaining a viable recommendation system that operates on stereotypes generated directly via the feature's metadata similarities, without using ratings information at the time the generation of the classes. The encouraging results using integrated MovieLens and Imdb data set show that the proposed algorithm performs better than other categorical clustering algorithms like k-modes when clustering complex categorical features. Notably, the representation of complex categorical features can help to alleviate cold-start issues in recommender systems.
READ FULL TEXT