Inferring multiple consensus trees and supertrees using clustering: a review
Phylogenetic trees (i.e. evolutionary trees, additive trees or X-trees) play a key role in the processes of modeling and representing species evolution. Genome evolution of a given group of species is usually modeled by a species phylogenetic tree that represents the main patterns of vertical descent. However, the evolution of each gene is unique. It can be represented by its own gene tree which can differ substantially from a general species tree representation. Consensus trees and supertrees have been widely used in evolutionary studies to combine phylogenetic information contained in individual gene trees. Nevertheless, if the available gene trees are quite different from each other, then the resulting consensus tree or supertree can either include many unresolved subtrees corresponding to internal nodes of high degree or can simply be a star tree. This may happen if the available gene trees have been affected by different reticulate evolutionary events, such as horizontal gene transfer, hybridization or genetic recombination. Thus, the problem of inferring multiple alternative consensus trees or supertrees, using clustering, becomes relevant since it allows one to regroup in different clusters gene trees having similar evolutionary patterns (e.g. gene trees representing genes that have undergone the same horizontal gene transfer or recombination events). We critically review recent advances and methods in the field of phylogenetic tree clustering, discuss the methods' mathematical properties, and describe the main advantages and limitations of multiple consensus tree and supertree approaches. In the application section, we show how the multiple supertree clustering approach can be used to cluster aaRS gene trees according to their evolutionary patterns.
READ FULL TEXT