The ℓ^∞-Cophenetic Metric for Phylogenetic Trees as an Interleaving Distance
There are many metrics available to compare phylogenetic trees since this is a fundamental task in computational biology. In this paper, we focus on one such metric, the ℓ^∞-cophenetic metric introduced by Cardona et al. This metric works by representing a phylogenetic tree with n labeled leaves as a point in R^n(n+1)/2 known as the cophenetic vector, then comparing the two resulting Euclidean points using the ℓ^∞ distance. Meanwhile, the interleaving distance is a formal categorical construction generalized from the definition of Chazal et al., originally introduced to compare persistence modules arising from the field of topological data analysis. We show that the ℓ^∞-cophenetic metric is an example of an interleaving distance. To do this, we define phylogenetic trees as a category of merge trees with some additional structure; namely labelings on the leaves plus a requirement that morphisms respect these labels. Then we can use the definition of a flow on this category to give an interleaving distance. Finally, we show that, because of the additional structure given by the categories defined, the map sending a labeled merge tree to the cophenetic vector is, in fact, an isometric embedding, thus proving that the ℓ^∞-cophenetic metric is, in fact, an interleaving distance.
READ FULL TEXT