Partial Orthology, Paralogy and Xenology Relations - Satisfiability in terms of Di-Cographs

11/01/2017
by   Nikolai Nøjgaard, et al.
0

A variety of methods based on sequence similarity, reconciliation, synteny or functional characteristics, can be used to infer homology relations, that is, orthology, paralogy and xenology relations between genes of a given gene family G. The (inferred) homology relations might not cover each pair of genes and thus, provide only partial knowledge on the full set of homology relations. Moreover, for particular pairs of genes it might be known with a high degree of certainty that they are not orthologs (resp. paralogs, xenologs) which yields forbidden pairs of genes. The question arises as whether such sets of (partial) homology relations with or without forbidden gene pairs are satisfiable, i.e., can they simultaneously co-exist in an evolutionary history for G. In this contribution, we characterize satisfiable homology relations. To this end, we employ the graph structure provided by these relations. In particular, the latter allows us to characterize full satisfiable homology relations as so-called di-cographs. Let m denote the total number of known and forbidden gene pairs. We provide a simple O(|G|^2 + m|G|)-time algorithm to determine whether such homology relations are satisfiable and, in the positive case, to construct event-labeled gene trees containing speciation, duplication and horizontal gene transfer (HGT) events that can explain the relations. The provided algorithm is evaluated on large-scaled simulated data sets. As we shall see, a comparably small amount of information about the original homologous relationships between the gene pairs is necessary to reconstruct most of the original relations. The algorithm and datasets are freely-available.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset