Time-irreversibility tests for random-length time series: the matching-time approach applied to DNA
In this work we implement the so-called matching time estimators for estimating the entropy rate as well as the entropy production rate for symbolic sequences. These estimators are based on recurrence properties of the system, which have been shown to be appropriate to test irreversibility specially when the sequences have large correlations or memory. Based on limit theorems for matching-times we derive a maximum likelihood estimator for entropy rate assuming that we have a set of moderately short symbolic time-series of finite random duration. We show that the proposed estimator has several properties that makes it adequate to estimate entropy rate and entropy production rate (or to test irreversibility) when the sample sequences have different lengths such as the coding sequences of DNA. We test our approach in some controlled examples of Markov chains. We also implement our estimators in genomic sequences to show that the degree of irreversibility coding sequences of human DNA is significantly larger than the corresponding non-coding sequences.
READ FULL TEXT