Vine dependence graphs with latent variables as summaries for gene expression data
The advent of high-throughput sequencing technologies has lead to vast comparative genome sequences. The construction of gene-gene interaction networks or dependence graphs on the genome scale is vital for understanding the regulation of biological processes. Different dependence graphs can provide different information. Some existing methods for dependence graphs based on high-order partial correlations are sparse and not informative when there are latent variables that can explain much of the dependence in groups of genes. Other methods of dependence graphs based on correlations and first-order partial correlations might have dense graphs. When genes can be divided into groups with stronger within group dependence in gene expression than between group dependence, we present a dependence graph based on truncated vines with latent variables that makes use of group information and low-order partial correlations. The graphs are not dense, and the genes that might be more central have more neighbors in the vine dependency graph. We demonstrate the use of our dependence graph construction on two RNA-seq data sets – yeast and prostate cancer. There is some biological evidence to support the relationship between genes in the resulting dependence graphs. A flexible framework is provided for building dependence graphs via low-order partial correlations and formation of groups, leading to graphs that are not too sparse or dense. We anticipate that this approach will help to identify groups that might be central to different biological functions.
READ FULL TEXT