Toward A Scalable Exploratory Framework for Complex High-Dimensional Phenomics Data
Phenomics is an emerging branch of modern biology, which uses high throughput phenotyping tools to capture multiple environment and phenotypic trait measurements, at a massive scale. The resulting high dimensional data sets represent a treasure trove of information for providing an indepth understanding of how multiple factors interact and contribute to control the growth and behavior of different plant crop genotypes. However, computational tools that can parse through such high dimensional data sets and aid in extracting plausible hypothesis are currently lacking. In this paper, we present a new algorithmic approach to effectively decode and characterize the role of environment on phenotypic traits, from complex phenomic data. To the best of our knowledge, this effort represents the first application of topological data analysis on phenomics data. We applied this novel algorithmic approach on a real-world maize data set. Our results demonstrate the ability of our approach to delineate emergent behavior among subpopulations, as dictated by one or more environmental factors; notably, our approach shows how the environment plays a key role in determining the phenotypic behavior of one of the two genotypes. Source code for our implementation and test data are freely available with detailed instructions at https://xperthut.github.io/HYPPO-X
READ FULL TEXT