Distributional data analysis with accelerometer data in a NHANES database with nonparametric survey regression models
Accelerometers enable an objective measurement of physical activity levels among groups of individuals in free-living environments, providing high-resolution detail about physical activity changes at different time scales. Current approaches used in the literature for analyzing such data typically employ summary measures such as total inactivity time or compositional metrics. However, at the conceptual level, these methods have the potential disadvantage of discarding important information from recorded data when calculating these summaries and metrics since these typically depend on cut-offs related to intensity exercise zones that are chosen subjectively or even arbitrarily. Much of the data collected in these studies follow complex survey designs, making application of standard statistical tools such as non-parametric regression models inappropriate and the requirement of specific estimation procedures according to particular sampling-design is mandatory. With functional data or other complex objects, barely literature exist that handles complex sampling designs in the statistical analysis. This paper aims two-fold; first, we introduce a new functional representation of accelerometer data of a distributional nature to build a complete individualized profile of each subject's physical activity levels. Second, using the NHANES accelerometer data (2003-2006), we show the potential advantages of this new representation to predict patients' outcomes over 68 years of age. A critical component in our statistical modeling is that we extend non-parametric functional models used: kernel smoother and kernel ridge regression, to handle the specific effect of complex sampling design in order to provide reliable conclusions about the influence of physical activity in distinct analysis performed.
READ FULL TEXT