Some Cautionary Comments on Principal Component Analysis for Time Series Data

08/04/2020
by   Xinyu Zhang, et al.
0

Principal component analysis (PCA) is a most frequently used statistical tool in almost all branches of data science. However, like many other statistical tools, there is sometimes the risk of misuse or even abuse. In this short note, we highlight possible pitfalls in using the theoretical results of PCA based on the assumption of independent data when the data are time series. For the latter, we state a central limit theorem of the eigenvalues and eigenvectors, give analytical and bootstrap methods to estimate the covariance, and assess their efficacy via simulation. An empirical example is given to illustrate the pitfalls of a common misuse of PCA. We conclude that while the conventional scree plot continues to be useful for time series data, the interpretation of the principal component loadings requires careful attention.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset