M5 Competition Uncertainty: Overdispersion, distributional forecasting, GAMLSS and beyond
The M5 competition uncertainty track aims for probabilistic forecasting of sales of thousands of Walmart retail goods. We show that the M5 competition data faces strong overdispersion and sporadic demand, especially zero demand. We discuss resulting modeling issues concerning adequate probabilistic forecasting of such count data processes. Unfortunately, the majority of popular prediction methods used in the M5 competition (e.g. lightgbm and xgboost GBMs) fails to address the data characteristics due to the considered objective functions. The distributional forecasting provides a suitable modeling approach for to the overcome those problems. The GAMLSS framework allows flexible probabilistic forecasting using low dimensional distributions. We illustrate, how the GAMLSS approach can be applied for the M5 competition data by modeling the location and scale parameter of various distributions, e.g. the negative binomial distribution. Finally, we discuss software packages for distributional modeling and their drawback, like the R package gamlss with its package extensions, and (deep) distributional forecasting libraries such as TensorFlow Probability.
READ FULL TEXT