What is Statistical Learning Theory?
Statistical learning theory is the broad framework for studying the concept of inference in both supervised and unsupervised machine learning. Inference covers the entire spectrum of machine learning, from gaining knowledge, making predictions or decisions and constructing models from a set of labeled or unlabeled data. The entire process is stated in a statistical framework, with every assumption stated mathematically as a null or alternative hypothesis.
How does Statistical Learning Theory Work in Practice?
The practical goals of this approach are to make machine learning more precise (reliably reproduceable) and to create new or improved modeling algorithms. This is primarily accomplished by providing a formal, statistical definition of abstract concepts, like learning, generalization, overfitting and performance, then testing these hypotheses one parameter at a time.
The general learning approach is the same as any other scientific discipline:
1. Observe a phenomenon
2. Construct a model of that phenomenon
3. Make predictions using this model
But in statistical machine learning, the entire process needs to be automated for a computer program to learn from it.
So each step of the scientific method is assumed to be governed by a probabilistic model of the phenomenon (or data generation process). At its simplest level, this means assuming that if all past and future observations are sampled randomly and independently through continuous statistical hypothesis testing, then information about the underlying phenomenon (the probability distribution) can be reliably inferred. Just as important, this allows the machine to construct learning algorithms, such as k-nearest neighbors with appropriate k, that are consistent (reproduceable). Which makes the general concept of deep learning possible, since as more and more data is processed, the algorithm’s predictions get closer and closer to the optimal solutions.