We propose an optimistic estimate to evaluate the best possible fitting
...
Dropout is a widely utilized regularization technique in the training of...
In this work, we study the mechanism underlying loss spikes observed dur...
Models with nonlinear architectures/parameterizations such as deep neura...
It is important to understand how the popular regularization method drop...
We prove a general Embedding Principle of loss landscape of deep neural
...
Although dropout has achieved great success in deep learning, little is ...
Understanding the structure of loss landscape of deep neural networks
(D...