Noise-induced degeneration in online learning
In order to elucidate the plateau phenomena caused by vanishing gradient, we herein analyse stability of stochastic gradient descent dynamics near degenerated subspaces in a multi-layer perceptron. We show that, in Fukumizu-Amari model, attracting regions exist in the degenerated subspace, and a novel type of strong plateau phenomenon emerges as a noise-induced phenomenon, which makes learning much slower than the deterministic gradient descent dynamics. The noise-induced degeneration observed herein is expected to be found in a broad class of online learning in perceptrons.
READ FULL TEXT