Bilevel Programming and Deep Learning: A Unifying View on Inference Learning Methods
In this work we unify a number of inference learning methods, that are proposed in the literature as alternative training algorithms to the ones based on regular error back-propagation. These inference learning methods were developed with very diverse motivations, mainly aiming to enhance the biological plausibility of deep neural networks and to improve the intrinsic parallelism of training methods. We show that these superficially very different methods can all be obtained by successively applying a particular reformulation of bilevel optimization programs. As a by-product it becomes also evident that all considered inference learning methods include back-propagation as a special case, and therefore at least approximate error back-propagation in typical settings. Finally, we propose Fenchel back-propagation, that replaces the propagation of infinitesimal corrections performed in standard back-propagation with finite targets as the learning signal. Fenchel back-propagation can therefore be seen as an instance of learning via explicit target propagation.
READ FULL TEXT