PR Product: A Substitute for Inner Product in Neural Networks

04/30/2019
by   Zhennan Wang, et al.
0

In this paper, we analyze the inner product of weight vector and input vector in neural networks from the perspective of vector orthogonal decomposition and prove that the local direction gradient of weight vector decreases as the angle between them gets closer to 0 or π. We propose the PR Product, a substitute for the inner product, which makes the local direction gradient of weight vector independent of the angle and consistently larger than the one in the conventional inner product while keeping the forward propagation identical. As the basic operation in neural networks, the PR Product can be applied into many existing deep learning modules, so we develop the PR Product version of the fully connected layer, convolutional layer, and LSTM layer. In static image classification, the experiments on CIFAR10 and CIFAR100 datasets demonstrate that the PR Product can robustly enhance the ability of various state-of-the-art classification networks. On the task of image captioning, even without any bells and whistles, our PR Product version of captioning model can compete or outperform the state-of-the-art models on MS COCO dataset.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset