Dynamic Normalization

01/15/2021

∙

Batch Normalization has become one of the essential components in CNN. It allows the network to use a higher learning rate and speed up training. And the network doesn't need to be initialized carefully. However, in our work, we find that a simple extension of BN can increase the performance of the network. First, we extend BN to adaptively generate scale and shift parameters for each mini-batch data, called DN-C (Batch-shared and Channel-wise). We use the statistical characteristics of mini-batch data (E[X], Std[X]∈ℝ^c) as the input of SC module. Then we extend BN to adaptively generate scale and shift parameters for each channel of each sample, called DN-B (Batch and Channel-wise). Our experiments show that DN-C model can't train normally, but DN-B model has very good robustness. In classification task, DN-B can improve the accuracy of the MobileNetV2 on ImageNet-100 more than 2 task, DN-B can improve the accuracy of the SSDLite on MS-COCO nearly 4 with the same settings. Compared with BN, DN-B has stable performance when using higher learning rate or smaller batch size.

READ FULL TEXT

Dynamic Normalization

Sign in with Google

Consider DeepAI Pro