DPRed: Making Typical Activation Values Matter In Deep Learning Computing

04/17/2018
by   Alberto Delmas, et al.
0

We show that selecting a fixed precision for all activations in Convolutional Neural Networks, even if that precision is different per layer, amounts to worst case design. We show that much lower precisions can be used, if we could target the common case instead by tailoring the precision at a much finer granularity than that of a layer. We propose Dynamic Prediction Reduction (DPRed) where hardware on-the-fly detects the precision activations need and at a much finer granularity than a whole layer. We demonstrate a practical implementation of DPRed with DPRed Stripes (DPRS), a data-parallel hardware accelerator that adjusts precision on-the-fly to accommodate the values of the activations it processes concurrently. DPRS accelerates convolutional layers and executes unmodified convolutional neural networks. DPRS is 2.61x faster and 1.84x more energy efficient than a fixed-precision accelerator for a set of convolutional neural networks. We further extend DPRS to exploit activation and weight precisions for fully-connected layers. The enhanced design improves average performance and energy efficiency respectively by 2.59x and 1.19x over the fixed-precision accelerator for a broader set of neural networks. We also consider a lower cost variant that supports only even precision widths which offers better energy efficiency.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset