Towards a Uniform Architecture for the Efficient Implementation of 2D and 3D Deconvolutional Neural Networks on FPGAs
Three-dimensional deconvolution is widely used in many computer vision applications. However, most previous works have only focused on accelerating 2D deconvolutional neural networks (DCNNs) on FPGAs, while the acceleration of 3D DCNNs has not been studied in depth as they have higher computational complexity and sparsity than 2D DCNNs. In this paper, we focus on the acceleration of both 2D and 3D DCNNs on FPGAs by proposing efficient schemes for mapping 2D and 3D DCNNs on a uniform architecture. By implementing our design on the Xilinx VC709 platform for four real-life 2D and 3D DCNNs, we can achieve up to 3.0 TOPS with high hardware efficiency. Comparisons with CPU and GPU solutions demonstrate that we can achieve an improvement of up to 63.3X in throughput relative to a CPU solution and an improvement of up to 8.3X in energy efficiency compared to a GPU solution.
READ FULL TEXT