Fast Neural Machine Translation Implementation

05/24/2018

∙

This paper describes the submissions to the efficiency track for GPUs by members of the University of Edinburgh, Adam Mickiewicz University, Tilde and University of Alicante. We focus on efficient implementation of the recurrent deep-learning model as implemented in Amun, the fast inference engine for neural machine translation. We improve the performance with an efficient mini-batching algorithm and by fusing the softmax operation with k-best extraction algorithm.

READ FULL TEXT

Fast Neural Machine Translation Implementation

Sign in with Google

Consider DeepAI Pro