Bi-Directional Differentiable Input Reconstruction for Low-Resource Neural Machine Translation
We aim to better exploit the limited amounts of parallel text available in low-resource settings by introducing a differentiable reconstruction loss for neural machine translation (NMT). We reconstruct the input from sampled translations and leverage differentiable sampling and bi-directional NMT to build a compact model that can be trained end-to-end. This approach achieves small but consistent BLEU improvements on four language pairs in both translation directions, and outperforms an alternative differentiable reconstruction strategy based on hidden states.
READ FULL TEXT