Top-down Flow Transformer Networks
We study the deformation fields of feature maps across convolutional network layers under explicit top-down spatial transformations. We propose top-down flow transformer (TFT) by focusing on three transformations: translation, rotation, and scaling. We learn flow transformation generators that are able to account for the hidden layer deformations while maintaining the overall consistency across layers. The learned generators are shown to capture the underlying feature transformation processes that are independent of the particular training images. We observe favorable experimental results compared to the existing methods that tie transformations to fixed datasets. A comprehensive study on various datasets including MNIST, shapes, and natural images with both inner and inter datasets (trained on MNIST and validated in a number of datasets) evaluation demonstrates the advantages of our proposed TFT framework, which can be adopted in a variety of computer vision applications.
READ FULL TEXT