Transfer Value Iteration Networks
Value iteration networks (VINs) have been demonstrated to be effective in predicting outcomes, assuming there is sufficient training data in the target domain. In this paper, we propose a transfer learning approach to leverage knowledge from the source domain to the target domain via automatically learning similarities of actions between two domains, for training the target VIN with only limited training data. The proposed architecture called Transfer Value Iteration Network (TVIN) is shown to empirically outperform VIN between domains with similar state and action spaces. Furthermore, we show that this performance gap is consistent across different maze environments, maze sizes, dataset sizes and also hyperparameters such as iteration counts and kernel sizes.
READ FULL TEXT