Negative Inner-Loop Learning Rates Learn Universal Features

03/18/2022
by   Tom Starshak, et al.
0

Model Agnostic Meta-Learning (MAML) consists of two optimization loops: the outer loop learns a meta-initialization of model parameters that is shared across tasks, and the inner loop task-specific adaptation step. A variant of MAML, Meta-SGD, uses the same two loop structure, but also learns the learning-rate for the adaptation step. Little attention has been paid to how the learned learning-rate of Meta-SGD affects feature reuse. In this paper, we study the effect that a learned learning-rate has on the per-task feature representations in Meta-SGD. The learned learning-rate of Meta-SGD often contains negative values. During the adaptation phase, these negative learning rates push features away from task-specific features and towards task-agnostic features. We performed several experiments on the Mini-Imagenet dataset. Two neural networks were trained, one with MAML, and one with Meta-SGD. The feature quality for both models was tested as follows: strip away the linear classification layer, pass labeled and unlabeled samples through this encoder, classify the unlabeled samples according to their nearest neighbor. This process was performed: 1) after training and using the meta-initialization parameters; 2) after adaptation, and validated on that task; and 3) after adaptation, and validated on a different task. The MAML trained model improved on the task it was adapted to, but had worse performance on other tasks. The Meta-SGD trained model was the opposite; it had worse performance on the task it was adapted to, but improved on other tasks. This confirms the hypothesis that Meta-SGD's negative learning rates cause the model to learn task-agnostic features rather than simply adapt to task specific features.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro