Overcoming the Stability Gap in Continual Learning
In many real-world applications, deep neural networks are retrained from scratch as a dataset grows in size. Given the computational expense for retraining networks, it has been argued that continual learning could make updating networks more efficient. An obstacle to achieving this goal is the stability gap, which refers to an observation that when updating on new data, performance on previously learned data degrades before recovering. Addressing this problem would enable continual learning to learn new data with fewer network updates, resulting in increased computational efficiency. We study how to mitigate the stability gap in rehearsal (or experience replay), a widely employed continual learning method. We test a variety of hypotheses to understand why the stability gap occurs. This leads us to discover a method that vastly reduces this gap. In experiments on a large-scale incremental class learning setting, we are able to significantly reduce the number of network updates to recover performance. Our work has the potential to advance the state-of-the-art in continual learning for real-world applications along with reducing the carbon footprint required to maintain updated neural networks.
READ FULL TEXT