A Scalable Finite Difference Method for Deep Reinforcement Learning

10/14/2022
by   Matthew Allen, et al.
0

Several low-bandwidth distributable black-box optimization algorithms have recently been shown to perform nearly as well as more refined modern methods in some Deep Reinforcement Learning domains. In this work we investigate a core problem with the use of distributed workers in such systems. Further, we investigate the dramatic differences in performance between the popular Adam gradient descent algorithm and the simplest form of stochastic gradient descent. These investigations produce a stable, low-bandwidth learning algorithm that achieves 100% usage of all connected CPUs under typical conditions.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset