Discounted Reinforcement Learning is Not an Optimization Problem

10/04/2019
by   Abhishek Naik, et al.
0

Discounted reinforcement learning is fundamentally incompatible with function approximation for control in continuing tasks. This is because it is not an optimization problem — it lacks an objective function. After substantiating these claims, we go on to address some misconceptions about discounting and its connection to the average reward formulation. We encourage researchers to adopt rigorous optimization approaches for reinforcement learning in continuing tasks, such as average reward.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset