Deep Reinforcement Learning in System Optimization
The recent advancements in deep reinforcement learning have opened new horizons and opportunities to tackle various problems in system optimization. Such problems are generally tailored to delayed, aggregated, and sequential rewards, which is an inherent behavior in the reinforcement learning setting, where an agent collects rewards while exploring and exploiting the environment to maximize the long term reward. However, in some cases, it is not clear why deep reinforcement learning is a good fit for the problem. Sometimes, it does not perform better than the state-of-the-art solutions. And in other cases, random search or greedy algorithms could outperform deep reinforcement learning. In this paper, we review, discuss, and evaluate the recent trends of using deep reinforcement learning in system optimization. We propose a set of essential metrics to guide future works in evaluating the efficacy of using deep reinforcement learning in system optimization. Our evaluation includes challenges, the types of problems, their formulation in the deep reinforcement learning setting, embedding, the model used, efficiency, and robustness. We conclude with a discussion on open challenges and potential directions for pushing further the integration of reinforcement learning in system optimization.
READ FULL TEXT