Personalized Cancer Chemotherapy Schedule: a numerical comparison of performance and robustness in model-based and model-free scheduling methodologies
Reinforcement learning algorithms are gaining popularity in fields where optimal scheduling is important, and oncology is not an exception. The complex and uncertain dynamics of cancer limit the performance of traditional model-based scheduling strategies like Optimal Control. Some preliminary efforts have already been made to design chemotherapy schedules using Q-learning considering a discrete action space. Motivated by the recent success of model-free Deep Reinforcement Learning (DRL) in challenging control tasks, we suggest the use of the Deep Q-Network (DQN) and Deep Deterministic Policy Gradient (DDPG) algorithms to design a personalized cancer chemotherapy schedule. We show that both of them succeed in the task and outperform the Optimal Control solution in the presence of uncertainty. Furthermore, we show that DDPG can exterminate cancer more efficiently than DQN due to its continuous action space. Finally, we provide some intuition regarding the amount of samples required for the training.
READ FULL TEXT