CARL: Conditional-value-at-risk Adversarial Reinforcement Learning

09/20/2021

∙

In this paper we present a risk-averse reinforcement learning (RL) method called Conditional value-at-risk Adversarial Reinforcement Learning (CARL). To the best of our knowledge, CARL is the first game formulation for Conditional Value-at-Risk (CVaR) RL. The game takes place between a policy player and an adversary that perturbs the policy player's state transitions given a finite budget. We prove that, at the maximin equilibrium point, the learned policy is CVaR optimal with a risk tolerance explicitly related to the adversary's budget. We provide a gradient-based training procedure to solve CARL by formulating it as a zero-sum Stackelberg Game, enabling the use of deep reinforcement learning architectures and training algorithms. Finally, we show that solving the CARL game does lead to risk-averse behaviour in a toy grid environment, also confirming that an increased adversary produces increasingly cautious policies.

READ FULL TEXT

CARL: Conditional-value-at-risk Adversarial Reinforcement Learning

RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning

Adaptive Sampling for Stochastic Risk-Averse Learning

Risk Averse Robust Adversarial Reinforcement Learning

Neural-Progressive Hedging: Enforcing Constraints in Reinforcement Learning with Stochastic Programming

Solving Online Threat Screening Games using Constrained Action Space Reinforcement Learning

Robust Risk-Aware Reinforcement Learning

Robust Market Making via Adversarial Reinforcement Learning

CARL: Conditional-value-at-risk Adversarial Reinforcement Learning

Related Research

RAMBO-RL: Robust Adversarial Model-Based Offline Reinforcement Learning

Adaptive Sampling for Stochastic Risk-Averse Learning

Risk Averse Robust Adversarial Reinforcement Learning

Neural-Progressive Hedging: Enforcing Constraints in Reinforcement Learning with Stochastic Programming

Solving Online Threat Screening Games using Constrained Action Space Reinforcement Learning

Robust Risk-Aware Reinforcement Learning

Robust Market Making via Adversarial Reinforcement Learning