Safety and robustness are two desired properties for any reinforcement
l...
In this paper, we focus on the problem of robustifying reinforcement lea...
Having a perfect model to compute the optimal policy is often infeasible...
We propose a version of WalkSAT algorithm, named as BetaWalkSAT. This me...
Optimal policies in Markov decision processes (MDPs) are very sensitive ...
Robust MDPs are a promising framework for computing robust policies in
r...
A reinforcement learning agent tries to maximize its cumulative payoff b...
Robustness is important for sequential decision making in a stochastic
d...
Multi-armed bandits are a quintessential machine learning problem requir...