research
∙
11/14/2021
Explicit Explore, Exploit, or Escape (E^4): near-optimal safety-constrained reinforcement learning in polynomial time
In reinforcement learning (RL), an agent must explore an initially unkno...
research
∙
10/26/2021