Lecture Notes on Partially Known MDPs

12/06/2021
by   Guillermo A. Perez, et al.
0

In these notes we will tackle the problem of finding optimal policies for Markov decision processes (MDPs) which are not fully known to us. Our intention is to slowly transition from an offline setting to an online (learning) setting. Namely, we are moving towards reinforcement learning.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset