We introduce the receding-horizon policy gradient (RHPG) algorithm, the ...
We revisit in this paper the discrete-time linear quadratic regulator (L...
We develop the first end-to-end sample complexity of model-free policy
g...
Direct policy search serves as one of the workhorses in modern reinforce...
Making decisions in the presence of a strategic opponent requires one to...