Ensuring the correctness of quantum programs is crucial for quantum soft...
Existing online learning algorithms for adversarial Markov Decision Proc...
Code classification is a difficult issue in program understanding and
au...
We study the problem of designing adaptive multi-armed bandit algorithms...
The standard assumption in reinforcement learning (RL) is that agents ob...
We consider the best-of-both-worlds problem for learning an episodic Mar...
This work studies the problem of learning episodic Markov Decision Proce...
We consider the problem of learning in episodic finite-horizon Markov
de...
Order dispatching and driver repositioning (also known as fleet manageme...