We propose a novel master-slave architecture to solve the top-K
combinat...
The goal of program synthesis, or code generation, is to generate execut...
Recent research in offline reinforcement learning (RL) has demonstrated ...
Programming screencasts (e.g., video tutorials on Youtube or live coding...
Reinforcement learning (RL) has shown promise for decision-making tasks ...
Recent success in Deep Reinforcement Learning (DRL) methods has shown th...
We revisit the estimation bias in policy gradients for the discounted
ep...
Whole-slide images (WSI) in computational pathology have high resolution...
The past few years have seen rapid progress in combining reinforcement
l...
We introduce CAMRL, the first curriculum-based asymmetric multi-task lea...
A promising paradigm for offline reinforcement learning (RL) is to const...
In cooperative multi-agent reinforcement learning (MARL), combining valu...
We study the adaption of soft actor-critic (SAC) from continuous action ...
A key challenge of continual reinforcement learning (CRL) in dynamic
env...
Data explosion and an increase in model size drive the remarkable advanc...
Graph neural networks (GNNs) have been applied into a variety of graph t...
Reinforcement learning competitions advance the field by providing
appro...
Learning rational behaviors in open-world games like Minecraft remains t...
We present Coordinated Proximal Policy Optimization (CoPPO), an algorith...
Deep reinforcement learning (DRL) has achieved super-human performance o...
Offline reinforcement learning (RL) tries to learn the near-optimal poli...
In Goal-oriented Reinforcement learning, relabeling the raw goals in pas...
ExploitDB is one of the important public websites, which contributes a l...
Hero drafting is essential in MOBA game playing as it builds the team of...
MOBA games, e.g., Honor of Kings, League of Legends, and Dota 2, pose gr...
We present JueWu-SL, the first supervised-learning-based artificial
inte...