Deheng Ye

research

∙ 08/24/2023

Master-slave Deep Architecture for Top-K Multi-armed Bandits with Non-linear Bandit Feedback and Diversity Constraints

We propose a novel master-slave architecture to solve the top-K combinat...

0 Hanchi Huang, et al. ∙

research

∙ 07/10/2023

RLTF: Reinforcement Learning from Unit Test Feedback

The goal of program synthesis, or code generation, is to generate execut...

0 Jiate Liu, et al. ∙

research

∙ 05/26/2023

Future-conditioned Unsupervised Pretraining for Decision Transformer

Recent research in offline reinforcement learning (RL) has demonstrated ...

0 Zhihui Xie, et al. ∙

research

∙ 04/27/2023

SeeHow: Workflow Extraction from Programming Screencasts through Action-Aware Video Analytics

Programming screencasts (e.g., video tutorials on Youtube or live coding...

0 Dehai Zhao, et al. ∙

research

∙ 03/13/2023

Deploying Offline Reinforcement Learning with Human Feedback

Reinforcement learning (RL) has shown promise for decision-making tasks ...

0 Ziniu Li, et al. ∙

research

∙ 02/05/2023

Sample Dropout: A Simple yet Effective Variance Reduction Technique in Deep Policy Optimization

Recent success in Deep Reinforcement Learning (DRL) methods has shown th...

0 Zichuan Lin, et al. ∙

research

∙ 01/20/2023

Revisiting Estimation Bias in Policy Gradients for Deep Reinforcement Learning

We revisit the estimation bias in policy gradients for the discounted ep...

0 Haoxuan Pan, et al. ∙

research

∙ 12/04/2022

RLogist: Fast Observation Strategy on Whole-slide Images with Deep Reinforcement Learning

Whole-slide images (WSI) in computational pathology have high resolution...

27 Boxuan Zhao, et al. ∙

research

∙ 11/08/2022

Pretraining in Deep Reinforcement Learning: A Survey

The past few years have seen rapid progress in combining reinforcement l...

0 Zhihui Xie, et al. ∙

research

∙ 11/07/2022

Curriculum-based Asymmetric Multi-task Reinforcement Learning

We introduce CAMRL, the first curriculum-based asymmetric multi-task lea...

0 Hanchi Huang, et al. ∙

research

∙ 10/19/2022

Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation

A promising paradigm for offline reinforcement learning (RL) is to const...

0 Chengqian Gao, et al. ∙

research

∙ 09/26/2022

More Centralized Training, Still Decentralized Execution: Multi-Agent Conditional Policy Factorization

In cooperative multi-agent reinforcement learning (MARL), combining valu...

0 Jiangxing Wang, et al. ∙

research

∙ 09/21/2022

Revisiting Discrete Soft Actor-Critic

We study the adaption of soft actor-critic (SAC) from continuous action ...

0 Haibin Zhou, et al. ∙

research

∙ 09/01/2022

Dynamics-Adaptive Continual Reinforcement Learning via Progressive Contextualization

A key challenge of continual reinforcement learning (CRL) in dynamic env...

2 Tiantian Zhang, et al. ∙

research

∙ 08/11/2022

Quantized Adaptive Subgradient Algorithms and Their Applications

Data explosion and an increase in model size drive the remarkable advanc...

10 Ke Xu, et al. ∙

research

∙ 05/12/2022

GPN: A Joint Structural Learning Framework for Graph Neural Networks

Graph neural networks (GNNs) have been applied into a variety of graph t...

0 Qianggang Ding, et al. ∙

research

∙ 02/17/2022

MineRL Diamond 2021 Competition: Overview, Results, and Lessons Learned

Reinforcement learning competitions advance the field by providing appro...

0 Anssi Kanervisto, et al. ∙

research

∙ 12/07/2021

JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical Reinforcement Learning

Learning rational behaviors in open-world games like Minecraft remains t...

0 Zichuan Lin, et al. ∙

research

∙ 11/07/2021

Coordinated Proximal Policy Optimization

We present Coordinated Proximal Policy Optimization (CoPPO), an algorith...

0 Zifan Wu, et al. ∙

research

∙ 10/09/2021

TiKick: Toward Playing Multi-agent Football Full Games from Single-agent Demonstrations

Deep reinforcement learning (DRL) has achieved super-human performance o...

0 Shiyu Huang, et al. ∙

research

∙ 06/19/2021

Boosting Offline Reinforcement Learning with Residual Generative Modeling

Offline reinforcement learning (RL) tries to learn the near-optimal poli...

7 Hua Wei, et al. ∙

research

∙ 05/13/2021

MapGo: Model-Assisted Policy Optimization for Goal-Oriented Tasks

In Goal-oriented Reinforcement learning, relabeling the raw goals in pas...

0 Menghui Zhu, et al. ∙

research

∙ 01/05/2021

Generating Informative CVE Description From ExploitDB Posts by Extractive Summarization

ExploitDB is one of the important public websites, which contributes a l...

0 Jiamou Sun, et al. ∙

research

∙ 12/18/2020

Which Heroes to Pick? Learning to Draft in MOBA Games with Neural Networks and Tree Search

Hero drafting is essential in MOBA game playing as it builds the team of...

0 Sheng Chen, et al. ∙

research

∙ 11/25/2020

Towards Playing Full MOBA Games with Deep Reinforcement Learning

MOBA games, e.g., Honor of Kings, League of Legends, and Dota 2, pose gr...

5 Deheng Ye, et al. ∙

research

∙ 11/25/2020

Supervised Learning Achieves Human-Level Performance in MOBA Games: A Case Study of Honor of Kings

We present JueWu-SL, the first supervised-learning-based artificial inte...

6 Deheng Ye, et al. ∙

Deheng Ye

Featured Co-authors

Sign in with Google

Consider DeepAI Pro