b'Zhihua Wu'

research

∙ 08/07/2023

RecycleGPT: An Autoregressive Language Model with Recyclable Module

Existing large language models have to run K times to generate a sequenc...

0 Yufan Jiang, et al. ∙

research

∙ 02/20/2023

TA-MoE: Topology-Aware Large Scale Mixture-of-Expert Training

Sparsely gated Mixture-of-Expert (MoE) has demonstrated its effectivenes...

0 Chang Chen, et al. ∙

research

∙ 11/01/2022

Efficient AlphaFold2 Training using Parallel Evoformer and Branch Parallelism

The accuracy of AlphaFold2, a frontier end-to-end structure prediction s...

0 Guoxia Wang, et al. ∙

research

∙ 08/17/2022

Boosting Distributed Training Performance of the Unpadded BERT Model

Pre-training models are an important tool in Natural Language Processing...

0 Jinle Zeng, et al. ∙

research

∙ 07/12/2022

HelixFold: An Efficient Implementation of AlphaFold2 using PaddlePaddle

Accurate protein structure prediction can significantly accelerate the d...

0 Guoxia Wang, et al. ∙

research

∙ 05/20/2022

SE-MoE: A Scalable and Efficient Mixture-of-Experts Distributed Training and Inference System

With the increasing diversity of ML infrastructures nowadays, distribute...

0 Liang Shen, et al. ∙

research

∙ 05/19/2022

Nebula-I: A General Framework for Collaboratively Training Deep Learning Models on Low-Bandwidth Cloud Clusters

The ever-growing model size and scale of compute have attracted increasi...

8 Yang Xiang, et al. ∙

research

∙ 12/31/2021

ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation

Conventional methods for the image-text generation tasks mainly tackle t...

6 Han Zhang, et al. ∙

research

∙ 12/23/2021

ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

Pre-trained language models have achieved state-of-the-art results in va...

4 Shuohuan Wang, et al. ∙

research

∙ 12/06/2021

End-to-end Adaptive Distributed Training on PaddlePaddle

Distributed training has become a pervasive and effective approach for t...

0 Yulong Ao, et al. ∙

research

∙ 11/20/2021

HeterPS: Distributed Deep Learning With Reinforcement Learning Based Scheduling in Heterogeneous Environments

Deep neural networks (DNNs) exploit many layers and a large number of pa...

1 Ji Liu, et al. ∙

research

∙ 09/20/2021

PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation

To explore the limit of dialogue generation pre-training, we present the...

0 Siqi Bao, et al. ∙

research

∙ 07/05/2021

ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

Pre-trained models have achieved state-of-the-art results in various Nat...

12 Yu Sun, et al. ∙

research

∙ 02/27/2020

MNN: A Universal and Efficient Inference Engine

Deploying deep learning models on mobile devices draws more and more att...

43 Xiaotang Jiang, et al. ∙

research

∙ 02/18/2020

Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

Federated learning is a new distributed machine learning framework, wher...

0 Yikai Yan, et al. ∙

research

∙ 11/06/2019

Secure Federated Submodel Learning

Federated learning was proposed with an intriguing vision of achieving c...

0 Chaoyue Niu, et al. ∙

Zhihua Wu

Featured Co-authors

Sign in with Google

Consider DeepAI Pro