Sheng Shen

research

∙ 08/07/2023

AgentBench: Evaluating LLMs as Agents

Large Language Models (LLMs) are becoming increasingly smart and autonom...

0 Xiao Liu, et al. ∙

research

∙ 06/13/2023

SqueezeLLM: Dense-and-Sparse Quantization

Generative Large Language Models (LLMs) have demonstrated remarkable res...

0 Sehoon Kim, et al. ∙

research

∙ 06/02/2023

Towards Robust GAN-generated Image Detection: a Multi-view Completion Representation

GAN-generated image detection now becomes the first line of defense agai...

0 Chi Liu, et al. ∙

research

∙ 05/24/2023

Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts

The explosive growth of language models and their applications have led ...

0 Sheng Shen, et al. ∙

research

∙ 05/01/2023

Poisoning Language Models During Instruction Tuning

Instruction-tuned LMs such as ChatGPT, FLAN, and InstructGPT are finetun...

0 Alexander Wan, et al. ∙

research

∙ 03/13/2023

Scaling Vision-Language Models with Sparse Mixture of Experts

The field of natural language processing (NLP) has made significant stri...

0 Sheng Shen, et al. ∙

research

∙ 12/31/2022

New Challenges in Reinforcement Learning: A Survey of Security and Privacy

Reinforcement learning (RL) is one of the most important branches of AI....

0 Yunjiao Lei, et al. ∙

research

∙ 11/21/2022

Multitask Vision-Language Prompt Tuning

Prompt Tuning, conditioning on task-specific learned prompt vectors, has...

0 Sheng Shen, et al. ∙

research

∙ 11/03/2022

Crosslingual Generalization through Multitask Finetuning

Multitask prompted finetuning (MTF) has been shown to help large languag...

0 Niklas Muennighoff, et al. ∙

research

∙ 10/27/2022

What Language Model to Train if You Have One Million GPU Hours?

The crystallization of modeling methods around the Transformer architect...

4 Teven Le Scao, et al. ∙

research

∙ 10/17/2022

ITSRN++: Stronger and Better Implicit Transformer Network for Continuous Screen Content Image Super-Resolution

Nowadays, online screen sharing and remote cooperation are becoming ubiq...

6 Sheng Shen, et al. ∙

research

∙ 04/20/2022

K-LITE: Learning Transferable Visual Models with External Knowledge

Recent state-of-the-art computer vision systems are trained from natural...

3 Sheng Shen, et al. ∙

research

∙ 03/13/2022

One Parameter Defense – Defending against Data Inference Attacks via Differential Privacy

Machine learning models are vulnerable to data inference attacks, such a...

7 Dayong Ye, et al. ∙

research

∙ 03/11/2022

Staged Training for Transformer Language Models

The current standard approach to scaling transformer language models tra...

8 Sheng Shen, et al. ∙

research

∙ 12/12/2021

Implicit Transformer Network for Screen Content Image Continuous Super-Resolution

Nowadays, there is an explosive growth of screen contents due to the wid...

2 Jingyu Yang, et al. ∙

research

∙ 10/15/2021

Multitask Prompted Training Enables Zero-Shot Task Generalization

Large language models have recently been shown to attain reasonable zero...

10 Victor Sanh, et al. ∙

research

∙ 09/08/2021

What's Hidden in a One-layer Randomly Weighted Transformer?

We demonstrate that, hidden within one-layer randomly weighted neural ne...

17 Sheng Shen, et al. ∙

research

∙ 07/13/2021

How Much Can CLIP Benefit Vision-and-Language Tasks?

Most existing Vision-and-Language (V L) models rely on pre-trained vis...

7 Sheng Shen, et al. ∙

research

∙ 07/02/2021

Learned Token Pruning for Transformers

A major challenge in deploying transformer models is their prohibitive i...

8 Sehoon Kim, et al. ∙

research

∙ 05/30/2021

MLPruning: A Multilevel Structured Pruning Framework for Transformer-based Models

Pruning is an effective method to reduce the memory footprint and comput...

21 Zhewei Yao, et al. ∙

research

∙ 12/30/2020

Reservoir Transformer

We demonstrate that transformers obtain impressive performance even when...

23 Sheng Shen, et al. ∙

research

∙ 10/19/2020

From Distributed Machine Learning To Federated Learning: In The View Of Data Privacy And Security

Federated learning is an improved version of distributed machine learnin...

0 Sheng Shen, et al. ∙

research

∙ 10/12/2020

MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding

Phrase localization is a task that studies the mapping from textual phra...

1 Qinxin Wang, et al. ∙

research

∙ 09/15/2020

Noisy Self-Knowledge Distillation for Text Summarization

In this paper we apply self-knowledge distillation to text summarization...

0 Yang Liu, et al. ∙

research

∙ 08/16/2020

Differentially Private Multi-Agent Planning for Logistic-like Problems

Planning is one of the main approaches used to improve agents' working e...

13 Dayong Ye, et al. ∙

research

∙ 06/01/2020

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

We introduce AdaHessian, a second order stochastic optimization algorith...

9 Zhewei Yao, et al. ∙

research

∙ 03/17/2020

Rethinking Batch Normalization in Transformers

The standard normalization method for neural network (NN) models used in...

0 Sheng Shen, et al. ∙

research

∙ 02/26/2020

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Since hardware resources are limited, the objective of training deep lea...

0 Zhuohan Li, et al. ∙

research

∙ 09/12/2019

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

Transformer based architectures have become de-facto models used for a r...

0 Sheng Shen, et al. ∙

research

∙ 04/02/2019

Pragmatically Informative Text Generation

We improve the informativeness of models for conditional text generation...

0 Sheng Shen, et al. ∙

research

∙ 11/01/2018

On the Generation of Medical Question-Answer Pairs

Question answering (QA) has achieved promising progress recently. Howeve...

0 Sheng Shen, et al. ∙

research

∙ 06/07/2018

Ermes: Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification

Most existing sentiment analysis approaches heavily rely on a large amou...

0 Zhenpeng Chen, et al. ∙

research

∙ 05/16/2017

Through a Gender Lens: An Empirical Study of Emoji Usage over Large-Scale Android Users

Emojis have gained incredible popularity in recent years and become a ne...

0 Zhenpeng Chen, et al. ∙

Sheng Shen

Featured Co-authors

Sign in with Google

Consider DeepAI Pro