Sang Michael Xie

research

∙ 05/17/2023

DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

The mixture proportions of pretraining data domains (e.g., Wikipedia, bo...

0 Sang Michael Xie, et al. ∙

research

∙ 02/27/2023

Reward Design with Language Models

Reward design in reinforcement learning (RL) is challenging since specif...

0 Minae Kwon, et al. ∙

research

∙ 02/06/2023

Data Selection for Language Models via Importance Resampling

Selecting a suitable training dataset is crucial for both general-domain...

0 Sang Michael Xie, et al. ∙

research

∙ 11/16/2022

Holistic Evaluation of Language Models

Language models (LMs) are becoming the foundation for almost all major l...

21 Percy Liang, et al. ∙

research

∙ 10/25/2022

Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models

Language modeling on large-scale datasets leads to impressive performanc...

0 Hong Liu, et al. ∙

research

∙ 04/01/2022

Connect, Not Collapse: Explaining Contrastive Learning for Unsupervised Domain Adaptation

We consider unsupervised domain adaptation (UDA), where labeled data fro...

6 Kendrick Shen, et al. ∙

research

∙ 12/09/2021

Extending the WILDS Benchmark for Unsupervised Adaptation

Machine learning systems deployed in the wild are often trained on a sou...

17 Shiori Sagawa, et al. ∙

research

∙ 11/03/2021

An Explanation of In-context Learning as Implicit Bayesian Inference

Large pretrained language models such as GPT-3 have the surprising abili...

7 Sang Michael Xie, et al. ∙

research

∙ 09/12/2021

No True State-of-the-Art? OOD Detection Methods are Inconsistent across Datasets

Out-of-distribution detection is an important component of reliable ML s...

14 Fahim Tajwar, et al. ∙

research

∙ 06/17/2021

Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning

Pretrained language models have achieved state-of-the-art performance wh...

9 Colin Wei, et al. ∙

research

∙ 12/14/2020

WILDS: A Benchmark of in-the-Wild Distribution Shifts

Distribution shifts can cause significant degradation in a broad range o...

11 Pang Wei Koh, et al. ∙

research

∙ 12/08/2020

In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness

Consider a prediction setting where a few inputs (e.g., satellite images...

8 Sang Michael Xie, et al. ∙

research

∙ 06/29/2020

Simplifying Models with Unlabeled Output Data

We focus on prediction problems with high-dimensional outputs that are s...

8 Sang Michael Xie, et al. ∙

research

∙ 02/25/2020

Understanding and Mitigating the Tradeoff Between Robustness and Accuracy

Adversarial training augments the training set with perturbations to imp...

42 Aditi Raghunathan, et al. ∙

research

∙ 06/14/2019

Adversarial Training Can Hurt Generalization

While adversarial training can improve robust accuracy (against an adver...

0 Aditi Raghunathan, et al. ∙

research

∙ 01/29/2019

Differentiable Subset Sampling

Many machine learning tasks require sampling a subset of items from a co...

2 Sang Michael Xie, et al. ∙

research

∙ 05/26/2018

Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance

Large amounts of labeled data are typically required to train deep learn...

0 Neal Jean, et al. ∙

Sang Michael Xie

Featured Co-authors

Sign in with Google

Consider DeepAI Pro