b'Dani Yogatama'

research

∙ 07/21/2022

Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?

There have been a lot of interest in the scaling properties of Transform...

0 Yi Tay, et al. ∙

research

∙ 06/21/2022

Questions Are All You Need to Train a Dense Passage Retriever

We introduce ART, a new corpus-level autoencoding approach for training ...

6 Devendra Singh Sachan, et al. ∙

research

∙ 03/02/2022

HighMMT: Towards Modality and Task Generalization for High-Modality Representation Learning

Learning multimodal representations involves discovering correspondences...

10 Paul Pu Liang, et al. ∙

research

∙ 02/13/2022

A Contrastive Framework for Neural Text Generation

Text generation is of great importance to many natural language processi...

0 Yixuan Su, et al. ∙

research

∙ 01/24/2022

Relational Memory Augmented Language Models

We present a memory-augmented approach to condition an autoregressive la...

1 Qi Liu, et al. ∙

research

∙ 10/12/2021

Balancing Average and Worst-case Accuracy in Multitask Learning

When training and evaluating machine learning models on a large number o...

0 Paul Michel, et al. ∙

research

∙ 10/06/2021

ABC: Attention with Bounded-memory Control

Transformer architectures have achieved state-of-the-art results on a va...

0 Hao Peng, et al. ∙

research

∙ 09/22/2021

Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers

There remain many open questions pertaining to the scaling behaviour of ...

3 Yi Tay, et al. ∙

research

∙ 06/09/2021

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering

We present an end-to-end differentiable training method for retrieval-au...

0 Devendra Singh Sachan, et al. ∙

research

∙ 03/24/2021

Finetuning Pretrained Transformers into RNNs

Transformers have outperformed recurrent neural networks (RNNs) in natur...

14 Jungo Kasai, et al. ∙

research

∙ 03/03/2021

Random Feature Attention

Transformers are state-of-the-art models for a variety of sequence model...

0 Hao Peng, et al. ∙

research

∙ 02/04/2021

Adaptive Semiparametric Language Models

We present a language model that combines a large parametric neural netw...

0 Dani Yogatama, et al. ∙

research

∙ 02/03/2021

Pitfalls of Static Language Modelling

Our world is open-ended, non-stationary and constantly evolving; thus wh...

0 Angeliki Lazaridou, et al. ∙

research

∙ 05/27/2020

Syntactic Structure Distillation Pretraining For Bidirectional Encoders

Textual representation learners trained on large amounts of data have ac...

0 Adhiguna Kuncoro, et al. ∙

research

∙ 04/30/2020

A Call for More Rigor in Unsupervised Cross-lingual Learning

We review motivations, definition, approaches, and methodology for unsup...

6 Mikel Artetxe, et al. ∙

research

∙ 02/21/2020

Modelling Latent Skills for Multitask Language Generation

We present a generative model for multitask conditional language generat...

0 Kris Cao, et al. ∙

research

∙ 11/08/2019

Reducing Sentiment Bias in Language Models via Counterfactual Evaluation

Recent improvements in large-scale language models have driven progress ...

0 Po-Sen Huang, et al. ∙

research

∙ 10/25/2019

On the Cross-lingual Transferability of Monolingual Representations

State-of-the-art unsupervised multilingual models (e.g., multilingual BE...

14 Mikel Artetxe, et al. ∙

research

∙ 10/18/2019

A Mutual Information Maximization Perspective of Language Representation Learning

We show state-of-the-art word representation learning methods maximize a...

0 Lingpeng Kong, et al. ∙

research

∙ 09/03/2019

Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation

Neural networks are part of many contemporary NLP systems, yet their emp...

1 Po-Sen Huang, et al. ∙

research

∙ 06/03/2019

Episodic Memory in Lifelong Language Learning

We introduce a lifelong language learning setup where a model needs to l...

8 Cyprien de Masson d'Autume, et al. ∙

research

∙ 01/31/2019

Learning and Evaluating General Linguistic Intelligence

We define general linguistic intelligence as the ability to reuse previo...

8 Dani Yogatama, et al. ∙

research

∙ 01/27/2019

Variational Smoothing in Recurrent Neural Network Language Models

We present a new theoretical perspective of data noising in recurrent ne...

0 Lingpeng Kong, et al. ∙

research

∙ 05/25/2017

Jointly Learning Sentence Embeddings and Syntax with Unsupervised Tree-LSTMs

We introduce a neural network that represents sentences by composing the...

0 Jean Maillard, et al. ∙

research

∙ 05/11/2017

Program Induction by Rationale Generation : Learning to Solve and Explain Algebraic Word Problems

Solving algebraic word problems requires executing a series of arithmeti...

0 Wang Ling, et al. ∙

research

∙ 03/06/2017

Generative and Discriminative Text Classification with Recurrent Neural Networks

We empirically characterize the performance of discriminative and genera...

0 Dani Yogatama, et al. ∙

research

∙ 11/28/2016

Learning to Compose Words into Sentences with Reinforcement Learning

We use reinforcement learning to learn tree-structured neural networks f...

0 Dani Yogatama, et al. ∙

research

∙ 12/08/2015

Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

We show that an end-to-end deep learning approach can be used to recogni...

0 Dario Amodei, et al. ∙

research

∙ 03/02/2015

Bayesian Optimization of Text Representations

When applying machine learning to problems in NLP, there are many choice...

0 Dani Yogatama, et al. ∙

research

∙ 10/09/2013

A Sparse and Adaptive Prior for Time-Dependent Model Parameters

We consider the scenario where the parameters of a probabilistic model a...

0 Dani Yogatama, et al. ∙

Dani Yogatama

Featured Co-authors

Sign in with Google

Consider DeepAI Pro