b'Albert Gu'

research

∙ 09/15/2023

Augmenting conformers with structured state space models for online speech recognition

Online speech recognition, where the model only accesses context to the ...

0 Haozhe Shan, et al. ∙

research

∙ 03/11/2023

Resurrecting Recurrent Neural Networks for Long Sequences

Recurrent Neural Networks (RNNs) offer fast inference on long sequences ...

10 Antonio Orvieto, et al. ∙

research

∙ 03/07/2023

Structured State Space Models for In-Context Reinforcement Learning

Structured state space sequence (S4) models have recently achieved state...

0 Chris Lu, et al. ∙

research

∙ 01/25/2023

Modelling Long Range Dependencies in N-D: From Task-Specific to a General Purpose CNN

Performant Convolutional Neural Network (CNN) architectures must be tail...

0 David M. Knigge, et al. ∙

research

∙ 12/20/2022

Pretraining Without Attention

Transformers have been essential to pretraining success in NLP. Other ar...

0 Junxiong Wang, et al. ∙

research

∙ 10/12/2022

S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces

Visual data such as images and videos are typically modeled as discretiz...

9 Eric Nguyen, et al. ∙

research

∙ 06/24/2022

How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections

Linear time-invariant state space models (SSM) are a classical model fro...

12 Albert Gu, et al. ∙

research

∙ 06/23/2022

On the Parameterization and Initialization of Diagonal State Space Models

State space models (SSM) have recently been shown to be very effective a...

4 Albert Gu, et al. ∙

research

∙ 06/07/2022

Towards a General Purpose CNN for Long Range Dependencies in ND

The use of Convolutional Neural Networks (CNNs) is widespread in Deep Le...

26 David W. Romero, et al. ∙

research

∙ 02/20/2022

It's Raw! Audio Generation with State-Space Models

Developing architectures suitable for modeling raw audio is a challengin...

6 Karan Goel, et al. ∙

research

∙ 10/31/2021

Efficiently Modeling Long Sequences with Structured State Spaces

A central goal of sequence modeling is designing a single principled mod...

1 Albert Gu, et al. ∙

research

∙ 10/26/2021

Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers

Recurrent neural networks (RNNs), temporal convolutions, and neural diff...

25 Albert Gu, et al. ∙

research

∙ 06/07/2021

HoroPCA: Hyperbolic Dimensionality Reduction via Horospherical Projections

This paper studies Principal Component Analysis (PCA) for data lying in ...

9 Ines Chami, et al. ∙

research

∙ 12/29/2020

Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps

Modern neural network architectures use structured linear transformation...

6 Tri Dao, et al. ∙

research

∙ 11/25/2020

No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems

In real-world classification tasks, each class often comprises multiple ...

6 Nimit S. Sohoni, et al. ∙

research

∙ 10/01/2020

From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering

Similarity-based Hierarchical Clustering (HC) is a classical unsupervise...

4 Ines Chami, et al. ∙

research

∙ 08/17/2020

HiPPO: Recurrent Memory with Optimal Polynomial Projections

A central problem in learning from sequential data is representing cumul...

24 Albert Gu, et al. ∙

research

∙ 08/15/2020

Model Patching: Closing the Subgroup Performance Gap with Data Augmentation

Classifiers in machine learning are often brittle when deployed. Particu...

18 Karan Goel, et al. ∙

research

∙ 10/22/2019

Improving the Gating Mechanism of Recurrent Neural Networks

Gating mechanisms are widely used in neural network models, where they a...

30 Albert Gu, et al. ∙

research

∙ 07/19/2019

Sparse Recovery for Orthogonal Polynomial Transforms

In this paper we consider the following sparse recovery problem. We have...

0 Anna Gilbert, et al. ∙

research

∙ 03/14/2019

Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations

Fast linear transforms are ubiquitous in machine learning, including the...

28 Tri Dao, et al. ∙

research

∙ 10/04/2018

Learning Compressed Transforms with Low Displacement Rank

The low displacement rank (LDR) framework for structured matrices repres...

2 Anna T. Thomas, et al. ∙

research

∙ 04/10/2018

Representation Tradeoffs for Hyperbolic Embeddings

Hyperbolic embeddings offer excellent quality with few dimensions when e...

0 Christopher De Sa, et al. ∙

research

∙ 03/16/2018

A Kernel Theory of Modern Data Augmentation

Data augmentation, a technique in which a training set is expanded with ...

0 Tri Dao, et al. ∙

Albert Gu

Featured Co-authors

Sign in with Google

Consider DeepAI Pro