b'Kazuki Irie'

research

∙ 05/30/2023

Exploring the Promise and Limits of Real-Time Recurrent Learning

Real-time recurrent learning (RTRL) for sequence-processing recurrent ne...

0 Kazuki Irie, et al. ∙

research

∙ 05/24/2023

Contrastive Training of Complex-Valued Autoencoders for Object Discovery

Current state-of-the-art object-centric models use slots and attention-b...

0 Aleksandar Stanić, et al. ∙

research

∙ 05/02/2023

Accelerating Neural Self-Improvement via Bootstrapping

Few-shot learning with sequence-processing neural networks (NNs) has rec...

0 Kazuki Irie, et al. ∙

research

∙ 02/15/2023

Topological Neural Discrete Representation Learning à la Kohonen

Unsupervised learning of discrete representations from continuous ones i...

0 Kazuki Irie, et al. ∙

research

∙ 11/17/2022

Learning to Control Rapidly Changing Synaptic Connections: An Alternative Type of Memory in Sequence Processing Artificial Neural Networks

Short-term memory in standard, general-purpose, sequence-processing recu...

0 Kazuki Irie, et al. ∙

research

∙ 10/12/2022

CTL++: Evaluating Generalization on Never-Seen Compositional Patterns of Known Functions, and Compatibility of Neural Representations

Well-designed diagnostic tasks have played a key role in studying the fa...

0 Róbert Csordás, et al. ∙

research

∙ 10/07/2022

Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules

Work on fast weight programmers has demonstrated the effectiveness of ke...

0 Kazuki Irie, et al. ∙

research

∙ 06/03/2022

Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules

Neural ordinary differential equations (ODEs) have attracted much attent...

14 Kazuki Irie, et al. ∙

research

∙ 03/25/2022

Unsupervised Learning of Temporal Abstractions with Slot-based Transformers

The discovery of reusable sub-routines simplifies decision-making and pl...

0 Anand Gopalakrishnan, et al. ∙

research

∙ 02/11/2022

The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention

Linear layers in neural networks (NNs) trained by gradient descent can b...

15 Kazuki Irie, et al. ∙

research

∙ 02/11/2022

A Modern Self-Referential Weight Matrix That Learns to Modify Itself

The weight matrix (WM) of a neural network (NN) is its program. The prog...

11 Kazuki Irie, et al. ∙

research

∙ 12/31/2021

Improving Baselines in the Wild

We share our experience with the recently released WILDS benchmark, a co...

7 Kazuki Irie, et al. ∙

research

∙ 12/31/2021

Training and Generating Neural Networks in Compressed Weight Space

The inputs and/or outputs of some neural nets are weight matrices of oth...

4 Kazuki Irie, et al. ∙

research

∙ 10/14/2021

The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization

Despite successes across a broad range of applications, Transformers hav...

16 Róbert Csordás, et al. ∙

research

∙ 08/26/2021

The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers

Recently, many datasets have been proposed to test the systematic genera...

12 Róbert Csordás, et al. ∙

research

∙ 06/11/2021

Going Beyond Linear Transformers with Recurrent Fast Weight Programmers

Transformers with linearised attention ("linear Transformers") have demo...

8 Kazuki Irie, et al. ∙

research

∙ 02/22/2021

Linear Transformers Are Secretly Fast Weight Memory Systems

We show the formal equivalence of linearised self-attention mechanisms a...

42 Imanol Schlag, et al. ∙

research

∙ 04/02/2020

The RWTH ASR System for TED-LIUM Release 2: Improving Hybrid HMM with SpecAugment

We present a complete training pipeline to build a state-of-the-art hybr...

0 Wei Zhou, et al. ∙

research

∙ 05/10/2019

Language Modeling with Deep Transformers

We explore multi-layer autoregressive Transformer models in language mod...

0 Kazuki Irie, et al. ∙

research

∙ 05/08/2019

RWTH ASR Systems for LibriSpeech: Hybrid vs Attention - w/o Data Augmentation

We present state-of-the-art automatic speech recognition (ASR) systems e...

0 Christoph Lüscher, et al. ∙

research

∙ 02/21/2019

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Lingvo is a Tensorflow framework offering a complete solution for collab...

13 Jonathan Shen, et al. ∙

research

∙ 02/05/2019

Model Unit Exploration for Sequence-to-Sequence Speech Recognition

We evaluate attention-based encoder-decoder models along two dimensions:...

0 Kazuki Irie, et al. ∙

research

∙ 05/08/2018

Improved training of end-to-end attention models for speech recognition

Sequence-to-sequence attention-based models on subword units allow simpl...

0 Albert Zeyer, et al. ∙

Kazuki Irie

Featured Co-authors

Sign in with Google

Consider DeepAI Pro