Shusuke Takahashi

research

∙ 09/17/2023

Zero- and Few-shot Sound Event Localization and Detection

Sound event localization and detection (SELD) systems estimate direction...

0 Kazuki Shimada, et al. ∙

research

∙ 08/14/2023

The Sound Demixing Challenge 2023 x2013 Cinematic Demixing Track

This paper summarizes the cinematic demixing (CDX) track of the Sound De...

0 Stefan Uhlich, et al. ∙

research

∙ 06/15/2023

STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events

While direction of arrival (DOA) of sound events is generally estimated ...

5 Kazuki Shimada, et al. ∙

research

∙ 05/18/2023

Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders

Diffusion-based speech enhancement (SE) has been investigated recently, ...

0 Hao Shi, et al. ∙

research

∙ 05/13/2023

The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation

This paper presents the crossing scheme (X-scheme) for improving the per...

0 Ryosuke Sawata, et al. ∙

research

∙ 05/11/2023

Extending Audio Masked Autoencoders Toward Audio Restoration

Audio classification and restoration are among major downstream tasks in...

0 Zhi Zhong, et al. ∙

research

∙ 05/10/2023

Diffusion-based Signal Refiner for Speech Separation

We have developed a diffusion-based speech refiner that improves the ref...

0 Masato Hirano, et al. ∙

research

∙ 02/16/2023

An Attention-based Approach to Hierarchical Multi-label Music Instrument Classification

Although music is typically multi-label, many works have studied hierarc...

0 Zhi Zhong, et al. ∙

research

∙ 10/27/2022

A Versatile Diffusion-based Generative Refiner for Speech Enhancement

Although deep neural network (DNN)-based speech enhancement (SE) methods...

0 Ryosuke Sawata, et al. ∙

research

∙ 10/11/2022

DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability

In this paper we propose a novel generative approach, DiffRoll, to tackl...

16 Kin Wai Cheuk, et al. ∙

research

∙ 06/04/2022

STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events

This report presents the Sony-TAu Realistic Spatial Soundscapes 2022 (ST...

0 Archontis Politis, et al. ∙

research

∙ 05/16/2022

SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization

One noted issue of vector-quantized variational autoencoder (VQ-VAE) is ...

26 Yuhta Takida, et al. ∙

research

∙ 10/14/2021

Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training

Sound event localization and detection (SELD) involves identifying the d...

0 Kazuki Shimada, et al. ∙

research

∙ 10/13/2021

Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection

Recording and annotating real sound events for a sound event localizatio...

0 Yuichiro Koyama, et al. ∙

research

∙ 10/13/2021

Music Source Separation with Deep Equilibrium Models

While deep neural network-based music source separation (MSS) is very ef...

0 Yuichiro Koyama, et al. ∙

research

∙ 10/12/2021

Spatial mixup: Directional loudness modification as data augmentation for sound event localization and detection

Data augmentation methods have shown great importance in diverse supervi...

0 Ricardo Falcon-Perez, et al. ∙

research

∙ 10/12/2021

Improving Character Error Rate Is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-box Acoustic Models

A deep neural network (DNN)-based speech enhancement (SE) aiming to maxi...

0 Ryosuke Sawata, et al. ∙

research

∙ 06/21/2021

Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection

This report describes our systems submitted to the DCASE2021 challenge t...

0 Kazuki Shimada, et al. ∙

research

∙ 06/04/2021

Manifold-Aware Deep Clustering: Maximizing Angles between Embedding Vectors Based on Regular Simplex

This paper presents a new deep clustering (DC) method called manifold-aw...

0 Keitaro Tanaka, et al. ∙

research

∙ 02/17/2021

Preventing Posterior Collapse Induced by Oversmoothing in Gaussian VAE

Variational autoencoders (VAEs) often suffer from posterior collapse, wh...

26 Yuhta Takida, et al. ∙

research

∙ 10/29/2020

ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection

Neural-network (NN)-based methods show high performance in sound event l...

0 Kazuki Shimada, et al. ∙

research

∙ 10/08/2020

All for One and One for All: Improving Music Separation by Bridging Networks

This paper proposes several improvements for music separation with deep ...

0 Ryosuke Sawata, et al. ∙

research

∙ 06/22/2020

Sound Event Localization and Detection Using Activity-Coupled Cartesian DOA Vector and RD3net

Our systems submitted to the DCASE2020 task 3: Sound Event Localization ...

0 Kazuki Shimada, et al. ∙

Shusuke Takahashi

Featured Co-authors

Sign in with Google

Consider DeepAI Pro