In this paper, we explore a continuous modeling approach for
deep-learni...
We propose a novel neural speaker diarization system using memory-aware
...
We propose a first step toward multilingual end-to-end automatic speech
...
Previous Multimodal Information based Speech Processing (MISP) challenge...
This technical report details our submission system to the CHiME-7 DASR
...
In recent research, slight performance improvement is observed from auto...
The goal of this study is to implement diffusion models for speech
enhan...
We propose a multi-dimensional structured state space (S4) approach to s...
The Multi-modal Information based Speech Processing (MISP) challenge aim...
We propose a quantum kernel learning (QKL) framework to address the inhe...
In this paper, we propose a deep learning based multi-speaker direction ...
We propose an ensemble learning framework with Poisson sub-sampling to
e...
Differential privacy (DP) is one data protection avenue to safeguard use...
In this paper, we propose two techniques, namely joint modeling and data...
Audio-only-based wake word spotting (WWS) is challenging under noisy
con...
We propose two improvements to target-speaker voice activity detection
(...
Multimodal emotion recognition is a challenging task in emotion computin...
We propose a variational Bayesian (VB) approach to learning distribution...
We propose a separation guided speaker diarization (SGSD) approach by fu...
We propose a novel neural model compression strategy combining data
augm...
We propose using an adversarial autoencoder (AAE) to replace generative
...
This system description describes our submission system to the Third DIH...
In this paper, we propose a novel four-stage data augmentation approach ...
In this paper, we propose a novel deep learning architecture to improvin...
To improve device robustness, a highly desirable key feature of a compet...
We propose a novel decentralized feature extraction approach in federate...
In this paper, we propose a visual embedding approach to improving embed...
In this paper, we exploit the properties of mean absolute error (MAE) as...
In this paper, we show that, in vector-to-vector regression utilizing de...
In this paper, we propose a domain adaptation framework to address the d...
In this paper, we propose a sub-utterance unit selection framework to re...
This paper investigates different trade-offs between the number of model...
In this technical report, we present a joint effort of four groups, name...
We propose a novel neural label embedding (NLE) scheme for the domain
ad...
Recent studies have highlighted adversarial examples as ubiquitous threa...
Recent deep neural networks based techniques, especially those equipped ...
We propose a tensor-to-vector regression approach to multi-channel speec...
One challenging problem of robust automatic speech recognition (ASR) is ...
While standard cell layouts are drawn with minimum design rules to maxim...
Starting from 22-nm, a standard cell must be designed to be full
lithogr...
While standard cell layouts are drawn with minimum design rules for maxi...
We propose an end-to-end model based on convolutional and recurrent neur...
In this paper, we present a probabilistic framework for goal-driven spok...
We present a Bayesian approach to adapting parameters of a well-trained
...