Nam Soo Kim

research

∙ 06/14/2023

EM-Network: Oracle Guided Self-distillation for Sequence Learning

We introduce EM-Network, a novel self-distillation approach that effecti...

0 Ji Won Yoon, et al. ∙

research

∙ 05/30/2023

Towards single integrated spoofing-aware speaker verification embeddings

This study aims to develop a single integrated spoofing-aware speaker ve...

3 Sung Hwan Mun, et al. ∙

research

∙ 11/30/2022

SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech

Zero-shot multi-speaker text-to-speech (ZSM-TTS) models aim to generate ...

0 Byoung Jin Choi, et al. ∙

research

∙ 11/28/2022

Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic Speech Recognition

Recently, the advance in deep learning has brought a considerable improv...

0 Ji Won Yoon, et al. ∙

research

∙ 10/12/2022

Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech

Several recently proposed text-to-speech (TTS) models achieved to genera...

0 Byoung Jin Choi, et al. ∙

research

∙ 08/17/2022

Disentangled Speaker Representation Learning via Mutual Information Minimization

Domain mismatch problem caused by speaker-unrelated feature has been a m...

0 Sung Hwan Mun, et al. ∙

research

∙ 04/13/2022

HuBERT-EE: Early Exiting HuBERT for Efficient Speech Recognition

Pre-training with self-supervised models, such as Hidden-unit BERT (HuBE...

0 Ji Won Yoon, et al. ∙

research

∙ 04/03/2022

Selective Kernel Attention for Robust Speaker Verification

Recent state-of-the-art speaker verification architectures adopt multi-s...

0 Sung Hwan Mun, et al. ∙

research

∙ 03/29/2022

Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus

Training a text-to-speech (TTS) model requires a large scale text labele...

0 Minchan Kim, et al. ∙

research

∙ 11/05/2021

Oracle Teacher: Towards Better Knowledge Distillation

Knowledge distillation (KD), best known as an effective method for model...

0 Ji Won Yoon, et al. ∙

research

∙ 07/06/2021

Kosp2e: Korean Speech to English Translation Corpus

Most speech-to-text (S2T) translation studies use English speech as a so...

0 Won Ik Cho, et al. ∙

research

∙ 04/03/2021

Diff-TTS: A Denoising Diffusion Model for Text-to-Speech

Although neural text-to-speech (TTS) models have attracted a lot of atte...

0 Myeonghun Jeong, et al. ∙

research

∙ 03/24/2021

StyleKQC: A Style-Variant Paraphrase Corpus for Korean Questions and Commands

Paraphrasing is often performed with less concern for controlled style c...

0 Won Ik Cho, et al. ∙

research

∙ 02/06/2021

Continuous Monitoring of Blood Pressure with Evidential Regression

Photoplethysmogram (PPG) signal-based blood pressure (BP) estimation is ...

0 Hyeongju Kim, et al. ∙

research

∙ 10/22/2020

Unsupervised Representation Learning for Speaker Recognition via Contrastive Equilibrium Learning

In this paper, we propose a simple but powerful unsupervised learning me...

0 Sung Hwan Mun, et al. ∙

research

∙ 10/22/2020

Robust Text-Dependent Speaker Verification via Character-Level Information Preservation for the SdSV Challenge 2020

This paper describes our submission to Task 1 of the Short-duration Spea...

0 Sung Hwan Mun, et al. ∙

research

∙ 08/07/2020

Disentangled speaker and nuisance attribute embedding for robust speaker verification

Over the recent years, various deep learning-based embedding methods hav...

0 Woo Hyun Kang, et al. ∙

research

∙ 07/25/2020

Robust Front-End for Multi-Channel ASR using Flow-Based Density Estimation

For multi-channel speech recognition, speech enhancement techniques such...

0 Hyeongju Kim, et al. ∙

research

∙ 07/10/2020

Gated Recurrent Context: Softmax-free Attention for Online Encoder-Decoder Speech Recognition

Recently, attention-based encoder-decoder (AED) models have shown state-...

0 Hyeonseung Lee, et al. ∙

research

∙ 06/08/2020

SoftFlow: Probabilistic Framework for Normalizing Flow on Manifolds

Flow-based generative models are composed of invertible transformations ...

0 Hyeongju Kim, et al. ∙

research

∙ 06/08/2020

WaveNODE: A Continuous Normalizing Flow for Speech Synthesis

In recent years, various flow-based generative models have been proposed...

0 Hyeongju Kim, et al. ∙

research

∙ 05/17/2020

Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation

Speech is one of the most effective means of communication and is full o...

0 Won Ik Cho, et al. ∙

research

∙ 12/01/2019

Machines Getting with the Program: Understanding Intent Arguments of Non-Canonical Directives

Modern dialog managers face the challenge of having to fulfill human-lev...

0 Won Ik Cho, et al. ∙

research

∙ 10/21/2019

Disambiguating Speech Intention via Audio-Text Co-attention Framework: A Case of Prosody-semantics Interface

Understanding the intention of an utterance is challenging for some pros...

0 Won Ik Cho, et al. ∙

research

∙ 05/31/2019

Investigating an Effective Character-level Embedding in Korean Sentence Classification

Different from the writing systems of many Romance and Germanic language...

0 Won Ik Cho, et al. ∙

research

∙ 05/28/2019

On Measuring Gender Bias in Translation of Gender-neutral Pronouns

Ethics regarding social bias has recently thrown striking issues in natu...

0 Won Ik Cho, et al. ∙

research

∙ 11/10/2018

Speech Intention Understanding in a Head-final Language: A Disambiguation Utilizing Intonation-dependency

For a large portion of real-life utterances, the intention cannot be sol...

0 Won Ik Cho, et al. ∙

research

∙ 10/31/2018

Real-time Automatic Word Segmentation for User-generated Text

For readability and possibly for disambiguation, appropriate word segmen...

0 Won Ik Cho, et al. ∙

research

∙ 10/10/2018

Structured Argument Extraction of Korean Question and Command

Intention identification and slot filling is a core issue in dialog mana...

0 Won Ik Cho, et al. ∙

Nam Soo Kim

Featured Co-authors

Sign in with Google

Consider DeepAI Pro