Tan Lee

research

∙ 09/21/2023

CoMFLP: Correlation Measure based Fast Search on ASR Layer Pruning

Transformer-based speech recognition (ASR) model with deep layers exhibi...

0 Wei Liu, et al. ∙

research

∙ 09/21/2023

Sparsely Shared LoRA on Whisper for Child Speech Recognition

Whisper is a powerful automatic speech recognition (ASR) model. Neverthe...

0 Wei Liu, et al. ∙

research

∙ 02/21/2023

Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring

Recent studies on pronunciation scoring have explored the effect of intr...

0 Wei Liu, et al. ∙

research

∙ 12/06/2022

Label-free Knowledge Distillation with Contrastive Loss for Light-weight Speaker Recognition

Very deep models for speaker recognition (SR) have demonstrated remarkab...

0 Zhiyuan Peng, et al. ∙

research

∙ 12/06/2022

Covariance Regularization for Probabilistic Linear Discriminant Analysis

Probabilistic linear discriminant analysis (PLDA) is commonly used in sp...

0 Zhiyuan Peng, et al. ∙

research

∙ 10/31/2022

Model Compression for DNN-Based Text-Independent Speaker Verification Using Weight Quantization

DNN-based models achieve high performance in the speaker verification (S...

0 Jingyu Li, et al. ∙

research

∙ 10/31/2022

Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification

Deep convolutional neural networks (CNNs) have been applied to extractin...

0 Jingyu Li, et al. ∙

research

∙ 06/29/2022

iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre

The capability of generating speech with specific type of emotion is des...

0 Guangyan Zhang, et al. ∙

research

∙ 06/27/2022

iExam: A Novel Online Exam Monitoring and Analysis System Based on Face Detection and Recognition

Online exams via video conference software like Zoom have been adopted i...

0 Xu Yang, et al. ∙

research

∙ 06/26/2022

Transport-Oriented Feature Aggregation for Speaker Embedding Learning

Pooling is needed to aggregate frame-level features into utterance-level...

0 Yusheng Tian, et al. ∙

research

∙ 05/25/2022

An Investigation on Applying Acoustic Feature Conversion to ASR of Adult and Child Speech

The performance of child speech recognition is generally less satisfacto...

0 Wei Liu, et al. ∙

research

∙ 04/22/2022

Unifying Cosine and PLDA Back-ends for Speaker Verification

State-of-art speaker verification (SV) systems use a back-end model to s...

0 Zhiyuan Peng, et al. ∙

research

∙ 04/12/2022

CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction

This study extends our previous work on text-based speech editing to dev...

0 Daxin Tan, et al. ∙

research

∙ 03/29/2022

Automatic Detection of Speech Sound Disorder in Child Speech Using Posterior-based Speaker Representations

This paper presents a macroscopic approach to automatic detection of spe...

0 Si-Ioi Ng, et al. ∙

research

∙ 11/20/2021

Acoustical Analysis of Speech Under Physical Stress in Relation to Physical Activities and Physical Literacy

Human speech production encompasses physiological processes that natural...

0 Si-Ioi Ng, et al. ∙

research

∙ 10/09/2021

Data Augmentation with Locally-time Reversed Speech for Automatic Speech Recognition

Psychoacoustic studies have shown that locally-time reversed (LTR) speec...

0 Si-Ioi Ng, et al. ∙

research

∙ 10/08/2021

Environment Aware Text-to-Speech Synthesis

This study aims at designing an environment-aware text-to-speech (TTS) s...

0 Daxin Tan, et al. ∙

research

∙ 10/08/2021

A study on the efficacy of model pre-training in developing neural text-to-speech system

In the development of neural text-to-speech systems, model pre-training ...

3 Guangyan Zhang, et al. ∙

research

∙ 09/20/2021

Improving Text-Independent Speaker Verification with Auxiliary Speakers Using Graph

The paper presents a novel approach to refining similarity scores betwee...

0 Jingyu Li, et al. ∙

research

∙ 09/16/2021

Utterance-level neural confidence measure for end-to-end children speech recognition

Confidence measure is a performance index of particular importance for a...

0 Wei Liu, et al. ∙

research

∙ 08/11/2021

Robust Feature Learning on Long-Duration Sounds for Acoustic Scene Classification

Acoustic scene classification (ASC) aims to identify the type of scene (...

0 Yuzhong Wu, et al. ∙

research

∙ 08/05/2021

Applying the Information Bottleneck Principle to Prosodic Representation Learning

This paper describes a novel design of a neural network-based speech gen...

0 Guangyan Zhang, et al. ∙

research

∙ 07/04/2021

EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion

This paper presents the design, implementation and evaluation of a speec...

0 Daxin Tan, et al. ∙

research

∙ 06/16/2021

Detection of Consonant Errors in Disordered Speech Based on Consonant-vowel Segment Embedding

Speech sound disorder (SSD) refers to a type of developmental disorder i...

0 Si-Ioi Ng, et al. ∙

research

∙ 03/30/2021

Enhancing Segment-Based Speech Emotion Recognition by Deep Self-Learning

Despite the widespread utilization of deep neural networks (DNNs) for sp...

0 Shuiyang Mao, et al. ∙

research

∙ 03/08/2021

CUHK-EE Voice Cloning System for ICASSP 2021 M2VoC Challenge

This paper presents the CUHK-EE voice cloning system for ICASSP 2021 M2V...

0 Daxin Tan, et al. ∙

research

∙ 12/14/2020

Bayesian Learning for Deep Neural Network Adaptation

A key task for speech recognition systems is to reduce the mismatch betw...

40 Xurong Xie, et al. ∙

research

∙ 11/28/2020

Unsupervised Spoken Term Discovery Based on Re-clustering of Hypothesized Speech Segments with Siamese and Triplet Networks

Spoken term discovery from untranscribed speech audio could be achieved ...

0 Man-Ling Sung, et al. ∙

research

∙ 11/12/2020

The CUHK-TUDELFT System for The SLT 2021 Children Speech Recognition Challenge

This technical report describes our submission to the 2021 SLT Children ...

0 Si-Ioi Ng, et al. ∙

research

∙ 11/03/2020

Unsupervised Pattern Discovery from Thematic Speech Archives Based on Multilingual Bottleneck Features

The present study tackles the problem of automatically discovering spoke...

0 Man-Ling Sung, et al. ∙

research

∙ 08/15/2020

Advancing Multiple Instance Learning with Attention Modeling for Categorical Speech Emotion Recognition

Categorical speech emotion recognition is typically performed as a seque...

0 Shuiyang Mao, et al. ∙

research

∙ 08/15/2020

EigenEmo: Spectral Utterance Representation Using Dynamic Mode Decomposition for Speech Emotion Classification

Human emotional speech is, by its very nature, a variant signal. This re...

0 Shuiyang Mao, et al. ∙

research

∙ 08/12/2020

Emotion Profile Refinery for Speech Emotion Classification

Human emotions are inherently ambiguous and impure. When designing syste...

0 Shuiyang Mao, et al. ∙

research

∙ 08/07/2020

Automatic Detection of Phonological Errors in Child Speech Using Siamese Recurrent Autoencoder

Speech sound disorder (SSD) refers to the developmental disorder in whic...

0 Si-Ioi Ng, et al. ∙

research

∙ 08/07/2020

CUCHILD: A Large-Scale Cantonese Corpus of Child Speech for Phonology and Articulation Assessment

This paper describes the design and development of CUCHILD, a large-scal...

0 Si-Ioi Ng, et al. ∙

research

∙ 10/30/2019

Mixture factorized auto-encoder for unsupervised hierarchical deep factorization of speech signal

Speech signal is constituted and contributed by various informative fact...

0 Zhiyuan Peng, et al. ∙

research

∙ 08/09/2019

Exploiting Cross-Lingual Speaker and Phonetic Diversity for Unsupervised Subword Modeling

This research addresses the problem of acoustic modeling of low-resource...

0 Siyuan Feng, et al. ∙

research

∙ 06/17/2019

Improving Unsupervised Subword Modeling via Disentangled Speech Representation Learning and Transformation

This study tackles unsupervised subword modeling in the zero-resource sc...

0 Siyuan Feng, et al. ∙

research

∙ 06/17/2019

Combining Adversarial Training and Disentangled Speech Representation for Robust Zero-Resource Subword Modeling

This study addresses the problem of unsupervised subword unit discovery ...

0 Siyuan Feng, et al. ∙

research

∙ 01/06/2019

Enhancing Sound Texture in CNN-Based Acoustic Scene Classification

Acoustic scene classification is the task of identifying the scene from ...

0 Yuzhong Wu, et al. ∙

research

∙ 11/01/2017

Reducing Model Complexity for DNN Based Large-Scale Audio Classification

Audio classification is the task of identifying the sound categories tha...

0 Yuzhong Wu, et al. ∙

Tan Lee

Featured Co-authors

Sign in with Google

Consider DeepAI Pro