S. Umesh

research

∙ 08/02/2023

SALTTS: Leveraging Self-Supervised Speech Representations for improved Text-to-Speech Synthesis

While FastSpeech2 aims to integrate aspects of speech such as pitch, ene...

0 Ramanan Sivaguru, et al. ∙

research

∙ 05/31/2023

The Tag-Team Approach: Leveraging CLS and Language Tagging for Enhancing Multilingual ASR

Building a multilingual Automated Speech Recognition (ASR) system in a l...

0 Kaousheik Jayakumar, et al. ∙

research

∙ 03/10/2023

UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation

In this paper, we introduce UnFuSeD, a novel approach to leverage self-s...

0 Ashish Seth, et al. ∙

research

∙ 11/03/2022

Channel-Aware Pretraining of Joint Encoder-Decoder Self-Supervised Model for Telephonic-Speech ASR

This paper proposes a novel technique to obtain better downstream ASR pe...

0 Vrunda N. Sukhadia, et al. ∙

research

∙ 11/02/2022

SLICER: Learning universal audio representations using low-resource self-supervised pre-training

We present a new Self-Supervised Learning (SSL) approach to pre-train en...

0 Ashish Seth, et al. ∙

research

∙ 11/02/2022

MAST: Multiscale Audio Spectrogram Transformers

We present Multiscale Audio Spectrogram Transformer (MAST) for audio cla...

0 Sreyan Ghosh, et al. ∙

research

∙ 11/02/2022

data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setup

In this paper, we propose a new Self-Supervised Learning (SSL) algorithm...

0 Vasista Sai Lodagala, et al. ∙

research

∙ 10/05/2022

CCC-wav2vec 2.0: Clustering aided Cross Contrastive Self-supervised learning of speech representations

While Self-Supervised Learning has helped reap the benefit of the scale ...

0 Vasista Sai Lodagala, et al. ∙

research

∙ 06/11/2022

Investigation of Ensemble features of Self-Supervised Pretrained Models for Automatic Speech Recognition

Self-supervised learning (SSL) based models have been shown to generate ...

0 A Arunkumar, et al. ∙

research

∙ 03/31/2022

Analyzing the factors affecting usefulness of Self-Supervised Pre-trained Representations for Speech Recognition

Self-supervised learning (SSL) to learn high-level speech representation...

0 Lodagala V S V Durga Prasad, et al. ∙

research

∙ 03/31/2022

PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations

While self-supervised speech representation learning (SSL) models serve ...

0 Lodagala V S V Durga Prasad, et al. ∙

research

∙ 03/31/2022

A Discourse Aware Sequence Learning Approach for Emotion Recognition in Conversations

The expression of emotions is a crucial part of daily human communicatio...

0 Sreyan Ghosh, et al. ∙

research

∙ 03/31/2022

MMER: Multimodal Multi-task learning for Emotion Recognition in Spoken Utterances

Emotion Recognition (ER) aims to classify human utterances into differen...

0 Harshvardhan Srivastava, et al. ∙

research

∙ 03/30/2022

Span Classification with Structured Information for Disfluency Detection in Spoken Utterances

Existing approaches in disfluency detection focus on solving a token-lev...

0 Sreyan Ghosh, et al. ∙

research

∙ 03/25/2022

DeLoRes: Decorrelating Latent Spaces for Low-Resource Audio Representation Learning

Inspired by the recent progress in self-supervised learning for computer...

0 Sreyan Ghosh, et al. ∙

research

∙ 02/18/2022

Domain Adaptation of low-resource Target-Domain models using well-trained ASR Conformer Models

In this paper, we investigate domain adaptation for low-resource Automat...

0 Vrunda N. Sukhadia, et al. ∙

research

∙ 10/17/2021

Deep Clustering For General-Purpose Audio Representations

We introduce DECAR, a self-supervised pre-training approach for learning...

0 Sreyan Ghosh, et al. ∙

research

∙ 08/11/2020

S-vectors: Speaker Embeddings based on Transformer's Encoder for Text-Independent Speaker Verification

X-vectors have become the standard for speaker-embeddings in automatic s...

0 Metilda Sagaya Mary N J, et al. ∙

research

∙ 08/07/2020

Investigation of Speaker-adaptation methods in Transformer based ASR

End-to-end models are fast replacing conventional hybrid models in autom...

0 Vishwas M. Shetty, et al. ∙

research

∙ 07/15/2013

Modified SPLICE and its Extension to Non-Stereo Data for Noise Robust Speech Recognition

In this paper, a modification to the training process of the popular SPL...

0 D. S. Pavan Kumar, et al. ∙

S. Umesh

Featured Co-authors

Sign in with Google

Consider DeepAI Pro