Jaesung Huh

research

∙ 07/18/2023

OxfordVGG Submission to the EGO4D AV Transcription Challenge

This report presents the technical details of our submission on the EGO4...

0 Jaesung Huh, et al. ∙

research

∙ 03/01/2023

WhisperX: Time-Accurate Speech Transcription of Long-Form Audio

Large-scale, weakly-supervised speech recognition models, such as Whispe...

0 Max Bain, et al. ∙

research

∙ 02/01/2023

Epic-Sounds: A Large-scale Dataset of Actions That Sound

We introduce EPIC-SOUNDS, a large-scale dataset of audio annotations cap...

1 Jaesung Huh, et al. ∙

research

∙ 11/01/2021

With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition

In egocentric videos, actions occur in quick succession. We capitalise o...

2 Evangelos Kazakos, et al. ∙

research

∙ 11/30/2020

Look who's not talking

The objective of this work is speaker diarisation of speech recordings '...

0 Youngki Kwon, et al. ∙

research

∙ 09/29/2020

Clova Baseline System for the VoxCeleb Speaker Recognition Challenge 2020

This report describes our submission to the VoxCeleb Speaker Recognition...

0 Hee-Soo Heo, et al. ∙

research

∙ 07/23/2020

Augmentation adversarial training for unsupervised speaker recognition

The goal of this work is to train robust speaker recognition models with...

0 Jaesung Huh, et al. ∙

research

∙ 07/02/2020

Spot the conversation: speaker diarisation in the wild

The goal of this paper is speaker diarisation of videos collected 'in th...

2 Joon Son Chung, et al. ∙

research

∙ 05/18/2020

Metric Learning for Keyword Spotting

The goal of this work is to train effective representations for keyword ...

0 Jaesung Huh, et al. ∙

research

∙ 03/26/2020

In defence of metric learning for speaker recognition

The objective of this paper is 'open-set' speaker recognition of unseen ...

0 Joon Son Chung, et al. ∙

research

∙ 02/10/2020

Modeling Musical Onset Probabilities via Neural Distribution Learning

Musical onset detection can be formulated as a time-to-event (TTE) or ti...

0 Jaesung Huh, et al. ∙

research

∙ 11/06/2019

The sound of my voice: speaker representation loss for target voice separation

Research on content and style representations has been widely studied in...

0 Seongkyu Mun, et al. ∙

research

∙ 10/24/2019

Delving into VoxCeleb: environment invariant speaker recognition

Research in speaker recognition has recently seen significant progress d...

0 Joon Son Chung, et al. ∙

research

∙ 03/07/2019

Phase-aware Speech Enhancement with Deep Complex U-Net

Most deep learning-based models for speech enhancement have mainly focus...

0 Hyeong-Seok Choi, et al. ∙

Jaesung Huh

Featured Co-authors

Sign in with Google

Consider DeepAI Pro