The local and global features are both essential for automatic speech
re...
Attention-based encoder-decoder (AED) models have shown impressive
perfo...
The use of Transformer represents a recent success in speech enhancement...
Talking face generation, also known as speech-to-lip generation, reconst...
Audio and visual signals complement each other in human speech perceptio...
Sound source localization aims to seek the direction of arrival (DOA) of...
Speaker extraction seeks to extract the clean speech of a target speaker...
In this work, we present the development of a new database, namely Sound...
Active speaker detection (ASD) seeks to detect who is speaking in a visu...
Most of the prior studies in the spatial DoA domain focus on a single
mo...
This document describes our submission to the 2018 LOCalization And TrAc...