Xinyuan Qian

research

∙ 05/24/2023

InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

The local and global features are both essential for automatic speech re...

0 Zhi-Hao Lai, et al. ∙

research

∙ 05/23/2023

Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding

Attention-based encoder-decoder (AED) models have shown impressive perfo...

0 Tian-Hao Zhang, et al. ∙

research

∙ 05/15/2023

Ripple sparse self-attention for monaural speech enhancement

The use of Transformer represents a recent success in speech enhancement...

0 Qiquan Zhang, et al. ∙

research

∙ 03/29/2023

Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

Talking face generation, also known as speech-to-lip generation, reconst...

0 Jiadong Wang, et al. ∙

research

∙ 09/05/2022

Predict-and-Update Network: Audio-Visual Speech Recognition Inspired by Human Speech Perception

Audio and visual signals complement each other in human speech perceptio...

0 Jiadong Wang, et al. ∙

research

∙ 06/24/2022

Iterative Sound Source Localization for Unknown Number of Sources

Sound source localization aims to seek the direction of arrival (DOA) of...

1 Yanjie Fu, et al. ∙

research

∙ 03/31/2022

Speaker Extraction with Co-Speech Gestures Cue

Speaker extraction seeks to extract the clean speech of a target speaker...

0 Zexu Pan, et al. ∙

research

∙ 08/05/2021

SLoClas: A Database for Joint Sound Localization and Classification

In this work, we present the development of a new database, namely Sound...

0 Xinyuan Qian, et al. ∙

research

∙ 07/14/2021

Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection

Active speaker detection (ASD) seeks to detect who is speaking in a visu...

0 Ruijie Tao, et al. ∙

research

∙ 05/13/2021

Multi-target DoA Estimation with an Audio-visual Fusion Mechanism

Most of the prior studies in the spatial DoA domain focus on a single mo...

0 Xinyuan Qian, et al. ∙

research

∙ 01/25/2019

LOCATA challenge: speaker localization with a planar array

This document describes our submission to the 2018 LOCalization And TrAc...

0 Xinyuan Qian, et al. ∙

Xinyuan Qian

Featured Co-authors

Sign in with Google

Consider DeepAI Pro