We introduce region-customizable sound extraction (ReZero), a general an...
Echo cancellation and noise reduction are essential for full-duplex
comm...
This paper summarizes the cinematic demixing (CDX) track of the Sound
De...
Modern neural-network-based speech processing systems are typically requ...
Multi-channel speech separation using speaker's directional information ...
Recently, frequency domain all-neural beamforming methods have achieved
...
Recently, the pre-trained Transformer models have received a rising inte...
Hand-crafted spatial features, such as inter-channel intensity differenc...
Dominant researches adopt supervised training for speaker extraction, wh...
Recently, end-to-end speaker extraction has attracted increasing attenti...
Keyword spotting (KWS) and speaker verification (SV) are two important t...
Keyword Spotting (KWS) remains challenging to achieve the trade-off betw...
To date, mainstream target speech separation (TSS) approaches are formul...
Transformer-based self-supervised models are trained as feature extracto...
Target speech separation refers to extracting a target speaker's voice f...
Hand-crafted spatial features (e.g., inter-channel phase difference, IPD...
Target speech separation refers to extracting the target speaker's speec...
Speech separation has been studied widely for single-channel close-talk
...
The end-to-end approach for single-channel speech separation has been st...