The INTERSPEECH 2020 Far-Field Speaker Verification Challenge
The INTERSPEECH 2020 Far-Field Speaker Verification Challenge (FFSVC 2020) addresses three different research problems under well-defined conditions: far-field text-dependent speaker verification from single microphone array, far-field text-independent speaker verification from single microphone array, and far-field text-dependent speaker verification from distributed microphone arrays. All three tasks pose a cross-channel challenge to the participants. To simulate the real-life scenario, the enrollment utterances are recorded from close-talk cellphone, while the test utterances are recorded from the far-field microphone arrays. In this paper, we describe the database, the challenge, and the baseline system, which is based on a ResNet-based deep speaker network with cosine similarity scoring. For a given utterance, the speaker embeddings of different channels are equally averaged as the final embedding. The baseline system achieves minDCFs of 0.62, 0.66, and 0.64 and EERs of 6.27 7.18
READ FULL TEXT