It is challenging to extract semantic meanings directly from audio signa...
For on-device automatic speech recognition (ASR), quantization aware tra...
The recurrent neural network transducer (RNN-T) is a prominent streaming...
We present a streaming, Transformer-based end-to-end automatic speech
re...
Dialogue act classification (DAC) is a critical task for spoken language...
End-to-end Spoken Language Understanding (E2E SLU) has attracted increas...
End-to-end (E2E) automatic speech recognition (ASR) systems often have
d...
Although speech recognition has become a widespread technology, inferrin...
Spoken language understanding (SLU) systems translate voice input comman...
Multi-channel inputs offer several advantages over single-channel, to im...
In order to evaluate the performance of the attention based neural ASR u...
Transformers are powerful neural architectures that allow integrating
di...
We propose a novel Transformer encoder-based architecture with syntactic...
End-to-end (E2E) spoken language understanding (SLU) systems can infer t...
Spoken language understanding (SLU) refers to the process of inferring t...