Non-autoregressive (NAR) modeling has gained significant interest in spe...
Text language models have shown remarkable zero-shot capability in
gener...
Collecting audio-text pairs is expensive; however, it is much easier to
...
Although frame-based models, such as CTC and transducers, have an affini...
There has been an increased interest in the integration of pretrained sp...
End-to-end speech summarization has been shown to improve performance ov...
Conformer, a convolution-augmented Transformer variant, has become the d...
Recently there have been efforts to introduce new benchmark tasks for sp...
This paper describes our system for the low-resource domain adaptation t...
Most human interactions occur in the form of spoken conversations where ...
Spoken language understanding (SLU) tasks have been studied for many dec...
Disfluency detection has mainly been solved in a pipeline approach, as
p...
Collecting sufficient labeled data for spoken language understanding (SL...
This paper presents BERT-CTC, a novel formulation of end-to-end speech
r...
End-to-end spoken language understanding (SLU) systems are gaining popul...
End-to-end (E2E) models are becoming increasingly popular for spoken lan...
The landscape of privacy laws and regulations around the world is comple...
Although Transformers have gained success in several speech processing t...
In attempts to "explain" predictions of machine learning models, researc...
As Automatic Speech Processing (ASR) systems are getting better, there i...
Decomposable tasks are complex and comprise of a hierarchy of sub-tasks....
In this paper, we address the problem of learning low dimension
represen...
Knowledge Graphs are increasingly becoming popular for a variety of
down...
Knowledge Graphs (KGs) extracted from text sources are often noisy and l...
We address the problem of learning a distributed representation of entit...
We consider algorithm selection in the context of ad-hoc information
ret...