This paper presents FunCodec, a fundamental neural speech codec toolkit,...
In this paper, we present a new question-answering (QA) based key-value ...
Notwithstanding the promise of Lipschitz-based approaches to
determinist...
This paper investigates a phenomenon where query-based object detectors
...
The goal of expressive Text-to-speech (TTS) is to synthesize natural spe...
This report describes our VolcTrans system for the WMT22 shared task on
...
Echo state network (ESN), a kind of recurrent neural networks, consists ...
Facial semantic guidance (facial landmarks, facial parsing maps, facial
...
Recent grid-based document representations like BERTgrid allow the
simul...
A text to image generation (T2I) model aims to generate photo-realistic
...
Smart contracts are the artifact of the blockchain that provide immutabl...
In this paper, we propose a novel regularization method, RotationOut, fo...
Capturing spatiotemporal correlations is an essential topic in video
cla...
Conditional random fields (CRFs) have been shown to be one of the most
s...
In this paper, we introduce our submissions for the tasks of trimmed act...
Salient object detection is a fundamental problem and has been received ...