The current speech anti-spoofing countermeasures (CMs) show excellent
pe...
The detection of spoofing speech generated by unseen algorithms remains ...
The wav2vec 2.0 and integrated spectro-temporal graph attention network
...
Many news comment mining studies are based on the assumption that commen...
Learning-based approaches to monocular motion capture have recently show...
We propose a new paradigm for universal information extraction (IE) that...
Named-entity recognition (NER) detects texts with predefined semantic la...
Face reenactment methods attempt to restore and re-animate portrait vide...
Creating animatable avatars from static scans requires the modeling of
c...
Inferring the full transportation network, including sidewalks and cycle...
The inspection of the Public Right of Way (PROW) for accessibility barri...
Objects in a scene are not always related. The execution efficiency of t...
This paper describes the deepfake audio detection system submitted to th...
Text information including extensive prior knowledge about land cover cl...
Currently, cross-scene hyperspectral image (HSI) classification has draw...
This report describes a pre-trained language model Erlangshen with
prope...
We present PyMAF-X, a regression-based approach to recovering a full-bod...
The voice conversion task is to modify the speaker identity of continuou...
We propose DeepMultiCap, a novel method for multi-person performance cap...
Semantic segmentation aims to robustly predict coherent class labels for...
Realistic speech-driven 3D facial animation is a challenging problem due...
This paper contributes a novel realtime multi-person motion capture algo...
Fully convolutional neural networks give accurate, per-pixel prediction ...
This paper presents an open source tool for testing the recognition accu...