Cameras are rapidly becoming the choice for on-board sensors towards spa...
It has long been an ill-posed problem to predict absolute depth maps fro...
This paper studies a novel energy-based cooperative learning framework f...
On the one hand, the dehazing task is an illposedness problem, which mea...
The 4D millimeter-wave (mmWave) radar, capable of measuring the range,
a...
An efficient team is essential for the company to successfully complete ...
3D reconstruction of medical images from 2D images has increasingly beco...
Neural text-to-speech (TTS) generally consists of cascaded architecture ...
We introduce a language modeling approach for text to speech synthesis (...
Visual entailment (VE) is to recognize whether the semantics of a hypoth...
Direct speech-to-speech translation (S2ST) is an attractive research top...
Detectingandsegmentingobjectswithinwholeslideimagesis essential in
compu...
Current text to speech (TTS) systems usually leverage a cascaded acousti...
Whole slide image (WSI) classification often relies on deep weakly super...
Expressive speech synthesis, like audiobook synthesis, is still challeng...
Binaural audio plays a significant role in constructing immersive augmen...
Text to speech (TTS) has made rapid progress in both academia and indust...
Adaptive text to speech (TTS) can synthesize new voices in zero-shot
sce...
Denoising diffusion probabilistic models (diffusion models for short) re...
In cross-lingual speech synthesis, the speech in various languages can b...
X-ray image plays an important role in manufacturing for quality assuran...
End-to-end TTS suffers from high data requirements as it is difficult fo...
Light-weight convolutional neural networks (CNNs) have small complexity ...
3D teeth reconstruction from X-ray is important for dental diagnosis and...
Cross-speaker style transfer is crucial to the applications of multi-sty...
Deep learning models are notoriously data-hungry. Thus, there is an urgi...
Up-to-date High-Definition (HD) maps are essential for self-driving cars...
This paper presents a speech BERT model to extract embedded prosody
info...
In this paper, several works are proposed to address practical challenge...
In recent years, transformer-based models have shown state-of-the-art re...
Machine Speech Chain, which integrates both end-to-end (E2E) automatic s...
We present a multilingual end-to-end Text-To-Speech framework that maps ...
Convolutional networks (ConvNets) have achieved promising accuracy for
v...
Depth estimation and semantic segmentation play essential roles in scene...
Patient's understanding on forthcoming dental surgeries is required by
p...
The ability of deep learning to predict with uncertainty is recognized a...
Neural end-to-end text-to-speech (TTS) , which adopts either a recurrent...
Modern deep reinforcement learning plays an important role to solve a wi...
In this paper, a deep reinforcement learning (DRL) method is proposed to...
Because of its streaming nature, recurrent neural network transducer (RN...
Visual object tracking is an important application of computer vision.
R...
End-to-end neural TTS has achieved superior performance on reading style...
Panoramic X-ray and Cone Beam Computed Tomography (CBCT) are two of the ...
Existing deep learning methods depend on an encoder-decoder structure to...
Due to a lack of medical resources or oral health awareness, oral diseas...
Due to a lack of medical resources or oral health awareness, oral diseas...
As an emerging technology, blockchain has achieved great success in nume...
Label propagation is a popular technique for anatomical segmentation. In...
The Order Acceptance and Scheduling (OAS) problem describes a class of
r...
Inspired by the great success of convolutional neural networks on struct...