Engaging video comments play an important role in video social media, as...
Improving the generalization capabilities of general-purpose robotic age...
We propose a novel framework for learning high-level cognitive capabilit...
Text to Speech (TTS) models can generate natural and high-quality speech...
A storyboard is a roadmap for video creation which consists of shot-by-s...
The pre-trained image-text models, like CLIP, have demonstrated the stro...
AI creation, such as poem or lyrics generation, has attracted increasing...
Expressive speech synthesis, like audiobook synthesis, is still challeng...
Adaptive text to speech (TTS) can synthesize new voices in zero-shot
sce...
Audiovisual scenes are pervasive in our daily life. It is commonplace fo...
The ability of a dialog system to express consistent language style duri...
It is appealing to have a system that generates a story or scripts
autom...
As the wide adoption of intelligent chatbot in human daily life, user de...
A storyboard is a sequence of images to illustrate a story containing
mu...
Sound effects play an essential role in producing high-quality radio sto...
Vision is a common source of inspiration for poetry. The objects and the...
Divergent word usages reflect differences among people. In this paper, w...