Musicians and fans often produce lyric videos, a form of music videos th...
Zero-shot skeleton-based action recognition aims to recognize actions of...
With the advance of text-to-image models (e.g., Stable Diffusion) and
co...
Factor model is a fundamental investment tool in quantitative investment...
Vision-language models have achieved tremendous progress far beyond what...
Self-supervised learning has demonstrated remarkable capability in
repre...
Amateurs working on mini-films and short-form videos usually spend lots ...
The ability to choose an appropriate camera view among multiple cameras ...
Training a generalizable 3D part segmentation network is quite challengi...
Neural Radiance Field (NeRF) has achieved outstanding performance in mod...
Shots are key narrative elements of various videos, e.g. movies, TV seri...
The task of searching certain people in videos has seen increasing poten...
Recent years have seen remarkable advances in visual understanding. Howe...
Scene, as the crucial unit of storytelling in movies, contains complex
a...
Automatic musical accompaniment is where a human musician is accompanied...
Adversarial examples expose vulnerabilities of machine learning models. ...