In spite of the rapidly evolving landscape of text-to-image generation, ...
Recent years have witnessed great progress in creating vivid audio-drive...
Large language models achieve state-of-the-art performance on sequence
g...
Applying Reinforcement Learning (RL) to sequence generation models enabl...
In the Euclidean k-TSP (resp. Euclidean k-MST), we are given n points
in...
Noise plagues many numerical datasets, where the recorded values in the ...
The issue of detecting deepfakes has garnered significant attention in t...
Despite recent advances in syncing lip movements with any audio waves,
c...
We study the unit-demand capacitated vehicle routing problem in the rand...
Neural Radiance Fields (NeRF) have constituted a remarkable breakthrough...
Creating the photo-realistic version of people sketched portraits is use...
Learning discriminative features for effectively separating abnormal eve...
In the past few years, the widespread use of 3D printing technology enab...
Previous studies have explored generating accurately lip-synced talking ...
Co-speech gesture is crucial for human-machine interaction and digital
e...
Deep 3D point cloud models are sensitive to adversarial attacks, which p...
Occluded person re-identification (ReID) is a challenging problem due to...
While dynamic Neural Radiance Fields (NeRF) have shown success in
high-f...
Person text-image matching, also known as text based person search, aims...
Digital images are vulnerable to nefarious tampering attacks such as con...
Whereas cryptography easily arouses attacks by means of encrypting a sec...
We study the capacitated vehicle routing problem in graphic metrics (gra...
In the Distance-constrained Vehicle Routing Problem (DVRP), we are given...
Time series analysis is of immense importance in extensive applications,...
For discretely observed functional data, estimating eigenfunctions with
...
Notwithstanding the prominent performance achieved in various applicatio...
In the unsplittable capacitated vehicle routing problem, we are given a
...
Realistic generative face video synthesis has long been a pursuit in bot...
Despite encouraging progress in deepfake detection, generalization to un...
Recent advances in face forgery techniques produce nearly visually
untra...
The Agriculture-Vision Challenge in CVPR is one of the most famous and
c...
Earth observation satellites have been continuously monitoring the earth...
Image cropping is an inexpensive and effective operation of maliciously
...
Although significant progress has been made to audio-driven talking face...
This paper focuses on the weakly-supervised audio-visual video parsing t...
Recent years have witnessed the success of deep learning on the visual s...
Generating speech-consistent body and gesture movements is a long-standi...
Adversary and invisibility are two fundamental but conflict characters o...
The task of audio-visual sound source localization has been well studied...
We give a polynomial time (3/2+ϵ)-approximation algorithm for the
unspli...
Animating high-fidelity video portrait with speech audio is crucial for
...
How efficiently can we find an unknown graph using distance queries betw...
We propose SUB-Depth, a universal multi-task training framework for
self...
Recently, the low-rank property of different components extracted from t...
We give a polynomial time approximation scheme (PTAS) for the unit deman...
We introduce Imuge, an image tamper resilient generative scheme for imag...
Self-supervised learning for depth estimation uses geometry in image
seq...
The existing image embedding networks are basically vulnerable to malici...
Previous image forensics schemes for crop detection are only limited on
...
Robust 3D mesh watermarking is a traditional research topic in computer
...