Multimodal recommendation exploits the rich multimodal information assoc...
In this technical report, we present our findings from the research cond...
Training an effective video action recognition model poses significant
c...
The main challenge in video question answering (VideoQA) is to capture a...
The existing deepfake detection methods have reached a bottleneck in
gen...
Human-Object Interaction Detection is a crucial aspect of human-centric ...
In this technical report, we present our findings from a study conducted...
Continual Learning (CL) aims at incrementally learning new tasks without...
We empirically investigate proper pre-training methods to build good vis...
Sensitivity to severe occlusion and large view angles limits the usage
s...
Adversarial contrastive learning (ACL), without requiring labels,
incorp...
Adversarial contrastive learning (ACL) does not require expensive data
a...
Visual Commonsense Reasoning (VCR) remains a significant yet challenging...
Automatically localizing a position based on a few natural language
inst...
Recommendation systems make predictions chiefly based on users' historic...
Detecting Human-Object Interaction (HOI) in images is an important step
...
Human-Object Interaction (HOI) detection has received considerable atten...
Knowledge-based Visual Question Answering (VQA) expects models to rely o...
As a step towards improving the abstract reasoning capability of machine...
Point cloud sequences are irregular and unordered in the spatial dimensi...
Many multimodal recommender systems have been proposed to exploit the ri...
Making each modality in multi-modal data contribute is of vital importan...
Visual Commonsense Reasoning (VCR), deemed as one challenging extension ...
In this paper, we study a new data mining problem of obstacle detection ...
Non-parametric two-sample tests (TSTs) that judge whether two sets of sa...
A key challenge for machine intelligence is to learn new visual concepts...
The learning process of deep learning methods usually updates the model'...
Raven's Progressive Matrices (RPM) is highly correlated with human
intel...
Motivated by scenarios where data is used for diverse prediction tasks, ...
This paper proposes a novel model for recognizing images with composite
...
Noisy labels (NL) and adversarial examples both undermine trained models...
Hyperspectral compressive imaging takes advantage of compressive sensing...
In adversarial machine learning, there was a common belief that robustne...
Improper or erroneous labelling can pose a hindrance to reliable
general...
In this work, we present an approach for unsupervised domain adaptation ...
Good training data is a prerequisite to develop useful ML applications.
...
Participation on social media platforms has many benefits but also poses...
This work explores the utility of implicit behavioral cues, namely,
Elec...
Federated learning facilitates collaboration among self-interested agent...
Locality-Sensitive Hashing (LSH) is one of the most popular methods for
...
We have recently seen great progress in image classification due to the
...
Salient object detection is evaluated using binary ground truth with the...
Adversarial training based on the minimax formulation is necessary for
o...
The recent development of commodity 360^∘ cameras have enabled a
single ...
The computer vision community is witnessing an unprecedented rate of new...
Raven's Progressive Matrices (RPM) have been widely used for Intelligenc...
Accuracy and processing speed are two important factors that affect the ...
Human action analysis and understanding in videos is an important and
ch...
Most existing recommender systems represent a user's preference with a
f...
Supervised machine learning applications in the health domain often face...