The ubiquity of camera-enabled devices has led to large amounts of unlab...
Due to effective pattern mining and feature representation, neural
forec...
Federated Learning (FL) is the privacy-preserving machine learning parad...
It is often necessary for drones to complete delivery, photography, and
...
In this paper, we introduce AdaSelection, an adaptive sub-sampling metho...
Content Warning: This work contains examples that potentially implicate
...
With the popularity of automatic code generation tools, such as Copilot,...
Hybrid Question-Answering (HQA), which targets reasoning over tables and...
How humans understand and recognize the actions of others is a complex
n...
CLIP (Contrastive Language-Image Pretraining) is well-developed for
open...
Bullet time is a type of visual effect commonly used in film, television...
Converting a parametric curve into the implicit form, which is called
im...
In this paper, we consider the problem of simultaneously detecting objec...
In this paper, we study the problem of knowledge-intensive text-to-SQL, ...
Text-to-SQL semantic parsing is an important NLP task, which greatly
fac...
For guiding the UAV swarm to pass through narrow openings, a trapezoid
v...
The robustness of Text-to-SQL parsers against adversarial perturbations ...
The task of text-to-SQL is to convert a natural language question to its...
Stochastic volatility often implies increasing risks that are difficult ...
Self-supervised learning (SSL) has proven vital in speech and audio-rela...
When the available hardware cannot meet the memory and compute requireme...
In order to guide the multi-agent system in a cluttered environment, a
c...
The ubiquity of camera-enabled mobile devices has lead to large amounts ...
To guide the movement of a robotic swarm in a corridor-like environment,...
We present LogiGAN, an unsupervised adversarial pre-training framework f...
Person re-identification aims to retrieve persons in highly varying sett...
The ubiquity of microphone-enabled devices has lead to large amounts of
...
Existing text-to-SQL semantic parsers are typically designed for particu...
Non-maximum suppression (NMS) is widely used in object detection pipelin...
Reasoning over natural language is a long-standing goal for the research...
Robotic swarm systems are now becoming increasingly attractive for many
...
Although deep learning methods have achieved advanced video object
recog...
Tables are often created with hierarchies, but existing works on table
r...
SpeechBrain is an open-source and all-in-one speech toolkit. It is desig...
Training Automatic Speech Recognition (ASR) models under federated learn...
Federated Learning (FL) allows edge devices to collaboratively learn a s...
Can our video understanding systems perceive objects when a heavy occlus...
Unmanned Aerial Vehicles (UAVs) are now becoming increasingly accessible...
In this paper, we investigate the recommendation task in the most common...
In Natural Language Interfaces to Databases systems, the text-to-SQL
tec...
Compositional generalization is a basic but essential intellective capab...
Knowledge distillation has been widely used to compress existing deep
le...
This article introduces the solutions of the two champion teams, `MMfrui...
This paper presents a novel approach to translating natural language
que...
This paper presents a novel approach to translating natural language
que...
Weakly supervised object detection (WSOD) focuses on training object det...
We present a neural approach called IRNet for complex and cross-domain
T...
Perinatal stroke (PS) is a serious condition that, if undetected and thu...
Gait is an important biometric trait for surveillance and forensic
appli...