Previous Multimodal Information based Speech Processing (MISP) challenge...
This technical report details our submission system to the CHiME-7 DASR
...
Proximity sensing detects an object's presence without contact. However,...
Domain adaptation (DA) has demonstrated significant promise for real-tim...
Transducer is one of the mainstream frameworks for streaming speech
reco...
A realistic long-term microscopic traffic simulator is necessary for
und...
During the diagnostic process, clinicians leverage multimodal informatio...
Motion deblurring is a critical ill-posed problem that is important in m...
Calibrating the extrinsic parameters of sensory devices is crucial for f...
Imitation learning holds great promise for addressing the complex task o...
With the increasing prevalence of robots in daily life, it is crucial to...
In this paper, we propose a novel framework for tactile-based dexterous
...
The Multi-modal Information based Speech Processing (MISP) challenge aim...
Operating unmanned aerial vehicles (UAVs) in complex environments that
f...
Human skin can accurately sense the self-decoupled normal and shear forc...
Traffic congestion is a persistent problem in our society. Existing meth...
Intersections are essential road infrastructures for traffic in modern
m...
Speech pre-training has shown great success in learning useful and gener...
Self-supervised learning (SSL) models have achieved considerable improve...
Multilingual end-to-end models have shown great improvement over monolin...
In this work, we present a novel method, named AV2vec, for learning
audi...
Although the manipulating of the unmanned aerial manipulator (UAM) has b...
HD map reconstruction is crucial for autonomous driving. LiDAR-based met...
Motion planning is challenging for autonomous systems in multi-obstacle
...
Understanding dynamic hand motions and actions from egocentric RGB video...
Deep reinforcement learning has achieved great success in laser-based
co...
Emergence of massive dynamic objects will diversify spatial structures w...
In this paper, we introduce a generalized continuous collision detection...
Understanding human intentions during interactions has been a long-lasti...
Deep learning models are able to approximate one specific dynamical syst...
Lip region-of-interest (ROI) is conventionally used for visual input in ...
In this paper, we aim to forecast a future trajectory distribution of a
...
We present a novel high-resolution face swapping method using the inhere...
Focus control (FC) is crucial for cameras to capture sharp images in
cha...
The challenges to solving the collision avoidance problem lie in adaptiv...
We propose a deep visuo-tactile model for realtime estimation of the liq...
Recently, there have been many advances in autonomous driving society,
a...
Dynamic state representation learning is an important task in robot lear...
This paper demonstrates that multilingual pretraining, a proper fine-tun...
Learning communication via deep reinforcement learning (RL) or imitation...
The general inverse kinematics (IK) problem of a manipulator, namely tha...
Previous works mainly focus on improving cross-lingual transfer for NLU ...
This system description describes our submission system to the Third DIH...
This paper presents a novel design of a soft tactile finger with
omni-di...
In this paper, we propose a novel four-stage data augmentation approach ...
Navigation in dense crowds is a well-known open problem in robotics with...
Over the past few decades, efforts have been made towards robust robotic...
In this paper, we present a novel optimization-based framework for auton...
COVID-19 pandemic has become a global challenge faced by people all over...
In recent years, vision-based crowd analysis has been studied extensivel...