Xiaolin Wei

research

∙ 08/14/2023

Orthogonal Temporal Interpolation for Zero-Shot Video Recognition

Zero-shot video recognition (ZSVR) is a task that aims to recognize vide...

0 Yan Zhu, et al. ∙

research

∙ 06/30/2023

Exploration and Exploitation of Unlabeled Data for Open-Set Semi-Supervised Learning

In this paper, we address a complex but practical scenario in semi-super...

0 Ganlong Zhao, et al. ∙

research

∙ 06/11/2023

3rd Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation

In order to deal with the task of video panoptic segmentation in the wil...

0 Jinming Su, et al. ∙

research

∙ 03/25/2023

Towards Accurate Post-Training Quantization for Vision Transformer

Vision transformer emerges as a potential architecture for vision tasks....

0 Yifu Ding, et al. ∙

research

∙ 02/24/2023

Pose-Controllable 3D Facial Animation Synthesis using Hierarchical Audio-Vertex Attention

Most of the existing audio-driven 3D facial animation methods suffered f...

0 Bin Liu, et al. ∙

research

∙ 02/11/2023

3D Colored Shape Reconstruction from a Single RGB Image through Diffusion

We propose a novel 3d colored shape reconstruction method from a single ...

0 Bo Li, et al. ∙

research

∙ 12/07/2022

Multiple Object Tracking Challenge Technical Report for Team MT_IoT

This is a brief technical report of our proposed method for Multiple-Obj...

0 Feng Yan, et al. ∙

research

∙ 11/30/2022

Uncertainty-Aware Image Captioning

It is well believed that the higher uncertainty in a word of the caption...

0 Zhengcong Fei, et al. ∙

research

∙ 10/22/2022

HAM: Hierarchical Attention Model with High Performance for 3D Visual Grounding

This paper tackles an emerging and challenging vision-language task, 3D ...

0 Jiaming Chen, et al. ∙

research

∙ 10/05/2022

Progressive Denoising Model for Fine-Grained Text-to-Image Generation

Recently, vector quantized autoregressive (VQ-AR) models have shown rema...

0 Zhengcong Fei, et al. ∙

research

∙ 10/05/2022

Meta-Ensemble Parameter Learning

Ensemble of machine learning models yields improved performance as well ...

0 Zhengcong Fei, et al. ∙

research

∙ 09/16/2022

Weakly Supervised Semantic Segmentation via Progressive Patch Learning

Most of the existing semantic segmentation approaches with image-level c...

0 Jinlong Li, et al. ∙

research

∙ 09/16/2022

Expansion and Shrinkage of Localization for Weakly-Supervised Semantic Segmentation

Generating precise class-aware pseudo ground-truths, a.k.a, class activa...

0 Jinlong Li, et al. ∙

research

∙ 09/07/2022

MSMDFusion: Fusing LiDAR and Camera at Multiple Scales with Multi-Depth Seeds for 3D Object Detection

Fusing LiDAR and camera information is essential for achieving accurate ...

0 Yang Jiao, et al. ∙

research

∙ 09/07/2022

YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications

For years, the YOLO series has been the de facto industry-level standard...

0 Chuyi Li, et al. ∙

research

∙ 08/11/2022

PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding

Panoptic Narrative Grounding (PNG) is an emerging task whose goal is to ...

1 Zihan Ding, et al. ∙

research

∙ 07/22/2022

Efficient Modeling of Future Context for Image Captioning

Existing approaches to image captioning usually generate the sentence wo...

0 Zhengcong Fei, et al. ∙

research

∙ 07/11/2022

MT-Net Submission to the Waymo 3D Detection Leaderboard

In this technical report, we introduce our submission to the Waymo 3D De...

0 Shaoxiang Chen, et al. ∙

research

∙ 05/27/2022

Fully Convolutional One-Stage 3D Object Detection on LiDAR Range Images

We present a simple yet effective fully convolutional one-stage 3D objec...

0 Zhi Tian, et al. ∙

research

∙ 05/26/2022

Learn to Cluster Faces via Pairwise Classification

Face clustering plays an essential role in exploiting massive unlabeled ...

0 Junfu Liu, et al. ∙

research

∙ 03/15/2022

InsCon:Instance Consistency Feature Representation via Self-Supervised Learning

Feature representation via self-supervised learning has reached remarkab...

0 Junwei Yang, et al. ∙

research

∙ 12/20/2021

Contrastive Attention Network with Dense Field Estimation for Face Completion

Most modern face completion approaches adopt an autoencoder or its varia...

7 Xin Ma, et al. ∙

research

∙ 10/09/2021

Two-stage Visual Cues Enhancement Network for Referring Image Segmentation

Referring Image Segmentation (RIS) aims at segmenting the target object ...

0 Yang Jiao, et al. ∙

research

∙ 08/12/2021

Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning

Open-set semi-supervised learning (open-set SSL) investigates a challeng...

0 Junkai Huang, et al. ∙

research

∙ 05/12/2021

Structure Guided Lane Detection

Recently, lane detection has made great progress with the rapid developm...

6 Jinming Su, et al. ∙

research

∙ 04/28/2021

Twins: Revisiting Spatial Attention Design in Vision Transformers

Very recently, a variety of vision transformer architectures for dense p...

0 Xiangxiang Chu, et al. ∙

research

∙ 04/27/2021

Rethinking BiSeNet For Real-time Semantic Segmentation

BiSeNet has been proved to be a popular two-stream network for real-time...

0 Mingyuan Fan, et al. ∙

research

∙ 03/30/2021

Large Scale Visual Food Recognition

Food recognition plays an important role in food choice and intake, whic...

0 Weiqing Min, et al. ∙

research

∙ 02/22/2021

Do We Really Need Explicit Position Encodings for Vision Transformers?

Almost all visual transformers such as ViT or DeiT rely on predefined po...

0 Xiangxiang Chu, et al. ∙

research

∙ 11/26/2020

Beyond Single Instance Multi-view Unsupervised Representation Learning

Recent unsupervised contrastive representation learning follows a Single...

0 Xiangxiang Chu, et al. ∙

research

∙ 11/23/2020

ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradients Accumulation

Single-path based differentiable neural architecture search has great st...

0 Xiaoxing Wang, et al. ∙

research

∙ 10/29/2020

Free-Form Image Inpainting via Contrastive Attention Network

Most deep learning based image inpainting approaches adopt autoencoder o...

14 Xin Ma, et al. ∙

research

∙ 09/02/2020

DARTS-: Robustly Stepping out of Performance Collapse Without Indicators

Despite the fast development of differentiable architecture search (DART...

0 Xiangxiang Chu, et al. ∙

research

∙ 08/19/2020

Query Twice: Dual Mixture Attention Meta Learning for Video Summarization

Video summarization aims to select representative frames to retain high-...

0 Junyan Wang, et al. ∙

research

∙ 08/13/2020

ISIA Food-500: A Dataset for Large-Scale Food Recognition via Stacked Global-Local Attention Network

Food recognition has received more and more attention in the multimedia ...

0 Weiqing Min, et al. ∙

research

∙ 07/22/2020

FedOCR: Communication-Efficient Federated Learning for Scene Text Recognition

While scene text recognition techniques have been widely used in commerc...

0 Wenqing Zhang, et al. ∙

research

∙ 04/05/2020

ReADS: A Rectified Attentional Double Supervised Network for Scene Text Recognition

In recent years, scene text recognition is always regarded as a sequence...

6 Qi Song, et al. ∙

Xiaolin Wei

Featured Co-authors

Sign in with Google

Consider DeepAI Pro