Sucheng Ren

research

∙ 08/23/2023

SG-Former: Self-guided Transformer with Evolving Token Reallocation

Vision Transformer has demonstrated impressive success across various vi...

0 Sucheng Ren, et al. ∙

research

∙ 08/23/2023

NPF-200: A Multi-Modal Eye Fixation Dataset and Method for Non-Photorealistic Videos

Non-photorealistic videos are in demand with the wave of the metaverse, ...

0 Ziyu Yang, et al. ∙

research

∙ 03/15/2023

DeepMIM: Deep Supervision for Masked Image Modeling

Deep supervision, which involves extra supervisions to the intermediate ...

0 Sucheng Ren, et al. ∙

research

∙ 01/03/2023

TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models

Masked image modeling (MIM) performs strongly in pre-training large visi...

0 Sucheng Ren, et al. ∙

research

∙ 07/22/2022

Learning from Multiple Annotator Noisy Labels via Sample-wise Label Fusion

Data lies at the core of modern deep learning. The impressive performanc...

10 Zhengqi Gao, et al. ∙

research

∙ 07/13/2022

DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation

One key challenge of exemplar-guided image generation lies in establishi...

0 Songhua Liu, et al. ∙

research

∙ 06/15/2022

A Simple Data Mixing Prior for Improving Self-Supervised Learning

Data mixing (e.g., Mixup, Cutmix, ResizeMix) is an essential component f...

19 Sucheng Ren, et al. ∙

research

∙ 06/13/2022

The Modality Focusing Hypothesis: On the Blink of Multimodal Knowledge Distillation

Multimodal knowledge distillation (KD) extends traditional knowledge dis...

5 Zihui Xue, et al. ∙

research

∙ 05/29/2022

Glance to Count: Learning to Rank with Anchors for Weakly-supervised Crowd Counting

Crowd image is arguably one of the most laborious data to annotate. In t...

0 Zheng Xiong, et al. ∙

research

∙ 03/22/2022

Self-supervision through Random Segments with Autoregressive Coding (RandSAC)

Inspired by the success of self-supervised autoregressive representation...

1 Tianyu Hua, et al. ∙

research

∙ 11/30/2021

Shunted Self-Attention via Multi-Scale Token Aggregation

Recent Vision Transformer (ViT) models have demonstrated encouraging res...

0 Sucheng Ren, et al. ∙

research

∙ 08/06/2021

Fine-grained Domain Adaptive Crowd Counting via Point-derived Segmentation

Existing domain adaptation methods for crowd counting view each crowd im...

5 Yongtuo Liu, et al. ∙

research

∙ 08/06/2021

Reducing Spatial Labeling Redundancy for Semi-supervised Crowd Counting

Labeling is onerous for crowd counting as it should annotate each indivi...

6 Yongtuo Liu, et al. ∙

research

∙ 08/05/2021

Unifying Global-Local Representations in Salient Object Detection with Transformer

The fully convolutional network (FCN) has dominated salient object detec...

0 Sucheng Ren, et al. ∙

research

∙ 06/23/2021

Co-advise: Cross Inductive Bias Distillation

Transformers recently are adapted from the community of natural language...

7 Sucheng Ren, et al. ∙

research

∙ 03/26/2021

Multimodal Knowledge Expansion

The popularity of multimodal sensors and the accessibility of the Intern...

6 Zihui Xue, et al. ∙

research

∙ 07/20/2020

TENet: Triple Excitation Network for Video Salient Object Detection

In this paper, we propose a simple yet effective approach, named Triple ...

0 Sucheng Ren, et al. ∙

Sucheng Ren

Featured Co-authors

Sign in with Google

Consider DeepAI Pro