Zhenfang Chen

research

∙ 07/24/2023

3D-LLM: Injecting the 3D World into Large Language Models

Large language models (LLMs) and Vision-Language Models (VLMs) have been...

0 Yining Hong, et al. ∙

research

∙ 06/27/2023

Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties

General physical scene understanding requires more than simply localizin...

0 Hsiao-Yu Tung, et al. ∙

research

∙ 06/07/2023

ModuleFormer: Learning Modular Large Language Models From Uncurated Data

Large Language Models (LLMs) have achieved remarkable results. But exist...

0 Yikang Shen, et al. ∙

research

∙ 05/04/2023

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

Recent AI-assistant agents, such as ChatGPT, predominantly rely on super...

0 Zhiqing Sun, et al. ∙

research

∙ 04/07/2023

Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following

Humans, even at a very early age, can learn visual concepts and understa...

0 Mingyu Ding, et al. ∙

research

∙ 04/06/2023

Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention

Humans possess a versatile mechanism for extracting structured represent...

0 Mingyu Ding, et al. ∙

research

∙ 03/20/2023

3D Concept Learning and Reasoning from Multi-View Images

Humans are able to accurately reason in 3D by gathering multi-view obser...

0 Yining Hong, et al. ∙

research

∙ 03/09/2023

Planning with Large Language Models for Code Generation

Existing large language model-based code generation pipelines typically ...

0 Shun Zhang, et al. ∙

research

∙ 12/15/2022

Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners

Optimization in multi-task learning (MTL) is more challenging than singl...

1 Zitian Chen, et al. ∙

research

∙ 10/17/2022

S^3-NeRF: Neural Reflectance Field from Shading and Shadow under a Single Viewpoint

In this paper, we address the "dual problem" of multi-view scene reconst...

0 Wenqi Yang, et al. ∙

research

∙ 07/23/2022

PS-NeRF: Neural Inverse Rendering for Multi-view Photometric Stereo

Traditional multi-view photometric stereo (MVPS) methods are often compo...

0 Wenqi Yang, et al. ∙

research

∙ 05/02/2022

ComPhy: Compositional Physical Reasoning of Objects and Events from Videos

Objects' motions in nature are governed by complex interactions and thei...

0 Zhenfang Chen, et al. ∙

research

∙ 02/15/2022

A Unified Framework for Masked and Mask-Free Face Recognition via Feature Rectification

Face recognition under ideal conditions is now considered a well-solved ...

0 Shaozhe Hao, et al. ∙

research

∙ 10/28/2021

Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language

In this work, we propose a unified framework, called Visual Reasoning wi...

8 Mingyu Ding, et al. ∙

research

∙ 09/02/2021

Deep Face Video Inpainting via UV Mapping

This paper addresses the problem of face video inpainting. Existing vide...

12 Wenqi Yang, et al. ∙

research

∙ 03/30/2021

Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning

We study the problem of dynamic visual reasoning on raw videos. This is ...

5 Zhenfang Chen, et al. ∙

research

∙ 03/24/2021

The Blessings of Unlabeled Background in Untrimmed Videos

Weakly-supervised Temporal Action Localization (WTAL) aims to detect the...

0 Yuan Liu, et al. ∙

research

∙ 03/01/2020

Cops-Ref: A new Dataset and Task on Compositional Referring Expression Comprehension

Referring expression comprehension (REF) aims at identifying a particula...

0 Zhenfang Chen, et al. ∙

research

∙ 01/25/2020

Look Closer to Ground Better: Weakly-Supervised Temporal Grounding of Sentence in Video

In this paper, we study the problem of weakly-supervised temporal ground...

0 Zhenfang Chen, et al. ∙

research

∙ 06/06/2019

Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video

In this paper, we address a novel task, namely weakly-supervised spatio-...

0 Zhenfang Chen, et al. ∙

research

∙ 05/10/2018

Boosting up Scene Text Detectors with Guided CNN

Deep CNNs have achieved great success in text detection. Most of existin...

0 Xiaoyu Yu, et al. ∙

Zhenfang Chen

Featured Co-authors

Sign in with Google

Consider DeepAI Pro