Exploring Multi-Branch and High-Level Semantic Networks for Improving Pedestrian Detection
To better detect pedestrians of various scales, deep multi-scale methods usually detect pedestrians of different scales by different in-network layers. However, the semantic levels of features from different layers are usually inconsistent. In this paper, we propose a multi-branch and high-level semantic network by gradually splitting a base network into multiple different branches. As a result, the different branches have the same depth and the output features of different branches have similarly high-level semantics. Due to the difference of receptive fields, the different branches are suitable to detect pedestrians of different scales. Meanwhile, the multi-branch network does not introduce additional parameters by sharing convolutional weights of different branches. To further improve detection performance, skip-layer connections among different branches are used to add context to the branch of relatively small receptive filed, and dilated convolution is incorporated into part branches to enlarge the resolutions of output feature maps. When they are embedded into Faster RCNN architecture, the weighted scores of proposal generation network and proposal classification network are further proposed. Experiments on KITTI dataset, Caltech pedestrian dataset, and Citypersons dataset demonstrate the effectiveness of proposed method. On these pedestrian datasets, the proposed method achieves state-of-the-art detection performance. Moreover, experiments on COCO benchmark show the proposed method is also suitable for general object detection.
READ FULL TEXT