Skip to main content
Article
Mask-guided attention network and occlusion-sensitive hard example mining for occluded pedestrian detection
IEEE Transactions on Image Processing
  • Jin Xie, Tianjin University
  • Yanwei Pang, Tianjin University
  • Muhammad Haris Khan, Mohamed Bin Zayed University of Artificial Intelligence
  • Rao Muhammad Anwer, Mohamed Bin Zayed University of Artificial Intelligence
  • Fahad Shahbaz Khan, Mohamed Bin Zayed University of Artificial Intelligence
  • Ling Shao, Mohamed Bin Zayed University of Artificial Intelligence
Document Type
Article
Abstract

Pedestrian detection relying on deep convolution neural networks has made significant progress. Though promising results have been achieved on standard pedestrians, the performance on heavily occluded pedestrians remains far from satisfactory. The main culprits are intra-class occlusions involving other pedestrians and inter-class occlusions caused by other objects, such as cars and bicycles. These result in a multitude of occlusion patterns. We propose an approach for occluded pedestrian detection with the following contributions. First, we introduce a novel mask-guided attention network that fits naturally into popular pedestrian detection pipelines. Our attention network emphasizes on visible pedestrian regions while suppressing the occluded ones by modulating full body features. Second, we propose the occlusion-sensitive hard example mining method and occlusion-sensitive loss that mines hard samples according to the occlusion level and assigns higher weights to the detection errors occurring at highly occluded pedestrians. Third, we empirically demonstrate that weak box-based segmentation annotations provide reasonable approximation to their dense pixel-wise counterparts. Experiments are performed on CityPersons, Caltech and ETH datasets. Our approach sets a new state-of-the-art on all three datasets. Our approach obtains an absolute gain of 10.3% in log-average miss rate, compared with the best reported results on the heavily occluded HO pedestrian set of the CityPersons test set. Code and models are available at: https://github.com/Leotju/MGAN.

DOI
10.1109/TIP.2020.3040854
Publication Date
12-4-2020
Keywords
  • attention,
  • convolutional neural networks,
  • hard example mining,
  • Pedestrian detection
Comments

IR Deposit conditions:

  • OA version (pathway a)
  • Accepted version No embargo
  • When accepted for publication, set statement to accompany deposit (see policy)
  • Must link to publisher version with DO
  • Publisher copyright and source must be acknowledged
Citation Information
J. Xie, Y. Pang, M. H. Khan, R. M. Anwer, F. S. Khan and L. Shao, "Mask-guided attention network and occlusion-sensitive hard example mining for occluded pedestrian detection," in IEEE Transactions on Image Processing, vol. 30, pp. 3872-3884, 2021, doi: 10.1109/TIP.2020.3040854.