Skip to main content
Article
Heuristic-Based Automatic Pruning of Deep Neural Networks
Neural Computing and Applications
  • Tejalal Choudhary
  • Vipul Mishra
  • Anurag Goswami
  • Jagannathan Sarangapani, Missouri University of Science and Technology
Abstract

The performance of a deep neural network (deep NN) is dependent upon a significant number of weight parameters that need to be trained which is a computational bottleneck. The growing trend of deeper architectures poses a restriction on the training and inference scheme on resource-constrained devices. Pruning is an important method for removing the deep NN's unimportant parameters and making their deployment easier on resource-constrained devices for practical applications. In this paper, we proposed a heuristics-based novel filter pruning method to automatically identify and prune the unimportant filters and make the inference process faster on devices with limited resource availability. The selection of the unimportant filters is made by a novel pruning estimator (γ). The proposed method is tested on various convolutional architectures AlexNet, VGG16, ResNet34, and datasets CIFAR10, CIFAR100, and ImageNet. The experimental results on a large-scale ImageNet dataset show that the FLOPs of the VGG16 can be reduced up to 77.47%, achieving ≈5x inference speedup. The FLOPs of a more popular ResNet34 model are reduced by 41.94% while retaining competitive performance compared to other state-of-the-art methods.

Department(s)
Electrical and Computer Engineering
Keywords and Phrases
  • Convolutional neural network,
  • Deep neural network,
  • Efficient inference,
  • Filter pruning,
  • Model compression and acceleration
Document Type
Article - Journal
Document Version
Final Version
File Type
text
Language(s)
English
Rights
© 2023 Springer, All rights reserved.
Publication Date
3-1-2022
Publication Date
01 Mar 2022
Citation Information
Tejalal Choudhary, Vipul Mishra, Anurag Goswami and Jagannathan Sarangapani. "Heuristic-Based Automatic Pruning of Deep Neural Networks" Neural Computing and Applications Vol. 34 Iss. 6 (2022) p. 4889 - 4903 ISSN: 1433-3058; 0941-0643
Available at: http://works.bepress.com/jagannathan-sarangapani/267/