Skip to main content
Article
PieSlicer: Dynamically Improving Response Time for Cloud-based CNN Inference
ICPE '21: Proceedings of the ACM/SPEC International Conference on Performance Engineering (2021)
  • Samuel S. Ogden, Worcester Polytechnic Institute
  • Xiangnan Kong, Worcester Polytechnic Institute
  • Tian Guo, Worcester Polytechnic Institute
Abstract
Executing deep-learning inference on cloud servers enables the usage of high complexity models for mobile devices with limited resources. However, pre-execution time-the time it takes to prepare and transfer data to the cloud-is variable and can take orders of magnitude longer to complete than inference execution itself. This pre-execution time can be reduced by dynamically deciding the order of two essential steps, preprocessing and data transfer, to better take advantage of on-device resources and network conditions. In this work, we present PieSlicer, a system for making dynamic preprocessing decisions to improve cloud inference performance using linear regression models. PieSlicer then leverages these models to select the appropriate preprocessing location. We show that for image classification applications PieSlicer reduces median and 99th percentile pre-execution time by up to 50.2ms and 217.2ms respectively when compared to static preprocessing methods.
Keywords
  • Cloud inference,
  • mobile deep learning,
  • performance modeling
Disciplines
Publication Date
2021
DOI
10.1145/3427921.3450256
Citation Information
Samuel S. Ogden, Xiangnan Kong and Tian Guo. "PieSlicer: Dynamically Improving Response Time for Cloud-based CNN Inference" ICPE '21: Proceedings of the ACM/SPEC International Conference on Performance Engineering (2021) p. 249 - 256
Available at: http://works.bepress.com/sam-ogden/2/