"PieSlicer: Dynamically Improving Response Time for Cloud-based CNN Inference" by Samuel S. Ogden

Selected Works of Sam Ogden

Follow Contact

Article

PieSlicer: Dynamically Improving Response Time for Cloud-based CNN Inference

ICPE '21: Proceedings of the ACM/SPEC International Conference on Performance Engineering (2021)

Samuel S. Ogden, Worcester Polytechnic Institute
Xiangnan Kong, Worcester Polytechnic Institute
Tian Guo, Worcester Polytechnic Institute

Link

Abstract

Executing deep-learning inference on cloud servers enables the usage of high complexity models for mobile devices with limited resources. However, pre-execution time-the time it takes to prepare and transfer data to the cloud-is variable and can take orders of magnitude longer to complete than inference execution itself. This pre-execution time can be reduced by dynamically deciding the order of two essential steps, preprocessing and data transfer, to better take advantage of on-device resources and network conditions. In this work, we present PieSlicer, a system for making dynamic preprocessing decisions to improve cloud inference performance using linear regression models. PieSlicer then leverages these models to select the appropriate preprocessing location. We show that for image classification applications PieSlicer reduces median and 99th percentile pre-execution time by up to 50.2ms and 217.2ms respectively when compared to static preprocessing methods.

Keywords

Cloud inference,
mobile deep learning,
performance modeling

Disciplines

Computer Sciences

Publication Date

2021

DOI

10.1145/3427921.3450256

Citation Information

Samuel S. Ogden, Xiangnan Kong and Tian Guo. "PieSlicer: Dynamically Improving Response Time for Cloud-based CNN Inference" ICPE '21: Proceedings of the ACM/SPEC International Conference on Performance Engineering (2021) p. 249 - 256
Available at: http://works.bepress.com/sam-ogden/2/