"Benchmarking vision kernels and neural network inference accelerators on embedded platforms" by Murad Qasaimeh

Selected Works of Joseph Zambreno

Follow Contact

Article

Benchmarking vision kernels and neural network inference accelerators on embedded platforms

Journal of Systems Architecture

Murad Qasaimeh, Iowa State University
Kristof Denolf, Xilinx Research Labs
Alireza Khodamoradi, University of California, San Diego
Michaela Blott, Xilinx Research Labs
Jack Lo, Xilinx Research Labs
Lisa Halder, Xilinx Research Labs
Kees Vissers, Xilinx Research Labs
Joseph Zambreno, Iowa State University
Phillip H. Jones, Iowa State University

Download Find in your library

Document Type

Article

Disciplines

Systems and Communications

Publication Version

Submitted Manuscript

Publication Date

9-25-2020

DOI

10.1016/j.sysarc.2020.101896

Abstract

Developing efficient embedded vision applications requires exploring various algorithmic optimization trade-offs and a broad spectrum of hardware architecture choices. This makes navigating the solution space and finding the design points with optimal performance trade-offs a challenge for developers. To help provide a fair baseline comparison, we conducted comprehensive benchmarks of accuracy, run-time, and energy efficiency of a wide range of vision kernels and neural networks on multiple embedded platforms: ARM57 CPU, Nvidia Jetson TX2 GPU and Xilinx ZCU102 FPGA. Each platform utilizes their optimized libraries for vision kernels (OpenCV, VisionWorks and xfOpenCV) and neural networks (OpenCV DNN, TensorRT and Xilinx DPU). For vision kernels, our results show that the GPU achieves an energy/frame reduction ratio of 1.1–3.2 compared to the others for simple kernels. However, for more complicated kernels and complete vision pipelines, the FPGA outperforms the others with energy/frame reduction ratios of 1.2–22.3. For neural networks [Inception-v2 and ResNet-50, ResNet-18, Mobilenet-v2 and SqueezeNet], it shows that the FPGA achieves a speed up of [2.5, 2.1, 2.6, 2.9 and 2.5] and an EDP reduction ratio of [1.5, 1.1, 1.4, 2.4 and 1.7] compared to the GPU FP16 implementations, respectively.

Comments

This is a manuscript of an article published as Qasaimeh, Murad, Kristof Denolf, Alireza Khodamoradi, Michaela Blott, Jack Lo, Lisa Halder, Kees Vissers, Joseph Zambreno, and Phillip H. Jones. "Benchmarking vision kernels and neural network inference accelerators on embedded platforms." Journal of Systems Architecture (2020): 101896. DOI: 10.1016/j.sysarc.2020.101896. Posted with permission.

Elsevier B.V.

2020

Language

File Format

application/pdf

Citation Information

Murad Qasaimeh, Kristof Denolf, Alireza Khodamoradi, Michaela Blott, et al.. "Benchmarking vision kernels and neural network inference accelerators on embedded platforms" Journal of Systems Architecture (2020) p. 101896
Available at: http://works.bepress.com/joseph-zambreno/13/