Skip to main content
Article
FiM: Performance Prediction for Parallel Computation in Iterative Data Processing Applications
10th IEEE International Conference on Cloud Computing (CLOUD 2017) (2017)
  • Janki Bhimani, Northeastern University
  • Ningfang Mi, Northeastern University
  • Miriam Leeser, Northeastern University
  • Zhengyu Yang, Northeastern University
Abstract
Predicting performance of an application running on high performance computing (HPC) platforms in a cloud environment is increasingly becoming important because of its influence on development time and resource management. However, predicting the performance with respect to parallel processes is complex for iterative, multi-stage applications. This research proposes a performance approximation approach FiM to model the computing performance of iterative, multi-stage applications running on a master-compute framework. FiM consists of two key components that are coupled with each other: 1) Stochastic Markov Model to capture non-deterministic runtime that often depends on parallel resources, e.g., number of processes. 2) Machine Learning Model that extrapolates the parameters for calibrating our Markov model when we have changes in application parameters such as dataset. Our new modeling approach considers different design choices along multiple dimensions, namely (i) process level parallelism, (ii) distribution of cores on multi-core processors in cloud computing, (iii) application related parameters, and (iv) characteristics of datasets. The major contribution of our prediction approach is that FiM is able to provide an accurate prediction of parallel computation time for the datasets which have much larger size than that of the training datasets. Such calculation prediction provides data analysts a useful insight of optimal configuration of parallel resources (e.g., number of processes and number of cores) and also helps system designers to investigate the impact of changes in application parameters on system performance.
Keywords
  • Performance Modeling,
  • Markov Model,
  • Regression,
  • Distributed Systems,
  • Cloud Computing,
  • Big Data Infrastructure
Disciplines
Publication Date
2017
Citation Information
Janki Bhimani, Ningfang Mi, Miriam Leeser and Zhengyu Yang. "FiM: Performance Prediction for Parallel Computation in Iterative Data Processing Applications" 10th IEEE International Conference on Cloud Computing (CLOUD 2017) (2017)
Available at: http://works.bepress.com/zhengyuyang/8/