Skip to main content
Article
Heuristically-based Parametric Performance Optimization Algorithms for Big Data Computing
IJCIS International Journal of Computer & Information Science (2016)
  • Jongyeop Kim, Georgia Southern University
  • Noh-Jin Park, Oklahoma City University
  • Nohpill Park, Oklahoma City University
Abstract
Performance optimization for MapReduce computing in Hadoop platform is a tedious yet challenging problem due to the complexity of system organization with an extensive list of configuration parameters to be considered. In order to address and resolve this problem, various parametric optimization algorithms are proposed in this research from a Naïve Exhaustive to a Random Method and a Heuristicallybased Greedy to vie with the exponential nature of the search process for the possible best parameter setting. In the course of exercising those algorithms, there are a few variables to be taken into consideration in order to make each algorithm be a viable option such as degree of the arity of each parameter that determines the degree of the base of the search process time, sampling methods in order to relax the complexity of the search process under control or the budget, and complexity of the heuristics in place to keep it computationally feasible, to mention a few. The heuristic proposed in this research is based on sensitivity-based parametric optimization in greedy manner along with sampling techniques in order to maintain the search space within computationally feasible range. Extensive benchmark-based experiments have been conducted to validate the performance optimization of the MapReduce computations on the benchmark programs such as TestDFSIO and TeraSort, to mention a couple. The experimental results demonstrate the proposed heuristically based algorithm in greedy manner provides a promising answer to the problem of how to optimize the systems configuration parameter setting at a computationally viable and feasible cost.
Publication Date
December, 2016
Publisher Statement
Copyright belongs to authors
Citation Information
Jongyeop Kim, Noh-Jin Park and Nohpill Park. "Heuristically-based Parametric Performance Optimization Algorithms for Big Data Computing" IJCIS International Journal of Computer & Information Science Vol. 17 Iss. 4 (2016) p. 17 - 31 ISSN: 2375-964X
Available at: http://works.bepress.com/jongyeop-kim/2/