Skip to main content
Article
Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce
Proc. of Big Learning: Algorithms, Systems and Tools (2012)
  • Erik B Reed, Carnegie Mellon University
  • Ole J Mengshoel, Carnegie Mellon University
Abstract
Bayesian network (BN) parameter learning from incomplete data can be a computationally expensive task for incomplete data. Applying the EM algorithm to learn BN parameters is unfortunately susceptible to local optima and prone to premature convergence. We develop and experiment with two methods for improving EM parameter learning by using MapReduce: Age-Layered Expectation Maximization (ALEM) and Multiple Expectation Maximization (MEM). Leveraging MapReduce for distributed machine learning, these algorithms (i) operate on a (potentially large) population of BNs and (ii) partition the data set as is traditionally done with MapReduce machine learning. For example, we achieved gains using the Hadoop implementation of MapReduce in both parameter quality (likelihood) and number of iterations (runtime) using distributed ALEM for the BN Asia over 20,000 MEM and ALEM trials.
Keywords
  • Bayesian network,
  • expectation maximization,
  • MapReduce,
  • Hadoop
Publication Date
December, 2012
Publisher Statement
@inproceedings{reed12scaling,
 author = {Reed, E. B. and Mengshoel, O. J.},
 title = {Scaling {Bayesian} Network Parameter Learning with Expectation Maximization using {MapReduce}},
 booktitle = {Proc. of Big Learning: Algorithms, Systems and Tools},
 year = {2012},
 month  = {December},
 address = {Lake Tahoe, NV}
}
Citation Information
Erik B Reed and Ole J Mengshoel. "Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce" Proc. of Big Learning: Algorithms, Systems and Tools (2012)
Available at: http://works.bepress.com/ole_mengshoel/37/