Skip to main content
Article
Cost and data exploration considerations for big data prediction on the cloud
2015 IEEE International Conference on Big Data (Big Data) (2015)
  • Chris Tseng, San Jose State University
  • Tien Nguyen, San Jose State University
  • Chetan Sharma, San Jose State University
Abstract
Cloud services allow one to perform intense big data calculations without having to own personally a powerful enough machine. Different cloud-based virtual machines, however, offer different processor speeds at different costs, and the most cost-effective machine size may not always be obvious. We investigated different virtual machine sizes on the Microsoft Azure cloud service and also different data exploration methodologies to solve a big data prediction project using Neural Networks. It was found that one may not always get proportionally better performance with higher end expensive virtual machine settings. Direct application of Neural Network on prediction problem typically has a bottleneck in performance. We found the learning and prediction can be made better with data properties and problem nature taken into consideration. Some of our data preparation schemes will be useful for general big data prediction problem with noise or non-uniformly distributed data.
Publication Date
December 28, 2015
DOI
10.1109/BigData.2015.7363930
Citation Information
Chris Tseng, Tien Nguyen and Chetan Sharma. "Cost and data exploration considerations for big data prediction on the cloud" 2015 IEEE International Conference on Big Data (Big Data) (2015) p. 1622 - 1628
Available at: http://works.bepress.com/chris_tseng/4/