The complexity of many problems in science and engineering requires computational capacity exceeding what average user can expect from a single computational center. While many of these problems can be viewed as a set of independent tasks, their collective complexity easily requires millions core-hours on any state-of-the-art HPC resource, and throughput that cannot be sustained by a single multi-user queuing system. In this paper we explore the use of aggregated HPC resources to solve large-scale engineering problems. We show it is possible to build a computational federation that is easy to use by end-users, and is elastic, resilient and scalable. We argue that the fusion of federated computing and real-life engineering problems can be brought to average user if relevant middleware is provided. We report on the use of federation of 10 distributed heterogeneous HPC resources to perform a large-scale interrogation of the parameter space in the microscale fluid flow problem.
Available at: http://works.bepress.com/baskar-ganapathysubramanian/26/
This is a manuscript of an article published as Diaz-Montes, Javier, Yu Xie, Ivan Rodero, Jaroslaw Zola, Baskar Ganapathysubramanian, and Manish Parashar. "Federated Computing for the Masses--Aggregating Resources to Tackle Large-Scale Engineering Problems." Computing in Science & Engineering 16, no. 4 (2014): 62-72. DOI:10.1109/MCSE.2013.134. Posted with permission.