Skip to main content
Contribution to Book
A Study on Balancing Parallelism, Data Locality, and Recomputation in Existing PDE Solvers
Proceedings of SC14: The International Conference for High Performance Computing, Networking, Storage and Analysis (2014)
  • Catherine Olschanowsky, Colorado State University
  • Michelle Mills Strout, Colorado State University
  • Stephen Guzik, Colorado State University
  • John Loffeld, Lawrence Livermore National Laboratory
  • Jeffrey Hittinger, Lawrence Livermore National Laboratory
Abstract
Structured-grid PDE solver frameworks parallelize over boxes, which are rectangular domains of cells or faces in a structured grid. In the Chombo framework, the box sizes are typically 16<sup>3</sup> or 32<sup>3</sup>, but larger box sizes such as 128<sup>3</sup> would result in less surface area and therefore less storage, copying, and/or ghost cells communication overhead. Unfortunately, current onnode parallelization schemes perform poorly for these larger box sizes. In this paper, we investigate 30 different inter-loop optimization strategies and demonstrate the parallel scaling advantages of some of these variants on NUMA multicore nodes. Shifted, fused, and communication-avoiding variants for 128<sup>3</sup> boxes result in close to ideal parallel scaling and come close to matching the performance of 16<sup>3</sup> boxes on three different multicore systems for a benchmark that is a proxy for program idioms found in Computational Fluid Dynamic (CFD) codes.
Keywords
  • parallel processing,
  • schedules,
  • computational fluid dynamics,
  • multicore processing,
  • optimization,
  • equations
Disciplines
Publication Date
2014
Publisher
IEEE
ISBN
9781479955008
DOI
10.1109/SC.2014.70
Citation Information
Catherine Olschanowsky, Michelle Mills Strout, Stephen Guzik, John Loffeld, et al.. "A Study on Balancing Parallelism, Data Locality, and Recomputation in Existing PDE Solvers" Piscataway, NJProceedings of SC14: The International Conference for High Performance Computing, Networking, Storage and Analysis (2014) p. 793 - 804
Available at: http://works.bepress.com/catherine-olschanowsky/9/