Parallel implementation of a three-dimensional direct simulation Monte Carlo (DSMC) code employing complex data structures and dynamic memory allocation is detailed for shared memory systems using Open Multi-Processing (OpenMP). Several techniques to optimize the serial implementation of the DSMC method are first discussed. Specifically for a 3-level Cartesian grid, a Cartesian-based movement technique including particle indexing is demonstrated to result in a modest decrease in overall simulation expense of 34% compared with a ray-tracing technique combined with stored cell-connectivity. Two strategies for data localization leading to optimal usage of cache memory are demonstrated to speed up certain cell-based functions (such as collision computations) by a factor of 3.38-4.36. The shared-memory parallel implementation using OpenMP is then described in detail. Synchronization points and related critical sections are identified as major factors that impact the OpenMP parallel performance. Techniques to remove all such synchronization points in the OpenMP implementation of the DSMC method are outlined. For dual-core and quad-core systems, speedups of 1.99 and 3.74, respectively, are obtained for a (free-stream flow) test simulation with low granularity. Finally, the parallel performance of identical source code employing OpenMP is shown to be strongly correlated to the underlying computer architecture. Both Symmetric Multiprocessor (SMP) and non-uniform memory access (NUMA) systems are studied in order to achieve a better understanding of their impacts on parallel scalability when using OpenMP.
Bibliographical noteFunding Information:
This work is partially supported through a seed-grant from the University of Minnesota Supercomputing Institute (MSI). This work was carried out in part using computing resources at MSI. We would like to thank Dr. David Porter from MSI for insightful discussions as well. This work is also supported by the Air Force Office of Scientific Research (AFOSR) under Grant No. FA9550-04-1-0341. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official polices or endorsements, either expressed or implied, of the AFOSR or the US Government.
- Direct simulation Monte Carlo
- Hypersonic flow