TY - GEN
T1 - Recent experiences in using MPI-3 RMA in the DASH PGAS Runtime
AU - Schuchart, Joseph
AU - Kowalewski, Roger
AU - Fuerlinger, Karl
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/1/31
Y1 - 2018/1/31
N2 - The Partitioned Global Address Space (PGAS) programming model has become a viable alternative to traditional message passing using MPI. The DASH project provides a PGAS abstraction entirely based on C++11. The underlying DASH RunTime, DART, provides communication and management functionality transparently to the user. In order to facilitate incremental transitions of existing MPI-parallel codes, the development of DART has focused on creating a PGAS runtime based on the MPI-3 RMA standard. From an MPI-RMA user perspective, this paper outlines our recent experiences in the development of DART and presents insights into issues that we faced and how we attempted to solve them, including issues surrounding memory allocation and memory consistency as well as communication latencies. We implemented a set of benchmarks for global memory allocation latency in the framework of the OSU micro-benchmark suite and present results for allocation and communication latency measurements of different global memory allocation strategies under three different MPI implementations.
AB - The Partitioned Global Address Space (PGAS) programming model has become a viable alternative to traditional message passing using MPI. The DASH project provides a PGAS abstraction entirely based on C++11. The underlying DASH RunTime, DART, provides communication and management functionality transparently to the user. In order to facilitate incremental transitions of existing MPI-parallel codes, the development of DART has focused on creating a PGAS runtime based on the MPI-3 RMA standard. From an MPI-RMA user perspective, this paper outlines our recent experiences in the development of DART and presents insights into issues that we faced and how we attempted to solve them, including issues surrounding memory allocation and memory consistency as well as communication latencies. We implemented a set of benchmarks for global memory allocation latency in the framework of the OSU micro-benchmark suite and present results for allocation and communication latency measurements of different global memory allocation strategies under three different MPI implementations.
KW - Communication latency
KW - DASH
KW - Global memory allocation
KW - MPI-RMA
KW - Partitioned global address space
KW - PGAS
UR - https://www.scopus.com/pages/publications/85043244236
U2 - 10.1145/3176364.3176367
DO - 10.1145/3176364.3176367
M3 - Conference contribution
AN - SCOPUS:85043244236
T3 - ACM International Conference Proceeding Series
SP - 21
EP - 30
BT - Proceedings of Workshops of HPC Asia 2018
PB - Association for Computing Machinery
T2 - 2018 Workshop on High Performance Computing Asia, HPC Asia 2018
Y2 - 31 January 2018
ER -