TY - GEN
T1 - Comprehensive Study for Just-In-Time Pack Functions in Open MPI
AU - Li, Yicheng
AU - Schuchart, Joseph
AU - Bosilca, George
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Among many of the communication capabilities of the Message Passing Interface (MPI), the manipulation of datatypes, i.e. contiguous and non-contiguous memory locations, regular or not, has been heavily underrated and underutilized. This paper introduces an enhancement to the Open MPI Datatype Engine by incorporating Just-In-Time (JIT) generation of tailored packing functions. The proposed approach aims at optimizing data serialization and communication performance by dynamically generating packing functions tailored to specific datatypes and communication patterns. Leveraging the JIT pack mechanism eliminates branching overhead and enables efficient handling of non-contiguous data movement. Our implementation demonstrates a maximum speedup of up to 3.65x, showcasing the potential performance gains achievable in synthetic scenarios. Furthermore, in real-world application communication patterns, we achieve a notable speedup of up to 3.60x, emphasizing the practical relevance of our approach in improving communication performance for various datatypes and application workloads within the Open MPI framework. Prior research has explored the advantages of employing JIT functions in the context of packing operations. In our study, we delve further into the limitation of JIT functions and the optimization of JIT creation overhead elimination and strive to systematically categorize the specific scenarios where JIT implementation proves beneficial.
AB - Among many of the communication capabilities of the Message Passing Interface (MPI), the manipulation of datatypes, i.e. contiguous and non-contiguous memory locations, regular or not, has been heavily underrated and underutilized. This paper introduces an enhancement to the Open MPI Datatype Engine by incorporating Just-In-Time (JIT) generation of tailored packing functions. The proposed approach aims at optimizing data serialization and communication performance by dynamically generating packing functions tailored to specific datatypes and communication patterns. Leveraging the JIT pack mechanism eliminates branching overhead and enables efficient handling of non-contiguous data movement. Our implementation demonstrates a maximum speedup of up to 3.65x, showcasing the potential performance gains achievable in synthetic scenarios. Furthermore, in real-world application communication patterns, we achieve a notable speedup of up to 3.60x, emphasizing the practical relevance of our approach in improving communication performance for various datatypes and application workloads within the Open MPI framework. Prior research has explored the advantages of employing JIT functions in the context of packing operations. In our study, we delve further into the limitation of JIT functions and the optimization of JIT creation overhead elimination and strive to systematically categorize the specific scenarios where JIT implementation proves beneficial.
KW - Datatype Engine
KW - JIT
KW - MPI
KW - Open MPI
UR - https://www.scopus.com/pages/publications/85200735601
U2 - 10.1109/IPDPSW63119.2024.00130
DO - 10.1109/IPDPSW63119.2024.00130
M3 - Conference contribution
AN - SCOPUS:85200735601
T3 - 2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024
SP - 678
EP - 685
BT - 2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024
Y2 - 27 May 2024 through 31 May 2024
ER -