TY - GEN
T1 - A Further Study of Linux Kernel Hugepages on A64FX with FLASH, an Astrophysical Simulation Code
AU - Feldman, Catherine
AU - Chheda, Smeet
AU - Calder, Alan
AU - Siegmann, Eva
AU - Dey, John
AU - Curtis, Tony
AU - Harrison, Robert
N1 - Publisher Copyright:
© 2023 Owner/Author.
PY - 2023/7/23
Y1 - 2023/7/23
N2 - We present an expanded study of the performance of FLASH when using Linux Kernel Hugepages on Ookami, an HPE Apollo 80 A64FX platform. FLASH is a multi-scale, multi-physics simulation code written principally in modern Fortran and makes use of the PARAMESH library to manage a block-structured adaptive mesh. Our initial study used only the Fujitsu compiler to utilize standard hugepages (hp), but further investigation allowed us to utilize hp for multiple compilers by linking to the Fujitsu library libmpg and transparent hugepages (thp) by enabling it at the node level. By comparing the results of hardware counters and in-code timers, we found that hp and thp do not significantly impact the runtime performance of FLASH. Interestingly, there is a significant reduction in the TLB misses, differences in cache and memory access counters, and strange behavior is observed when using thp.
AB - We present an expanded study of the performance of FLASH when using Linux Kernel Hugepages on Ookami, an HPE Apollo 80 A64FX platform. FLASH is a multi-scale, multi-physics simulation code written principally in modern Fortran and makes use of the PARAMESH library to manage a block-structured adaptive mesh. Our initial study used only the Fujitsu compiler to utilize standard hugepages (hp), but further investigation allowed us to utilize hp for multiple compilers by linking to the Fujitsu library libmpg and transparent hugepages (thp) by enabling it at the node level. By comparing the results of hardware counters and in-code timers, we found that hp and thp do not significantly impact the runtime performance of FLASH. Interestingly, there is a significant reduction in the TLB misses, differences in cache and memory access counters, and strange behavior is observed when using thp.
KW - A64FX architecture
KW - astrophysics
KW - high performance computing
UR - https://www.scopus.com/pages/publications/85176222362
U2 - 10.1145/3569951.3597583
DO - 10.1145/3569951.3597583
M3 - Conference contribution
AN - SCOPUS:85176222362
T3 - PEARC 2023 - Computing for the common good: Practice and Experience in Advanced Research Computing
SP - 186
EP - 195
BT - PEARC 2023 - Computing for the common good
PB - Association for Computing Machinery, Inc
T2 - 2023 Practice and Experience in Advanced Research Computing, PEARC 2023
Y2 - 23 July 2023 through 27 July 2023
ER -