TY - GEN
T1 - Performance Study on CPU-based Machine Learning with PyTorch
AU - Chheda, Smeet
AU - Curtis, Anthony
AU - Siegmann, Eva
AU - Chapman, Barbara
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/2/27
Y1 - 2023/2/27
N2 - Over the past decade we have seen a surge in research in Machine Learning. Deep neural networks represent a subclass of machine learning and are computationally intensive. Traditionally, GPUs have been leveraged to accelerate the training of such deep networks by taking advantage of parallelization and the many core architecture. As the datasets and models grow larger, scaling the training or inference task can help reduce the time to solution for research or production purposes. The Supercomputer Fugaku established state of the art results in multiple benchmarks in machine learning by scaling ARM based CPU technology. To that end, we study and present the performance of machine learning training and inference tasks on 64-bit ARM CPU architecture by exploiting its features namely the Scalable Vector Extensions (SVE) in the ARMv8-A.
AB - Over the past decade we have seen a surge in research in Machine Learning. Deep neural networks represent a subclass of machine learning and are computationally intensive. Traditionally, GPUs have been leveraged to accelerate the training of such deep networks by taking advantage of parallelization and the many core architecture. As the datasets and models grow larger, scaling the training or inference task can help reduce the time to solution for research or production purposes. The Supercomputer Fugaku established state of the art results in multiple benchmarks in machine learning by scaling ARM based CPU technology. To that end, we study and present the performance of machine learning training and inference tasks on 64-bit ARM CPU architecture by exploiting its features namely the Scalable Vector Extensions (SVE) in the ARMv8-A.
KW - Distributed Learning
KW - High Performance Computing
KW - Machine Learning
KW - Scalability
UR - https://www.scopus.com/pages/publications/85147735059
U2 - 10.1145/3581576.3581615
DO - 10.1145/3581576.3581615
M3 - Conference contribution
AN - SCOPUS:85147735059
T3 - ACM International Conference Proceeding Series
SP - 24
EP - 34
BT - Proceedings of International Conference on High Performance Computing in Asia-Pacific Region Workshops, HPC Asia 2023
PB - Association for Computing Machinery
T2 - 2023 International Conference on High Performance Computing in Asia-Pacific Region Workshops, HPC Asia 2023
Y2 - 27 February 2023 through 2 March 2023
ER -