TY - GEN
T1 - Applying Fast Fourier Transforms to Accelerate Spatially and Temporally Inhomogeneous Stencil Computations
AU - Bentley, Russell
AU - Chowdhury, Rezaul
AU - Gregory, Aaron
AU - Santomauro, Michael
N1 - Publisher Copyright:
© 2025 ACM.
PY - 2025/7/16
Y1 - 2025/7/16
N2 - Stencil computations are essential for simulating the evolution of physical systems across multi-dimensional grids over multiple timesteps. State-of-the-art techniques in this field fall into three major groups: tiled looping algorithms, divide-and-conquer trapezoidal algorithms, and Krylov subspace methods. The vast majority of stencil computation algorithms fundamentally operate by looping over all grid cells for all timesteps and computing each value as they go. For a grid of size N and T timesteps, all such looping methods perform Θ (NT) work. Recently however a class of Krylov subspace methods based on Fast Fourier transforms (FFTs) have been developed that allow for o (NT) work if applied to a problem with a constant linear stencil. In this paper we present an approach that supports a broad class of problems involving multiple time-varying linear stencils applied over spatial regions with arbitrary, time-varying boundaries subject to arbitrary boundary conditions. We develop a recursive divide-and-conquer strategy based on l-shell decompositions of arbitrary spatial regions, enabling efficient solutions over irregular domains in both space and time. These extensions allow us to apply FFT-based acceleration in settings that were previously out of reach. We maintain to within a logarithmic factor the complexity bounds achieved for the narrower problem statement addressed by previous FFT-based methods, again securing o (NT) work. Initial experimental results show that implementations of our algorithms that evolve grids of roughly 107 cells for around 105 timesteps run orders of magnitude faster than state-of-the-art implementations for periodic stencil problems, and substantially faster for aperiodic stencil problems. Code Repository: https://github.com/TEALab-org/nhls
AB - Stencil computations are essential for simulating the evolution of physical systems across multi-dimensional grids over multiple timesteps. State-of-the-art techniques in this field fall into three major groups: tiled looping algorithms, divide-and-conquer trapezoidal algorithms, and Krylov subspace methods. The vast majority of stencil computation algorithms fundamentally operate by looping over all grid cells for all timesteps and computing each value as they go. For a grid of size N and T timesteps, all such looping methods perform Θ (NT) work. Recently however a class of Krylov subspace methods based on Fast Fourier transforms (FFTs) have been developed that allow for o (NT) work if applied to a problem with a constant linear stencil. In this paper we present an approach that supports a broad class of problems involving multiple time-varying linear stencils applied over spatial regions with arbitrary, time-varying boundaries subject to arbitrary boundary conditions. We develop a recursive divide-and-conquer strategy based on l-shell decompositions of arbitrary spatial regions, enabling efficient solutions over irregular domains in both space and time. These extensions allow us to apply FFT-based acceleration in settings that were previously out of reach. We maintain to within a logarithmic factor the complexity bounds achieved for the narrower problem statement addressed by previous FFT-based methods, again securing o (NT) work. Initial experimental results show that implementations of our algorithms that evolve grids of roughly 107 cells for around 105 timesteps run orders of magnitude faster than state-of-the-art implementations for periodic stencil problems, and substantially faster for aperiodic stencil problems. Code Repository: https://github.com/TEALab-org/nhls
KW - Divide-and-Conquer
KW - Fast Fourier Transform
KW - Moving Boundaries
KW - Multi-region Stencil Grids
KW - Parallel Stencil Solver
KW - Stencil Computation
KW - Time-varying Stencils
UR - https://www.scopus.com/pages/publications/105012714638
U2 - 10.1145/3694906.3743354
DO - 10.1145/3694906.3743354
M3 - Conference contribution
AN - SCOPUS:105012714638
T3 - Annual ACM Symposium on Parallelism in Algorithms and Architectures
SP - 17
EP - 33
BT - SPAA 2025 - Proceedings of the 2025 37th ACM Symposium on Parallelism in Algorithms and Architectures
PB - Association for Computing Machinery
T2 - 37th ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2025
Y2 - 28 July 2025 through 1 August 2025
ER -