TY - GEN
T1 - Processing of large-scale biomedical images on a cluster of multicore CPUs and GPUs
AU - Catalyurek, Umit V.
AU - Hartley, Timothy D.R.
AU - Sertel, Olcay
AU - Ujaldon, Manuel
AU - Ruiz, Antonio
AU - Saltz, Joel
AU - Gurcan, Metin
PY - 2009
Y1 - 2009
N2 - Today's state-of-the-art cluster supercomputers include commodity components such as multi-core CPUs and graphics processing units. Together, these hardware devices provide unprecendented levels of performance in terms of raw GFLOPS and GFLOPS/cost. High-performance computing applications are always in search of lower execution times, greater system utilization, and better efficiency, which means that developers will need to leverage these disruptive technologies in order to take advantage of modern cluster computers' full potential processing power. New application models and middleware systems are needed to ease the developer's task of writing programs which efficiently use this processing capability. Here, we present the implementation of a biomedical image analysis application which serves as a case-study for the development of applications for modern heterogeneous supercomputers. We present detailed application-specific optimizations which we generalize and combine with new programming models into a blueprint for future application development. Our techniques show good success executing on a modern heterogeneous GPU cluster providing 10 TFLOPS of peak processing capability.
AB - Today's state-of-the-art cluster supercomputers include commodity components such as multi-core CPUs and graphics processing units. Together, these hardware devices provide unprecendented levels of performance in terms of raw GFLOPS and GFLOPS/cost. High-performance computing applications are always in search of lower execution times, greater system utilization, and better efficiency, which means that developers will need to leverage these disruptive technologies in order to take advantage of modern cluster computers' full potential processing power. New application models and middleware systems are needed to ease the developer's task of writing programs which efficiently use this processing capability. Here, we present the implementation of a biomedical image analysis application which serves as a case-study for the development of applications for modern heterogeneous supercomputers. We present detailed application-specific optimizations which we generalize and combine with new programming models into a blueprint for future application development. Our techniques show good success executing on a modern heterogeneous GPU cluster providing 10 TFLOPS of peak processing capability.
UR - https://www.scopus.com/pages/publications/84906546550
U2 - 10.3233/978-1-60750-073-5-341
DO - 10.3233/978-1-60750-073-5-341
M3 - Conference contribution
AN - SCOPUS:84906546550
SN - 9781607500735
T3 - Advances in Parallel Computing
SP - 341
EP - 364
BT - High Speed and Large Scale Scientific Computing
PB - IOS Press BV
ER -