TY - GEN
T1 - Haggis
T2 - 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014
AU - Aji, Ablimit
AU - Teodoro, George
AU - Wang, Fusheng
PY - 2014/11/4
Y1 - 2014/11/4
N2 - Spatial query processing involves complex multidimensional objects and compute intensive spatial operations, and therefore requires a high performance approach to meet the rapid data analytics requirements of modern spatial applications. Recently, MapReduce based spatial query systems have become a viable solution for many data intensive query tasks, and gained widespread adoption in both academia and industry. At the same time, GPUs have been successfully utilized in many applications that require high performance computation. Both approaches, GPU and MapReduce, have their own limitations and advantages, and have been separately utilized in spatial query processing tasks to boost application performance. However, it is unclear that how MapReduce and GPU, two vastly different parallelization techniques, can be fused together to effectively deal with the spatial big data challenges. In this paper, we explore such synergy of parallelization techniques for large scale spatial query processing. We extend Hadoop-GIS, a MapReduce based spatial query system, and provide GPU accelerated spatial query processing capability at the engine level. We evaluate the system on a real world dataset, and demonstrate that GPU accelerated system can gain considerable performance improvements. We also show how other factors such as partition granularity, task scheduling between CPU and GPU can impact the query performance.
AB - Spatial query processing involves complex multidimensional objects and compute intensive spatial operations, and therefore requires a high performance approach to meet the rapid data analytics requirements of modern spatial applications. Recently, MapReduce based spatial query systems have become a viable solution for many data intensive query tasks, and gained widespread adoption in both academia and industry. At the same time, GPUs have been successfully utilized in many applications that require high performance computation. Both approaches, GPU and MapReduce, have their own limitations and advantages, and have been separately utilized in spatial query processing tasks to boost application performance. However, it is unclear that how MapReduce and GPU, two vastly different parallelization techniques, can be fused together to effectively deal with the spatial big data challenges. In this paper, we explore such synergy of parallelization techniques for large scale spatial query processing. We extend Hadoop-GIS, a MapReduce based spatial query system, and provide GPU accelerated spatial query processing capability at the engine level. We evaluate the system on a real world dataset, and demonstrate that GPU accelerated system can gain considerable performance improvements. We also show how other factors such as partition granularity, task scheduling between CPU and GPU can impact the query performance.
KW - GPU
KW - Load balancing
KW - MapReduce
KW - Spatial data partition
KW - Spatial query processing
UR - https://www.scopus.com/pages/publications/84920273564
U2 - 10.1145/2676536.2676539
DO - 10.1145/2676536.2676539
M3 - Conference contribution
AN - SCOPUS:84920273564
T3 - Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014
SP - 15
EP - 20
BT - Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2014
A2 - Chandola, Varun
A2 - Vatsavai, Ranga Raju
PB - Association for Computing Machinery
Y2 - 4 November 2014 through 4 November 2014
ER -