Project Details
Description
The rise of big data is changing our way of thinking about the world by providing new insights and creating new forms of value. The challenges for big data come not only from the volume but also the complexity, such as the multi-dimensional nature of spatial data. In this CAREER project, we will deliver a scalable and efficient spatial big data processing system that can take advantage of fast increasing processing power of computers and their latest supporting programming environments. This software can be used for a variety of applications in medical image processing and in GIS (Geographical Information Systems), e.g., for city planning, transportation planning, disaster response, military planning.
The fundamental goal of this CAREER project is to address the research challenges for delivering a high performance software system for spatial queries and analytics of spatial big data on MapReduce and CPU-GPU hybrid platforms, promote the use of the created open source software to support problem solving in multiple disciplines, and educate the next generation workforce in big data. Specifically, the following research aims will be pursued in this project: 1) Create new spatial data processing methods and pipelines with spatial partition level parallelism through MapReduce and propose multi-level indexing methods to accelerate spatial data processing; 2) Research two critical components to enable data parallelism: effective and scalable spatial partitioning in MapReduce, and query normalization methods for partition effect; 3) Research efficient GPU-based spatial operations to support object level and intra-object level parallelism, and integrate them into MapReduce pipelines; 4) Investigate optimization methods for data processing pipelines, data skew mitigation, and CPU/GPU resource coordination in MapReduce; and 5) Provide declarative spatial queries and create a query translator to automatically translate the queries into MapReduce applications.
The project will provide a high performance scalable spatial computing infrastructure to be deployed by researchers and application users world-wide from various disciplines, and the source codes will be made open source and fully available. The project will provide a strong foundation to solve spatial big data problems such as location based services, remote sensing based applications, and map based applications. It will also enable the fast solving of scientific problems such as pathology imaging at large scale. The education activities include a revised undergraduate course with a new spatial big data theme, a revised graduate course with a focus on big data management, involvement of undergraduate, graduate and underrepresented students in research, symposia and science projects for K-12 students, and a software infrastructure to support the education.
For further information see the project web site: http://fushengwang.net/hadoop-gis
Keywords: spatial big data, MapReduce, CPU-GPU, spatial queries, spatial analytics
| Status | Finished |
|---|---|
| Effective start/end date | 01/15/15 → 08/31/21 |
Funding
- National Science Foundation: $414,161.00
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.