Abstract
The ability to query and process very large, terabyte-scale datasets has become a key step in many scientific and engineering applications. In this paper, we describe the application of two middleware frameworks in an integrated fashion to provide a scalable and efficient system for execution of seismic data analysis on large datasets in a distributed environment. We investigate different strategies for efficient querying of large datasets and parallel implementations of a seismic image reconstruction algorithm. Our results on a state-of-the-art mass storage system coupled with a high-end compute cluster show that our implementation is scalable and can achieve about 2.9 Gigabytes per second data processing rate - about 70% of the maximum 4.2GB/s application-level raw I/O bandwidth of the storage platform.
| Original language | English |
|---|---|
| Pages (from-to) | 423-438 |
| Number of pages | 16 |
| Journal | International Journal of High Performance Computing Applications |
| Volume | 20 |
| Issue number | 3 |
| DOIs | |
| State | Published - Sep 2006 |
Keywords
- Data-driven applications
- Seismic data analysis
Fingerprint
Dive into the research topics of 'Supporting scalable and distributed data subsetting and aggregation in large-scale seismic data analysis'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver