Skip to main navigation Skip to search Skip to main content

Supporting SQL-3 aggregations on grid-based data repositories

  • Li Weng
  • , Gagan Agrawal
  • , Umit Catalyurek
  • , Joel Saltz
  • Ohio State University

Research output: Contribution to journalConference articlepeer-review

Abstract

There is an increasing trends towards distributed and shared repositories for storing scientific datasets. Developing applications that retrieve and process data from such repositories involves a number of challenges. First, these data repositories store data in complex, low-level layouts, which should be abstracted from application developers. Second, as data repositories are shared resources, part of the computations on the data must be performed at a different set of machines than the ones hosting the data. Third, because of the volume of data and the amount of computations involved, parallel configurations need to be used for both hosting the data and the processing on the retrieved data. In this paper, we describe a system for executing SQL-3 queries over scientific data stored as flat-files. A relational table-based virtual view is supported on these flat-file datasets. The class of queries we consider involve data retrieval using Select and Where clauses, and processing with user-defined aggregate functions and group-bys. We use a middleware system STORM for providing much of the low-level functionality. Our compiler analyzes the SQL-3 queries and generates many of the functions required by this middleware. Our experimental results show good scalability with respect to the number of nodes as well as the dataset size.

Original languageEnglish
Pages (from-to)283-298
Number of pages16
JournalLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3602
DOIs
StatePublished - 2005
Event17th International Workshop on Languages and Compilers for High Performance Computing, LCPC 2004 - West Lafayette, IN, United States
Duration: Sep 22 2004Sep 24 2004

Fingerprint

Dive into the research topics of 'Supporting SQL-3 aggregations on grid-based data repositories'. Together they form a unique fingerprint.

Cite this