Skip to main navigation Skip to search Skip to main content

Progressive clustering of big data with GPU acceleration and visualization

  • Jun Wang
  • , Eric Papenhausen
  • , Bing Wang
  • , Sungsoo Ha
  • , Alla Zelenyuk
  • , Klaus Mueller
  • Stony Brook University
  • Pacific Northwest National Laboratory

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Clustering has become an unavoidable step in big data analysis. It may be used to arrange data into a compact format, making operations on big data manageable. However, clustering of big data requires not only the capability of handling data with large volume and high dimensionality, but also the ability to process streaming data, all of which are less developed in most current algorithms. Furthermore, big data processing is seldom interactive, which stands at conflict with users who seek answers immediately. The best one can do is to process incrementally, such that partial and, hopefully, accurate results can be available relatively quickly and are then progressively refined over time. We propose a clustering framework which uses Multi-Dimensional Scaling for layout and GPU acceleration to accomplish these goals. Our domain application is the clustering of mass spectral data of individual aerosol particles with 8 million data points of 450 dimensions each.

Original languageEnglish
Title of host publication2017 New York Scientific Data Summit, NYSDS 2017 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781538631614
DOIs
StatePublished - Oct 25 2017
Event2017 New York Scientific Data Summit, NYSDS 2017 - New York, United States
Duration: Aug 6 2017Aug 9 2017

Publication series

Name2017 New York Scientific Data Summit, NYSDS 2017 - Proceedings

Conference

Conference2017 New York Scientific Data Summit, NYSDS 2017
Country/TerritoryUnited States
CityNew York
Period08/6/1708/9/17

Keywords

  • big data
  • clustering
  • GPU
  • visualization

Fingerprint

Dive into the research topics of 'Progressive clustering of big data with GPU acceleration and visualization'. Together they form a unique fingerprint.

Cite this