Skip to main navigation Skip to search Skip to main content

FPGA-accelerated samplesort for large data sets

  • Stony Brook University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

28 Scopus citations

Abstract

Sorting is a fundamental operation in many applications such as databases, search, and social networks. Although FPGAs have been shown very effective at sorting data sizes that fit on chip, systems that sort larger data sets by shuffling data on and off chip are bottlenecked by costly merge operations or data transfer time. We propose a new technique for sorting large data sets, which uses a variant of the samplesort algorithm on a server with a PCIeconnected FPGA. Samplesort avoids merging by randomly sampling values to determine how to partition data into non-overlapping buckets that can be independently sorted. The key to our design is a novel parallel multi-stage hardware partitioner, which is a scalable high-throughput solution that greatly accelerates the samplesort partitioning step. Using samplesort for FPGA-accelerated sorting provides several advantages over mergesort, while also presenting a number of new challenges that we address with cooperation between the FPGA and the software running on the host CPU. We prototype our design using Amazon Web Services FPGA instances, which pair a Xilinx Virtex UltraScale+ FPGA with a high-performance server. Our experiments demonstrate that our prototype system sorts 230 key-value records with a speed of 7.2 GB/s, limited only by the on-board DRAM capacity and available PCIe bandwidth. When sorting 230 records, our system exhibits a 37.4x speedup over the widely used GNU parallel sort on an 8-thread state-of-the-art CPU.

Original languageEnglish
Title of host publicationFPGA 2020 - 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays
PublisherAssociation for Computing Machinery, Inc
Pages222-232
Number of pages11
ISBN (Electronic)9781450370998
DOIs
StatePublished - Feb 23 2020
Event2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA 2020 - Seaside, United States
Duration: Feb 23 2020Feb 25 2020

Publication series

NameFPGA 2020 - 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Conference

Conference2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA 2020
Country/TerritoryUnited States
CitySeaside
Period02/23/2002/25/20

Keywords

  • FPGA
  • Partitioning
  • Samplesort
  • Sorting

Fingerprint

Dive into the research topics of 'FPGA-accelerated samplesort for large data sets'. Together they form a unique fingerprint.

Cite this