Skip to main navigation Skip to search Skip to main content

Benchmarking and evaluating unified memory for OpenMP GPU offloading

  • Alok Mishra
  • , Lingda Li
  • , Martin Kong
  • , Hal Finkel
  • , Barbara Chapman
  • Stony Brook University
  • Brookhaven National Laboratory
  • Argonne National Laboratory

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

38 Scopus citations

Abstract

The latest OpenMP standard offers automatic device offloading capabilities which facilitate GPU programming. Despite this, there remain many challenges. One of these is the unified memory feature introduced in recent GPUs. GPUs in current and future HPC systems have enhanced support for unified memory space. In such systems, CPU and GPU can access each other's memory transparently, that is, the data movement is managed automatically by the underlying system software and hardware. Memory over subscription is also possible in these systems. However, there is a significant lack of knowledge about how this mechanism will perform, and how programmers should use it. We have modified several benchmarks codes, in the Rodinia benchmark suite, to study the behavior of OpenMP accelerator extensions and have used them to explore the impact of unified memory in an OpenMP context. We moreover modified the open source LLVM compiler to allow OpenMP programs to exploit unified memory. The results of our evaluation reveal that, while the performance of unified memory is comparable with thatof normal GPU offloading for benchmarks with little data reuse, it suffers from significant overhead when GPU memory is over subcribed for benchmarks with large amount of data reuse. Based on these results, we provide several guidelines for programmers to achieve better performance with unified memory.

Original languageEnglish
Title of host publicationProceedings of LLVM-HPC 2017
Subtitle of host publication4th Workshop on the LLVM Compiler Infrastructure in HPC - Held in conjunction with SC 2017: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherAssociation for Computing Machinery, Inc
ISBN (Print)9781450355650
DOIs
StatePublished - Nov 12 2017
Event4th Workshop on the LLVM Compiler Infrastructure in HPC, LLVM-HPC 2017 - Held in conjunction with the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017 - Denver, United States
Duration: Nov 12 2017Nov 17 2017

Publication series

NameProceedings of LLVM-HPC 2017: 4th Workshop on the LLVM Compiler Infrastructure in HPC - Held in conjunction with SC 2017: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference4th Workshop on the LLVM Compiler Infrastructure in HPC, LLVM-HPC 2017 - Held in conjunction with the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017
Country/TerritoryUnited States
CityDenver
Period11/12/1711/17/17

Keywords

  • Benchmarking
  • GPU
  • OpenMP offloading
  • Performance evaluation
  • Unified memory

Fingerprint

Dive into the research topics of 'Benchmarking and evaluating unified memory for OpenMP GPU offloading'. Together they form a unique fingerprint.

Cite this