Advanced features

Overview

Teaching: 10 min
Exercises: 5 min

Questions

Some pointers on further OpenMP features

Objectives

Understand further material and topics that can used.

Using OpenMP on GPUs

GPUs are very efficient at parallel workloads where data can be offloaded to the device and processed since communication between GPU and main memory is limited by the interface (e.g. PCI).

NVIDIA GPUs are available and usually programmed using NVIDIA’s own CUDA technology. This leads to code that is limited to only working on NVIDIA’s ecosystem. This limits choice for the programmer and portability for others to use your code. Recent versions of OpenMP since 4.0 has supproted offload functionality.

Info from Nvidia suggests this is possible but still work in progress. Compilers have to be built to support CUDA (LLVM/Clang is one such compiler).

Example code is:

#pragma omp \
#ifdef GPU
target teams distribute \
#endif
parallel for reduction(max:error) \
#ifdef GPU
collapse(2) schedule(static,1)
#endif
for( int j = 1; j < n-1; j++)
{
  for( int i= 1; i < m-1; i++ )
  {
    Anew[j][i] = 0.25 * ( A[j][i+1] + A[j][i-1]+ A[j-1][i] + A[j+1][i]);
    error = fmax( error, fabs(Anew[j][i] -A[j][i]));
  }
}

If interested, come and talk to us and we can see how we can help.

Further material

https://www.openmp.org

Key Points

OpenMP is still an evolving interface to parallel code.

previous episode

Introduction to Parallel Programming using OpenMP

lesson home

Advanced features

Overview

Using OpenMP on GPUs

Further material

Key Points

previous episode

lesson home