OpenMP Tutorials at SC22

As in previous years, several OpenMP tutorial proposals have been accepted for SC22. I am really looking forward to being in the USA again, and – among other things – to teach OpenMP to real people, instead of black tiles. In this summary, I would like to highlight the two tutorials in which I am involved.

And by the way: in addition to the content itself, I believe these tutorials provide the extra value of direct access to members of the OpenMP Language Committee. That means we are approachable beyond the tutorial outline to discuss any topics, or any issues, you have with OpenMP.

Mastering Tasking with OpenMP

Since version 3.0 released in 2008, OpenMP offers tasking to support the creation of composable parallel software blocks and the parallelization of irregular algorithms. Mastering the tasking concept of OpenMP requires a change in the way developers reason about the structure of their code and how to expose the parallelism of it. Our tutorial addresses this critical aspect by examining the tasking concept in detail and presenting patterns as solutions to many common problems.

Presenters: Christian Terboven, Michael Klemm, Xavier Teruel and Bronis R. de Supinski

Content summary:

  • OpenMP Overview (high-level summary, synchronization, memory model)
  • OpenMP Tasking Model (overview, data sharing, taskloop)
  • Improving Tasking Performance (if + final + mergeable clauses, cut-off strategies, task dependencies, task affinity)
  • Cancellation Construct
  • Future OpenMP directions

Advanced OpenMP: Host Performance and 5.2 Features

Developers usually find OpenMP easy to learn. However, they are often disappointed with the performance and scalability of the resulting code. This stems not from shortcomings of OpenMP, but rather from the lack of depth with which it is employed. Our “Advanced OpenMP Programming” tutorial addresses this critical need by exploring the implications of possible OpenMP parallelization strategies, both in terms of correctness and performance.

Presenters: Christian Terboven, Michael Klemm, Ruud van der Pas, and Bronis R. de Supinski

Content summary:

  • OpenMP Overview (high-level summary, synchronization, memory model)
  • Techniques to obtain High Performance with OpenMP: memory access (memory placement, binding, NUMA) and vectorization (understanding SIMD, vectorization in OpenMP)
  • Advanced Language Features (doacross loops, user-defined reductions, atomics)
  • Future OpenMP directions

For a complete list of SC22 activities around OpenMP and associated with the OpenMP organization, please see this page listing tutorials, the Bof, and booth talks.

Excellent price-performance of SC20 tutorials

You are probably aware that SC20 will be a virtual (= online) event. It will start in about two weeks with the Tutorials (November 9 to 11), followed by the Workshops (November 11 to 13), the Keynotes and Awards and Top500 (and more, November 16) and finally the Technical Program and Invited Talks (and more, November 17 to 19).

However, the switch to an online format brings a great advantage for the SC20 tutorial format that I only became aware of very recently: Tutorials will be recorded and available online on-demand for 6 months. This will give you the unique chance to attend all tutorials you are possibly interested in!

If you are interested in OpenMP, there are three tutorials to choose from. The OpenMP web presence has a nice overview. As usual, I am part of the Advanced OpenMP: Host Performance and 5.0 Features tutorial. Our focus is on performance aspects, e.g., data/thread locality, false sharing, and exploitation of vector units. All topics are accompanied by case studies and we will discuss the corresponding OpenMP language features in-depth. Please note that we will solely cover performance programming for multi-core architectures (not accelerators):

Title Slide: Advanced OpenMP tutorial at SC20
Our title slide: Advanced OpenMP tutorial at SC20