When I was asked to give an answer to the question of How to kill OpenMP by 2011 during the OpenMP BoF panel discussion at SC08, I decided against listing the most prominent issues and challenges OpenMP is facing. It turned out that the first two speakers – Tim Mattson from Intel and Bronis de Supinski from LLNL – did exactly this very well. Instead, my claim is that OpenMP is doing quite well today and we “just” have to continue in riding on the multi-core momentum by outfitting OpenMP with a little more features. Our group is pretty involved in the OpenMP community and I got the feeling that since around early 2008 OpenMP is gaining moment and I tried to present this in an entertaining approach. This is a brief textual summary of my panel contribution (please do not take all things too seriously).
RWTH Aachen University is a member of the OpenMP ARB (= Architecture Review Board), as OpenMP is very important for many of our applications: All large codes (in terms of compute cycle consumption) are hybrid today, and in order to server some complex applications for which no MPI parallelization exists (so far) we offer the largest SPARC- and x86-based SMP systems one could buy. Obviously we would be very sad if OpenMP would disappear, but in order to find an answer for the question what an university could do to kill OpenMP 2011 it just needed a few domestic beers and a good chat with friends at one of the nice pubs in Austin, TX: Go teaching goto-based spaghetti style programming, as branching in and out of Parallel Regions is not allowed by OpenMP and as such this programming style is inherently incompatible with OpenMP.
By the next day this idea hat lost some of it’s fascination :-), so I went off to evaluate OpenMP’s current momentum. In 2007, we have been invited to write a chapter for David Bader’s book on Petascale Computing. What we did just recently was to do a keyword search (with some manual postprocessing):
This reveals at least the following interesting aspects:
- MPI is clearly assessed to be the most important programming paradigm for Petascale Systems, but OpenMP is also well-recognized. Our own chapter on how to exploit SMP building blocks contributed for only 28 of the 150 hits on OpenMP.
- The term Thread is often used in conjunction with OpenMP, but other threading models are virtually not touched at all.
- C/C++ and Fortran are the programming languages considered to be used to program current and future Petascale systems.
- There was one chapter on Chapel and because of that it had a comparably high number of hits, but otherwise the “new” parallel programming paradigms are not (yet ?) considered to be significant.
In order to take an even closer look at the recognition of OpenMP we asked our friend Google:
One can cleary see that the interest in OpenMP is increasing, as opposite to Posix-Threads and Win32-Threads. At the end of 2007 there is a peak when OpenMP 3.0 was announced and a draft standard for public comment was released. Since Q3/2008 we have compilers supporting OpenMP 3.0 which is accounting for increasing interest again. As there is quite some momentum in OpenMP it is hard for us, representing a University / the community it is hard of not impossible to kill OpenMP – which is actually quite nice.
But going back to finding an answer for the question posed on us, we found a suitable assassin: The trend of making Shared-Memory Systems more and more complex in terms of the architecture. For example all current x86-based systems (as announced this week at SC08) are cc-NUMA systems if you have more than one socket, and maybe we will eventually see NUMA (= non-uniform cache architecture) systems as well. So actually the hardware vendors have a chance to kill OpenMP by designing systems that are hard to exploit efficiently with multithreading. So the only chance to really kill OpenMP by 2011 is leaving it as it is and not equipping it with means to aid the programmer in squeezing performance out of such systems with an increasing depth of the memory hierarchy. In terms of OpenMP, the world is still flat:
OpenMP is hardware agnostic, it has no notion of data locality.
The Affinity problem: How to maintain or improve the nearness of threads and their most frequently used data.