Tag Archives: SC08

Compiler Support for OpenMP 3.0

SC08 brought us some pretty good news regarding availability of (full) support for OpenMP 3.0:

  • Intel 11.0: Linux (x86), Windows (x86) and MacOS (x86)
  • Sun Studio Express 11/08: Linux (x86) and Solaris (SPARC + x86)
  • PGI 8.0: Linux (x86) and Windows (x86)
  • IBM 10.1: Linux (POWER) and AIX (POWER)

GCC 4.4 will have support for OpenMP 3.0 as well, it is currently in regression fixes and docs only mode (see http://gcc.gnu.org/ml/gcc/2008-11/msg00007.html). Great, hm?

How to kill OpenMP by 2011 ?!

When I was asked to give an answer to the question of How to kill OpenMP by 2011 during the OpenMP BoF panel discussion at SC08, I decided against listing the most prominent issues and challenges OpenMP is facing. It turned out that the first two speakers – Tim Mattson from Intel and Bronis de Supinski from LLNL – did exactly this very well. Instead, my claim is that OpenMP is doing quite well today and we “just” have to continue in riding on the multi-core momentum by outfitting OpenMP with a little more features. Our group is pretty involved in the OpenMP community and I got the feeling that since around early 2008 OpenMP is gaining moment and I tried to present this in an entertaining approach. This is a brief textual summary of my panel contribution (please do not take all things too seriously).

RWTH Aachen University is a member of the OpenMP ARB (= Architecture Review Board), as OpenMP is very important for many of our applications: All large codes (in terms of compute cycle consumption) are hybrid today, and in order to server some complex applications for which no MPI parallelization exists (so far) we offer the largest SPARC- and x86-based SMP systems one could buy. Obviously we would be very sad if OpenMP would disappear, but in order to find an answer for the question what an university could do to kill OpenMP 2011 it just needed a few domestic beers and a good chat with friends at one of the nice pubs in Austin, TX: Go teaching goto-based spaghetti style programming, as branching in and out of Parallel Regions is not allowed by OpenMP and as such this programming style is inherently incompatible with OpenMP.

By the next day this idea hat lost some of it’s fascination :-), so I went off to evaluate OpenMP’s current momentum. In 2007, we have been invited to write a chapter for David Bader’s book on Petascale Computing. What we did just recently was to do a keyword search (with some manual postprocessing):

Algorithms and Applications, by David Bader.

Petascale Computing: Algorithms and Applications, by David Bader.





















X10, Fortress, Titaium

< 10

This reveals at least the following interesting aspects:

  • MPI is clearly assessed to be the most important programming paradigm for Petascale Systems, but OpenMP is also well-recognized. Our own chapter on how to exploit SMP building blocks contributed for only 28 of the 150 hits on OpenMP.
  • The term Thread is often used in conjunction with OpenMP, but other threading models are virtually not touched at all.
  • C/C++ and Fortran are the programming languages considered to be used to program current and future Petascale systems.
  • There was one chapter on Chapel and because of that it had a comparably high number of hits, but otherwise the “new” parallel programming paradigms are not (yet ?) considered to be significant.

In order to take an even closer look at the recognition of OpenMP we asked our friend Google:

OpenMP versus Native Threading.

Google Trends: OpenMP versus Native Threading.

One can cleary see that the interest in OpenMP is increasing, as opposite to Posix-Threads and Win32-Threads. At the end of 2007 there is a peak when OpenMP 3.0 was announced and a draft standard for public comment was released. Since Q3/2008 we have compilers supporting OpenMP 3.0 which is accounting for increasing interest again. As there is quite some momentum in OpenMP it is hard for us, representing a University / the community it is hard of not impossible to kill OpenMP – which is actually quite nice.

But going back to finding an answer for the question posed on us, we found a suitable assassin: The trend of making Shared-Memory Systems more and more complex in terms of the architecture. For example all current x86-based systems (as announced this week at SC08) are cc-NUMA systems if you have more than one socket, and maybe we will eventually see NUMA (= non-uniform cache architecture) systems as well. So actually the hardware vendors have a chance to kill OpenMP by designing systems that are hard to exploit efficiently with multithreading. So the only chance to really kill OpenMP by 2011 is leaving it as it is and not equipping it with means to aid the programmer in squeezing performance out of such systems with an increasing depth of the memory hierarchy.  In terms of OpenMP, the world is still flat:

The World is still flat, no support for cc-NUMA (yet)!

OpenMP 3.0: The World is still flat, no support for cc-NUMA (yet)!

OpenMP is hardware agnostic, it has no notion of data locality.

The Affinity problem: How to maintain or improve the nearness of threads and their most frequently used data.


  • Where to run threads?

  • Where to place data?

SC08 + a visit at UH: Happy to be in the USA again

Yesterday evening I arrived in Austin, TX. I had to spend about six hours at the Chicago airport and that was not as bad as I anticipated, because as this was my third visit of this aiport I knew where to find power outlets and the like (even when the official ones are full). By the way, if you don’t know it yet take a look at this WIKI from Jeff Sandquist listing power outlets at several airports.

The next week is tightly packed with a couple of HPC events and I will try to summarize interesting notes picked up on these events throughout the next couple of days:

  • SAT to MON (November 15 to 17): Sun HPC Consortium.
  • MON (November 17): 2nd Windows HPC Summit (USA).
  • MON to FRI (November 17 to 21): SC08.