If you attended SC11, you might have noticed some buzz around OpenACC. Well, at least I did. For example, today’s OpenMP BOF had some information on this. I want to use this blog post to add some general comments and insights on the developments and direction of the OpenMP language committee as well as what has lead to OpenACC. As always you have to understand that these statements are mine only, on this blog I do not speak in any official role.
Since quite a while now, OpenMP is moving into the accelerator space, with the work done by the OpenMP for Accelerators subcommittee of the OpenMP Language Committee. That subcommittee publically presented the status of their work at the last IWOMP, where James Beyer et al had a paper on that particular topic (PDF of their presentation). They invested a lot of effort and made good progress since then. In order to make support for accelerators happen in OpenMP, they have to achieve three goals: (i) provide support for Slicing and Shaping expressions, (ii) provide support for data management constructs and clauses, and finally (iii) provide support to denote kernels and constructs for execution on the accelerator. For all three items the subcommittee looked at existing other proposals, particularly from PGI, BSC and CAPS, but also from others. There are good proposals underway for (i) and (ii) which probably are backed by a majority in the language committee, since this functionality may turn out to be very handy to drive other features and proposals as well. Just as an example we are aiming for improved support for Affinity of threads and data, which requires Slicing and Shaping of array expressions.
However, support for (iii) is really tough, if one wants to integrate well with the rest of OpenMP and allow for future extensions. An important design goal is that OpenMP will support not just one particular type of accelerator, but rather be widely applicable to different kinds of devices from different vendors. These are the reasons for OpenMP developing with the slow speed it is. We are planning for a public draft of OpenMP 4.0 for SC12, one year from now.
In order to allow for faster development and ignoring the OpenMP integration just for a moment, the OpenACC standard initiative was formed and basically is a spin-off of the OpenMP Language Committee. Personally, I see this as a beta of OpenMP for Accelerators, and I hope that this initiative will help to collect valuable feedback on how pragma-based accelerator programming has to look like. Cray, PGI and CAPS all have announced to implement the specification as it is currently. When it comes to getting the resources for that, it is much easier to implement this spin-off spec, instead of implementing an incompleted proposal draft. This is what I like the OpenACC effort for. Any by the way, it was prominently promoted during the NVIDIA keynote at SC11 on Tuesday morning.
However, what I do not like is, how it was marketed. People did not get the relation to OpenMP. They way it was published it was not clear that effort from other parties was involved in the development as well, not just the ones mentioned on the website. In fact, many people who visited the booth thought that OpenACC is about to become a competitor for OpenMP in the accelerator domain. This is not true, it is clearly the intend to feed back the OpenACC development into the next OpenMP specification. While clearly hope for the SC12 time frame to release a draft, but until then we have several technical problems to solve.