{"id":226,"date":"2024-12-03T17:15:02","date_gmt":"2024-12-03T15:15:02","guid":{"rendered":"https:\/\/terboven.com\/?p=226"},"modified":"2024-12-03T17:15:02","modified_gmt":"2024-12-03T15:15:02","slug":"affinity-in-openmp-6-0-on-taskloop-construct","status":"publish","type":"post","link":"https:\/\/terboven.com\/?p=226","title":{"rendered":"Affinity in OpenMP 6.0 on taskloop construct"},"content":{"rendered":"\n<p>During last week\u2019s SC24 conference in Atlanta, GA, I briefly reported on the activity of the Affinity subcommittee of the OpenMP language committee. One topic was that, together with the Tasking subcommittee, we brought support for <code>taskloop<\/code> affinity to OpenMP 6.0, which I am going to describe here.<\/p>\n\n\n\n<p>As you are probably well aware, the OpenMP specification currently allows for the use of the <code>depend<\/code> and <code>affinity<\/code> clauses on <code>task<\/code> constructs. The <code>depend<\/code> clause provides a mechanism for expressing data dependencies among tasks, and the <code>affinity<\/code> clause functions as a hint to guide the OpenMP runtime where to execute the tasks, preferably close to the data items specified in the clause. However, this functionality was not made available when the <code>taskloop<\/code> construct was added, which parallelizes a loop by creating a set of tasks, where each task typically handles one or more iterations of the loop. Specifically, the <code>depend<\/code> clause could not be used to express dependencies, either between tasks within a <code>taskloop<\/code> or between tasks generated by a <code>taskloop<\/code> and other tasks, limiting its applicability.<\/p>\n\n\n\n<p>OpenMP 6.0 introduced the <code>task_iteration<\/code> directive, which, when used with a <code>taskloop<\/code> construct, allows for fine-grained control over the creation and properties of individual tasks within the loop. Each <code>task_iteration<\/code> directive within a <code>taskloop<\/code> signals the creation of a new task with corresponding properties. With this functionality, one can express:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Dependencies: The <code>depend<\/code> clause on a <code>task_iteration<\/code> directive allows to specify data dependencies between tasks generated by the <code>taskloop<\/code> as well between tasks of this <code>taskloop<\/code> and other tasks (standalone and e.g. generated by other <code>taskloops<\/code>).<\/li>\n\n\n\n<li>Affinity: The <code>affinity<\/code> clause can be used to specify data affinity for individual tasks. This enables optimizing data locality and improving cache utilization.<\/li>\n\n\n\n<li>Conditions: The <code>if<\/code> clause can be used to conditionally generate tasks within the <code>taskloop<\/code>. This can be helpful for situations where not all iterations of the loop need to generate a dependency, in particular to reduce overhead.<\/li>\n<\/ul>\n\n\n\n<p>Let&#8217;s consider the following artificial example code.<\/p>\n\n\n\n<p><code>\/\/ TL1 taskloop<br>#pragma omp taskloop nogroup<br>for (int i = 1; i &lt; n; i++)<br>{<br>\u00a0\u00a0\u00a0#pragma omp task_iteration depend(inout: A[i]) depend(in: A[i-1])<br>\u00a0\u00a0\u00a0A[i] += A[i] * A[i-1];<br>}<\/code><br><br><code>\/\/ TL2 taskloop + grainsize<br>#pragma omp taskloop grainsize(strict: 4) nogroup<br>for (int i = 1; i &lt; n; i++)<br>{<br>\u00a0\u00a0\u00a0#pragma omp task_iteration depend(inout: A[i]) depend(in: A[i-4])<\/code> \\<br>                                                                               <code>if ((i % 4) == 0 || i == n-1)<br>\u00a0\u00a0\u00a0A[i] += A[i] * A[i-1];<br>}<\/code><br><br><code>\/\/ T3 other task<br>#pragma omp task depend(in: A[n-1])<\/code><\/p>\n\n\n\n<p>The first <code>taskloop<\/code> TL1 construct parallelizes a loop that has an obvious dependency: every iteration <code>i<\/code> depends on the previous iteration <code>i-1<\/code>. This is expressed with the <code>depend<\/code> clause accordingly. Consequently, this will manifest in dependencies between tasks generated by this <code>taskloop<\/code>.<\/p>\n\n\n\n<p>The second <code>taskloop<\/code> TL2 parallelized the loop by creating tasks that each execute four iterations, because of the <code>grainsize<\/code> clause with the <code>strict<\/code> modifier. In addition, a task dependency is only created if the expression of the <code>if<\/code> clause evaluates to <code>true<\/code>, limiting the overall number of dependencies per task<\/p>\n\n\n\n<p>The remaining standalone task T3 is a regular explicit task that depends on the final element of array <code>A<\/code>, that is produced by the last task of TL2, and hence ensures the completion of all previously generated tasks.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n","protected":false},"excerpt":{"rendered":"<p>During last week\u2019s SC24 conference in Atlanta, GA, I briefly reported on the activity of the Affinity subcommittee of the OpenMP language committee. One topic was that, together with the Tasking subcommittee, we brought support for taskloop affinity to OpenMP 6.0, which I am going to describe here. As you are probably well aware, the &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/terboven.com\/?p=226\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Affinity in OpenMP 6.0 on taskloop construct&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[17,3,4],"tags":[12,5,35,8],"class_list":["post-226","post","type-post","status-publish","format-standard","hentry","category-hpc","category-openmp","category-talks","tag-affinity","tag-openmp","tag-sc24","tag-tasking"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/terboven.com\/index.php?rest_route=\/wp\/v2\/posts\/226","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/terboven.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/terboven.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/terboven.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/terboven.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=226"}],"version-history":[{"count":1,"href":"https:\/\/terboven.com\/index.php?rest_route=\/wp\/v2\/posts\/226\/revisions"}],"predecessor-version":[{"id":227,"href":"https:\/\/terboven.com\/index.php?rest_route=\/wp\/v2\/posts\/226\/revisions\/227"}],"wp:attachment":[{"href":"https:\/\/terboven.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=226"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/terboven.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=226"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/terboven.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=226"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}