Treffer: Enhancing Kokkos with OpenACC.

Title:
Enhancing Kokkos with OpenACC.
Authors:
Valero-Lara, Pedro1 (AUTHOR) valerolarap@ornl.gov, Lee, Seyong1 (AUTHOR) lees2@ornl.gov, Gonzalez-Tallada, Marc2 (AUTHOR), Denny, Joel1 (AUTHOR), Teranishi, Keita1 (AUTHOR), Vetter, Jeffrey S.1 (AUTHOR)
Source:
International Journal of High Performance Computing Applications. Sep2024, Vol. 38 Issue 5, p409-426. 18p.
Database:
Business Source Elite

Weitere Informationen

C++ template metaprogramming has emerged as a prominent approach for achieving performance portability in heterogeneous computing. Kokkos represents a notable paradigm in this domain, offering programmers a suite of high-level abstractions for generic programming while deferring much of the device-specific code generation and optimization to the compiler through template specializations. Kokkos furnishes a range of device-specific code specializations across multiple back ends, including CUDA and HIP. Diverging from conventional back ends, the OpenACC implementation presents a high-level, multicompiler, multidevice, and directive-based programming model. This paper presents recent advancements in the OpenACC back end for Kokkos (i.e., KokkACC) and focuses on its integration into the Kokkos ecosystem, exploration of automatic device selection capabilities to enhance productivity, and performance evaluation on modern hardware such as NVIDIA H100 GPUs. The study includes implementation details and a thorough performance assessment across various computational benchmarks, including minibenchmarks (AXPY and DOT product), miniapps (LULESH, MiniFE, and SNAP-LAMMPS), and a scientific kernel based on the lattice Boltzmann method. [ABSTRACT FROM AUTHOR]

Copyright of International Journal of High Performance Computing Applications is the property of Sage Publications Inc. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)