Treffer: Parallelizing autotuning for HPC applications: Unveiling the potential of the speculation strategy in Bayesian optimization.
Weitere Informationen
In the exascale computing era, tuning High-Performance Computing (HPC) applications has become a significant computational challenge. Although Bayesian optimization (BO) has emerged as a promising tool for HPC performance tuning, the BO workflow is inherently sequential (i.e., one function evaluation at a time) and cannot leverage the huge amount of parallel resources present in modern supercomputers, resulting in a considerable underutilization of their computational capabilities. This paper explores the trade-off between search quality and parallelism in BO, investigating a diverse set of methods. Building upon both previous approaches from the literature and novel methodologies introduced in this work, our study provides a deep analysis to accelerate BO performance tuning. By examining a set of synthetic functions and practical HPC applications, our exploration analyzes the interaction among various BO methods for parallelization, the quantity of parallel resources, the runtime distribution of target HPC applications, and the costs associated with different search orchestration mechanisms that have been overlooked in previous studies. Compared to sequential BO, our novel methodology achieves comparable quality while demonstrating robust scalability in search time as the amount of parallel resources increases; it also outperforms a state-of-the-art tuner, which supports parallelization, achieving up to 3.67x faster search time. We provide high-value insights for practitioners seeking to leverage the power of parallel computing for efficient HPC application tuning. Additionally, to further assist researchers in accelerating the performance tuning of their HPC applications, we provide an extension of an existing open-source tuning framework that incorporates our methods. [ABSTRACT FROM AUTHOR]
Copyright of International Journal of High Performance Computing Applications is the property of Sage Publications Inc. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)