Treffer: Productivity, Portability, Performance, and Reproducibility: Data-Centric Python
https://doi.org/10.1109/TPDS.2025.3549310
https://hdl.handle.net/1871.1/b32513e8-9c4f-4e07-bc44-91c8951e2264
https://research.vu.nl/ws/files/424089872/Productivity_Portability_Performance_and_Reproducibility_Data-Centric_Python.pdf
https://www.scopus.com/pages/publications/105002742993
https://www.scopus.com/inward/citedby.url?scp=105002742993&partnerID=8YFLogxK
Weitere Informationen
Python has become the de facto language for scientific computing. Programming in Python is highly productive, mainly due to its rich science-oriented software ecosystem built around the NumPy module. As a result, the demand for Python support in High-Performance Computing (HPC) has skyrocketed. However, the Python language itself does not necessarily offer high performance. This work presents a workflow that retains Python’s high productivity while achieving portable performance across different architectures. The workflow’s key features are HPC-oriented language extensions and a set of automatic optimizations powered by a data-centric intermediate representation. We show performance results and scaling across CPU, GPU, FPGA, and the Piz Daint supercomputer (up to 23,328 cores), with 2.47x and 3.75x speedups over previous-best solutions, first-ever Xilinx and Intel FPGA results of annotated Python, and up to 93.16% scaling efficiency on 512 nodes. Our benchmarks were reproduced in the Student Cluster Competition (SCC) during the Supercomputing Conference (SC) 2022. We present and discuss the student teams’ results.