Treffer: Productivity, Portability, Performance, and Reproducibility: Data-Centric Python

Title:

Productivity, Portability, Performance, and Reproducibility: Data-Centric Python

Authors:

Ziogas, Alexandros Nikolaos, Schneider, Timo, Ben-Nun, Tal, Calotoiu, Alexandru, De Matteis, Tiziano, de Fine Licht, Johannes, Lavarini, Luca, Hoefler, Torsten

Source:

Ziogas, A N, Schneider, T, Ben-Nun, T, Calotoiu, A, De Matteis, T, de Fine Licht, J, Lavarini, L & Hoefler, T 2025, 'Productivity, Portability, Performance, and Reproducibility: Data-Centric Python', IEEE Transactions on Parallel and Distributed Systems, vol. 36, no. 5, pp. 804-820. https://doi.org/10.1109/TPDS.2025.3549310

Publication Year:

2025

Subject Terms:

Computer languages, dataflow computing, distributed computing, high-performance computing, parallel programming, Python

Document Type:

Fachzeitschrift article in journal/newspaper

File Description:

application/pdf

Language:

English

Relation:

info:eu-repo/semantics/altIdentifier/hdl/https://hdl.handle.net/1871.1/b32513e8-9c4f-4e07-bc44-91c8951e2264; info:eu-repo/semantics/altIdentifier/pissn/1045-9219

DOI:

10.1109/TPDS.2025.3549310

Availability:

https://research.vu.nl/en/publications/b32513e8-9c4f-4e07-bc44-91c8951e2264
https://doi.org/10.1109/TPDS.2025.3549310
https://hdl.handle.net/1871.1/b32513e8-9c4f-4e07-bc44-91c8951e2264
https://research.vu.nl/ws/files/424089872/Productivity_Portability_Performance_and_Reproducibility_Data-Centric_Python.pdf
https://www.scopus.com/pages/publications/105002742993
https://www.scopus.com/inward/citedby.url?scp=105002742993&partnerID=8YFLogxK

Rights:

info:eu-repo/semantics/openAccess ; https://ub.vu.nl/nl/onderwijs-onderzoek/open-access/b-you-share-we-care/end-user-agreement.aspx

Accession Number:

edsbas.2CB25F62

Database:

BASE

Weitere Informationen

Python has become the de facto language for scientific computing. Programming in Python is highly productive, mainly due to its rich science-oriented software ecosystem built around the NumPy module. As a result, the demand for Python support in High-Performance Computing (HPC) has skyrocketed. However, the Python language itself does not necessarily offer high performance. This work presents a workflow that retains Python’s high productivity while achieving portable performance across different architectures. The workflow’s key features are HPC-oriented language extensions and a set of automatic optimizations powered by a data-centric intermediate representation. We show performance results and scaling across CPU, GPU, FPGA, and the Piz Daint supercomputer (up to 23,328 cores), with 2.47x and 3.75x speedups over previous-best solutions, first-ever Xilinx and Intel FPGA results of annotated Python, and up to 93.16% scaling efficiency on 512 nodes. Our benchmarks were reproduced in the Student Cluster Competition (SCC) during the Supercomputing Conference (SC) 2022. We present and discuss the student teams’ results.

Treffer: Productivity, Portability, Performance, and Reproducibility: Data-Centric Python

Weitere Informationen

Links

Zusatz-Funktionen