Treffer: The use of Java in large scientific applications in HPC environments
Weitere Informationen
Java is a very commonly used computer programming language, although its use amongst the scientific and High Performance Computing (HPC) communities remains relatively low. In this thesis, the option of using Java for developing scientific applications intended for execution in HPC environments is investigated. The data reduction pipeline for the Gaia space astronomy mission is an example of a large software project that has been written in Java, and will run in HPC environments. The efficient execution of the Gaia data reduction pipeline was one of the main motivations behind this thesis, although this thesis largely remains a general investigation into the use of Java in HPC. HPC is a fast changing field, in terms of hardware, software, and the scale of the problems that are being tackled. Amongst the most significant trends in HPC in recent years have been the increase in the number of cores per computing node, and the increase in the size of datasets that must be processed. A significant challenge in HPC is ensuring that data is made available in a particular node, when a core is ready to process it, thereby avoiding deadtime and providing high throughput. One danger to throughput is a decrease in the performance of shared storage devices, as the number of concurrent processes that are accessing those devices increases. Given the trends mentioned above, efficient data communication is very important for many applications running in HPC environments. In this thesis, we present an investigation into the current options for providing efficient data communication to Java applications in HPC environments. We investigate a number of implementations of Message Passing in Java (MPJ) and compare their performance. We present a new communication middleware application, called MPJ-Cache. This middleware makes use of an underlying implementation of Message-Passing in Java (MPJ), and adds prefetching, caching, and file-splitting functionality. It presents application developers with a high-level API, thus providing ...