Shudler, Sergei (2018)
Scalability Engineering for Parallel Programs Using Empirical Performance Models.
Technische Universität Darmstadt
Ph.D. Thesis, Primary publication
|
Scalability Engineering for Parallel Programs Using Empirical Performance Models -
Text
20180608_Shudler_diss_final.pdf - Accepted Version Copyright Information: CC BY-NC-ND 4.0 International - Creative Commons, Attribution NonCommercial, NoDerivs. Download (2MB) | Preview |
Item Type: | Ph.D. Thesis | ||||
---|---|---|---|---|---|
Type of entry: | Primary publication | ||||
Title: | Scalability Engineering for Parallel Programs Using Empirical Performance Models | ||||
Language: | English | ||||
Referees: | Wolf, Prof. Dr. Felix ; Schulz, Prof. Dr. Martin | ||||
Date: | 11 June 2018 | ||||
Place of Publication: | Darmstadt | ||||
Date of oral examination: | 16 April 2018 | ||||
Abstract: | Performance engineering is a fundamental task in high-performance computing (HPC). By definition, HPC applications should strive for maximum performance. As HPC systems grow larger and more complex, the scalability of an application has become of primary concern. Scalability is the ability of an application to show satisfactory performance even when the number of processors or the problems size is increased. Although various analysis techniques for scalability were suggested in past, engineering applications for extreme-scale systems still occurs ad hoc. The challenge is to provide techniques that explicitly target scalability throughout the whole development cycle, thereby allowing developers to uncover bottlenecks earlier in the development process. In this work, we develop a number of fundamental approaches in which we use empirical performance models to gain insights into the code behavior at higher scales. In the first contribution, we propose a new software engineering approach for extreme-scale systems. Specifically, we develop a framework that validates asymptotic scalability expectations of programs against their actual behavior. The most important applications of this method, which is especially well suited for libraries encapsulating well-studied algorithms, include initial validation, regression testing, and benchmarking to compare implementation and platform alternatives. We supply a tool-chain that automates large parts of the framework, thus allowing it to be continuously applied throughout the development cycle with very little effort. We evaluate the framework with MPI collective operations, a data-mining code, and various OpenMP constructs. In addition to revealing unexpected scalability bottlenecks, the results also show that it is a viable approach for systematic validation of performance expectations. As the second contribution, we show how the isoefficiency function of a task-based program can be determined empirically and used in practice to control the efficiency. Isoefficiency, a concept borrowed from theoretical algorithm analysis, binds efficiency, core count, and the input size in one analytical expression, thereby allowing the latter two to be adjusted according to given (realistic) efficiency objectives. Moreover, we analyze resource contention by modeling the efficiency of contention-free execution. This allows poor scaling to be attributed either to excessive resource contention overhead or structural conflicts related to task dependencies or scheduling. Our results, obtained with applications from two benchmark suites, demonstrate that our approach provides insights into fundamental scalability limitations or excessive resource overhead and can help answer critical co-design questions. Our contributions for better scalability engineering can be used not only in the traditional software development cycle, but also in other, related fields, such as algorithm engineering. It is a field that uses the software engineering cycle to produce algorithms that can be utilized in applications more easily. Using our contributions, algorithm engineers can make informed design decisions, get better insights, and save experimentation time. |
||||
Alternative Abstract: |
|
||||
URN: | urn:nbn:de:tuda-tuprints-74714 | ||||
Classification DDC: | 000 Generalities, computers, information > 004 Computer science | ||||
Divisions: | 20 Department of Computer Science > Parallel Programming | ||||
Date Deposited: | 12 Jun 2018 09:14 | ||||
Last Modified: | 09 Jul 2020 02:07 | ||||
URI: | https://tuprints.ulb.tu-darmstadt.de/id/eprint/7471 | ||||
PPN: | 432521151 | ||||
Export: |
View Item |