Discovery of Potential Parallelism in Sequential Programs.
Technische Universität Darmstadt, Darmstadt
[Ph.D. Thesis], (2016)
Available under CC-BY-NC-ND 4.0 International - Creative Commons Attribution Non-commercial No-derivatives 4.0.
Download (4MB) | Preview
|Item Type:||Ph.D. Thesis|
|Title:||Discovery of Potential Parallelism in Sequential Programs|
In the era of multicore processors, the responsibility for performance gains has been shifted onto software developers. Once improvements of the sequential algorithm have been exhausted, software-managed parallelism is the only option left. However, writing parallel code is still difficult, especially when parallelizing sequential code written by someone else. A key task in this process is the identification of suitable parallelization targets in the source code. Parallelism discovery tools help developers to find such targets automatically. Unfortunately, tools that identify parallelism during compilation are usually conservative due to the lack of runtime information, and tools relying on runtime information primarily suffer from high overhead in terms of both time and memory. This dissertation presents a generic framework for parallelism discovery based on dynamic program analysis, supporting various types of parallelism while incurring practically affordable overhead. The framework contains two main components: an efficient data-dependence profiler and a set of parallelism discovery algorithms based on a language-independent concept called Computational Unit.
The data-dependence profiler serves as the foundation of the parallelism discovery framework. Traditional dependence profiling approaches introduce a tremendous amount of time and memory overhead. To lower the overhead, current methods limit their scope to the subset of the dependence information needed for the analysis they have been created for, sacrificing generality and discouraging reuse. In contrast, the profiler shown in this thesis addresses the problem via signature-based memory management and a lock-free parallel design. It produces detailed dependences not only for sequential but also for multi-threaded code without causing prohibitive overhead, allowing it to serve as a generic base for various program analysis techniques.
Computational Units (CUs) provide a language-independent foundation for parallelism discovery. CUs are computations that follow the read-compute-write pattern. Unlike other concepts, they are not restricted to predefined language constructs. A program is represented as a CU graph, in which vertexes are CUs and edges are data dependences. This allows parallelism to be detected that spreads across multiple language constructs, taking code refactoring into consideration. The parallelism discovery algorithms cover both loop and task parallelism.
Results of our experiments show that 1) the efficient data-dependence profiler has a very competitive average slowdown of around 80× with accuracy higher than 99.6%; 2) the framework discovers parallelism with high accuracy, identifying 92.5% of the parallel loops in NAS benchmarks; 3) when parallelizing well-known open-source software following the outputs of the framework, reasonable speedups are obtained. Finally, use cases beyond parallelism discovery are briefly demonstrated to show the generality of the framework.
|Place of Publication:||Darmstadt|
|Uncontrolled Keywords:||parallelism discovery, data-dependence profiling, program analysis, computational unit, parallel programming|
|Classification DDC:||000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik|
|Divisions:||20 Department of Computer Science
20 Department of Computer Science > Parallel Programming
|Date Deposited:||08 Nov 2016 10:10|
|Last Modified:||08 Nov 2016 10:10|
|Referees:||Wolf, Prof. Dr. Felix and Clauss, Prof. Dr. Philippe|
|Refereed:||28 October 2016|