|PPoPP 2008||START Conference Manager|
We use profile-driven control and data dependence analysis to overcome limitations in static analysis. Dependencies are measured between large regions of code, leading, in practice, to a correct analysis. Nonetheless, manual verification or thread-level speculation systems are needed to guarantee correct execution.
DO-ACROSS parallelism is identified by analyzing the dependences between code regions. Interdependent code regions are recursively merged until the overall loop structure matches a preset template. This template is then used to steer parallelization and to privatize data structures when necessary.
Application of our technique to the MiBench and SPEC CPU2000 benchmarks shows that the profile-based analysis is correct. Significant amounts of outer-loop parallelism are found and lead to a speedup of 5.45 for bzip2 compression on a 32-thread Sun Niagara CMP.
|START Conference Manager (V2.54.5)|