13.07.2015 Views

automatically exploiting cross-invocation parallelism using runtime ...

automatically exploiting cross-invocation parallelism using runtime ...

automatically exploiting cross-invocation parallelism using runtime ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2.4 Sequential Loop Example for DOACROSS and DSWP . . . . . . . . . . . 162.5 Parallelization Execution Plan for DOACROSS and DSWP . . . . . . . . . 162.6 Example Loop which cannot be parallelized by DOACROSS or DSWP . . 172.7 PDG after breaking the loop exit control dependence . . . . . . . . . . . . 192.8 TLS and SpecDSWP schedules for the loop shown in Figure 2.6 . . . . . . 193.1 Example program: (a) Simplified code for a nested loop in CG (b) PDGfor inner loop. The dependence pattern allows DOALL parallelization. (c)PDG for outer loop. Cross-iteration dependence deriving from E to itselfhas manifest rate 72.4%. . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.2 Comparison of performance with and without <strong>cross</strong>-<strong>invocation</strong> parallelization: (a) DOALL is applied to the inner loop. Frequent barrier synchronizationoccurs between the boundary of the inner and outer loops. (b)After the partitioning phase, DOMORE has partitioned the code without insertingthe <strong>runtime</strong> engine. A scheduler and three workers execute concurrently,but worker threads still synchronize after each <strong>invocation</strong>. (c) DO-MORE finalizes by inserting the <strong>runtime</strong> engine to exploit <strong>cross</strong>-<strong>invocation</strong><strong>parallelism</strong>. Assuming iteration 2 from <strong>invocation</strong> 2 (2.2) depends on iteration5 from <strong>invocation</strong> 1 (1.5). Scheduler detects the dependence andsynchronizes those two iterations. . . . . . . . . . . . . . . . . . . . . . . 273.3 Performance improvement of CG with and without DOMORE. . . . . . . . 283.4 Overview of DOMORE compile-time transformation and <strong>runtime</strong> synchronization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.5 Scheduler scheme running example: (a) Table showing original <strong>invocation</strong>/iteration,array element accessed in iteration, thread the iteration isscheduled to, combined iteration number, and helper data structure values(b) Execution of the example. . . . . . . . . . . . . . . . . . . . . . . . . 33viii

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!