13.07.2015 Views

automatically exploiting cross-invocation parallelism using runtime ...

automatically exploiting cross-invocation parallelism using runtime ...

automatically exploiting cross-invocation parallelism using runtime ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

5.2 SPECCROSS Performance EvaluationEight programs are evaluated for SPECCROSS. As DOMORE evaluation, We comparedtwo parallel versions of these programs: (a) pthreads-based [9] parallelization with nonspeculativepthread barriers; and (b) pthreads-based parallelization with SPECCROSS.SPECCROSS is used with a pthreads-based implementation, since the recovery mechanismrelies on the properties of POSIX threads. For the performance measurements, the bestsequential execution of the parallelized loops is considered the baseline.For most of these programs, the parallelized loops account for more than 90% of theexecution time. When parallelizing <strong>using</strong> SPECCROSS, each loop iteration is regarded as aseparate task and the custom hash function used for calculating the access signatures keepstrack of the range of memory locations (or array indices) accessed by each task. This choiceis guided by the predominantly array-based accesses present in these programs. Each parallelprogram is first instrumented <strong>using</strong> the profiling functions provided by SPECCROSS.The profiling step recommends a minimum dependence distance value for use in speculativebarrier execution. All benchmark programs have multiple input sets. We chosethe training input set for profiling run. Table 5.3 shows the minimum dependence distanceresults for the evaluated programs <strong>using</strong> two different input sets (a training input set for profilingrun and another reference input set for performance run). Four of the eight programshad <strong>runtime</strong> dependences detected by profiling functions while the rest do not. The minimumdependence distance between two inner loops in program FLUIDANIMATE varies alot. Some of the loops do not cause any <strong>runtime</strong> access conflicts while others have a verysmall minimum dependence distance. For the latter case, SPECCROSS basically serves asa non-speculative barrier. The results of the profiling run were passed to speculative barrierexecution which used the minimum dependence distance value to avoid misspeculation.Figure 5.2 compares the speedups achieved by the parallelized loops <strong>using</strong> pthreadbarriers and SPECCROSS. It demonstrates the benefits of reducing the overhead in barriersynchronization. The best sequential execution time of the parallelized loops is considered82

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!