13.07.2015 Views

automatically exploiting cross-invocation parallelism using runtime ...

automatically exploiting cross-invocation parallelism using runtime ...

automatically exploiting cross-invocation parallelism using runtime ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

LOCALWRITE + Barriers because of the additional <strong>parallelism</strong> enabled by SPECCROSS.However, LOCALWRITE + SPECCROSS does not perform as well as DOMORE + Barrier.The benefits from <strong>cross</strong>-<strong>invocation</strong> <strong>parallelism</strong> is negated by the overhead in redundantcomputation. This also explains why with large thread counts, LOCALWRITE + SPEC-CROSS does not perform as well as the manual parallelization.DOMORE is capable of reducing the redundancy execution and improving the scalabilityof performance. However, SPECCROSS transformation does not support partition-basedparallelization technique such as DOMORE. To make SPECCROSS and DOMORE worktogether to achieve better performance scalability, we modify the DOMORE code generation(section 3.4). Instead of having a separate scheduler thread, the scheduler code isduplicated on each worker thread. This optimization works for FLUIDANIMATE becausethe duplication of scheduler code does not cause any side effect. Figure 5.6 shows thecombiniation of SPECCROSS and DOMORE achieves the best performance among all.Another interesting thing to notice is that compared to DOMORE with pthread barriers,DOMORE with SPECCROSS does not yield much better performance gain. For high confidencespeculation, a speculative distance is applied to avoid conflict-prone speculation.According to the profiling results, some of the loop <strong>invocation</strong>s have very small speculativerange. In that case, speculative barriers basically serve as a non-speculative barrier and theeffect of SPECCROSS is limited.5.5 Limitations of Current Parallelizing Compiler InfrastructureIn the previous sections, we’ve demonstrated the applicability and scalability of DOMOREand SPECCROSS systems <strong>using</strong> ten programs. Besides these ten programs, DOMORE andSPECCROSS evaluated many other programs. Some of those programs can be directly parallelizedby DOALL, DOANY or PS-DSWP [32, 33, 55], so they are not good candidates91

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!