automatically exploiting cross-invocation parallelism using runtime ...

automatically exploiting cross-invocation parallelism using runtime ... automatically exploiting cross-invocation parallelism using runtime ...

dataspace.princeton.edu
from dataspace.princeton.edu More from this publisher
13.07.2015 Views

[47] A. Nicolau, G. Li, A. V. Veidenbaum, and A. Kejariwal. Synchronization optimizationsfor efficient execution on multi-cores. In Proceedings of the 23rd Internationalconference on Supercomputing (ISC), 2009.[48] NAS Parallel Benchmarks 3.http://www.nas.nasa.gov/Resources/Software/npb.html.[49] C. E. Oancea and A. Mycroft. Software thread-level speculation: an optimistic libraryimplementation. In Proceedings of the 1st International Workshop on Multicore SoftwareEngineering (IWMSE), 2008.[50] M. F. P. O’Boyle, L. Kervella, and F. Bodin. Synchronization minimization in aSPMD execution model. J. Parallel Distrib. Comput., volume 29, pages 196–210,September 1995.[51] G. Ottoni, R. Rangan, A. Stoler, and D. I. August. Automatic thread extraction withdecoupled software pipelining. In Proceedings of the 38th annual IEEE/ACM internationalsymposium on Microarchitecture (MICRO), 2005.[52] C. D. Polychronopoulos and D. J. Kuck. Guided self-scheduling: a practical schedulingscheme for parallel supercomputers. IEEE Transactions on Computers, volumeC-36, December 1987.[53] R. Ponnusamy, J. Saltz, and A. Choudhary. Runtime compilation techniques fordata partitioning and communication schedule reuse. In Proceedings of the 1993ACM/IEEE conference on Supercomputing (SC), 1993.[54] L.-N. Pouchet. PolyBench: the Polyhedral Benchmark suite.http://www-roc.inria.fr/ pouchet/software/polybench/download.[55] P. Prabhu, S. Ghosh, Y. Zhang, N. P. Johnson, and D. I. August. Commutative set:A language extension for implicit parallel programming. In Proceedings of the 32nd102

ACM SIGPLAN conference on Programming language design and implementation(PLDI), 2011.[56] P. Prabhu, T. B. Jablin, A. Raman, Y. Zhang, J. Huang, H. Kim, N. P. Johnson, F. Liu,S. Ghosh, S. Beard, T. Oh, M. Zoufaly, D. Walker, and D. I. August. A survey ofthe practice of computational science. In State of the Practice Reports, SC ’11, pages19:1–19:12, 2011.[57] R. Rajwar and J. Goodman. Speculative lock elision: enabling highly concurrentmultithreaded execution. In Proceedings of the 34th international symposium on Microarchitecture(MICRO), 2001.[58] A. Raman, H. Kim, T. R. Mason, T. B. Jablin, and D. I. August. Speculative parallelizationusing software multi-threaded transactions. In Proceedings of the 15thinternational conference on Architectural Support for Programming Languages andOperating Systems (ASPLOS), 2010.[59] A. Raman, H. Kim, T. Oh, J. W. Lee, and D. I. August. Parallelism orchestration usingDoPE: the Degree of Parallelism Executive. In Proceedings of the 32nd ACM SIG-PLAN conference on Programming Language Design and Implementation (PLDI),2011.[60] L. Rauchwerger, N. M. Amato, and D. A. Padua. A scalable method for run-time loopparallelization. International Journal of Parallel Programming (IJPP), volume 26,pages 537–576, 1995.[61] L. Rauchwerger and D. Padua. The Privatizing DOALL test: A run-time techniquefor DOALL loop identification and array privatization.In Proceedings of the 8thInternational Conference on Supercomputing (ICS), 1994.103

[47] A. Nicolau, G. Li, A. V. Veidenbaum, and A. Kejariwal. Synchronization optimizationsfor efficient execution on multi-cores. In Proceedings of the 23rd Internationalconference on Supercomputing (ISC), 2009.[48] NAS Parallel Benchmarks 3.http://www.nas.nasa.gov/Resources/Software/npb.html.[49] C. E. Oancea and A. Mycroft. Software thread-level speculation: an optimistic libraryimplementation. In Proceedings of the 1st International Workshop on Multicore SoftwareEngineering (IWMSE), 2008.[50] M. F. P. O’Boyle, L. Kervella, and F. Bodin. Synchronization minimization in aSPMD execution model. J. Parallel Distrib. Comput., volume 29, pages 196–210,September 1995.[51] G. Ottoni, R. Rangan, A. Stoler, and D. I. August. Automatic thread extraction withdecoupled software pipelining. In Proceedings of the 38th annual IEEE/ACM internationalsymposium on Microarchitecture (MICRO), 2005.[52] C. D. Polychronopoulos and D. J. Kuck. Guided self-scheduling: a practical schedulingscheme for parallel supercomputers. IEEE Transactions on Computers, volumeC-36, December 1987.[53] R. Ponnusamy, J. Saltz, and A. Choudhary. Runtime compilation techniques fordata partitioning and communication schedule reuse. In Proceedings of the 1993ACM/IEEE conference on Supercomputing (SC), 1993.[54] L.-N. Pouchet. PolyBench: the Polyhedral Benchmark suite.http://www-roc.inria.fr/ pouchet/software/polybench/download.[55] P. Prabhu, S. Ghosh, Y. Zhang, N. P. Johnson, and D. I. August. Commutative set:A language extension for implicit parallel programming. In Proceedings of the 32nd102

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!