13.07.2015 Views

automatically exploiting cross-invocation parallelism using runtime ...

automatically exploiting cross-invocation parallelism using runtime ...

automatically exploiting cross-invocation parallelism using runtime ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

[24] R. Gupta. The fuzzy barrier: a mechanism for high speed synchronization of processors.In Proceedings of the 3rd international conference on Architectural Support forProgramming Languages and Operating Systems (ASPLOS), 1989.[25] L. Hammond, V. Wong, M. Chen, B. D. Carlstrom, J. D. Davis, B. Hertzberg, M. K.Prabhu, H. Wijaya, C. Kozyrakis, and K. Olukotun. Transactional memory coherenceand consistency. In Proceedings of the 31st annual International Symposium onComputer Architecture (ISCA), 2004.[26] H. Han and C.-W. Tseng. Improving compiler and run-time support for irregularreductions <strong>using</strong> local writes. In Proceedings of the 11th international workshop onLanguages and Compilers for Parallel Computing (LCPC), 1999.[27] M. Herlihy and J. E. B. Moss. Transactional memory: Architectural support for lockfreedata structures. In Proceedings of the 20th annual International Symposium onComputer Architecture (ISCA), 1993.[28] J. Huang, T. B. Jablin, S. R. Beard, N. P. Johnson, and D. I. August. Automatically<strong>exploiting</strong> <strong>cross</strong>-<strong>invocation</strong> <strong>parallelism</strong> <strong>using</strong> <strong>runtime</strong> information. In Proceedingsof the 2013 International Symposium on Code Generation and Optimization, April2013.[29] J. Huang, A. Raman, Y. Zhang, T. B. Jablin, T.-H. Hung, and D. I. August. DecoupledSoftware Pipelining Creates Parallelization Opportunities. In Proceedings of the 8thinternational symposium on Code Generation and Optimization (CGO), 2010.[30] K. Z. Ibrahim and G. T. Byrd. On the exploitation of value predication and produceridentification to reduce barrier synchronization time. In Proceedings of the 15th InternationalParallel & Distributed Processing Symposium (IPDPS), 2001.99

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!