Hybrid MPI and OpenMP programming tutorial - Prace Training Portal

Hybrid MPI and OpenMP programming tutorial - Prace Training Portal Hybrid MPI and OpenMP programming tutorial - Prace Training Portal

training.prace.ri.eu
from training.prace.ri.eu More from this publisher
10.07.2015 Views

— skipped —Thread support within Open MPI• In order to enable thread support in Open MPI, configure with:configure --enable-mpi-threads• This turns on:– Support for full MPI_THREAD_MULTIPLE– internal checks when run with threads (--enable-debug)configure --enable-mpi-threads --enable-progress-threads• This (additionally) turns on:– Progress threads to asynchronously transfer/receive data pernetwork BTL.• Additional Feature:– Compiling with debugging support, but without threads willcheck for recursive lockingOutline• Introduction / Motivation• Programming models on clusters of SMP nodes• Case Studies / pure MPI vs hybrid MPI+OpenMP• Practical “How-To” on hybrid programming• Mismatch Problems• Opportunities:Application categories that can benefit from hybrid parallelization• Thread-safety quality of MPI libraries• Tools for debugging and profiling MPI+OpenMP• Other options on clusters of SMP nodes• SummaryThis section is skipped, seetalks on tools on ThursdayHybrid Parallel ProgrammingSlide 125 / 154Rabenseifner, Hager, JostHybrid Parallel ProgrammingSlide 126 / 154Rabenseifner, Hager, JostCourtesy of Rainer Keller, HLRS and ORNL— skipped —Thread Correctness – Intel ThreadChecker 1/3— skipped —Thread Correctness – Intel ThreadChecker 2/3• Intel ThreadChecker operates in a similar fashion to helgrind,• Compile with –tcheck, then run program using tcheck_cl:• One may output to HTML:tcheck_cl --format HTML --report pthread_race.html pthread_raceApplication finished_______________________________________________________________________________|ID|Short De|Sever|C|Contex|Description|1st Acc|2nd Acc|| |scriptio|ity |o|t[Best| |ess[Bes|ess[Bes|| |n |Name |u|] | |t] |t] || | | |n| | | | || | | |t| | | | |_______________________________________________________________________________|1 |Write ->|Error|1|"pthre|Memory write of global_variable at|"pthrea|"pthrea|| |Write da| | |ad_rac|"pthread_race.c":31 conflicts with|d_race.|d_race.|| |ta-race | | |e.c":2|a prior memory write of |c":31 |c":31 || | | | |5 |global_variable at | | || | | | | |"pthread_race.c":31 (output | | || | | | | |dependence) | | |_______________________________________________________________________________• Caution: Intel Inspector XE 2011 is a GUI based tool not suitable forhybrid code execution (?)Hybrid Parallel ProgrammingSlide 127 / 154Rabenseifner, Hager, JostHybrid Parallel ProgrammingSlide 128 / 154Rabenseifner, Hager, JostCourtesy of Rainer Keller, HLRS and ORNLCourtesy of Rainer Keller, HLRS and ORNL

— skipped —Thread Correctness – Intel ThreadChecker 3/3— skipped —Performance Tools Support for Hybrid Code• If one wants to compile with threaded Open MPI (option for IB):configure --enable-mpi-threads--enable-debug--enable-mca-no-build=memory-ptmalloc2CC=icc F77=ifort FC=ifortCFLAGS=‘-debug all –inline-debug-info tcheck’CXXFLAGS=‘-debug all –inline-debug-info tcheck’FFLAGS=‘-debug all –tcheck’ LDFLAGS=‘tcheck’• Paraver examples have alreadybeen shown, tracing is done withlinking against (closed-source)omptrace or ompitrace• Then run with:mpirun --mca tcp,sm,self -np 2 tcheck_cl \--reinstrument -u full --format html \--cache_dir '/tmp/my_username_$$__tc_cl_cache' \--report 'tc_mpi_test_suite_$$' \--options 'file=tc_my_executable_%H_%I, \pad=128, delay=2, stall=2' -- \./my_executable my_arg1 my_arg2 …• For Vampir/Vampirtrace performance analysis:./configure –enable-omp–enable-hyb–with-mpi-dir=/opt/OpenMPI/1.3-iccCC=icc F77=ifort FC=ifort(Attention: does not wrap MPI_Init_thread!)Hybrid Parallel ProgrammingSlide 129 / 154Rabenseifner, Hager, JostHybrid Parallel ProgrammingSlide 130 / 154Rabenseifner, Hager, JostCourtesy of Rainer Keller, HLRS and ORNLCourtesy of Rainer Keller, HLRS and ORNL— skipped —Scalasca – Example “Wait at Barrier”— skipped —Scalasca – Example “Wait at Barrier”, SolutionIndication ofnon-optimal loadbalanceBetter load balancingwith dynamicloop scheduleHybrid Parallel ProgrammingSlide 131 / 154Rabenseifner, Hager, JostHybrid Parallel ProgrammingSlide 132 / 154Rabenseifner, Hager, JostScreenshots, courtesy of KOJAK JSC, FZ JülichScreenshots, courtesy of KOJAK JSC, FZ Jülich

— skipped —Thread Correctness – Intel ThreadChecker 3/3— skipped —Performance Tools Support for <strong>Hybrid</strong> Code• If one wants to compile with threaded Open <strong>MPI</strong> (option for IB):configure --enable-mpi-threads--enable-debug--enable-mca-no-build=memory-ptmalloc2CC=icc F77=ifort FC=ifortCFLAGS=‘-debug all –inline-debug-info tcheck’CXXFLAGS=‘-debug all –inline-debug-info tcheck’FFLAGS=‘-debug all –tcheck’ LDFLAGS=‘tcheck’• Paraver examples have alreadybeen shown, tracing is done withlinking against (closed-source)omptrace or ompitrace• Then run with:mpirun --mca tcp,sm,self -np 2 tcheck_cl \--reinstrument -u full --format html \--cache_dir '/tmp/my_username_$$__tc_cl_cache' \--report 'tc_mpi_test_suite_$$' \--options 'file=tc_my_executable_%H_%I, \pad=128, delay=2, stall=2' -- \./my_executable my_arg1 my_arg2 …• For Vampir/Vampirtrace performance analysis:./configure –enable-omp–enable-hyb–with-mpi-dir=/opt/Open<strong>MPI</strong>/1.3-iccCC=icc F77=ifort FC=ifort(Attention: does not wrap <strong>MPI</strong>_Init_thread!)<strong>Hybrid</strong> Parallel ProgrammingSlide 129 / 154Rabenseifner, Hager, Jost<strong>Hybrid</strong> Parallel ProgrammingSlide 130 / 154Rabenseifner, Hager, JostCourtesy of Rainer Keller, HLRS <strong>and</strong> ORNLCourtesy of Rainer Keller, HLRS <strong>and</strong> ORNL— skipped —Scalasca – Example “Wait at Barrier”— skipped —Scalasca – Example “Wait at Barrier”, SolutionIndication ofnon-optimal loadbalanceBetter load balancingwith dynamicloop schedule<strong>Hybrid</strong> Parallel ProgrammingSlide 131 / 154Rabenseifner, Hager, Jost<strong>Hybrid</strong> Parallel ProgrammingSlide 132 / 154Rabenseifner, Hager, JostScreenshots, courtesy of KOJAK JSC, FZ JülichScreenshots, courtesy of KOJAK JSC, FZ Jülich

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!