10.07.2015 Views

Hybrid MPI and OpenMP programming tutorial - Prace Training Portal

Hybrid MPI and OpenMP programming tutorial - Prace Training Portal

Hybrid MPI and OpenMP programming tutorial - Prace Training Portal

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Pure <strong>MPI</strong><strong>Hybrid</strong> Masteronlypure <strong>MPI</strong>one <strong>MPI</strong> processon each coreDiscussedin detail later onin the sectionMismatchProblemsAdvantages– No modifications on existing <strong>MPI</strong> codes– <strong>MPI</strong> library need not to support multiple threadsMajor problems– Does <strong>MPI</strong> library uses internally different protocols?• Shared memory inside of the SMP nodes• Network communication between the nodes– Does application topology fit on hardware topology?– Unnecessary <strong>MPI</strong>-communication inside of SMP nodes!Masteronly<strong>MPI</strong> only outsideof parallel regionsfor (iteration ….){#pragma omp parallelnumerical code/*end omp parallel *//* on master thread only */<strong>MPI</strong>_Send (original datato halo areasin other SMP nodes)<strong>MPI</strong>_Recv (halo datafrom the neighbors)}/*endforloopAdvantages– No message passing inside of the SMP nodes– No topology problemMajor Problems– All other threads are sleepingwhile master thread communicates!– Which inter-node b<strong>and</strong>width?– <strong>MPI</strong>-lib must support at least<strong>MPI</strong>_THREAD_FUNNELED SectionThread-safetyquality of <strong>MPI</strong>libraries<strong>Hybrid</strong> Parallel ProgrammingSlide9/154Rabenseifner, Hager, Jost<strong>Hybrid</strong> Parallel ProgrammingSlide10/154Rabenseifner, Hager, JostOverlapping Communication <strong>and</strong> ComputationOverlapping <strong>MPI</strong> communication bycommunication one or a few threads while other <strong>and</strong> threads computationare computingif (my_thread_rank < …) {<strong>MPI</strong>_Send/Recv….i.e., communicate all halo data}else{Execute those parts of the applicationthat do not need halo data(on non-communicating threads)}Execute those parts of the applicationthat need halo data(on all threads)Pure <strong>OpenMP</strong> (on the cluster)<strong>OpenMP</strong> onlydistributed virtualshared memory• Distributed shared virtual memory system needed• Must support clusters of SMP nodes• e.g., Intel ® Cluster <strong>OpenMP</strong>– Shared memory parallel inside of SMP nodes– Communication of modified parts of pagesat <strong>OpenMP</strong> flush (part of each <strong>OpenMP</strong> barrier)i.e., the <strong>OpenMP</strong> memory <strong>and</strong> parallelization modelis prepared for clusters!Experience: Mismatchsection<strong>Hybrid</strong> Parallel ProgrammingSlide11/154Rabenseifner, Hager, Jost<strong>Hybrid</strong> Parallel ProgrammingSlide12/154Rabenseifner, Hager, Jost

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!