12.07.2015 Views

Ab initio molecular dynamics: Theory and Implementation

Ab initio molecular dynamics: Theory and Implementation

Ab initio molecular dynamics: Theory and Implementation

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

ing st<strong>and</strong>ard communication libraries <strong>and</strong> making no assumption on the topologyof the processor network <strong>and</strong> memory access system.Minimizing the communication was the major goal in the implementation ofthe parallel plane wave code in CPMD. Therefore, the algorithms had to be adaptedto the distributed data model chosen. The most important decisions concern thedata distribution of the largest arrays in the calculation. These arrays are the onesholding information on the wavefunctions. Three distribution strategies can beenvisaged <strong>and</strong> were used before 90,137,687,688,117 .First, the data are distributed over the b<strong>and</strong>s 687 . Each processor holds allexpansion coefficients of an electronic b<strong>and</strong> locally. Several problems arise withthis choice. The number of b<strong>and</strong>s is usually of the same magnitude as the numberof processors. This leads to a severe load-balancing problem that can only beavoided for certain magic numbers, namely if the number of b<strong>and</strong>s is a multipleof the number of cpu’s. Furthermore this approach requires to perform threedimensionalFourier transforms locally. The memory requirements for the Fouriertransform only increase linearly with system size, but their prefactor is very big <strong>and</strong>a distribution of these arrays is desirable. In addition, all parts of the program thatdo not contain loops over the number of b<strong>and</strong>s have to be parallelized using anotherscheme, leading to additional communication <strong>and</strong> synchronization overhead.Second, the data is distributed over the Fourier space components <strong>and</strong> thereal space grid is also distributed 90,137,117 . This scheme allows for a straightforward parallelization of all parts of the program that involve loops over the Fouriercomponents or the real space grid. Only a few routines are not covered by thisscheme. The disadvantage is that all three-dimensional Fourier transforms requirecommunication.Third, it is possible to use a combination of the above two schemes 688 . Thisleads to the most complicated scheme, as only a careful arrangement of algorithmsavoids the disadvantages of the other schemes while still keeping their advantages.Additionally, it is possible to distribute the loop over k–points. As most calculationonly use a limited number of k–points or even only the Γ–point, this methodis of limited use. However, combining the distribution of the k-points with one ofthe other method mentioned above might result in a very efficient approach.TheCPMD program is parallelized using the distribution in Fourier <strong>and</strong> real space.The data distribution is held fixed during a calculation, i.e. static load balancingis used. In all parts of the program where the distribution of the plane waves doesnot apply, an additional parallelization over the number of atoms or b<strong>and</strong>s is used.However, the data structures involved are replicated on all processors.A special situation exists for the case of path integral calculations (see Sect. 4.4),where an inherent parallelization over the Trotter slices is present. The problem is”embarrassingly parallel” in this variable <strong>and</strong> perfect parallelism can be observed onall types of computers, even on clusters of workstations or supercomputers (”meta–computing”). In practice the parallelization over the Trotter slices will be combinedwith one of the schemes mentioned above, allowing for good results even on massivelyparallel machines with several hundred processors.83

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!