14.09.2014 Views

CASINO manual - Theory of Condensed Matter

CASINO manual - Theory of Condensed Matter

CASINO manual - Theory of Condensed Matter

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

39.2 Implementation basics and performance<br />

The general strategy <strong>of</strong> this implementation is to use OpenMP parallelism for the loops whose trip<br />

counts scale with the number <strong>of</strong> electrons or atoms. In the QMC algorithm the basic logical units<br />

that need to be parallelized are routines like<br />

• orbital evaluation routines,<br />

• Jastrow factor evaluation routines,<br />

• inverse Slater matrix updating routine,<br />

• potential energy evaluation routine,<br />

• electron–electron and electron–nucleus distance evaluation routines,<br />

• etc.<br />

Extensive performance tests were done on pseudopotential systems that use the blip3d and<br />

blip3dgamma. The best performance obtained on a AMD quadcore CPU was for a system <strong>of</strong> 1024<br />

electrons. The speedup factor was close to 1.5 for 2 OpenMP threads and close to 2 for 4 OpenMP<br />

threads. Larger systems had update dbar as an OpenMP bottleneck.<br />

39.3 Using OpenMP<br />

To use the experimental OpenMP feature, compile the code with make Openmp on a supported architecture.<br />

Then use the option --tpp threads-per-process in the runqmc command line to specify<br />

the number <strong>of</strong> OpenMP threads per process.<br />

E.g. to run two processes with two threads each you would type runqmc --nproc=2 --tpp=2, ideally<br />

on a 4-core machine. By default on batch-queueing systems the number <strong>of</strong> cores reserved for the job<br />

will be nproc * tpp.<br />

Note finally that when analysing timing data from Openmp runs you need to look at the ‘Real Time’<br />

data, rather than the ‘CPU time’ data, since the CPU time is summed over all OMP threads.<br />

A<br />

Appendix 1: Programming guide for <strong>CASINO</strong><br />

Because you signed our legal agreement (you did, didn’t you?) you are not allowed to make modifications<br />

to casino without explicit written permission from the Cambridge group. This is generally very<br />

easy to obtain, though we do not guarantee to incorporate your changes into the public distribution<br />

if we don’t like them, or to keep them there afterwards if we think <strong>of</strong> a better way to do it (but then,<br />

who does?). If you have obtained this permission, then please read the following.<br />

The main casino source code and many <strong>of</strong> the utilities are written in Fortran, which must conform<br />

to the Fortran 95 standard (later standards are not necessarily supported by all compilers). There<br />

are also a couple <strong>of</strong> simple C routines in the main code (for shared memory etc.). Utilities not<br />

in Fortran should generally be written as bash shell scripts (or possibly tcsh or csh, though this<br />

is deprecated). We did once accept a C++ pseudopotential conversion utility, though this adds to<br />

problems with portability. The ADF converter in utils/wfn converters/adf is written in Python,<br />

though we provide no <strong>of</strong>ficial support for this or any other programming language. Please understand<br />

that casino is meant to compile, setup and run out <strong>of</strong> the box on any computer in the world, and<br />

the fewer languages we are dependent on the easier this is to achieve.<br />

A.1 Style<br />

casino has a Fortran95 ‘style’ which should be adhered to when writing code, both for the main source<br />

and for the Fortran utilities. This is because it is desirable that the package has a homogeneous look<br />

and feel (and because searching for text strings then works consistently). Everybody has their own<br />

style. Yours is different and may even be better, but we’ve decided on one for casino and there it<br />

is. If you don’t write your code like this, the likelihood is that MDT or someone else will reformat it<br />

for you, and they will probably accidentally delete a crucial minus sign while correcting your routine,<br />

211

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!