IPP Annual Report 2007 - Max-Planck-Institut für Plasmaphysik ...
IPP Annual Report 2007 - Max-Planck-Institut für Plasmaphysik ...
IPP Annual Report 2007 - Max-Planck-Institut für Plasmaphysik ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Introduction<br />
The Rechenzentrum Garching<br />
(RZG) traditionally provides supercomputing<br />
power and archival<br />
services for the <strong>IPP</strong> and other<br />
<strong>Max</strong> <strong>Planck</strong> <strong>Institut</strong>es throughout<br />
Germany. Besides operation<br />
of the systems, application support<br />
is given to <strong>Max</strong> <strong>Planck</strong> <strong>Institut</strong>es<br />
with high-end computing<br />
needs in fusion research, materials science, astrophysics,<br />
and other fields. Large amounts of experimental data from<br />
the fusion devices of the <strong>IPP</strong>, satellite data of the MPI for<br />
Extraterrestrial Physics (MPE) at the Garching site, and data<br />
from supercomputer simulations are administered and stored<br />
with high lifetimes. In addition, the RZG provides network<br />
and standard services for the <strong>IPP</strong> and part of the other MPIs<br />
at the Garching site. The experimental data acquisition software<br />
development group XDV for both the W7-X fusion<br />
experiment and the current ASDEX Upgrade fusion experiment<br />
operates as part of the RZG.<br />
Furthermore, the RZG is engaged in several large projects in<br />
collaboration with other, partly international scientific institutions.<br />
One of these projects is a bioinformatics project<br />
dealing with genome research, another one the ATLAS project,<br />
which is part of the LHC experiment at CERN. And finally,<br />
the RZG is a member of DEISA, a consortium of the leading<br />
European supercomputing centers supporting the advancements<br />
of computational sciences in Europe. In this project the<br />
RZG holds the task leaderships for global file systems, for the<br />
operation of the distributed infrastructure, for applications<br />
enabling, and for joint research activities in plasma physics and<br />
in materials science. All these projects are based on new software<br />
technologies, among others so-called Grid-Middleware<br />
tools. Since the importance of Grid technology for international<br />
collaborations has significantly increased in recent years,<br />
broad competence has to be established also in this field.<br />
Major Hardware Changes<br />
The supercomputer landscape, consisting of the IBM pSeries<br />
690 based supercomputer and the IBM p575 based cluster of<br />
8-way nodes, has been augmented in September <strong>2007</strong> with an<br />
IBM BlueGene/P system with 8,192 PowerPC@850MHz-based<br />
cores which is especially suited for applications scaling up to<br />
1,024 cores and beyond. Furthermore, a series of Linux clusters<br />
with Intel Xeon and AMD Opteron processors is operated,<br />
which has been further extended in the area of Intel Xeon quadcore<br />
and AMD dual-core Opteron based Linux clusters. Besides<br />
the generally available systems, dedicated compute servers<br />
are operated and maintained for: <strong>IPP</strong>, Fritz-Haber-<strong>Institut</strong>e, MPI<br />
for Astrophysics, MPI for Polymer Research, MPI for Quantum<br />
Computer Center Garching<br />
Head: Dipl.-Inf. Stefan Heinzel<br />
A major task has been the optimization of complex<br />
applications from plasma physics, materials<br />
science and other disciplines. The data acquisition<br />
system of W7-X has been implemented<br />
on a smaller existing device (WEGA) and reaches<br />
its test phase. For the FP6 EU project DEISA,<br />
codes from Plasma Physics (GENE and ORB5)<br />
have been enabled for hyperscaling to make<br />
efficient use of up to 32,000 processors.<br />
95<br />
Developments for High-End Computing<br />
Optics, MPI for Extraterrestrial<br />
Physics, MPI for Biochemistry,<br />
MPI for Chemical Physics of<br />
Solids, MPI for Physics and MPI<br />
for Astronomy. In the mass storage<br />
area, the capacity of the new automated<br />
tape library Sun SL8500<br />
has been extended to 6 PB of<br />
compressed data. Both LTO3 and<br />
LTO2 tape drives are supported.<br />
The application group of the RZG gives support in the field of<br />
high-performance computing. This includes supervising the<br />
start-up of new parallel codes, giving advice in case of software<br />
and performance problems as well as providing development<br />
software for the different platforms. One of the major<br />
tasks, however, is the optimization of complex codes from<br />
plasma physics, materials sciences and other disciplines on the<br />
respective, in general parallel high-performance target architecture.<br />
This requires a deep understanding and algorithmic<br />
knowledge and is usually done in close collaboration with<br />
the authors from the respective disciplines. In what follows<br />
selected optimization projects are presented in more detail.<br />
GEM Code<br />
The GEM (Gyrofluid-ElectroMagnetic) code from the <strong>IPP</strong><br />
plasma theory solves nonlinear gyrofluid equations for electrons<br />
and one or more ion species in tokamak geometry. It is<br />
restricted to a local approach in geometry, a so-called fluxtube<br />
approach. According to the parallelization concept of<br />
one-dimensional domain decomposition along the magnetic<br />
field the maximum number of processors to be used was 16.<br />
The new code version GEMR treats the full geometry in the<br />
radial x-direction. Hence, more realistic simulations of turbulence<br />
in experiments like JET and ITER are now in progress.<br />
However, the necessary grid resolution of at least 1024×512×16<br />
is already far too large to be run on just 16 processors.<br />
Correspondingly, the scaling properties of the GEMR code<br />
had to be improved towards many hundreds of processors.<br />
After the single-processor performance had been increased<br />
by 50 %, the parallelization concept was expanded to a twodimensional<br />
domain decomposition by additionally parallelizing<br />
along the x-coordinate. For this purpose the index structure<br />
of the most important arrays had to be adapted to avoid<br />
unnecessary copying in connection with communication. The<br />
parallelization of the matrix solver in x-direction was a nontrivial<br />
task which could finally be solved with an elaborated<br />
parallel transpose of the data and corresponding matrices.<br />
As a result, the scalability could be increased by the envisaged<br />
factor of 32; a parallel efficiency of 89 % from 64 to 512<br />
processors could be observed. Hence, both the distributed