Jim Tomkins
Jim Tomkins
Jim Tomkins
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
The Red Storm Story<br />
Red Storm Update<br />
<strong>Jim</strong> <strong>Tomkins</strong><br />
<strong>Jim</strong> <strong>Tomkins</strong><br />
SOS 11<br />
Key West, Florida<br />
June 12 - 14, 2007<br />
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,<br />
for the United States Department of Energy’s National Nuclear Security Administration<br />
under contract DE-AC04-94AL85000.<br />
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,<br />
for the United States Department of Energy’s National Nuclear Security Administration<br />
under contract DE-AC04-94AL85000.
Upgrade Changes<br />
• SeaStar 2.1 NIC/Router Chips<br />
– 2 x Injection Bandwidth<br />
• 5th Row (25% more compute and service nodes)<br />
• Dual Core Opterons<br />
– 2.5 x More Compute Processors<br />
– 20% Higher Clock Rate<br />
• New OS’s (Linux 2.6, CVN)<br />
– Virtual node model –– 1PE or 2PE / node<br />
• New File Systems<br />
– Lustre 1.4 on Linux 2.6<br />
• Complete: mid-November, 2006
Normally<br />
Classified<br />
I/O and<br />
Service Nodes<br />
Upgraded Red Storm Layout<br />
(27 x 20 x 24 Compute Node Mesh)<br />
Switchable Nodes<br />
Disconnect Cabinets<br />
Disk storage system<br />
not shown<br />
Normally<br />
Unclassified<br />
I/O and<br />
Service Nodes
Upgraded Red Storm
Theoretical Peak Performance<br />
(Compute Nodes Only)<br />
Red Storm Comparison<br />
Red Storm<br />
(initial operations)<br />
41.47 TF 124.42 TF<br />
HPL Performance 36.19 TF 101.400 TF<br />
Red Storm<br />
(post-upgrade)<br />
Compute Nodes / Processors 10368 / 10368 12960 / 25920<br />
Service and I/O Nodes /<br />
Processors<br />
256 + 256<br />
256 + 256<br />
320 + 320<br />
640 + 640<br />
Processor 2.0 GHz Opteron 2.4 GHz Dual Core<br />
Opteron<br />
Total Memory (TB) 33.38 39.19<br />
System Memory B/W (TB/s) 55.2 82.9<br />
Topology 27 x 16 x 24 27 x 20 x 24<br />
Bi-Section B/W (TB/s) 3.7, 6.2, 8.3 4.6, 6.2, 10.4<br />
Power / Cooling (MW) 1.7 2.5<br />
System Size ~3100 ft 2 ~3800 ft 2
Some Performance Data
#2 on Top 500 list at 101.4 TF
Single Core Versus Dual Core
Single Core Versus Dual Core
Single Core Versus Dual Core<br />
1.20<br />
1.00<br />
0.80<br />
0.60<br />
0.40<br />
0.20<br />
0.00<br />
CTH (2000)<br />
SAGE (2048)<br />
UMT2K (3200)<br />
ITS (3200)<br />
Red Storm (SN vs. VN)<br />
SN = 1PE/socket, VN = 2PE/socket<br />
PRTSN (4096)<br />
SAGE (4500)<br />
UMT2K (4500)<br />
ITS (4500)<br />
sPPM (4500)<br />
Application (PEs)<br />
CTH (6480)<br />
PRTSN (6480)<br />
sPPM (6561)<br />
PRTSN (8930)<br />
SN (post-upgrade)<br />
VN (post-upgrade)<br />
CTH (9000)<br />
sPPM (9000)
John Daly’s Production Work<br />
• Recent results have shown that for John’s application we<br />
are getting a large fraction of the second core.<br />
– John’s application code runs on 3000 dual core nodes in<br />
the same time as 5000 single core nodes. This means<br />
that he is getting about 70% efficiency out of using the<br />
second core.
Red Storm Upgraded to ~500TF<br />
• Upgrade Existing System to Quad Core XT4 Specifications<br />
– Replace all Compute Boards<br />
– 2.4 GHz AMD Opteron Quad Core Processors<br />
– New DDR2 Memory (>75 TB)<br />
– Replace Power Supplies<br />
– Replace Cabinet Cooling Fans<br />
– Reuse Seastar Mezzanines, L0s, Cabinets, Cables, PDUs, etc.<br />
• A Few System Parameters<br />
– ~500 TF<br />
– >75 TB of Memory<br />
– >1 PB of Disk Storage<br />
• Timeframe - Early 2008
Red Storm Impact<br />
• Cray and Sandia with the support of NNSA/ASC<br />
have created one of the most successful new<br />
supercomputers ever.<br />
• There are now 17 sites with Red Storm / XT3 / XT4<br />
computer systems.
ARMY/HPC<br />
AWE (England)<br />
CSC (Finland)<br />
CSCS<br />
(Switzerland)<br />
DOD/ERDC<br />
EPSRC (UK)<br />
JAIST (Japan)<br />
NERSC<br />
ORNL<br />
PSC<br />
SNL<br />
Red Storm / XT3 / XT4 Sites<br />
SS-634 (non-US)<br />
SS-635<br />
SS-643<br />
SS-661<br />
U of Tokyo (JST)<br />
U of Western Australia<br />
Worldwide<br />
Count of Sites<br />
(Cumulative)<br />
20<br />
15<br />
10<br />
Number of XT3/XT4 Sites Worldwide<br />
5<br />
0<br />
1H05 1 2H05 2 1H06 3 2H06 4 1H07 5 2H07 6<br />
TFLOPS<br />
800<br />
600<br />
400<br />
200<br />
0<br />
Half Years<br />
TFLOPS available Worldwide<br />
1 2 3 4 5 6<br />
1H05 2H05 1H06 2H06 1H07 2H07<br />
Half Years