06.12.2012 Views

HSL and the solution of sparse linear systems - EPCC

HSL and the solution of sparse linear systems - EPCC

HSL and the solution of sparse linear systems - EPCC

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>HSL</strong> <strong>and</strong> <strong>the</strong> <strong>solution</strong> <strong>of</strong> <strong>sparse</strong> <strong>linear</strong><br />

<strong>systems</strong><br />

Jennifer A. Scott<br />

Computational Science <strong>and</strong> Engineering Department,<br />

Ru<strong>the</strong>rford Appleton Laboratory.<br />

J.A.Scott@rl.ac.uk<br />

Group homepage: www.cse.clrc.ac.uk/nag/


• Who we are<br />

• <strong>HSL</strong> <strong>and</strong> s<strong>of</strong>tware design<br />

Overview<br />

• Sparse <strong>linear</strong> solvers: a brief introduction<br />

<strong>EPCC</strong> Janury 2005


The Numerical Analysis Group at RAL<br />

• Belong to <strong>the</strong> Computational Science <strong>and</strong> Engineering (CSE)<br />

Department <strong>of</strong> CCLRC.<br />

• CSE aims to provide world-class expertise <strong>and</strong> support for UK<br />

<strong>the</strong>oretical <strong>and</strong> computational science communities, in both<br />

academia <strong>and</strong> industry.<br />

• We are based at <strong>the</strong> Ru<strong>the</strong>rford Appleton Laboratory in<br />

Oxfordshire (about 15 miles south <strong>of</strong> Oxford).<br />

• RAL employs around 1200 people (mainly scientists <strong>and</strong><br />

engineers) plus large number <strong>of</strong> visitors.<br />

• We are a small Group (4 permanent members plus a consultant)<br />

... <strong>the</strong>re has only been one staff change in 15 years!<br />

• Currently, much <strong>of</strong> <strong>the</strong> core funding for <strong>the</strong> Group is provided<br />

by an ESPRC grant (GR/S42170).<br />

<strong>EPCC</strong> Janury 2005


Why use a Numerical Library?<br />

• Developing reliable, robust, accurate <strong>and</strong> efficient s<strong>of</strong>tware<br />

for areas covered by numerical libraries requires considerable<br />

experience <strong>and</strong> takes years <strong>of</strong> effort.<br />

• Thus it is cost effective to use a Library written <strong>and</strong> developed<br />

by experts<br />

• Reduces programming time <strong>and</strong> effort<br />

• Increases productivity<br />

• Allows confidence in <strong>the</strong> results<br />

<strong>EPCC</strong> Janury 2005


• There is much free ma<strong>the</strong>matical s<strong>of</strong>tware available on <strong>the</strong> web<br />

(a useful site is gams.nist.gov)<br />

• Some free s<strong>of</strong>tware is excellent <strong>and</strong> is fully documented, tested,<br />

<strong>and</strong> maintained (eg <strong>the</strong> LAPACK <strong>linear</strong> algebra library for dense<br />

matrices is available in <strong>the</strong> public domain)<br />

• BUT beware <strong>of</strong> <strong>the</strong> unknown - on <strong>the</strong> web <strong>the</strong>re is no overall<br />

st<strong>and</strong>ard <strong>and</strong> no quality control<br />

• Often <strong>of</strong>fers no guarantee <strong>of</strong> maintenance, user support, or<br />

continuity<br />

• The alternative is a commercial library eg NAG, IMSL, <strong>HSL</strong><br />

• Also available commercially are high-performance technical<br />

computing environments (MATLAB, Ma<strong>the</strong>matica ...)<br />

<strong>EPCC</strong> Janury 2005


<strong>HSL</strong><br />

• Began as Harwell Subroutine Library in 1963.<br />

• Portable, fully documented <strong>and</strong> tested Fortran packages.<br />

• Primarily written <strong>and</strong> developed by RAL Numerical Analysis<br />

Group.<br />

• Each package performs a basic numerical task (eg solve<br />

<strong>linear</strong> system, find eigenvalues) <strong>and</strong> has been designed to be<br />

incorporated into programs.<br />

• Particular strengths in:<br />

– <strong>sparse</strong> matrix computations<br />

– optimization<br />

– large-scale system <strong>solution</strong><br />

<strong>HSL</strong> has international reputation for reliability <strong>and</strong> efficiency.<br />

<strong>EPCC</strong> Janury 2005


For academics:<br />

Benefits <strong>and</strong> advantages <strong>of</strong> <strong>HSL</strong><br />

• Freely available to ALL UK academics<br />

• Teaching aid (mainly MSc <strong>and</strong> PhD level)<br />

• More time for concentrating on own area <strong>of</strong> research<br />

(avoid “reinventing <strong>the</strong> wheel”!)<br />

• Can be used with confidence (“black box”)<br />

<strong>EPCC</strong> Janury 2005


Benefits <strong>and</strong> advantages <strong>of</strong> <strong>HSL</strong> (cont.)<br />

For commercial organisations:<br />

• Shorten application development cycle, cutting time-to-market<br />

<strong>and</strong> gaining competitive advantage<br />

• Reduce overall development costs<br />

• More time to focus on specialist aspects <strong>of</strong> applications<br />

• Improve application accuracy <strong>and</strong> robustness<br />

• Fully supported <strong>and</strong> maintained s<strong>of</strong>tware<br />

<strong>HSL</strong> routines have been incorporated into a large number <strong>of</strong><br />

commercial products.<br />

<strong>EPCC</strong> Janury 2005


Current version<br />

• A new version <strong>of</strong> <strong>HSL</strong> is released every 2-3 years<br />

(with UK academics given access to new routines as soon as<br />

testing is completed).<br />

• Latest version: <strong>HSL</strong> 2004 ... released September 2004<br />

• <strong>HSL</strong> is currently marketed by Aspen Technology.<br />

<strong>EPCC</strong> Janury 2005


How to get <strong>HSL</strong><br />

• <strong>HSL</strong> packages are available without charge, for academic<br />

purposes, to any user whose email address ends in .ac.uk .<br />

• Access to <strong>HSL</strong> is via our website by means <strong>of</strong> a short-lived<br />

individual password-controlled account.<br />

• Potential users are asked for brief details, including <strong>the</strong> use <strong>the</strong>y<br />

intend to make <strong>of</strong> <strong>HSL</strong>.<br />

• Please provide this data as it helps us (<strong>and</strong> our funding body)<br />

to evaluate <strong>the</strong> relevance <strong>of</strong> our s<strong>of</strong>tware to UK academia.<br />

• Users must accept a conditions-<strong>of</strong>-use form, <strong>and</strong> are not<br />

permitted to distribute any <strong>HSL</strong> codes <strong>the</strong>y download to a third<br />

party.<br />

Fur<strong>the</strong>r details <strong>of</strong> <strong>HSL</strong>: www.cse.clrc.ac.uk/nag/hsl<br />

<strong>EPCC</strong> Janury 2005


Design <strong>of</strong> <strong>the</strong> <strong>HSL</strong> Library<br />

• <strong>HSL</strong> is split into <strong>HSL</strong> 2004 <strong>and</strong> <strong>HSL</strong> Archive.<br />

• <strong>HSL</strong> Archive consists <strong>of</strong> older packages that have been<br />

superseded ei<strong>the</strong>r by improved <strong>HSL</strong> packages or by public<br />

domain libraries such as LAPACK.<br />

• <strong>HSL</strong> Archive is free to all for non-commercial use but its use is<br />

not supported.<br />

• All <strong>HSL</strong> usage (main library <strong>and</strong> Archive) requires a valid<br />

licence.<br />

<strong>HSL</strong> provides users with source code.<br />

<strong>EPCC</strong> Janury 2005


• <strong>HSL</strong> packages are classified into chapters. These were decided<br />

on in <strong>the</strong> early days (certainly by Release 1 <strong>of</strong> <strong>the</strong> Catalogue)<br />

• The chapters led to <strong>the</strong> <strong>HSL</strong> naming convention. eg AD02<br />

for automatic differentiation belongs to <strong>the</strong> ‘A’ chapter on<br />

computer algebra <strong>and</strong> MA48 is part <strong>of</strong> <strong>the</strong> ‘MA’ chapter <strong>of</strong><br />

matrix <strong>linear</strong> algebra packages.<br />

• The prefix <strong>HSL</strong> is used to indicate <strong>the</strong> package is written in<br />

Fortran 90 or 95 (some packages have Fortran 77 <strong>and</strong> Fortran<br />

90 versions).<br />

• The <strong>HSL</strong> catalogue provides a complete list <strong>of</strong> <strong>the</strong> packages in<br />

<strong>HSL</strong> 2004 <strong>and</strong> for each gives a brief outline <strong>of</strong> purpose, method,<br />

origin, language <strong>and</strong> o<strong>the</strong>r attributes.<br />

• An extensive index assists potential users in choosing packages<br />

appropriately.<br />

<strong>EPCC</strong> Janury 2005


S<strong>of</strong>tware design aims within <strong>HSL</strong><br />

We aim to design our s<strong>of</strong>tware so that it is<br />

• Portable<br />

• Efficient<br />

• Reliable<br />

• Straightforward to use<br />

• General purpose<br />

• Flexible<br />

• Threadsafe<br />

<strong>EPCC</strong> Janury 2005


Portability:<br />

How we achieve <strong>the</strong>se objectives<br />

• S<strong>of</strong>tware written in st<strong>and</strong>ard Fortran (older codes are Fortran<br />

77, more recently, Fortran 90 <strong>and</strong> 95).<br />

• Parallel codes use MPI for message passing.<br />

• Small number <strong>of</strong> machine-dependent routines (eg FD05<br />

returns real-valued machine constants).<br />

Efficiency:<br />

• Extensive experience <strong>of</strong> Fortran programming (in particular,<br />

<strong>sparse</strong> matrix coding)<br />

• Use <strong>of</strong> (eg) BLAS <strong>and</strong> LAPACK, with options for tuning for<br />

different platforms<br />

• Performance compared with o<strong>the</strong>r state-<strong>of</strong>-<strong>the</strong>-art packages<br />

<strong>EPCC</strong> Janury 2005


Reliability:<br />

• Extensive testing using comprehensive test deck<br />

• Also testing on real applications <strong>of</strong> different sizes<br />

• Tests performed on a range <strong>of</strong> computer platforms with a<br />

range <strong>of</strong> Fortran compilers.<br />

<strong>EPCC</strong> Janury 2005


Ease <strong>of</strong> use:<br />

• S<strong>of</strong>tware is fully documented with each package having its<br />

own specification sheets<br />

• These include a simple example illustrating <strong>the</strong> use <strong>of</strong> <strong>the</strong><br />

code (may be used as a template).<br />

• Parameters that must be set by <strong>the</strong> user are kept to a<br />

minimum.<br />

• User interface simplified through use <strong>of</strong> Fortran 90 (dynamic<br />

memory allocation .. also allows easy restart)<br />

• The main codes provide checks on <strong>the</strong> user’s data. In case<br />

<strong>of</strong> an error, a flag is set <strong>and</strong>, optionally, a message written.<br />

This assists <strong>the</strong> user with debugging <strong>the</strong>ir calling program<br />

<strong>and</strong> data.<br />

<strong>EPCC</strong> Janury 2005


General purpose:<br />

• Packages are not designed for a particular problem arising<br />

from a single application area. This means that our s<strong>of</strong>tware<br />

may not always be <strong>the</strong> best for a given problem but will<br />

perform well on a range <strong>of</strong> problems.<br />

Flexibility:<br />

• Many packages <strong>of</strong>fer <strong>the</strong> more experienced user a range <strong>of</strong><br />

options. These can include options on how to input <strong>the</strong><br />

problem data <strong>and</strong> whe<strong>the</strong>r it is to be checked for errors,<br />

blocksizes for use with BLAS, <strong>and</strong> <strong>the</strong> stability threshold<br />

parameter (<strong>linear</strong> solvers).<br />

<strong>EPCC</strong> Janury 2005


Threadsafe:<br />

• The first release to be threadsafe was <strong>HSL</strong> 2002.<br />

• All use <strong>of</strong> (eg) COMMON <strong>and</strong> SAVE removed.<br />

• Allows <strong>HSL</strong> packages to be used in multi-threaded<br />

applications.<br />

• Note: <strong>the</strong> Archive is not threadsafe (<strong>and</strong> no plans for this<br />

as Archive is not actively developed).<br />

<strong>EPCC</strong> Janury 2005


Problem: we wish to solve<br />

where A is<br />

Sparse <strong>systems</strong><br />

Ax = b<br />

LARGE<br />

Informal definition: A is <strong>sparse</strong> if<br />

• many entries are zero<br />

s p a r s e<br />

• it is worthwhile to exploit <strong>the</strong>se zeros.<br />

<strong>EPCC</strong> Janury 2005


• The idea <strong>of</strong> what is LARGE changed significantly over <strong>the</strong> last<br />

30-40 years.<br />

• Problems <strong>of</strong> order > 10 6 common.<br />

• Largest problems require iterative solvers (eg CG, GMRES,<br />

MINRES,...).<br />

• Our interest lies mainly in direct solvers.<br />

• Direct methods involve explicit factorization eg A = LU<br />

(L, U lower <strong>and</strong> upper triangular matrices).<br />

• Recently combining direct <strong>and</strong> iterative solvers has become<br />

an active area <strong>of</strong> research eg direct solvers used to obtain<br />

preconditioners for iterative solvers.<br />

<strong>EPCC</strong> Janury 2005


Many application areas in science, engineering, <strong>and</strong> finance give<br />

rise to <strong>sparse</strong> <strong>systems</strong><br />

• chemical engineering<br />

• economic modelling<br />

• fluid flow<br />

• oceanography<br />

• <strong>linear</strong> programming<br />

• structural engineering ...<br />

But all have different patterns <strong>and</strong> characteristics.<br />

<strong>EPCC</strong> Janury 2005


0<br />

2000<br />

4000<br />

6000<br />

8000<br />

10000<br />

Circuit simulation<br />

circuit3<br />

12000<br />

0 2000 4000 6000<br />

nz = 48137<br />

8000 10000 12000<br />

<strong>EPCC</strong> Janury 2005


0<br />

50<br />

100<br />

150<br />

200<br />

250<br />

300<br />

350<br />

400<br />

450<br />

500<br />

Reservoir modelling<br />

pores3<br />

0 100 200 300<br />

nz = 3474<br />

400 500<br />

<strong>EPCC</strong> Janury 2005


0<br />

200<br />

400<br />

600<br />

800<br />

1000<br />

1200<br />

Economic modelling<br />

0 200 400 600<br />

nz = 7682<br />

800 1000 1200<br />

<strong>EPCC</strong> Janury 2005


0<br />

1000<br />

2000<br />

3000<br />

4000<br />

5000<br />

6000<br />

7000<br />

8000<br />

9000<br />

10000<br />

Structural engineering<br />

0 2000 4000 6000<br />

nz = 428650<br />

8000 10000<br />

<strong>EPCC</strong> Janury 2005


0<br />

2000<br />

4000<br />

6000<br />

8000<br />

10000<br />

12000<br />

Acoustics<br />

0 2000 4000 6000<br />

nz = 342828<br />

8000 10000 12000<br />

<strong>EPCC</strong> Janury 2005


0<br />

200<br />

400<br />

600<br />

800<br />

1000<br />

1200<br />

1400<br />

1600<br />

1800<br />

Chemical engineering<br />

2000<br />

0 500 1000<br />

nz = 14677<br />

1500 2000<br />

<strong>EPCC</strong> Janury 2005


0<br />

100<br />

200<br />

300<br />

400<br />

500<br />

600<br />

700<br />

800<br />

Linear programming<br />

0 100 200 300 400<br />

nz = 4841<br />

500 600 700 800<br />

<strong>EPCC</strong> Janury 2005


Solving <strong>sparse</strong> <strong>systems</strong><br />

Let A be n × n with nz nonzeros.<br />

Gaussian elimination for dense problem requires<br />

O(n 2 ) storage <strong>and</strong> O(n 3 ) flops.<br />

Hence infeasible for large n.<br />

Sparse algorithm aims to solve equations in<br />

O(n) + O(nz) time <strong>and</strong> space.<br />

<strong>EPCC</strong> Janury 2005


Why is it hard?<br />

• We have to worry about <strong>the</strong> zero entries<br />

• Need to use <strong>sparse</strong> data structures<br />

• If we just go ahead <strong>and</strong> apply Gaussian elimination to <strong>sparse</strong><br />

A, <strong>the</strong> zeros will, in general, rapidly fill-in.<br />

• We have to order carefully eg<br />

(a) x x x x x (b) x x<br />

x x x x<br />

x x x x<br />

x x x x<br />

x x x x x x x<br />

(a) fills in totally (b) no fill-in<br />

<strong>EPCC</strong> Janury 2005


Note: A does not have to be very large for it to be worthwhile to<br />

exploit sparsity.<br />

Here we compare <strong>the</strong> <strong>sparse</strong> solver MA48 (<strong>HSL</strong>) with a dense solver<br />

SGESV (LAPACK) on some problems from practical applications<br />

(timings in seconds).<br />

Identifier n nz MA48 SGESV<br />

FS 680 3 680 2646 0.06 0.96<br />

PORES 2 1224 9613 0.54 4.54<br />

BCSSTK27 1224 56126 2.07 4.55<br />

NNC1374 1374 8606 0.70 6.19<br />

WEST2021 2021 7353 0.21 18.88<br />

ORANI678 2529 90158 1.17 36.37<br />

<strong>EPCC</strong> Janury 2005


<strong>HSL</strong> contains several <strong>sparse</strong> direct solvers:<br />

• Some are for symmetric <strong>systems</strong>, o<strong>the</strong>rs for unsymmetric<br />

<strong>systems</strong>.<br />

• There are solvers designed for element problems.<br />

• There are solvers that use minimal storage.<br />

• Some are designed for particular sparsity structures (b<strong>and</strong>ed,<br />

highly unsymmetric, KKT ...).<br />

• There are solvers for real <strong>systems</strong> <strong>and</strong> solvers for complex<br />

<strong>systems</strong>.<br />

• Some expolit high level BLAS.<br />

• Some incorporate scaling/ iterative refinement/ different<br />

orderings ...<br />

<strong>EPCC</strong> Janury 2005


Chemical process engineering problems<br />

• Realistic, industrial-scale process modelling problems for<br />

dynamic simulation <strong>and</strong> optimization require large-scale<br />

computation.<br />

• Solving large, <strong>sparse</strong> <strong>linear</strong> <strong>systems</strong> is <strong>of</strong>ten a bottleneck (up to<br />

95% <strong>of</strong> total computation time).<br />

• These <strong>systems</strong> involve matrices that are:<br />

– Very <strong>sparse</strong><br />

– Not diagonally dominant<br />

– Numerically indefinite<br />

– Highly unsymmetric structure<br />

– May be ill-conditioned<br />

• Consequently choice <strong>of</strong> algorithm/solver is limited.<br />

<strong>EPCC</strong> Janury 2005


0<br />

500<br />

1000<br />

1500<br />

2000<br />

2500<br />

3000<br />

3500<br />

4000<br />

4500<br />

5000<br />

Chemical process engineering<br />

hydr1 matrix<br />

0 1000 2000 3000 4000 5000<br />

nz = 23752<br />

General-purpose solver such as <strong>HSL</strong> code MA48 is typically used<br />

for <strong>the</strong>se problems.<br />

<strong>EPCC</strong> Janury 2005


Phases in MA48<br />

• (Optional) Preorder to block triangular form eg.<br />

P AQ =<br />

⎛<br />

⎜<br />

⎝<br />

(only need to factorize Bii)<br />

B11<br />

B21 B22<br />

...<br />

Bl1 Bl2 ... Bll<br />

• Analyse - sparsity analysed to produce suitable ordering <strong>and</strong><br />

data structures for efficient factorization (pivot sequence<br />

chosen to minimise fill-in <strong>and</strong> for numerical stability).<br />

• Factorize - compute L <strong>and</strong> U using information from analyse<br />

• Solve - forward elimination <strong>and</strong> back substitution<br />

These phases are typical <strong>of</strong> a <strong>sparse</strong> direct solver.<br />

⎞<br />

⎟<br />

⎠<br />

<strong>EPCC</strong> Janury 2005


• Analyse phase selects a tentative pivot sequence to try <strong>and</strong><br />

minimise fill-in.<br />

• The analysis can be reused to factorize o<strong>the</strong>r matrices with<br />

same sparsity pattern (important in many applications eg<br />

solving a non<strong>linear</strong> system using a Newton-type method).<br />

• The pivot sequence can be modified during <strong>the</strong> factorize phase<br />

to ensure stability or it can be fixed (fast factorize option).<br />

• Once <strong>the</strong> factors are computed, <strong>the</strong>y can be used to solve<br />

repeatedly for different right h<strong>and</strong> sides b.<br />

<strong>EPCC</strong> Janury 2005


MA48 results<br />

Results on an SGI Origin (timings in seconds).<br />

Identifier n nz Analyse Factorize Fast Solve<br />

Factorize<br />

onetone2 36,057 227,628 10.43 3.47 2.97 0.10<br />

bayer01 57,735 277,774 5.02 1.20 0.65 0.10<br />

lhr71c 70,304 1,528,092 40.26 9.99 7.56 0.39<br />

icomp 75,724 338,711 0.60 0.18 0.13 0.06<br />

<strong>EPCC</strong> Janury 2005


Parallel approach<br />

Start by preordering A to Singly Bordered Block Diagonal (SBBD)<br />

form ⎛<br />

⎞<br />

where<br />

⎜<br />

⎝<br />

A11<br />

A22<br />

C1<br />

C2<br />

... .<br />

ANN CN<br />

• All are ml × nl matrices with ml ≥ nl<br />

• Cl are ml × k with k ≪ nl.<br />

⎟<br />

⎠<br />

,<br />

<strong>EPCC</strong> Janury 2005


ayer04 before <strong>and</strong> after reordering<br />

<strong>EPCC</strong> Janury 2005


• Perform partial LU decomposition <strong>of</strong> each (All, Cl).<br />

• Complete factorization by forming <strong>and</strong> <strong>the</strong>n factorizing an<br />

interface problem<br />

Advantages over designing a general parallel <strong>sparse</strong> solver:<br />

• Allows us to exploit existing fully tested <strong>and</strong> developed<br />

sophisticated direct solvers.<br />

• Processors are preassigned all <strong>the</strong> necessary matrix data before<br />

<strong>the</strong> factorization starts.<br />

• Communications only required to send Schur complement<br />

matrices to <strong>the</strong> processor responsible for <strong>the</strong> interface problem<br />

(plus communication <strong>of</strong> interface data during <strong>the</strong> solve phase).<br />

• Interface matrix much smaller than <strong>the</strong> original matrix;<br />

factorize using any existing <strong>sparse</strong> solver.<br />

<strong>EPCC</strong> Janury 2005


Unfortunately factorization <strong>of</strong> <strong>the</strong> rectangular submatrices<br />

(All, Cl) cannot be performed using an existing direct solver<br />

without modifications.<br />

• Have to distinguish between columns <strong>of</strong> All that are c<strong>and</strong>idates<br />

for elimination <strong>and</strong> those belonging to border Cl that must be<br />

passed to interface problem.<br />

• Pivots can ONLY be chosen from All.<br />

• Must have access to <strong>the</strong> Schur complement remaining at <strong>the</strong><br />

end <strong>of</strong> each partial factorization.<br />

• Submatrix may be rank deficient.<br />

<strong>EPCC</strong> Janury 2005


Efficiency<br />

Our new parallel direct solver is called <strong>HSL</strong> MP48<br />

Efficiency <strong>of</strong> <strong>HSL</strong> MP48 depends on:<br />

• SBBD having a small interface.<br />

(Interface problem is solved using a single processor).<br />

• Obtaining good load balance.<br />

Matrix may naturally arise in SBBD form<br />

(eg <strong>the</strong> components <strong>of</strong> a chemical processing plant)<br />

but not in general.<br />

MONET algorithm <strong>of</strong> Yifan Hu very successful<br />

(available in <strong>HSL</strong> 2002 as routine <strong>HSL</strong> MC66)<br />

<strong>EPCC</strong> Janury 2005


Results<br />

Numerical results for parallel direct solver <strong>HSL</strong> MP48.<br />

• 12 processor SGI Origin2000 (Manchester University)<br />

• cpuset facility (exclusive access to <strong>the</strong> processors <strong>and</strong> <strong>the</strong>ir local<br />

memory)<br />

• Fortran 90 compiler in 64 bit mode with optimization flags<br />

-O3 -OPT:Olimite=0<br />

• Vendor-supplied BLAS<br />

• SBBD with N = 8 blocks<br />

• All timings are wallclock timings in seconds.<br />

<strong>EPCC</strong> Janury 2005


Results (cont.)<br />

Example: bayer01 (chemical process simulation problem)<br />

• n = 57735, nz = 277774.<br />

• Time for Analyse + Factorize + Solve (Speedup)<br />

<strong>HSL</strong> MP48<br />

MA48 p = 1 2 4 8<br />

6.37 4.23 2.39 (1.8) 1.48 (2.9) 0.97 (4.4)<br />

• Time for interface problem is 0.06.<br />

• Time for Solve (Speedup)<br />

<strong>HSL</strong> MP48<br />

MA48 p = 1 2 4 8<br />

0.105 0.116 0.082 (1.4) 0.050 (2.3) 0.047 (2.5)<br />

<strong>EPCC</strong> Janury 2005


Results (cont.)<br />

Example: lhr71c (light hydrocarbon recovery problem)<br />

• n = 70304, nz = 1528092.<br />

• Time for Analyse + Factorize + Solve (Speedup)<br />

<strong>HSL</strong> MP48<br />

MA48 p = 1 2 4 8<br />

50.6 71.2 39.8 (1.8) 22.3 (3.2) 12.4 (5.7)<br />

• Time for interface problem is 0.7.<br />

• Time for Solve (Speedup)<br />

<strong>HSL</strong> MP48<br />

MA48 p = 1 2 4 8<br />

0.39 0.51 0.32 (1.6) 0.20 (2.6) 0.13 (4.1)<br />

<strong>EPCC</strong> Janury 2005


Results (cont.)<br />

Example: 10cols (chemical process simulation problem)<br />

• n = 29496, nz = 109588.<br />

• Time for Analyse + Factorize + Solve (Speedup)<br />

<strong>HSL</strong> MP48<br />

MA48 p = 1 2 4 8<br />

16.4 2.75 1.60 (1.7) 0.93 (3.0) 0.65 (4.2)<br />

• Time for interface problem is 0.15.<br />

• Flops (∗10 5 ): MA48 = 1611; <strong>HSL</strong> MP48 = 183<br />

• Time for Solve (Speedup)<br />

<strong>HSL</strong> MP48<br />

MA48 p =1 2 4 8<br />

0.074 0.034 0.032 (1.6) 0.027 (2.0) 0.023 (2.4)<br />

<strong>EPCC</strong> Janury 2005


Concluding remarks<br />

• <strong>HSL</strong> is an established but continually evolving library <strong>of</strong><br />

ma<strong>the</strong>matical s<strong>of</strong>tware.<br />

• Problems users want to solve are always increasing in size.<br />

• Most <strong>of</strong> <strong>the</strong>se are <strong>sparse</strong>.<br />

• Thus new techniques/methods continue to be developed.<br />

• New solvers to implement <strong>the</strong>m continue to be written.<br />

• Recently, parallel algorithms being developed.<br />

We always welcome new applications <strong>and</strong> test problems<br />

<strong>EPCC</strong> Janury 2005

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!