01.12.2014 Views

THE EXPAND PARALLEL FILE SYSTEM - Arcos

THE EXPAND PARALLEL FILE SYSTEM - Arcos

THE EXPAND PARALLEL FILE SYSTEM - Arcos

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>THE</strong> <strong>EXPAND</strong> <strong>PARALLEL</strong> <strong>FILE</strong><br />

<strong>SYSTEM</strong><br />

A <strong>FILE</strong> <strong>SYSTEM</strong> FOR CLUSTER AND<br />

GRID COMPUTING<br />

José Daniel García Sánchez<br />

ARCOS Group – University Carlos III of Madrid


Contents<br />

2<br />

The ARCOS Group.<br />

Expand motivation.<br />

Expand design.<br />

Expand evaluation.<br />

Conclusions.<br />

Ongoing Work.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


University Carlos III of Madrid<br />

3<br />

Founded in 1989<br />

Three faculties:<br />

Faculty of Social Sciences<br />

and Law.<br />

Faculty of<br />

Humanities, Documentation<br />

and Communication.<br />

Higher Technical School.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


The ARCOS Group<br />

4<br />

The Computer Architecture, Communications and<br />

Systems Group is part of the Department of<br />

Computer Science.<br />

20 full time members<br />

9 PhD’s (2 full professors + 4 associate professors + 3<br />

visiting professors).<br />

11 PhD students<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Research lines<br />

5<br />

Data management on Grid environments.<br />

Parallel file systems.<br />

Optimization of irregular applications.<br />

OS for Wireless Sensor Networks.<br />

Real-time systems.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Some products<br />

6<br />

Expand: A parallel file system for cluster and grid<br />

environment.<br />

WinPFS: Windows Parallel File System.<br />

MiMPI: MPI implementation for heterogeneous<br />

cluster environments<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Contents<br />

7<br />

The ARCOS Group.<br />

Expand motivation.<br />

Expand design.<br />

Expand evaluation.<br />

Conclusions.<br />

Ongoing Work.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


June-93<br />

April-94<br />

February-95<br />

December-95<br />

October-96<br />

August-97<br />

June-98<br />

April-99<br />

February-00<br />

December-00<br />

October-01<br />

August-02<br />

June-03<br />

April-04<br />

February-05<br />

December-05<br />

October-06<br />

8<br />

Trends in the supercomputing<br />

environment<br />

500<br />

450<br />

400<br />

350<br />

300<br />

250<br />

200<br />

150<br />

100<br />

50<br />

0<br />

Clusters in top500.org<br />

75 % of supercomputers in top500 are clusters.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Trends in the supercomputing<br />

9<br />

environment<br />

Number of transistors per chip still doubling every<br />

1.5 years.<br />

Does not mean doubling frequency, performance, …<br />

More space more cores per chip.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Trends in the supercomputing<br />

10<br />

environment<br />

Grid Computing: Interconnecting supercomputers to<br />

aggregate geographically distributed resources.<br />

Applications are deployed somewhere in the grid.<br />

Applications read input data and produce output data.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Trends in the supercomputing<br />

11<br />

environment<br />

Clusters becoming the preferred option for<br />

supercomputing.<br />

Processors with increasing capacity.<br />

Grid computing using clusters as a building block.<br />

I/O will remain as a major bottleneck.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Storage system typical architecture<br />

12<br />

Clients<br />

Communication network<br />

Storage network<br />

I/O server<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Aggregated bandwidth (MB/s)<br />

Problems with storage architectures<br />

13<br />

14<br />

12<br />

10<br />

8<br />

6<br />

4<br />

2<br />

0<br />

1 2 4 8 16<br />

Clients<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

NAS<br />

July 2007 - University of Modena


Solution: Parallelism<br />

14<br />

Parallel applications<br />

Parallel computers<br />

Exploit parallelism at<br />

multiple layers<br />

Parallel file systems<br />

Parallel devices<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Parallel File System Architecture<br />

15<br />

Computing<br />

node<br />

Computing<br />

node<br />

Computing node<br />

Apps<br />

Client<br />

Communication Network<br />

I/O<br />

Server<br />

I/O<br />

Server<br />

I/O<br />

Server<br />

File 1<br />

File 2<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Parallel File System Architecture<br />

16<br />

Computing<br />

node<br />

Computing<br />

node<br />

Computing node<br />

Apps<br />

Client<br />

Communication Network<br />

GPFS<br />

I/O<br />

Server<br />

I/O<br />

Server<br />

I/O<br />

Server<br />

Storage Network<br />

File 1<br />

File 2<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Parallel File System Architecture<br />

17<br />

GPFS<br />

Computing<br />

node<br />

Computing<br />

node<br />

Computing node<br />

Client<br />

Apps<br />

Storage Network<br />

File 1<br />

File 2<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


18<br />

Expand Parallel File System:<br />

Motivation<br />

Provide a high performance storage system by<br />

using standard protocols and servers.<br />

Easy integration of heterogeneous systems.<br />

Reuse and aggregation of existing resources.<br />

Parallel data access.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Why Expand?<br />

19<br />

A standard data server already includes almost all<br />

the needed functionality.<br />

Reuse.<br />

Standard protocols and servers make resources<br />

universally available.<br />

Easy to deploy.<br />

Independence of the underlying storage<br />

infrastructure.<br />

Portability.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Objective<br />

20<br />

Offer a new approach to build PFS for cluster and<br />

grid environments by using standard data servers.<br />

<br />

Advantages:<br />

No server change needed.<br />

• Operations at client side.<br />

Independence of client and server OS’s.<br />

• Operations through standard protocols.<br />

Simplified PFS construction.<br />

• Take advantage of already implemented server high performance<br />

mechanisms.<br />

Allows mixing servers with different platforms and OS’s.<br />

Easy installation and configuration.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Contents<br />

21<br />

The ARCOS Group.<br />

Expand motivation.<br />

Expand design.<br />

Expand evaluation.<br />

Conclusions.<br />

Ongoing Work.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Computing node<br />

Architecture<br />

22<br />

POSIX<br />

MPI-IO<br />

…<br />

Expand<br />

NFS GridFTP RNS-WS Local<br />

…<br />

Parallel access<br />

Server protocol<br />

Distributed partition<br />

File 1<br />

File 2<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


File structure<br />

23<br />

Expand file:<br />

Metadata sub-file.<br />

Several data sub-files.<br />

Data distributed across<br />

several servers.<br />

File-to-server flexible<br />

mapping policy.<br />

Sub-files<br />

Expand file<br />

0 1 2 3 4 5 6 7 8<br />

0<br />

3<br />

6<br />

1<br />

4<br />

7<br />

…………..<br />

Server 1 Server 2 Server 3<br />

2<br />

5<br />

8<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Directory structure<br />

24<br />

Logical<br />

View<br />

Mapping<br />

Physical<br />

View<br />

Dir1<br />

/Expand<br />

Dir2 Dir3<br />

Server 1 Server 2 Server 3<br />

/export1 /export2 /export3<br />

Dir1 Dir2 Dir3 Dir1 Dir2 Dir3 Dir1 Dir2 Dir3<br />

Dir4<br />

Dir4<br />

Dir4<br />

Dir4<br />

FileA<br />

FileA<br />

FileA<br />

FileA<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Metadata management<br />

25<br />

Metadata distributed<br />

management.<br />

Two levels.<br />

Without locking.<br />

No metadata manager.<br />

Metadata distributed<br />

across servers.<br />

Master node.<br />

Hashing on name.<br />

Load balancing.<br />

Expand file<br />

0 1 2 3 4 5 6 7 8 …………..<br />

0<br />

3<br />

6<br />

block<br />

Server 1 Server 2 Server 3<br />

1<br />

4<br />

7<br />

metadata<br />

2<br />

5<br />

8<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Metadata management<br />

26<br />

Metadata distributed<br />

management.<br />

Two levels.<br />

Without locking.<br />

No metadata manager.<br />

Metadata distributed<br />

across servers.<br />

Master node.<br />

Hashing on name.<br />

Load balancing.<br />

Expand file<br />

0 1 2 3 4 5 6 7 8 …………..<br />

Server 1 Server 2 Server 3<br />

Metadata<br />

FileA<br />

Metadata<br />

FileC<br />

block<br />

Metadata<br />

FileD<br />

Metadata<br />

FileF<br />

Metadata<br />

FileB<br />

Metadata<br />

FileE<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Parallel access<br />

27<br />

read(fd, buffer,count)<br />

buffer<br />

Data blocks<br />

0 1 2 3 4 5 6 7 8<br />

Expand<br />

Threads<br />

Server 1 Server 2 Server 3<br />

0<br />

3<br />

6<br />

1<br />

4<br />

7<br />

2<br />

5<br />

8<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


MPI-IO interface using ROMIO<br />

28<br />

MPI-IO<br />

ADIO<br />

Unix NFS PFVS Expand<br />

IBM<br />

PIOFS<br />

SGI XFS<br />

Distributed partition<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Dynamic partition reconfiguration<br />

29<br />

Server 1 Server 2 Server 3 Server 4<br />

2<br />

5<br />

8<br />

0<br />

3<br />

6<br />

1<br />

4<br />

7<br />

Instantaneous.<br />

Deferred.<br />

hash(file) = server 3<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Dynamic partition reconfiguration<br />

30<br />

Server 1 Server 2 Server 3 Server 3<br />

2<br />

5<br />

8<br />

12<br />

16<br />

0<br />

3<br />

6<br />

1<br />

4<br />

7<br />

9 10<br />

13 14<br />

11<br />

15<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Expand cluster versions<br />

31<br />

Linux/NFS<br />

Java/NFS<br />

NFS<br />

Server<br />

NFS<br />

Server<br />

NFS<br />

Server<br />

NFS<br />

Server<br />

Distributed partition<br />

Distributed partition<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Contents<br />

32<br />

The ARCOS Group.<br />

Expand motivation.<br />

Expand design.<br />

Expand adaptation for Grid Computing.<br />

Expand evaluation.<br />

Conclusions.<br />

Ongoing Work.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Requirements for a Grid File System<br />

33<br />

Hierarchical logical space name.<br />

Resource Namespace Service (RNS).<br />

Standard access interface.<br />

POSIX and MPI-IO.<br />

Data access.<br />

GridFTP.<br />

Security.<br />

Grid Security Infrastructure (GSI).<br />

Performance optimization and improvement.<br />

Paralle I/O.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Computing node<br />

Adapting Expand to Grid environments<br />

34<br />

POSIX<br />

MPI-IO<br />

…<br />

Computing node<br />

Expand<br />

Computing node<br />

NFS GridFTP RNS-WS Local<br />

…<br />

RNS<br />

NFS NFS NFS<br />

Internet + GSI<br />

GridFTP GridFTP GridFTP GridFTP<br />

Distributed<br />

partition<br />

Site 1 Site 2 Site 3 Site 4<br />

Distributed partition<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Contents<br />

35<br />

The ARCOS Group.<br />

Expand motivation.<br />

Expand design.<br />

Expand evaluation.<br />

Conclusions.<br />

Ongoing Work.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Evaluation<br />

36<br />

How does Expand behaves compared to other<br />

existing solutions?<br />

Cluster<br />

• PFVS.<br />

• GPFS.<br />

Grid<br />

• Globus Grid services.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Cluster environment<br />

37<br />

8 biprocessors (Pentium VI, 3.2 GHz).<br />

2 GB RAM per node.<br />

Network: Gigabit ethernet.<br />

Expand.<br />

PVFS.<br />

GPFS.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Cluster benchmarnking<br />

38<br />

High performance.<br />

Parallel access to a file: IOR benchmark.<br />

FLASH I/O benchmark.<br />

Metadata operations.<br />

High throughput.<br />

Image processing.<br />

Dynamic partition reconfiguration.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


High performance:<br />

39<br />

Parallel access to a file<br />

Parallel program (IOR) making interleaved writes<br />

and reads to a single file with different access sizes.<br />

MPI-IO interface<br />

Process 1 Process 2<br />

File 0 1 2 3 4 5 6 7 8<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Bandwidth (MB/s)<br />

40<br />

High performance:<br />

Parallel access to a file for writing<br />

120<br />

8 processes (writing)<br />

100<br />

80<br />

60<br />

40<br />

20<br />

XPN<br />

PVFS<br />

GPFS<br />

0<br />

access size<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Bandwidth (MB/s)<br />

41<br />

High performance:<br />

Parallel access to a file for reading<br />

140<br />

8 processes (reading)<br />

120<br />

100<br />

80<br />

60<br />

40<br />

20<br />

XPN<br />

PVFS<br />

GPFS<br />

0<br />

access size<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Bandwidth (MB/s)<br />

42<br />

High performance:<br />

Parallel access to a file for writing<br />

140<br />

Parallel access – writing (8 KB)<br />

120<br />

100<br />

XPN<br />

PVFS<br />

GPFS<br />

80<br />

60<br />

40<br />

20<br />

0<br />

1 2 4 8 16<br />

Number of processes<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Bandwidth (MB/s)<br />

43<br />

High performance.<br />

Parallel access to a file for writing<br />

100<br />

90<br />

80<br />

XPN<br />

PVFS<br />

GPFS<br />

Parallel access – writing (256 KB)<br />

70<br />

60<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

1 2 4 8 16<br />

Number of processes<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


High Performance: FLASH-IO<br />

44<br />

FLASH is a parallel application simulating<br />

thermonuclear flashes.<br />

FLASH-IO simulates I/O operations performed by<br />

FLASH.<br />

Data size is proportional to number of running<br />

processes.<br />

1 process 73.53 MB<br />

16 processes 1.16 GB<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Bandwidth (MB/s)<br />

High Performance: FLASH-IO<br />

45<br />

80<br />

70<br />

60<br />

XPN<br />

PVFS<br />

GPFS<br />

Benchmark FLASH-IO<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

1 2 4 8 16<br />

Number of processes<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Files/second<br />

Metadata: Creating empty files<br />

46<br />

1400<br />

File creation (empty files))<br />

1200<br />

1 process 4 processes<br />

1000<br />

800<br />

600<br />

400<br />

200<br />

0<br />

XPN PVFS GPFS<br />

Filesystem<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Files/second<br />

Metadata: Creating small files<br />

47<br />

1200<br />

File creation (small files)<br />

1000<br />

800<br />

1 process<br />

4 processes<br />

600<br />

400<br />

200<br />

0<br />

XPN PVFS GPFS<br />

Filesystem<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


High throughput<br />

48<br />

Parallel application processing a set of 256 images.<br />

Each process works on a subset of images<br />

independently.<br />

No concurrent access to a file.<br />

Sizes:<br />

Image file 5 MB.<br />

Full dataset 2.5 GB.<br />

The process applies to each image file a fixed<br />

bitmask to generate a new image file.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Time (s)<br />

49<br />

High throughput:<br />

Image processing in C<br />

100<br />

90<br />

80<br />

70<br />

60<br />

Image processing (C application)<br />

XPN<br />

PVFS<br />

GPFS<br />

50<br />

40<br />

30<br />

20<br />

10<br />

0<br />

1 2 4 8 16<br />

Number of processes<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Time (s)<br />

50<br />

High throughput:<br />

Image processing in Java<br />

450<br />

Image processing (Java application)<br />

400<br />

350<br />

XPN<br />

PVFS<br />

GPFS<br />

300<br />

250<br />

200<br />

150<br />

100<br />

50<br />

0<br />

1 2 4 8 16<br />

Number of processes<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Bandwidth (MB/s)<br />

Reconstruction time (min)<br />

51<br />

Dynamic partition reconfiguration:<br />

Adding new nodes<br />

90<br />

80<br />

70<br />

9<br />

8<br />

7<br />

60<br />

6<br />

50<br />

40<br />

30<br />

20<br />

10<br />

5<br />

4<br />

3<br />

2<br />

1<br />

Bandwidth (MB/s)<br />

Reconstruction time (min)<br />

0<br />

4-5 4-6 4-7 4-8 5-6 5-7 5-8 6-7 6-8 7-8<br />

Reconstruction Model<br />

0<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Grid evaluation environment<br />

52<br />

Evaluation for high throughput.<br />

Perform 500 jobs.<br />

Each job selects randomly a file (among 200) to access.<br />

File size is 200 MB.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Testbed environment<br />

53<br />

Intel Pentium 4<br />

3.20GHz<br />

•<br />

GridFTP<br />

Intel Xeon<br />

Doble Procesador 2.4GHz<br />

Intel(R) Pentium(R) 4<br />

CPU 2.40GHz<br />

•<br />

GridFTP<br />

•<br />

GridFTP<br />

•<br />

GridFTP<br />

Intel Pentium 4 CPU<br />

2.80GHz<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

AMD Athlon<br />

July 2007 - University of Modena


Evaluated scenarios<br />

54<br />

Typical Grid<br />

Completely transfer<br />

file to local node.<br />

Processing starts after<br />

transfer finishes.<br />

Globus services for<br />

transfer.<br />

globus-url-copy<br />

Expand<br />

Direct remote access to<br />

file.<br />

No previous transfer<br />

to node needed!<br />

The Expand Parallel File System José Daniel García Sánchez – ARCOS<br />

Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Model 1 / Scenario 1<br />

55<br />

1 server<br />

Complete transfer to<br />

local node.<br />

Application access<br />

local copy.<br />

GridFTP<br />

Files<br />

The Expand Parallel File System José Daniel García Sánchez – ARCOS<br />

Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Model 2 / Scenario 1<br />

56<br />

4 servers.<br />

Distributed files.<br />

Each server stores 50<br />

files.<br />

Complete transfer to<br />

local node.<br />

Application accesses<br />

local copy.<br />

GridFTP GridFTP GridFTP GridFTP<br />

Files<br />

The Expand Parallel File System José Daniel García Sánchez – ARCOS<br />

Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Computing node<br />

Scenario 2 (Expand)<br />

57<br />

Expand with 1, 2 and<br />

4 servers.<br />

POSIX<br />

MPI-IO<br />

Expand<br />

…<br />

NFS GridFTP RNS-WS Local<br />

…<br />

Local node accesses<br />

remotely needed<br />

data.<br />

No previous transfer<br />

needed.<br />

RNS<br />

Internet + GSI<br />

GridFTP GridFTP GridFTP GridFTP<br />

Site 1 Site 2 Site 3 Site 4<br />

The Expand Parallel File System José Daniel García Sánchez – ARCOS<br />

Group – University Carlos III of Madrid<br />

Distributed partition<br />

July 2007 - University of Modena


Time (min)<br />

Grid Evaluation<br />

58<br />

200<br />

180<br />

160<br />

140<br />

120<br />

100<br />

80<br />

60<br />

40<br />

20<br />

0<br />

1 site 4 sites<br />

(distributed<br />

files)<br />

Expand (1<br />

server)<br />

Expand (2<br />

servers)<br />

Expand (4<br />

servers)<br />

Model<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Contents<br />

59<br />

The ARCOS Group.<br />

Expand motivation.<br />

Expand design.<br />

Expand evaluation.<br />

Conclusions.<br />

Ongoing Work.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Conclusions<br />

60<br />

It is feasible to build parallel file system by using<br />

standard protocols and servers.<br />

Our solution is easily adaptable to different<br />

environments/situations (cluster and grid are<br />

examples).<br />

Performance results are comparable to other<br />

solutions (even comercial).<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Contents<br />

61<br />

The ARCOS Group.<br />

Expand motivation.<br />

Expand design.<br />

Expand evaluation.<br />

Conclusions.<br />

Ongoing Work.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Ongoing work<br />

62<br />

Add new protocols (e.g. Web Services)<br />

Evaluation in large clusters and grid environments.<br />

Use Expand to improve performance when<br />

accessing replicated data.<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


Ongoing work<br />

63<br />

Use Expand as intermediate file system in large<br />

clusters.<br />

Apps<br />

Expand<br />

Cluster File System (PFVS, GPFS, etc.)<br />

Compute<br />

nodes<br />

Parallel<br />

access<br />

Network<br />

I/O nodes<br />

The Expand Parallel File System<br />

José Daniel García Sánchez – ARCOS Group – University Carlos III of Madrid<br />

July 2007 - University of Modena


<strong>THE</strong> <strong>EXPAND</strong> <strong>PARALLEL</strong> <strong>FILE</strong><br />

<strong>SYSTEM</strong><br />

A <strong>FILE</strong> <strong>SYSTEM</strong> FOR CLUSTER AND<br />

GRID COMPUTING<br />

José Daniel García Sánchez<br />

ARCOS Group – University Carlos III of Madrid

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!