The Hadoop Distributed File System - IEEE-NASA Storage ...

aged our users to create larger files, this has not happened since 

it would require changes in application behavior. We have 

added quotas to manage the usage and have provided an archive 

tool. However these do not fundamentally address the 

scalability problem. 

Our near-term solution to scalability is to allow multiple 

namespaces (and NameNodes) to share the physical storage 

within a cluster. We are extending our block IDs to be prefixed 

by block pool identifiers. Block pools are analogous to LUNs 

in a SAN storage system and a namespace with its pool of 

blocks is analogous as a file system volume. 

This approach is fairly simple and requires minimal 

changes to the system. It offers a number of advantages besides 

scalability: it isolates namespaces of different sets of applications 

and improves the overall availability of the cluster. It also 

generalizes the block storage abstraction to allow other services 

to use the block storage service with perhaps a different namespace 

structure. We plan to explore other approaches to scaling 

such as storing only partial namespace in memory and truly 

distributed implementation of the NameNode in the future. In 

particular, our assumption that applications will create a small 

number of large files was flawed. As noted earlier, changing 

application behavior is hard. Furthermore, we are seeing new 

classes of applications for HDFS that need to store a large 

number of smaller files. 

The main drawback of multiple independent namespaces is 

the cost of managing them, especially if the number of namespaces 

is large. We are also planning to use application or job 

centric namespaces rather than cluster centric namespaces— 

this is analogous to the per-process namespaces that are used to 

deal with remote execution in distributed systems in the late 

80s and early 90s [10][11][12]. 

Currently our clusters are less than 4000 nodes. We believe 

we can scale to much larger clusters with the solutions outlined 

above. However, we believe it is prudent to have multiple clusters 

rather than a single large cluster (say three 6000-node clusters 

rather than a single 18 000-node cluster) as it allows much 

improved availability and isolation. To that end we are planning 

to provide greater cooperation between clusters. For example 

caching remotely accessed files or reducing the replication 

factor of blocks when files sets are replicated across clusters. 

VI. ACKNOWLEDGMENT 

We would like to thank all members of the HDFS team at 

Yahoo! present and past for their hard work building the file 

system. We would like to thank all Hadoop committers and 

collaborators for their valuable contributions. Corinne Chandel 

drew illustrations for this paper. 

10 

REFERENCES 

[1] Apache Hadoop. http://hadoop.apache.org/ 

[2] P. H. Carns, W. B. Ligon III, R. B. Ross, and R. Thakur. “PVFS: A 

parallel file system for Linux clusters,” in Proc. of 4th Annual Linux 

Showcase and Conference, 2000, pp. 317–327. 

[3] J. Dean, S. Ghemawat, “MapReduce: Simplified Data Processing on 

Large Clusters,” In Proc. of the 6th Symposium on Operating Systems 

Design and Implementation, San Francisco CA, Dec. 2004. 

[4] A. Gates, O. Natkovich, S. Chopra, P. Kamath, S. Narayanam, C. 

Olston, B. Reed, S. Srinivasan, U. Srivastava. “Building a High-Level 

Dataflow System on top of MapReduce: The Pig Experience,” In Proc. 

of Very Large Data Bases, vol 2 no. 2, 2009, pp. 1414–1425 

[5] S. Ghemawat, H. Gobioff, S. Leung. “The Google file system,” In Proc. 

of ACM Symposium on Operating Systems Principles, Lake George, 

NY, Oct 2003, pp 29–43. 

[6] F. P. Junqueira, B. C. Reed. “The life and times of a zookeeper,” In 

Proc. of the 28th ACM Symposium on Principles of Distributed 

Computing, Calgary, AB, Canada, August 10–12, 2009. 

[7] Lustre File System. http://www.lustre.org 

[8] M. K. McKusick, S. Quinlan. “GFS: Evolution on Fast-forward,” ACM 

Queue, vol. 7, no. 7, New York, NY. August 2009. 

[9] O. O'Malley, A. C. Murthy. Hadoop Sorts a Petabyte in 16.25 Hours and 

a Terabyte in 62 Seconds. May 2009. 

http://developer.yahoo.net/blogs/hadoop/2009/05/hadoop_sorts_a_petab 

yte_in_162.html 

[10] R. Pike, D. Presotto, K. Thompson, H. Trickey, P. Winterbottom, “Use 

of Name Spaces in Plan9,” Operating Systems Review, 27(2), April 

1993, pages 72–76. 

[11] S. Radia, "Naming Policies in the spring system," In Proc. of 1st IEEE 

Workshop on Services in Distributed and Networked Environments, 

June 1994, pp. 164–171. 

[12] S. Radia, J. Pachl, “The Per-Process View of Naming and Remote 

Execution,” IEEE Parallel and Distributed Technology, vol. 1, no. 3, 

August 1993, pp. 71–80. 

[13] K. V. Shvachko, “HDFS Scalability: The limits to growth,” ;login:. 

April 2010, pp. 6–16. 

[14] W. Tantisiriroj, S. Patil, G. Gibson. “Data-intensive file systems for 

Internet services: A rose by any other name ...” Technical Report CMU- 

PDL-08-114, Parallel Data Laboratory, Carnegie Mellon University, 

Pittsburgh, PA, October 2008. 

[15] A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, 

P. Wyckoff, R. Murthy, “Hive – A Warehousing Solution Over a Map- 

Reduce Framework,” In Proc. of Very Large Data Bases, vol. 2 no. 2, 

August 2009, pp. 1626-1629. 

[16] J. Venner, Pro Hadoop. Apress, June 22, 2009. 

[17] S. Weil, S. Brandt, E. Miller, D. Long, C. Maltzahn, “Ceph: A Scalable, 

High-Performance Distributed File System,” In Proc. of the 7 th 

Symposium on Operating Systems Design and Implementation, Seattle, 

WA, November 2006. 

[18] B. Welch, M. Unangst, Z. Abbasi, G. Gibson, B. Mueller, J. Small, J. 

Zelenka, B. Zhou, “Scalable Performance of the Panasas Parallel file 

System”, In Proc. of the 6 th USENIX Conference on File and Storage 

Technologies, San Jose, CA, February 2008 

[19] T. White, Hadoop: The Definitive Guide. O'Reilly Media, Yahoo! Press, 

June 5, 2009.

Previous page

Next page

1

2

3

4

5

6

7

8

9

10

The Hadoop Distributed File System - IEEE-NASA Storage ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?