a Grid Computing System - Utopia
a Grid Computing System - Utopia
a Grid Computing System - Utopia
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
a <strong>Grid</strong> <strong>Computing</strong> <strong>System</strong><br />
Radu Niculita<br />
February-August 2002<br />
Computer <strong>System</strong> Laboratory, Computer Science,<br />
School of <strong>Computing</strong>, Automatic Control and Computers Faculty,<br />
National University of Singapore Politehnica University of Bucharest<br />
Advisor: A/P Teo Yong Meng Advisor: Prof. Dr. Ing. Nicolae Tapus
“There is nothing more difficult to take in hand, more perilous to<br />
conduct, or more uncertain in its success than to take the lead in the<br />
introduction of a new order of things...but nothing is more thrilling.”<br />
Niccolo Machiavellii
Abstract<br />
ALiCE is a middleware that supports generic grid application development and deployment. It<br />
is build in Java using the Sun Microsystems JavaSpaces technology and it is designed with platform<br />
independence, scalability, modularity, performance and programmability in mind. The system that<br />
this document presents is build using the concepts from the previous version of ALiCE, developed<br />
at the National University of Singapore.<br />
This documents presents the design and the features in the ALiCE system. The need to re-<br />
design ALiCE raised from the actual requirements of a flexible and functional GRID computing<br />
middleware system, with an emphasis on scalability and modularity.<br />
Although the templates and programing templates that the programmer will use are mainly the<br />
same and the basic concepts are unchanged, hence the old application port will be easy, the new<br />
system is entirely rewritten from scratch. The design is much more scalable and very easy to build<br />
on, so future developments are possible with minimum effort. Also, the new design permits the<br />
support for non-Java applications.<br />
The means of communication in the new design are generic ones, involving live object freely<br />
moving through the system, which leave place for any kind of communication and synchronization<br />
inside the system and between the new ALiCE applications.
Contents<br />
I. Introduction 7<br />
1. Introduction 8<br />
1.1. Motivation of <strong>Grid</strong> systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8<br />
1.2. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9<br />
1.2.1. Legion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9<br />
1.2.2. GLOBE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10<br />
1.2.3. Globus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10<br />
1.2.4. Condor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10<br />
1.2.5. SETI@home . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11<br />
1.2.6. Distributed.Net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11<br />
1.3. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11<br />
II. Design, Architecture and Implementation 12<br />
2. ALiCE Design Overview 13<br />
2.1. Design Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14<br />
2.2. The Advantages of the New Architecture . . . . . . . . . . . . . . . . . . . . . . . 15<br />
2.3. Design Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17<br />
2.3.1. Java, Jini and JavaSpaces . . . . . . . . . . . . . . . . . . . . . . . . . . 17<br />
3. The Basic Building Block - ONTA (Object Network Transfer Architecture) 19<br />
3.1. How Does ONTA Works? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20<br />
3.2. The Object Writer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />
3.3. The Object Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25<br />
4
Contents<br />
3.4. The Remote Object Loader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26<br />
3.5. The Object Loader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27<br />
3.6. The File Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29<br />
3.7. The Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31<br />
3.7.1. The Protocol Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31<br />
3.7.2. The Protocol Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32<br />
4. ALiCE Architecture and Implementation 34<br />
4.1. An Overview of the Components of the <strong>System</strong> . . . . . . . . . . . . . . . . . . . 34<br />
4.1.1. Three-tier architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34<br />
4.1.2. <strong>System</strong> Components Overview . . . . . . . . . . . . . . . . . . . . . . . . 34<br />
4.2. The Communication Between Components of ALiCE . . . . . . . . . . . . . . . . 37<br />
4.2.1. Communication Through Object References . . . . . . . . . . . . . . . . . 38<br />
4.2.2. Communication Through Messages . . . . . . . . . . . . . . . . . . . . . 40<br />
4.2.3. General Communication Scheme for an ALiCE Application . . . . . . . . 41<br />
4.3. <strong>System</strong> Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44<br />
4.3.1. The Common Components . . . . . . . . . . . . . . . . . . . . . . . . . . 44<br />
4.3.2. The Consumer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46<br />
4.3.3. The Resource Broker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50<br />
4.3.4. The Producer and the Task Producer . . . . . . . . . . . . . . . . . . . . . 58<br />
4.3.5. The Data Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62<br />
III. Sample Applications and Performance Testing 66<br />
5. Example of ALiCE Applications 67<br />
5.1. Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67<br />
5.2. Ray Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68<br />
5.3. DES Key Cracker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70<br />
5.4. Protein Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71<br />
6. Performance Testing 73<br />
6.1. The Test Bed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73<br />
6.2. Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74<br />
6.2.1. Performance Evolution with Variance of Task Size . . . . . . . . . . . . . 74<br />
6.2.2. Varying the Number of Producers . . . . . . . . . . . . . . . . . . . . . . 76<br />
6.2.3. Overhead Variation with Task Size for Direct Result Delivery . . . . . . . 78<br />
5
Contents<br />
6.2.4. Overhead Variation with Task Size for Delivery of Results Through Re-<br />
source Broker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81<br />
6.2.5. Performance Comparison with the Old Version of ALiCE . . . . . . . . . 84<br />
IV. ALiCE GRID Programming Model 86<br />
7. Developing ALiCE Applications 87<br />
7.1. The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87<br />
7.2. Template Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88<br />
7.2.1. The Task Generator Template . . . . . . . . . . . . . . . . . . . . . . . . 88<br />
7.2.2. The Task Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90<br />
7.2.3. The Result Collector Template . . . . . . . . . . . . . . . . . . . . . . . . 90<br />
7.2.4. Data Files Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92<br />
7.2.5. Inter-task Communication . . . . . . . . . . . . . . . . . . . . . . . . . . 93<br />
7.3. Simple application examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94<br />
7.3.1. Simple Example and Data File Usage . . . . . . . . . . . . . . . . . . . . 94<br />
7.3.2. Simple Inter-Task Communication and Spawning new Task from a Task . . 96<br />
8. ALiCE Programming Templates 100<br />
8.1. The Task Generator Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100<br />
8.2. The Result Collector Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101<br />
8.3. The Task Template . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102<br />
9. Conclusions 104<br />
9.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104<br />
9.2. Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104<br />
6
Part I.<br />
Introduction<br />
7
1. Introduction<br />
1.1. Motivation of <strong>Grid</strong> systems<br />
<strong>Grid</strong> computing, the harnessing of the immense computational resources provided by geographi-<br />
cally dispersed computers connected via network as one large parallel machine (also called a grid<br />
or a meta computer) has intrigued the minds of researchers for many years.<br />
The basic idea is quite simple: a large part of the computing power available from the com-<br />
puters today remain unused, starting with the supercomputers that are used as low as 40% of their<br />
capacity and ending with the home user that owns a computer that is only used for desktop appli-<br />
cations and hence has a lot of wasted computational power. The number of computers connected<br />
to the Internet has increased more than four times between 1999 and 2002, from 43 million to<br />
a staggering 190 million (as of February 2002) and the trend is still increasing. These numbers<br />
present ample opportunities for the Internet to be used as a powerful distributed computing system<br />
that can compete with and best any single-machine computers that are available today.<br />
Projects such as SETI@home, Genome@home, The Globus Computational <strong>Grid</strong>, MIT’s Bayani-<br />
han, Milan, among many many others, have not only contributed ideas on how grid computing<br />
systems can be implemented, but they have also demonstrated the potential of grid computing<br />
systems.<br />
A grid computing system has several advantages. The first advantage is the huge amount<br />
of available resources, on an Intranet, on a cluster and especially on the Internet. Current and<br />
past projects have demonstrated that is highly efficient to harness the idle cycles of hundreds or<br />
thousands of machines for a distributed application. Though this is very promising, the world of<br />
grid computing is still at its beginning and research in this field is still in infant stage.<br />
ALiCE wants to offer a new approach, based on distributing java persistent objects through the<br />
network and thus offering a lot more expression power than other systems available today.<br />
With all its advantages, grid computing comes with a lot of challenges that a grid system<br />
developer faces. The first challenge is the adaptive, non-uniform and non-predictable nature of<br />
8
1. Introduction<br />
the network. In a grid computing system, machines can leave and join the system at any time. A<br />
grid computing system, therefore, must be able to adapt to the dynamic behavior of the resources<br />
available.<br />
Another challenge is imposed by the nature of applications that can be executed on a grid<br />
computing system. Not all applications can be parallelized to such a degree that they can be run on<br />
a distributed systems and there are even application that are inherently sequential. Even the best<br />
suited applications should have implementations especially developed to run on a grid computer<br />
system. In this light, a challenge rises from the need to provide a general middleware and a feasible<br />
grid programming model, which should make application development possible and easy.<br />
A third challenge that a grid computing system face is security. With the abundance of re-<br />
sources that the Internet provides comes also the question of how to utilize these resources in a<br />
secure manner, especially when running on an infrastructure as large as the Internet. Employing a<br />
single-machine that is not connected to the Internet to execute applications gives users the comfort<br />
of not having to worry about malicious users from other hosts. In an Internet-based grid comput-<br />
ing system, however, this is not true. Therefore, it is of crucial importance that the grid computing<br />
systems provide means to ensure that security precautions are taken.<br />
Yet another challenge is in offering the best performance possible to users. Although in a<br />
system running such different applications on a such heterogeneous resources, the guaranteeing of<br />
performance is not possible, best use of resources is, however, a must. This challenge translates<br />
into trying to obtain the best possible performance for the users of the grid computing system,<br />
despite the heterogeneous nature of available resources. This means that, even though the resources<br />
are abundant, a grid computing system would make the best possible use of them. Although<br />
some Intranet consists of relatively homogeneous resources, others and as an extreme example,<br />
the Internet is made up of computers with varying configurations and capabilities, connected by<br />
networks of varying connection latency, speed and reliability. Therefore, it is not trivial to answer<br />
the question of which machine the grid computing system should send a computational task to.<br />
1.2. Related Work<br />
1.2.1. Legion<br />
Legion [16], from the University of Virginia, is an object-based grid computing middleware that<br />
provides a single address space (distributed shared memory) for all nodes to use as a medium for<br />
exchanging objects. Legion is totally distributed. Hence, the system has no centralized node of<br />
any kind that function as a manager or a coordinator of the shared memory. Objects are sent from<br />
one physical address space to another via message passing.<br />
9
1. Introduction<br />
It is designed for wide area networks and it supports a variety of programming languages, such<br />
as C++, Fortran, Mentat, Parallel Virtual Machine (PVM) and Message Passing Interface (MPI).<br />
Though it is claimed to be scalable it is platform dependent on the UNIX platform.<br />
1.2.2. GLOBE<br />
GLOBE [15] is a Java-based grid computing middleware developed at Vrije University, Nether-<br />
lands. Like Legion, it provides a single address space for nodes to use as a distributed shared<br />
memory. Unlike Legion, however, GLOBE nodes keep a local object that functions as a represen-<br />
tative of the remote object in its local physical memory. The GLOBE system does not only allow<br />
sharing of computational alone, but it allows the sharing of any resource.<br />
1.2.3. Globus<br />
The Globus [4, 14] project at Argonne National Lab focuses on building a toolkit, a middleware,<br />
for building grid computing systems. It provides several basic lower-level services that simplify<br />
the design of higher level services that serves as a meta computer. These services include naming<br />
services, security, and resource management.<br />
Globus is the most well-known grid middleware available today. Compared to Globus, even<br />
though is much more lighter, ALiCE has some definitely advantages, like platform independence,<br />
ease of use and administration, better control over resources. ALiCE also targets more the home<br />
user than the large systems that Globus targets.<br />
1.2.4. Condor<br />
Condor [17], developed at University of Wisconsin, is a grid computing system used harness the<br />
idle cycles of computers residing in an Intranet. Condor provides a set of libraries that a C program<br />
can link to. Through this library, the program have access to check pointing and remote system<br />
call mechanisms. Condor supports job migration and quality of service specifications by allowing<br />
the users to specify a list of preferences and requirements. Requirements specify the minimum<br />
resource needed to execute the job whereas preferences specify the ideal amount of resources that<br />
the job would want to run on. Despite the tremendous advantage that a Condor system can provide,<br />
it is limited to NT and UNIX platforms only.<br />
10
1.2.5. SETI@home<br />
1. Introduction<br />
SETI@home [12], is part of the SETI (Search for ExtraTerestrial Intelligence) program that tries<br />
to find intelligent patterns in the radio waves received from spaces. It is an application that runs a<br />
screen saver on the machines of anyone who is willing to offer the idle cycles of his/her computer<br />
to process those signals<br />
1.2.6. Distributed.Net<br />
Distributed.Net [13] is a project that tries to develop specific distributed applications for key-<br />
cracking that are working on the same principle as SETI@home. The project had a lot of success,<br />
successfully completing the crack of 56-bits DES key, which led to the conclusion that a 56-bits<br />
key is too little security.<br />
1.3. Summary<br />
In this section we will briefly present the content of this report. The report is organized in four<br />
parts.<br />
Part one is the introductory part and is ending with this summary.<br />
The second part is the main part, containing the design and the architecture details of the ALiCE<br />
grid computing system, as well as a general presentation of the implementation, starting with an<br />
overview of the architecture. Next, we present the basic building block of ALiCE, that is ONTA,<br />
our library through which we are transferring objects over the network. In the last chapter of the<br />
second part we present the details of the architecture and the implementation of each of ALiCE’s<br />
component.<br />
The third part is dedicated to presenting some sample applications and also the results of the<br />
tests we conducted on the system.<br />
The last part presents the programming model of ALiCE applications, consisting in a program-<br />
mer’s manual for ALiCE application developers.<br />
11
Part II.<br />
Design, Architecture and<br />
Implementation<br />
12
2. ALiCE Design Overview<br />
At this time, there are many experimental grid systems and a lot of research is focused in the<br />
direction of grid computing. Since this is a very promising field in computer science that is still at<br />
its beginning, having many approaches is very useful and productive. We aim to address some of<br />
the deficiencies of these existing implementation.<br />
In parallel and distributed systems, there are two main parallel paradigms used to model the<br />
system:<br />
the master-slave paradigm - The master-slave programming model consists of a master pro-<br />
gram that controls the overall function of the application and several independent slave sub-<br />
programs who’s task is to do computations for the mater program; this model is also known<br />
as the task farming model;<br />
the peer-to-peer paradigm - In this model, there is no central control or entity that has a cen-<br />
tralized view; the model consists of a number of totally independent tasks that are working<br />
together for reaching a result.<br />
In addition to those, there are a number of approaches that further refine the paradigms, like single<br />
program multiple data, data pipelining, divide and conqueror and speculative parallelism.<br />
Our approach is to use a hybrid paradigm. The approach we took is closer to the master-<br />
slave model, since we do have central program and several slave sub-programs that are doing the<br />
computations. Though, we adopt the good parts of the peer-to-peer model, by permitting each slave<br />
program to create its own sub-slave programs. This permits deploying more complicated parallel<br />
algorithms, including divide and conqueror class parallel algorithms. From this point of view, we<br />
are closer to the peer-to-peer model. However, we only have one entry point in the application, so<br />
the model mainly follows the master-slave paradigm.<br />
13
2.1. Design Goals<br />
2. ALiCE Design Overview<br />
The following are the main goals of the ALiCE system.<br />
Flexibility, Modularity, Scalability and Functionality Based on the ideas in the previous<br />
version of ALiCE, the new version is entirely redesigned. Although the old version comes with<br />
some very good ideas that are still used, the functionality, performance and scalability are greatly<br />
enhanced. The previous version was unable to run more than one application at a time and some<br />
very important functions were not implemented. Also, the design was not very modular and lacked<br />
the flexibility needed in a grid computing system.<br />
The new design focuses most on functionality and scalability. The system is now fully func-<br />
tional, ready to be deployed in real-life conditions. It supports multiple applications, multiple<br />
clients and it has a very scalable implementation.<br />
Platform Independence for Java Applications Some grid computing systems are restricted<br />
by the Operating <strong>System</strong>s or hardware platform that it runs on; therefore, limiting the resources<br />
that the grid computing system can harness. We feel that such restrictions defeat the purpose of<br />
having a grid computing system. The Internet, as an example, consists of machines of diverse<br />
platforms with different operating systems running on different types of hardware configurations.<br />
ALiCE is platform independent and is scalable to hardness all the resources on the Internet. This<br />
platform Independence comes from using Java, which is a cross-platform language that is based<br />
on a virtual machine which has implementation for almost any computer platform existent today<br />
Generic Infrastructure Support Some grid computing systems are restricted to the certain<br />
applications that it is built for. These are systems that are specifically designed to solve certain<br />
problems that are computationally intractable with single-machine systems. Such grid computing<br />
systems, though very useful in terms of the specific problems that they address, are incapable of<br />
addressing problems that they are not designed to solve.<br />
ALiCE is a generic runtime infrastructure on which users can deploy any applications. This is<br />
achieved through the use of programming templates.<br />
Generic Communication Support Many grid computing systems, as well as the previous<br />
version of ALiCE, are forcing the user to use a particular protocol to transfer files and informations<br />
over the network. In a world in which the security is a big issue, a more flexible approach is needed,<br />
in order neither to expose the user data to others, nor to overload the system with unnecessary<br />
cryptography. For this, ALiCE offers support for user-developed communications protocols, which<br />
14
2. ALiCE Design Overview<br />
can be plugged in as modules, without even restarting the system. There are although a number of<br />
built-in protocols already developed that should suit any security needs a user could impose.<br />
Non-Java Support Even though using Java as the language to develop ALiCE application has<br />
many benefices, the most important being platform Independence, Java is not the only language<br />
out there today. Many programmer still develop applications in other languages, like C or C++,<br />
and asking them to migrate all to Java in order to use a grid computing system is unreasonable. So<br />
ALiCE includes support for the C language in this version, as well as provisions in the design to<br />
develop support for any other languages.<br />
Performance Execution Having an abundance of resources available would not be of great<br />
advantage unless we can make good use of those resources. Therefore, we want ALiCE to be able<br />
to provide performance for the applications that runs on it. There are also provisions for further<br />
enhancements, like developing new scheduling algorithms.<br />
Ease of Setup and Maintenance From the user’s perspective, it is inconvenient and unde-<br />
sirable to have a system that is difficult to setup and maintain. Considering that ALiCE is meant<br />
to be used by many users, possibly located in differing geographical locations, it is very important<br />
that ALiCE be easy to setup and maintain.<br />
ALiCE is developed as an application mainly, so there is no need for special privileges or for<br />
any insight knowledge of the machine it is deployed on. The user will just run a program.<br />
Anonymity and Security In an untrusted environment such as the Internet, it is of vital<br />
importance that we do not disclose information that may be used by malicious users. For this<br />
reason, ALiCE nodes do not have information about other nodes. Also, only authenticated nodes<br />
are allowed in the ALiCE system.<br />
2.2. The Advantages of the New Architecture<br />
Although the current version of ALiCE comes with some brilliant and innovative ideas, the design<br />
and the implementation of the system are to some extent faulty, as they lack functionality and<br />
they don’t take full advantage of the possibilities that are opened by the live-object migration<br />
technique. The main concept is just the same, so the system is build around a central point which<br />
is JavaSpaces from Sun Microsystems, but, as all the other parts in the system, changing from<br />
JavaSpaces to another implementation of a distributed share memory implementation can be done<br />
15
2. ALiCE Design Overview<br />
without affecting the rest of the system. In fact, at the later stages of the development, we switched<br />
to using GigaSpaces instead of JavaSpaces, without imposing any change what so ever to the rest<br />
of the system.<br />
The new system still lacks the support needed in the fields of security and scheduling, but those<br />
are beyond the point of this design. The security, as well as support for different schedulers, can<br />
be developed and added on later in the system, with little or no change at all to the actual code.<br />
The system supports now multiple applications at the same time and multiple clients at the same<br />
time. The combination of those is also supported, meaning one could have multiple applications<br />
submitted by the same client at the same time, or even distinct instances of the same application.<br />
The new architecture and implementation aim to create a system which is functional, high<br />
performance, flexible and open at the same time. Some major advantages of the new approach are:<br />
the ability to move live object through the systems very easily<br />
The main focus of ALiCE approach to grid computing is the possibility of migrating a living object<br />
from one machine in the system to another machine. This opens a whole world of possibilities.<br />
The architecture is designed in such a way that object transfer is done in a very general manner,<br />
so adding new means of communications by way of live object transfer is easy to do. This means<br />
that you can do anything from simple tasks like synchronizations and message communication to<br />
complex tasks like comportamental transfer, object request/delivery of any kind and even creating<br />
new tasks from inside other tasks, opening way to a direct approach in solving divide and conquer<br />
problems. The serialization/transfer/reload of objects is done through a general library developed<br />
especially for that purpose, with ease of use, portability and performance in mind. For mode<br />
details, see the chapter about ONTA (chapter 3).<br />
improved scalability<br />
Since we are developing a grid programming system, the scalability was a major concern in the<br />
design. We kept in mind that this is a totally distributed system, so the add of new components of<br />
any kind to the system can be done easily and with little or no overhead. The growth of the system,<br />
which is inherent since it is a grid system, is in this way sustained and will impose no change<br />
in the implementation or the design. This means you can add new resource brokers, new task<br />
producers and new data servers on the fly without adding stress to the system. No entity is aware<br />
of the presence of other entities of the same kind in the system, so the architecture is inherently<br />
distributed and highly scalable.<br />
adaptability<br />
16
2. ALiCE Design Overview<br />
The system design includes plug-in like support for new protocols and runtime supports. That<br />
means that adding new protocols for file transfer (the files are the objects in a serialized form) can<br />
be done on the fly, without restarting the system. The ALiCE developer will be able in this way to<br />
deploy new security techniques over network traffic and fix security holes without redeploying the<br />
system.<br />
performance<br />
The performance was a big issue in the system design. The main focus was that we are dealing with<br />
a very big system and that we are using a central distributed share memory, that is JavaSpace; we<br />
tried to keep the trace in JavaSpaces of the transfers to a minimum, so transferring a file or an object<br />
between part of the systems means just placing a reference to it in JavaSpaces, the reference being<br />
very small. Also, in order to improve performance, we are using a multi-thread model, which will<br />
be detailed later in this report. There are some other issues in order to improve performance, like<br />
having a file manager that handles all the file storing/retrieval (hopefully, it will support caching in<br />
the future :-)) and the possibility to deploy a multiple resource broker system.<br />
modularity and ease of future development<br />
The whole architecture and implementation are very modular and the code is well organized, so<br />
future development will be easy. Also, the modules are as independent as possible, with well<br />
defined interfaces, so changes inside a module should not affect other modules.<br />
2.3. Design Decisions<br />
The design of ALiCE involves several decisions that has been made to fulfill our design objectives.<br />
2.3.1. Java, Jini and JavaSpaces<br />
The Java language is chosen for ALiCE for various reasons. Java is a platform independent lan-<br />
guage and the Java Virtual Machine has been implemented on various platforms to allow different<br />
platforms to share executables.<br />
The second reason is the popularity of the Java language itself. Choosing a popular language<br />
for a grid computing system allows users to learn how to build applications for ALiCE quickly and<br />
easily.<br />
Thirdly, Java has provided us with various technologies that aids the development of a dis-<br />
tributed system such as ALiCE. These technologies include Jini and JavaSpaces. Jini is a set<br />
17
2. ALiCE Design Overview<br />
of Java APIs that facilitate the building and deploying of distributed systems. Jini provides the<br />
"plumbing" that takes care of common but difficult parts of distributed systems.<br />
Jini consists of a programming model and a runtime infrastructure. The programming model<br />
helps developers build distributed systems that are reliable, even though the underlying network<br />
is unreliable. The runtime infrastructure makes it easy to add, locate, access, and remove services<br />
from the network.<br />
JavaSpaces is a Jini service that provides a distributed shared memory for Jini enabled de-<br />
vices on the network. JavaSpaces helps simplify communication, coordination and sharing of Java<br />
Objects among the Jini-enabled devices.<br />
Figure 2.1.: JavaSpaces Technology from Sun Microsystems<br />
JavaSpaces provides persistent storage of Objects that are accessible by various machines con-<br />
nected over the network. These machines can be given access to write Objects into the JavaSpaces<br />
as well as read, modify, or remove these Objects from JavaSpaces. Figure 1.1 demonstrates these<br />
operations.<br />
GigaSpaces GigaSpaces Synchronization and Coordination Platform is a software infrastruc-<br />
ture for information collaboration platform for Enterprise Distributed Applications and Web Ser-<br />
vices. The platform is an implementation of Sun Microsystems’ JavaSpaces technology.<br />
In the current stage of development we are using GigaSpaces for ALiCE. The main advan-<br />
tage over Sun Microsystems’ implementation of JavaSpaces is that is much more faster and more<br />
reliable. The downside is the fact that it is a commercial product.<br />
18
3. The Basic Building Block - ONTA<br />
(Object Network Transfer<br />
Architecture)<br />
As mentioned before, the whole ALiCE system is revolving around moving objects and classes<br />
around the system. To support this, we developed a library that is designed to get a live object<br />
or a class, put it in an archive file and then load it back at the other end of a network connection.<br />
And the ONTA (Object Network Transfer Architecture) does just that, offering to the ALiCE core<br />
developer a general API to serialize and save objects, together with the associated classes and thus<br />
implement object persistence over the network. Also, ONTA is using a generic way to transport the<br />
serialized objects over the network, that is using a protocol model which is as general as possible.<br />
Actually, the protocol used can be added in the system at any time, on the fly, making the system<br />
very flexible and modular. A new protocol is composed of two parts: the server side and the<br />
client side, each being a class implementing a simple and generic interface. ONTA is handling<br />
protocols retrieval and adding of new protocols in the system, meaning dynamically loading them,<br />
transferring the client side to where it is needed and registering them.<br />
There are six components inside ONTA:<br />
the Object Writer, that handles serializing objects and creating jar archive files containing all<br />
that is needed to retrieve an object or class after the file was transported over the network;<br />
the Object Repository, that stores the jar archive files, advertise them to be downloaded<br />
by remote object loaders and introduces new protocols into the system; it can also send<br />
messages to other parts of the system;<br />
the Remote Object Loader, that basically just handles retrieving file references and down-<br />
loading the files; it is also retrieving the messages send to the machine it runs on by others;<br />
the Object Loader, which restores a saved object from a file;<br />
19
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
the File Manager, which handles file naming and storage on the local disk;<br />
the protocols, which are moved around the system when needed.<br />
All the communication in ALiCE is done through JavaSpace. Thus, the space is also the method of<br />
synchronization between the components. This is, as far as we know, the first time that JavaSpace<br />
technology is used for grid computing. The idea is quite good, since JavaSpace tends to be exactly<br />
what is needed to communicate and synchronize with full efficiency in a distributed system: a dis-<br />
tributed share memory implemented over the network as a Jini service. Tests were also conducted<br />
using another implementation of JavaSpace, namely GigaSpaces from GigaSpace technologies.<br />
The results were very promising, both in terms of reliability and of speed.<br />
3.1. How Does ONTA Works?<br />
A diagram showing how a live object is transfered from one machine to another is presented in<br />
figure 3.1 and explained next.<br />
(1) Serializa object<br />
=> file<br />
Object<br />
Repository<br />
File<br />
(4) Download<br />
file<br />
(2) File Reference/<br />
Message<br />
JavaSpace<br />
(3) File Reference/<br />
Message<br />
Figure 3.1.: Object transfer through ONTA<br />
Remote<br />
Object<br />
Loader<br />
(5) Load object<br />
from file<br />
ONTA is actually an infrastructure used to freely move live objects over the network, with<br />
a great accent on scalability and ability to sustain high work loads in terms of number, size and<br />
20
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
diversity of objects. Basically, any object that implements the Serializable interface (directly or<br />
through another interface or class in extends) can be send over the network using ONTA.<br />
The approach that we took in addressing the problem of remote object loading was to imple-<br />
ment a mechanism our selfs and not to use the RMI mechanism because of two main reasons. First,<br />
we are now handling transporting files that contain the objects, not the objects themselves. This<br />
means that the objects are not required to be in memory all the time, thus permitting to deploy a<br />
grid computing system, which would be very limited by using RMI. In the same time, by not keep-<br />
ing the objects in memory and by putting references in JavaSpace instead of objects, the footprint<br />
in JavaSpace is very small and of constant size, thus the limit of how many references can be in<br />
JavaSpace at the same time is very high, above the limit that one would find in a grid computing<br />
system, hence the scalability of ALiCE. The second reason for not using Java RMI mechanism is<br />
that it lacks the security that a grid computing system needs. Java RMI uses the HTTP protocol<br />
to download files, with no possibility to change this. The security of HTTP is almost inexistent,<br />
compared to the flexible approach of plug-in user-developed protocols used in ONTA.<br />
There are two kinds of objects that we transfer through the space for inter-component commu-<br />
nication in ALiCE:<br />
Object References - these are actually references to files containing a serialized object, to-<br />
gether with all the classes needed to load the object. The reference to a file consists of an<br />
AliceURL object, which is actually a tuple of three string fields, one identifying the proto-<br />
col, one the host (its IP address) and one the file location on that machine. There are other<br />
fields inside the reference, such as a destination field that named the address of the host that<br />
this reference is intended to get to, a type field, some additional identification fields and the<br />
application ID field;<br />
Messages - these type of objects are used to exchange any kind of information that does not<br />
involve files or serialized objects; for implementing different kind of messages, this class is<br />
extended in several other classes.<br />
Inside both of these kind of objects there is an application ID. This means that all the objects<br />
transfered through space in ALiCE are keeping a tag that relates them to an application submitted<br />
into the system. This is very useful for addressing issues and for keeping track of things.<br />
Each application is uniquely identified in the system and this ID is used for all the objects<br />
related to the application. An application ID is actually the URL of the file created when the<br />
application is first submitted to th system and this ID is unique system-wide, since the file manager<br />
of ONTA is getting an unique file name for each object advertised by the Object Repository; this,<br />
together with the host’s IP is an unique identifier.<br />
21
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
Next, we will present each component of the ONTA system individually. In what follows, we<br />
present the implementation of ONTA in the case of Java applications. For non-Java applications,<br />
all that differs are the Object Writer and the Object Loader, which are build-in the language support<br />
for that application. The ALiCE system is designed and implemented in such a way that support<br />
for new languages can be added to the system without changing the already existing components.<br />
3.2. The Object Writer<br />
In a grid computing system, the network overhead is a problem. With machines that are contribut-<br />
ing to the grid present in a wide geographical area, some network connection could be slow, so we<br />
decided to transfer all the files over the network in a compressed form. That is, for each live object<br />
that is transfered through ONTA, the system creates a JAR archive that will store the object and all<br />
the classes it needs to be restored. At the site where the object is restored, first all the classes will<br />
be dynamically loaded and than the actual serialized object will be deserialized.<br />
Actually, in many ways, the saving of the classes that are needed by an object is very straight-<br />
forward: you just need to go through all the class references starting at the class that the object is<br />
an instance of and to save all these classes.<br />
We can distinguish two cases of object transfer in ALiCE: the transfer of live objects and the<br />
transfer of just the classes that are needed to instantiate an object. The later is the case when a<br />
new application is submitted into the system. In this case, the programmer supplies the class files<br />
needed for the application, but there is no instance of any of those classes that needs to be created<br />
at the consumer site and than transmitted to the resource broker or to any producer. Instead of<br />
this, the class file for the task generator, as well as all the classes that are referred from inside the<br />
task generator class, should be transfered to the task producer and the task generator should be<br />
instantiated and run there. The need to transfer live instances of classes arises more often, being<br />
the case of the instantiated tasks that are transfered to the producers, of the results coming back to<br />
the consumer or of the user objects that the tasks are communication through.<br />
The usage of the ObjectWriter class is quite simple: when instantiated, this class will create a<br />
temporary JAR archive file that will store the classes and perhaps the live object. After instantia-<br />
tion, the API of the class consists of these public methods:<br />
public void addFile(String _fileName);<br />
public void addClass(String _classFileName);<br />
public void addClass(Object _obj);<br />
public void addObject(Object _obj);<br />
public File getJar();<br />
22
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
After all the intended files, classes and objects are added, the user can get the a File object for<br />
the archive by calling the getJAR() method.<br />
stream.<br />
The addFile() call is very straight-forward and it adds to the archive the named file, as a byte<br />
The second call adds a class to the archive, given the filename for the class file. We should<br />
stress the fact that the first class added in this manner to the archive will be the one instantiated<br />
by the ObjectLoader as the object carried in this archive (e.g. when packing an application, the<br />
first class added to the archive must be the TaskGenerator class). This method will also traverse<br />
all the references deriving from the given class file. The support for finding the references inside<br />
a class file is provided by the fact that all the references are saved in the class file by the java<br />
compiler. This support is provided by a very simple parser that gets the class references list from<br />
analyzing the given class file. It all meant just interpreting the content of the file by following the<br />
class file structure, as published by Sun. This is done through another class named ClassFile that<br />
just returns a linked list of files when its method GetClasses() is called after initializing an instance<br />
of the ClassFile class with the class file given to the object writer.<br />
The addObject() call just creates a new entry inside the archive and serializes the object by<br />
means of an ObjectOutputStream. We mention that this call also saves all the objects that are<br />
contained inside the one that is written with the writeObject() call. But this will only save a<br />
snapshot of the live object, that is of its fields, not the code. In order to be able to restore the live<br />
object, one should also save all the code for that object and all the objects contained in its fields, in<br />
the fields of those fields and so on and so forth; this is done by starting with the class the object is<br />
an instance of, through the call addClass(Object _obj), the third call given above.<br />
To see how this is done, some preliminaries first. The references in each object to other objects<br />
form a hierarchy of classes that is an oriented graph. We should save all the classes for each of<br />
those references. The issue is traversing the graph of object references in such a way to get to all<br />
the classes and save all of them, but in the same time not to get into cycles and not to save the same<br />
class twice if two instances of it are encountered. Let us consider an example to be more explicit.<br />
Let’s consider the that we should save an instance of the class A, that is an live object from that<br />
class, given the following definitions:<br />
class A { class B {<br />
B ab; C bc;<br />
D ad; D bd;<br />
.... ....<br />
} }<br />
class C { class D {<br />
23
}<br />
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
A ca; B db;<br />
D cd; ....<br />
.... }<br />
Figure 3.2.: Example of an object’s references hierarchy<br />
The hierarchy of references created by this object being saved to the jar archive is presented<br />
in figure 3.2. Starting with the object that instantiated the class A, we should traverse all the<br />
references and save the class B and class D. We are stressing here that the references are all resolved<br />
at compile time by Java, even the ones referring to an interface implementation (e.g. interf1 Obj<br />
= new Implementation()) and hence all the classes needed for an object can be found by starting<br />
at analyzing the class file for the class that that object is an implementation of. Than we should<br />
check all the references in those classes too. Checking the references in class B we end up saving<br />
the class C and then, checking the references of this class, we could end up in a loop. This should<br />
be avoided. At a first look, it looks like we should avoid to begin inspecting the objects, not the<br />
classes that have been previously traversed. That is because there could be other instances of the<br />
same class that could lead to different points in the hierarchy. That is not true, since different<br />
instances of the same class will end up leading us to the same classes, hence this is a static traverse<br />
of the class hierarchy graph. Even if there are multiple instances of the same class in the hierarchy,<br />
we should only begin to traverse the first instance encountered.<br />
The implementation does an iterative deep-first traverse of the graph, keeping track of nodes<br />
in a stack that remembers all the classes that were already saved. Actually, if we think about the<br />
fact that we are not going on with traversing any node that has been previously inspected, we end<br />
up concluding that the structure could be thought of as a tree, so we are doing a iterative depth-<br />
first traverse of a general tree. Since the algorithm of doing this is common, we would not get<br />
any further into details about the implementation. We are first saving the classes that are referred<br />
24
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
deepest in the tree, so there would hopefully not be any future delegation necessary for classes not<br />
yet loaded when the hierarchy of classes is restored.<br />
A problem not yet solved is that of circular references. Since the delegation of class name<br />
resolution is done by the JVM and we have no control over this, we don’t have a way to intercept<br />
circular references delegations and to resolve them; to be able to do this would mean to make<br />
changes inside the JVM.<br />
3.3. The Object Repository<br />
The object repository is actually just an interface that hides Java Space from the rest of the sys-<br />
tem and provides logical means of putting object references and messages into the space. The<br />
implementation lets the components of the system that are using the object repository create their<br />
own references to put in the space. This is because there is no general enough representation of a<br />
reference so it could be use by any language support that is and will be implemented in ALiCE.<br />
Together with the remote object loader, the object repository is the central point of ONTA.<br />
Basically, the API of the object repository consists of some calls that are advertising different<br />
types of references and messages:<br />
public AliceURL advertiseCode(ObjectReference or, String protoName)<br />
public AliceURL advertiseTask(ObjectReference taskToSchedule)<br />
public AliceURL advertiseTaskToSchedule(ObjectReference or, String protoName)<br />
public AliceURL advertiseUserObject(ObjectReference or, String protoName)<br />
public AliceURL advertiseProtocol(String name, String dir, String fileClient, String fileServer)<br />
public void reAdvertiseProtocol(String name, String toWho)<br />
public AliceURL advertiseData(String dataFileName, AliceURL appl, String protocol)<br />
public AliceURL advertiseResult(ObjectReference or, String protoName)<br />
public void sendMessage(Message msg)<br />
Most of this calls are doing the same thing, namely get an object reference from the caller, set<br />
the protocol inside it and put it in JavaSpace. The methods that are a little more complicated are<br />
presented in the following paragraphs. All the methods that are advertising a reference to a file are<br />
returning an AliceURL that is pointing to the file that has just been advertised. This is most useful<br />
for the advertiseCode call, which will actually return the application ID, as presented earlier.<br />
The method advertising a new protocol, public AliceURL advertiseProtocol(String name, String<br />
dir, String fileClient, String fileServer), is dynamically loading the class for the server side of the<br />
protocol, instantiate it and run it. The protocol is also registered with the ONTA registry. The<br />
25
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
reAdvertiseProtocol call is sending a protocol on request, if that protocol is locally known and it<br />
has not reached the component that tries to download something from the machine inside which it<br />
is called. This call is directed to a certain machine that requests the protocol. The remote object<br />
loader thread is handling protocol readvertisement.<br />
The only other method that does something extra is public AliceURL advertiseData(String<br />
dataFileName, AliceURL appl, String protocol) which is first locating a data server. Each time a<br />
data server thread starts, it advertises in JavaSpace, through a special message, that it is up and<br />
ready to receive data files. After a data server is located, the reference to the data file is created<br />
with that machine as the destination and the reference is written in JavaSpace.<br />
3.4. The Remote Object Loader<br />
Through the use of the download protocols and of the Object Loader (see section 3.5), the remote<br />
object loader is the one responsible with bringing the objects to the local machine over the network<br />
and restoring them on the local machine. The process of bringing to life an object saved on another<br />
machine has two parts: the download of the serialized form of the object and the dynamic loading<br />
of the object itself, once the serialized form is available on the local machine. First one of this<br />
tasks is carried out by the remote object loader, that is downloading of a serialized object over the<br />
network. As presented in this chapter, for each object saved, the ONTA keeps a single JAR archive<br />
file that contains the bytecode for the object and all the classes needed to restore the object. Since<br />
each serialized object is actually a file, the process of transporting it over the network translates<br />
in transporting a file over the network, and this is basically what the remote object loader does.<br />
It is designed, as the whole ONTA system, with flexibility, modularity and security in mind. The<br />
downloading of a file translates in two steps: the first one is retrieving an object reference from<br />
JavaSpace and the second one is actually getting the file. The first step is achieved by calling one<br />
of the next two methods:<br />
public ObjectReference getObjectReference (ObjectReference template)<br />
public ObjectReference waitObjectReference (ObjectReference template)<br />
For maximum flexibility the call gets as a parameter a template for an ObjectReference to take<br />
form the space. This is done considering the fact that, for example, different language support<br />
will need special informations, not general ones, inside the object reference, and hence we can not<br />
implement the creation of templates here; instead, we let the programmer of a runtime support use<br />
his own kind of references and handle the templates himself. The difference between the above<br />
two calls is that the first one is non-blocking, using a time-out and the second one blocks until a<br />
reference that match the given template is found in space.<br />
26
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
Once the reference to an object (actually to the file that contains the serialized form of the<br />
object) is obtained, the download of the file, using the protocol named inside the reference, can be<br />
easily done with the call:<br />
public File doDownload (ObjectReference ref)<br />
The remote object loader also provide the means to receive message (from the class Message<br />
or from a class extending it) from JavaSpace, with the methods:<br />
public Message takeMessage (Message template)<br />
public Message readMessage (Message template)<br />
public Message tryMessage (Message template)<br />
The first two calls are blocking until a message matching the template is received, the difference<br />
between them being that takeMessage removes the message from the space and readMessage only<br />
reads it and returns a copy of it, leaving the original in the space. The third method is non-blocking<br />
and removes a message matching the template from space, if such a message exists.<br />
The remote object loader has one other important function, that is to retrieve new protocols<br />
advertised in JavaSpace by other parts of the system. In order to achieve this functionality, a<br />
special thread is running and it waits for an object reference that contains a new protocol to arrive<br />
in the space, reads it when a new one is found, downloads the file, dynamically loads the client<br />
side of the protocol, instantiate it and registers it to the ONTA registry for further usage in the<br />
download process.<br />
3.5. The Object Loader<br />
The object loader is the component of the ONTA that does the delicate job of returning a life<br />
object, provided a JAR file archive that contains a serialized object and all the classes needed<br />
to restore the object. There are two other classes involved in the dynamic loading of classes in<br />
ALiCE, namely OntaClassLoader, a class loader class, and MyObjectInputStream, an extension<br />
of the ObjectInputStream class, necessary to intercept the class resolution requests sent by Java<br />
when loading an object through means of ObjectInputStream.<br />
The dynamic loading mechanism in Java is one of the most powerful tools available to the<br />
developer. To understand the mechanism, some preliminaries first. Class loaders are a powerful<br />
mechanism for dynamically loading software components on the Java platform. They are unusual<br />
in supporting all of the following features:laziness, type-safe linkage, user-defined extensibility,<br />
and multiple communicating name spaces. The purpose of class loaders is to support dynamic<br />
27
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
loading of software components on the Java platform. The unit of software distribution is a class.<br />
Classes are distributed using a machine-independent, standard, binary representation known as the<br />
class file format. The representation of an individual class is referred to as a class file. Class files<br />
are produced by Java compilers, and can be loaded into any Java virtual machine. The Java virtual<br />
machine uses class loaders to load class files and create class objects. Class loaders are ordinary<br />
objects that can be defined in Java code. They are instances of subclasses of the class ClassLoader.<br />
A Java application may use several different kinds of class loaders to manage various software<br />
components. In Java, a class is defined by two components: its name and its class loader. So, two<br />
classes with the same name, but loaded by different class loaders, will be different.<br />
The OntaClassLoader class loads classes either directly from a class file or from a JAR archive.<br />
The class loader is a tool used by the ObjectLoader class in the process of restoring an object.<br />
There are two use cases for the object loader, the first one being the one in which we are restoring<br />
an object that was saved from a live instance (the cases of serializing tasks at the task generator’s<br />
site or results at the producer’s site). The second use case is when there is no actual live object to<br />
transmit, so only class files are put in the archive. This is the case of sending the application from<br />
the consumer to the task producer’s site - the application does start only when the task generator<br />
starts, after sending it to the task producer. In this later case, since there is no live instance serialized<br />
in the file, the first class saved inside the archive will get instantiated and the result of loading the<br />
“object” from that archive will be this instance. Accordingly to the two use cases, the API of the<br />
object loader consists of two main methods, together with another call to clean up that will be<br />
explained a little later:<br />
public Object loadFromSavedClass (File f)<br />
public Object loadFromSavedObject (File f)<br />
public void cleanUp (File f)<br />
Internally to the class, the main method used is getClasses(File f) that retrieves all the class file<br />
entries from the JAR archive and dynamically loads them with a new instance of the OntaClass-<br />
Loader.<br />
The restoration of the byte code for a saved instance of an object is done through the My-<br />
ObjectInputStream class, which will call the ObjectInputStream methods, but will intercept name<br />
resolution delegation for the classes encountered in restoring the object, for they should be dele-<br />
gated to the corresponding class loader that loaded them, which is an instance of OntaClassLoader<br />
class. This poses the problem of saving the class loaders for each class that was dynamically<br />
loaded, as well as retrieving the right class loader and also clean them up when they are not needed<br />
anymore, since we are dealing heavily with mobile code and not cleaning these stored class loaders<br />
28
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
will mean that the system will grow in terms of memory size unlimited. Also, we need to retrieve<br />
a class loader for a specific class whenever trying to save to a new archive a class that was loaded<br />
by one of our class loader. For example, at the task producer, we are loading, say, a Result class,<br />
as it is required for the instantiation of the task generator (through a link in the tree of references<br />
- see ObjectWriter section 3.2). When we are trying to save the serialized task to an archive to<br />
be delivered to a producer, we end up needing the Result class. Since this class is loaded by an<br />
OntaClassLoader, a call to Class.forName to restore the class (in order to find the location of the<br />
class file on the disk) will also require providing the class loader that loaded the class, otherwise<br />
the call would be unsuccessful. To solve the problem of storing the class loaders, we use one class<br />
loader for each object restored. So, for example, when retrieving at a producer a task from a JAR<br />
archive, we will use one class loader for all the classes needed and associate this class loader with<br />
the instance of the restored task. When a class is needed from a previously used class loader, we<br />
will have the object that that class is related to (when saving new serialized object archives) or the<br />
class loader itself (when restoring an object - when reloading an object we would have first loaded<br />
all the classes inside the archive using exactly this class loader that is needed). The problem of<br />
retrieving class loader is thus resolved. As for cleaning up these class loaders, this is achieved for<br />
all the references to class loader saved in relation to a JAR archive, by calling the cleanUp (File<br />
f) method presented above. This should be done when the classes from that file are not longer<br />
necessary (e.g. after a task has returned the result or after a task generator has returned). All the<br />
class loaders are saved inside a hash table that is stored in the RemoteObjectLoader class.<br />
3.6. The File Manager<br />
In ALiCE, all the serialized objects are stored in a file, as well as new protocols. Since the job of<br />
serializing/deserializing objects and of transferring them over the network is handled by ONTA,<br />
one important component of ONTA is the file manager that keeps the file in an organized manner.<br />
Also, the file manager is designed so future development of the file storing system (e.g. imple-<br />
mentation of caching or other similar mechanisms) can be isolated from the other parts of the<br />
system.<br />
Since we can have multiple applications using the same filename or even different instances of<br />
the same application, using the same files and also taking into account that the filename for each<br />
archive file containing an application should be unique, the approach we took is creating a new<br />
filename for each file that the file manager will store.<br />
There are two situations in which a file is stored with the help of the file manager: when a<br />
new archive is created as the result of an object being serialized through the object writer and<br />
29
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
when a file is downloaded by ONTA from the network. These two situations are different in the<br />
sense that when storing an archive after using the object writer, a local temporary file is already<br />
created, at it should just be renamed to a unique file name. When downloading a file, first a unique<br />
filename should be obtained and than this file name should be used to store the downloaded file.<br />
For handling this cases the API consists of two calls:<br />
public String put (File f)<br />
public String getFileName(AliceURL appl)<br />
All the files are stored starting at a root directory that is selected by using a string parameter<br />
passed to the constructor of the file manager when the class is instantiated. All the files are stored<br />
starting at this directory, which is specified by the user through the use of the GUI. There is an issue<br />
of creating too many files inside the same directory and the ONTA file manager uses a hierarchical<br />
approach, so there are never more than a maximum number of files inside a directory. If that<br />
number is reached, a new subdirectory is created and all the other files with the same root directory<br />
start will be stored in that subdirectory. We do this in order to minimize the file system overhead<br />
induce by a larger number of files being located in the same directory, thus additional level of<br />
indirection being needed.<br />
The put (File f) method stores a local temporary file to a unique filename inside the ONTA root<br />
directory. This actually just gets a new file name, creates a new directory if needed (if the current<br />
one is full - that is the maximum number of files has been reached) and moves the file there.<br />
The second method, getFileName(AliceURL appl) should be used by all the client protocols<br />
to store a file when it is downloaded. Since all the references contain the application that they<br />
are related to inside the object reference taken from the space, the file manager has the ability<br />
to store all the files downloaded on behalf of an application in the same directory. This greatly<br />
improves the ability to have a good file structure organization and to be able to clean up the unused<br />
files much easier. So, given an application ID (in form of an AliceURL), this call will return a<br />
unique filename in a directory unique for that application (if this directory does not exists, it will<br />
be created). Storing files for different applications in different directories also resolves the issue of<br />
files with the same filename being downloaded in the same directory.<br />
To clean up files that are not used anymore, the file manager offers two calls to handle this:<br />
public void markNotUsed(String name)<br />
public void deleteUnused()<br />
The first method adds a file with a given name (the name returned by one of the two calls<br />
presented above should be used) to a list of files that are not longer needed. When deleteUnused()<br />
is called, all the files in the list are physically deleted from the disk.<br />
30
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
3.7. The Protocols<br />
One of the principal focuses of ONTA is on plug-in protocol support. This means that new pro-<br />
tocols can be added to the system on the fly and thus security fixes or new security levels for file<br />
transfer can be implemented without even restarting the system or any part of it. Developing a new<br />
protocol is very straight-forward as all it is needed is implementing two classes, one for the client<br />
side and one for the server side. Each of these classes should implement a corresponding interface.<br />
In the system there is only one hard-coded protocol needed to transfer new protocols (a protocol<br />
before-known by all parties is necessary to permit the communication at first).<br />
So, to add a new protocol, the programmer should implement these two classes and to advertise<br />
them with the advertiseProtocol method from the object repository. What needs to be implemented<br />
in the server and in the client side of the protocol is presented in the following subsections.<br />
We stress here that the protocols are identified by their name system-wide, so one should not<br />
deploy a protocol if another protocol with the same name exists. Since there is no central informa-<br />
tion system yet developed, this is a big flaw into the system, as there is no way to tell if there is<br />
another protocol with the same name somewhere else into the system. This could end up in a client<br />
side from one protocol trying to communicate with the server side of another protocol. For this<br />
reason, adding protocols to the system should be done for now under a centralized human control.<br />
3.7.1. The Protocol Server<br />
The server side of the protocol mainly consists of one method, called acceptConnections, which<br />
should be implemented in a sever manner, that is it waits for connections and when a connection<br />
is requested, it should serve that connection and wait for a new one. Since the connection are<br />
initiated by the client side of the protocols, there is no guideline in how to implement the requests<br />
or the communication protocol.<br />
The interface for the protocol server is actually an abstract class that should be extended (it<br />
also contains the name of the protocol, that is set by the system, that is why it is not an interface):<br />
public abstract class ProtocolServer {<br />
}<br />
public abstract void acceptConnections();<br />
public abstract void pushFileSupport ();<br />
There is one other method that can be implemented, that is the pushFileSupport() method. This<br />
method is called whenever a protocol server is registered and started, together with the acceptCon-<br />
nections(). If it is desirable to implement a way so that the connections are initiated from inside,<br />
31
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
because the machine the protocol runs on is behind a firewall, this method should be implemented.<br />
There is a corresponding pushFileSupport() method in the client side of the protocol to support this<br />
approach. The idea is that some firewalls don’t allow connections initiated from outside. If this<br />
is the case, a server could not function properly behind that firewall without changing the rules of<br />
the firewall and usually this is not desired or in some cases can not be done. In this case, another<br />
approach should be used, supported by the push file support. In this approach, a download should<br />
be implemented by the following steps:<br />
1. When a download is required, put a special message in JavaSpace to request that file on the<br />
client side of the protocol, than wait for a connection; waiting for a connection is possible by<br />
implementing a server mechanism (inside the client) with the help of the pushFileSupport()<br />
method, which will be called when the client is instantiated;<br />
2. On the server side, wait for requests in form of messages in JavaSpace, in loop located inside<br />
a thread that should be started from the pushFileSupport method; important: do not make<br />
a forever (while (true)) loop inside the call, as this would block the system :). When such<br />
a request is received, open a connection to the client side of the protocol located on the<br />
machine that requested the download and push the file to that machine;<br />
3. On the client side, when a connection is initiated to the pushFileSupport server, download<br />
the file and store it locally, as if it was normally downloaded.<br />
The default protocol offers no push file support.<br />
3.7.2. The Protocol Client<br />
The client side of the protocol is responsible of downloading a file from the server side of the<br />
protocol when this is requested by calling the download method on an instance of the protocol<br />
client class. The ProtocolClient class is an interface that should be implemented by any protocol<br />
developed. The interface is:<br />
public interface ProtocolClient {<br />
public File download(AliceURL url, AliceURL appl) throws<br />
IOException,FileNotFoundException;<br />
}<br />
public void pushFileSupport();<br />
The main method that should be implemented is the download() method. It receives from<br />
the system two parameters, first being the actual AliceURL of the file that should be downloaded<br />
32
3. The Basic Building Block - ONTA (Object Network Transfer Architecture)<br />
(protocol name, host address and file name on that host). The other parameter is the application on<br />
behalf of which this file is downloaded, taken from the reference from Java Space that lead to this<br />
download. This is necessary in order to pass it on to the file manager (see section 3.6) to get a file<br />
name where to put the file downloaded. Than the call should return a File object representing the<br />
downloaded file. An implementation should look like:<br />
public File download(AliceURL url, AliceURL appl) throws<br />
IOException,FileNotFoundException {<br />
}<br />
String fileName = ALICE.getFileManager().getFileName(appl);<br />
request_file_to_server(url.getHost(), url.getFile());<br />
File f = new File(fileName);<br />
receive_file_from_server(f);<br />
return f;<br />
The push file supported should be implemented by the directions and with the functionality<br />
presented in subsection 3.7.1.<br />
33
4. ALiCE Architecture and<br />
Implementation<br />
A distributed system, especially a grid computing system, has several components residing on vari-<br />
ous locations of the system, working together for a common purpose. Each of this components has<br />
their own unique roles and responsibilities. The components of ALiCE are dealing with providing<br />
a functional and performant middleware for a general grid computing environment.<br />
4.1. An Overview of the Components of the <strong>System</strong><br />
4.1.1. Three-tier architecture<br />
The ALiCE system must support three basic functions: allowing users to submit the applications<br />
that they wish to run, allowing users to contribute the computational power of their machines to<br />
the ALiCE system, and lastly resource management - matching resource demand with available<br />
resources.<br />
The same approach as the previous version of ALiCE is taken, with some new elements. Ba-<br />
sically, we are using a three-tier architecture, consisting of three elements: the consumer - that<br />
submits new applications into the system -, the resource broker - that does resource management<br />
and scheduling with a centralized view of the system - and the producers - that are executing code<br />
supplied by the applications. There is an additional component used for data files, the data server.<br />
This can be the same machine as the resource broker.<br />
4.1.2. <strong>System</strong> Components Overview<br />
The components of the system changed a little, since the architecture and the approach changed.<br />
There are some new components and some old ones have different roles. All the components are<br />
presented in figure 4.1 and the functionality of each of them is explained in the following.<br />
34
4. ALiCE Architecture and Implementation<br />
Consumer<br />
Task Prducer<br />
(Java/C -<br />
Sparc Solaris)<br />
Producer<br />
Consumer/<br />
Producer Consumer/<br />
Producer<br />
Resource<br />
Broker<br />
Task Prducer<br />
(Java/C -<br />
Intel Solaris)<br />
Internet / LAN<br />
Resource<br />
Broker<br />
Task Prducer<br />
(Java/C -<br />
Intel Linux)<br />
Consumer<br />
Data<br />
Server<br />
Task Prducer<br />
(Java/C -<br />
Intel Windows)<br />
Figure 4.1.: ALiCE components<br />
Data<br />
Server<br />
Producer<br />
ALiCE<br />
The connection between the components is done through the network, either a Local Area Net-<br />
work if the work environment is a cluster, or the Internet if the system is deployed over wide area<br />
connections. All the communication is done through JavaSpace and the means of communication<br />
will be further detailed later on.<br />
The components of the ALiCE GRID middleware are:<br />
The Consumer<br />
The consumer is the one submitting the applications to the system. It can be any machine that is<br />
connected to the ALiCE system through a LAN or through the Internet and that runs the ALiCE<br />
consumer/producer components and GUI (in the end, any machine connected to the Internet can<br />
use the ALiCE GRID system). This means that the user will use a GUI to submit a file containing<br />
the ALiCE application in a specific form the language that is written in. For Java, this file is a JAR<br />
archive which should contain at least one task generator class and one result collector class (see<br />
chapter 3 for more details about the programming model). The task generator is transported inside<br />
the ALiCE system in order for tasks to be generated, initialized and sent to be producer. The result<br />
collector is executed at the consumer at it receives the results generated by the tasks created by the<br />
task generator.<br />
35
4. ALiCE Architecture and Implementation<br />
The consumer is also the point from which new protocols and new runtime supports can be<br />
added to the whole system. Although the new protocols plug-in support was tested and it works,<br />
some tuning is steel badly needed, as well as an unified support for using the new protocols and<br />
selecting from them for security purposes. The plug-in support for new languages support is not<br />
fully deployed as yet.<br />
The Producer<br />
The producer is a machine that has volunteered its cycles to run ALiCE applications. The producer<br />
will receive tasks from the ALiCE system in the form of serialized live objects, will dynamically<br />
load them and execute the. The results obtained from each task will be sent so they can be received<br />
by the consumer that has originally submitted the application.<br />
The producer and the consumer can actually be the same machine, this being the most usual<br />
case, when someone who volunteers to run application from others also wants to run his/hers own<br />
applications in ALiCE. In order to support this, the GUI for the system is unified for the producer<br />
and for the consumer.<br />
The Resource Broker<br />
The resource broker is the central point of the system. Basically, the only thing it does is schedul-<br />
ing. The scheduling is needed for many reasons. The first is to have a control over resource<br />
allocation and usage, since we are dealing with multiple concurrent applications at the same time<br />
in the system.<br />
Then there are the objective needs imposed by supporting other languages than Java. In con-<br />
trast with the portability and platform-Independence of the Java programming language, other<br />
languages are platform-dependent and even library-depended. The scheduler should than choose<br />
an appropriate platform for the producer that should run that application. There are two types of<br />
scheduling done at the resource broker’s site: application scheduling and task scheduling.<br />
Even though there are many approaches to scheduling, we choose to have a centralized sched-<br />
uler, with options to do part of the scheduling distributed by means of pattern-matching when<br />
retrieving objects from JavaSpace.<br />
For more information about the resource broker and the scheduling, see section 3.3.3.<br />
The Task Producer<br />
The task producer is a machine that is part of the ALiCE core (but needs not to be so - it can be<br />
outside, see the detailed chapter about system components) and it is meant to run the task generator<br />
classes of the applications. This will generate tasks, which will be scheduled by the resource<br />
36
4. ALiCE Architecture and Implementation<br />
broker and than downloaded by the producers directly from the task producer. The separation of<br />
this machine from the resource broker (in the previous version, they were running on the same<br />
machine) was done for two principal reasons:<br />
- since we are supporting non-Java applications, those applications are platform-dependent and<br />
not all of them can be run at the resource broker;<br />
- in order to separate and isolate the central point of ALiCE, the resource broker, from any<br />
alien code. Since the task producer runs code submitted by consumers, we don’t have total control<br />
over what that code does. Even with strongly enforced security and code safety measures, we can’t<br />
guarantee total security. So the decision was made to run the code on another machine and in this<br />
way to achieve total safety of the resource broker.<br />
Each task producer is running either Java code (which can be run on any platform), either code<br />
compiled for the platform that the task producer offers.<br />
The Data Server<br />
The data server is a machine dedicated for data file storage. Any data file used by an application<br />
can be submitted to the data server. From inside any task, the programmer can obtain access to a<br />
data file submitted for the application that has generated the task. Through the reference obtained,<br />
the task can read or write chunks from that file of any size, from 1 to the size of the whole file. For<br />
more details, see the section about the data server and data files in section 3.3.5.<br />
4.2. The Communication Between Components of<br />
ALiCE<br />
In ALiCE, all communication initiated and supported through JavaSpace. This means that there<br />
will be no communication between parts of the system that do not leave a trace in the space. In<br />
future development, this will help with implementing a central accounting and monitoring scheme<br />
that will be able to register all the communications that take place in the system.<br />
JavaSpace is also used as the mean to synchronize between the components of the system.<br />
Since the API of JavaSpace place at our disposal a set of blocking calls, we can use those calls to<br />
synchronize different threads running on different machines in the ALiCE grid computing system.<br />
All the objects transfered through JavaSpace in ALiCE fall in one of two categories: file ref-<br />
erences or messages. These two categories are represented by the classes ObjectReference and<br />
Message that are the core element of data transfer in ONTA. Corresponding to the two kinds of ob-<br />
jects, we have two types of communication: one implies the transfer of serialized objects through<br />
the network, implementing a mean to have mobile code in our grid system. The other one is just<br />
37
4. ALiCE Architecture and Implementation<br />
for transferring information between components of the system, to advertise capabilities and re-<br />
sources, to synchronize parts of ALiCE and for any other purpose that has as the final purpose the<br />
transfer of a punctual information from one machine in ALiCE to another one. In the following,<br />
we will present in detail each of this two kinds of communication.<br />
4.2.1. Communication Through Object References<br />
Communicating between components of the system through usage of Object References is im-<br />
plemented by ONTA. The steps involved in transferring an object from one part of the system to<br />
another are presented in figure 3.1. For example, let’s say that the component A has an object Obj<br />
that it wants to send to B. another component of the system. The stages in this operation are:<br />
1. A serializes Obj and obtains a file containing the byte code for the live object, together with<br />
all the classes that Obj needs, packed inside a JAR archive;<br />
2. A creates an ObjectReference object corresponding to the file containing the serialized Obj<br />
and writes it in JavaSpace;<br />
3. A specialized thread in B that was waiting for an Object Reference of the type sent by A and<br />
will take the reference from space;<br />
4. B downloads the actual file containing the serialized object directly from A using the proto-<br />
col indicated in the reference it took from space;<br />
5. B deserializes Obj from the downloaded file and gets the live instance for future use.<br />
Basically, an ObjectReference is a pointer to a file containing a serialized object. It also contains<br />
some information needed for differentiate matching of different reference types and some other<br />
fields needed by ALiCE to handle the objects. The fields in an ObjectReference instance are:<br />
URL - this field points to the location of the file containing the serialized object. It is not an<br />
actual java.net.URL instance (this would pose too much overhead with not needed informa-<br />
tions), but an instance of the class AliceURL, that contains just three string fields:<br />
– host - the IP address (or the host name) of the host holding the file;<br />
– protocol - the string ID identifying the protocol that should be used to download this<br />
reference (for more on protocols, see section 2.7);<br />
– file - the full path file name location of the file containing the serialized object on the<br />
machine that advertised it.<br />
38
4. ALiCE Architecture and Implementation<br />
Type - this field helps differentiate between different kind of objects referenced by Objec-<br />
tReference instances in JavaSpace. The types currently supported by ALiCE are:<br />
– application - for the transfer of a new application reference between a consumer and a<br />
resource broker for scheduling;<br />
– code - for the actual file transfer of a new application from the consumer to the task<br />
producer, after the scheduler choose a task producer to run the task generator of that<br />
application;<br />
– data - for the transfer of data files from the consumer to the data server;<br />
– protocol - for transferring new protocols from any component of the system to any<br />
other component of the system;<br />
– taskToSchedulle - for sending a reference to a task that is ready to be scheduled from<br />
the task producer to the resource broker;<br />
– task - for the file transfer of a serialized task from the task producer to a producer after<br />
scheduling by the resource broker;<br />
– result - for transferring a serialized result from the producer that produced the result<br />
back to the result collector running at the consumer that started the application, either<br />
direct, or through a result manager running on a resource broker’s machine;<br />
– userObject - this type of reference is used to point to files containing serialized objects<br />
used in the communication between the components of an ALiCE application; these<br />
objects are sent and request by the application, from within the user-code.<br />
Application - usually, every reference to an object is related to an application, being a result<br />
of the application, a task of the application, the application itself or an user object. Keeping<br />
track of which reference is part of which application is very important to the system, from<br />
multiple points of view, like identifying which results will be delivered to a result collector,<br />
which user objects are related to an application or for purposes like security and accounting.<br />
Also, this identification is important in order to be able to clean the space of all the references<br />
to an application if this is needed or requested. The ID is also an URL, that is the URL that<br />
the application was advertised with by the consumer that sent it into the system. This ID is<br />
unique system-wide, given the fact that is comprised of the IP address of the machine (that<br />
uniquely identifies that machine from any other machine) and from the file name that the<br />
application was contained in, which is also unique (see section 2.6).<br />
Destination - most of the references are intended to be sent to one specific machine that<br />
runs ALiCE, with some exceptions (like when an application is first submitted, any resource<br />
39
4. ALiCE Architecture and Implementation<br />
broker could take it for scheduling). In order to be able to do this, an addressing scheme<br />
should be in place. Since one machine can only run one instance of ALiCE at a given time,<br />
that means that we can safely use the IP addresses (which are guaranteed to be unique for<br />
each computer). The destination field in an object reference will that contain the IP address<br />
(or the DNS name) of the machine that this reference is intended for. No other instance<br />
of ALiCE will take the reference than the one running on the computer with the given IP<br />
address.<br />
Identification - this field is a string identifier that severs different purposes, depending on<br />
what kind of object reference it is. The most common use for this field is to name the<br />
language that the code this reference to is, in case of all binary code transported in ALiCE<br />
(for now, this is limited to Java and C), this being the case of all the reference types except for<br />
the protocol type. In this later case, the identification field contains the string that identifies<br />
the protocol name.<br />
Platform informations - there are some additional fields in an object reference that are used<br />
in the case of references pointing to code objects (applications, task, results a.s.o.); these<br />
serve the purpose of defining the platform that this code can run on. There are two string<br />
fields in this category, one defining the processor and one the OS that the pointed binary code<br />
can run on. Depending on this fields and the language of the code, a producer can decide if<br />
it can or can not run the code contained in the object pointed by the reference.<br />
4.2.2. Communication Through Messages<br />
Messages are used to communicate small amount of information from one component of the sys-<br />
tem to another. The basic class that contain a message is defined in Message.java and contains just<br />
four fields:<br />
source - the IP address or the host name of the machine that sent this message;<br />
destination - the IP address or the host name of the machine that this message is addressed<br />
to;<br />
application - the application ID of the application that this message is related to; there are<br />
cases when a message has no connection to an application (e.g. message advertising the<br />
presence of a data server), in this case the application field is null;<br />
type - an integer defining the type of this message; this is the field that is indispensable for<br />
the matching to a template when taking messages from JavaSpace.<br />
40
4. ALiCE Architecture and Implementation<br />
The message class is just a frame to build on, so for implementing specific inter-component com-<br />
munication in ALiCE, the Message class is extended to contain specific fields for that message<br />
type. When creating a new message extension class, the programmer should also define a unique<br />
identifier as a constant in the Message class itself.<br />
The use of communication through messages in ALiCE has many examples. It is used for<br />
advertising for advertising some information (the presence of a data server, the handler of a data<br />
file, the scheduler of an application etc.), for request-reply purposes (asking number of results<br />
ready when results are stored at a resource broker, getting templates to run with for a producer<br />
etc.) or even for sending simple messages on behalf of an application (the sendStringMessage -<br />
getStringMessage mechanism between a result collector and a task generator).<br />
4.2.3. General Communication Scheme for an ALiCE Application<br />
Consumer<br />
(1)application<br />
(A)data<br />
Resource<br />
Broker<br />
Data<br />
Server<br />
(2)code<br />
(3)taskToSchedule<br />
ALiCE<br />
Task<br />
Producer<br />
(5)result - option 1<br />
(5)result - option2<br />
(4)task<br />
Figure 4.2.: ALiCE ObjectReference transfers<br />
(B)data<br />
chunks<br />
Producer<br />
In this section we present the algorithm that is followed for running an application in the ALiCE<br />
system. All the steps in running an application are illustrated in the figure 4.2 by following all the<br />
41
4. ALiCE Architecture and Implementation<br />
object references that are put into JavaSpace. There are a lot of additional messages that are<br />
transfered between the components of the system. We will look in detail at each component of the<br />
system in section 3.3, <strong>System</strong> Components.<br />
The general algorithm of running an ALiCE application is presented next:<br />
1. A user submits a new application into the system from a consumer machine, in the form of<br />
a JAR archive containing all the classes needed, including at least a task generator class, a<br />
result collector class; for functionality there should also be included at least one task class<br />
and one result class.<br />
2. The result collector class is dynamically loaded, instantiated and started at the consumer.<br />
3. An object reference with the type TYPE_APPLICATION and with no specified destination<br />
is placed in the space.<br />
4. The reference is taken by an resource broker that calls a scheduling function that will choose<br />
a task producer to run the task generator of the application. A new object reference with the<br />
type TYPE_CODE and the destination the chosen task producer is created, pointing to the<br />
original application archive at the site of the consumer that submitted the application.<br />
5. The task producer that was chosen to execute the task generator of this application takes<br />
the reference that was destined to it from JavaSpace and downloads the file containing the<br />
application from the consumer.<br />
6. The task producer dynamically loads the task generator class and all the classes needed by<br />
the task generator (through references, these classes also include the tasks classes and the<br />
results classes) and instantiate it, than starts it. This will initiate the process of creating tasks.<br />
7. Each time a new task is created and submitted into the system for producing by the appli-<br />
cation (through the call process() from the task generator), the task is first serialized and<br />
added to a JAR archive file that will contain the serialized form of the object, together with<br />
all the classes used and referred by this object (including the result class). Than a new refer-<br />
ence with the type TASK_TO_SCHEDULE is instantiated, pointing to the JAR archive just<br />
created, with no destination and the reference is placed in JavaSpace.<br />
8. A resource broker takes the reference and calls the scheduler to choose a producer that should<br />
run the task that is referred by the reference; this producer can be changed later without<br />
affecting the reference (see the subsection about the resource broker in <strong>System</strong> Components<br />
section). A new object reference with the type TYPE_TASK, pointing to the file containing<br />
42
4. ALiCE Architecture and Implementation<br />
the serialized task at the task producer’s machine, destined for the chosen producer, is placed<br />
in JavaSpace.<br />
9. The producer that was chosen at step 8 takes the reference from the space and downloads<br />
the file from the task producer. By using the Object Loader (see section 2.5), the object is<br />
dynamically loaded, together with all the classes needed by the object. The newly obtained<br />
task instance is started by calling the execute() method on it.<br />
10. The execution of the task will return an Object which is the result of this task. The re-<br />
sult is serialized and placed in a JAR archive and a new object reference with the type<br />
TYPE_RESULT is created, pointing to that file and having the destination set according<br />
to the result delivery mode chosen when the application was first submitted. The result will<br />
be sent to the resource broker that scheduled the application if the results should be handled<br />
through the resource broker, or directly to the result collector of the application, running at<br />
the consumer, if the result delivery mode is direct delivery.<br />
11. If the result delivery mode is direct delivery, the result manager running at the consumer that<br />
submitted the application takes the reference to the result from JavaSpace, downloads the file<br />
containing the result and dynamically loads the result object, storing it locally in memory.<br />
When the result collector requests a new result, this object is returned to it.<br />
12. If the results are sent back through the result broker<br />
a) The resource broker takes the reference from JavaSpace and downloads the file con-<br />
taining the serialized result pointed by the reference and stores it on the secondary<br />
storage.<br />
b) The result collector running at the consumer decides to check for results and sends a<br />
message to the resource broker asking for the number of results that are stored there<br />
for the application that this result collector runs for. If there are some results ready, the<br />
result collector can get them one by one (in order not to fill the JavaSpace with result<br />
references). In order to get a new result, the result collector puts a request for it in<br />
JavaSpace.<br />
c) The result manager running on the resource broker’s machine takes the request (in form<br />
of a message) from JavaSpace and finds a file containing a serialized result that belongs<br />
to the respective application. A new object reference with the type TYPE_RESULT is<br />
created pointing to that file located on the resource broker’s machine, with the destina-<br />
tion the consumer machine that runs the result collector and is written in JavaSpace.<br />
43
4. ALiCE Architecture and Implementation<br />
d) The result manager running at the consumer that submitted the application takes the<br />
reference to the result from JavaSpace, downloads the file containing the result and<br />
dynamically loads the result object, storing it locally in memory. When the result<br />
collector requests a new result, this object is returned to it.<br />
13. When the application’s result collector returns, all the references that are related to the ap-<br />
plication that has just terminated are removed from JavaSpace.<br />
4.3. <strong>System</strong> Components<br />
This section will explain the functionality and the implementation of each component in the ALiCE<br />
grid computing system. Every component consists mainly in some threads that are doing various<br />
functions, some of them common for all the components, some of them specific to either the<br />
consumer, the resource broker, the producer/task producer or to the data server.<br />
The implementation of ALiCE consists in a multi-threaded application. The system is designed<br />
in such a way that all the components are separated only by threads, so on the same Java Virtual<br />
Machine we can run even all the components at the same time. This means that there is possible<br />
to run a consumer and a producer on the same machine at the same time, or (more useful) to run a<br />
resource broker and a data server on the same machine. All this is possible as long as the fact that<br />
they are running in the same JVM does not pose a problem to the functionality of the system.<br />
4.3.1. The Common Components<br />
There are some parts of the system that are common to all the components. The main common<br />
subsystem is ONTA, which is used for object transfer by the consumers, the resource brokers, as<br />
well as the producers, task producers an data servers. ONTA is composed of some objects, some<br />
servers and a thread to collect new protocols. All the objects that are instantiated and used in all<br />
the components of ALiCE, as well as the common threads are presented next.<br />
Every time an ALiCE instance is started, there is an initialization phase that is done first, by<br />
calling a static init method on the ALICE class. This does some initialization of objects that are<br />
used by any instance of ALiCE, and starts some threads.<br />
The Java Space<br />
In ALiCE, we are using one common and unique space to transfer object references and mes-<br />
sages. We tried the system with both JavaSpaces and GigaSpaces and, although it is a commercial<br />
implementation, GigaSpace is much more faster and more stable.<br />
44
4. ALiCE Architecture and Implementation<br />
Any instance of ALiCE will first obtain a reference to the space by doing a static lookup. We<br />
are not using the discovery protocol since we want to have control over the use of the space. So the<br />
user of ALiCE should provide the location of the space when the system starts up. Any operation<br />
of writing, taking or reading object to/from the space will be done using the reference obtained<br />
during the lookup done at the system initialization time; this reference can be obtained with a<br />
static call on the ALICE class.<br />
We are not using a JavaSpace class, but a wrapper for the calls to the space, meaning that<br />
any call to take, write, read etc. objects from/to JavaSpace is actually a call to a method in the<br />
FastJavaSpace implemented in ALiCE. The purpose of wrapping this calls is to be able to pro-<br />
vide control for accessing the space. This is required for implementing security over who makes<br />
operation on the space, as well as being able to implement a centralized monitoring/accounting<br />
manager.<br />
The ONTA <strong>System</strong><br />
Common to any instance of ALiCE, being a consumer, a resource broker, a producer, a task pro-<br />
ducer, a data server or a combination of these, the ONTA system has some components that are<br />
initialized and started by the initialization procedure of ALiCE.<br />
The File Manager - this controls the file naming and storage in ALiCE; it is instantiated<br />
providing a root directory at the system startup. For more details, see section 2.6;<br />
The Object Repository - through calls of methods from this class, it is possible to place<br />
object references or messages in the space, using a nominated protocol;<br />
The Remote Object Loader - this class will provide the methods of taking object references<br />
and messages from the space, as well as the ones to download the files referred by object<br />
references. Once instantiated, this class will also start the new protocol download thread.<br />
This thread will wait for an object reference pointing to a new protocol to appear in JavaS-<br />
pace. When a new such reference arrives, the thread will download the file containing the<br />
code for the protocol and will dynamically load and instantiate it. This will provide server<br />
support by starting the thread for the new protocol server, as well as an instance of the pro-<br />
tocol client to have the mean to download files using the new protocol. After starting the<br />
protocol, it will be registered with the ONTA registry;<br />
The ONTA Registry - there is an instance of the ONTA registry on each machine running<br />
ALiCE. It basically stores all the known protocols and they can be retrieved from it when<br />
needed, both the instance of the client side and the instance for the server side. The registry<br />
45
4. ALiCE Architecture and Implementation<br />
also handles the unique protocol naming by not allowing two protocols with the same name<br />
to register.<br />
The Server Threads<br />
At the instantiation of the remote object loader, all the protocols that are built-in the system are<br />
started and registered to the ONTA registry. There will be a thread running for each protocol that<br />
the system knows. The server side of all the protocols that are built-in ONTA consist of a thread<br />
that is started during the system initialization phase. In addition to those, for each new protocol<br />
that is received from another ALiCE instance during normal operation of the system, there will be<br />
a thread started on any ALiCE instance.<br />
Each protocol thread will serve the requests made by the client side of the same protocol run-<br />
ning on another machine that will use the respective protocol for downloading a file containing a<br />
serialized object or maybe a data file.<br />
The Shutdown Thread<br />
Common to all ALiCE instances is a thread that is used for graceful system shutdown. The prob-<br />
lem is that if any instance of ALiCE is stopped by killing the process it was running in, there will<br />
be leaks in the system, like references and messages in the space belonging to a dead application,<br />
as well as unused files that were not deleted. In order to provide a clean shutdown for ALiCE, a<br />
special thread is running that listens on a predefined TCP/IP port for a connection. If such a con-<br />
nection is initiated and the shutdown is confirmed by receiving a certain string, all the references<br />
that are related to the host that is running ALiCE that received the shutdown request are removed<br />
and all the local storage files used are also deleted.<br />
This also provides the means to do a remote-shutdown of an ALiCE instance, if this is desired,<br />
because the communication is implemented using TCP/IP sockets.<br />
4.3.2. The Consumer<br />
The consumer is the component of ALiCE that has two main purposes: submits new applications<br />
into the system and starts the result collector, also providing the means to get new results from the<br />
system as they are produced by the producers.<br />
Each language supported in ALiCE for applications (Java and C support are implemented at the<br />
time this report is written) implements its own consumer class, by extending the alice.consumer.Consumer<br />
class. To make things as easy as possible for other language support development, all that a con-<br />
sumer needs to implement by extending this class is the method:<br />
46
Result<br />
Retrieval<br />
Thread 1<br />
Result<br />
Collector<br />
for<br />
application 1<br />
Result<br />
Retrieval<br />
Thread 2<br />
4. ALiCE Architecture and Implementation<br />
Result<br />
Retrieval<br />
Thread 3<br />
JavaConsumer<br />
Result<br />
Collector<br />
for<br />
application 2<br />
Results at the<br />
Resource Broker<br />
Consumer<br />
Result<br />
Collector<br />
for<br />
application n<br />
Result<br />
Retrieval<br />
Thread<br />
Figure 4.3.: An example of an ALiCE consumer instance<br />
CConsumer<br />
Native C<br />
Result<br />
Collector<br />
public void start(File pack, String taskGenerator, String resultCollector, LinkedList dataFiles,<br />
Boolean resByRB, int threads, String proto);<br />
This method will be called by the GUI each time a new application is submitted into the sys-<br />
tem, on the consumer implementation instance for the language the application is written in. What<br />
this method does is start the result collector of the application by dynamically loading and in-<br />
stantiating the class provided by the user in the archive. That class should extend the class al-<br />
ice.result.ResultCollector. By extending this class, the result collector implementation gets access<br />
to the calls it needs to collect the actual results (inherited from the ResultCollector class), like<br />
getNewResult() or getResultsNoReady().<br />
The implementation of the Consumer class for any language should return from this call by<br />
creating a new thread to run the result collector, initializing it with the data it needs, starting it and<br />
than it should return. The ResultCollector class is therefor extending the java.lang.Thread class.<br />
47
4. ALiCE Architecture and Implementation<br />
This thread model is imposed so that multiple result collectors for multiple applications could be<br />
run inside the same ALiCE instance by calling repeatedly the start() method on new instance of<br />
consumers.<br />
For Java language, the class extending the Consumer class is named JavaConsumer. The GUI<br />
instantiate the class for each application that is submitted into the system and calls the start()<br />
method on each new instance, providing the specific parameters needed for each application. Mul-<br />
tiple application can run at the same time (this translates in many result collectors for different<br />
applications or even different instances of the same application) on the same ALiCE consumer in<br />
one Java Virtual Machine, at the same time.<br />
A structure of an ALiCE consumer instance is presented in the figure 4.3.<br />
Starting a New Application<br />
For each new Java application, a new instance of the class JavaConsumer is created and than the<br />
start() method is called on that instance, providing the application’s specific parameters. This will<br />
create a new thread that is implemented also in the JavaConsumer class.<br />
The thread will first unpack the archive file containing the application; any new Java application<br />
should be submitted into the system as a JAR file archive containing all the classes needed and used<br />
by the application. The user should also provide the class names for the task generator and for the<br />
result collector. If there are any data files used by the application, they should be advertised by<br />
specifying the file names in a linked list provided as a parameter to the start() method. There are<br />
two other parameters sent to the start() method, both regarding the result collection. The first of<br />
them indicates the number of result collecting threads that should be started. The last parameter<br />
indicates the result delivery mode, which is chosen by the user that submits the application. For<br />
more details see Result Collection paragraph below.<br />
The consumer will first choose a running directory. The running directory are predefined loca-<br />
tions that the classes for an application are unpacked and run from. These directories should also<br />
be in the CLASSPATH environment variable for the result collector to work. The evidence of used<br />
running directories is kept statically by the superclass of all consumer classes, the Consumer class.<br />
After choosing a running directory, the java consumer will actually copy the entries from the jar<br />
archive to that directory, one by one; also, if there are any directory entries in the archive, these<br />
subdirectories will be also created in the chosen running directory and the entries in them will be<br />
copied to the right location. After doing this, a new ALiCE JAR archive is created by the Object<br />
Writer, containing the task generator and all the classes it needs and refers. Than a new object ref-<br />
erence with the type TYPE_APPLICATION, not destined to anyone, pointing to the new created<br />
archive, will be written in JavaSpace, where it will be taken by a resource broker that will schedule<br />
48
4. ALiCE Architecture and Implementation<br />
the application. If the linked list of data files contains entries, each entry will be advertised to a<br />
data server; the location of the data server is first found out by finding a message advertising a<br />
data server in the space. The data server will download all advertised data files and will have them<br />
available for future use by the tasks of the application.<br />
After this, the result collector class from the archive is dynamically loaded, instantiated and<br />
initialized. After this, the collect() method is called on the new instance. This method is the entry<br />
point to the result collector of the application.<br />
The Result Collector<br />
The result collector is one of the main components of an application, the one that does the re-<br />
sult collection and visualization. Any result collector for an application should extend the class<br />
alice.result.ResultCollector. As mentioned before, for each application, this class is dynamically<br />
loaded, initialized and run by ALiCE. Starting the result collector means simply calling the col-<br />
lect() method, the entry point in it.<br />
Initializing the result collector implies calling the init() method from the ResultCollector class,<br />
which is inherited by any result collector of an application by extending this class. The initial-<br />
ization will set the right values for some fields in the result collector subclass instance, like the<br />
application ID of the application this result collector belongs to (so it knows which results to take<br />
from the space or to request to the resource broker that stores them), the result delivery mode and<br />
the running directory that the result collector runs in. Also, at initialization of the result collec-<br />
tor, the user will specify (in the parameters of the Consumer.start() method) the number of result<br />
retrieval threads to start. This threads are intended for enhancing the performance of the result col-<br />
lector. For this, a number of threads will listen for new results in JavaSpace and, as soon a result is<br />
ready and an object reference to it is found in space, the result will be downloaded by one of this<br />
threads and will be stored locally on the machine that runs the result collector. If the application<br />
starts a very large number of tasks and hence a large number of results is expected to come on a<br />
high rate, having more threads doing the result retrieval work is useful, since the bottleneck will be<br />
that while one result is downloaded, others could be ready and nobody would download them. In<br />
this way, many results can be downloaded at the same time and they will be available to the result<br />
collector at a faster rate.<br />
So, how do the methods returning new results or the number of new results, from the Result-<br />
Collector class, work? The functionality is quite different depending on the result delivery mode.<br />
If the results are deliverer directly to the consumer, the results will be available locally as soon as<br />
they are downloaded by the result retrieval threads from the producers. The number of the results<br />
equals the number of elements in the local vector storing the result objects and getting a new result<br />
49
4. ALiCE Architecture and Implementation<br />
means just returning (and removing) the first element in this vector. This result delivery mode is<br />
recommended, as it is much faster and it imposed much less overhead on the system.<br />
If the results are delivered through the resource broker, the files containing them will be stored<br />
there. Finding out the number of results that are ready means sending a message to the resource<br />
broker that stored the results for this application and waiting for a reply containing the number<br />
of results ready. Also, getting a new result translates in sending a special request message and<br />
than waiting for a corresponding object reference to be placed by the resource broker in the space.<br />
We are only putting one result object reference at a time in the space in order to contain the space<br />
occupied in JavaSpace by these references. If an application would have let’s say a number of 1000<br />
tasks that returned results, it would be unacceptable to keep all the references to them in space<br />
until the application gets them. This result delivery mode should be chosen by the applications<br />
that will take a long time to have all the results ready and the consumer does not have a permanent<br />
connection to the Internet, so it will come back on line later to get the results.<br />
The Consumer Threads<br />
To sum up all the information in this section, the threads that are running on any ALiCE consumer<br />
instance are, beside the ones that are common to all ALiCE components: one thread that runs<br />
the result collector for each application, each of these thread will start a variable number of result<br />
retrieval threads (user specified) at the time of the initialization of the result collector. Each of this<br />
later threads is getting new results for the application and stores it locally.<br />
The Consumer GUI<br />
The Consumer GUI, presented in figure 4.4, provides the user with an easy to use acces to new<br />
application submition and for observing system output as well as each application’s output. For<br />
each new application submite, a new tab will be added to the consumer GUI pannel.<br />
4.3.3. The Resource Broker<br />
The resource broker is the central component of ALiCE, the one that runs the scheduler. This is<br />
the point where accounting and monitoring will be implemented, as all tasks and applications go<br />
through the scheduler.<br />
The approach taken in implementing the resource broker is very scalable, as there can be many<br />
resource brokers into the system, without one being aware of the presence of any others. When<br />
some task or application needs scheduling, one of the resource brokers running will take the ref-<br />
erence from the space and will handle it. This will mean that the references are scheduled much<br />
50
faster than when using just one resource broker.<br />
4. ALiCE Architecture and Implementation<br />
Figure 4.4.: The Consumer GUI<br />
An important feature of the design of the resource broker is that it is totally isolated from any<br />
alien code. Even though ALiCE is a grid middleware and it deals heavily with mobile code from<br />
outside sources, the model designed keeps all the code that runs at the resource broker’s site inside<br />
ALiCE classes. This means that no outside code is run on the resource broker. The previous<br />
version of ALiCE was running the task generators on the resource broker and this posed a very<br />
high security risk which was completely eliminated by moving the task generators execution on<br />
the task producers. In this way, the resource broker, which is a centralized point, hence a single<br />
point of failure in the system, is much more secure.<br />
The resource broker is also handling the results for application that have chosen the resource-<br />
broker-stored result delivery mode.<br />
The Scheduler<br />
As mentioned before, the main purpose of the resource broker is running the scheduler. The<br />
scheduling process has two components: scheduling new application to task producers and schedul-<br />
ing new tasks to producers. This translates in two types of object references that should be taken<br />
51
4. ALiCE Architecture and Implementation<br />
from JavaSpace (TYPE_APPLICATION for new applications and TYPE_TASK_TO_SCHEDULE<br />
for new tasks), than the appropriate machine should be chosen and new references (with the types<br />
TYPE_CODE and respectively TYPE_TASK) should be created, with the destination the chosen<br />
machines, and written in the space.<br />
Event<br />
Producer Create<br />
comes on-linecapabilities<br />
list<br />
Producer<br />
starts<br />
executing<br />
user code<br />
Producer<br />
needs to be<br />
re-tagged<br />
Producer<br />
runs with the<br />
new templates<br />
TIME<br />
Start<br />
task<br />
execution<br />
thread(s)<br />
Producer/<br />
Task Producer<br />
Update<br />
templates<br />
used to take<br />
references<br />
from Space<br />
capabilities<br />
list<br />
templates<br />
list<br />
templates<br />
list<br />
Figure 4.5.: Tagging Producers<br />
J<br />
A<br />
V<br />
A<br />
S<br />
P<br />
A<br />
C<br />
E<br />
capabilities<br />
list<br />
Resource<br />
Broker<br />
templates<br />
list<br />
templates<br />
list<br />
Create<br />
templates<br />
list<br />
Create NEW<br />
templates<br />
list<br />
The scheduling process looks simple, but it is not quite so. We should keep in mind that we are<br />
dealing with a highly dynamic system, namely a grid computing system, which has a property not<br />
common to any other computing systems: it has at his disposal a large number of resources that are<br />
changing rapidly. In this environment, the classical approach to scheduling, that is to take a new<br />
task and tag it to be run on some work engine, is not feasible any more. This would mean that if we<br />
are delegating a task to be run by some producer, tag it to reflect this and write it in JavaSpace, and<br />
if later that producers goes down (and this happens very often in a grid system), the resource broker<br />
would have to take the reference back and re-tag it. This would mean a tremendous overhead and<br />
the system would have an immense overhead caused by this. Another approach would be to group<br />
the producers in previously established groups and tagging a task to such a group of producers.<br />
But this is not a good approach either, as it means developing a complex algorithm for producer<br />
52
4. ALiCE Architecture and Implementation<br />
grouping and we could end up with having no producers to run a task, even though there are<br />
available producers in the system.<br />
Our approach that we implemented is not to tag the task references that written in the space<br />
to a producer, bu rather tag a producer to take some particular references from the space. In this<br />
approach we can change the destination of a task just by delegating another producer to run it,<br />
without modifying the reference we put in JavaSpace. This approach will also permit the grouping<br />
of producers, but this time dynamically, and will mean that we can change what tasks a producer<br />
will run without restarting the producer and without any influence to the already scheduled tasks<br />
from JavaSpace. So if, let’s say, we delegate one producer to run a certain task and than the<br />
producer goes down, we can simply tag another producer to run that task.<br />
The implementation of this approach relies on the usage of JavaSpace for object references<br />
transfer. A producer will take from the space only references that match a certain template. Beside<br />
the application it belongs to, the object reference contains a series of fields that could differentiate<br />
between references linked to the same language and the same platform. And if a new scheduler is<br />
developed that needs more information inside an object reference, the ObjectReference class can<br />
any time be extended and new informations added in, informations that could only be used or only<br />
be known to exists to the scheduler.<br />
So, to tag a producer to run specific tasks, all we need to do is provide that producer with some<br />
templates to use when matching object references to tasks from JavaSpace. So, when a producer<br />
starts up, it firsts advertises its capabilities, that is a list of byte code it knows how to execute.<br />
Each capability is actually a collection of three strings: one is naming the language the producer<br />
has runtime support for, one the processor of the machine it runs on and the third one the OS it<br />
runs under. After advertising its capabilities, the resource broker will create for each producer a<br />
list of templates and will send those templates back to it. Once it receives the first templates, the<br />
producer is ready to work and starts the task executor threads. The process is illustrated in the<br />
figure 4.5. If at a later time the resource broker decides to change the templates for a producer,<br />
that is re-tag it to run some other kind of tasks, all it needs to do is create a new list of templates in<br />
the form of a message, put them in a special message and send the message to the producer. Each<br />
producer has a special thread that waits for templates updates and when the message is received, it<br />
will automatically update the templates the producer runs with.<br />
Summing all this up, we can describe the components of a resource broker. They are presented<br />
in the figure 4.6. and are described next:<br />
The Scheduler Module<br />
The scheduler module is the most important module in the resource broker. It does the task of<br />
53
4. ALiCE Architecture and Implementation<br />
choosing producers that should run any new task submitted into the system and task producer to<br />
run new applications.<br />
The design of the scheduler permits having multiple scheduler implementation. Although, only<br />
one implementation can be used at a time. To implement new scheduler, the ALiCE programmer<br />
should just extend the Scheduler class. This class consists of two methods and a running thread.<br />
The methods are the calls used to schedule a new application/thread, that is to return a string<br />
containing the IP address or host name of the machine that was delegated to run the application/task<br />
referred by the object reference passed as a parameter. The scheduler that will be used is chosen at<br />
the start of the ALiCE instance for the resource broker.<br />
The scheduler module consists of a thread that is waiting for an object reference for a new<br />
application/a new task to appear in JavaSpace. When such a reference is retrieved, one of the two<br />
scheduling function is called with the current reference as a parameter, using the instance of the<br />
current chosen scheduler; which of the two scheduling functions is calls depend on the reference<br />
being an application reference or a task-to-schedule reference.<br />
In the process of delegating new applications to task producers and new tasks to producers,<br />
the scheduler module needs to keep am information repository to keep track of the producers into<br />
the system. This will be used for accounting in future development but is used for now just for<br />
scheduling. All this information are kept in the Information <strong>System</strong> Database.<br />
The Producer Tagging Module<br />
The purpose of this module is to send templates and templates updates to the producer. When a<br />
producer comes on-line, it advertise its capabilities and the role of the resource broker is to send<br />
the templates to work with back to the producer. This task is accomplished by the producer tagging<br />
module, that has the implementation as the run method in the running scheduler. Since tagging/re-<br />
tagging the producers depends on the scheduler chosen, the producer tagging module is entirely<br />
implemented in each scheduler implementation.<br />
If the scheduler decides to change the templates for a particular producer, all it needs to do is<br />
send an appropriate message to that producer with a list of updated templates. The new templates<br />
will replace the old ones in that producer and will be used as soon as they are received. The<br />
changing of templates can be as a result of a change in the topology of the system (a producer<br />
joining/leaving) or as a result of a new application with a higher priority being submitted into the<br />
system.<br />
The Result Manager Module<br />
The result manager module is the module that stores the results of the applications that choose to<br />
have the results delivered back through the resource broker.<br />
54
4. ALiCE Architecture and Implementation<br />
When a new application is scheduled, if the result delivery mode is resource-broker-stored, the<br />
resource broker will also put a message in JavaSpace stating that all the results for that application<br />
should be sent back to it. Any producer that has a result ready will look for and read this message<br />
if the result delivery mode indicates that they should be sent to the resource broker and will send<br />
the result to that resource broker that has scheduled the application.<br />
The resource broker will download any such result and store the file references for all the results<br />
in an internal hash table of vectors, indexed with the application ID.<br />
When an application’s result collector later decides to retrieve results, this will be done via a<br />
request-reply approach. First, the result collector can find out from the resource broker how many<br />
results are ready for the application in runs for by sending a special request message to the resource<br />
broker that stores the results (the address of this resource broker can be founding by reading the<br />
message through which it advertised that it will hold the results for that application). Inside the<br />
resource broker, the result manager module thread will reply to any such request message with a<br />
reply message.<br />
After finding out how many results are ready, the result collector can get the results one by one<br />
by placing (for each of the results) a new result request message in JavaSpace designed for the<br />
resource broker storing the results. In reply to this request message, the result manager module<br />
from the resource broker will write an appropriate object reference in the space that points to the<br />
file locally stored by the result manager.<br />
The Information <strong>System</strong> Database<br />
For any scheduler other than eager scheduler, a database with all the producers available and with<br />
which of these are free. Since the only scheduler implemented at the time this report is written was<br />
eager scheduler, this component of the resource broker is not yet implemented.<br />
The information system database should also be used for monitoring and especially for ac-<br />
counting purposes.<br />
Implementing a Scheduler<br />
Implementing a scheduler for ALiCE means extending the Scheduler abstract class, that consists<br />
of two methods:<br />
public abstract class Scheduler extends Thread {<br />
}<br />
public abstract String scheduleTaskGenerator(ObjectReference code);<br />
public abstract String scheduleTask(ObjectReference taskToSched) ;<br />
55
Eager<br />
Scheduler<br />
Round-Robin<br />
Scheduler<br />
Any Other<br />
Scheduler<br />
Information<br />
<strong>System</strong><br />
Database<br />
TaskToSchedule/<br />
Application<br />
reference<br />
4. ALiCE Architecture and Implementation<br />
Scheduller<br />
Module<br />
Task/<br />
Code<br />
reference<br />
Resource<br />
Broker<br />
Producer<br />
Tagging<br />
Module<br />
Capabilities<br />
List<br />
message<br />
Templates<br />
List<br />
message<br />
JavaSpace/<br />
GigaSpace<br />
Figure 4.6.: Resource Broker components<br />
Result<br />
Manager<br />
Module<br />
Result reference/<br />
NewResult/<br />
ResultsNo<br />
request message<br />
Result reference/<br />
ResultsNo<br />
reply message<br />
The first method, scheduleTaskGenerator(), is called by the broker whenever a new application<br />
reference is retrieved through space. The call should return, in the form of a string, the IP address<br />
or the host name of the task producer chosen to run the task generator for the new application. The<br />
second method, scheduleTask(), should return the address/name of a producer that was delegated<br />
to run any new task to schedule that was taken by the broker from the space. These methods should<br />
make use of an information system that should also be implemented as a part of the scheduler.<br />
The scheduler should also implement the public void run() method from java.lang.Thread. The<br />
thread will be started by the broker and it is the one that should send the new templates messages to<br />
any new producer, as a reply to it sending a message with the list of its capabilities. The scheduler<br />
can than make changer to the templates a particular producer uses by sending a new message to<br />
update its templates.<br />
Example: the Eager Scheduler<br />
The only scheduler implemented at the time of writing this report is the eager scheduler. It is<br />
the simplest approach possible to scheduling, as it does not do any tagging, instead the templates<br />
sent to each producer are just instructing it to run any task/task generator it knows how to run by<br />
creating the most general templates and sending them to it.<br />
56
4. ALiCE Architecture and Implementation<br />
When a new application or task is scheduled, there is no particular destination returned, but<br />
rather a special wild card string from the ObjectReference class that will match any producer’s<br />
address. So, all the producers will run both task and task generators and they will begin running a<br />
new task/task generator as soon as they are free.<br />
We present next the source code for this scheduler implementation, since it is very basic and<br />
can be used as a hint of what one would need in order to implement a new scheduler:<br />
package alice.broker;<br />
import alice.onta.common.¡ ;<br />
import alice.runtime.¡ ;<br />
import alice.ALICE;<br />
public class EagerScheduler extends Scheduler {<br />
public String scheduleTaskGenerator(ObjectReference code) {<br />
}<br />
return ObjectReference.toAnyone;<br />
public String scheduleTask(ObjectReference taskToSched) {<br />
}<br />
return ObjectReference.toAnyone;<br />
public void run () {<br />
while (true) {<br />
Message rpmTemplate = new Message(null, null, null,<br />
Message.TYPE_REGISTER_PRODUCER);<br />
RegisterProducerMessage msg = (RegisterProducerMessage);<br />
ALICE.getRemoteObjectLoader().takeMessage(rpmTemplate);<br />
ProducerTemplateMessage ptm = new ProducerTemplateMessage(ALICE.myIP(), msg.source);<br />
ObjectReference templ;<br />
for (int i=0; i¢ msg.cap.size(); i++) {<br />
Capability c = (Capability)(msg.cap.elementAt(i));<br />
templ = new ObjectReference(null, ObjectReference.TYPE_CODE, c.language);<br />
templ.proc = c.processor; templ.os = c.os;<br />
ptm.addTemplate(templ);<br />
templ = new ObjectReference(null, ObjectReference.TYPE_TASK, c.language);<br />
templ.proc = c.processor; templ.os = c.os;<br />
ptm.addTemplate(templ);<br />
57
}<br />
}<br />
}<br />
}<br />
4. ALiCE Architecture and Implementation<br />
ALICE.getObjectRepository().sendMessage(ptm);<br />
4.3.4. The Producer and the Task Producer<br />
The producers are the components of ALiCE that are executing the user code for the applications<br />
submitted into the system. There are two kinds of producers: the task producers and the actual<br />
producers.<br />
The task producers are the machine that are executing task generators. This machines are<br />
(but must not be) in the central node of ALiCE, under the same control as the resource broker.<br />
In the previous version of ALiCE, the task generators were executed on the same machines that<br />
the resource broker was running. The current version has separated the machine that runs a task<br />
generator from the machine that runs the resource broker for three main purposes:<br />
non-Java support - In order to support execution of applications that are written in a lan-<br />
£<br />
guage other than Java, we deal with problems like platform and OS requirements, as other<br />
£<br />
languages do not run in a virtual machine as Java does; so it could not be possible to run<br />
the task generators for the non-Java applications all on the same machine that the resource<br />
broker is running on;<br />
security - The resource broker is the central point of ALiCE, where scheduling takes place;<br />
it is the only component of ALiCE that if fails, will make the whole system fail. To run<br />
task generators on the same machine as the resource broker would mean to execute alien<br />
application code on that machine, possible malicious code. Even with strong code safety<br />
and isolation techniques, the system could not be entirely safe. The best approach is not to<br />
run any alien code on the resource broker’s machine;<br />
performance - Since the resource broker is the centralized point of ALiCE, it makes a bottle-<br />
£<br />
neck if it does not perform scheduling very fast and if it does not responds to a large number<br />
of requests to schedule tasks and applications in a short period of time. To further load<br />
the computer broker by running task generators on it would mean additional performance<br />
decrease.<br />
The difference between the producers and the task producers are made only by the templates that<br />
they are using to retrieve references from the space and since those templates are given to the<br />
58
4. ALiCE Architecture and Implementation<br />
producers by the resource broker (see section 3.3.3), the code for the producer and for the task<br />
producer is exactly the same. The difference between the two is made by the resource broker<br />
which will decide to nominate a producer to run tasks or task generators or both.<br />
The task producer should be under the same control as the resource broker, since it is a vital<br />
point of the system. Given the architecture implemented by ALiCE, we can not afford to have<br />
a task producer go off line and back on line any time. This is because, until all the tasks are<br />
downloaded by producers, the task producer should be up and running to be able to deliver the<br />
files containing the serialized tasks on request by a producer.<br />
The structure of a producer and its components are presented in figure 4.7. The components<br />
are explained next.<br />
RTS<br />
Update<br />
Thread<br />
Java<br />
Execution<br />
Thread 1<br />
Templates<br />
Update<br />
Thread<br />
Java<br />
Execution<br />
Thread 2<br />
Producer<br />
RTS<br />
Registry<br />
RTS<br />
Manager<br />
JavaRuntimeSupport CRuntimeSupport<br />
Java<br />
Execution<br />
Thread n<br />
C<br />
Execution<br />
Thread 1<br />
C<br />
Execution<br />
Thread 2<br />
Figure 4.7.: Producer/Task producer components<br />
C<br />
Execution<br />
Thread n<br />
The RTS Manager RTS stands for runtime support. ALiCE is designed in such a way that<br />
support for new languages can be added in very easily; the aim is to be able to do it on-the-fly,<br />
like with the protocols in ONTA, without having to even restart the system. The RTS manager is<br />
handling the comportment, initialization and start-up of a producer. It first initializes all the runtime<br />
supports that are build-in the system (for now, this is the case of Java and C) by instantiating, for<br />
each of them, the class that extends the RuntimeSupport class for that language.<br />
59
4. ALiCE Architecture and Implementation<br />
Than, it creates a list of the capabilities of all the runtime supports and advertises the list<br />
through a message in JavaSpace. A resource broker will take that message and deliver the pro-<br />
ducer a list of templates. The templates will be later used by the execution threads to take object<br />
references from the space. The RTS manager will not start the execution threads until it does not<br />
obtain a list of initial templates.<br />
The RTS manager will start the templates update thread than and a number of execution threads<br />
for each runtime support implemented.<br />
The RTS Update Thread This thread is implemented in the RTSManager class and has the<br />
purpose of receiving new runtime supports that are sent by other instances of ALiCE through<br />
object references in JavaSpace. Though this is not completely implemented and functional yet,<br />
the final purpose is to make an extremely flexible interface to the runtime support system that will<br />
permit adding support for new languages in the form of plug-ins. This thread will get the new<br />
runtime support classes advertised through object references in JavaSpace by some other machine<br />
in ALiCE, dynamically load it, instantiate it and start it. The runtime supports will be stored in the<br />
RTS registry.<br />
The RTS Registry The runtime support registry has two purposes. The first purpose is to store<br />
the templates that are used by all the execution threads to retrieve object references to task/task<br />
generators from JavaSpace. This templates are updated whenever a new list of templates is received<br />
by the producer. The update is initiated and handled by the templates update thread.<br />
Although the templates are centrally stored in the RTS registry, it would be a bottleneck for all<br />
the execution thread to just hold references to them and use the object in the registry, as we would<br />
need synchronized access to this object and this would mean that only one execution thread would<br />
be able to read the templates at a time. For this reason, each runtime support for each language<br />
will keep its own copy of the templates that are obtained by cloning the list in the RTS registry.<br />
These clones are also updated whenever a new templates list is received.<br />
The second purpose of the RTS registry is to keep a list of the runtime supports currently avail-<br />
able to the instance of ALiCE it is running under. All the runtime supports received dynamically<br />
will be registered to the RTS registry. Also, the RTS registry is the one actually starting each<br />
runtime support after it is registered.<br />
The Templates Update Thread As explained in section 3.3.3 about scheduling, the approach<br />
we took in scheduling is to tag the producers to take some category of object references from<br />
JavaSpace for execution of the referred objects. We need to be able to change the tag of a producer<br />
60
4. ALiCE Architecture and Implementation<br />
at any time, by changing the templates it uses. For that purpose the RTS manager initiates the<br />
templates update thread and starts it at the system start-up on any producer.<br />
The templates update thread will wait to receive via JavaSpace any message destined to it that<br />
contains a list of new templates. As soon as such a message is received, the new templates will<br />
replace the old ones and clone copies of the list of new templates will replace the old templates for<br />
each runtime support implemented.<br />
The general algorithm of a producer is presented in figure 4.9.<br />
The Runtime Supports Each runtime support each actually just a class implementing the<br />
class java.lang.Thread. The RTS manager will receive as a parameter at start-up of an ALiCE<br />
producer the number of execution thread for each language supported. This will be the number of<br />
execution threads of each language started by each runtime support.<br />
Usually we need more than one thread for execution of task/task generators per producer. To<br />
see way, let’s take an example of a task producer that would have only one thread to run, let’s say,<br />
Java task generators. If it start executing a task generator that blocks waiting for an external event<br />
that will take a very long time to happen or if the task generator just loops forever, the task producer<br />
would be blocked; it would not be able to run anything, even though it is not doing anything useful<br />
either. By having more threads in a producer that are executing the mobile code, we provide more<br />
efficiency and more performance. By making the number of threads variable and specifiable at<br />
producer start-up, we are making the producer very flexible, so it can adapt to the limitations of<br />
the machine it runs on.<br />
Each runtime support contains as a field the list of its capabilities; a capability is actually a<br />
class consisting of three strings (language, processor and OS) that defines a kind of code that<br />
this runtime supports knows how to run. The list of capabilities for the whole producer that are<br />
advertised by the RTS manager is formed by gathering all the capabilities from all the runtime<br />
supports instantiated.<br />
The Execution Threads As mentioned, for each language supported, a number of execution<br />
threads will be started. What each of this thread does is very simple. It traverses the list of<br />
templates of the producer that were sent to it by a resource broker. For each template, if it matches<br />
each capability from the list of the runtime support it belongs to, than it tries to retrieve an object<br />
reference that does match that template from JavaSpace, waiting for it for a very short period of<br />
time.<br />
If such an object reference is found, it is taken from the space and the object it refers is down-<br />
loaded and dynamically loaded. First some initializations are done on the object dynamically<br />
loaded (like tagging it to the application it belongs to). Depends if it is a task or a task generator,<br />
61
4. ALiCE Architecture and Implementation<br />
a different method is called on the new instance by java reflect API. This is the entry point in the<br />
mobile alien code.<br />
If the thread executed a task generator, it will just returned. But if it executed a task, a result is<br />
collected as returned by the execute() method from the thread as a java Object. The result is than<br />
advertised in the space to the resource broker that scheduled the application or to the consumer that<br />
is running the result collector for the application, depending on the result delivery mode chosen.<br />
The Producer GUI<br />
Figure 4.8.: The Producer GUI<br />
The Producer GUI, presented in figure 4.8, provides the user with an easy to use acces to<br />
producer informations and for observing system output as.<br />
4.3.5. The Data Server<br />
The data server is a new component added to ALiCE. In the old version, data was transfered<br />
packed inside objects or the files should have been shared via NFS or even present, a copy on each<br />
producer machine. This is unacceptable and is not feasible for a grid computing system.<br />
62
4. ALiCE Architecture and Implementation<br />
The architecture of the new ALiCE version has a special part dedicated to data files. There<br />
exists a component in ALiCE that handles these files, the data server.<br />
The data server can be run on a separate machine (let’s say a machine with vast secondary<br />
storage space) or can be run on the same machine as the resource broker, which is more usual.<br />
The data server consists of two components: the data files manager and the server itself.<br />
Data Server Issues In a grid computing system, where we are dealing with applications that<br />
are compute-intensive, the amount of data used is often large and the size of data files just as large.<br />
In this case the main problem that arises is how to transfer a large file over a non-reliable network<br />
connection as the Internet. The approach we took is to store the data files at a central location and<br />
than provide the application developer with the means to read or write chunks of that file.<br />
This approach means that the user can choose the amount of data that is transfered in a single<br />
burst, setting the right trade-off between the overhead imposed by initiating multiple transfers to<br />
get small amount of data each time and the unreliable nature of the connections. The user might<br />
choose even to read the whole file in one go, making the chunk as large as the file size.<br />
Another benefit of our approach is that not the whole data file is distributed to all the producers<br />
that need it. Instead, the application can partition the data for the problem at first, decide what<br />
chunk of data each task needs and than each task will just retrieve the data it needs. This impose<br />
the lowest overhead possible caused by data transfer and adds flexibility.<br />
The Data Files Manager The data files manager is the component of the system that gets new<br />
advertisements of data files from consumers and than downloads them. It consists of a thread that<br />
takes the object references form JavaSpace, downloads the file and than stores it locally. The data<br />
files will be stored in a directory that has the same name as the application ID of the application<br />
the data files were submitted for. Since the application IDs are unique, we make sure this way that<br />
there will be no conflicts if different applications will have data files with the same file name.<br />
When the data files manager starts up, it places a message in the space that states that a data<br />
server is present on the machine that it runs on. Every time a consumer will submit a data file, it<br />
will first look for such a message to know to who to send a data file.<br />
After each data file is downloaded, the data files manager puts a message in the space to adver-<br />
tise the fact that it stores that data file for the application it was submitted for. When a task or task<br />
generator needs later access to that file, it will look for a message from the data server that holds<br />
the file, construct a new instance of the DataFile class that points to that location, all through use<br />
of some static calls form the Data class. The file reference obtained this way will be used to read<br />
and write from/to that file by the user code. For more details on data files usage, see section 4.2.4.<br />
63
4. ALiCE Architecture and Implementation<br />
The Server The server is a simple implementation of raw TCP/IP sockets transfer. It listens for<br />
connections on a port. A connection will be initiated by method calls from a DataFile instance<br />
from inside some user code. Since the data file object is already initialized and knows the location<br />
of the data file (that is, the data server address and the full filename), the request will open the<br />
connection to the data server machine on the preset TCP port and just send the file name as a<br />
request, followed by the request type and the parameters for the operation (chunk size, offset in<br />
file etc). The server will respond by sending or receiving the required chunk raw, through the<br />
opened connection.<br />
64
NO<br />
Task<br />
generator<br />
Execute task generator<br />
4. ALiCE Architecture and Implementation<br />
Start<br />
Get the list of capabilities<br />
from each runtime support<br />
Advertise a message containing<br />
the list of all capabilities<br />
Wait for the list of ininitial<br />
templates from a resource broker<br />
Start execution threads<br />
New templates<br />
received?<br />
NO<br />
New object reference<br />
in JavaSpace<br />
matching a template?<br />
YES<br />
Download file containing<br />
the serialized object<br />
Initialize new object<br />
Is it a task or<br />
a task generator?<br />
direct<br />
delivery<br />
Send the result to the resource<br />
broker that scheduled the application<br />
YES<br />
Task<br />
Update templates list<br />
Execute task<br />
Result delivery<br />
mode<br />
Figure 4.9.: The Producer Algorithm<br />
65<br />
resource-broker-stored<br />
delivery<br />
Return the result to the consumer<br />
that submited the application
Part III.<br />
Sample Applications and Performance<br />
Testing<br />
66
5. Example of ALiCE Applications<br />
5.1. Matrix Multiplication<br />
The matrix multiplication is an application that takes in an integer n, and computes the result of the<br />
multiplication of two nxn matrices. The two nxn matrices are random generated by the application<br />
itself.<br />
The matrix multiplication application is designed such that the number of tasks generated for<br />
every application execution is the exactly n, where n is the parameter above. The integer n repre-<br />
sents the problem size as well as the task size. (A problem consists of many tasks). The purpose<br />
of this design is such that with the increase of n, both the overall problem-size of the application<br />
increases and the task-size of the application increases. This property is desirable since we wish to<br />
test the performance of ALiCE under conditions of varying task sizes. The experiments and results<br />
can be observed in Chapter 5.<br />
Next, we present the algorithm for the Task Generator and the Result Collector components of<br />
the ALiCE Matrix Multiplication application.<br />
ALGORITHM ALiCE_MM( n )<br />
TASK_GENERATOR<br />
1: A new Matrix of size nxn<br />
2: B new Matrix of size nxn<br />
3: Initialize (A)<br />
4: Initialize (B)<br />
5: for x in 1 to n<br />
6: T new TASK containing (row x of A, ma-<br />
trix B, and x)<br />
7: send T to Resource Broker<br />
8: endfor<br />
67
RESULT_COLLECTOR<br />
5. Example of ALiCE Applications<br />
1: C new Matrix of size nxn<br />
2: for j in 1 to n<br />
3: RESULT R incoming Result from Resource Broker<br />
4: C[R.x] = R.result_array<br />
5: endfor<br />
TASK_EXECUTE (Ax, B, x)<br />
1: n A.length<br />
2: result_array = new array of n elements<br />
3: for j in 1 to n<br />
4: for m in 1 to n<br />
5: result_array[j] += Ax[m] *B[j][m]<br />
6: endfor<br />
7: endfor<br />
8: RESULT R new Result<br />
9: insert result_array into R<br />
10: return R<br />
5.2. Ray Tracing<br />
Ray tracing is a method for producing views of a virtual three-dimensional scene on a computer.<br />
It tries to mimic actual physical effects associated with the propagation of light. A ray tracer<br />
calculates pixel values of a given coordinate in an image by tracing a path of light as it bounces<br />
off or is refracted through surfaces. This involves a lot of calculations that may take too long for<br />
one machine to handle; hence, grid computing environments such as ALiCE is used to speed up<br />
the process of rendering ray traced images.<br />
The ray tracing application that we develop takes in two parameters m and n, and divides the<br />
task of rendering a 1024x768 sized image into several chunks of size m x n each. These chunks<br />
are then calculated separately and the results of the calculation of these chunks will be displayed<br />
on the screen.<br />
Unlike the Matrix Multiplication application, the problem size of the Ray Tracing application<br />
is fixed. Changing the parameters of m and n would only vary the task size, and hence the number<br />
of tasks that the problem has.<br />
Next we present the overall algorithm for the ALiCE ray tracing sample application.<br />
68
ALGORITHM ALiCE_TRACE(m, n)<br />
TASK_GENERATOR<br />
5. Example of ALiCE Applications<br />
Figure 5.1.: ALiCE Ray Tracing Visualizer<br />
1: for x in 0 to 480 step n<br />
2: for y in 0 to 640 step m<br />
3: TASK T new Task containing data of the rectan-<br />
gle (x,y) to (x+n-1, y+m-1) and the (x,y) coordinate<br />
4: send T to Resource Broker<br />
5: endfor<br />
6: endfor<br />
RESULT_COLLECTOR<br />
1: for x in 0 to 480 step n<br />
2: for y in 0 to 640 step m<br />
3: Result R collectResult from Resource Broker<br />
4: display R on screen<br />
5: endfor<br />
6: endfor<br />
TASK_EXECUTE<br />
1: compute rendering output<br />
69
2: Result R new Result<br />
5. Example of ALiCE Applications<br />
3: Insert rendering output into Result<br />
4: return Result<br />
A screenshot of the ALiCE RayTrace visualizer while the application is still working is presented<br />
in Figure 5.1.<br />
5.3. DES Key Cracker<br />
Figure 5.2.: ALiCE DES key cracker<br />
The DES key craker is a massively parallel application that is trying to find a DES key by brute<br />
force, doing an exhaustive saerch in the key space. Each task will be given an interval of keys to<br />
look for the sollution in.<br />
The application we develop takes in as paramaters the length k of the key in bits (this means<br />
the key search space actually) and the size of each task t, given in the terms of how many keys will<br />
each task search. The number of tasks will be decided by the raport between the maximum key<br />
possible and the task size.<br />
This application is very usefull to see how ALiCE is scalling up, given the fact that we can<br />
vary the problem size, as well as each task side.<br />
70
5. Example of ALiCE Applications<br />
The general algorithm for the DES Key Cracker application is presented next and a screenshot<br />
of the viewer GUI is presented in Figure 5.2.<br />
ALGORITHM ALiCE_DES_CRACKER(k, t)<br />
TASK_GENERATOR<br />
1: Generate random key to look for<br />
2: Encrypt a preset short message using the gener-<br />
ated key<br />
3: for i in 0 to 2^k step 2^k/t do<br />
4: T new TASK search-<br />
ing from i*t to (i+1)*t, given the encrypted and the un-<br />
encrypted messages generated in step 2<br />
5: send T to the Resource Broker<br />
6: endfor<br />
RESULT_COLLECTOR<br />
1: for i in 0 to 2^k/t do<br />
2: if key found, display result<br />
3: endfor<br />
TASK_EXECUTE<br />
1: Result R new Result<br />
2: for x in startKey to endKey do<br />
3: Encrypt the message received using the key x<br />
4: If the result of the encryp-<br />
tion is equal to the encrypted message re-<br />
ceived, the key has been found, it is x; put x into R<br />
5: endfor<br />
6: if key not found, return NOT_FOUND_RESULT<br />
7: else return R<br />
5.4. Protein Matching<br />
The protein matching application is a complex application written for ALiCE by Yew Kwong NG,<br />
ngyewkwo@comp.nus.edu.sg.<br />
Bioinformatic applications, which usually involve massively huge volumes of chromosome<br />
data stored in geographically distributed databases, are potentials for execution on a grid comput-<br />
ing platform. The sequence comparison approach adopted here is the Smith-Waterman dynamic<br />
71
5. Example of ALiCE Applications<br />
Figure 5.3.: Protein Matching for ALiCE application visualizer<br />
programming algorithm. The objective of this toolkit is to allow a user, typically a computa-<br />
tional biologist, to obtain the most optimal alignments of specific query sequences and each gene<br />
sequence stored in known databases, which would otherwise be extremely tedious if performed<br />
manually. The scale of the problem can be very large, since we are considering computations<br />
involving possibly tens of thousands of gene sequences scattered in different nodes on the web.<br />
This application is just presented as an example as a very practical and complex application<br />
developed for ALiCE. It is not used in the performance testings, but is rather an illustration of the<br />
power of ALiCE.<br />
A screenshot of the visualizer after collecting the results is presented in Figure 5.3.<br />
72
6. Performance Testing<br />
Based on a set of tests we carried out, this chapter presents our preliminary performance results.<br />
Aside from the experiments presented here, we carried out a number of other tests like stress tests<br />
on JavaSpace and GigaSpace (and we concluded that we should use GigaSpace :-) ) and tests<br />
outside the actual test enviroment.<br />
Tests also included trying ALiCE on Windows and Solaris machines, as well as on other<br />
netwroks that the cluster enviroment that we used as the main testbed for testing.<br />
6.1. The Test Bed<br />
Out experiments were carried out on a cluster of twenty-four nodes shown in figure 6.1.<br />
Sixteen (named ws00 to ws15) are Intel PII 400MHz with 256MB of RAM, and eight (named<br />
ws17 to ws24) are Intel PIII 866MHz with 256MB of RAM. These nodes are connected to each<br />
other via a 100Mbps switch. All nodes are running RedHat Linux release 7.0 (Guiness) distribu-<br />
tion, based on the 2.2.16-22 Linux Kernel.<br />
ALiCE is developed using the Java TM 2 Software Development Kit version 1.3.1 and 1.4.0<br />
and the Jini Starter Kit version 1.2. We used GigaSpace TM Platform 2.0 in our actual stage<br />
of development and testing. For development and experiments, we use the Java TM 2 Runtime<br />
Environment Standard Edition with the Java TM HotSpot Server and Client Virtual Machines build<br />
1.3.1_03-b03, mixed mode. The HotSpot Server Virtual Machines are used for Resource Broker<br />
and Producer nodes. The Consumer nodes make use of the HotSpot Client Virtual Machine.<br />
During our test the need to have a machine with more memory to hold GigaSpace arised, so<br />
we moved 128 MB of RAM from ws04 to ws21, which held GigaSpace for all our experiments.<br />
We also tried to run GigaSpace on a Sun Ultra 30 station, based on an UltraSparc II 266Mhz,<br />
but this machine proved to be much to slow in terms of processor power than needed. The conclu-<br />
sion is that if GigaSpace is held on only one machine, that machine should be a powerfull machine<br />
with lots of memory.<br />
73
6. Performance Testing<br />
ws00 ws01 ws02<br />
ws15<br />
Ethernet<br />
switch<br />
ws17<br />
6.2. Experiments<br />
ws18<br />
.....<br />
.....<br />
Figure 6.1.: Cluster-based experiment enviroment<br />
ws24<br />
We have run a vast series of exeperiments to see how ALiCE performs in different conditions, in<br />
diferent enviroments and with different work loads. In this section we present the most important<br />
tests conducdet, as well as the conclusions and the relevant results to these tests.<br />
6.2.1. Performance Evolution with Variance of Task Size<br />
Objectives: To observe the overhead impose by varying the task size for a problem of a fixed<br />
total size.<br />
74
Platform:<br />
£ GigaSpace on ws21<br />
£ Resource Broker on ws10<br />
£ Task Producer on ws20<br />
6. Performance Testing<br />
£ 5 Producers on ws17, ws18, ws19, ws22 and ws23<br />
£ Consumer on ws24<br />
Test Application: Ray Tracing<br />
Methodology: We run the Ray Tracing application for an image of 1024x768 pixels (fixed prob-<br />
lem size) on a set of 5 producers. We varied the size of each task to see the influence of the<br />
number of tasks over the application execution time.<br />
Test Results:<br />
Chunk Size Tasks Time (seconds)<br />
20x20 1976 1230<br />
30x30 875 285<br />
40x40 494 270<br />
60x60 234 117<br />
80x80 130 115<br />
100x100 88 112<br />
140x140 48 107<br />
180x180 30 115<br />
200x200 24 120<br />
250x250 15 141<br />
300x300 12 135<br />
350x350 9 158<br />
400x400 6 163<br />
75
Graphical representation of results:<br />
6. Performance Testing<br />
Analysis of results: What we observed confirms our intuition that when the task size increase<br />
(so the task number decrease), the overhead will decrease. So, by having less tasks, the<br />
network/JavaSpace overhead decrease, so does the exectuion time, to a certain point. After<br />
that point, the execution time begins to increase slightly again.<br />
This is caused by two factors. One is that the number of tasks is not always a multiple of<br />
the number of producers, so there are times (to the end of the application runtime) that some<br />
producers are idle during the life time of the application. The other factor is that not all the<br />
task are of equal size in terms of computational time. Even if the cunks are equal, some tasks<br />
will require more computation than others. If the tasks more computation-intensive will be<br />
taken last (we are using eager scheduling), than again there will be some producers that will<br />
remai idle. This explains the anomaly of increase in execution time to the end of the scale of<br />
tasks number.<br />
6.2.2. Varying the Number of Producers<br />
Objectives: To explore the possible speedup that can be obtained by increasing more producers<br />
and to understand the factors that might limit the speedup.<br />
76
Platform:<br />
£ Resource Broker on ws02<br />
£ Task Producer on ws03<br />
£ Producers on ws04-ws23<br />
£ Consumer on ws24<br />
Test Application: Ray Tracing<br />
6. Performance Testing<br />
Methodology: We measure the application runtime of ALiCE with several producer configura-<br />
tions using the normal eager scheduling algorithm to execute the Ray Tracing test application<br />
of the same task size (140x140 chunks for an 1024x768 image, that sums up to a number<br />
of 48 tasks). We compare the speedup obtained via the use of ALiCE with reference to the<br />
sequential runtimes of the Ray Tracing test application.<br />
Test Results:<br />
Producers Time (seconds)<br />
1 491<br />
2 246<br />
3 177<br />
4 132<br />
5 111<br />
6 101<br />
7 98<br />
8 96<br />
9 93<br />
10 78<br />
11 68<br />
12 67<br />
13 66<br />
14 59<br />
15 65<br />
16 63<br />
17 63<br />
18 61<br />
77
Graphical Representation of Results:<br />
6. Performance Testing<br />
Analysis of results: Our experiments have shown that ALiCE does improve the execution time<br />
of the ray tracing application by a significant factor. The best performance improvement<br />
would be with 14 producers (improvement by 88%).<br />
The fact that at some point increasing the number of producers lead to increasing the total<br />
time of execution can be explained by the fact that the machine running GigaSpace ran out of<br />
phisical memory and it started using the swap, which lead in turn to an accelerated decrease<br />
of performance.<br />
6.2.3. Overhead Variation with Task Size for Direct Result Delivery<br />
Objectives: To explore the percent from the total execution time that was represented by over-<br />
head with variation in task size, maintaining a fixed total problem size. In this test the<br />
overhead is measured in the case the result delivery mode is direct delivery, i.e. the results<br />
are delivered directly from the producers to the consumer.<br />
78
Platform:<br />
£ GigaSpace on ws21<br />
£ Resource Broker on ws02<br />
£ Task Producer on ws23<br />
£ Producers on ws17, ws18, ws19 and ws20<br />
£ Consumer on ws24<br />
Test Application: DES Key Cracker<br />
6. Performance Testing<br />
Methodology: We measure the application runtime of ALiCE with several task size configura-<br />
tions using the normal eager scheduling algorithm to execute the DES Key Cracking test<br />
application on the same key length, 25 bits, using a fixed number of 4 producers.<br />
The result delivery mode was direct delivery, so the results came back directly from the<br />
producers to the consumers.<br />
Test Results: In the next table we present the results for the measurement made during this tests<br />
for:<br />
TpT - Average time of execution per task, including the overhead;<br />
TpT-O - Avera time of execution per task, excluding the overhead;<br />
T - Total execution time;<br />
CT - Computational time from total execution time;<br />
OT - Overhead dime from total execution time;<br />
O (%) - The percent that the overhead represented from the execution time.<br />
79
6. Performance Testing<br />
Tasks TpT - O TpT T CT OT O(%)<br />
4 69.69 73.6 73.6 69.69 3.92 5.3<br />
8 34.96 38.64 77.28 69.91 7.37 9.5<br />
20 14.02 14.93 74.67 70.1 4.57 6.1<br />
40 7.07 7.78 77.82 70.72 7.1 9.1<br />
80 3.62 3.88 77.58 72.34 5.24 6.8<br />
100 2.78 3.04 76.03 69.68 6.35 8.4<br />
140 2.04 2.12 77.56 71.3 6.27 8.1<br />
200 1.45 1.59 79.45 72.5 7 8.8<br />
300 1.016 1.209 90.68 76.2 14.48 16<br />
400 0.807 1.008 100.8 80.7 20.1 19.9<br />
800 0.403 0.997 199.4 80.6 118.8 59.6<br />
1000 0.324 0.996 249 81 168 67.5<br />
Graphical Representation of Results:<br />
The first graph presents the fraction of the execution time that is represented by the actual compu-<br />
tation and the one represented by overhead, ofr various task sizes.<br />
The second graph presents the increase of the percentage that the overhead represents in the<br />
total running time as the task size decreases and the numer of tasks increase.<br />
80
6. Performance Testing<br />
Analysis of Results: The conclusion of this test was, as expeceted, that the overhead increases<br />
as the number of tasks increases and the size of each task decreases. The more interesting<br />
part was the trend of the increase in overhead. To a certain point, the overhead increases very<br />
slowly with the increase of tasks number. But after that point is reached, the percent that the<br />
overhead represents from the total execution time increases exponentialy.<br />
All these tests lead us to the conclusion that the choice of the task size and the partitioning<br />
of the problem are extremely important in how an application performs on ALiCE. For this<br />
reason, the programmers should be more concern ed with problem partitioning issues.<br />
6.2.4. Overhead Variation with Task Size for Delivery of Results<br />
Through Resource Broker<br />
Objectives: To explore the percent from the total execution time that was represented by over-<br />
Platform:<br />
head with variation in task size, maintaining a fixed total problem size. In this test the<br />
overhead is measured in the case the results are delivered through the resource broker.<br />
£ GigaSpace on ws21<br />
£ Resource Broker on ws02<br />
81
£ Task Producer on ws23<br />
£ Producers on ws17, ws18, ws19 and ws20<br />
£ Consumer on ws24<br />
Test Application: DES Key Cracker<br />
6. Performance Testing<br />
Methodology: The same as the previous test (section 5.2.3), but with the result delivery mode<br />
set so the results are delivered to the consumer through the resource broker.<br />
Test Results:<br />
In the next table we present the results for the measurement made during this tests for:<br />
TpT - Average time of execution per task, including the overhead;<br />
TpT-O - Avera time of execution per task, excluding the overhead;<br />
T - Total execution time;<br />
CT - Computational time from total execution time;<br />
OT - Overhead dime from total execution time;<br />
O (%) - The percent that the overhead represented from the execution time.<br />
Tasks TpT - O TpT T CT OT O(%)<br />
4 67.97 77.86 77.86 67.97 9.86 12.7<br />
8 34.26 41.74 83.48 68.52 14.96 17.9<br />
20 13.77 14.36 71.8 63.85 7.95 11.1<br />
40 6.88 7.66 76.6 68.8 7.8 10.2<br />
80 3.47 4.72 94.4 69.4 25 26.5<br />
100 2.81 4.96 124 70.25 53.75 43.3<br />
140 2.065 4.538 158.83 72.28 86.55 54.5<br />
200 1.458 4.987 249.35 72.9 176.45 70.8<br />
300 1.016 5.083 381.23 76.2 305.03 80<br />
400 0.779 5.523 552.3 77.9 474.4 85.9<br />
800 0.425 5.276 1055.2 85 790.2 91.9<br />
1000 0.326 5.309 1327.2 81.5 1245.7 93.9<br />
Graphical Representation of Results:<br />
The first graph presents the fraction of the execution time that is represented by the actual compu-<br />
tation and the one represented by overhead, ofr various task sizes.<br />
82
6. Performance Testing<br />
The second graph presents the increase of the percentage that the overhead represents in the<br />
total running time as the task size decreases and the numer of tasks increase.<br />
Comparison Between Result Delivery Modes: As expected, the overhead in the case of re-<br />
sults being delivered through the resource broker is bigger than in the case of direct result<br />
delivery. It is very important though to notice the fact that this overhead tends to increase<br />
83
6. Performance Testing<br />
much more rapidly when delivering the results through the resource broker. This is ex-<br />
plained first by the fact that the resource broker will act as a huge bottleneck when there<br />
are many results, as they will be delivered one by one, from the same machine (the resource<br />
broker), whilst in the case of direct delivery, the results are received from many machines<br />
and even more, there are many threads running at the result collector retrieving result objects<br />
in background. The comparison between the overheads is presented in the next figure:<br />
The conclusion of analyzing the overheads in the two cases is that the result delivery mode<br />
through the resource broker should be used only when it is absolutely needed, like is the case<br />
when the consumer can not stay on-line to wait for all the results and is coming back on-line at a<br />
later time to check for new results.<br />
6.2.5. Performance Comparison with the Old Version of ALiCE<br />
We did some performance comparison between the last version of ALiCE and the new implemen-<br />
tation. The results are comparison have shown that for very low load of the system, the older<br />
version did better. This can be easily explained by the fact that the old version’s scalability was<br />
very poor and all the objects were kept in memory and the classes were transfered using RMI and<br />
the codebase property. This is faster, but for a higher load, the system would not only tend to be<br />
very slow (as the memory would run out and objects would go into swap), but it would crash after<br />
84
6. Performance Testing<br />
a certain load was reached. For a grid computing system such a behavior is totally unacceptable,<br />
as we are dealing with extremely high loads in a grid system used in real life. The new version is<br />
extremely scalable, as no objects, nor classes are kept in memory, but rather on secondary storage,<br />
as serialized files. That is why, for high loads, even with one application running, the new system<br />
did better than the old version.<br />
To conclude this comparison, we can say that the new version is a truly scalable system that has<br />
an increase in execution time that varies linearly with the system load. Also, since no objects/user<br />
classes are kept in memory, this means that the system can be loaded to the only limit of the size<br />
occupied by references in JavaSpace. Since the footprint of a reference in JavaSpace is very small,<br />
ALiCE would not crash even when faced with numbers like thousands of nodes in the system<br />
working at the same time.<br />
85
Part IV.<br />
ALiCE GRID Programming Model<br />
86
7. Developing ALiCE Applications<br />
7.1. The Model<br />
The ALiCE Programming Template for Java application follows the same model as in the first<br />
version, that is a model called Task Generator - Result Collector model. The templates are mainly<br />
the same, with some differences and new features which are presented in this chapter.<br />
The model defines two entities for a parallel ALiCE application:<br />
£ Task Generator<br />
The task generator is the entity that will create and initialize the tasks. The tasks are the execution<br />
thread doing the computational-intensive work.<br />
£ ResultCollector<br />
The result collector is executed at the machine that submits the application and its role is to receive<br />
the results obtained from the execution of the tasks generated by the task generator.<br />
Essentially, the ALiCE program will work by the following steps:<br />
1. A new application, consisting at least of a task generator and a result collector, is submitted<br />
into the system. As an optional part, one or more data files can be submitted into the system<br />
as belonging to this application; these files will be further available to the tasks and to the<br />
task generator;<br />
2. The application is downloaded by a resource broker that finds an appropriate machine to run<br />
the task generator and schedules the application to be run there;<br />
3. The application is downloaded by the selected machine and the task generator is dynami-<br />
cally loaded, instantiated and run, thus tasks are created and after each of them is created, a<br />
reference to it is sent to the resource broker in order to schedule it;<br />
87
7. Developing ALiCE Applications<br />
4. The resource broker schedules the tasks and sends the references to the designated machines<br />
via JavaSpace;<br />
5. The producer machine downloads a task, runs it and then the result object obtained is sent<br />
back either to the resource broker, or to the consumer that has originally submitted the ap-<br />
plication, the choice of which of this two is done being made when the application was first<br />
submitted.<br />
7.2. Template Features<br />
The ALiCE programming template for Java applications includes the following specifications:<br />
TaskGenerator, ResultCollector and Task classes, that should be extended by the program-<br />
£<br />
mer;<br />
Methods used to send new Task objects to ALiCE, either from the TaskGenerator, of from<br />
£<br />
another Task;<br />
£ Methods used to retrieve the Results from the system;<br />
£ Methods used submit, to get a reference to a data file, to read and write to it;<br />
£ Methods of simple communication between the result collector and the task generator;<br />
£ Methods of generic communication between tasks or between tasks and task generator.<br />
7.2.1. The Task Generator Template<br />
As mentioned above, the task generator is the entry point in the application. The TaskGenerator<br />
template mainly requires the programmer to create a class that is executed at the task producer’s<br />
site and that generates, initializes and submits computational tasks. There should be no compu-<br />
tational part in the task generator (this is not enforced to be so), all it should be done are some<br />
initializations and maybe some message exchange with the result collector. The actual entry point<br />
in the task generator is provided by the method public void main(String[] args). This method<br />
should be implemented by any task generator class. We stress here that in the task generator there<br />
should be NO static methods, including the main(String[] args) method. This is because some in-<br />
formations concerning the application that this task generator belongs are stored inside the actual<br />
object instance of the task generator that is created by dynamically loading the class submitted by<br />
the programmer. So, do not make the main method in the task generator static, nor create new task<br />
88
7. Developing ALiCE Applications<br />
generator instances. Beside this entry point, before calling the main() method on the task genera-<br />
tor, the init() method is called. So the programmer can implement the method public void init() is<br />
there are any necessary initialization to be done before starting the application, those initializations<br />
can be done inside this call, since it is called on the task generator object before anything else.<br />
The template also requires the programmer to extend the alice.consumer.TaskGenerator class.<br />
This superclass has the means to send computational task into the system, to receive string mes-<br />
sages from the result collector and to send/receive objects to/from any other component of the<br />
application (result collector or tasks). Also, if desired, the user can get a reference to a data file<br />
and use it to read/write from/to that file. For more details on data files usage, see section 4.2.4.<br />
The method calls the programmer has to his/hers disposal from the Task Generator are:<br />
public void process(Task t) - This is the call that submits new tasks into the systems to<br />
£<br />
be produced (executed); since the purpose of the task generator is obviously to generate<br />
tasks, this is the most important call inside this component. The programmer should create<br />
objects that are implementing the Task class and than call process() with that objects to<br />
start a computational process on a producer machine, that is to send the task to ALiCE for<br />
execution;<br />
public String getStringMessage() - Sometimes it is useful to have a way to send some simple<br />
£<br />
messages from the result collector to the task generator, with minimum overhead. This is<br />
the case when the result collector (which is the only part of an application that is run at<br />
the consumer) has a user interface that reads some basic inputs and/or commands and this<br />
inputs/commands should be transmitted to the task generator. To receive a string sent by the<br />
result collector of an application, the task generator of that application can receive the string<br />
with this call. To transfer more complicated structures between parts of the system, use the<br />
requestObject/sendObject mechanism;<br />
public Object requestObject(String id) - This call is used to request and wait for the reception<br />
£<br />
of an object defined by the given identifier. This object can be sent by any other component<br />
of the application, that is either the result collector, either a task. For more details about this<br />
mechanism, see section 4.2.5;<br />
public void sendObject(Object obj, String id) - This is the corresponding call to send an<br />
£<br />
object defined by the given id to the components of this application.<br />
To better understand this mechanism, take a look at the examples presented in section 4.3 of this<br />
part and also at the model programing templates in part 5.<br />
89
7.2.2. The Task Template<br />
7. Developing ALiCE Applications<br />
The Task template is a class that any task submitted by the programmer should extend. The exten-<br />
der class should essentially implement the public Object execute() method that the producer will<br />
call on any tasks received to be executed. The returned object is the result of the task (also see<br />
section 4.2.3).<br />
Beside this, the template offers the means to communicate with other tasks and the means to<br />
create a new task. The programmer can get from inside a task a reference to a previously submitted<br />
data file and than use that reference to read or write chunks or data from/to that file. The calls<br />
available from inside a task to achieve this functions are:<br />
public void process(Task t) - This is the call that submits a new tasks into the systems to<br />
£<br />
be produced (executed). This is very useful to develop application in which there are some<br />
major data dependencies or the ones in which the sizes of the tasks can not be previously<br />
calculated and they depend on some data being calculated in a task. Also this ability to create<br />
tasks from inside other tasks permits the implementation of computational algorithms from<br />
the “divide and conqueror” class, which opens up the ALiCE system to a full class of new<br />
problem solving.<br />
public Object requestObject(String id) - This call is used to request and wait for the reception<br />
£<br />
of an object defined by the given identifier. This object can be sent by any other component<br />
of the application, that is either the result collector, either a task. For more details about this<br />
mechanism, see section 4.2.5. Using this inter-task communication system, any means of<br />
synchronization and data dependencies handling is possible, given the fact that the object<br />
requested/submitted can be of any kind (as long as it is serializable, i.e. implementing the<br />
java.io.Serializable interface);<br />
public void sendObject(Object obj, String id) - This is the corresponding call to send an<br />
£<br />
object defined by the given id to the components of this application, frequently this being<br />
another tasks to which there is a need of synchronization or a data dependency.<br />
7.2.3. The Result Collector Template<br />
Result Template<br />
The result can be any kind of an object; this means that the result is as generic as it can get, thus<br />
permitting the programmer to implement any data structure inside the results delivered back to the<br />
result collector. The only requirement is that it implements the java.io.Serializable interface, since<br />
it will be transported over the network.<br />
90
7. Developing ALiCE Applications<br />
The delivery of results back to the consumer can be done in two ways. The first is directly,<br />
the second through the resource broker that has scheduled the application originally. The direct<br />
delivery is intended to be used in all the cases when the consumer is staying on-line and where<br />
the result collector is a thread that is running on this machine from the moment the application is<br />
submitted to the moment when all the results have been delivered, without going off-line. This kind<br />
of delivery impose much more overhead and hence it is much quicker. But there are cases when the<br />
consumer can not stay on-line (e.g. when it uses a dial-up Internet connection). In this cases, there<br />
is convenient to have the results delivered and stored at the resource broker’s site until the consumer<br />
will later come back on-line to restore them. In this case the result collector should be interrupted<br />
and executed later (or maybe put in a waiting state) until the connection is again available. The<br />
selection between the first and the second delivery mode is done when the application is first<br />
submitted into the system and the selection between the two is done automatically.<br />
The Result Collector<br />
The ResultCollector mainly requires the programmer to create a class that can be executed at the<br />
consumer. This implies that the programmer writes a class extends the alice.result.ResultCollector<br />
class. The main method of this class, that is the entry point of the result collector and which must<br />
be implemented by the programmer is the public void collect() method.<br />
The ResultCollector superclass has the means to obtain a result (if one is ready) for the ap-<br />
plication, to find out how many results are ready and also to send a simple string message to the<br />
task generator of this application, with the uses explained in section 4.2.1. This functions can be<br />
achieved through the following method calls from the class that extends ResultCollector:<br />
public Object collectResult() - If there is a result available, this call will return the object that<br />
£<br />
represents that result, as returned by the execute() method from the task that has generated<br />
that result. If there is no result available, it will return null. To find out the number of results<br />
available (of if there are any), the programmer should use the following method;<br />
public int getResultsNoReady() - This call returns the number of results that are available<br />
£<br />
and that have been calculated already by some tasks;<br />
public void sendStringMessage(String str) - This can be used to send a string message to the<br />
£<br />
task generator of this application, to send some simple parameters or user-inputed options;<br />
public String getRunDir() - This method returns the full path directory name of the directory<br />
£<br />
that the result collector runs in; this directory is chosen dynamically at the start of each<br />
application (see subsection 3.3.2 for more details).<br />
91
7.2.4. Data Files Usage<br />
7. Developing ALiCE Applications<br />
Data files are submitted to a data server at the time a new application is submitted into the system.<br />
The model we implemented uses the random-access file paradigm; this means that from inside a<br />
task or a task generator, the programmer can get access to a file with a special open() call and than,<br />
using the reference returned, he/she can do random access reads or writes to that file.<br />
The first step in data file usage is handled by the system and consists in submitting a data file for<br />
an application to the data server. This is done at the same time the actual code for the application<br />
is submitted. From the point the file is transfered, it is available to the task generator and to all the<br />
tasks of the application. If an open call is called for a data file and that file has not been yet entirely<br />
transfered to the data server, the call will block until the transfer completes. There are two calls for<br />
opening a file, one for handling file opening from inside tasks and one from inside task generators.<br />
These two calls are static and are methods of the alice.data.Data class:<br />
£<br />
£<br />
static public DataFile openFile (String fileName, Task t)<br />
static public DataFile openFile (String fileName, TaskGenerator tg)<br />
To get a reference to a data file, the actual call from a task or from a task generator looks the same,<br />
having a form similar to: DataFile f = Data.openFile(name, this). The reference to the calling<br />
task/task generator is needed in order to link the data file name to the application that is using it,<br />
since the system handles file with the same names in different applications at the same time. The<br />
name that is passed to the openFile call should be the same as the one of the originally submitted<br />
data file, without the path, There is no directory allowed for data files at the data server. This means<br />
that if, for example, the data file originally submitted had the path ./data/files/DataFile1.dat at the<br />
site of the consumer, to get a reference to that file after submitting it to ALiCE as a data file for an<br />
application, the programmer should call Data.openFile(“DataFile1.dat”, this), without using any<br />
path.<br />
The class obtained from an openFile call is a reference to the data file which contains the<br />
following methods:<br />
public byte[] read (int offset, int length) - This call will read from the data file represented<br />
£<br />
by the class it is called on, starting at the given offset in the file, a chunk of length bytes.<br />
Since the chunk size is at the decision of the programmer, the system is very flexible. In<br />
this way, the whole data file could be read at once (this should be the case only when the<br />
data file is small), or any part of it, of any size, can be read at once. Each task can thus only<br />
read the data that it needs and decide which part of the data file it needs based on some prior<br />
calculations done inside the same task;<br />
92
7. Developing ALiCE Applications<br />
public void write ( byte[] buffer, int offset, int length) - This is the corresponding call to write<br />
£<br />
data to a data file. The programmer can write any length of data. The data will be taken from<br />
the byte[] area referred by the buffer parameter. A number of length bytes will be read from<br />
the buffer and will be written to the data file starting at the given offset in the file. Sine the<br />
programmer has writing possibilities to the data file, the data files could be used as a mean<br />
of inter-task communication (although it is not intended to be this and nor we advice to use<br />
it this way);<br />
public long length() - This will just return the actual size of the physical data file as occupied<br />
£<br />
on the data file server’s disk storage, in size of bytes.<br />
7.2.5. Inter-task Communication<br />
The design of the new version of the ALiCE system adds to the system the functionality needed to<br />
handle any kind of dependencies between tasks and provide the means of flexible and convenient<br />
communication between parts of the application.<br />
Communication Through User Objects<br />
To communicate between tasks or between the task generator and tasks, the programmer has the<br />
ability to use any kind of objects as the unit being transfered during the communication. The only<br />
requirement for the objects transfered is that they implement the java.io.Serializable interface.<br />
Each type of objects used will be associated with a string id to differentiate between classes of<br />
objects. The programmer can request from inside a task that another task sends an object of a class<br />
that is defined by such a given identifier.<br />
There is no addressing implemented in the inter-task communication, since on a grid com-<br />
puting system there is no previous knowledge of where each task will execute, so at the time the<br />
application is written, the programmer doesn’t know where each task will be located at the execu-<br />
tion time. The implementation of any kind of task identification hard-coded in the Task class from<br />
ALiCE would have meant greatly diminishes the flexibility of the model. The approach taken is the<br />
most general and flexible one, that is to give the programmer the ability to implement any means<br />
of communication. We aim to achieve this by using generic objects for inter-task communication.<br />
In this way, the programmer could, for example, identify a specific task with a type of object with<br />
a certain identifier. The methods that the programmer can use to send/receive objects to/from a<br />
task or task generator are:<br />
public Object requestObject(String id) - This call is used to request and wait for the recep-<br />
£<br />
tion of an object defined by the given identifier. The call will block until such an object is<br />
93
eceived;<br />
7. Developing ALiCE Applications<br />
public void sendObject(Object obj, String id) - This is the corresponding call to send an<br />
£<br />
object defined by the given id to the components of this application, frequently this being<br />
another tasks to which there is a need of synchronization or a data dependency<br />
All the user-object communication is done through JavaSpace references. The footprint in JavaS-<br />
pace is always as large as the reference, so it does not depend of the size of the object transfered.<br />
So the programmer does not need to worry about the size of the objects transfered between com-<br />
ponents of the application, the only limitation being imposed by the resources of the machines that<br />
are running the producers, not by the network overhead.<br />
Communication Through Data Files<br />
Although inter-task communication by means of data files is possible, it is not recommended.<br />
This kind of communication can be implemented by using a data file as the point to send/receive<br />
communication data and do reads and writes on that data file from different tasks. The main<br />
limitation of doing this is that there is no synchronization possible but by using busy waiting over<br />
the network, which is very slow, high-overhead and recommended.<br />
7.3. Simple application examples<br />
This section will present some extremely simple application code examples in order to get a glance<br />
of how to use the functionalities described above.<br />
7.3.1. Simple Example and Data File Usage<br />
The first example is generating a task and then, from inside that task, opens a data file, writes a<br />
string to a position in the file, reads back what was written and returns a result that contains this<br />
string. The string is then printed by the result collector.<br />
THE RESULT COLLECTOR<br />
import alice.result.¤ ;<br />
public class MyResultCollector extends ResultCollector {<br />
public void collect() {<br />
MyResult res = null;<br />
94
}<br />
}<br />
THE RESULT<br />
import java.io.¤ ;<br />
7. Developing ALiCE Applications<br />
while (getResultsNoReady() ¥<br />
1) ¦<br />
;<br />
res = (MyResult)collectResult();<br />
<strong>System</strong>.out.println("Strint returned: "+ res.str);<br />
public class MyResult implements Serializable {<br />
}<br />
public String str;<br />
public MyResult() {<br />
}<br />
str = null;<br />
THE TASK GENERATOR<br />
import alice.consumer.¤ ;<br />
public class MyTaskGenerator extends TaskGenerator {<br />
}<br />
public MyTaskGenerator() {<br />
}<br />
public void generateTasks() {<br />
}<br />
<strong>System</strong>.out.println("{MyTaskGenerator}: generating TASK");<br />
Task t= new MyTask();<br />
process(t);<br />
public void main(String args[]) {<br />
}<br />
THE TASK<br />
this.generateTasks();<br />
95
import alice.consumer.¤ ;<br />
import alice.data.¤ ;<br />
import java.io.¤ ;<br />
public class MyTask extends Task {<br />
}<br />
public MyTask () {<br />
}<br />
public Object execute () {<br />
}<br />
7. Developing ALiCE Applications<br />
byte[] testW = (new String("OOPS")).getBytes();<br />
<strong>System</strong>.out.println("{Task}: Executing task "+this.hashCode());<br />
DataFile f = Data.openFile("Datafile",this);<br />
<strong>System</strong>.out.println("{Task}: data file length: "+f.length());<br />
f.write(testW,10,4);<br />
byte[] testR = f.read(10,4);<br />
String rd = new String(testR);<br />
<strong>System</strong>.out.println("{Task}: RESULT OF READING: "+rd);<br />
MyResult ret = new MyResult();<br />
ret.str = rd;<br />
return ret;<br />
7.3.2. Simple Inter-Task Communication and Spawning new Task<br />
from a Task<br />
This is a very simple example that just generates a task and from inside that task a new task of<br />
a different kind is generated. Than the first tasks block waiting for an object from the second<br />
one. When the second one executes, it sends an object of the requested kind. The files for this<br />
application are presented below.<br />
MYRESULTCOLLECTOR.JAVA<br />
import alice.result.¤ ;<br />
public class MyResultCollector extends ResultCollector {<br />
public void collect() {<br />
96
);<br />
}<br />
}<br />
MYRESULT.JAVA<br />
import java.io.¤ ;<br />
7. Developing ALiCE Applications<br />
MyResult res1, res2;<br />
while (getResultsNoReady() ¥<br />
2) ¦<br />
;<br />
res1 = (MyResult)collectResult();<br />
res2 = (MyResult)collectResult();<br />
<strong>System</strong>.out.println("Number returned:"+ (res1.i ¥<br />
-1) ? res1.i : res2.i<br />
¦<br />
public class MyResult implements Serializable {<br />
}<br />
public int i;<br />
public MyResult() {<br />
}<br />
i = -1;<br />
MYTASKGENERATOR.JAVA<br />
import alice.consumer.¤ ;<br />
public class MyTaskGenerator extends TaskGenerator {<br />
}<br />
public MyTaskGenerator() {<br />
}<br />
public void generateTasks() {<br />
}<br />
<strong>System</strong>.out.println("{MyTaskGenerator}: generating TASK");<br />
Task t1= new MyTask1();<br />
process(t1);<br />
public void main(String args[]) {<br />
}<br />
this.generateTasks();<br />
97
MYTASK1.JAVA<br />
import alice.consumer.¤ ;<br />
import java.io.¤ ;<br />
public class MyTask1 extends Task {<br />
}<br />
public MyTask1 () {<br />
}<br />
public Object execute () {<br />
}<br />
MYTASK2.JAVA<br />
7. Developing ALiCE Applications<br />
<strong>System</strong>.out.println("{Task}: Executing task1 ");<br />
Task t = new MyTask2();<br />
process(t);<br />
Dummy d = (Dummy)requestObject("dummy_id");<br />
MyResult res = new MyResult();<br />
res.i = d.i;<br />
return res;<br />
import alice.consumer.¤ ;<br />
import java.io.¤ ;<br />
public class MyTask2 extends Task {<br />
}<br />
public MyTask2 () {<br />
}<br />
public Object execute () {<br />
}<br />
<strong>System</strong>.out.println("{Task}: Executing task2 ");<br />
Dummy d = new Dummy();<br />
d.i = 69;<br />
sendObject(d,"dummy_if");<br />
MyResult ret = new MyResult();<br />
return ret;<br />
98
DUMMY.JAVA<br />
import java.io.¤ ;<br />
public class Dummy implements Serializable {<br />
}<br />
public int i;<br />
public Dummy() {<br />
}<br />
7. Developing ALiCE Applications<br />
99
8. ALiCE Programming Templates<br />
Important guidelines<br />
£ Do NOT use circular references<br />
£ Do NOT use static methods for the task generator, the task or the results<br />
8.1. The Task Generator Template<br />
/§¨§<br />
ALiCE Task Generator Template<br />
§<br />
§¨§ /<br />
import alice.consumer.§ ;<br />
import alice.data.§ ;<br />
public class TASKGEN_CLASSNAME extends TaskGenerator {<br />
/§¨§<br />
The no parameters constructor is a must<br />
§<br />
§¨§ /<br />
public TASKGEN_CLASSNAME() {}<br />
public void init() {<br />
}<br />
//Place your initialisation code here<br />
/§¨§<br />
main method - DO NOT make it static;<br />
§<br />
§ - this is the entry point and the point<br />
100
8. ALiCE Programming Templates<br />
where tasks should be generated<br />
§<br />
/ §¨§<br />
public void main(String args[]) {<br />
/§<br />
This is where the tasks are generated, usually<br />
§<br />
in a loop §<br />
§ /<br />
//To send a task for producing (this should be called for<br />
// each task) :<br />
TASK_CLASSNAME t = new TASK_CLASSNAME();<br />
process(t);<br />
//To open a data file, read and write from/to it:<br />
DataFile f = Data.openFile("file_name",this);<br />
READ_BUFF = f.read(POSITION, LENGTH);<br />
f.write( WRITE_BUFF, POSITION, LENGTH);<br />
//To send/receive an object<br />
OBJECT_CLASSNAME obj = new OBJECT_CLASSNAME();<br />
sendObject(obj, "snd_str_id");<br />
OBJECT_CLASSNAME rcvObj =<br />
(OBJECT_CLASSNAME)requestObject("rcv_str_id");<br />
}<br />
} //end class<br />
//To receive a string message from the result collector:<br />
String msg = getStringMessage();<br />
8.2. The Result Collector Template<br />
/§¨§<br />
ALiCE Result Collector Template<br />
§<br />
§¨§ /<br />
import alice.result.§ ;<br />
101
8. ALiCE Programming Templates<br />
public class RESCOL_CLASSNAME extends ResultCollector {<br />
}<br />
//Place Variables Here<br />
//Conscructor<br />
public RESCOL_CLASSNAME() {<br />
}<br />
public void collect() {<br />
}<br />
/§<br />
Place here result collection and processing code<br />
§<br />
/ §<br />
//to obtain number of results ready call:<br />
int resReady = getResultsNoReady()<br />
//to get a new result call:<br />
RES_CLASSNAME res = (RES_CLASSNAME)collectResult();<br />
8.3. The Task Template<br />
/§¨§<br />
ALiCE Task Generator Template<br />
§<br />
§¨§ /<br />
import alice.consumer.§ ;<br />
import java.io.§ ;<br />
public class TASK_CLASSNAME extends Task {<br />
//Place variables here<br />
//Constructor<br />
public TASK_CLASSNAME () {<br />
}<br />
public Object execute () {<br />
102
§<br />
8. ALiCE Programming Templates<br />
This is where you do your calculations<br />
§<br />
The results can be any kind of objects<br />
§<br />
§ /<br />
// You can generate and send a new task to<br />
// be produced:<br />
O_TASK_CLASSNAME t = new O_TASK_CLASSNAME();<br />
process(t);<br />
//To open a data file, read and write from/to it:<br />
DataFile f = Data.openFile("file_name",this);<br />
READ_BUFF = f.read(POSITION, LENGTH);<br />
f.write( WRITE_BUFF, POSITION, LENGTH);<br />
//To send/receive an object<br />
OBJECT_CLASSNAME obj = new OBJECT_CLASSNAME();<br />
sendObject(obj, "snd_str_id");<br />
OBJECT_CLASSNAME rcvObj =<br />
(OBJECT_CLASSNAME)requestObject("rcv_str_id");<br />
}<br />
}<br />
103
9. Conclusions<br />
9.1. Summary<br />
In summary, this project achieved the following:<br />
Design and implementation of a working grid computing system based on Sun Microsyt-<br />
£<br />
stems’ JavaSpaces Technology. To allow better control of the system, we have implemented<br />
a three-tier architecture system consisting of consumers, producers, and resource brokers.<br />
We have designed and implemented a scalable, modular, portable and performant system<br />
based on a library used to transfer any live Java objects over the network, also developed by<br />
us. The system is also supporting other programming languages for application develope-<br />
ment, like C and C++;<br />
Design a programming template for users can use to develop grid applications. Our TaskGenerator-<br />
£<br />
Tasks-ResultCollector programming model allows users to decouple visualization from com-<br />
putation. We have also implemented a system to generate tasks from other tasks, to be able<br />
to implement pear-to-pear algorithms, as well as the master-and-slave ones. We also devel-<br />
oped a very powerfull communication system between components of an application, based<br />
on requesting/sending generic Java objects;<br />
Design a data server for a fiable and performant distribution of data needed by applications<br />
£<br />
running on our system.<br />
9.2. Future work<br />
There are still many things that should be implemented in ALiCE to make it the powerfull grid<br />
computing system it can be and aims to become. This includes:<br />
£<br />
Integrating new scheduling tehniques and load-balancing scheduling;<br />
104
£<br />
£<br />
9. Conclusions<br />
Implementing a performant fault-tolerant architecture for the producers, including maybe<br />
task migration and pre-emption, together with check-pointing;<br />
Implementing Quality of Sevice tehniques to prioritize critical applications;<br />
£ Developping a centralized accounting and monitoring scheme.<br />
105
Bibliography<br />
[1] Johan Prawira (2002). ALiCE, Java-based <strong>Grid</strong> <strong>Computing</strong> <strong>System</strong>, Honours Thesis, School<br />
of <strong>Computing</strong>, National University of Singapore<br />
[2] Lee, Matsuoka, Talia, Sossman, Karonis, Allen and Thomas (2001). A <strong>Grid</strong> Programming<br />
Primer, Programming Models Working Group, <strong>Grid</strong> Forum 1, Amsterdam<br />
[3] Baratloo, Karaul, Kedem, Wyckoff (1996). Charlotte: Metacomputing on the Web. In the Pro-<br />
ceedings of the 9th International Conference on Parallel and Distributed <strong>Computing</strong> <strong>System</strong>s,<br />
1996.<br />
[4] Foster and Kesselman (1997). Globus: A Metacomputing Infrastructure Toolkit. International<br />
Journal of Supercomputing Applications.<br />
[5] Foster, Kesselman and Tuecke (2001). The Anatomy of the <strong>Grid</strong>: Enabling Scalable Virtual<br />
Organizations, International Supercomputer Applications 2001.<br />
[6] Germain, N.ri, Fedak and Cappello (2000). XtremWeb: building an experimental platform<br />
for Global <strong>Computing</strong>. Laboratoire de Recherche en Informatique, Universit. Paris Sud.<br />
[7] Hornburb, P.C. (2001). The Architecture of a World Wide Distributed <strong>System</strong>. Ph.D. thesis.<br />
Vrije University, Netherlands.<br />
[8] Khunboa, C. and R. Simon (2001). On the Performance of Coordination Spaces for Dis-<br />
tributed Agent <strong>System</strong>s. In the Proceedings of the IEEE 34th Annual Simulation Symposium,<br />
April, 2001, Seattle, Washington. pp 7-14.<br />
[9] Lee, C.R (2000). The Design and Implementation of a <strong>Computing</strong> Engine in ALiCE, Honours<br />
thesis. School of <strong>Computing</strong>, National University of Singapore.<br />
106
Bibliography<br />
[10] Sarmenta, L.F.G. (1998). Bayanihan: Web-Based Volunteer <strong>Computing</strong> Using Java. In the<br />
Proceedings of the 2nd International Conference on World-Wide-<strong>Computing</strong> and its Appli-<br />
cations (WWCA’98), Tsukuba, Japan, March 3-4, 1998. Lecture Notes in Computer Science<br />
1368, Springer-Verlag, 1998. pp. 444-461<br />
[11] Sarmenta, L.F.G. (2001). Volunteer <strong>Computing</strong>. Ph.D. thesis. Department of Electrical En-<br />
ginnering and Computer Science, MIT, March 2001.<br />
[12] SETI@home: http://setiathome.ssl.berkeley.edu<br />
[13] Distributed.Net: http://www.distributed.net<br />
[14] Globus: http://www.globus.org<br />
[15] The GLOBE Project: http://www.cs.vu.nl/~steen/globe/<br />
[16] Legion: http://www.cs.virginia.edu/~legion<br />
[17] Condor: http://www.cs.wisc.edu/condor<br />
[18] IEEE High Performance Distributed <strong>Computing</strong> (HPDC) symposium 2001: http://www-<br />
2.cs.cmu.edu/~hpdc<br />
[19] High Performance <strong>Computing</strong> Symposium 2002: http://wwwteo.informatik.uni-<br />
rostock.de/HPC<br />
[20] 3rd International Workshop on <strong>Grid</strong> <strong>Computing</strong>: http://www.gridcomputing.org/grid2002<br />
107