20.01.2014 Views

Master Thesis - ICS

Master Thesis - ICS

Master Thesis - ICS

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Abstract<br />

The Grid is an emerging infrastructure that supports the discovery, access and use of<br />

distributed computational resources. Grids abstract over platform or protocol-specific<br />

mechanisms for authentication, file access, data transfer, application invocation, etc. and<br />

allow dynamic deployment of applications on diverse hardware and software platforms.<br />

The scheduling computations and resource management for Grid-aware applications is a<br />

challenging problem as the resources are distributed, heterogeneous in nature, owned by<br />

different individuals or organizations with their own policies, different access and cost<br />

models, and have dynamically varying loads and availability.<br />

A high-performance scheduler in general promotes the performance of individual<br />

applications by optimizing performance measurements such as minimal execution time.<br />

The strategy of efficient and optimized query execution is very challenging research<br />

problem. Besides that, the a-priori resource allocation and management is a hard and<br />

challenging problem, as well. It is extremely important for the researchers and the<br />

distributed database designers to know in advance which and how many resources of the<br />

Grid architecture are involved in the execution of a given query. Communication and<br />

computation costs are two very important factors that designate the proper computation<br />

resources for this execution.<br />

This work explores this aspect of service-based computing and resource<br />

management. We study the various data replication policies that could be followed in<br />

distributed database systems. In addition, we focus on how we can optimize queries<br />

processing technology with computational Grids and how we can make resource<br />

allocation more efficient and effective. Especially, regarding the case of no data<br />

replication, we designed and implemented a high-performance application scheduler for<br />

relational join queries over a Grid-aware architecture. We transform given join<br />

expressions into directed acyclic graphs (DAGs) that contain all possible plans for the<br />

execution of the join. For that purpose, we developed the Query Plan Graph Constructor<br />

– QuPGC algorithm. When the query plan graph is constructed, we have to select the<br />

execution plan that yields optimal performance. For that reason, we developed the<br />

Heuristic Query Path Selector – HQuPaS algorithm, that uses two heuristic functions for<br />

iv

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!