Master Thesis - ICS
Master Thesis - ICS
Master Thesis - ICS
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Abstract<br />
The Grid is an emerging infrastructure that supports the discovery, access and use of<br />
distributed computational resources. Grids abstract over platform or protocol-specific<br />
mechanisms for authentication, file access, data transfer, application invocation, etc. and<br />
allow dynamic deployment of applications on diverse hardware and software platforms.<br />
The scheduling computations and resource management for Grid-aware applications is a<br />
challenging problem as the resources are distributed, heterogeneous in nature, owned by<br />
different individuals or organizations with their own policies, different access and cost<br />
models, and have dynamically varying loads and availability.<br />
A high-performance scheduler in general promotes the performance of individual<br />
applications by optimizing performance measurements such as minimal execution time.<br />
The strategy of efficient and optimized query execution is very challenging research<br />
problem. Besides that, the a-priori resource allocation and management is a hard and<br />
challenging problem, as well. It is extremely important for the researchers and the<br />
distributed database designers to know in advance which and how many resources of the<br />
Grid architecture are involved in the execution of a given query. Communication and<br />
computation costs are two very important factors that designate the proper computation<br />
resources for this execution.<br />
This work explores this aspect of service-based computing and resource<br />
management. We study the various data replication policies that could be followed in<br />
distributed database systems. In addition, we focus on how we can optimize queries<br />
processing technology with computational Grids and how we can make resource<br />
allocation more efficient and effective. Especially, regarding the case of no data<br />
replication, we designed and implemented a high-performance application scheduler for<br />
relational join queries over a Grid-aware architecture. We transform given join<br />
expressions into directed acyclic graphs (DAGs) that contain all possible plans for the<br />
execution of the join. For that purpose, we developed the Query Plan Graph Constructor<br />
– QuPGC algorithm. When the query plan graph is constructed, we have to select the<br />
execution plan that yields optimal performance. For that reason, we developed the<br />
Heuristic Query Path Selector – HQuPaS algorithm, that uses two heuristic functions for<br />
iv