Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
2.1 <strong>Integration</strong> <strong>Flows</strong><br />
Executing <strong>Integration</strong> <strong>Flows</strong><br />
Semantics<br />
Data Flow<br />
Control Flow<br />
Data<br />
Granularity<br />
iterator<br />
model<br />
hybrid<br />
instanceglobal<br />
materialized<br />
intermediates<br />
materialized<br />
intermediates<br />
Figure 2.6: Classification <strong>of</strong> Execution Approaches for <strong>Integration</strong> <strong>Flows</strong><br />
instancelocal<br />
instanceglobal<br />
instancelocal<br />
instanceglobal<br />
instancelocal<br />
instanceglobal<br />
Putting it all together, Figure 2.6 illustrates the overall classification. Using controlflow-oriented<br />
execution semantics (with temporal dependencies) directly implies the use<br />
<strong>of</strong> materialized intermediates in the form <strong>of</strong> variables. In this context, both instance-local<br />
and instance-global processing is possible. Typically, EAI servers and BPEL engines use<br />
this execution approach. Hence, this thesis uses the execution model <strong>of</strong> control-flow semantics<br />
and materialized intermediates as conceptual foundation. Furthermore, data flow<br />
execution semantics allow for both the iterator model and materialized intermediates as<br />
well as both instance-local and instance-global data granularity. Examples for the use<br />
<strong>of</strong> materialized intermediates are Demaq [BMK08] (instance-global) and some ETL tools<br />
(instance-local). The iterator model requires a more fine-grained classification. Iterator,<br />
instance-global refers to a tuple-based processing over data <strong>of</strong> multiple instances, where<br />
punctuations are used to distinguish data from the different instances. An example for this<br />
model is the stream-based Web service approach [PVHL09a, PVHL09b]. In contrast, iterator,<br />
instance-local refers to a tuple-based processing over data <strong>of</strong> a single instance, which is<br />
the typical execution model <strong>of</strong> ETL tools. In addition, an iterator, hybrid instance-global<br />
model can be used, where the single materialized intermediates <strong>of</strong> multiple instances are<br />
executed in a pipelined fashion and thus, with the iterator model.<br />
Finally, we will use this classification <strong>of</strong> execution approaches in order to position the<br />
different optimization approaches as well as the results <strong>of</strong> this thesis.<br />
2.1.5 Optimizing <strong>Integration</strong> <strong>Flows</strong><br />
<strong>Based</strong> on the different modeling and execution approaches, we now focus on the optimization<br />
<strong>of</strong> deployed integration flows. The main scope is the optimization <strong>of</strong> integration flows<br />
with control-flow execution semantics.<br />
Due to the early state <strong>of</strong> the research area <strong>of</strong> integration flow optimization, an exhaustive<br />
classification <strong>of</strong> optimization approaches for integration flows does not exist. However,<br />
typically, integration-flow-specific characteristics are exploited for optimization. First,<br />
the expensive access <strong>of</strong> external systems is tackled with approaches that speed up the<br />
external data transfer. Second, the control-flow-oriented execution is optimized by parallel<br />
execution <strong>of</strong> subplans <strong>of</strong> an integration flow or by operator reordering. Thus, we use these<br />
two categories in order to classify the existing approaches.<br />
15