25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2.3 <strong>Integration</strong> Flow Meta Model<br />

teristics are non-blocking operators and the need for state-awareness (e.g., state migration<br />

on plan rewriting). Due to processing infinite streams, re-optimization is by definition<br />

intra-operator or per-tuple routing, where re-optimization can be done asynchronously.<br />

2.2.4 <strong>Integration</strong> Flow <strong>Optimization</strong><br />

As we have shown in Subsection 2.1.5, existing techniques for the optimization <strong>of</strong> integration<br />

flows are mainly rule-based (optimize-once) [LZ05, VSS + 07, BJ10] and thus, do<br />

not address re-optimization, or they follow an optimize-always model [SVS05, SMWM06].<br />

However, similar to the categories <strong>of</strong> plan-based adaptation in DBMS and continuousquery-based<br />

adaptation in DSMS, integration flows exhibit certain specific characteristics<br />

that could be exploited for a more efficient re-optimization approach.<br />

First, integration flows are deployed once and executed many times. Due to the execution<br />

<strong>of</strong> many instances with rather small amounts <strong>of</strong> data (that stands in contrast to<br />

plan-based adaptation in DBMS), there is no need for inter-operator or intra-operator reoptimization.<br />

Second, in contrast to CQ-based adaptation in DSMS, many independent<br />

instances <strong>of</strong> an integration flow are executed over time. Due to this execution model <strong>of</strong> independent<br />

instances, there is no need for state migration because a plan can be exchanged<br />

between two subsequent instances with low costs.<br />

<strong>Based</strong> on the specific characteristics, integration flows require incremental statistic<br />

maintenance and inter-instance (inter-query) re-optimization. The advantages would be<br />

(1) the asynchronous optimization independent <strong>of</strong> executing certain instances, (2) the<br />

fact that all subsequent instances rather than only the current query benefit from reoptimization,<br />

and (3) the inter-instance plan change without the need <strong>of</strong> state migration.<br />

To summarize, we infer that the specific characteristics <strong>of</strong> DBMS, DSMS and integration<br />

platforms require tailor-made optimization approaches. While there exist sophisticated approaches<br />

for plan-based adaptation in DBMS and continuous-query-based adaptation in<br />

DSMS, to the best <strong>of</strong> our knowledge, there does not exist any optimization approach <strong>of</strong><br />

integration flows that allows the continuous adaptation to changing workload characteristics.<br />

This lack <strong>of</strong> a tailor-made cost-based optimization approach for integration flows is<br />

addressed in this thesis.<br />

2.3 <strong>Integration</strong> Flow Meta Model<br />

<strong>Based</strong> on the literature review <strong>of</strong> integration flows, we now define the integration flow<br />

meta model that is used as our formal foundation throughout the whole thesis. On the<br />

one side, we introduce the basic notation <strong>of</strong> integration flows including the message meta<br />

model and the flow meta model as well as the execution semantics <strong>of</strong> integration flows. On<br />

the other side, we discuss specific transactional properties <strong>of</strong> integration flows that must<br />

be ensured when rewriting such flows.<br />

2.3.1 Notation <strong>of</strong> <strong>Integration</strong> <strong>Flows</strong><br />

As the basic notation <strong>of</strong> integration flows, essentially, we use an extension <strong>of</strong> the so-called<br />

Message Transformation Model (MTM) [BWH + 07, BHW + 07]. This integration flow meta<br />

model consists <strong>of</strong> two vital parts. First, the message meta model describes the structural<br />

aspects, that is, the structure <strong>of</strong> data objects (materialized intermediates) processed by an<br />

21

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!