Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
3.2 Prerequisites for <strong>Cost</strong>-<strong>Based</strong> <strong>Optimization</strong><br />
In [LRD06], semantic constraints are modeled locally for each individual plan by the<br />
user in order to preserve semantic correctness by using application knowledge. There,<br />
constraints between operators are explicitly specified in order to exclude these operators<br />
from any plan rewriting. In contrast to this approach, we define semantic correctness <strong>of</strong><br />
plans with global constraints—that are independent <strong>of</strong> specific plans and thus, reduce the<br />
required modeling and configuration efforts for a user—as follows:<br />
Definition 3.1 (Semantic Correctness). Let P denote an original plan and let P ′ denote<br />
a plan that was created by rewriting P . Then, semantic correctness <strong>of</strong> P ′ refers to<br />
the semantic equivalence <strong>of</strong> P ≡ P ′ . This equivalence property is given if the following<br />
constraints hold:<br />
1. There are no dependencies δ between operators <strong>of</strong> concurrent subflows (parallel subflows<br />
<strong>of</strong> a Fork operator).<br />
2. If there is a dependency δ between an interaction-oriented operator and another<br />
operator, the temporal order <strong>of</strong> them must be equivalent in P and P ′ .<br />
3. If there are two interaction-oriented operators, where at least one performs a write<br />
operation and both refer to the same external system, the temporal order <strong>of</strong> them<br />
must be equivalent in P and P ′ .<br />
4. If there exists an anti-dependency between two operators, the temporal order <strong>of</strong> these<br />
operators must be equivalent in P and P ′ .<br />
5. If there is a data dependency between two operators, the sequential order <strong>of</strong> these<br />
operators must be equivalent in P and P ′ or the applied optimization technique must<br />
guarantee semantic correctness (equivalent results) <strong>of</strong> the changed sequential order.<br />
According to Rule 5, the specific optimization techniques decide whether or not operators,<br />
with dependencies between these operators, can be reordered. This is necessary<br />
because the reordering decision must be made based on the concrete involved operators<br />
and their parameterizations. For example, two Selection operators can be reordered,<br />
while this is impossible for a sequence <strong>of</strong> Selection and Projection operators if the selection<br />
attribute is removed by the projection. In case there was a dependency δ between<br />
two operators and if their sequential order (and thus, also the temporal order) was changed<br />
when rewriting P to P ′ , the parameters <strong>of</strong> the two operators (and hence, the new data<br />
flow) must be changed accordingly. Thus, when rewriting a plan, incremental maintenance<br />
(transformation) <strong>of</strong> the dependency graph is applied as well. Due to the importance <strong>of</strong><br />
this dependency graph, we use Example 3.1 to illustrate its core concepts.<br />
Example 3.1 (Dependency Graph). We use the plan P 3 from Example 2.6. Figure 3.2(a)<br />
shows the related dependency graph. Consider the dependency δ D msg 3<br />
. It is a data dependency<br />
(D) over the message msg 3 from o 4 to o 3 ; i.e., operator o 4 reads the result <strong>of</strong> o 3<br />
as one <strong>of</strong> its join operands. Hence, o 4 depends on o 3 . The dependency graph is used to<br />
determine rewriting possibilities. For example, since there are no dependencies between<br />
operators o 2 and o 3 and none <strong>of</strong> them is a writing interaction, we can insert a Fork operator<br />
and execute those as parallel subflows (Figure 3.2(b)). If there are data dependencies<br />
between local operators (no interaction-oriented operators), we can yet exchange their sequential<br />
order (e.g., o 4 and o 5 by the optimization technique eager group-by). However,<br />
we are not allowed to exchange o 6 and o 7 because this would change the external behavior.<br />
Further, the output and anti dependencies determine that we are not allowed to exchange<br />
the order <strong>of</strong> the involved operators (e.g., o 3 and o 6 ).<br />
37