Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
4.2 Plan Vectorization<br />
Receive (o1)<br />
[service: s5, out: msg1]<br />
Assign (o2)<br />
[in: msg1, out: msg2]<br />
Invoke (o3)<br />
[service: s4, in: msg2, out: msg3]<br />
Join (o4)<br />
[in: msg1,msg3, out: msg4]<br />
Assign (o5)<br />
[in: msg4, out: msg5]<br />
Invoke (o6)<br />
[service s3, in: msg5]<br />
D<br />
δ msg1<br />
D<br />
δ msg2<br />
D<br />
δ msg3<br />
D<br />
δ msg4<br />
D<br />
δ msg5<br />
δ msg1<br />
D<br />
D (set <strong>of</strong> dependencies)<br />
δ D msg1 (o 2 o 1,q 1)<br />
δ D msg1 (o 4 o 1,q 2)<br />
δ D msg2 (o 3 o 2,q 3)<br />
δ D msg3 (o 4 o 3,q 4)<br />
δ D msg4 (o 5 o 4,q 5)<br />
δ D msg5 (o 6 o 5,q 6)<br />
Copy (oc)<br />
[in: msg1, out: msg1]<br />
Assign (o2)<br />
[in: msg1, out: msg2]<br />
Invoke (o3)<br />
[service: s4, in: msg2, out: msg3]<br />
Join (o4)<br />
[in: msg1,msg3, out: msg4]<br />
Assign (o5)<br />
[in: msg4, out: msg5]<br />
Invoke (o6)<br />
[service s3, in: msg5]<br />
(a) Dependency Graph DG(P 2) <strong>of</strong> Plan P 2<br />
(b) Vectorized Plan P ′ 2<br />
Figure 4.5: Example Plan Vectorization<br />
we create a queue q i for each dependency and connect the operators with these queues.<br />
Finally, we simply remove all temporal dependencies and get P ′ .<br />
Although the A-PV is only executed once during the initial deployment <strong>of</strong> an integration<br />
flow, it is important to note its time complexity with increasing number <strong>of</strong> operators m.<br />
Theorem 4.1. The A-PV exhibits a cubic worst-case time complexity <strong>of</strong> O(m 3 ).<br />
Pro<strong>of</strong>. Basically, we prove the complexity for the two algorithm parts: the dependency<br />
analysis and the graph creation. Assume an operator sequence o. We fix the number <strong>of</strong><br />
operators m with m = |o|. Then, an arbitrary operator o i with 1 ≤ i ≤ m can—in the<br />
worst case—be the target <strong>of</strong> i − 1 data dependencies δi − , and it can be the source <strong>of</strong> m − i<br />
data dependencies δ<br />
i + . <strong>Based</strong> on the equivalence <strong>of</strong> δ− = δ + and thus, |δ − | = |δ + |, there<br />
are at most<br />
m∑<br />
(i − 1) =<br />
i=1<br />
m−1<br />
∑<br />
i=1<br />
i =<br />
m · (m − 1)<br />
2<br />
(4.4)<br />
dependencies between arbitrary operators <strong>of</strong> this sequence. Hence, the dependency analysis<br />
is computable with quadratic complexity <strong>of</strong> O(m 2 ).<br />
When evaluating operator o i , there are at most ∑ m−1<br />
i=1<br />
i = m · (m − 1)/2 (for cases<br />
i = m − 1 and i = m) dependencies in D. During the graph creation, for each operator,<br />
each dependency d ∈ D must be evaluated in order to connect queues and execution<br />
buckets. In summary, for all operators, we must check at most<br />
m∑<br />
i=1 j=1<br />
i∑<br />
(m − j) = m ·<br />
m · (m + 1)<br />
2<br />
= m2 · (m + 1)<br />
2<br />
+<br />
−<br />
m∑<br />
i · (m − i − 1)<br />
i=1<br />
m∑<br />
i 2 −<br />
i=1<br />
m∑<br />
m · i +<br />
i=1<br />
m∑<br />
i=1<br />
i = m3 − m<br />
3<br />
(4.5)<br />
dependencies for graph creation. Hence, the algorithm part <strong>of</strong> graph creation is computed<br />
with cubic complexity <strong>of</strong> O(m 3 ). In summary, the plan vectorization algorithm exhibits a<br />
cubic worst-case time complexity <strong>of</strong> O(m 3 ) = O(m 3 +m 2 ). Hence, Theorem 4.1 holds.<br />
Note that this is the complexity analysis <strong>of</strong> our A-PV algorithm, while the P-PV problem<br />
can be solved with quadratic time complexity <strong>of</strong> O(m 2 ). For example, one can use<br />
95