Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
3 Fundamentals <strong>of</strong> Optimizing <strong>Integration</strong> <strong>Flows</strong><br />
Assign (o1)<br />
[out: msg1]<br />
[5ms]<br />
[max(60ms+3ms,<br />
Fork (o-1)<br />
26ms+6ms)]<br />
[60ms,<br />
5000]<br />
[26ms,<br />
1000]<br />
Invoke (o2)<br />
Invoke (o3)<br />
[service s4, in: msg1, out: msg2] [service s5, in: msg1, out: msg3]<br />
Join (o4)<br />
[in: msg2,msg3, out: msg4]<br />
Groupby (o5)<br />
[in: msg4, out: msg5]<br />
[50ms,<br />
5000]<br />
[35ms,<br />
1000]<br />
[max(60ms+W(o5)+3ms,<br />
26ms+6ms)]<br />
[60ms,<br />
5000]<br />
Invoke (o2)<br />
[service s4, in: msg1, out: msg2]<br />
[ W(o5), <br />
|dsout(o5)| ]<br />
Groupby (o5)<br />
[in: msg2, out: msg4]<br />
Assign (o1)<br />
[out: msg1]<br />
Join (o4)<br />
[in: msg4,msg3, out: msg5]<br />
[5ms]<br />
Fork (o-1)<br />
[26ms,<br />
1000]<br />
Invoke (o3)<br />
[service s5, in: msg1, out: msg3]<br />
[ W(o4), <br />
1000 ]<br />
Assign (o6)<br />
[in: msg5, out: msg1]<br />
[5ms]<br />
Assign (o6)<br />
[in: msg5, out: msg1]<br />
[5ms]<br />
Invoke (o7)<br />
[service s5, in: msg1]<br />
[40ms]<br />
Invoke (o7)<br />
[service s5, in: msg1]<br />
[40ms]<br />
(a) Plan P 3<br />
(b) Plan P ′ 3<br />
Figure 3.4: Plan <strong>Cost</strong> Estimation Example<br />
Furthermore, we estimate the missing execution times using the monitored statistics <strong>of</strong> P 3<br />
and the defined abstract costs <strong>of</strong> P 3 and P 3 ′ as follows:<br />
Ŵ (o ′ 5) = |ds in(o ′ 5 )| + |ds in(o ′ 5 )| · |dsouto′ 5 |<br />
2<br />
|ds in (o 5 )| + |ds in (o 5 )| · |dsout(o 5)|<br />
2<br />
· W (o 5 ) = 2,505,000 · 35 ms = 35 ms<br />
2,505,000<br />
Ŵ (o ′ 4) = |ds in1(o ′ 4 )| + |ds in1(o ′ 4 )| · |ds in2(o ′ 4 )|<br />
|ds in1 (o 4 )| + |ds in1 (o 4 )| · |ds in2 (o 4 )| · W (o 4) = 1,001,000 · 50 ms = 10 ms.<br />
5,005,000<br />
Finally, we can use the computed cost estimates, aggregate the plan costs, and compare<br />
these costs as follows:<br />
W (P 3 ) = 5 ms + max(60 ms + 3 ms, 26 ms + 2 · 3 ms) + 50 ms + 35 ms + 5 ms + 40 ms<br />
= 198 ms<br />
Ŵ (P ′ 3) = 5 ms + max(60 ms + 35 ms + 3 ms, 26 ms + 2 · 3 ms) + 10 ms + 5 ms + 40 ms<br />
= 158 ms.<br />
In our example, we would choose P 3 ′ as execution plan because, it is optimal, on average,<br />
under the assumption <strong>of</strong> precise monitored statistics. Note that although o 2 and o 7 are defined<br />
with equal abstract costs (Invoke), we adapt to the concrete workload characteristics<br />
by weighting those costs with monitored execution times. In addition, the double metric<br />
cost model enables us to use one single metric (the execution time) for data-flow-oriented<br />
operators (e.g., Groupby), interaction-oriented operators (e.g., Invoke), and control-floworiented<br />
operators (e.g., Fork).<br />
To summarize, we proposed the first complete cost model for integration flows. This<br />
double-metric cost model is self-adjusting because we weight the abstract costs with monitored<br />
execution statistics. For this reason, over time, the estimates converge to the real<br />
costs <strong>of</strong> the concrete application environment and hence, this cost model enables the adaptation<br />
to changing workload characteristics. Further, the two metrics enable to integrate<br />
the interaction-, control-flow-, and data-flow-oriented operators into a unique cost model<br />
and thus, enable the comparison <strong>of</strong> plans with control-flow semantics.<br />
44