Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
Cost-Based Optimization of Integration Flows - Datenbanken ...
- No tags were found...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
6 On-Demand Re-<strong>Optimization</strong><br />
f γ((γR)⋊⋉S) θ f γ(R⋊⋉S) /f γ(R) . If we assume independence for this group-by selectivity, we<br />
have f γ((γR)⋊⋉S) = 1 and can set f γ(R⋊⋉S) = f γ(R) . As a result, we can derive all variables<br />
<strong>of</strong> the optimality conditions from statistics <strong>of</strong> the optimal plan.<br />
* |R|, |S| as additional input<br />
R S γ<br />
|R| |S| |R S| | γ(R S)|<br />
fR,S<br />
f γ(R ⋈ S)<br />
γ<br />
oc 2 oc 3<br />
oc 4<br />
C1 *C2<br />
*C3<br />
*C4<br />
R<br />
S<br />
≤<br />
(oc1)<br />
≤<br />
(oc2)<br />
≤<br />
(oc3)<br />
≤<br />
(oc4)<br />
oc 1<br />
(a) Example POT<br />
(b) Optimality Conditions<br />
(c) Complexity Analysis<br />
Figure 6.10: Example Eager Group-By<br />
Figure 6.10(a) shows the resulting PlanOptTree, where we omitted some connections<br />
(*) to atomic statistic nodes for simplicity <strong>of</strong> presentation. Note that for eager group-by,<br />
no transitivity is used. Furthermore, only the plan optimality is modeled rather than<br />
the whole plan search space. Hence, only four optimality conditions are required per join<br />
operator as shown in Figure 6.10(b). Accordingly, Figure 6.10(c) compares the number <strong>of</strong><br />
alternative plans <strong>of</strong> the full search space with the number <strong>of</strong> required optimality conditions.<br />
The improvement is reasoned by the fact that for each join input, we just model if preaggregation<br />
is advantageous or not.<br />
Union Distinct Example<br />
In contrast to join enumeration or eager group-by, there are many control-flow- and<br />
data-flow-oriented optimization techniques with fairly simple optimality conditions and<br />
thus, rather small PlanOptTrees. An example is the optimization technique WD11:<br />
Setoperation-Type Selection (set operations with distinctness).<br />
oc 2<br />
U<br />
sort(R)<br />
R S R<br />
oc 1<br />
UM<br />
sort(S)<br />
S<br />
(a) Union Distinct Alternatives<br />
R U S<br />
|R| |S| |R U S|<br />
≤<br />
(oc1)<br />
C1<br />
≥<br />
(oc’2)<br />
C2<br />
(b) Example POT<br />
Figure 6.11: Example Union Distinct<br />
There are three alternative subplans for a union distinct R∪S. First, there is the normal<br />
union distinct operator with costs that are given by C(R ∪ S) = |R| + |S| · |R ∪ S|/2 (two<br />
plans due to asymmetric costs), where |R| ≤ |R ∪ S| ≤ |R| + |S| holds. Second, we can<br />
sort both inputs and apply a merge algorithm with costs <strong>of</strong><br />
C (sort(R) ∪ M sort(S)) = |R| + |S| + |R| · log 2 |R| + |S| · log 2 |S|. (6.7)<br />
184