25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

3 Fundamentals <strong>of</strong> Optimizing <strong>Integration</strong> <strong>Flows</strong><br />

<strong>Optimization</strong> Algorithm<br />

According to the defined integration flow optimization problem, we now explain the overall<br />

optimization algorithm including the two aspects <strong>of</strong> (1) when and how to trigger reoptimization<br />

<strong>of</strong> a plan and (2) how to re-optimize the given plan using the set <strong>of</strong> available<br />

cost-based optimization techniques. The naïve algorithm for solving the P-PPO comprises<br />

three subproblems: (1) the complete creation <strong>of</strong> alternative plans (the search space), (2)<br />

the periodical cost evaluation <strong>of</strong> each created plan (search space evaluation), and (3) the<br />

choice <strong>of</strong> the plan with minimal costs. In contrast to this generation-based approach, we<br />

exploit the specific characteristic <strong>of</strong> an initially given imperative plan, by using an iterative<br />

(transformation-based) optimization algorithm. In the following, we describe in detail how<br />

to trigger re-optimization and how to re-optimize the given plan.<br />

Algorithm 3.1 Trigger Re-<strong>Optimization</strong> (A-TR)<br />

Require: plan identifier ptid, optimization interval ∆t, workload window size ∆w,<br />

aggregation method method, optimization algorithm algorithm<br />

1: while true do<br />

2: sleep(∆t)<br />

3: P ← getPlan(ptid)<br />

4: DG ← getDependencyGraph(ptid)<br />

5: Estimator.aggregateStatistics(P , ∆w, method) // see Subsection 3.3.3<br />

6: ret ← Optimizer.optimizePlan(P , DG, algorithm)<br />

7: if ret.isChanged() then<br />

8: P ← ret.getPlan()<br />

9: putPlan(ptid, P )<br />

10: putDependencyGraph(ptid, ret.getDG())<br />

11: P arser.recompilePlan(ptid, P )<br />

12: Runtime.exchangePlans(ptid, P )<br />

Algorithm 3.1 4 illustrates when and how re-optimization is triggered. Essentially, this<br />

algorithm is started as a background thread for each deployed integration flow and periodically<br />

issues plan re-optimization with period ∆t (line 2). Therefore, monitored execution<br />

statistics are aggregated with a certain aggregation method (line 5) and re-optimization<br />

is initiated with one <strong>of</strong> our optimization algorithms (line 6). If the current plan has been<br />

changed during this re-optimization, we recompile the logical plan into an executable physical<br />

plan (line 11) and exchange the plan at the next possible point between two subsequent<br />

plan instances (line 12). When triggering re-optimization, the optimization algorithm is<br />

selected. There, patternMatching<strong>Optimization</strong> (A-PMO) is the default algorithm, while<br />

several additional heuristic algorithms can be used for search space reduction.<br />

Algorithm 3.2 illustrates our transformation-based optimization algorithm A-PMO. This<br />

algorithm is invoked for the complete plan, where it recursively iterates over the hierarchy<br />

<strong>of</strong> operator sequences (<strong>of</strong> the current plan) and applies optimization techniques according<br />

to the operator types (the included comments show the abbreviations <strong>of</strong> applied optimization<br />

techniques, which we partly discuss in Section 3.4). There are four types <strong>of</strong><br />

optimization techniques. First, we apply all techniques, which need to be executed on top<br />

level <strong>of</strong> a plan and before all other optimization techniques (line 2). For example, the join<br />

4 Similar to the naming scheme <strong>of</strong> problems, we use the prefix A- to indicate names <strong>of</strong> algorithms.<br />

48

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!