25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

2.1 <strong>Integration</strong> <strong>Flows</strong><br />

efficient processing <strong>of</strong> simple integration tasks, complex integration tasks require many<br />

indirections and thus, cannot be realized efficiently or not modeled at all. In addition,<br />

from the perspective <strong>of</strong> flow semantics, we distinguish between data-flow- and controlflow-oriented<br />

modeling [MMLW05].<br />

Example 2.2. Figure 2.5 shows an example integration flow with both different flow semantics<br />

(data flow and control flow): Figure 2.5(a) illustrates an example plan with dataflow<br />

semantics, where we receive a message, execute two filters and finally, send the result<br />

to two external systems. Here, the edges describe data dependencies between operators,<br />

while temporal dependencies cannot be specified. For example, we cannot specify in which<br />

temporal order to execute the two final writes. A specific characteristic is that if two or<br />

more operators require a certain intermediate result, this intermediate result must be copied<br />

in order to send it to both operators. In contrast, Figure 2.5(b) illustrates the same example<br />

plan using the control-flow semantics. Here, the edges describe temporal dependencies,<br />

while data dependencies are implicitly given by input and output variables (for clarity, illustrated<br />

as dashed edges). Thus, the semantic <strong>of</strong> this integration flow additionally includes<br />

the execution order <strong>of</strong> operators and does not require copy operations.<br />

Write<br />

Receive Selection Selection Copy<br />

Receive Selection Selection Write Write<br />

Write<br />

var1<br />

var2<br />

var3<br />

(a) Data-Flow Semantics<br />

(b) Control-Flow Semantics<br />

Figure 2.5: <strong>Integration</strong> Flow Modeling with Directed Graphs<br />

Examples <strong>of</strong> integration flow modeling with directed graphs and control-flow semantics<br />

are UML activity diagrams [OMG03] and BPMN process specifications [BMI06]. In<br />

contrast, directed graphs in combination with data-flow semantics are commonly used by<br />

traditional ETL tools. To summarize, control-flow semantics specify an integration flow<br />

more precisely than pure data-flow semantics because the control-flow includes the data<br />

flow and additional temporal dependencies. However, note that the implicit data flow<br />

specification (beside the primary temporal dependencies) can cause semantic data flow<br />

modeling errors such as lost or inconsistent data [TvdAS09].<br />

In addition to this classification, further aspects <strong>of</strong> modeling integration flows—that<br />

we will reveal in the following—are currently discussed in the literature. This includes<br />

(1) the combination <strong>of</strong> control-flow and data-flow modeling (hybrid flow semantic), (2)<br />

the combination <strong>of</strong> hierarchical and source code structure (hybrid flow structure), (3) the<br />

model-driven development <strong>of</strong> integration flows, and (4) the declarative flow modeling.<br />

A. Hybrid Flow Semantic<br />

The strict distinction between data-flow and control-flow semantics has been considered<br />

as a problem, especially, in the context <strong>of</strong> data-intensive integration flows that also require<br />

rather complex procedural aspects. In consequence, two projects have addressed the combination<br />

<strong>of</strong> data flow and control flow using hybrid modeling semantics, where the data<br />

flow is modeled explicitly rather than only by input and output variables.<br />

First, there is the concept <strong>of</strong> BPEL/SQL, where specific SQL activities can be used<br />

within BPEL process specifications in order to combine the advantages <strong>of</strong> data-flow and<br />

11

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!