25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2.3 <strong>Integration</strong> Flow Meta Model<br />

access to heterogeneous systems and applications. There, the proprietary external messages<br />

and data representations are transformed into the described internal message meta<br />

model. In detail, the group <strong>of</strong> interaction-oriented operators include the operators shown<br />

in Table 2.1. In contrast to this, the data-flow- and control-flow-oriented operators are<br />

used as local processing steps within the integration platform in the sense that they do<br />

not perform any interactions with external systems. Both groups <strong>of</strong> operators are shown<br />

in Table 2.2 and Table 2.3, respectively.<br />

The instance-based plan execution has several implications for all operators. First, the<br />

operators use materialized intermediate results in the sense <strong>of</strong> local message variables.<br />

Second, the data flow is implicitly given by those input and output variables.<br />

Moreover, we distinguish between external integration flow descriptions (e.g., BPEL),<br />

internal plans (logical representation) and internal compiled plans (physical representation).<br />

Here, the term plan is a shorthand for internal plans. In Section 2.4, we present<br />

several use cases that are used as example plans throughout the whole thesis.<br />

2.3.2 Transactional Properties<br />

There are several transactional properties <strong>of</strong> integration flows that must be guaranteed<br />

under all circumstances. In this section, we discuss different problems that can occur while<br />

executing an integration flow as well as how they are typically addressed in integration<br />

platforms and what we can imply for the cost-based optimization <strong>of</strong> integration flows. For<br />

more details, see our analysis <strong>of</strong> problem categories [BHLW08a].<br />

The most important risk <strong>of</strong> executing integration flows is the problem <strong>of</strong> message lost<br />

when using a simple send and forget execution principle.<br />

Problem 2.1 (Message Lost). Assume that the stream <strong>of</strong> incoming messages M is collected<br />

using transient (in-memory) inbound message queues. If a server breakdown <strong>of</strong> the<br />

integration platform has occurred, all messages not sent to the target systems will be lost.<br />

This is a problem because <strong>of</strong>ten the messages cannot be restored by the source systems.<br />

As shown in Section 2.1.2, this problem is typically addressed by persistently storing all<br />

incoming messages at the inbound server side <strong>of</strong> the integration platform. The resulting<br />

implication is that all integration platforms follow a store and forward principle in order to<br />

guarantee that each received message will be successfully delivered to the external systems.<br />

Thus, if a failure occurs, the stored messages are used to resume the state <strong>of</strong> execution.<br />

A failure or server breakdown can occur at an arbitrary point in time. Thus, there might<br />

be operators and interactions with external systems that have already been successfully<br />

finished, while other operators have not. When re-executing the complete integration flow,<br />

the following problem arises.<br />

Problem 2.2 (Message Double Processing). Assume that a server breakdown during execution<br />

<strong>of</strong> plan instance p 1 has occurred. Recall that typically, each interaction with an<br />

external system is an atomic transaction. Thus, there might be successfully finished operators<br />

and currently unfinished operators. Furthermore, if the external system does not<br />

support transactional behavior, there might be partially successful interactions with external<br />

systems. If we re-execute the plan instance p 1 with p ′ 1 , we might send the same message<br />

twice to an external system.<br />

In order to tackle this problem, specific recovery models for integration flows are used,<br />

where we distinguish between two types. First, there is the compensation-based transac-<br />

25

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!