25.01.2015 Views

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

Cost-Based Optimization of Integration Flows - Datenbanken ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

2.1 <strong>Integration</strong> <strong>Flows</strong><br />

complex queries, Publish/Subscribe systems usually execute huge numbers <strong>of</strong> fairly simple<br />

queries. On the other side, there are ETL tools that follow a time-based or data-driven<br />

event model. All three system categories conceptually use a hub-and-spoke or bus topology,<br />

where in both cases a central integration platform is used. Finally, all system types<br />

<strong>of</strong> the application area <strong>of</strong> information integration are query-based in the sense <strong>of</strong> the specification<br />

method for integration task modeling. A special case is given by ETL tools that<br />

<strong>of</strong>ten use integration flows as the specification method as well.<br />

The categories <strong>of</strong> application integration and process integration refer to a more loosely<br />

coupled type <strong>of</strong> integration, where integration flows are used as specification method and<br />

message-oriented flows are hierarchically composed. The main distinction between both<br />

is that application integration refers to the integration <strong>of</strong> heterogeneous systems and applications,<br />

while process integration refers to a business-process-oriented integration <strong>of</strong><br />

homogeneous services (e.g., Web services). Thus, both are classified as materialized integration<br />

approaches because messages are propagated and physically stored by the target<br />

systems. However, application integration is data-centric in terms <strong>of</strong> efficiently exchanging<br />

data between the involved applications, while process integration is more focused on<br />

procedural aspects in the sense <strong>of</strong> controlling the overall business process and its involved<br />

systems. In this context, we see many different facets <strong>of</strong> system types such as (near)<br />

real-time ETL tools (that use the data-driven event model), MOM systems (that use<br />

standard-messaging infrastructures such as Java Message Service), EAI systems, BPEL<br />

Engines (Business Process Execution Language), and Web Service Management Systems<br />

(WSMS). Note that those system categories have converged more and more in the past<br />

[Sto02, HAB + 05] in the form <strong>of</strong> overlapping functionalities [Sto02]. For example, standards<br />

from the area <strong>of</strong> process integration such as BPEL are also partially used to specify<br />

application integration tasks.<br />

Finally, GUI integration (Graphical User Interface) describes the unique and integrated<br />

visualization <strong>of</strong> (or the access to) heterogeneous and distributed data sources. Portals<br />

provide a unique system for accessing heterogeneous data sources and applications, where<br />

data is only integrated for visualization purposes. In contrast, mashups dynamically compose<br />

Web content (feeds, maps, etc.) for creating small applications with a stronger focus<br />

on content integration. See the classification by Aumueller and Thor [AT08] for a detailed<br />

classification <strong>of</strong> existing mashup approaches. Both portals and mashups are classified as<br />

virtual integration approaches that use a hierarchical topology, and the specification is<br />

mainly user-interface-oriented.<br />

The exclusive scope <strong>of</strong> this thesis is the category <strong>of</strong> integration flows. As a result, the<br />

proposed approaches can be applied for process integration, application integration, and<br />

partially information integration as well.<br />

2.1.2 System Architecture for <strong>Integration</strong> <strong>Flows</strong><br />

The integration <strong>of</strong> highly heterogeneous systems and applications that require fairly complex<br />

procedural aspects makes it almost impossible to realize these integration tasks using<br />

traditional techniques from the area <strong>of</strong> distributed query processing [Kos00] or replication<br />

techniques [CCA08, PGVA08]. In consequence, those complex integration tasks are<br />

typically modeled and executed as imperative integration flow specifications.<br />

<strong>Based</strong> on the specific characteristics <strong>of</strong> integration flows, a typical system architecture<br />

has evolved in the past. This architecture is commonly used by the major EAI products<br />

such as SAP eXchange Infrastructure (XI) / Process <strong>Integration</strong> (PI) [SAP10], IBM<br />

7

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!