07.02.2013 Views

Best Practices for SAP BI using DB2 9 for z/OS - IBM Redbooks

Best Practices for SAP BI using DB2 9 for z/OS - IBM Redbooks

Best Practices for SAP BI using DB2 9 for z/OS - IBM Redbooks

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Partitioning and clustering are both important per<strong>for</strong>mance tuning techniques <strong>for</strong><br />

DataStore objects. DataStore objects are typically large, containing many rows of<br />

data. Queries against an DataStore object may retrieve thousands or more rows<br />

of data be<strong>for</strong>e a <strong>DB2</strong> aggregate function, such as SUM, is applied. Queries that<br />

retrieve these large numbers of rows benefit most from partitioning and<br />

clustering.<br />

7.4.1 Choosing the partitioning key<br />

The partitioning key defines how a table is to be partitioned — it dictates which<br />

records are in which partitions. With table-controlled partitioning, the partitioning<br />

index is no longer required. The partitioning key and the limit key values <strong>for</strong> a<br />

table in a partitioned tablespace can be specified <strong>using</strong> the PARTITION BY<br />

clause and the PARTITION ENDING AT clause of the CREATE TABLE<br />

statement.<br />

So, how do you choose the partitioning key? You can decide later if you also<br />

want the partitioning key to be a partitioning index. A partitioning index is defined<br />

as an index whose columns are the same as (and have the same collating<br />

sequence) or whose columns start with the columns in the PARTITION BY<br />

clause of the CREATE TABLE statement. In other words, it is an index that<br />

matches the partitioning key.<br />

The characteristics of a good partitioning key are:<br />

► It should provide partition elimination.<br />

Users are not normally interested in all the data in a table. Instead, they are<br />

only interested in a subset of the data. If this subset of data is stored in one or<br />

a few partitions, then <strong>DB2</strong> needs only to access these partitions and can<br />

eliminate the others from its search. Accessing and searching less data<br />

improves query per<strong>for</strong>mance.<br />

► It should enable parallelism.<br />

Each partition is a separate physical data set and has its own spacemap<br />

page. This allows the partitions to be operated on in parallel. In general, work<br />

done in parallel can be completed quicker than work done serially.<br />

► It should be able to provide similar-sized partitions.<br />

A key reason <strong>for</strong> partitioning is to handle large tables and to manage the size<br />

of the physical data sets behind the table. Having similar-sized partitions<br />

usually optimizes per<strong>for</strong>mance and availability.<br />

In this section of the book we look at choosing the partitioning key to optimize<br />

query per<strong>for</strong>mance. Queries are executed against the active DataStore object<br />

table, so we are really discussing choosing the partitioning key <strong>for</strong> the<br />

110 <strong>Best</strong> <strong>Practices</strong> <strong>for</strong> <strong>SAP</strong> <strong>BI</strong> <strong>using</strong> <strong>DB2</strong> 9 <strong>for</strong> z/<strong>OS</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!