07.02.2013 Views

Best Practices for SAP BI using DB2 9 for z/OS - IBM Redbooks

Best Practices for SAP BI using DB2 9 for z/OS - IBM Redbooks

Best Practices for SAP BI using DB2 9 for z/OS - IBM Redbooks

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

partitioning and table-controlled partitioning, see <strong>DB2</strong> 9 <strong>for</strong> z/<strong>OS</strong> documentation<br />

or the <strong>DB2</strong> Version 9.1 <strong>for</strong> z/<strong>OS</strong> Administration Guide, SES1-2935-01.<br />

Partitioning provides many benefits in the areas of data management, data<br />

availability, and query per<strong>for</strong>mance, especially <strong>for</strong> large tables like DataStore<br />

objects. The support from <strong>DB2</strong> allows addition and rotation of partitions and<br />

provides more options <strong>for</strong> managing data. <strong>DB2</strong> utilities can operate on a subset<br />

of the partitions, thus allowing the rest of the data to be available. Partitioned<br />

tables may be able to take advantage of parallelism <strong>for</strong> online queries, batch<br />

processing, and utilities. Query per<strong>for</strong>mance can improve if partitions can be<br />

eliminated from the search. The <strong>DB2</strong> feature of data partitioned secondary index<br />

(DPSI) promotes partition-independence, which can reduce lock contention and<br />

improve index availability.<br />

Clustering is the physical sequence in which records are stored in a table. The<br />

clustering sequence is defined by the clustering index <strong>for</strong> the table. <strong>DB2</strong> tries to<br />

insert rows into the table to maintain the order of the clustering index. If you do<br />

not specify one of the table’s indexes as clustering, then, by default, the<br />

clustering index is the first index that <strong>DB2</strong> creates <strong>for</strong> the table.<br />

You use the CLUSTER keyword on the CREATE INDEX statement to explicitly<br />

define the clustering index. When <strong>SAP</strong> creates tables, it arbitrarily chooses the<br />

primary index (index 0) as the clustering index. However, this may not<br />

necessarily be the best choice. The desired clustering sequence depends on<br />

how the data is accessed and used, which is mostly customer specific.<br />

In <strong>DB2</strong>, the clustering index can be altered without the requirement to drop and<br />

recreate the affected indexes. In addition, with table-controlled partitioning, you<br />

have more flexibility over the choice of a clustering index, since it does not need<br />

to be the partitioning index.<br />

When a range of rows is retrieved, the query per<strong>for</strong>mance can be improved if<br />

these rows are clustered and accessed by a clustering index. The rows can be<br />

read with fewer I/O operations, since the pages read are likely to contain more<br />

rows that need to be retrieved and <strong>DB2</strong> can take advantage of its sequential<br />

prefetch feature.<br />

Partitioning provides <strong>for</strong>ced clustering because data within a partition cannot<br />

exceed the boundaries of that partition. Clustering is only a guide that <strong>DB2</strong> tries<br />

to achieve. So, partitioning is more precise than clustering.<br />

Also, partitioning can be used instead of indexing in some situations. Lower<br />

cardinality columns that are chosen <strong>for</strong> partitioning may then be removed from<br />

the index chosen as the clustering index.<br />

Chapter 7. <strong>Best</strong> practices <strong>for</strong> DataStore 109

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!