07.02.2013 Views

Best Practices for SAP BI using DB2 9 for z/OS - IBM Redbooks

Best Practices for SAP BI using DB2 9 for z/OS - IBM Redbooks

Best Practices for SAP BI using DB2 9 for z/OS - IBM Redbooks

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

For efficient join per<strong>for</strong>mance, it is important that an index exists to support the<br />

join predicates, regardless of the table size (small or large). For anything other<br />

than very small dimensions/master data tables (a few data pages), you must<br />

ensure that these tables have indexes that support the join and local filtering<br />

predicates. This will ensure efficient nested loop join per<strong>for</strong>mance <strong>for</strong> all tables.<br />

Local predicates are preferred as the leading index columns because they<br />

support this table accessed as the leading table accessed or non-leading of a<br />

table join sequence. Additionally, they allow frequency statistics to be collected<br />

on the local predicate column with default RUNSTATS. Both the local and join<br />

predicates should be included in the index because both columns provide<br />

filtering as part of the join — that is, unless the dimension/master data table is<br />

very small, in which case an index on the join predicate alone may be sufficient<br />

<strong>for</strong> join per<strong>for</strong>mance.<br />

Aside from data access and join per<strong>for</strong>mance, statistics collection on the<br />

dimension and master data tables becomes very important. An incorrect<br />

estimation of filtering from dimension/snowflake will propagate to the large<br />

facttable. Thus a 10% error in estimation on a 100-row dimension is only 10<br />

rows, but when joined to the facttable (10 million), the error factor becomes 1<br />

million rows.<br />

Single column indexes that support the local predicates on the dimension and<br />

master data tables allow frequency statistics to be collected by default. For<br />

dimensions/master data tables that have multiple filtering predicates, indexes<br />

that contain all filtering columns allow default collection of correlation in<strong>for</strong>mation<br />

on these tables. This in<strong>for</strong>mation is crucial <strong>for</strong> correct access path selection (and<br />

table join sequence).<br />

Dimensions and master data tables (large or small) can support many more<br />

indexes than can multi-million row facttables. This should be encouraged <strong>for</strong> the<br />

purpose of efficient data access/join per<strong>for</strong>mance, and also <strong>for</strong> effective statistics<br />

collection.<br />

Chapter 10. Tips <strong>for</strong> SQL efficiency 225

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!