Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

rekharaghuram
from rekharaghuram More from this publisher
05.11.2015 Views

CHAPTER 13 ■ PARTITIONING 609 Rows Row Source Operation ------- --------------------------------------------------- 1 TABLE ACCESS BY INDEX ROWID T (cr=5 pr=0 pw=0 time=62 us) 1 INDEX RANGE SCAN T_IDX (cr=4 pr=0 pw=0 time=63 us) You might immediately jump to the (erroneous) conclusion that partitioning causes a sevenfold increase in I/O: 5 query mode gets without partitioning and 34 with partitioning. If your system had an issue with high consistent gets (logical I/Os before), it is worse now. If it didn’t have one before, it might well get one. The same thing can be observed for the other two queries. In the following, the first total line is for the partitioned table and the second is for the nonpartitioned table: select * from t where owner = :o and object_type = :t call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 5 0.01 0.01 0 47 0 16 total 5 0.00 0.00 0 16 0 16 select * from t where owner = :o call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 5 0.00 0.00 0 51 0 25 total 5 0.00 0.00 0 23 0 25 The queries each returned the same answer, but consumed 500 percent, 300 percent, or 200 percent of the I/Os to accomplish it—this is not good. The root cause? The index partitioning scheme. Notice in the preceding plan the partitions listed in the last line: 1 through 16. 1 PARTITION HASH ALL PARTITION: 1 16 (cr=34 pr=0 pw=0 time=359 us) 1 TABLE ACCESS BY LOCAL INDEX ROWID T PARTITION: 1 16 (cr=34 pr=0 1 INDEX RANGE SCAN T_IDX PARTITION: 1 16 (cr=33 pr=0 pw=0 time=250 This query has to look at each and every index partition because entries for SCOTT may well be (in fact, probably are) in each and every index partition. The index is logically hash partitioned by OBJECT_ID, so any query that uses this index and does not also refer to the OBJECT_ID in the predicate must consider every index partition! The solution here is to globally partition your index. For example, continuing with the same T_IDX example, we could choose to hash partition the index in Oracle 10g: ■Note Hash partitioning of indexes is a new Oracle 10g feature that is not available in Oracle9i. There are considerations to be taken into account with hash partitioned indexes regarding range scans, which we’ll discuss later in this section.

610 CHAPTER 13 ■ PARTITIONING ops$tkyte@ORA10G> create index t_idx 2 on t(owner,object_type,object_name) 3 global 4 partition by hash(owner) 5 partitions 16 6 / Index created. Much like the hash partitioned tables we investigated earlier, Oracle will take the OWNER value, hash it to a partition between 1 and 16, and place the index entry in there. Now when we review the TKPROF information for these three queries again call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.00 0.00 0 4 0 1 total 5 0.00 0.00 0 19 0 16 total 5 0.01 0.00 0 28 0 25 we can see we are much closer to the worked performed by the nonpartitioned table earlier— that is, we have not negatively impacted the work performed by our queries. It should be noted, however, that a hash partitioned index cannot be range scanned. In general, it is most suitable for exact equality (equals or in-lists). If you were to query WHERE OWNER > :X using the preceding index, it would not be able to perform a simple range scan using partition elimination—you would be back to inspecting all 16 hash partitions. Does that mean that partitioning won’t have any positive impact on OLTP performance? No, not entirely—you just have to look in a different place. In general, it will not positively impact the performance of your data retrieval in OLTP; rather, care has to be taken to ensure data retrieval isn’t affected negatively. But for data modification in highly concurrent environments, partitioning may provide salient benefits. Consider the preceding a rather simple example of a single table with a single index, and add into the mix a primary key. Without partitioning, there is, in fact, a single table: all insertions go into this single table. There is contention perhaps for the freelists on this table. Additionally, the primary key index on the OBJECT_ID column would be a heavy right-handside index, as we discussed in Chapter 11. Presumably it would be populated by a sequence; hence, all inserts would go after the rightmost block leading to buffer busy waits. Also, there would be a single index structure, T_IDX, for which people would be contending. So far, a lot of “single” items. Enter partitioning. You hash partition the table by OBJECT_ID into 16 partitions. There are now 16 “tables” to contend for, and each table has one-sixteenth the number of users hitting it simultaneously. You locally partition the primary key index on OBJECT_ID into 16 partitions. You now have 16 “right-hand sides,” and each index structure will receive one-sixteenth the workload it had before. And so on. That is, you can use partitioning in a highly concurrent environment to reduce contention, much like we used a reverse key index in Chapter 11 to reduce the buffer busy waits. However, you must be aware that the very process of partitioning out the data consumes more CPU itself than not having partitioning. That is, it takes more CPU to figure out where to put the data than it would if the data had but one place to go.

610<br />

CHAPTER 13 ■ PARTITIONING<br />

ops$tkyte@ORA10G> create index t_idx<br />

2 on t(owner,object_type,object_name)<br />

3 global<br />

4 partition by hash(owner)<br />

5 partitions 16<br />

6 /<br />

Index created.<br />

Much like the hash partitioned tables we investigated earlier, <strong>Oracle</strong> will take the OWNER<br />

value, hash it to a partition between 1 <strong>and</strong> 16, <strong>and</strong> place the index entry in there. Now when<br />

we review the TKPROF information for these three queries again<br />

call count cpu elapsed disk query current rows<br />

------- ------ -------- ---------- ---------- ---------- ---------- ----------<br />

total 4 0.00 0.00 0 4 0 1<br />

total 5 0.00 0.00 0 19 0 16<br />

total 5 0.01 0.00 0 28 0 25<br />

we can see we are much closer to the worked performed by the nonpartitioned table earlier—<br />

that is, we have not negatively impacted the work performed by our queries. It should be<br />

noted, however, that a hash partitioned index cannot be range scanned. In general, it is most<br />

suitable for exact equality (equals or in-lists). If you were to query WHERE OWNER > :X using the<br />

preceding index, it would not be able to perform a simple range scan using partition elimination—you<br />

would be back to inspecting all 16 hash partitions.<br />

Does that mean that partitioning won’t have any positive impact on OLTP performance?<br />

No, not entirely—you just have to look in a different place. In general, it will not positively<br />

impact the performance of your data retrieval in OLTP; rather, care has to be taken to ensure<br />

data retrieval isn’t affected negatively. But for data modification in highly concurrent environments,<br />

partitioning may provide salient benefits.<br />

Consider the preceding a rather simple example of a single table with a single index, <strong>and</strong><br />

add into the mix a primary key. Without partitioning, there is, in fact, a single table: all insertions<br />

go into this single table. There is contention perhaps for the freelists on this table.<br />

Additionally, the primary key index on the OBJECT_ID column would be a heavy right-h<strong>and</strong>side<br />

index, as we discussed in Chapter 11. Presumably it would be populated by a sequence;<br />

hence, all inserts would go after the rightmost block leading to buffer busy waits. Also, there<br />

would be a single index structure, T_IDX, for which people would be contending. So far, a lot of<br />

“single” items.<br />

Enter partitioning. You hash partition the table by OBJECT_ID into 16 partitions. There are<br />

now 16 “tables” to contend for, <strong>and</strong> each table has one-sixteenth the number of users hitting it<br />

simultaneously. You locally partition the primary key index on OBJECT_ID into 16 partitions.<br />

You now have 16 “right-h<strong>and</strong> sides,” <strong>and</strong> each index structure will receive one-sixteenth the<br />

workload it had before. And so on. That is, you can use partitioning in a highly concurrent<br />

environment to reduce contention, much like we used a reverse key index in Chapter 11 to<br />

reduce the buffer busy waits. However, you must be aware that the very process of partitioning<br />

out the data consumes more CPU itself than not having partitioning. That is, it takes more<br />

CPU to figure out where to put the data than it would if the data had but one place to go.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!