Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

rekharaghuram
from rekharaghuram More from this publisher
05.11.2015 Views

CHAPTER 11 ■ INDEXES 445 Indicates the amount of order of the rows in the table based on the values of the index: • If the value is near the number of blocks, then the table is very well ordered. In this case, the index entries in a single leaf block tend to point to rows in the same data blocks. • If the value is near the number of rows, then the table is very randomly ordered. In this case, it is unlikely that index entries in the same leaf block point to rows in the same data blocks. We could also view the clustering factor as a number that represents the number of logical I/Os against the table that would be performed to read the entire table via the index. That is, the CLUSTERING_FACTOR is an indication of how ordered the table is with respect to the index itself, and when we look at these indexes we find the following: ops$tkyte@ORA10G> select a.index_name, 2 b.num_rows, 3 b.blocks, 4 a.clustering_factor 5 from user_indexes a, user_tables b 6 where index_name in ('COLOCATED_PK', 'DISORGANIZED_PK' ) 7 and a.table_name = b.table_name 8 / INDEX_NAME NUM_ROWS BLOCKS CLUSTERING_FACTOR --------------- ---------- ---------- ----------------- COLOCATED_PK 100000 1252 1190 DISORGANIZED_PK 100000 1219 99932 ■Note I used an ASSM-managed tablespace for this section’s example, which explains why the clustering factor for the COLOCATED table is less than the number of blocks in the table. There are unformatted blocks in the COLOCATED table below the HWM that do not contain data, as well as blocks used by ASSM itself to manage space, and we will not read these blocks ever in an index range scan. Chapter 10 explains HWMs and ASSM in more detail. So the database is saying, “If we were to read every row in COLOCATED via the index COLOCATED_PK from start to finish, we would perform 1,190 I/Os. However, if we did the same to DISORGANIZED, we would perform 99,932 I/Os against the table.” The reason for the large difference is that as Oracle range scans through the index structure, if it discovers the next row in the index is on the same database block as the prior row, it does not perform another I/O to get the table block from the buffer cache. It already has a handle to one and just uses it. However, if the next row is not on the same block, then it will release that block and perform another I/O into the buffer cache to retrieve the next block to be processed. Hence the COLOCATED_PK index, as we range scan through it, will discover that the next row is almost always on the same block as the prior row. The DISORGANIZED_PK index will discover the opposite is true. In fact, we can actually see this measurement is very accurate. Using hints to

446 CHAPTER 11 ■ INDEXES have the optimizer use an index full scan to read the entire table and just count the number of non-null Y values—we can see exactly how many I/Os it will take to read the entire table via the index: select count(Y) from (select /*+ INDEX(COLOCATED COLOCATED_PK) */ * from colocated) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.10 0.16 0 1399 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.10 0.16 0 1399 0 1 Rows Row Source Operation ------- --------------------------------------------------- 1 SORT AGGREGATE (cr=1399 pr=0 pw=0 time=160325 us) 100000 TABLE ACCESS BY INDEX ROWID COLOCATED (cr=1399 pr=0 pw=0 time=500059 us) 100000 INDEX FULL SCAN COLOCATED_PK (cr=209 pr=0 pw=0 time=101057 us)(object ... ******************************************************************************** select count(Y) from (select /*+ INDEX(DISORGANIZED DISORGANIZED_PK) */ * from disorganized) call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 2 0.34 0.40 0 100141 0 1 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 4 0.34 0.40 0 100141 0 1 Rows Row Source Operation ------- --------------------------------------------------- 1 SORT AGGREGATE (cr=100141 pr=0 pw=0 time=401109 us) 100000 TABLE ACCESS BY INDEX ROWID OBJ#(66615) (cr=100141 pr=0 pw=0 time=800058... 100000 INDEX FULL SCAN OBJ#(66616) (cr=209 pr=0 pw=0 time=101129 us)(object... In both cases, the index needed to perform 209 logical I/Os (cr=209 in the Row Source ➥ Operation lines). If you subtract 209 from the total consistent reads and measure just the number of I/Os against the table, then you’ll find that they are identical to the clustering factor for each respective index. The COLOCATED_PK is a classic “the table is well ordered” example, whereas the DISORGANIZE_PK is a classic “the table is very randomly ordered” example. It is interesting to see how this affects the optimizer now. If we attempt to retrieve 25,000 rows, Oracle will now choose a full table scan for both queries (retrieving 25 percent of the rows via

446<br />

CHAPTER 11 ■ INDEXES<br />

have the optimizer use an index full scan to read the entire table <strong>and</strong> just count the number of<br />

non-null Y values—we can see exactly how many I/Os it will take to read the entire table via<br />

the index:<br />

select count(Y)<br />

from<br />

(select /*+ INDEX(COLOCATED COLOCATED_PK) */ * from colocated)<br />

call count cpu elapsed disk query current rows<br />

------- ------ -------- ---------- ---------- ---------- ---------- ----------<br />

Parse 1 0.00 0.00 0 0 0 0<br />

Execute 1 0.00 0.00 0 0 0 0<br />

Fetch 2 0.10 0.16 0 1399 0 1<br />

------- ------ -------- ---------- ---------- ---------- ---------- ----------<br />

total 4 0.10 0.16 0 1399 0 1<br />

Rows Row Source Operation<br />

------- ---------------------------------------------------<br />

1 SORT AGGREGATE (cr=1399 pr=0 pw=0 time=160325 us)<br />

100000 TABLE ACCESS BY INDEX ROWID COLOCATED (cr=1399 pr=0 pw=0 time=500059 us)<br />

100000 INDEX FULL SCAN COLOCATED_PK (cr=209 pr=0 pw=0 time=101057 us)(object ...<br />

********************************************************************************<br />

select count(Y)<br />

from<br />

(select /*+ INDEX(DISORGANIZED DISORGANIZED_PK) */ * from disorganized)<br />

call count cpu elapsed disk query current rows<br />

------- ------ -------- ---------- ---------- ---------- ---------- ----------<br />

Parse 1 0.00 0.00 0 0 0 0<br />

Execute 1 0.00 0.00 0 0 0 0<br />

Fetch 2 0.34 0.40 0 100141 0 1<br />

------- ------ -------- ---------- ---------- ---------- ---------- ----------<br />

total 4 0.34 0.40 0 100141 0 1<br />

Rows Row Source Operation<br />

------- ---------------------------------------------------<br />

1 SORT AGGREGATE (cr=100141 pr=0 pw=0 time=401109 us)<br />

100000 TABLE ACCESS BY INDEX ROWID OBJ#(66615) (cr=100141 pr=0 pw=0 time=800058...<br />

100000 INDEX FULL SCAN OBJ#(66616) (cr=209 pr=0 pw=0 time=101129 us)(object...<br />

In both cases, the index needed to perform 209 logical I/Os (cr=209 in the Row Source ➥<br />

Operation lines). If you subtract 209 from the total consistent reads <strong>and</strong> measure just the number<br />

of I/Os against the table, then you’ll find that they are identical to the clustering factor<br />

for each respective index. The COLOCATED_PK is a classic “the table is well ordered” example,<br />

whereas the DISORGANIZE_PK is a classic “the table is very r<strong>and</strong>omly ordered” example. It is<br />

interesting to see how this affects the optimizer now. If we attempt to retrieve 25,000 rows,<br />

<strong>Oracle</strong> will now choose a full table scan for both queries (retrieving 25 percent of the rows via

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!