Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005
CHAPTER 11 ■ INDEXES 443 select * from colocated a15 where x between 20000 and 40000 Rows Row Source Operation ------- --------------------------------------------------- 20001 TABLE ACCESS BY INDEX ROWID COLOCATED (cr=2899 pr=0 pw=0 time=120125... 20001 INDEX RANGE SCAN COLOCATED_PK (cr=1374 pr=0 pw=0 time=40072 us)(... select * from colocated a100 where x between 20000 and 40000 Rows Row Source Operation ------- --------------------------------------------------- 20001 TABLE ACCESS BY INDEX ROWID COLOCATED (cr=684 pr=0 pw=0 ...) 20001 INDEX RANGE SCAN COLOCATED_PK (cr=245 pr=0 pw=0 ... The first query was executed with the ARRAYSIZE of 15, and the (cr=nnnn) values in the Row ➥ Source Operation shows we performed 1,374 logical I/Os against the index and then 1,625 logical I/Os against the table (2,899–1,374; the numbers are cumulative in the Row Source Operation steps). When we increased the ARRAYSIZE to 100 from 15, the amount of logical I/O against the index dropped to 245, which was the direct result of not having to reread the index leaf blocks from the buffer cache every 15 rows, but only every 100 rows. To understand this, assume that we were able to store 200 rows per leaf block. As we are scanning through the index reading 15 rows at a time, we would have to retrieve the first leaf block 14 times to get all 200 entries off it. On the other hand, when we array fetch 100 rows at a time, we need to retrieve this same leaf block only two times from the buffer cache to exhaust all of its entries. The same thing happened in this case with the table blocks. Since the table was sorted in the same order as the index keys, we would tend to retrieve each table block less often, as we would get more of the rows from it with each fetch call. So, if this was good for the COLOCATED table, it must have been just as good for the DISORGANIZED table, right? Not so. The results from the DISORGANIZED table would look like this: select /*+ index( a15 disorganized_pk ) */ * from disorganized a15 where x between 20000 and 40000 Rows Row Source Operation ------- --------------------------------------------------- 20001 TABLE ACCESS BY INDEX ROWID DISORGANIZED (cr=21357 pr=0 pw=0 ... 20001 INDEX RANGE SCAN DISORGANIZED_PK (cr=1374 pr=0 pw=0 ... select /*+ index( a100 disorganized_pk ) */ * from disorganized a100 where x between 20000 and 40000 Rows Row Source Operation ------- --------------------------------------------------- 20001 TABLE ACCESS BY INDEX ROWID OBJ#(75652) (cr=20228 pr=0 pw=0 ... 20001 INDEX RANGE SCAN OBJ#(75653) (cr=245 pr=0 pw=0 time=20281 us)(...
444 CHAPTER 11 ■ INDEXES The results against the index here were identical, which makes sense, as the data is stored in the index is just the same regardless of how the table is organized. The logical I/O went from 1,374 for a single execution of this query to 245, just as before. But overall the amount of logical I/O performed by this query did not differ significantly: 21,357 versus 20,281. The reason? The amount of logical I/O performed against the table did not differ at all—if you subtract the logical I/O against the index from the total logical I/O performed by each query, you’ll find that both queries did 19,983 logical I/Os against the table. This is because every time we wanted N rows from the database—the odds that any two of those rows would be on the same block was very small—there was no opportunity to get multiple rows from a table block in a single call. Every professional programming language I have seen that can interact with Oracle implements this concept of array fetching. In PL/SQL, you may use BULK COLLECT or rely on the implicit array fetch of 100 that is performed for implicit cursor for loops. In Java/JDBC, there is a prefetch method on a connect or statement object. Oracle Call Interface (OCI; a C API) allows you to programmatically set the prefetch size, as does Pro*C. As you can see, this can have a material and measurable affect on the amount of logical I/O performed by your query, and it deserves your attention. Just to wrap up this example, let’s look at what happens when we full scan the DISORGANIZED table: select * from disorganized where x between 20000 and 40000 call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 5 0.00 0.00 0 0 0 0 Execute 5 0.00 0.00 0 0 0 0 Fetch 6675 0.53 0.54 0 12565 0 100005 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 6685 0.53 0.54 0 12565 0 100005 Rows Row Source Operation ------- --------------------------------------------------- 20001 TABLE ACCESS FULL DISORGANIZED (cr=2513 pr=0 pw=0 time=60115 us) That shows that in this particular case, the full scan is very appropriate due to the way the data is physically stored on disk. This begs the question, “Why didn’t the optimizer full scan in the first place for this query?” Well, it would have if left to its own design, but in the first example query against DISORGANIZED I purposely hinted the query and told the optimizer to construct a plan that used the index. In the second case, I let the optimizer pick the best overall plan. The Clustering Factor Next, let’s look at some of the information Oracle will use. We are specifically going to look at the CLUSTERING_FACTOR column found in the USER_INDEXES view. The Oracle Reference manual tells us this column has the following meaning:
- Page 437 and 438: 392 CHAPTER 10 ■ DATABASE TABLES
- Page 439 and 440: 394 CHAPTER 10 ■ DATABASE TABLES
- Page 441 and 442: 396 CHAPTER 10 ■ DATABASE TABLES
- Page 443 and 444: 398 CHAPTER 10 ■ DATABASE TABLES
- Page 445 and 446: 400 CHAPTER 10 ■ DATABASE TABLES
- Page 447 and 448: 402 CHAPTER 10 ■ DATABASE TABLES
- Page 449 and 450: 404 CHAPTER 10 ■ DATABASE TABLES
- Page 451 and 452: 406 CHAPTER 10 ■ DATABASE TABLES
- Page 453 and 454: 408 CHAPTER 10 ■ DATABASE TABLES
- Page 455 and 456: 410 CHAPTER 10 ■ DATABASE TABLES
- Page 457 and 458: 412 CHAPTER 10 ■ DATABASE TABLES
- Page 459 and 460: 414 CHAPTER 10 ■ DATABASE TABLES
- Page 461 and 462: 416 CHAPTER 10 ■ DATABASE TABLES
- Page 463 and 464: 418 CHAPTER 10 ■ DATABASE TABLES
- Page 466 and 467: CHAPTER 11 ■ ■ ■ Indexes Inde
- Page 468 and 469: CHAPTER 11 ■ INDEXES 423 value of
- Page 470 and 471: CHAPTER 11 ■ INDEXES 425 One of t
- Page 472 and 473: CHAPTER 11 ■ INDEXES 427 We then
- Page 474 and 475: CHAPTER 11 ■ INDEXES 429 we ended
- Page 476 and 477: CHAPTER 11 ■ INDEXES 431 The data
- Page 478 and 479: CHAPTER 11 ■ INDEXES 433 if ( (++
- Page 480 and 481: CHAPTER 11 ■ INDEXES 435 Table 11
- Page 482 and 483: CHAPTER 11 ■ INDEXES 437 When Sho
- Page 484 and 485: CHAPTER 11 ■ INDEXES 439 an 8KB b
- Page 486 and 487: CHAPTER 11 ■ INDEXES 441 select *
- Page 490 and 491: CHAPTER 11 ■ INDEXES 445 Indicate
- Page 492 and 493: CHAPTER 11 ■ INDEXES 447 an index
- Page 494 and 495: CHAPTER 11 ■ INDEXES 449 Table 11
- Page 496 and 497: CHAPTER 11 ■ INDEXES 451 9 1, 'M'
- Page 498 and 499: CHAPTER 11 ■ INDEXES 453 column w
- Page 500 and 501: CHAPTER 11 ■ INDEXES 455 Bitmap j
- Page 502 and 503: CHAPTER 11 ■ INDEXES 457 INSERT a
- Page 504 and 505: CHAPTER 11 ■ INDEXES 459 7 l_last
- Page 506 and 507: CHAPTER 11 ■ INDEXES 461 ops$tkyt
- Page 508 and 509: CHAPTER 11 ■ INDEXES 463 If we co
- Page 510 and 511: CHAPTER 11 ■ INDEXES 465 ops$tkyt
- Page 512 and 513: CHAPTER 11 ■ INDEXES 467 Caveat o
- Page 514 and 515: CHAPTER 11 ■ INDEXES 469 ops$tkyt
- Page 516 and 517: CHAPTER 11 ■ INDEXES 471 Frequent
- Page 518 and 519: CHAPTER 11 ■ INDEXES 473 select *
- Page 520 and 521: CHAPTER 11 ■ INDEXES 475 If you s
- Page 522 and 523: CHAPTER 11 ■ INDEXES 477 we’ll
- Page 524 and 525: CHAPTER 11 ■ INDEXES 479 Predicat
- Page 526 and 527: CHAPTER 11 ■ INDEXES 481 ops$tkyt
- Page 528 and 529: CHAPTER 11 ■ INDEXES 483 ops$tkyt
- Page 530 and 531: CHAPTER 11 ■ INDEXES 485 This dem
- Page 532 and 533: CHAPTER 11 ■ INDEXES 487 SELECT /
- Page 534 and 535: CHAPTER 12 ■ ■ ■ Datatypes Ch
- Page 536 and 537: CHAPTER 12 ■ DATATYPES 491 • TI
444<br />
CHAPTER 11 ■ INDEXES<br />
The results against the index here were identical, which makes sense, as the data is stored in the index<br />
is just the same regardless of how the table is organized. The logical I/O went from 1,374 for a single execution<br />
of this query to 245, just as before. But overall the amount of logical I/O performed by this query did not<br />
differ significantly: 21,357 versus 20,281. The reason? The amount of logical I/O performed against the table<br />
did not differ at all—if you subtract the logical I/O against the index from the total logical I/O performed by<br />
each query, you’ll find that both queries did 19,983 logical I/Os against the table. This is because every time<br />
we wanted N rows from the database—the odds that any two of those rows would be on the same block<br />
was very small—there was no opportunity to get multiple rows from a table block in a single call.<br />
Every professional programming language I have seen that can interact with <strong>Oracle</strong> implements this<br />
concept of array fetching. In PL/SQL, you may use BULK COLLECT or rely on the implicit array fetch of 100<br />
that is performed for implicit cursor for loops. In Java/JDBC, there is a prefetch method on a connect or<br />
statement object. <strong>Oracle</strong> Call Interface (OCI; a C API) allows you to programmatically set the prefetch size, as<br />
does Pro*C. As you can see, this can have a material <strong>and</strong> measurable affect on the amount of logical I/O performed<br />
by your query, <strong>and</strong> it deserves your attention.<br />
Just to wrap up this example, let’s look at what happens when we full scan the<br />
DISORGANIZED table:<br />
select * from disorganized where x between 20000 <strong>and</strong> 40000<br />
call count cpu elapsed disk query current rows<br />
------- ------ -------- ---------- ---------- ---------- ---------- ----------<br />
Parse 5 0.00 0.00 0 0 0 0<br />
Execute 5 0.00 0.00 0 0 0 0<br />
Fetch 6675 0.53 0.54 0 12565 0 100005<br />
------- ------ -------- ---------- ---------- ---------- ---------- ----------<br />
total 6685 0.53 0.54 0 12565 0 100005<br />
Rows Row Source Operation<br />
------- ---------------------------------------------------<br />
20001 TABLE ACCESS FULL DISORGANIZED (cr=2513 pr=0 pw=0 time=60115 us)<br />
That shows that in this particular case, the full scan is very appropriate due to the way<br />
the data is physically stored on disk. This begs the question, “Why didn’t the optimizer full<br />
scan in the first place for this query?” Well, it would have if left to its own design, but in the<br />
first example query against DISORGANIZED I purposely hinted the query <strong>and</strong> told the optimizer<br />
to construct a plan that used the index. In the second case, I let the optimizer pick the best<br />
overall plan.<br />
The Clustering Factor<br />
Next, let’s look at some of the information <strong>Oracle</strong> will use. We are specifically going to look at<br />
the CLUSTERING_FACTOR column found in the USER_INDEXES view. The <strong>Oracle</strong> Reference manual<br />
tells us this column has the following meaning: