Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005
CHAPTER 11 ■ INDEXES 447 an index is not the optimal plan, even for the very ordered table). However, if we select only 10 percent of the table data, we observe the following: ops$tkyte@ORA10G> select * from colocated where x between 20000 and 30000; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=143 Card=10002 Bytes=800160) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'COLOCATED' (TABLE) (Cost=143 ... 2 1 INDEX (RANGE SCAN) OF 'COLOCATED_PK' (INDEX (UNIQUE)) (Cost=22 ... ops$tkyte@ORA10G> select * from disorganized where x between 20000 and 30000; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=337 Card=10002 Bytes=800160) 1 0 TABLE ACCESS (FULL) OF 'DISORGANIZED' (TABLE) (Cost=337 Card=10002 ... Here we have the same table structures and the same indexes, but different clustering factors. The optimizer, in this case, chose an index access plan for the COLOCATED table and a full scan access plan for the DISORGANIZED table. Bear in mind that 10 percent is not a threshold value—it is just a number that is less than 25 percent and that caused an index range scan to happen in this case (for the COLOCATED table). The key point to this discussion is that indexes are not always the appropriate access method. The optimizer may very well be correct in choosing to not use an index, as the preceding example demonstrates. Many factors influence the use of an index by the optimizer, including physical data layout. You might be tempted, therefore, to run out and try to rebuild all of your tables now to make all indexes have a good clustering factor, but that would be a waste of time in most cases. It will affect cases where you do index range scans of a large percentage of a table. Additionally, you must keep in mind that in general the table will have only one index with a good clustering factor! The rows in a table may be sorted in only one way. In the example just shown, if I had another index on the column Y, it would be very poorly clustered in the COLOCATED table, but very nicely clustered in the DISORGANIZED table. If having the data physically clustered is important to you, consider the use of an IOT, a B*Tree cluster, or a hash cluster over continuous table rebuilds. B*Trees Wrap-Up B*Tree indexes are by far the most common and well-understood indexing structures in the Oracle database. They are an excellent general-purpose indexing mechanism. They provide very scalable access times, returning data from a 1,000-row index in about the same amount of time as a 100,000-row index structure. When to index and what columns to index are things you need to pay attention to in your design. An index does not always mean faster access; in fact, you will find that indexes will decrease performance in many cases if Oracle uses them. It is purely a function of how large of a percentage of the table you will need to access via the index and how the data happens to be
448 CHAPTER 11 ■ INDEXES laid out. If you can use the index to “answer the question,” then accessing a large percentage of the rows makes sense, since you are avoiding the extra scattered I/O to read the table. If you use the index to access the table, then you will need to ensure that you are processing a small percentage of the total table. You should consider the design and implementation of indexes during the design of your application, not as an afterthought (as I so often see). With careful planning and due consideration of how you are going to access the data, the indexes you need will be apparent in most all cases. Bitmap Indexes Bitmap indexes were added to Oracle in version 7.3 of the database. They are currently available with the Oracle Enterprise and Personal Editions, but not the Standard Edition. Bitmap indexes are designed for data warehousing/ad hoc query environments where the full set of queries that may be asked of the data is not totally known at system implementation time. They are specifically not designed for OLTP systems or systems where data is frequently updated by many concurrent sessions. Bitmap indexes are structures that store pointers to many rows with a single index key entry, as compared to a B*Tree structure where there is parity between the index keys and the rows in a table. In a bitmap index, there will be a very small number of index entries, each of which points to many rows. In a conventional B*Tree, one index entry points to a single row. Let’s say we are creating a bitmap index on the JOB column in the EMP table as follows: Ops$tkyte@ORA10G> create BITMAP index job_idx on emp(job); Index created. Oracle will store something like what is shown in Table 11-6 in the index. Table 11-6. Representation of How Oracle Would Store the JOB-IDX Bitmap Index Value/Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ANALYST 0 0 0 0 0 0 0 1 0 1 0 0 1 0 CLERK 1 0 0 0 0 0 0 0 0 0 1 1 0 1 MANAGER 0 0 0 1 0 1 1 0 0 0 0 0 0 0 PRESIDENT 0 0 0 0 0 0 0 0 1 0 0 0 0 0 SALESMAN 0 1 1 0 1 0 0 0 0 0 0 0 0 0 Table 11-6 shows that rows 8, 10, and 13 have the value ANALYST, whereas rows 4, 6, and 7 have the value MANAGER. It also shows us that no rows are null (bitmap indexes store null entries; the lack of a null entry in the index implies there are no null rows). If we wanted to count the rows that have the value MANAGER, the bitmap index would do this very rapidly. If we wanted to find all the rows such that the JOB was CLERK or MANAGER, we could simply combine their bitmaps from the index, as shown in Table 11-7.
- Page 441 and 442: 396 CHAPTER 10 ■ DATABASE TABLES
- Page 443 and 444: 398 CHAPTER 10 ■ DATABASE TABLES
- Page 445 and 446: 400 CHAPTER 10 ■ DATABASE TABLES
- Page 447 and 448: 402 CHAPTER 10 ■ DATABASE TABLES
- Page 449 and 450: 404 CHAPTER 10 ■ DATABASE TABLES
- Page 451 and 452: 406 CHAPTER 10 ■ DATABASE TABLES
- Page 453 and 454: 408 CHAPTER 10 ■ DATABASE TABLES
- Page 455 and 456: 410 CHAPTER 10 ■ DATABASE TABLES
- Page 457 and 458: 412 CHAPTER 10 ■ DATABASE TABLES
- Page 459 and 460: 414 CHAPTER 10 ■ DATABASE TABLES
- Page 461 and 462: 416 CHAPTER 10 ■ DATABASE TABLES
- Page 463 and 464: 418 CHAPTER 10 ■ DATABASE TABLES
- Page 466 and 467: CHAPTER 11 ■ ■ ■ Indexes Inde
- Page 468 and 469: CHAPTER 11 ■ INDEXES 423 value of
- Page 470 and 471: CHAPTER 11 ■ INDEXES 425 One of t
- Page 472 and 473: CHAPTER 11 ■ INDEXES 427 We then
- Page 474 and 475: CHAPTER 11 ■ INDEXES 429 we ended
- Page 476 and 477: CHAPTER 11 ■ INDEXES 431 The data
- Page 478 and 479: CHAPTER 11 ■ INDEXES 433 if ( (++
- Page 480 and 481: CHAPTER 11 ■ INDEXES 435 Table 11
- Page 482 and 483: CHAPTER 11 ■ INDEXES 437 When Sho
- Page 484 and 485: CHAPTER 11 ■ INDEXES 439 an 8KB b
- Page 486 and 487: CHAPTER 11 ■ INDEXES 441 select *
- Page 488 and 489: CHAPTER 11 ■ INDEXES 443 select *
- Page 490 and 491: CHAPTER 11 ■ INDEXES 445 Indicate
- Page 494 and 495: CHAPTER 11 ■ INDEXES 449 Table 11
- Page 496 and 497: CHAPTER 11 ■ INDEXES 451 9 1, 'M'
- Page 498 and 499: CHAPTER 11 ■ INDEXES 453 column w
- Page 500 and 501: CHAPTER 11 ■ INDEXES 455 Bitmap j
- Page 502 and 503: CHAPTER 11 ■ INDEXES 457 INSERT a
- Page 504 and 505: CHAPTER 11 ■ INDEXES 459 7 l_last
- Page 506 and 507: CHAPTER 11 ■ INDEXES 461 ops$tkyt
- Page 508 and 509: CHAPTER 11 ■ INDEXES 463 If we co
- Page 510 and 511: CHAPTER 11 ■ INDEXES 465 ops$tkyt
- Page 512 and 513: CHAPTER 11 ■ INDEXES 467 Caveat o
- Page 514 and 515: CHAPTER 11 ■ INDEXES 469 ops$tkyt
- Page 516 and 517: CHAPTER 11 ■ INDEXES 471 Frequent
- Page 518 and 519: CHAPTER 11 ■ INDEXES 473 select *
- Page 520 and 521: CHAPTER 11 ■ INDEXES 475 If you s
- Page 522 and 523: CHAPTER 11 ■ INDEXES 477 we’ll
- Page 524 and 525: CHAPTER 11 ■ INDEXES 479 Predicat
- Page 526 and 527: CHAPTER 11 ■ INDEXES 481 ops$tkyt
- Page 528 and 529: CHAPTER 11 ■ INDEXES 483 ops$tkyt
- Page 530 and 531: CHAPTER 11 ■ INDEXES 485 This dem
- Page 532 and 533: CHAPTER 11 ■ INDEXES 487 SELECT /
- Page 534 and 535: CHAPTER 12 ■ ■ ■ Datatypes Ch
- Page 536 and 537: CHAPTER 12 ■ DATATYPES 491 • TI
- Page 538 and 539: CHAPTER 12 ■ DATATYPES 493 (in th
- Page 540 and 541: CHAPTER 12 ■ DATATYPES 495 That d
CHAPTER 11 ■ INDEXES 447<br />
an index is not the optimal plan, even for the very ordered table). However, if we select only 10<br />
percent of the table data, we observe the following:<br />
ops$tkyte@ORA10G> select * from colocated where x between 20000 <strong>and</strong> 30000;<br />
Execution Plan<br />
----------------------------------------------------------<br />
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=143 Card=10002 Bytes=800160)<br />
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'COLOCATED' (TABLE) (Cost=143 ...<br />
2 1 INDEX (RANGE SCAN) OF 'COLOCATED_PK' (INDEX (UNIQUE)) (Cost=22 ...<br />
ops$tkyte@ORA10G> select * from disorganized where x between 20000 <strong>and</strong> 30000;<br />
Execution Plan<br />
----------------------------------------------------------<br />
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=337 Card=10002 Bytes=800160)<br />
1 0 TABLE ACCESS (FULL) OF 'DISORGANIZED' (TABLE) (Cost=337 Card=10002 ...<br />
Here we have the same table structures <strong>and</strong> the same indexes, but different clustering factors.<br />
The optimizer, in this case, chose an index access plan for the COLOCATED table <strong>and</strong> a full<br />
scan access plan for the DISORGANIZED table. Bear in mind that 10 percent is not a threshold<br />
value—it is just a number that is less than 25 percent <strong>and</strong> that caused an index range scan to<br />
happen in this case (for the COLOCATED table).<br />
The key point to this discussion is that indexes are not always the appropriate access<br />
method. The optimizer may very well be correct in choosing to not use an index, as the preceding<br />
example demonstrates. Many factors influence the use of an index by the optimizer,<br />
including physical data layout. You might be tempted, therefore, to run out <strong>and</strong> try to rebuild<br />
all of your tables now to make all indexes have a good clustering factor, but that would be a<br />
waste of time in most cases. It will affect cases where you do index range scans of a large percentage<br />
of a table. Additionally, you must keep in mind that in general the table will have only<br />
one index with a good clustering factor! The rows in a table may be sorted in only one way. In<br />
the example just shown, if I had another index on the column Y, it would be very poorly clustered<br />
in the COLOCATED table, but very nicely clustered in the DISORGANIZED table. If having the<br />
data physically clustered is important to you, consider the use of an IOT, a B*Tree cluster, or a<br />
hash cluster over continuous table rebuilds.<br />
B*Trees Wrap-Up<br />
B*Tree indexes are by far the most common <strong>and</strong> well-understood indexing structures in the<br />
<strong>Oracle</strong> database. They are an excellent general-purpose indexing mechanism. They provide<br />
very scalable access times, returning data from a 1,000-row index in about the same amount<br />
of time as a 100,000-row index structure.<br />
When to index <strong>and</strong> what columns to index are things you need to pay attention to in your<br />
design. An index does not always mean faster access; in fact, you will find that indexes will<br />
decrease performance in many cases if <strong>Oracle</strong> uses them. It is purely a function of how large of<br />
a percentage of the table you will need to access via the index <strong>and</strong> how the data happens to be