Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005
CHAPTER 11 ■ INDEXES 465 ops$tkyte@ORA10G> update big_table set temporary = decode(temporary,'N','Y','N'); 1000000 rows updated. And we’ll check out the ratio of Ys to Ns: ops$tkyte@ORA10G> select temporary, cnt, 2 round( (ratio_to_report(cnt) over ()) * 100, 2 ) rtr 3 from ( 4 select temporary, count(*) cnt 5 from big_table 6 group by temporary 7 ) 8 / T CNT RTR - ---------- ---------- N 1779 .18 Y 998221 99.82 As we can see, of the 1,000,000 records in the table, only about one-fifth of 1 percent of the data should be indexed. If we use a conventional index on the TEMPORARY column (which is playing the role of the PROCESSED_FLAG column in this example), we would discover that the index has 1,000,000 entries, consumes over 14MB of space, and has a height of 3: ops$tkyte@ORA10G> create index processed_flag_idx 2 on big_table(temporary); Index created. ops$tkyte@ORA10G> analyze index processed_flag_idx 2 validate structure; Index analyzed. ops$tkyte@ORA10G> select name, btree_space, lf_rows, height 2 from index_stats; NAME BTREE_SPACE LF_ROWS HEIGHT ------------------------------ ----------- ---------- ---------- PROCESSED_FLAG_IDX 14528892 1000000 3 Any retrieval via this index would incur three I/Os to get to the leaf blocks. This index is not only “wide,” but also “tall.” To get the first unprocessed record, we will have to perform at least four I/Os (three against the index and one against the table). How can we change all of this? We need to make it so the index is much smaller and easier to maintain (with less runtime overhead during the updates). Enter the function-based index, which allows us to simply write a function that returns NULL when we don’t want to index a given row and returns a non-NULL value when we do. For example, since we are interested just in the N records, let’s index just those:
466 CHAPTER 11 ■ INDEXES ops$tkyte@ORA10G> drop index processed_flag_idx; Index dropped. ops$tkyte@ORA10G> create index processed_flag_idx 2 on big_table( case temporary when 'N' then 'N' end ); Index created. ops$tkyte@ORA10G> analyze index processed_flag_idx 2 validate structure; Index analyzed. ops$tkyte@ORA10G> select name, btree_space, lf_rows, height 2 from index_stats; NAME BTREE_SPACE LF_ROWS HEIGHT ------------------------------ ----------- ---------- ---------- PROCESSED_FLAG_IDX 40012 1779 2 That is quite a difference—the index is some 40KB, not 14.5MB. The height has decreased as well. If we use this index, we’ll perform one less I/O than we would using the previous taller index. Implementing Selective Uniqueness Another useful technique with function-based indexes is to use them to enforce certain types of complex constraints. For example, suppose you have a table with versioned information, such as a projects table. Projects have one of two statuses: either ACTIVE or INACTIVE. You need to enforce a rule such that “Active projects must have a unique name; inactive projects do not.” That is, there can only be one active “project X,” but you could have as many inactive project Xs as you like. The first response from a developer when they hear this requirement is typically, “We’ll just run a query to see if there are any active project Xs, and if not, we’ll create ours.” If you read Chapter 7 (which covers concurrency control and multi-versioning), you understand that such a simple implementation cannot work in a multiuser environment. If two people attempt to create a new active project X at the same time, they’ll both succeed. We need to serialize the creation of project X, but the only way to do that is to lock the entire projects table (not very concurrent) or use a function-based index and let the database do it for us. Building on the fact that we can create indexes on functions, that entire null entries are not made in B*Tree indexes, and that we can create a UNIQUE index, we can easily do the following: Create unique index active_projects_must_be_unique On projects ( case when status = 'ACTIVE' then name end ); That would do it. When the status column is ACTIVE, the NAME column will be uniquely indexed. Any attempt to create active projects with the same name will be detected, and concurrent access to this table is not compromised at all.
- Page 459 and 460: 414 CHAPTER 10 ■ DATABASE TABLES
- Page 461 and 462: 416 CHAPTER 10 ■ DATABASE TABLES
- Page 463 and 464: 418 CHAPTER 10 ■ DATABASE TABLES
- Page 466 and 467: CHAPTER 11 ■ ■ ■ Indexes Inde
- Page 468 and 469: CHAPTER 11 ■ INDEXES 423 value of
- Page 470 and 471: CHAPTER 11 ■ INDEXES 425 One of t
- Page 472 and 473: CHAPTER 11 ■ INDEXES 427 We then
- Page 474 and 475: CHAPTER 11 ■ INDEXES 429 we ended
- Page 476 and 477: CHAPTER 11 ■ INDEXES 431 The data
- Page 478 and 479: CHAPTER 11 ■ INDEXES 433 if ( (++
- Page 480 and 481: CHAPTER 11 ■ INDEXES 435 Table 11
- Page 482 and 483: CHAPTER 11 ■ INDEXES 437 When Sho
- Page 484 and 485: CHAPTER 11 ■ INDEXES 439 an 8KB b
- Page 486 and 487: CHAPTER 11 ■ INDEXES 441 select *
- Page 488 and 489: CHAPTER 11 ■ INDEXES 443 select *
- Page 490 and 491: CHAPTER 11 ■ INDEXES 445 Indicate
- Page 492 and 493: CHAPTER 11 ■ INDEXES 447 an index
- Page 494 and 495: CHAPTER 11 ■ INDEXES 449 Table 11
- Page 496 and 497: CHAPTER 11 ■ INDEXES 451 9 1, 'M'
- Page 498 and 499: CHAPTER 11 ■ INDEXES 453 column w
- Page 500 and 501: CHAPTER 11 ■ INDEXES 455 Bitmap j
- Page 502 and 503: CHAPTER 11 ■ INDEXES 457 INSERT a
- Page 504 and 505: CHAPTER 11 ■ INDEXES 459 7 l_last
- Page 506 and 507: CHAPTER 11 ■ INDEXES 461 ops$tkyt
- Page 508 and 509: CHAPTER 11 ■ INDEXES 463 If we co
- Page 512 and 513: CHAPTER 11 ■ INDEXES 467 Caveat o
- Page 514 and 515: CHAPTER 11 ■ INDEXES 469 ops$tkyt
- Page 516 and 517: CHAPTER 11 ■ INDEXES 471 Frequent
- Page 518 and 519: CHAPTER 11 ■ INDEXES 473 select *
- Page 520 and 521: CHAPTER 11 ■ INDEXES 475 If you s
- Page 522 and 523: CHAPTER 11 ■ INDEXES 477 we’ll
- Page 524 and 525: CHAPTER 11 ■ INDEXES 479 Predicat
- Page 526 and 527: CHAPTER 11 ■ INDEXES 481 ops$tkyt
- Page 528 and 529: CHAPTER 11 ■ INDEXES 483 ops$tkyt
- Page 530 and 531: CHAPTER 11 ■ INDEXES 485 This dem
- Page 532 and 533: CHAPTER 11 ■ INDEXES 487 SELECT /
- Page 534 and 535: CHAPTER 12 ■ ■ ■ Datatypes Ch
- Page 536 and 537: CHAPTER 12 ■ DATATYPES 491 • TI
- Page 538 and 539: CHAPTER 12 ■ DATATYPES 493 (in th
- Page 540 and 541: CHAPTER 12 ■ DATATYPES 495 That d
- Page 542 and 543: CHAPTER 12 ■ DATATYPES 497 ops$tk
- Page 544 and 545: CHAPTER 12 ■ DATATYPES 499 Table
- Page 546 and 547: CHAPTER 12 ■ DATATYPES 501 The IN
- Page 548 and 549: CHAPTER 12 ■ DATATYPES 503 ops$tk
- Page 550 and 551: CHAPTER 12 ■ DATATYPES 505 • BI
- Page 552 and 553: CHAPTER 12 ■ DATATYPES 507 NUMBER
- Page 554 and 555: CHAPTER 12 ■ DATATYPES 509 MSG NU
- Page 556 and 557: CHAPTER 12 ■ DATATYPES 511 They a
- Page 558 and 559: CHAPTER 12 ■ DATATYPES 513 ■Not
466<br />
CHAPTER 11 ■ INDEXES<br />
ops$tkyte@ORA10G> drop index processed_flag_idx;<br />
Index dropped.<br />
ops$tkyte@ORA10G> create index processed_flag_idx<br />
2 on big_table( case temporary when 'N' then 'N' end );<br />
Index created.<br />
ops$tkyte@ORA10G> analyze index processed_flag_idx<br />
2 validate structure;<br />
Index analyzed.<br />
ops$tkyte@ORA10G> select name, btree_space, lf_rows, height<br />
2 from index_stats;<br />
NAME BTREE_SPACE LF_ROWS HEIGHT<br />
------------------------------ ----------- ---------- ----------<br />
PROCESSED_FLAG_IDX 40012 1779 2<br />
That is quite a difference—the index is some 40KB, not 14.5MB. The height has decreased<br />
as well. If we use this index, we’ll perform one less I/O than we would using the previous taller<br />
index.<br />
Implementing Selective Uniqueness<br />
Another useful technique with function-based indexes is to use them to enforce certain types<br />
of complex constraints. For example, suppose you have a table with versioned information,<br />
such as a projects table. Projects have one of two statuses: either ACTIVE or INACTIVE. You need<br />
to enforce a rule such that “Active projects must have a unique name; inactive projects do<br />
not.” That is, there can only be one active “project X,” but you could have as many inactive<br />
project Xs as you like.<br />
The first response from a developer when they hear this requirement is typically, “We’ll<br />
just run a query to see if there are any active project Xs, <strong>and</strong> if not, we’ll create ours.” If you<br />
read Chapter 7 (which covers concurrency control <strong>and</strong> multi-versioning), you underst<strong>and</strong><br />
that such a simple implementation cannot work in a multiuser environment. If two people<br />
attempt to create a new active project X at the same time, they’ll both succeed. We need to<br />
serialize the creation of project X, but the only way to do that is to lock the entire projects table<br />
(not very concurrent) or use a function-based index <strong>and</strong> let the database do it for us.<br />
Building on the fact that we can create indexes on functions, that entire null entries are<br />
not made in B*Tree indexes, <strong>and</strong> that we can create a UNIQUE index, we can easily do the following:<br />
Create unique index active_projects_must_be_unique<br />
On projects ( case when status = 'ACTIVE' then name end );<br />
That would do it. When the status column is ACTIVE, the NAME column will be uniquely<br />
indexed. Any attempt to create active projects with the same name will be detected, <strong>and</strong> concurrent<br />
access to this table is not compromised at all.