Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

rekharaghuram
from rekharaghuram More from this publisher
05.11.2015 Views

CHAPTER 11 ■ INDEXES 465 ops$tkyte@ORA10G> update big_table set temporary = decode(temporary,'N','Y','N'); 1000000 rows updated. And we’ll check out the ratio of Ys to Ns: ops$tkyte@ORA10G> select temporary, cnt, 2 round( (ratio_to_report(cnt) over ()) * 100, 2 ) rtr 3 from ( 4 select temporary, count(*) cnt 5 from big_table 6 group by temporary 7 ) 8 / T CNT RTR - ---------- ---------- N 1779 .18 Y 998221 99.82 As we can see, of the 1,000,000 records in the table, only about one-fifth of 1 percent of the data should be indexed. If we use a conventional index on the TEMPORARY column (which is playing the role of the PROCESSED_FLAG column in this example), we would discover that the index has 1,000,000 entries, consumes over 14MB of space, and has a height of 3: ops$tkyte@ORA10G> create index processed_flag_idx 2 on big_table(temporary); Index created. ops$tkyte@ORA10G> analyze index processed_flag_idx 2 validate structure; Index analyzed. ops$tkyte@ORA10G> select name, btree_space, lf_rows, height 2 from index_stats; NAME BTREE_SPACE LF_ROWS HEIGHT ------------------------------ ----------- ---------- ---------- PROCESSED_FLAG_IDX 14528892 1000000 3 Any retrieval via this index would incur three I/Os to get to the leaf blocks. This index is not only “wide,” but also “tall.” To get the first unprocessed record, we will have to perform at least four I/Os (three against the index and one against the table). How can we change all of this? We need to make it so the index is much smaller and easier to maintain (with less runtime overhead during the updates). Enter the function-based index, which allows us to simply write a function that returns NULL when we don’t want to index a given row and returns a non-NULL value when we do. For example, since we are interested just in the N records, let’s index just those:

466 CHAPTER 11 ■ INDEXES ops$tkyte@ORA10G> drop index processed_flag_idx; Index dropped. ops$tkyte@ORA10G> create index processed_flag_idx 2 on big_table( case temporary when 'N' then 'N' end ); Index created. ops$tkyte@ORA10G> analyze index processed_flag_idx 2 validate structure; Index analyzed. ops$tkyte@ORA10G> select name, btree_space, lf_rows, height 2 from index_stats; NAME BTREE_SPACE LF_ROWS HEIGHT ------------------------------ ----------- ---------- ---------- PROCESSED_FLAG_IDX 40012 1779 2 That is quite a difference—the index is some 40KB, not 14.5MB. The height has decreased as well. If we use this index, we’ll perform one less I/O than we would using the previous taller index. Implementing Selective Uniqueness Another useful technique with function-based indexes is to use them to enforce certain types of complex constraints. For example, suppose you have a table with versioned information, such as a projects table. Projects have one of two statuses: either ACTIVE or INACTIVE. You need to enforce a rule such that “Active projects must have a unique name; inactive projects do not.” That is, there can only be one active “project X,” but you could have as many inactive project Xs as you like. The first response from a developer when they hear this requirement is typically, “We’ll just run a query to see if there are any active project Xs, and if not, we’ll create ours.” If you read Chapter 7 (which covers concurrency control and multi-versioning), you understand that such a simple implementation cannot work in a multiuser environment. If two people attempt to create a new active project X at the same time, they’ll both succeed. We need to serialize the creation of project X, but the only way to do that is to lock the entire projects table (not very concurrent) or use a function-based index and let the database do it for us. Building on the fact that we can create indexes on functions, that entire null entries are not made in B*Tree indexes, and that we can create a UNIQUE index, we can easily do the following: Create unique index active_projects_must_be_unique On projects ( case when status = 'ACTIVE' then name end ); That would do it. When the status column is ACTIVE, the NAME column will be uniquely indexed. Any attempt to create active projects with the same name will be detected, and concurrent access to this table is not compromised at all.

466<br />

CHAPTER 11 ■ INDEXES<br />

ops$tkyte@ORA10G> drop index processed_flag_idx;<br />

Index dropped.<br />

ops$tkyte@ORA10G> create index processed_flag_idx<br />

2 on big_table( case temporary when 'N' then 'N' end );<br />

Index created.<br />

ops$tkyte@ORA10G> analyze index processed_flag_idx<br />

2 validate structure;<br />

Index analyzed.<br />

ops$tkyte@ORA10G> select name, btree_space, lf_rows, height<br />

2 from index_stats;<br />

NAME BTREE_SPACE LF_ROWS HEIGHT<br />

------------------------------ ----------- ---------- ----------<br />

PROCESSED_FLAG_IDX 40012 1779 2<br />

That is quite a difference—the index is some 40KB, not 14.5MB. The height has decreased<br />

as well. If we use this index, we’ll perform one less I/O than we would using the previous taller<br />

index.<br />

Implementing Selective Uniqueness<br />

Another useful technique with function-based indexes is to use them to enforce certain types<br />

of complex constraints. For example, suppose you have a table with versioned information,<br />

such as a projects table. Projects have one of two statuses: either ACTIVE or INACTIVE. You need<br />

to enforce a rule such that “Active projects must have a unique name; inactive projects do<br />

not.” That is, there can only be one active “project X,” but you could have as many inactive<br />

project Xs as you like.<br />

The first response from a developer when they hear this requirement is typically, “We’ll<br />

just run a query to see if there are any active project Xs, <strong>and</strong> if not, we’ll create ours.” If you<br />

read Chapter 7 (which covers concurrency control <strong>and</strong> multi-versioning), you underst<strong>and</strong><br />

that such a simple implementation cannot work in a multiuser environment. If two people<br />

attempt to create a new active project X at the same time, they’ll both succeed. We need to<br />

serialize the creation of project X, but the only way to do that is to lock the entire projects table<br />

(not very concurrent) or use a function-based index <strong>and</strong> let the database do it for us.<br />

Building on the fact that we can create indexes on functions, that entire null entries are<br />

not made in B*Tree indexes, <strong>and</strong> that we can create a UNIQUE index, we can easily do the following:<br />

Create unique index active_projects_must_be_unique<br />

On projects ( case when status = 'ACTIVE' then name end );<br />

That would do it. When the status column is ACTIVE, the NAME column will be uniquely<br />

indexed. Any attempt to create active projects with the same name will be detected, <strong>and</strong> concurrent<br />

access to this table is not compromised at all.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!