Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

rekharaghuram
from rekharaghuram More from this publisher
05.11.2015 Views

CHAPTER 11 ■ INDEXES 447 an index is not the optimal plan, even for the very ordered table). However, if we select only 10 percent of the table data, we observe the following: ops$tkyte@ORA10G> select * from colocated where x between 20000 and 30000; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=143 Card=10002 Bytes=800160) 1 0 TABLE ACCESS (BY INDEX ROWID) OF 'COLOCATED' (TABLE) (Cost=143 ... 2 1 INDEX (RANGE SCAN) OF 'COLOCATED_PK' (INDEX (UNIQUE)) (Cost=22 ... ops$tkyte@ORA10G> select * from disorganized where x between 20000 and 30000; Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=337 Card=10002 Bytes=800160) 1 0 TABLE ACCESS (FULL) OF 'DISORGANIZED' (TABLE) (Cost=337 Card=10002 ... Here we have the same table structures and the same indexes, but different clustering factors. The optimizer, in this case, chose an index access plan for the COLOCATED table and a full scan access plan for the DISORGANIZED table. Bear in mind that 10 percent is not a threshold value—it is just a number that is less than 25 percent and that caused an index range scan to happen in this case (for the COLOCATED table). The key point to this discussion is that indexes are not always the appropriate access method. The optimizer may very well be correct in choosing to not use an index, as the preceding example demonstrates. Many factors influence the use of an index by the optimizer, including physical data layout. You might be tempted, therefore, to run out and try to rebuild all of your tables now to make all indexes have a good clustering factor, but that would be a waste of time in most cases. It will affect cases where you do index range scans of a large percentage of a table. Additionally, you must keep in mind that in general the table will have only one index with a good clustering factor! The rows in a table may be sorted in only one way. In the example just shown, if I had another index on the column Y, it would be very poorly clustered in the COLOCATED table, but very nicely clustered in the DISORGANIZED table. If having the data physically clustered is important to you, consider the use of an IOT, a B*Tree cluster, or a hash cluster over continuous table rebuilds. B*Trees Wrap-Up B*Tree indexes are by far the most common and well-understood indexing structures in the Oracle database. They are an excellent general-purpose indexing mechanism. They provide very scalable access times, returning data from a 1,000-row index in about the same amount of time as a 100,000-row index structure. When to index and what columns to index are things you need to pay attention to in your design. An index does not always mean faster access; in fact, you will find that indexes will decrease performance in many cases if Oracle uses them. It is purely a function of how large of a percentage of the table you will need to access via the index and how the data happens to be

448 CHAPTER 11 ■ INDEXES laid out. If you can use the index to “answer the question,” then accessing a large percentage of the rows makes sense, since you are avoiding the extra scattered I/O to read the table. If you use the index to access the table, then you will need to ensure that you are processing a small percentage of the total table. You should consider the design and implementation of indexes during the design of your application, not as an afterthought (as I so often see). With careful planning and due consideration of how you are going to access the data, the indexes you need will be apparent in most all cases. Bitmap Indexes Bitmap indexes were added to Oracle in version 7.3 of the database. They are currently available with the Oracle Enterprise and Personal Editions, but not the Standard Edition. Bitmap indexes are designed for data warehousing/ad hoc query environments where the full set of queries that may be asked of the data is not totally known at system implementation time. They are specifically not designed for OLTP systems or systems where data is frequently updated by many concurrent sessions. Bitmap indexes are structures that store pointers to many rows with a single index key entry, as compared to a B*Tree structure where there is parity between the index keys and the rows in a table. In a bitmap index, there will be a very small number of index entries, each of which points to many rows. In a conventional B*Tree, one index entry points to a single row. Let’s say we are creating a bitmap index on the JOB column in the EMP table as follows: Ops$tkyte@ORA10G> create BITMAP index job_idx on emp(job); Index created. Oracle will store something like what is shown in Table 11-6 in the index. Table 11-6. Representation of How Oracle Would Store the JOB-IDX Bitmap Index Value/Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ANALYST 0 0 0 0 0 0 0 1 0 1 0 0 1 0 CLERK 1 0 0 0 0 0 0 0 0 0 1 1 0 1 MANAGER 0 0 0 1 0 1 1 0 0 0 0 0 0 0 PRESIDENT 0 0 0 0 0 0 0 0 1 0 0 0 0 0 SALESMAN 0 1 1 0 1 0 0 0 0 0 0 0 0 0 Table 11-6 shows that rows 8, 10, and 13 have the value ANALYST, whereas rows 4, 6, and 7 have the value MANAGER. It also shows us that no rows are null (bitmap indexes store null entries; the lack of a null entry in the index implies there are no null rows). If we wanted to count the rows that have the value MANAGER, the bitmap index would do this very rapidly. If we wanted to find all the rows such that the JOB was CLERK or MANAGER, we could simply combine their bitmaps from the index, as shown in Table 11-7.

CHAPTER 11 ■ INDEXES 447<br />

an index is not the optimal plan, even for the very ordered table). However, if we select only 10<br />

percent of the table data, we observe the following:<br />

ops$tkyte@ORA10G> select * from colocated where x between 20000 <strong>and</strong> 30000;<br />

Execution Plan<br />

----------------------------------------------------------<br />

0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=143 Card=10002 Bytes=800160)<br />

1 0 TABLE ACCESS (BY INDEX ROWID) OF 'COLOCATED' (TABLE) (Cost=143 ...<br />

2 1 INDEX (RANGE SCAN) OF 'COLOCATED_PK' (INDEX (UNIQUE)) (Cost=22 ...<br />

ops$tkyte@ORA10G> select * from disorganized where x between 20000 <strong>and</strong> 30000;<br />

Execution Plan<br />

----------------------------------------------------------<br />

0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=337 Card=10002 Bytes=800160)<br />

1 0 TABLE ACCESS (FULL) OF 'DISORGANIZED' (TABLE) (Cost=337 Card=10002 ...<br />

Here we have the same table structures <strong>and</strong> the same indexes, but different clustering factors.<br />

The optimizer, in this case, chose an index access plan for the COLOCATED table <strong>and</strong> a full<br />

scan access plan for the DISORGANIZED table. Bear in mind that 10 percent is not a threshold<br />

value—it is just a number that is less than 25 percent <strong>and</strong> that caused an index range scan to<br />

happen in this case (for the COLOCATED table).<br />

The key point to this discussion is that indexes are not always the appropriate access<br />

method. The optimizer may very well be correct in choosing to not use an index, as the preceding<br />

example demonstrates. Many factors influence the use of an index by the optimizer,<br />

including physical data layout. You might be tempted, therefore, to run out <strong>and</strong> try to rebuild<br />

all of your tables now to make all indexes have a good clustering factor, but that would be a<br />

waste of time in most cases. It will affect cases where you do index range scans of a large percentage<br />

of a table. Additionally, you must keep in mind that in general the table will have only<br />

one index with a good clustering factor! The rows in a table may be sorted in only one way. In<br />

the example just shown, if I had another index on the column Y, it would be very poorly clustered<br />

in the COLOCATED table, but very nicely clustered in the DISORGANIZED table. If having the<br />

data physically clustered is important to you, consider the use of an IOT, a B*Tree cluster, or a<br />

hash cluster over continuous table rebuilds.<br />

B*Trees Wrap-Up<br />

B*Tree indexes are by far the most common <strong>and</strong> well-understood indexing structures in the<br />

<strong>Oracle</strong> database. They are an excellent general-purpose indexing mechanism. They provide<br />

very scalable access times, returning data from a 1,000-row index in about the same amount<br />

of time as a 100,000-row index structure.<br />

When to index <strong>and</strong> what columns to index are things you need to pay attention to in your<br />

design. An index does not always mean faster access; in fact, you will find that indexes will<br />

decrease performance in many cases if <strong>Oracle</strong> uses them. It is purely a function of how large of<br />

a percentage of the table you will need to access via the index <strong>and</strong> how the data happens to be

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!