05.11.2015 Views

Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 11 ■ INDEXES 449<br />

Table 11-7. Representation of a Bitwise OR<br />

Value/Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14<br />

CLERK 1 0 0 0 0 0 0 0 0 0 1 1 0 1<br />

MANAGER 0 0 0 1 0 1 1 0 0 0 0 0 0 0<br />

CLERK or 1 0 0 1 0 1 1 0 0 0 1 1 0 1<br />

MANAGER<br />

Table 11-7 rapidly shows us that rows 1, 4, 6, 7, 11, 12, <strong>and</strong> 14 satisfy our criteria. The<br />

bitmap <strong>Oracle</strong> stores with each key value is set up so that each position represents a rowid in<br />

the underlying table, if we need to actually retrieve the row for further processing. Queries<br />

such as the following:<br />

select count(*) from emp where job = 'CLERK' or job = 'MANAGER'<br />

will be answered directly from the bitmap index. A query such as this:<br />

select * from emp where job = 'CLERK' or job = 'MANAGER'<br />

on the other h<strong>and</strong>, will need to get to the table. Here, <strong>Oracle</strong> will apply a function to turn the<br />

fact that the i’th bit is on in a bitmap, into a rowid that can be used to access the table.<br />

When Should You Use a Bitmap Index?<br />

Bitmap indexes are most appropriate on low distinct cardinality data (i.e., data with relatively<br />

few discrete values when compared to the cardinality of the entire set). It is not really possible<br />

to put a value on this—in other words, it is difficult to define what low distinct cardinality truly<br />

is. In a set of a couple thous<strong>and</strong> records, 2 would be low distinct cardinality, but 2 would not<br />

be low distinct cardinality in a two-row table. In a table of tens or hundreds of millions<br />

records, 100,000 could be low distinct cardinality. So, low distinct cardinality is relative to the<br />

size of the resultset. This is data where the number of distinct items in the set of rows divided<br />

by the number of rows is a small number (near zero). For example, a GENDER column might<br />

take on the values M, F, <strong>and</strong> NULL. If you have a table with 20,000 employee records in it, then<br />

you would find that 3/20000 = 0.00015. Likewise, 100,000 unique values out of 10,000,000<br />

results in a ratio of 0.01—again, very small. These columns would be c<strong>and</strong>idates for bitmap<br />

indexes. They probably would not be c<strong>and</strong>idates for a having B*Tree indexes, as each of the<br />

values would tend to retrieve an extremely large percentage of the table. B*Tree indexes<br />

should be selective in general, as outlined earlier. Bitmap indexes should not be selective—<br />

on the contrary, they should be very “unselective” in general.<br />

Bitmap indexes are extremely useful in environments where you have lots of ad hoc<br />

queries, especially queries that reference many columns in an ad hoc fashion or produce<br />

aggregations such as COUNT. For example, suppose you have a large table with three columns:<br />

GENDER, LOCATION, <strong>and</strong> AGE_GROUP. In this table, GENDER has a value of M or F, LOCATION can take<br />

on the values 1 through 50, <strong>and</strong> AGE_GROUP is a code representing 18 <strong>and</strong> under, 19-25, 26-30,<br />

31-40, <strong>and</strong> 41 <strong>and</strong> over. You have to support a large number of ad hoc queries that take the<br />

following form:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!