05.11.2015 Views

Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

378<br />

CHAPTER 10 ■ DATABASE TABLES<br />

Index Clustered Tables Wrap-Up<br />

Clustered tables give you the ability to physically “prejoin” data together. You use clusters<br />

to store related data from many tables on the same database block. Clusters can help readintensive<br />

operations that always join data together or access related sets of data (e.g.,<br />

everyone in department 10).<br />

Clustered tables reduce the number of blocks that <strong>Oracle</strong> must cache. Instead of keeping<br />

ten blocks for ten employees in the same department, <strong>Oracle</strong> will put them in one block <strong>and</strong><br />

therefore increase the efficiency of your buffer cache. On the downside, unless you can calculate<br />

your SIZE parameter setting correctly, clusters may be inefficient with their space utilization<br />

<strong>and</strong> can tend to slow down DML-heavy operations.<br />

Hash Clustered Tables<br />

Hash clustered tables are very similar in concept to the index clustered tables just described<br />

with one main exception: the cluster key index is replaced with a hash function. The data in<br />

the table is the index; there is no physical index. <strong>Oracle</strong> will take the key value for a row, hash it<br />

using either an internal function or one you supply, <strong>and</strong> use that to figure out where the data<br />

should be on disk. One side effect of using a hashing algorithm to locate data, however, is that<br />

you cannot range scan a table in a hash cluster without adding a conventional index to the<br />

table. In an index cluster, the query<br />

select * from emp where deptno between 10 <strong>and</strong> 20<br />

would be able to make use of the cluster key index to find these rows. In a hash cluster, this<br />

query would result in a full table scan unless you had an index on the DEPTNO column. Only<br />

exact equality searches (including in lists <strong>and</strong> subqueries) may be made on the hash key without<br />

using an index that supports range scans.<br />

In a perfect world, with nicely distributed hash key values <strong>and</strong> a hash function that distributes<br />

them evenly over all of the blocks allocated to the hash cluster, we can go straight<br />

from a query to the data with one I/O. In the real world, we will end up with more hash key<br />

values hashing to the same database block address than fit on that block. This will result in<br />

<strong>Oracle</strong> having to chain blocks together in a linked list to hold all of the rows that hash to this<br />

block. Now, when we need to retrieve the rows that match our hash key, we might have to visit<br />

more than one block.<br />

Like a hash table in a programming language, hash tables in the database have a fixed<br />

“size.” When you create the table, you must determine the number of hash keys your table<br />

will have, forever. That does not limit the amount of rows you can put in there.<br />

Figure 10-9 shows a graphical representation of a hash cluster with table EMP created in<br />

it. When the client issues a query that uses the hash cluster key in the predicate, <strong>Oracle</strong> will<br />

apply the hash function to determine which block the data should be in. It will then read that<br />

one block to find the data. If there have been many collisions, or the SIZE parameter to the<br />

CREATE CLUSTER was underestimated, <strong>Oracle</strong> will have allocated overflow blocks that are<br />

chained off the original block.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!