Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

rekharaghuram
from rekharaghuram More from this publisher
05.11.2015 Views

CHAPTER 10 ■ DATABASE TABLES 379 Figure 10-9. Hash cluster depiction When you create a hash cluster, you’ll use the same CREATE CLUSTER statement you used to create the index cluster with different options. You’ll just be adding a HASHKEYS option to it to specify the size of the hash table. Oracle will take your HASHKEYS value and round it up to the nearest prime number (the number of hash keys will always be a prime). Oracle will then compute a value based on the SIZE parameter multiplied by the modified HASHKEYS value. It will allocate at least that much space in bytes for the cluster. This is a big difference from the preceding index cluster, which dynamically allocates space as it needs it. A hash cluster preallocates enough space to hold (HASHKEYS/trunc(blocksize/SIZE)) bytes of data. For example, if you set your SIZE to 1,500 bytes and you have a 4KB blocksize, Oracle will expect to store two keys per block. If you plan on having 1,000 HASHKEYs, Oracle will allocate 500 blocks. It is interesting to note that unlike a conventional hash table in a computer language, it is OK to have hash collisions—in fact, it is desirable in many cases. If you take the same DEPT/EMP example from earlier, you could set up a hash cluster based on the DEPTNO column. Obviously, many rows will hash to the same value, and you expect them to (they have the same DEPTNO). This is what the cluster is about in some respects: clustering like data together. This is why Oracle asks you to specify the HASHKEYs (how many department numbers you anticipate over time) and SIZE (what the size of the data is that will be associated with each department number). It allocates a hash table to hold HASHKEY number of departments of SIZE bytes each. What you do want to avoid is unintended hash collisions. It is obvious that if you set the size of the hash table to 1,000 (really 1,009, since the hash table size is always a prime number and Oracle rounds up for you), and you put 1,010 departments in the table, there will be at least one collision (two different departments hashing to the same value). Unintended hash collisions are to be avoided, as they add overhead and increase the probability of block chaining occurring. To see what sort of space hash clusters take, we’ll use a small utility stored procedure, SHOW_SPACE (for details on this procedure, see the “Setup” section at the beginning of the book), that we’ll use in this chapter and in the next chapter. This routine just uses the DBMS_SPACE-supplied package to get details about the storage used by segments in the database.

380 CHAPTER 10 ■ DATABASE TABLES Now if we issue a CREATE CLUSTER statement such as the following, we can see the storage it allocated: ops$tkyte@ORA10GR1> create cluster hash_cluster 2 ( hash_key number ) 3 hashkeys 1000 4 size 8192 5 tablespace mssm 6 / Cluster created. ops$tkyte@ORA10GR1> exec show_space( 'HASH_CLUSTER', user, 'CLUSTER' ) Free Blocks............................. 0 Total Blocks............................ 1,024 Total Bytes............................. 8,388,608 Total MBytes............................ 8 Unused Blocks........................... 14 Unused Bytes............................ 114,688 Last Used Ext FileId.................... 9 Last Used Ext BlockId................... 1,033 Last Used Block......................... 114 PL/SQL procedure successfully completed. We can see that the total number of blocks allocated to the table is 1,024. Fourteen of these blocks are unused (free). One block goes to table overhead, to manage the extents. Therefore, 1,009 blocks are under the HWM of this object, and these are used by the cluster. The prime 1,009 just happens to be the next largest prime over 1,000, and since the blocksize is 8KB, we can see that Oracle did in fact allocate (8192 ✕ 1009) blocks. The figure is a little higher than this, due to the way extents are rounded and/or by using locally managed tablespaces with uniformly sized extents. This example points out one of the issues with hash clusters you need to be aware of. Normally, if you create an empty table, the number of blocks under the HWM for that table is 0. If you full scan it, it reaches the HWM and stops. With a hash cluster, the tables will start out big and will take longer to create, as Oracle must initialize each block, an action that normally takes place as data is added to the table. They have the potential to have data in their first block and their last block, with nothing in between. Full scanning a virtually empty hash cluster will take as long as full scanning a full hash cluster. This is not necessarily a bad thing; you built the hash cluster to have very fast access to the data by a hash key lookup. You did not build it to full scan it frequently. Now we can start placing tables into the hash cluster in the same fashion we did with index clusters: Ops$tkyte@ORA10GR1> create table hashed_table 2 ( x number, data1 varchar2(4000), data2 varchar2(4000) ) 3 cluster hash_cluster(x); Table created.

CHAPTER 10 ■ DATABASE TABLES 379<br />

Figure 10-9. Hash cluster depiction<br />

When you create a hash cluster, you’ll use the same CREATE CLUSTER statement you used to<br />

create the index cluster with different options. You’ll just be adding a HASHKEYS option to it to<br />

specify the size of the hash table. <strong>Oracle</strong> will take your HASHKEYS value <strong>and</strong> round it up to the<br />

nearest prime number (the number of hash keys will always be a prime). <strong>Oracle</strong> will then<br />

compute a value based on the SIZE parameter multiplied by the modified HASHKEYS value. It<br />

will allocate at least that much space in bytes for the cluster. This is a big difference from the<br />

preceding index cluster, which dynamically allocates space as it needs it. A hash cluster preallocates<br />

enough space to hold (HASHKEYS/trunc(blocksize/SIZE)) bytes of data. For example,<br />

if you set your SIZE to 1,500 bytes <strong>and</strong> you have a 4KB blocksize, <strong>Oracle</strong> will expect to store two<br />

keys per block. If you plan on having 1,000 HASHKEYs, <strong>Oracle</strong> will allocate 500 blocks.<br />

It is interesting to note that unlike a conventional hash table in a computer language, it is<br />

OK to have hash collisions—in fact, it is desirable in many cases. If you take the same DEPT/EMP<br />

example from earlier, you could set up a hash cluster based on the DEPTNO column. Obviously,<br />

many rows will hash to the same value, <strong>and</strong> you expect them to (they have the same DEPTNO).<br />

This is what the cluster is about in some respects: clustering like data together. This is why<br />

<strong>Oracle</strong> asks you to specify the HASHKEYs (how many department numbers you anticipate over<br />

time) <strong>and</strong> SIZE (what the size of the data is that will be associated with each department number).<br />

It allocates a hash table to hold HASHKEY number of departments of SIZE bytes each. What<br />

you do want to avoid is unintended hash collisions. It is obvious that if you set the size of the<br />

hash table to 1,000 (really 1,009, since the hash table size is always a prime number <strong>and</strong> <strong>Oracle</strong><br />

rounds up for you), <strong>and</strong> you put 1,010 departments in the table, there will be at least one collision<br />

(two different departments hashing to the same value). Unintended hash collisions are to<br />

be avoided, as they add overhead <strong>and</strong> increase the probability of block chaining occurring.<br />

To see what sort of space hash clusters take, we’ll use a small utility stored procedure,<br />

SHOW_SPACE (for details on this procedure, see the “Setup” section at the beginning of the<br />

book), that we’ll use in this chapter <strong>and</strong> in the next chapter. This routine just uses the<br />

DBMS_SPACE-supplied package to get details about the storage used by segments in the<br />

database.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!