05.11.2015 Views

Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

CHAPTER 10 ■ DATABASE TABLES 387<br />

Rows Row Source Operation<br />

------- ---------------------------------------------------<br />

48178 TABLE ACCESS BY INDEX ROWID T_HEAP (cr=144534 pr=0 pw=0 time=1331049 us)<br />

48178 INDEX UNIQUE SCAN T_HEAP_PK (cr=96356 pr=0 pw=0 time=710295 us)(object...<br />

Hash Clustered Tables Wrap-Up<br />

That is the nuts <strong>and</strong> bolts of a hash cluster. Hash clusters are similar in concept to index clusters,<br />

except a cluster index is not used. The data is the index in this case. The cluster key is<br />

hashed into a block address <strong>and</strong> the data is expected to be there. The important things to<br />

underst<strong>and</strong> about hash clusters are as follows:<br />

• The hash cluster is allocated right from the beginning. <strong>Oracle</strong> will take your HASHKEYS/<br />

trunc(blocksize/SIZE) <strong>and</strong> allocate <strong>and</strong> format that space right away. As soon as the<br />

first table is put in that cluster, any full scan will hit every allocated block. This is different<br />

from every other table in this respect.<br />

• The number of HASHKEYs in a hash cluster is a fixed size. You cannot change the size<br />

of the hash table without a rebuild of the cluster. This does not in any way limit the<br />

amount of data you can store in this cluster; it simply limits the number of unique<br />

hash keys that can be generated for this cluster. That may affect performance due to<br />

unintended hash collisions if the value was set too low.<br />

• Range scanning on the cluster key is not available. Predicates such as WHERE cluster_<br />

key BETWEEN 50 AND 60 cannot use the hashing algorithm. There are an infinite number of<br />

possible values between 50 <strong>and</strong> 60, <strong>and</strong> the server would have to generate them all to<br />

hash each one <strong>and</strong> see if there was any data there. This is not possible. The cluster will<br />

be full scanned if you use a range on a cluster key <strong>and</strong> have not indexed it using a conventional<br />

index.<br />

Hash clusters are suitable in the following situations:<br />

• You know with a good degree of accuracy how many rows the table will have over its<br />

life, or you have some reasonable upper bound. Getting the size of the HASHKEYs <strong>and</strong><br />

SIZE parameters right is crucial to avoid a rebuild.<br />

• DML, especially inserts, is light with respect to retrieval. This means you have to balance<br />

optimizing data retrieval with new data creation. Light inserts might be 100,000<br />

per unit of time for one person <strong>and</strong> 100 per unit of time for another—all depending on<br />

their data retrieval patterns. Updates do not introduce significant overhead, unless<br />

you update the HASHKEY, which would not be a good idea, as it would cause the row to<br />

migrate.<br />

• You access the data by the HASHKEY value constantly. For example, say you have a table<br />

of parts, <strong>and</strong> these parts are accessed by part number. Lookup tables are especially<br />

appropriate for hash clusters.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!