Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005
CHAPTER 10 ■ DATABASE TABLES 379 Figure 10-9. Hash cluster depiction When you create a hash cluster, you’ll use the same CREATE CLUSTER statement you used to create the index cluster with different options. You’ll just be adding a HASHKEYS option to it to specify the size of the hash table. Oracle will take your HASHKEYS value and round it up to the nearest prime number (the number of hash keys will always be a prime). Oracle will then compute a value based on the SIZE parameter multiplied by the modified HASHKEYS value. It will allocate at least that much space in bytes for the cluster. This is a big difference from the preceding index cluster, which dynamically allocates space as it needs it. A hash cluster preallocates enough space to hold (HASHKEYS/trunc(blocksize/SIZE)) bytes of data. For example, if you set your SIZE to 1,500 bytes and you have a 4KB blocksize, Oracle will expect to store two keys per block. If you plan on having 1,000 HASHKEYs, Oracle will allocate 500 blocks. It is interesting to note that unlike a conventional hash table in a computer language, it is OK to have hash collisions—in fact, it is desirable in many cases. If you take the same DEPT/EMP example from earlier, you could set up a hash cluster based on the DEPTNO column. Obviously, many rows will hash to the same value, and you expect them to (they have the same DEPTNO). This is what the cluster is about in some respects: clustering like data together. This is why Oracle asks you to specify the HASHKEYs (how many department numbers you anticipate over time) and SIZE (what the size of the data is that will be associated with each department number). It allocates a hash table to hold HASHKEY number of departments of SIZE bytes each. What you do want to avoid is unintended hash collisions. It is obvious that if you set the size of the hash table to 1,000 (really 1,009, since the hash table size is always a prime number and Oracle rounds up for you), and you put 1,010 departments in the table, there will be at least one collision (two different departments hashing to the same value). Unintended hash collisions are to be avoided, as they add overhead and increase the probability of block chaining occurring. To see what sort of space hash clusters take, we’ll use a small utility stored procedure, SHOW_SPACE (for details on this procedure, see the “Setup” section at the beginning of the book), that we’ll use in this chapter and in the next chapter. This routine just uses the DBMS_SPACE-supplied package to get details about the storage used by segments in the database.
380 CHAPTER 10 ■ DATABASE TABLES Now if we issue a CREATE CLUSTER statement such as the following, we can see the storage it allocated: ops$tkyte@ORA10GR1> create cluster hash_cluster 2 ( hash_key number ) 3 hashkeys 1000 4 size 8192 5 tablespace mssm 6 / Cluster created. ops$tkyte@ORA10GR1> exec show_space( 'HASH_CLUSTER', user, 'CLUSTER' ) Free Blocks............................. 0 Total Blocks............................ 1,024 Total Bytes............................. 8,388,608 Total MBytes............................ 8 Unused Blocks........................... 14 Unused Bytes............................ 114,688 Last Used Ext FileId.................... 9 Last Used Ext BlockId................... 1,033 Last Used Block......................... 114 PL/SQL procedure successfully completed. We can see that the total number of blocks allocated to the table is 1,024. Fourteen of these blocks are unused (free). One block goes to table overhead, to manage the extents. Therefore, 1,009 blocks are under the HWM of this object, and these are used by the cluster. The prime 1,009 just happens to be the next largest prime over 1,000, and since the blocksize is 8KB, we can see that Oracle did in fact allocate (8192 ✕ 1009) blocks. The figure is a little higher than this, due to the way extents are rounded and/or by using locally managed tablespaces with uniformly sized extents. This example points out one of the issues with hash clusters you need to be aware of. Normally, if you create an empty table, the number of blocks under the HWM for that table is 0. If you full scan it, it reaches the HWM and stops. With a hash cluster, the tables will start out big and will take longer to create, as Oracle must initialize each block, an action that normally takes place as data is added to the table. They have the potential to have data in their first block and their last block, with nothing in between. Full scanning a virtually empty hash cluster will take as long as full scanning a full hash cluster. This is not necessarily a bad thing; you built the hash cluster to have very fast access to the data by a hash key lookup. You did not build it to full scan it frequently. Now we can start placing tables into the hash cluster in the same fashion we did with index clusters: Ops$tkyte@ORA10GR1> create table hashed_table 2 ( x number, data1 varchar2(4000), data2 varchar2(4000) ) 3 cluster hash_cluster(x); Table created.
- Page 373 and 374: 328 CHAPTER 9 ■ REDO AND UNDO Thi
- Page 375 and 376: 330 CHAPTER 9 ■ REDO AND UNDO ops
- Page 377 and 378: 332 CHAPTER 9 ■ REDO AND UNDO Whe
- Page 379 and 380: 334 CHAPTER 9 ■ REDO AND UNDO Tha
- Page 381 and 382: 336 CHAPTER 9 ■ REDO AND UNDO tou
- Page 383 and 384: 338 CHAPTER 10 ■ DATABASE TABLES
- Page 385 and 386: 340 CHAPTER 10 ■ DATABASE TABLES
- Page 387 and 388: 342 CHAPTER 10 ■ DATABASE TABLES
- Page 389 and 390: 344 CHAPTER 10 ■ DATABASE TABLES
- Page 391 and 392: 346 CHAPTER 10 ■ DATABASE TABLES
- Page 393 and 394: 348 CHAPTER 10 ■ DATABASE TABLES
- Page 395 and 396: 350 CHAPTER 10 ■ DATABASE TABLES
- Page 397 and 398: 352 CHAPTER 10 ■ DATABASE TABLES
- Page 399 and 400: 354 CHAPTER 10 ■ DATABASE TABLES
- Page 401 and 402: 356 CHAPTER 10 ■ DATABASE TABLES
- Page 403 and 404: 358 CHAPTER 10 ■ DATABASE TABLES
- Page 405 and 406: 360 CHAPTER 10 ■ DATABASE TABLES
- Page 407 and 408: 362 CHAPTER 10 ■ DATABASE TABLES
- Page 409 and 410: 364 CHAPTER 10 ■ DATABASE TABLES
- Page 411 and 412: 366 CHAPTER 10 ■ DATABASE TABLES
- Page 413 and 414: 368 CHAPTER 10 ■ DATABASE TABLES
- Page 415 and 416: 370 CHAPTER 10 ■ DATABASE TABLES
- Page 417 and 418: 372 CHAPTER 10 ■ DATABASE TABLES
- Page 419 and 420: 374 CHAPTER 10 ■ DATABASE TABLES
- Page 421 and 422: 376 CHAPTER 10 ■ DATABASE TABLES
- Page 423: 378 CHAPTER 10 ■ DATABASE TABLES
- Page 427 and 428: 382 CHAPTER 10 ■ DATABASE TABLES
- Page 429 and 430: 384 CHAPTER 10 ■ DATABASE TABLES
- Page 431 and 432: 386 CHAPTER 10 ■ DATABASE TABLES
- Page 433 and 434: 388 CHAPTER 10 ■ DATABASE TABLES
- Page 435 and 436: 390 CHAPTER 10 ■ DATABASE TABLES
- Page 437 and 438: 392 CHAPTER 10 ■ DATABASE TABLES
- Page 439 and 440: 394 CHAPTER 10 ■ DATABASE TABLES
- Page 441 and 442: 396 CHAPTER 10 ■ DATABASE TABLES
- Page 443 and 444: 398 CHAPTER 10 ■ DATABASE TABLES
- Page 445 and 446: 400 CHAPTER 10 ■ DATABASE TABLES
- Page 447 and 448: 402 CHAPTER 10 ■ DATABASE TABLES
- Page 449 and 450: 404 CHAPTER 10 ■ DATABASE TABLES
- Page 451 and 452: 406 CHAPTER 10 ■ DATABASE TABLES
- Page 453 and 454: 408 CHAPTER 10 ■ DATABASE TABLES
- Page 455 and 456: 410 CHAPTER 10 ■ DATABASE TABLES
- Page 457 and 458: 412 CHAPTER 10 ■ DATABASE TABLES
- Page 459 and 460: 414 CHAPTER 10 ■ DATABASE TABLES
- Page 461 and 462: 416 CHAPTER 10 ■ DATABASE TABLES
- Page 463 and 464: 418 CHAPTER 10 ■ DATABASE TABLES
- Page 466 and 467: CHAPTER 11 ■ ■ ■ Indexes Inde
- Page 468 and 469: CHAPTER 11 ■ INDEXES 423 value of
- Page 470 and 471: CHAPTER 11 ■ INDEXES 425 One of t
- Page 472 and 473: CHAPTER 11 ■ INDEXES 427 We then
CHAPTER 10 ■ DATABASE TABLES 379<br />
Figure 10-9. Hash cluster depiction<br />
When you create a hash cluster, you’ll use the same CREATE CLUSTER statement you used to<br />
create the index cluster with different options. You’ll just be adding a HASHKEYS option to it to<br />
specify the size of the hash table. <strong>Oracle</strong> will take your HASHKEYS value <strong>and</strong> round it up to the<br />
nearest prime number (the number of hash keys will always be a prime). <strong>Oracle</strong> will then<br />
compute a value based on the SIZE parameter multiplied by the modified HASHKEYS value. It<br />
will allocate at least that much space in bytes for the cluster. This is a big difference from the<br />
preceding index cluster, which dynamically allocates space as it needs it. A hash cluster preallocates<br />
enough space to hold (HASHKEYS/trunc(blocksize/SIZE)) bytes of data. For example,<br />
if you set your SIZE to 1,500 bytes <strong>and</strong> you have a 4KB blocksize, <strong>Oracle</strong> will expect to store two<br />
keys per block. If you plan on having 1,000 HASHKEYs, <strong>Oracle</strong> will allocate 500 blocks.<br />
It is interesting to note that unlike a conventional hash table in a computer language, it is<br />
OK to have hash collisions—in fact, it is desirable in many cases. If you take the same DEPT/EMP<br />
example from earlier, you could set up a hash cluster based on the DEPTNO column. Obviously,<br />
many rows will hash to the same value, <strong>and</strong> you expect them to (they have the same DEPTNO).<br />
This is what the cluster is about in some respects: clustering like data together. This is why<br />
<strong>Oracle</strong> asks you to specify the HASHKEYs (how many department numbers you anticipate over<br />
time) <strong>and</strong> SIZE (what the size of the data is that will be associated with each department number).<br />
It allocates a hash table to hold HASHKEY number of departments of SIZE bytes each. What<br />
you do want to avoid is unintended hash collisions. It is obvious that if you set the size of the<br />
hash table to 1,000 (really 1,009, since the hash table size is always a prime number <strong>and</strong> <strong>Oracle</strong><br />
rounds up for you), <strong>and</strong> you put 1,010 departments in the table, there will be at least one collision<br />
(two different departments hashing to the same value). Unintended hash collisions are to<br />
be avoided, as they add overhead <strong>and</strong> increase the probability of block chaining occurring.<br />
To see what sort of space hash clusters take, we’ll use a small utility stored procedure,<br />
SHOW_SPACE (for details on this procedure, see the “Setup” section at the beginning of the<br />
book), that we’ll use in this chapter <strong>and</strong> in the next chapter. This routine just uses the<br />
DBMS_SPACE-supplied package to get details about the storage used by segments in the<br />
database.