Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005
CHAPTER 11 ■ INDEXES 459 7 l_last_digit number default 0; 8 9 type vcArray is table of varchar2(10) index by binary_integer; 10 l_code_table vcArray; 11 12 begin 13 stats.cnt := stats.cnt+1; 14 15 l_code_table(1) := 'BPFV'; 16 l_code_table(2) := 'CSKGJQXZ'; 17 l_code_table(3) := 'DT'; 18 l_code_table(4) := 'L'; 19 l_code_table(5) := 'MN'; 20 l_code_table(6) := 'R'; 21 22 23 for i in 1 .. length(p_string) 24 loop 25 exit when (length(l_return_string) = 6); 26 l_char := upper(substr( p_string, i, 1 ) ); 27 28 for j in 1 .. l_code_table.count 29 loop 30 if (instr(l_code_table(j), l_char ) > 0 AND j l_last_digit) 31 then 32 l_return_string := l_return_string || to_char(j,'fm9'); 33 l_last_digit := j; 34 end if; 35 end loop; 36 end loop; 37 38 return rpad( l_return_string, 6, '0' ); 39 end; 40 / Function created. Notice in this function, we are using a new keyword, DETERMINISTIC. This declares that the preceding function, when given the same inputs, will always return the exact same output. This is needed to create an index on a user-written function. We must tell Oracle that the function is DETERMINISTIC and will return a consistent result given the same inputs. We are telling Oracle that this function should be trusted to return the same value, call after call, given the same inputs. If this were not the case, we would receive different answers when accessing the data via the index versus a full table scan. This deterministic setting implies, for example, that we cannot create an index on the function DBMS_RANDOM.RANDOM, the random number generator. Its results are not deterministic; given the same inputs, we’ll get random output. The built-in SQL function UPPER used in the first example, on the other hand, is deterministic, so we can create an index on the UPPER value of a column.
460 CHAPTER 11 ■ INDEXES Now that we have the function MY_SOUNDEX, let’s see how it performs without an index. This uses the EMP table we created earlier with about 10,000 rows in it: ops$tkyte@ORA10G> set timing on ops$tkyte@ORA10G> set autotrace on explain ops$tkyte@ORA10G> select ename, hiredate 2 from emp 3 where my_soundex(ename) = my_soundex('Kings') 4 / ENAME HIREDATE ---------- --------- Ku$_Chunk_ 10-AUG-04 Ku$_Chunk_ 10-AUG-04 Elapsed: 00:00:01.07 Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=32 Card=100 Bytes=1900) 1 0 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=32 Card=100 Bytes=1900) ops$tkyte@ORA10G> set autotrace off ops$tkyte@ORA10G> set timing off ops$tkyte@ORA10G> set serveroutput on ops$tkyte@ORA10G> exec dbms_output.put_line( stats.cnt ); 19998 PL/SQL procedure successfully completed. We can see this query took over one second to execute and had to do a full scan on the table. The function MY_SOUNDEX was invoked almost 20,000 times (according to our counter), twice for each row. Let’s see how indexing the function can speed up things. The first thing we’ll do is create the index as follows: ops$tkyte@ORA10G> create index emp_soundex_idx on 2 emp( substr(my_soundex(ename),1,6) ) 3 / Index created. The interesting thing to note in this CREATE INDEX command is the use of the SUBSTR function. This is because we are indexing a function that returns a string. If we were indexing a function that returned a number or date, this SUBSTR would not be necessary. The reason we must SUBSTR the user-written function that returns a string is that such functions return VARCHAR2(4000) types. That may well be too big to be indexed—index entries must fit within about three quarters the size of a block. If we tried, we would receive (in a tablespace with a 4KB blocksize) the following:
- Page 453 and 454: 408 CHAPTER 10 ■ DATABASE TABLES
- Page 455 and 456: 410 CHAPTER 10 ■ DATABASE TABLES
- Page 457 and 458: 412 CHAPTER 10 ■ DATABASE TABLES
- Page 459 and 460: 414 CHAPTER 10 ■ DATABASE TABLES
- Page 461 and 462: 416 CHAPTER 10 ■ DATABASE TABLES
- Page 463 and 464: 418 CHAPTER 10 ■ DATABASE TABLES
- Page 466 and 467: CHAPTER 11 ■ ■ ■ Indexes Inde
- Page 468 and 469: CHAPTER 11 ■ INDEXES 423 value of
- Page 470 and 471: CHAPTER 11 ■ INDEXES 425 One of t
- Page 472 and 473: CHAPTER 11 ■ INDEXES 427 We then
- Page 474 and 475: CHAPTER 11 ■ INDEXES 429 we ended
- Page 476 and 477: CHAPTER 11 ■ INDEXES 431 The data
- Page 478 and 479: CHAPTER 11 ■ INDEXES 433 if ( (++
- Page 480 and 481: CHAPTER 11 ■ INDEXES 435 Table 11
- Page 482 and 483: CHAPTER 11 ■ INDEXES 437 When Sho
- Page 484 and 485: CHAPTER 11 ■ INDEXES 439 an 8KB b
- Page 486 and 487: CHAPTER 11 ■ INDEXES 441 select *
- Page 488 and 489: CHAPTER 11 ■ INDEXES 443 select *
- Page 490 and 491: CHAPTER 11 ■ INDEXES 445 Indicate
- Page 492 and 493: CHAPTER 11 ■ INDEXES 447 an index
- Page 494 and 495: CHAPTER 11 ■ INDEXES 449 Table 11
- Page 496 and 497: CHAPTER 11 ■ INDEXES 451 9 1, 'M'
- Page 498 and 499: CHAPTER 11 ■ INDEXES 453 column w
- Page 500 and 501: CHAPTER 11 ■ INDEXES 455 Bitmap j
- Page 502 and 503: CHAPTER 11 ■ INDEXES 457 INSERT a
- Page 506 and 507: CHAPTER 11 ■ INDEXES 461 ops$tkyt
- Page 508 and 509: CHAPTER 11 ■ INDEXES 463 If we co
- Page 510 and 511: CHAPTER 11 ■ INDEXES 465 ops$tkyt
- Page 512 and 513: CHAPTER 11 ■ INDEXES 467 Caveat o
- Page 514 and 515: CHAPTER 11 ■ INDEXES 469 ops$tkyt
- Page 516 and 517: CHAPTER 11 ■ INDEXES 471 Frequent
- Page 518 and 519: CHAPTER 11 ■ INDEXES 473 select *
- Page 520 and 521: CHAPTER 11 ■ INDEXES 475 If you s
- Page 522 and 523: CHAPTER 11 ■ INDEXES 477 we’ll
- Page 524 and 525: CHAPTER 11 ■ INDEXES 479 Predicat
- Page 526 and 527: CHAPTER 11 ■ INDEXES 481 ops$tkyt
- Page 528 and 529: CHAPTER 11 ■ INDEXES 483 ops$tkyt
- Page 530 and 531: CHAPTER 11 ■ INDEXES 485 This dem
- Page 532 and 533: CHAPTER 11 ■ INDEXES 487 SELECT /
- Page 534 and 535: CHAPTER 12 ■ ■ ■ Datatypes Ch
- Page 536 and 537: CHAPTER 12 ■ DATATYPES 491 • TI
- Page 538 and 539: CHAPTER 12 ■ DATATYPES 493 (in th
- Page 540 and 541: CHAPTER 12 ■ DATATYPES 495 That d
- Page 542 and 543: CHAPTER 12 ■ DATATYPES 497 ops$tk
- Page 544 and 545: CHAPTER 12 ■ DATATYPES 499 Table
- Page 546 and 547: CHAPTER 12 ■ DATATYPES 501 The IN
- Page 548 and 549: CHAPTER 12 ■ DATATYPES 503 ops$tk
- Page 550 and 551: CHAPTER 12 ■ DATATYPES 505 • BI
- Page 552 and 553: CHAPTER 12 ■ DATATYPES 507 NUMBER
460<br />
CHAPTER 11 ■ INDEXES<br />
Now that we have the function MY_SOUNDEX, let’s see how it performs without an index.<br />
This uses the EMP table we created earlier with about 10,000 rows in it:<br />
ops$tkyte@ORA10G> set timing on<br />
ops$tkyte@ORA10G> set autotrace on explain<br />
ops$tkyte@ORA10G> select ename, hiredate<br />
2 from emp<br />
3 where my_soundex(ename) = my_soundex('Kings')<br />
4 /<br />
ENAME HIREDATE<br />
---------- ---------<br />
Ku$_Chunk_ 10-AUG-04<br />
Ku$_Chunk_ 10-AUG-04<br />
Elapsed: 00:00:01.07<br />
Execution Plan<br />
----------------------------------------------------------<br />
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=32 Card=100 Bytes=1900)<br />
1 0 TABLE ACCESS (FULL) OF 'EMP' (TABLE) (Cost=32 Card=100 Bytes=1900)<br />
ops$tkyte@ORA10G> set autotrace off<br />
ops$tkyte@ORA10G> set timing off<br />
ops$tkyte@ORA10G> set serveroutput on<br />
ops$tkyte@ORA10G> exec dbms_output.put_line( stats.cnt );<br />
19998<br />
PL/SQL procedure successfully completed.<br />
We can see this query took over one second to execute <strong>and</strong> had to do a full scan on the<br />
table. The function MY_SOUNDEX was invoked almost 20,000 times (according to our counter),<br />
twice for each row.<br />
Let’s see how indexing the function can speed up things. The first thing we’ll do is create<br />
the index as follows:<br />
ops$tkyte@ORA10G> create index emp_soundex_idx on<br />
2 emp( substr(my_soundex(ename),1,6) )<br />
3 /<br />
Index created.<br />
The interesting thing to note in this CREATE INDEX comm<strong>and</strong> is the use of the SUBSTR function.<br />
This is because we are indexing a function that returns a string. If we were indexing a<br />
function that returned a number or date, this SUBSTR would not be necessary. The reason we<br />
must SUBSTR the user-written function that returns a string is that such functions return<br />
VARCHAR2(4000) types. That may well be too big to be indexed—index entries must fit within<br />
about three quarters the size of a block. If we tried, we would receive (in a tablespace with a<br />
4KB blocksize) the following: