Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005
CHAPTER 14 ■ PARALLEL EXECUTION 643 8 / 48250 rows created. ops$tkyte-ORA10G> commit; Commit complete. Just to see what happened here, we can query the newly inserted data out and group by SESSION_ID to see first how many parallel execution servers were used, and second how many rows each processed: ops$tkyte-ORA10G> select session_id, count(*) 2 from t2 3 group by session_id; SESSION_ID COUNT(*) ---------- ---------- 241 8040 246 8045 253 8042 254 8042 258 8040 260 8041 6 rows selected. Apparently, we used six parallel execution servers for the SELECT component of this parallel operation, and each one processed about 8,040 records each. As you can see, Oracle parallelized our process, but we underwent a fairly radical rewrite of our process. This is a long way from the original implementation. So, while Oracle can process our routine in parallel, we may well not have any routines that are coded to be parallelized. If a rather large rewrite of your procedure is not feasible, you may be interested in the next implementation: DIY parallelism. Do-It-Yourself Parallelism Say we have that same process as in the preceding section: the serial, simple procedure. We cannot afford a rather extensive rewrite of the implementation, but we would like to execute it in parallel. What can we do? My approach many times has been to use rowid ranges to break up the table into some number of ranges that don’t overlap (yet completely cover the table). This is very similar to how Oracle performs a parallel query conceptually. If you think of a full table scan, Oracle processes that by coming up with some method to break the table into many “small” tables, each of which is processed by a parallel execution server. We are going to do the same thing using rowid ranges. In early releases, Oracle’s parallel implementation actually used rowid ranges itself. We’ll use a BIG_TABLE of 1,000,000 rows, as this technique works best on big tables with lots of extents, and the method I use for creating rowid ranges depends on extent boundaries. The more extents used, the better the data distribution. So, after creating the BIG_TABLE with 1,000,000 rows, we’ll create T2 like this:
644 CHAPTER 14 ■ PARALLEL EXECUTION big_table-ORA10G> create table t2 2 as 3 select object_id id, object_name text, 0 session_id 4 from big_table 5 where 1=0; Table created. We are going to use the job queues built into the database to parallel process our procedure. We will schedule some number of jobs. Each job is our procedure slightly modified to just process the rows in a given rowid range. ■Note In Oracle 10g, you could use the scheduler for something this simple, but in order to make the example 9i compatible, we’ll use the job queues here. To efficiently support the job queues, we’ll use a parameter table to pass inputs to our jobs: big_table-ORA10G> create table job_parms 2 ( job number primary key, 3 lo_rid rowid, 4 hi_rid rowid 5 ) 6 / Table created. This will allow us to just pass the job ID into our procedure, so it can query this table to get the rowid range it is to process. Now for our procedure. The code in bold is the new code we’ll be adding: big_table-ORA10G> create or replace 2 procedure serial( p_job in number ) 3 is 4 l_rec job_parms%rowtype; 5 begin 6 select * into l_rec 7 from job_parms 8 where job = p_job; 9 10 for x in ( select object_id id, object_name text 11 from big_table 12 where rowid between l_rec.lo_rid 13 and l_rec.hi_rid ) 14 loop 15 -- complex process here 16 insert into t2 (id, text, session_id ) 17 values ( x.id, x.text, p_job ); 18 end loop;
- Page 638 and 639: CHAPTER 13 ■ PARTITIONING 593 •
- Page 640 and 641: CHAPTER 13 ■ PARTITIONING 595 Now
- Page 642 and 643: CHAPTER 13 ■ PARTITIONING 597 the
- Page 644 and 645: CHAPTER 13 ■ PARTITIONING 599 imp
- Page 646 and 647: CHAPTER 13 ■ PARTITIONING 601 OLT
- Page 648 and 649: CHAPTER 13 ■ PARTITIONING 603 5 s
- Page 650 and 651: CHAPTER 13 ■ PARTITIONING 605 Sur
- Page 652 and 653: CHAPTER 13 ■ PARTITIONING 607 On
- Page 654 and 655: CHAPTER 13 ■ PARTITIONING 609 Row
- Page 656 and 657: CHAPTER 13 ■ PARTITIONING 611 So,
- Page 658 and 659: CHAPTER 13 ■ PARTITIONING 613 Aud
- Page 660 and 661: CHAPTER 14 ■ ■ ■ Parallel Exe
- Page 662 and 663: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 664 and 665: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 666 and 667: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 668 and 669: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 670 and 671: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 672 and 673: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 674 and 675: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 676 and 677: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 678 and 679: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 680 and 681: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 682 and 683: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 684 and 685: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 686 and 687: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 690 and 691: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 692 and 693: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 694 and 695: CHAPTER 15 ■ ■ ■ Data Loading
- Page 696 and 697: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 698 and 699: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 700 and 701: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 702 and 703: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 704 and 705: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 706 and 707: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 708 and 709: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 710 and 711: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 712 and 713: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 714 and 715: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 716 and 717: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 718 and 719: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 720 and 721: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 722 and 723: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 724 and 725: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 726 and 727: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 728 and 729: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 730 and 731: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 732 and 733: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 734 and 735: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 736 and 737: CHAPTER 15 ■ DATA LOADING AND UNL
CHAPTER 14 ■ PARALLEL EXECUTION 643<br />
8 /<br />
48250 rows created.<br />
ops$tkyte-ORA10G> commit;<br />
Commit complete.<br />
Just to see what happened here, we can query the newly inserted data out <strong>and</strong> group by<br />
SESSION_ID to see first how many parallel execution servers were used, <strong>and</strong> second how many<br />
rows each processed:<br />
ops$tkyte-ORA10G> select session_id, count(*)<br />
2 from t2<br />
3 group by session_id;<br />
SESSION_ID COUNT(*)<br />
---------- ----------<br />
241 8040<br />
246 8045<br />
253 8042<br />
254 8042<br />
258 8040<br />
260 8041<br />
6 rows selected.<br />
Apparently, we used six parallel execution servers for the SELECT component of this parallel<br />
operation, <strong>and</strong> each one processed about 8,040 records each.<br />
As you can see, <strong>Oracle</strong> parallelized our process, but we underwent a fairly radical rewrite<br />
of our process. This is a long way from the original implementation. So, while <strong>Oracle</strong> can<br />
process our routine in parallel, we may well not have any routines that are coded to be parallelized.<br />
If a rather large rewrite of your procedure is not feasible, you may be interested in the<br />
next implementation: DIY parallelism.<br />
Do-It-Yourself Parallelism<br />
Say we have that same process as in the preceding section: the serial, simple procedure. We<br />
cannot afford a rather extensive rewrite of the implementation, but we would like to execute it<br />
in parallel. What can we do? My approach many times has been to use rowid ranges to break<br />
up the table into some number of ranges that don’t overlap (yet completely cover the table).<br />
This is very similar to how <strong>Oracle</strong> performs a parallel query conceptually. If you think of a<br />
full table scan, <strong>Oracle</strong> processes that by coming up with some method to break the table into<br />
many “small” tables, each of which is processed by a parallel execution server. We are going to<br />
do the same thing using rowid ranges. In early releases, <strong>Oracle</strong>’s parallel implementation actually<br />
used rowid ranges itself.<br />
We’ll use a BIG_TABLE of 1,000,000 rows, as this technique works best on big tables with<br />
lots of extents, <strong>and</strong> the method I use for creating rowid ranges depends on extent boundaries.<br />
The more extents used, the better the data distribution. So, after creating the BIG_TABLE with<br />
1,000,000 rows, we’ll create T2 like this: