Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005
CHAPTER 14 ■ PARALLEL EXECUTION 629 We’ll start by creating the USER_INFO table, enabling it for parallel operations, and then gathering statistics on it: big_table@ORA10GR1> create table user_info as select * from all_users; Table created. big_table@ORA10GR1> alter table user_info parallel; Table altered. big_table@ORA10GR1> exec dbms_stats.gather_table_stats( user, 'USER_INFO' ); PL/SQL procedure successfully completed. Now, we would like to parallel direct path load a new table with this information. The query we’ll use is simply create table new_table parallel as select a.*, b.user_id, b.created user_created from big_table a, user_info b where a.owner = b.username The plan for that particular CREATE TABLE AS SELECT looked like this in Oracle 10g: --------------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | --------------------------------------------------------------------------- | 0 | CREATE TABLE STATEMENT | | | | | | 1 | PX COORDINATOR | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | P->S | QC (RAND) | | 3 | LOAD AS SELECT | | Q1,01 | PCWP | | |* 4 | HASH JOIN | | Q1,01 | PCWP | | | 5 | PX RECEIVE | | Q1,01 | PCWP | | | 6 | PX SEND BROADCAST | :TQ10000 | Q1,00 | P->P | BROADCAST | | 7 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | | 8 | TABLE ACCESS FULL| USER_INFO | Q1,00 | PCWP | | | 9 | PX BLOCK ITERATOR | | Q1,01 | PCWC | | | 10 | TABLE ACCESS FULL | BIG_TABLE | Q1,01 | PCWP | | --------------------------------------------------------------------------- If you look at the steps from 4 on down, that is the query (SELECT) component. The scan of BIG_TABLE and hash join to USER_INFO was performed in parallel, and each of the subresults was loaded into a portion of the table (step 3, the LOAD AS SELECT). After each of the parallel execution servers finishes its part of the join and load, it sends its results up to the query coordinator. In this case, the results simply indicated “success” or “failure,” as the work had already been performed. And that is all there is to it—parallel direct path loads made easy. The most important thing to consider with these operations is how space is used (or not used). Of particular importance is a side effect called extent trimming. I’d like to spend some time investigating that now.
630 CHAPTER 14 ■ PARALLEL EXECUTION Parallel DDL and Extent Trimming Parallel DDL relies on direct path operations. That is, the data is not passed to the buffer cache to be written later; rather, an operation such as a CREATE TABLE AS SELECT will create new extents and write directly to them, and the data goes straight from the query to disk, in those newly allocated extents. Each parallel execution server performing its part of the CREATE ➥ TABLE AS SELECT will write to its own extent. The INSERT /*+ APPEND */ (a direct path insert) writes “above” a segment’s HWM, and each parallel execution server will again write to its own set of extents, never sharing them with other parallel execution servers. Therefore, if you do a parallel CREATE TABLE AS SELECT and use four parallel execution servers to create the table, then you will have at least four extents—maybe more. But each of the parallel execution servers will allocate its own extent, write to it and, when it fills up, allocate another new extent. The parallel execution servers will never use an extent allocated by some other parallel execution server. Figure 14-3 depicts this process. We have a CREATE TABLE NEW_TABLE AS SELECT being executed by four parallel execution servers. In the figure, each parallel execution server is represented by a different color (white, light gray, dark gray, or black). The boxes in the “disk drum” represent the extents that were created in some data file by this CREATE TABLE statement. Each extent is presented in one of the aforementioned four colors, for the simple reason that all of the data in any given extent was loaded by only one of the four parallel execution servers—P003 is depicted as having created and then loaded four of these extents. P000, on the other hand, is depicted as having five extents, and so on. Figure 14-3. Parallel DDL extent allocation depiction
- Page 624 and 625: CHAPTER 13 ■ PARTITIONING 579 14
- Page 626 and 627: CHAPTER 13 ■ PARTITIONING 581 ops
- Page 628 and 629: CHAPTER 13 ■ PARTITIONING 583 In
- Page 630 and 631: CHAPTER 13 ■ PARTITIONING 585 ops
- Page 632 and 633: CHAPTER 13 ■ PARTITIONING 587 | S
- Page 634 and 635: CHAPTER 13 ■ PARTITIONING 589 12
- Page 636 and 637: CHAPTER 13 ■ PARTITIONING 591 ops
- Page 638 and 639: CHAPTER 13 ■ PARTITIONING 593 •
- Page 640 and 641: CHAPTER 13 ■ PARTITIONING 595 Now
- Page 642 and 643: CHAPTER 13 ■ PARTITIONING 597 the
- Page 644 and 645: CHAPTER 13 ■ PARTITIONING 599 imp
- Page 646 and 647: CHAPTER 13 ■ PARTITIONING 601 OLT
- Page 648 and 649: CHAPTER 13 ■ PARTITIONING 603 5 s
- Page 650 and 651: CHAPTER 13 ■ PARTITIONING 605 Sur
- Page 652 and 653: CHAPTER 13 ■ PARTITIONING 607 On
- Page 654 and 655: CHAPTER 13 ■ PARTITIONING 609 Row
- Page 656 and 657: CHAPTER 13 ■ PARTITIONING 611 So,
- Page 658 and 659: CHAPTER 13 ■ PARTITIONING 613 Aud
- Page 660 and 661: CHAPTER 14 ■ ■ ■ Parallel Exe
- Page 662 and 663: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 664 and 665: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 666 and 667: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 668 and 669: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 670 and 671: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 672 and 673: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 676 and 677: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 678 and 679: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 680 and 681: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 682 and 683: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 684 and 685: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 686 and 687: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 688 and 689: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 690 and 691: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 692 and 693: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 694 and 695: CHAPTER 15 ■ ■ ■ Data Loading
- Page 696 and 697: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 698 and 699: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 700 and 701: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 702 and 703: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 704 and 705: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 706 and 707: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 708 and 709: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 710 and 711: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 712 and 713: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 714 and 715: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 716 and 717: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 718 and 719: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 720 and 721: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 722 and 723: CHAPTER 15 ■ DATA LOADING AND UNL
CHAPTER 14 ■ PARALLEL EXECUTION 629<br />
We’ll start by creating the USER_INFO table, enabling it for parallel operations, <strong>and</strong> then<br />
gathering statistics on it:<br />
big_table@ORA10GR1> create table user_info as select * from all_users;<br />
Table created.<br />
big_table@ORA10GR1> alter table user_info parallel;<br />
Table altered.<br />
big_table@ORA10GR1> exec dbms_stats.gather_table_stats( user, 'USER_INFO' );<br />
PL/SQL procedure successfully completed.<br />
Now, we would like to parallel direct path load a new table with this information. The<br />
query we’ll use is simply<br />
create table new_table parallel<br />
as<br />
select a.*, b.user_id, b.created user_created<br />
from big_table a, user_info b<br />
where a.owner = b.username<br />
The plan for that particular CREATE TABLE AS SELECT looked like this in <strong>Oracle</strong> 10g:<br />
---------------------------------------------------------------------------<br />
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |<br />
---------------------------------------------------------------------------<br />
| 0 | CREATE TABLE STATEMENT | | | | |<br />
| 1 | PX COORDINATOR | | | | |<br />
| 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | P->S | QC (RAND) |<br />
| 3 | LOAD AS SELECT | | Q1,01 | PCWP | |<br />
|* 4 | HASH JOIN | | Q1,01 | PCWP | |<br />
| 5 | PX RECEIVE | | Q1,01 | PCWP | |<br />
| 6 | PX SEND BROADCAST | :TQ10000 | Q1,00 | P->P | BROADCAST |<br />
| 7 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |<br />
| 8 | TABLE ACCESS FULL| USER_INFO | Q1,00 | PCWP | |<br />
| 9 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |<br />
| 10 | TABLE ACCESS FULL | BIG_TABLE | Q1,01 | PCWP | |<br />
---------------------------------------------------------------------------<br />
If you look at the steps from 4 on down, that is the query (SELECT) component. The scan<br />
of BIG_TABLE <strong>and</strong> hash join to USER_INFO was performed in parallel, <strong>and</strong> each of the subresults<br />
was loaded into a portion of the table (step 3, the LOAD AS SELECT). After each of the parallel<br />
execution servers finishes its part of the join <strong>and</strong> load, it sends its results up to the query coordinator.<br />
In this case, the results simply indicated “success” or “failure,” as the work had already<br />
been performed.<br />
And that is all there is to it—parallel direct path loads made easy. The most important<br />
thing to consider with these operations is how space is used (or not used). Of particular<br />
importance is a side effect called extent trimming. I’d like to spend some time investigating<br />
that now.