Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

rekharaghuram
from rekharaghuram More from this publisher
05.11.2015 Views

CHAPTER 14 ■ PARALLEL EXECUTION 629 We’ll start by creating the USER_INFO table, enabling it for parallel operations, and then gathering statistics on it: big_table@ORA10GR1> create table user_info as select * from all_users; Table created. big_table@ORA10GR1> alter table user_info parallel; Table altered. big_table@ORA10GR1> exec dbms_stats.gather_table_stats( user, 'USER_INFO' ); PL/SQL procedure successfully completed. Now, we would like to parallel direct path load a new table with this information. The query we’ll use is simply create table new_table parallel as select a.*, b.user_id, b.created user_created from big_table a, user_info b where a.owner = b.username The plan for that particular CREATE TABLE AS SELECT looked like this in Oracle 10g: --------------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | --------------------------------------------------------------------------- | 0 | CREATE TABLE STATEMENT | | | | | | 1 | PX COORDINATOR | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | P->S | QC (RAND) | | 3 | LOAD AS SELECT | | Q1,01 | PCWP | | |* 4 | HASH JOIN | | Q1,01 | PCWP | | | 5 | PX RECEIVE | | Q1,01 | PCWP | | | 6 | PX SEND BROADCAST | :TQ10000 | Q1,00 | P->P | BROADCAST | | 7 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | | 8 | TABLE ACCESS FULL| USER_INFO | Q1,00 | PCWP | | | 9 | PX BLOCK ITERATOR | | Q1,01 | PCWC | | | 10 | TABLE ACCESS FULL | BIG_TABLE | Q1,01 | PCWP | | --------------------------------------------------------------------------- If you look at the steps from 4 on down, that is the query (SELECT) component. The scan of BIG_TABLE and hash join to USER_INFO was performed in parallel, and each of the subresults was loaded into a portion of the table (step 3, the LOAD AS SELECT). After each of the parallel execution servers finishes its part of the join and load, it sends its results up to the query coordinator. In this case, the results simply indicated “success” or “failure,” as the work had already been performed. And that is all there is to it—parallel direct path loads made easy. The most important thing to consider with these operations is how space is used (or not used). Of particular importance is a side effect called extent trimming. I’d like to spend some time investigating that now.

630 CHAPTER 14 ■ PARALLEL EXECUTION Parallel DDL and Extent Trimming Parallel DDL relies on direct path operations. That is, the data is not passed to the buffer cache to be written later; rather, an operation such as a CREATE TABLE AS SELECT will create new extents and write directly to them, and the data goes straight from the query to disk, in those newly allocated extents. Each parallel execution server performing its part of the CREATE ➥ TABLE AS SELECT will write to its own extent. The INSERT /*+ APPEND */ (a direct path insert) writes “above” a segment’s HWM, and each parallel execution server will again write to its own set of extents, never sharing them with other parallel execution servers. Therefore, if you do a parallel CREATE TABLE AS SELECT and use four parallel execution servers to create the table, then you will have at least four extents—maybe more. But each of the parallel execution servers will allocate its own extent, write to it and, when it fills up, allocate another new extent. The parallel execution servers will never use an extent allocated by some other parallel execution server. Figure 14-3 depicts this process. We have a CREATE TABLE NEW_TABLE AS SELECT being executed by four parallel execution servers. In the figure, each parallel execution server is represented by a different color (white, light gray, dark gray, or black). The boxes in the “disk drum” represent the extents that were created in some data file by this CREATE TABLE statement. Each extent is presented in one of the aforementioned four colors, for the simple reason that all of the data in any given extent was loaded by only one of the four parallel execution servers—P003 is depicted as having created and then loaded four of these extents. P000, on the other hand, is depicted as having five extents, and so on. Figure 14-3. Parallel DDL extent allocation depiction

CHAPTER 14 ■ PARALLEL EXECUTION 629<br />

We’ll start by creating the USER_INFO table, enabling it for parallel operations, <strong>and</strong> then<br />

gathering statistics on it:<br />

big_table@ORA10GR1> create table user_info as select * from all_users;<br />

Table created.<br />

big_table@ORA10GR1> alter table user_info parallel;<br />

Table altered.<br />

big_table@ORA10GR1> exec dbms_stats.gather_table_stats( user, 'USER_INFO' );<br />

PL/SQL procedure successfully completed.<br />

Now, we would like to parallel direct path load a new table with this information. The<br />

query we’ll use is simply<br />

create table new_table parallel<br />

as<br />

select a.*, b.user_id, b.created user_created<br />

from big_table a, user_info b<br />

where a.owner = b.username<br />

The plan for that particular CREATE TABLE AS SELECT looked like this in <strong>Oracle</strong> 10g:<br />

---------------------------------------------------------------------------<br />

| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |<br />

---------------------------------------------------------------------------<br />

| 0 | CREATE TABLE STATEMENT | | | | |<br />

| 1 | PX COORDINATOR | | | | |<br />

| 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | P->S | QC (RAND) |<br />

| 3 | LOAD AS SELECT | | Q1,01 | PCWP | |<br />

|* 4 | HASH JOIN | | Q1,01 | PCWP | |<br />

| 5 | PX RECEIVE | | Q1,01 | PCWP | |<br />

| 6 | PX SEND BROADCAST | :TQ10000 | Q1,00 | P->P | BROADCAST |<br />

| 7 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |<br />

| 8 | TABLE ACCESS FULL| USER_INFO | Q1,00 | PCWP | |<br />

| 9 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |<br />

| 10 | TABLE ACCESS FULL | BIG_TABLE | Q1,01 | PCWP | |<br />

---------------------------------------------------------------------------<br />

If you look at the steps from 4 on down, that is the query (SELECT) component. The scan<br />

of BIG_TABLE <strong>and</strong> hash join to USER_INFO was performed in parallel, <strong>and</strong> each of the subresults<br />

was loaded into a portion of the table (step 3, the LOAD AS SELECT). After each of the parallel<br />

execution servers finishes its part of the join <strong>and</strong> load, it sends its results up to the query coordinator.<br />

In this case, the results simply indicated “success” or “failure,” as the work had already<br />

been performed.<br />

And that is all there is to it—parallel direct path loads made easy. The most important<br />

thing to consider with these operations is how space is used (or not used). Of particular<br />

importance is a side effect called extent trimming. I’d like to spend some time investigating<br />

that now.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!