Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

rekharaghuram
from rekharaghuram More from this publisher
05.11.2015 Views

CHAPTER 14 ■ ■ ■ Parallel Execution Parallel execution, a feature of Oracle Enterprise Edition (it is not available in the Standard Edition), was first introduced in Oracle version 7.1.6 in 1994. It is the ability to physically break a large serial task (any DML, or DDL in general) into many smaller bits that may all be processed simultaneously. Parallel executions in Oracle mimic the real-life processes we see all of the time. Rarely would you expect to see a single individual build a house; it is far more common for many teams of people to work concurrently to rapidly assemble the house. In that way, certain operations can be divided into smaller tasks and performed concurrently. For example, the plumbing and electrical wiring can take place at the same time to reduce the total amount of time required for the job as a whole. Parallel execution in Oracle follows much the same logic. It is often possible for Oracle to divide a certain large “job” into smaller parts and to perform each part concurrently. For example, if a full table scan of a large table is required, there is no reason why Oracle cannot have four parallel sessions, P001–P004, perform the full scan together, with each session reading a different portion of the table. If the data scanned by P001–P004 needs to be sorted, this could be carried out by four more parallel sessions, P005–P008, which could ultimately send the results to an overall coordinating session for the query. Parallel execution is a tool that, when wielded properly, may increase the response time of certain operations by orders of magnitude. When it’s wielded as a “fast = true” switch, the results are typically quite the opposite. In this chapter, the goal is not to explain precisely how parallel query is implemented in Oracle, the myriad combinations of plans that can result from parallel operations, and the like. I feel that much of that material is covered quite well in the Oracle Administrator’s Guide, the Oracle Concepts Guide and, in particular, the Oracle Data Warehousing Guide. This chapter’s goal is to give you an understanding of what class of problems parallel execution is and isn’t appropriate for. Specifically, after looking at when to use parallel execution, we will cover the following: • Parallel query: The ability to perform a single query using many operating system processes or threads. Oracle will find operations it can perform in parallel, such as full table scans or large sorts, and create a query plan to do so. • Parallel DML (PDML): This is very similar in nature to parallel query, but it is used in reference to performing modifications (INSERT, UPDATE, DELETE, and MERGE) using parallel processing. In this chapter, we’ll look at PDML and discuss some of the inherent limitations associated with it. 615

616 CHAPTER 14 ■ PARALLEL EXECUTION • Parallel DDL: Parallel DDL is the ability of Oracle to perform large DDL operations in parallel. For example, an index rebuild, creation of a new index, loading of data, and reorganization of large tables may all use parallel processing. This, I believe, is the “sweet spot” for parallelism in the database, so we will focus most of the discussion on this topic. • Parallel recovery: This is the ability of the database to perform instance or even media recovery in parallel in an attempt to reduce the time it takes to recover from failures. • Procedural parallelism: This is the ability to run developed code in parallel. In this chapter, I’ll discuss two approaches to this. In the first approach, Oracle runs our developed PL/SQL code in parallel in a fashion transparent to developers (developers are not developing parallel code; rather, Oracle is parallelizing their code for them transparently). The other approach is something I term “do-it-yourself parallelism,” whereby the developed code is designed to be executed in parallel. When to Use Parallel Execution Parallel execution can be fantastic. It can allow you to take a process that executes over many hours or days and complete it in minutes. Breaking down a huge problem into small components may, in some cases, dramatically reduce the processing time. However, one underlying concept that will be useful to keep in mind while considering parallel execution is summarized by this very short quote from Practical Oracle8i: Building Efficient Databases (Addison- Wesley, 2001) by Jonathan Lewis: PARALLEL QUERY option is essentially nonscalable. Parallel execution is essentially a nonscalable solution. It was designed to allow an individual user or a particular SQL statement to consume all resources of a database. If you have a feature that allows an individual to make use of everything that is available, and then you allow two individuals to use that feature, you’ll have obvious contention issues. As the number of concurrent users on your system begins to overwhelm the number of resources you have (memory, CPU, and I/O), the ability to deploy parallel operations becomes questionable. If you have a four-CPU machine, for example, and on average you have 32 users executing queries simultaneously, then the odds are that you do not want to parallelize their operations. If you allowed each user to perform just a “parallel 2” query, then you would now have 64 concurrent operations taking place on a machine with only four CPUs. If the machine were not overwhelmed before parallel execution, it almost certainly would be now. In short, parallel execution can also be a terrible idea. In many cases, the application of parallel processing will only lead to increased resource consumption, as parallel execution attempts to use all available resources. In a system where resources must be shared by many concurrent transactions, such as an OLTP system, you would likely observe increased response times due to this. Oracle avoids certain execution techniques that it can use efficiently in a serial execution plan and adopts execution paths such as full scans in the hope that by performing many pieces of the larger, bulk operation in parallel, it would be better than the serial plan. Parallel execution, when applied inappropriately, may be the cause of your performance problem, not the solution for it.

CHAPTER 14<br />

■ ■ ■<br />

Parallel Execution<br />

Parallel execution, a feature of <strong>Oracle</strong> Enterprise Edition (it is not available in the St<strong>and</strong>ard<br />

Edition), was first introduced in <strong>Oracle</strong> version 7.1.6 in 1994. It is the ability to physically<br />

break a large serial task (any DML, or DDL in general) into many smaller bits that may all be<br />

processed simultaneously. Parallel executions in <strong>Oracle</strong> mimic the real-life processes we see<br />

all of the time. Rarely would you expect to see a single individual build a house; it is far more<br />

common for many teams of people to work concurrently to rapidly assemble the house. In<br />

that way, certain operations can be divided into smaller tasks <strong>and</strong> performed concurrently.<br />

For example, the plumbing <strong>and</strong> electrical wiring can take place at the same time to reduce<br />

the total amount of time required for the job as a whole.<br />

Parallel execution in <strong>Oracle</strong> follows much the same logic. It is often possible for <strong>Oracle</strong><br />

to divide a certain large “job” into smaller parts <strong>and</strong> to perform each part concurrently. For<br />

example, if a full table scan of a large table is required, there is no reason why <strong>Oracle</strong> cannot<br />

have four parallel sessions, P001–P004, perform the full scan together, with each session reading<br />

a different portion of the table. If the data scanned by P001–P004 needs to be sorted, this<br />

could be carried out by four more parallel sessions, P005–P008, which could ultimately send<br />

the results to an overall coordinating session for the query.<br />

Parallel execution is a tool that, when wielded properly, may increase the response time<br />

of certain operations by orders of magnitude. When it’s wielded as a “fast = true” switch, the<br />

results are typically quite the opposite. In this chapter, the goal is not to explain precisely how<br />

parallel query is implemented in <strong>Oracle</strong>, the myriad combinations of plans that can result<br />

from parallel operations, <strong>and</strong> the like. I feel that much of that material is covered quite well in<br />

the <strong>Oracle</strong> Administrator’s Guide, the <strong>Oracle</strong> Concepts Guide <strong>and</strong>, in particular, the <strong>Oracle</strong> Data<br />

Warehousing Guide. This chapter’s goal is to give you an underst<strong>and</strong>ing of what class of problems<br />

parallel execution is <strong>and</strong> isn’t appropriate for. Specifically, after looking at when to use<br />

parallel execution, we will cover the following:<br />

• Parallel query: The ability to perform a single query using many operating system<br />

processes or threads. <strong>Oracle</strong> will find operations it can perform in parallel, such as<br />

full table scans or large sorts, <strong>and</strong> create a query plan to do so.<br />

• Parallel DML (PDML): This is very similar in nature to parallel query, but it is used in<br />

reference to performing modifications (INSERT, UPDATE, DELETE, <strong>and</strong> MERGE) using parallel<br />

processing. In this chapter, we’ll look at PDML <strong>and</strong> discuss some of the inherent<br />

limitations associated with it.<br />

615

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!