05.11.2015 Views

Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

624<br />

CHAPTER 14 ■ PARALLEL EXECUTION<br />

need ample free resources such as CPU, I/O, <strong>and</strong> memory. If you are lacking in any of these,<br />

then parallel query may well push your utilization of that resource over the edge, negatively<br />

impacting overall performance <strong>and</strong> runtime.<br />

In the past, parallel query was considered m<strong>and</strong>atory for many data warehouses simply<br />

because in the past (say, in 1995) data warehouses were rare <strong>and</strong> typically had a very small,<br />

focused user base. Today in 2005, data warehouses are literally everywhere <strong>and</strong> support user<br />

communities that are as large as those found for many transactional systems. This means that<br />

you may well not have sufficient free resources at any given point in time to enable parallel<br />

query on these systems. That doesn’t mean parallel execution in general is not useful in this<br />

case—it just might be more of a DBA tool, as we’ll see in the section “Parallel DDL,” rather<br />

than a parallel query tool.<br />

Parallel DML<br />

The <strong>Oracle</strong> documentation limits the scope of the term DML (PDML) to include only INSERT,<br />

UPDATE, DELETE, <strong>and</strong> MERGE (it does not include SELECT as normal DML does). During PDML,<br />

<strong>Oracle</strong> may use many parallel execution servers to perform your INSERT, UPDATE, DELETE, or<br />

MERGE instead of a single serial process. On a multi-CPU machine with plenty of I/O b<strong>and</strong>width,<br />

the potential increase in speed may be large for mass DML operations.<br />

However, you should not look to PDML as a feature to speed up your OLTP-based applications.<br />

As stated previously, parallel operations are designed to fully <strong>and</strong> totally maximize the<br />

utilization of a machine. They are designed so that a single user can completely use all of the<br />

disks, CPU, <strong>and</strong> memory on the machine. In certain data warehouses (with lots of data <strong>and</strong><br />

few users), this is something you may want to achieve. In an OLTP system (with a lot of users<br />

all doing short, fast transactions), you do not want to give a user the ability to fully take over<br />

the machine resources.<br />

This sounds contradictory: we use parallel query to scale up, so how could it not be scalable?<br />

When applied to an OLTP system, the statement is quite accurate. Parallel query is not<br />

something that scales up as the number of concurrent users increases. Parallel query was<br />

designed to allow a single session to generate as much work as 100 concurrent sessions would.<br />

In our OLTP system, we really do not want a single user to generate the work of 100 users.<br />

PDML is useful in a large data warehousing environment to facilitate bulk updates to<br />

massive amounts of data. The PDML operation is executed in much the same way as a distributed<br />

query would be executed by <strong>Oracle</strong>, with each parallel execution server acting like a<br />

process in a separate database instance. Each slice of the table is modified by a separate<br />

thread with its own independent transaction (<strong>and</strong> hence its own undo segment, hopefully).<br />

After they are all done, the equivalent of a fast 2PC is performed to commit the separate, independent<br />

transactions. Figure 14-2 depicts a parallel update using four parallel execution<br />

servers. Each of the parallel execution servers has its own independent transaction, in which<br />

either all are committed with the PDML coordinating session or none commit.<br />

We can actually observe the fact that there are separate independent transactions created<br />

for the parallel execution servers. We’ll use two sessions again, as before. In the session with<br />

SID=162, we explicitly enable parallel DML. PDML differs from parallel query in that regard;<br />

unless you explicitly ask for it, you will not get it.<br />

big_table@ORA10GR1> alter session enable parallel dml;<br />

Session altered.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!