05.11.2015 Views

Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

CHAPTER 7 ■ CONCURRENCY AND MULTI-VERSIONING 243<br />

3. They then pull all of the rows from the transactional system—a full SELECT * FROM ➥<br />

TABLE—to get the data warehouse initially populated.<br />

4. To refresh the data warehouse, they remember what time it is right now again. For<br />

example, suppose an hour has gone by <strong>and</strong> it is now 10:00 am on the source system.<br />

They will remember that fact. They then pull all changed records since 9:00 am—the<br />

moment before they started the first pull—<strong>and</strong> merge them in.<br />

■Note This technique may “pull” the same record twice in two consecutive refreshes. This is unavoidable<br />

due to the granularity of the clock. A MERGE operation will not be affected by this (i.e., update existing record<br />

in the data warehouse or insert a new record).<br />

They believe that they now have all of the records in the data warehouse that were modified<br />

since they did the initial pull. They may actually have all of the records, but just as likely<br />

they may not. This technique does work on some other databases—ones that employ a locking<br />

system whereby reads are blocked by writes <strong>and</strong> vice versa. But in a system where you have<br />

non-blocking reads, the logic is flawed.<br />

To see the flaw in this example, all we need to do is assume that at 9:00 am there was at<br />

least one open, uncommitted transaction. At 8:59:30 am, it had updated a row in the table we<br />

were to copy. At 9:00 am, when we started pulling the data, reading the data in this table, we<br />

would not see the modifications to that row; we would see the last committed version of it. If it<br />

was locked when we got to it in our query, we would read around the lock. If it was committed<br />

by the time we got to it, we would still read around it since read consistency permits us to read<br />

only data that was committed in the database when our statement began. We would not read<br />

that new version of the row during the 9:00 am initial pull, but nor would we read the modified<br />

row during the 10:00 am refresh. The reason? The 10:00 am refresh would only pull records<br />

modified since 9:00 am that morning—but this record was modified at 8:59:30 am. We would<br />

never pull this changed record.<br />

In many other databases where reads are blocked by writes <strong>and</strong> a committed but inconsistent<br />

read is implemented, this refresh process would work perfectly. If at 9:00 am—when<br />

we did the initial pull of data—we hit that row <strong>and</strong> it was locked, we would have blocked <strong>and</strong><br />

waited for it, <strong>and</strong> read the committed version. If it were not locked, we would just read whatever<br />

was there, committed.<br />

So, does this mean the preceding logic just cannot be used? No, it means that we need<br />

to get the “right now” time a little differently. We need to query V$TRANSACTION <strong>and</strong> find out<br />

which is the earliest of the current time <strong>and</strong> the time recorded in the START_TIME column of<br />

this view. We will need to pull all records changed since the start time of the oldest transaction<br />

(or the current SYSDATE value if there are no active transactions):<br />

select nvl( min(to_date(start_time,'mm/dd/rr hh24:mi:ss')),sysdate)<br />

from v$transaction;<br />

In this example, that would be 8:59:30 am—when the transaction that modified the<br />

row started. When we go to refresh the data at 10:00 am, we pull all of the changes that had<br />

occurred since that time, <strong>and</strong> when we merge these into the data warehouse, we’ll have

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!