Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

rekharaghuram
from rekharaghuram More from this publisher
05.11.2015 Views

CHAPTER 13 ■ PARTITIONING 613 Audit trail information is the one piece of data in your database that you might well insert but never retrieve during the normal course of operation. It is there predominantly as a forensic, after-the-fact trail of evidence. We need to have it, but from many perspectives, it is just something that sits on our disks and consumes space—lots and lots of space. And then every month or year or some other time interval, we have to purge or archive it. Auditing is something that if not properly designed from the beginning can kill you at the end. Seven years from now when you are faced with your first purge or archive of the old data is not when you want to be thinking about how to accomplish it. Unless you designed for it, getting that old information out is going to be painful. Enter two technologies that make auditing not only bearable, but also pretty easy to manage and consume less space. These technologies are partitioning and segment space compression, as we discussed in Chapter 10. That second one might not be as obvious since segment space compression only works with large bulk operations like a direct path load, and audit trails are typically inserted into a row at a time, as events happen. The trick is to combine sliding window partitions with segment space compression. Suppose we decide to partition the audit trail by month. During the first month of business, we just insert into the partitioned table; these inserts go in using the “conventional path,” not a direct path, and hence are not compressed. Now, before the month ends, we’ll add a new partition to the table to accommodate next month’s auditing activity. Shortly after the beginning of next month, we will perform a large bulk operation on last month’s audit trail— specifically, we’ll use the ALTER TABLE command to move last month’s partition, which will have the effect of compressing the data as well. If we, in fact, take this a step further, we could move this partition from a read-write tablespace, which it must have been in, into a tablespace that is normally read-only (and contains other partitions for this audit trail). In that fashion, we can back up that tablespace once a month, after we move the partition in there; ensure we have a good, clean, current readable copy of the tablespace; and then not back it up anymore that month. We might have these tablespaces for our audit trail: • A current online, read-write tablespace that gets backed up like every other normal tablespace in our system. The audit trail information in this tablespace is not compressed, and it is constantly inserted into. • A read-only tablespace containing “this year to date” audit trail partitions in a compressed format. At the beginning of each month, we make this tablespace read-write, move and compress last month’s audit information into this tablespace, make it readonly again, and back it up. • A series of tablespaces for last year, the year before, and so on. These are all read-only and might even be on slow, cheap media. In the event of a media failure, we just need to restore from backup. We would occasionally pick a year at random from our backup sets to ensure they are still restorable (tapes go bad sometimes). In this fashion, we have made purging easy (i.e., drop a partition). We have made archiving easy, too—you could just transport a tablespace off and restore it later. We have reduced our space utilization by implementing compression. We have reduced our backup volumes, as in many systems, the single largest set of data is audit trail data. If you can remove some or all of that from your day-to-day backups, the difference will be measurable. In short, audit trail requirements and partitioning are two things that go hand in hand, regardless of the underlying system type, be it data warehouse or OLTP.

614 CHAPTER 13 ■ PARTITIONING Summary Partitioning is extremely useful in scaling up large database objects in the database. This scaling is visible from the perspective of performance scaling, availability scaling, and administrative scaling. All three are extremely important to different people. The DBA is concerned with administrative scaling. The owners of the system are concerned with availability, because downtime is lost money, and anything that reduces downtime—or reduces the impact of downtime—boosts the payback for a system. The end users of the system are concerned with performance scaling. No one likes to use a slow system, after all. We also looked at the fact that in an OLTP system, partitions may not increase performance, especially if applied improperly. Partitions can increase the performance of certain classes of queries, but those queries are generally not applied in an OLTP system. This point is important to understand, as many people associate partitioning with “free performance increase.” This does not mean that partitions should not be used in OLTP systems—they do provide many other salient benefits in this environment—just don’t expect a massive increase in throughput. Expect reduced downtime. Expect the same good performance (partitioning will not slow you down when applied appropriately). Expect easier manageability, which may lead to increased performance due to the fact that some maintenance operations are performed by the DBAs more frequently because they can be. We investigated the various table-partitioning schemes offered by Oracle—range, hash, list, and composite—and talked about when they are most appropriately used. We spent the bulk of our time looking at partitioned indexes and examining the differences between prefixed and nonprefixed and local and global indexes. We investigated partition operations in data warehouses combined with global indexes, and the tradeoff between resource consumption and availability. Over time, I see this feature becoming more relevant to a broader audience as the size and scale of database applications grow. The Internet and its database-hungry nature along with legislation requiring longer retention of audit data are leading to more and more extremely large collections of data, and partitioning is a natural tool to help manage that problem.

CHAPTER 13 ■ PARTITIONING 613<br />

Audit trail information is the one piece of data in your database that you might well insert<br />

but never retrieve during the normal course of operation. It is there predominantly as a forensic,<br />

after-the-fact trail of evidence. We need to have it, but from many perspectives, it is just<br />

something that sits on our disks <strong>and</strong> consumes space—lots <strong>and</strong> lots of space. And then every<br />

month or year or some other time interval, we have to purge or archive it. Auditing is something<br />

that if not properly designed from the beginning can kill you at the end. Seven years<br />

from now when you are faced with your first purge or archive of the old data is not when you<br />

want to be thinking about how to accomplish it. Unless you designed for it, getting that old<br />

information out is going to be painful.<br />

Enter two technologies that make auditing not only bearable, but also pretty easy to<br />

manage <strong>and</strong> consume less space. These technologies are partitioning <strong>and</strong> segment space<br />

compression, as we discussed in Chapter 10. That second one might not be as obvious since<br />

segment space compression only works with large bulk operations like a direct path load, <strong>and</strong><br />

audit trails are typically inserted into a row at a time, as events happen. The trick is to combine<br />

sliding window partitions with segment space compression.<br />

Suppose we decide to partition the audit trail by month. During the first month of business,<br />

we just insert into the partitioned table; these inserts go in using the “conventional<br />

path,” not a direct path, <strong>and</strong> hence are not compressed. Now, before the month ends, we’ll add<br />

a new partition to the table to accommodate next month’s auditing activity. Shortly after the<br />

beginning of next month, we will perform a large bulk operation on last month’s audit trail—<br />

specifically, we’ll use the ALTER TABLE comm<strong>and</strong> to move last month’s partition, which will<br />

have the effect of compressing the data as well. If we, in fact, take this a step further, we could<br />

move this partition from a read-write tablespace, which it must have been in, into a tablespace<br />

that is normally read-only (<strong>and</strong> contains other partitions for this audit trail). In that<br />

fashion, we can back up that tablespace once a month, after we move the partition in there;<br />

ensure we have a good, clean, current readable copy of the tablespace; <strong>and</strong> then not back it<br />

up anymore that month. We might have these tablespaces for our audit trail:<br />

• A current online, read-write tablespace that gets backed up like every other normal<br />

tablespace in our system. The audit trail information in this tablespace is not compressed,<br />

<strong>and</strong> it is constantly inserted into.<br />

• A read-only tablespace containing “this year to date” audit trail partitions in a compressed<br />

format. At the beginning of each month, we make this tablespace read-write,<br />

move <strong>and</strong> compress last month’s audit information into this tablespace, make it readonly<br />

again, <strong>and</strong> back it up.<br />

• A series of tablespaces for last year, the year before, <strong>and</strong> so on. These are all read-only<br />

<strong>and</strong> might even be on slow, cheap media. In the event of a media failure, we just need<br />

to restore from backup. We would occasionally pick a year at r<strong>and</strong>om from our backup<br />

sets to ensure they are still restorable (tapes go bad sometimes).<br />

In this fashion, we have made purging easy (i.e., drop a partition). We have made archiving<br />

easy, too—you could just transport a tablespace off <strong>and</strong> restore it later. We have reduced<br />

our space utilization by implementing compression. We have reduced our backup volumes, as<br />

in many systems, the single largest set of data is audit trail data. If you can remove some or all<br />

of that from your day-to-day backups, the difference will be measurable.<br />

In short, audit trail requirements <strong>and</strong> partitioning are two things that go h<strong>and</strong> in h<strong>and</strong>,<br />

regardless of the underlying system type, be it data warehouse or OLTP.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!