Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005
CHAPTER 13 ■ PARTITIONING 613 Audit trail information is the one piece of data in your database that you might well insert but never retrieve during the normal course of operation. It is there predominantly as a forensic, after-the-fact trail of evidence. We need to have it, but from many perspectives, it is just something that sits on our disks and consumes space—lots and lots of space. And then every month or year or some other time interval, we have to purge or archive it. Auditing is something that if not properly designed from the beginning can kill you at the end. Seven years from now when you are faced with your first purge or archive of the old data is not when you want to be thinking about how to accomplish it. Unless you designed for it, getting that old information out is going to be painful. Enter two technologies that make auditing not only bearable, but also pretty easy to manage and consume less space. These technologies are partitioning and segment space compression, as we discussed in Chapter 10. That second one might not be as obvious since segment space compression only works with large bulk operations like a direct path load, and audit trails are typically inserted into a row at a time, as events happen. The trick is to combine sliding window partitions with segment space compression. Suppose we decide to partition the audit trail by month. During the first month of business, we just insert into the partitioned table; these inserts go in using the “conventional path,” not a direct path, and hence are not compressed. Now, before the month ends, we’ll add a new partition to the table to accommodate next month’s auditing activity. Shortly after the beginning of next month, we will perform a large bulk operation on last month’s audit trail— specifically, we’ll use the ALTER TABLE command to move last month’s partition, which will have the effect of compressing the data as well. If we, in fact, take this a step further, we could move this partition from a read-write tablespace, which it must have been in, into a tablespace that is normally read-only (and contains other partitions for this audit trail). In that fashion, we can back up that tablespace once a month, after we move the partition in there; ensure we have a good, clean, current readable copy of the tablespace; and then not back it up anymore that month. We might have these tablespaces for our audit trail: • A current online, read-write tablespace that gets backed up like every other normal tablespace in our system. The audit trail information in this tablespace is not compressed, and it is constantly inserted into. • A read-only tablespace containing “this year to date” audit trail partitions in a compressed format. At the beginning of each month, we make this tablespace read-write, move and compress last month’s audit information into this tablespace, make it readonly again, and back it up. • A series of tablespaces for last year, the year before, and so on. These are all read-only and might even be on slow, cheap media. In the event of a media failure, we just need to restore from backup. We would occasionally pick a year at random from our backup sets to ensure they are still restorable (tapes go bad sometimes). In this fashion, we have made purging easy (i.e., drop a partition). We have made archiving easy, too—you could just transport a tablespace off and restore it later. We have reduced our space utilization by implementing compression. We have reduced our backup volumes, as in many systems, the single largest set of data is audit trail data. If you can remove some or all of that from your day-to-day backups, the difference will be measurable. In short, audit trail requirements and partitioning are two things that go hand in hand, regardless of the underlying system type, be it data warehouse or OLTP.
614 CHAPTER 13 ■ PARTITIONING Summary Partitioning is extremely useful in scaling up large database objects in the database. This scaling is visible from the perspective of performance scaling, availability scaling, and administrative scaling. All three are extremely important to different people. The DBA is concerned with administrative scaling. The owners of the system are concerned with availability, because downtime is lost money, and anything that reduces downtime—or reduces the impact of downtime—boosts the payback for a system. The end users of the system are concerned with performance scaling. No one likes to use a slow system, after all. We also looked at the fact that in an OLTP system, partitions may not increase performance, especially if applied improperly. Partitions can increase the performance of certain classes of queries, but those queries are generally not applied in an OLTP system. This point is important to understand, as many people associate partitioning with “free performance increase.” This does not mean that partitions should not be used in OLTP systems—they do provide many other salient benefits in this environment—just don’t expect a massive increase in throughput. Expect reduced downtime. Expect the same good performance (partitioning will not slow you down when applied appropriately). Expect easier manageability, which may lead to increased performance due to the fact that some maintenance operations are performed by the DBAs more frequently because they can be. We investigated the various table-partitioning schemes offered by Oracle—range, hash, list, and composite—and talked about when they are most appropriately used. We spent the bulk of our time looking at partitioned indexes and examining the differences between prefixed and nonprefixed and local and global indexes. We investigated partition operations in data warehouses combined with global indexes, and the tradeoff between resource consumption and availability. Over time, I see this feature becoming more relevant to a broader audience as the size and scale of database applications grow. The Internet and its database-hungry nature along with legislation requiring longer retention of audit data are leading to more and more extremely large collections of data, and partitioning is a natural tool to help manage that problem.
- Page 608 and 609: CHAPTER 13 ■ PARTITIONING 563 BIG
- Page 610 and 611: CHAPTER 13 ■ PARTITIONING 565 Enh
- Page 612 and 613: CHAPTER 13 ■ PARTITIONING 567 Tab
- Page 614 and 615: CHAPTER 13 ■ PARTITIONING 569 tha
- Page 616 and 617: CHAPTER 13 ■ PARTITIONING 571 PAR
- Page 618 and 619: CHAPTER 13 ■ PARTITIONING 573 35
- Page 620 and 621: CHAPTER 13 ■ PARTITIONING 575 If
- Page 622 and 623: CHAPTER 13 ■ PARTITIONING 577 We
- Page 624 and 625: CHAPTER 13 ■ PARTITIONING 579 14
- Page 626 and 627: CHAPTER 13 ■ PARTITIONING 581 ops
- Page 628 and 629: CHAPTER 13 ■ PARTITIONING 583 In
- Page 630 and 631: CHAPTER 13 ■ PARTITIONING 585 ops
- Page 632 and 633: CHAPTER 13 ■ PARTITIONING 587 | S
- Page 634 and 635: CHAPTER 13 ■ PARTITIONING 589 12
- Page 636 and 637: CHAPTER 13 ■ PARTITIONING 591 ops
- Page 638 and 639: CHAPTER 13 ■ PARTITIONING 593 •
- Page 640 and 641: CHAPTER 13 ■ PARTITIONING 595 Now
- Page 642 and 643: CHAPTER 13 ■ PARTITIONING 597 the
- Page 644 and 645: CHAPTER 13 ■ PARTITIONING 599 imp
- Page 646 and 647: CHAPTER 13 ■ PARTITIONING 601 OLT
- Page 648 and 649: CHAPTER 13 ■ PARTITIONING 603 5 s
- Page 650 and 651: CHAPTER 13 ■ PARTITIONING 605 Sur
- Page 652 and 653: CHAPTER 13 ■ PARTITIONING 607 On
- Page 654 and 655: CHAPTER 13 ■ PARTITIONING 609 Row
- Page 656 and 657: CHAPTER 13 ■ PARTITIONING 611 So,
- Page 660 and 661: CHAPTER 14 ■ ■ ■ Parallel Exe
- Page 662 and 663: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 664 and 665: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 666 and 667: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 668 and 669: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 670 and 671: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 672 and 673: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 674 and 675: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 676 and 677: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 678 and 679: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 680 and 681: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 682 and 683: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 684 and 685: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 686 and 687: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 688 and 689: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 690 and 691: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 692 and 693: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 694 and 695: CHAPTER 15 ■ ■ ■ Data Loading
- Page 696 and 697: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 698 and 699: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 700 and 701: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 702 and 703: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 704 and 705: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 706 and 707: CHAPTER 15 ■ DATA LOADING AND UNL
CHAPTER 13 ■ PARTITIONING 613<br />
Audit trail information is the one piece of data in your database that you might well insert<br />
but never retrieve during the normal course of operation. It is there predominantly as a forensic,<br />
after-the-fact trail of evidence. We need to have it, but from many perspectives, it is just<br />
something that sits on our disks <strong>and</strong> consumes space—lots <strong>and</strong> lots of space. And then every<br />
month or year or some other time interval, we have to purge or archive it. Auditing is something<br />
that if not properly designed from the beginning can kill you at the end. Seven years<br />
from now when you are faced with your first purge or archive of the old data is not when you<br />
want to be thinking about how to accomplish it. Unless you designed for it, getting that old<br />
information out is going to be painful.<br />
Enter two technologies that make auditing not only bearable, but also pretty easy to<br />
manage <strong>and</strong> consume less space. These technologies are partitioning <strong>and</strong> segment space<br />
compression, as we discussed in Chapter 10. That second one might not be as obvious since<br />
segment space compression only works with large bulk operations like a direct path load, <strong>and</strong><br />
audit trails are typically inserted into a row at a time, as events happen. The trick is to combine<br />
sliding window partitions with segment space compression.<br />
Suppose we decide to partition the audit trail by month. During the first month of business,<br />
we just insert into the partitioned table; these inserts go in using the “conventional<br />
path,” not a direct path, <strong>and</strong> hence are not compressed. Now, before the month ends, we’ll add<br />
a new partition to the table to accommodate next month’s auditing activity. Shortly after the<br />
beginning of next month, we will perform a large bulk operation on last month’s audit trail—<br />
specifically, we’ll use the ALTER TABLE comm<strong>and</strong> to move last month’s partition, which will<br />
have the effect of compressing the data as well. If we, in fact, take this a step further, we could<br />
move this partition from a read-write tablespace, which it must have been in, into a tablespace<br />
that is normally read-only (<strong>and</strong> contains other partitions for this audit trail). In that<br />
fashion, we can back up that tablespace once a month, after we move the partition in there;<br />
ensure we have a good, clean, current readable copy of the tablespace; <strong>and</strong> then not back it<br />
up anymore that month. We might have these tablespaces for our audit trail:<br />
• A current online, read-write tablespace that gets backed up like every other normal<br />
tablespace in our system. The audit trail information in this tablespace is not compressed,<br />
<strong>and</strong> it is constantly inserted into.<br />
• A read-only tablespace containing “this year to date” audit trail partitions in a compressed<br />
format. At the beginning of each month, we make this tablespace read-write,<br />
move <strong>and</strong> compress last month’s audit information into this tablespace, make it readonly<br />
again, <strong>and</strong> back it up.<br />
• A series of tablespaces for last year, the year before, <strong>and</strong> so on. These are all read-only<br />
<strong>and</strong> might even be on slow, cheap media. In the event of a media failure, we just need<br />
to restore from backup. We would occasionally pick a year at r<strong>and</strong>om from our backup<br />
sets to ensure they are still restorable (tapes go bad sometimes).<br />
In this fashion, we have made purging easy (i.e., drop a partition). We have made archiving<br />
easy, too—you could just transport a tablespace off <strong>and</strong> restore it later. We have reduced<br />
our space utilization by implementing compression. We have reduced our backup volumes, as<br />
in many systems, the single largest set of data is audit trail data. If you can remove some or all<br />
of that from your day-to-day backups, the difference will be measurable.<br />
In short, audit trail requirements <strong>and</strong> partitioning are two things that go h<strong>and</strong> in h<strong>and</strong>,<br />
regardless of the underlying system type, be it data warehouse or OLTP.