Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

rekharaghuram
from rekharaghuram More from this publisher
05.11.2015 Views

CHAPTER 12 ■ DATATYPES 549 Note the increased I/O usage, both on the read and writes. All in all, this shows that if you use a CLOB, and many of the strings are expected to fit “in the row” (i.e., will be less than 4,000 bytes), then using the default of ENABLE STORAGE IN ROW is a good idea. CHUNK Clause The CREATE TABLE statement returned from DBMS_METADATA previously included the following: LOB ("TXT") STORE AS ( ... CHUNK 8192 ... ) LOBs are stored in chunks; the index that points to the LOB data points to individual chunks of data. Chunks are logically contiguous sets of blocks and are the smallest unit of allocation for LOBs, whereas normally a block is the smallest unit of allocation. The CHUNK size must be an integer multiple of your Oracle blocksize—this is the only valid value. You must take care to choose a CHUNK size from two perspectives. First, each LOB instance (each LOB value stored out of line) will consume at least one CHUNK. A single CHUNK is used by a single LOB value. If a table has 100 rows and each row has a LOB with 7KB of data in it, you can be sure that there will be 100 chunks allocated. If you set the CHUNK size to 32KB, you will have 100 32KB chunks allocated. If you set the CHUNK size to 8KB, you will have (probably) 100 8KB chunks allocated. The point is, a chunk is used by only one LOB entry (two LOBs will not use the same CHUNK). If you pick a CHUNK size that does not meet your expected LOB sizes, you could end up wasting an excessive amount of space. For example, if you have that table with 7KB LOBs on average, and you use a CHUNK size of 32k, you will be “wasting” approximately 25k of space per LOB instance. On the other hand, if you use an 8KB CHUNK, you will minimize any sort of waste. You also need to be careful when you want to minimize the number of CHUNKs you have per LOB instance. As you have seen, there is a lobindex used to point to the individual chunks, and the more chunks you have, the larger this index is. If you have a 4MB LOB and use an 8KB CHUNK, you will need at least 512 CHUNKs to store that information. That means you need at least as many lobindex entries to point to these chunks. That might not sound like a lot, but you have to remember that this is per LOB instance, so if you have thousands of 4MB LOBs, you now have many thousands of entries. This will also affect your retrieval performance, as it takes longer to read and manage many small chunks than it does to read fewer, but larger, chunks. The ultimate goal is to use a CHUNK size that minimizes your “waste,” but also efficiently stores your data. PCTVERSION Clause The CREATE TABLE statement returned from DBMS_METADATA previously included the following: LOB ("TXT") STORE AS ( ... PCTVERSION 10 ... ) This is used to control the read consistency of the LOB. In previous chapters, we’ve discussed read consistency, multi-versioning, and the role that undo plays in them. Well, when it comes to LOBs, the way read consistency is implemented changes. The lobsegment does not use undo to record its changes; rather, it versions the information directly in the lobsegment itself. The lobindex generates undo just as any other segment would, but the lobsegment does not. Instead, when you modify a LOB, Oracle allocates a new CHUNK and leaves the old CHUNK in place. If you roll back your transaction, the changes to the LOB index are rolled back and the

550 CHAPTER 12 ■ DATATYPES index will point to the old CHUNK again. So the undo maintenance is performed right in the LOB segment itself. As you modify the data, the old data is left in place and new data is created. This is also relevant when reading the LOB data. LOBs are read consistent, just as all other segments are. If you retrieve a LOB locator at 9:00 am, the LOB data you retrieve from it will be “as of 9:00 am.” Just like if you open a cursor (a resultset) at 9:00 am, the rows it produces will be as of that point in time. Even if someone else comes along and modifies the LOB data and commits (or not), your LOB locator will be “as of 9:00 am,” just like your resultset would be. Here, Oracle uses the lobsegment along with the read-consistent view of the lobindex to undo the changes to the LOB, to present you with the LOB data as it existed when you retrieved the LOB locator. It does not use the undo information for the lobsegment, since none was generated for the lobsegment itself. We can easily demonstrate that LOBs are read consistent. Consider this small table with an out-of-line LOB (it is stored in the lobsegment): ops$tkyte@ORA10G> create table t 2 ( id int primary key, 3 txt clob 4 ) 5 lob( txt) store as ( disable storage in row ) 6 / Table created. ops$tkyte@ORA10G> insert into t values ( 1, 'hello world' ); 1 row created. ops$tkyte@ORA10G> commit; Commit complete. If we fetch out the LOB locator and open a cursor on this table as follows: ops$tkyte@ORA10G> declare 2 l_clob clob; 3 4 cursor c is select id from t; 5 l_id number; 6 begin 7 select txt into l_clob from t; 8 open c; and then we modify that row and commit: 9 10 update t set id = 2, txt = 'Goodbye'; 11 commit; 12 we’ll see, upon working with the LOB locator and opened cursor, that the data is presented “as of the point in time we retrieved or opened them”:

CHAPTER 12 ■ DATATYPES 549<br />

Note the increased I/O usage, both on the read <strong>and</strong> writes. All in all, this shows that if you<br />

use a CLOB, <strong>and</strong> many of the strings are expected to fit “in the row” (i.e., will be less than 4,000<br />

bytes), then using the default of ENABLE STORAGE IN ROW is a good idea.<br />

CHUNK Clause<br />

The CREATE TABLE statement returned from DBMS_METADATA previously included the following:<br />

LOB ("TXT") STORE AS ( ... CHUNK 8192 ... )<br />

LOBs are stored in chunks; the index that points to the LOB data points to individual chunks<br />

of data. Chunks are logically contiguous sets of blocks <strong>and</strong> are the smallest unit of allocation<br />

for LOBs, whereas normally a block is the smallest unit of allocation. The CHUNK size must be<br />

an integer multiple of your <strong>Oracle</strong> blocksize—this is the only valid value.<br />

You must take care to choose a CHUNK size from two perspectives. First, each LOB instance<br />

(each LOB value stored out of line) will consume at least one CHUNK. A single CHUNK is used by a<br />

single LOB value. If a table has 100 rows <strong>and</strong> each row has a LOB with 7KB of data in it, you<br />

can be sure that there will be 100 chunks allocated. If you set the CHUNK size to 32KB, you will<br />

have 100 32KB chunks allocated. If you set the CHUNK size to 8KB, you will have (probably) 100<br />

8KB chunks allocated. The point is, a chunk is used by only one LOB entry (two LOBs will not<br />

use the same CHUNK). If you pick a CHUNK size that does not meet your expected LOB sizes, you<br />

could end up wasting an excessive amount of space. For example, if you have that table with<br />

7KB LOBs on average, <strong>and</strong> you use a CHUNK size of 32k, you will be “wasting” approximately 25k<br />

of space per LOB instance. On the other h<strong>and</strong>, if you use an 8KB CHUNK, you will minimize any<br />

sort of waste.<br />

You also need to be careful when you want to minimize the number of CHUNKs you have<br />

per LOB instance. As you have seen, there is a lobindex used to point to the individual chunks,<br />

<strong>and</strong> the more chunks you have, the larger this index is. If you have a 4MB LOB <strong>and</strong> use an 8KB<br />

CHUNK, you will need at least 512 CHUNKs to store that information. That means you need at least<br />

as many lobindex entries to point to these chunks. That might not sound like a lot, but you<br />

have to remember that this is per LOB instance, so if you have thous<strong>and</strong>s of 4MB LOBs, you<br />

now have many thous<strong>and</strong>s of entries. This will also affect your retrieval performance, as it<br />

takes longer to read <strong>and</strong> manage many small chunks than it does to read fewer, but larger,<br />

chunks. The ultimate goal is to use a CHUNK size that minimizes your “waste,” but also efficiently<br />

stores your data.<br />

PCTVERSION Clause<br />

The CREATE TABLE statement returned from DBMS_METADATA previously included the following:<br />

LOB ("TXT") STORE AS ( ... PCTVERSION 10 ... )<br />

This is used to control the read consistency of the LOB. In previous chapters, we’ve discussed<br />

read consistency, multi-versioning, <strong>and</strong> the role that undo plays in them. Well, when it comes<br />

to LOBs, the way read consistency is implemented changes. The lobsegment does not use<br />

undo to record its changes; rather, it versions the information directly in the lobsegment itself.<br />

The lobindex generates undo just as any other segment would, but the lobsegment does not.<br />

Instead, when you modify a LOB, <strong>Oracle</strong> allocates a new CHUNK <strong>and</strong> leaves the old CHUNK in<br />

place. If you roll back your transaction, the changes to the LOB index are rolled back <strong>and</strong> the

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!