Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005

rekharaghuram
from rekharaghuram More from this publisher
05.11.2015 Views

CHAPTER 10 ■ DATABASE TABLES 371 different PCTFREEs would not make sense. Therefore, a CREATE CLUSTER looks a lot like a CREATE TABLE with a small number of columns (just the cluster key columns): ops$tkyte@ORA10GR1> create cluster emp_dept_cluster 2 ( deptno number(2) ) 3 size 1024 4 / Cluster created. Here, we have created an index cluster (the other type being a hash cluster, which we’ll look at in the next section). The clustering column for this cluster will be the DEPTNO column. The columns in the tables do not have to be called DEPTNO, but they must be NUMBER(2), to match this definition. We have, on the cluster definition, a SIZE 1024 option. This is used to tell Oracle that we expect about 1,024 bytes of data to be associated with each cluster key value. Oracle will use that to compute the maximum number of cluster keys that could fit per block. Given that we have an 8KB blocksize, Oracle will fit up to seven cluster keys (but maybe fewer if the data is larger than expected) per database block. This is, the data for departments 10, 20, 30, 40, 50, 60, and 70 would tend to go onto one block, and as soon as we insert department 80, a new block will be used. That does not mean that the data is stored in a sorted manner; it just means that if we inserted the departments in that order, they would naturally tend to be put together. If we inserted the departments in the order 10, 80, 20, 30, 40, 50, 60, and then 70, the final department, 70, would tend to be on the newly added block. As we’ll see shortly, both the size of the data and the order in which the data is inserted will affect the number of keys we can store per block. The SIZE parameter therefore controls the maximum number of cluster keys per block. It is the single largest influence on the space utilization of our cluster. Set the size too high, and we’ll get very few keys per block and we’ll use more space than we need. Set the size too low, and we’ll get excessive chaining of data, which offsets the purpose of the cluster to store all of the data together on a single block. It is the most important parameter for a cluster. Now for the cluster index on our cluster. We need to index the cluster before we can put data in it. We could create tables in the cluster right now, but we’re going to create and populate the tables simultaneously, and we need a cluster index before we can have any data. The cluster index’s job is to take a cluster key value and return the block address of the block that contains that key. It is a primary key in effect, where each cluster key value points to a single block in the cluster itself. So, when we ask for the data in department 10, Oracle will read the cluster key, determine the block address for that, and then read the data. The cluster key index is created as follows: ops$tkyte@ORA10GR1> create index emp_dept_cluster_idx 2 on cluster emp_dept_cluster 3 / Index created. It can have all of the normal storage parameters of an index and can be stored in another tablespace. It is just a regular index, so it can be on multiple columns; it just happens to index into a cluster and can also include an entry for a completely null value (see Chapter 11 for the reason why this is interesting to note). Note that we do not specify a list of columns in this CREATE INDEX statement—that is derived from the CLUSTER definition itself. Now we are ready to create our tables in the cluster:

372 CHAPTER 10 ■ DATABASE TABLES ops$tkyte@ORA10GR1> create table dept 2 ( deptno number(2) primary key, 3 dname varchar2(14), 4 loc varchar2(13) 5 ) 6 cluster emp_dept_cluster(deptno) 7 / Table created. ops$tkyte@ORA10GR1> create table emp 2 ( empno number primary key, 3 ename varchar2(10), 4 job varchar2(9), 5 mgr number, 6 hiredate date, 7 sal number, 8 comm number, 9 deptno number(2) references dept(deptno) 10 ) 11 cluster emp_dept_cluster(deptno) 12 / Table created. Here, the only difference from a “normal” table is that we used the CLUSTER keyword and told Oracle which column of the base table will map to the cluster key in the cluster itself. Remember, the cluster is the segment here, therefore this table will never have segment attributes such as TABLESPACE, PCTFREE, and so on—they are attributes of the cluster segment, not the table we just created. We can now load them up with the initial set of data: ops$tkyte@ORA10GR1> begin 2 for x in ( select * from scott.dept ) 3 loop 4 insert into dept 5 values ( x.deptno, x.dname, x.loc ); 6 insert into emp 7 select * 8 from scott.emp 9 where deptno = x.deptno; 10 end loop; 11 end; 12 / PL/SQL procedure successfully completed. You might be wondering, “Why didn’t we just insert all of the DEPT data and then all of the EMP data, or vice versa? Why did we load the data DEPTNO by DEPTNO like that?” The reason is in the design of the cluster. We are simulating a large, initial bulk load of a cluster. If we had loaded all of the DEPT rows first, we definitely would have gotten our seven keys per block (based on the SIZE 1024 setting we made), since the DEPT rows are very small (just a couple of

CHAPTER 10 ■ DATABASE TABLES 371<br />

different PCTFREEs would not make sense. Therefore, a CREATE CLUSTER looks a lot like a<br />

CREATE TABLE with a small number of columns (just the cluster key columns):<br />

ops$tkyte@ORA10GR1> create cluster emp_dept_cluster<br />

2 ( deptno number(2) )<br />

3 size 1024<br />

4 /<br />

Cluster created.<br />

Here, we have created an index cluster (the other type being a hash cluster, which we’ll<br />

look at in the next section). The clustering column for this cluster will be the DEPTNO column.<br />

The columns in the tables do not have to be called DEPTNO, but they must be NUMBER(2), to<br />

match this definition. We have, on the cluster definition, a SIZE 1024 option. This is used to<br />

tell <strong>Oracle</strong> that we expect about 1,024 bytes of data to be associated with each cluster key<br />

value. <strong>Oracle</strong> will use that to compute the maximum number of cluster keys that could fit<br />

per block. Given that we have an 8KB blocksize, <strong>Oracle</strong> will fit up to seven cluster keys (but<br />

maybe fewer if the data is larger than expected) per database block. This is, the data for<br />

departments 10, 20, 30, 40, 50, 60, <strong>and</strong> 70 would tend to go onto one block, <strong>and</strong> as soon as we<br />

insert department 80, a new block will be used. That does not mean that the data is stored in a<br />

sorted manner; it just means that if we inserted the departments in that order, they would naturally<br />

tend to be put together. If we inserted the departments in the order 10, 80, 20, 30, 40, 50,<br />

60, <strong>and</strong> then 70, the final department, 70, would tend to be on the newly added block. As we’ll<br />

see shortly, both the size of the data <strong>and</strong> the order in which the data is inserted will affect the<br />

number of keys we can store per block.<br />

The SIZE parameter therefore controls the maximum number of cluster keys per block. It<br />

is the single largest influence on the space utilization of our cluster. Set the size too high, <strong>and</strong><br />

we’ll get very few keys per block <strong>and</strong> we’ll use more space than we need. Set the size too low,<br />

<strong>and</strong> we’ll get excessive chaining of data, which offsets the purpose of the cluster to store all of<br />

the data together on a single block. It is the most important parameter for a cluster.<br />

Now for the cluster index on our cluster. We need to index the cluster before we can put<br />

data in it. We could create tables in the cluster right now, but we’re going to create <strong>and</strong> populate<br />

the tables simultaneously, <strong>and</strong> we need a cluster index before we can have any data. The<br />

cluster index’s job is to take a cluster key value <strong>and</strong> return the block address of the block that<br />

contains that key. It is a primary key in effect, where each cluster key value points to a single<br />

block in the cluster itself. So, when we ask for the data in department 10, <strong>Oracle</strong> will read the<br />

cluster key, determine the block address for that, <strong>and</strong> then read the data. The cluster key index<br />

is created as follows:<br />

ops$tkyte@ORA10GR1> create index emp_dept_cluster_idx<br />

2 on cluster emp_dept_cluster<br />

3 /<br />

Index created.<br />

It can have all of the normal storage parameters of an index <strong>and</strong> can be stored in another<br />

tablespace. It is just a regular index, so it can be on multiple columns; it just happens to index<br />

into a cluster <strong>and</strong> can also include an entry for a completely null value (see Chapter 11 for the<br />

reason why this is interesting to note). Note that we do not specify a list of columns in this<br />

CREATE INDEX statement—that is derived from the CLUSTER definition itself. Now we are ready<br />

to create our tables in the cluster:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!