Apress.Expert.Oracle.Database.Architecture.9i.and.10g.Programming.Techniques.and.Solutions.Sep.2005
CHAPTER 15 ■ DATA LOADING AND UNLOADING 667 A word of caution to those of you lucky enough to work on both Windows and UNIX: the end-of-line marker is different on these platforms. On UNIX, it is simply \n (CHR(10) in SQL). On Windows NT, is it \r\n (CHR(13)||CHR(10) in SQL). In general, if you use the FIX approach, make sure to create and load the file on a homogenous platform (UNIX and UNIX, or Windows and Windows). Use the VAR Attribute Another method of loading data with embedded newline characters is to use the VAR attribute. When using this format, each record will begin with some fixed number of bytes that represent the total length of the incoming record. Using this format, we can load variable-length records that contain embedded newlines, but only if we have a record length field at the beginning of each and every record. So, if we use a control file such as this: LOAD DATA INFILE demo.dat "var 3" INTO TABLE DEPT REPLACE FIELDS TERMINATED BY ',' TRAILING NULLCOLS (DEPTNO, DNAME "upper(:dname)", LOC "upper(:loc)", COMMENTS ) then the VAR 3 says that the first 3 bytes of each input record will be the length of that input record. If we take a data file such as the following: [tkyte@desktop tkyte]$ cat demo.dat 05510,Sales,Virginia,This is the Sales Office in Virginia 06520,Accounting,Virginia,This is the Accounting Office in Virginia 06530,Consulting,Virginia,This is the Consulting Office in Virginia 05940,Finance,Virginia,This is the Finance Office in Virginia [tkyte@desktop tkyte]$ we can load it using that control file. In our input data file, we have four rows of data. The first row starts with 055, meaning that the next 55 bytes represent the first input record. These 55 bytes include the terminating newline after the word Virginia. The next row starts with 065. It has 65 bytes of text, and so on. Using this format data file, we can easily load our data with embedded newlines. Again, if you are using UNIX and Windows (the preceding example was with UNIX, where a newline is one character long), you would have to adjust the length field for each record. On Windows, the preceding example’s .dat file would have to have 56, 66, 66, and 60 as the length fields.
668 CHAPTER 15 ■ DATA LOADING AND UNLOADING Use the STR Attribute This is perhaps the most flexible method of loading data with embedded newlines. Using the STR attribute, we can specify a new end-of-line character (or sequence of characters). This allows us to create an input data file that has some special character at the end of each line— the newline is no longer “special.” I prefer to use a sequence of characters, typically some special marker, and then a newline. This makes it easy to see the end-of-line character when viewing the input data in a text editor or some utility, as each record still has a newline at the end of it. The STR attribute is specified in hexadecimal, and perhaps the easiest way to get the exact hexadecimal string we need is to use SQL and UTL_RAW to produce the hexadecimal string for us. For example, assuming we are on UNIX where the end-of-line marker is CHR(10) (linefeed) and our special marker character is a pipe symbol (|), we can write this: ops$tkyte@ORA10G> select utl_raw.cast_to_raw( '|'||chr(10) ) from dual; UTL_RAW.CAST_TO_RAW('|'||CHR(10)) ------------------------------------------------------------------------------- 7C0A which shows us that the STR we need to use on UNIX is X'7C0A'. ■Note On Windows, you would use UTL_RAW.CAST_TO_RAW( '|'||chr(13)||chr(10) ). To use this, we might have a control file like this: LOAD DATA INFILE demo.dat "str X'7C0A'" INTO TABLE DEPT REPLACE FIELDS TERMINATED BY ',' TRAILING NULLCOLS (DEPTNO, DNAME "upper(:dname)", LOC "upper(:loc)", COMMENTS ) So, if our input data looks like this: [tkyte@desktop tkyte]$ cat demo.dat 10,Sales,Virginia,This is the Sales Office in Virginia| 20,Accounting,Virginia,This is the Accounting Office in Virginia| 30,Consulting,Virginia,This is the Consulting
- Page 662 and 663: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 664 and 665: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 666 and 667: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 668 and 669: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 670 and 671: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 672 and 673: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 674 and 675: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 676 and 677: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 678 and 679: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 680 and 681: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 682 and 683: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 684 and 685: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 686 and 687: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 688 and 689: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 690 and 691: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 692 and 693: CHAPTER 14 ■ PARALLEL EXECUTION 6
- Page 694 and 695: CHAPTER 15 ■ ■ ■ Data Loading
- Page 696 and 697: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 698 and 699: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 700 and 701: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 702 and 703: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 704 and 705: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 706 and 707: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 708 and 709: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 710 and 711: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 714 and 715: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 716 and 717: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 718 and 719: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 720 and 721: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 722 and 723: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 724 and 725: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 726 and 727: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 728 and 729: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 730 and 731: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 732 and 733: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 734 and 735: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 736 and 737: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 738 and 739: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 740 and 741: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 742 and 743: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 744 and 745: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 746 and 747: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 748: CHAPTER 15 ■ DATA LOADING AND UNL
- Page 751 and 752: 706 ■INDEX autonomous transaction
- Page 753 and 754: 708 ■INDEX CREATE ANY DIRECTORY f
- Page 755 and 756: 710 ■INDEX dedicated server, 57-5
- Page 757 and 758: 712 ■INDEX flat files, 66, 113-14
- Page 759 and 760: 714 ■INDEX job queue coordinator
- Page 761 and 762: 716 ■INDEX MAXTRANS parameter, 21
668<br />
CHAPTER 15 ■ DATA LOADING AND UNLOADING<br />
Use the STR Attribute<br />
This is perhaps the most flexible method of loading data with embedded newlines. Using the<br />
STR attribute, we can specify a new end-of-line character (or sequence of characters). This<br />
allows us to create an input data file that has some special character at the end of each line—<br />
the newline is no longer “special.”<br />
I prefer to use a sequence of characters, typically some special marker, <strong>and</strong> then a newline.<br />
This makes it easy to see the end-of-line character when viewing the input data in a text<br />
editor or some utility, as each record still has a newline at the end of it. The STR attribute is<br />
specified in hexadecimal, <strong>and</strong> perhaps the easiest way to get the exact hexadecimal string we<br />
need is to use SQL <strong>and</strong> UTL_RAW to produce the hexadecimal string for us. For example, assuming<br />
we are on UNIX where the end-of-line marker is CHR(10) (linefeed) <strong>and</strong> our special marker<br />
character is a pipe symbol (|), we can write this:<br />
ops$tkyte@ORA10G> select utl_raw.cast_to_raw( '|'||chr(10) ) from dual;<br />
UTL_RAW.CAST_TO_RAW('|'||CHR(10))<br />
-------------------------------------------------------------------------------<br />
7C0A<br />
which shows us that the STR we need to use on UNIX is X'7C0A'.<br />
■Note On Windows, you would use UTL_RAW.CAST_TO_RAW( '|'||chr(13)||chr(10) ).<br />
To use this, we might have a control file like this:<br />
LOAD DATA<br />
INFILE demo.dat "str X'7C0A'"<br />
INTO TABLE DEPT<br />
REPLACE<br />
FIELDS TERMINATED BY ','<br />
TRAILING NULLCOLS<br />
(DEPTNO,<br />
DNAME "upper(:dname)",<br />
LOC<br />
"upper(:loc)",<br />
COMMENTS<br />
)<br />
So, if our input data looks like this:<br />
[tkyte@desktop tkyte]$ cat demo.dat<br />
10,Sales,Virginia,This is the Sales<br />
Office in Virginia|<br />
20,Accounting,Virginia,This is the Accounting<br />
Office in Virginia|<br />
30,Consulting,Virginia,This is the Consulting