19.06.2013 Views

DB2 UDB for z/OS Version 8 Performance Topics - IBM Redbooks

DB2 UDB for z/OS Version 8 Performance Topics - IBM Redbooks

DB2 UDB for z/OS Version 8 Performance Topics - IBM Redbooks

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

9.1 Unicode catalog<br />

<strong>DB2</strong> <strong>for</strong> z/<strong>OS</strong> supports three types of encoding schemes:<br />

► EBCDIC<br />

► ASCII<br />

► UNICODE<br />

Traditionally <strong>IBM</strong> mainframes have been based on EBCDIC, while Unix and Windows<br />

applications are based on ASCII. Beginning with Windows NT®, everything stored in<br />

Windows was stored in Unicode (UTF-16). To handle larger character sets, which are<br />

absolutely necessary in Asian languages, double byte characters and mixtures of single byte<br />

and double byte characters were necessary.<br />

Each national language developed its own code pages in EBCDIC or ASCII. The character<br />

set <strong>for</strong> a language is identified by a numeric CCSID. An encoding scheme consists of a single<br />

byte character set (SBCS), and optionally a double byte character set (DBCS) along with a<br />

mixed character set. For example, the EBCDIC CCSID used by the z/<strong>OS</strong> operating system<br />

itself is 37, but Japanese can use 8482 <strong>for</strong> SBCS, 16684 <strong>for</strong> DBCS and 1390 <strong>for</strong> mixed.<br />

Terminology: Translation is what we do going from language to another, while conversion is<br />

what we do to character strings. CCSIDs (and code pages) are never converted, they are just<br />

definitions. The individual character strings are converted.<br />

9.1.1 Character conversion<br />

9.1.2 Unicode parsing<br />

The variety of CCSIDs made it very difficult <strong>for</strong> multinational corporations to combine data<br />

from different sources. Unicode has arisen to solve the problem. By storing data in a single<br />

Unicode CCSID, text data from different languages in different countries can be easily<br />

managed. However, to make the transition to Unicode can be difficult and a lot of data<br />

conversion is unavoidable. When storing into and retrieving data from <strong>DB2</strong>, <strong>DB2</strong> will convert<br />

data where necessary. Obviously, it is preferable that no conversion be carried out, because<br />

conversion impacts per<strong>for</strong>mance. However, considerable work has been done to improve<br />

conversion per<strong>for</strong>mance on zSeries. See 4.7, “Unicode” on page 174.<br />

Character conversion is necessary whenever there is a mismatch between the CCSID of a<br />

source and target string, such as between a host variable and its associated column. Such<br />

conversion support in <strong>DB2</strong> has existed since <strong>DB2</strong> began to support client/server connections.<br />

In <strong>DB2</strong> V2.3 such translations started out between different EBCDIC CCSIDs, as well as<br />

between EBCDIC and ASCII. To per<strong>for</strong>m such character conversion (not involving Unicode),<br />

<strong>DB2</strong> uses a translate table which is stored in SYS<strong>IBM</strong>.SYSSTRINGS. We will refer to such<br />

conversions as SYSSTRINGS conversions, which have particular per<strong>for</strong>mance<br />

characteristics.<br />

In <strong>DB2</strong> V8, because the catalog has changed to Unicode, data being sent or received by the<br />

application must be verified and possibly converted by the DBM1 address space.<br />

Figure 9-1 depicts a legacy COBOL application running in z/<strong>OS</strong> using CCSID 37. The<br />

database has been converted to Unicode. The table contains one CHAR column called<br />

COLC (CCSID 1208) and one GRAPHIC column called COLG (CCSID 1200). When the<br />

application inserts host variables HV1 and HV2 into the table, the strings are converted by the<br />

DBM1 address space into Unicode. The CPU time <strong>for</strong> these conversions is added to class 2<br />

CPU time.<br />

342 <strong>DB2</strong> <strong>UDB</strong> <strong>for</strong> z/<strong>OS</strong> <strong>Version</strong> 8 Per<strong>for</strong>mance <strong>Topics</strong>

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!