Nonstop Database, HiRDB Version 9 SQL Reference

[Contents][Index][Back][Next]

1.3 Character sets

Organization of this section
(1) Description
(2) Format
(3) Rules
(4) Notes

(1) Description

A character set defines the properties of character data, based on the following three attributes:

(2) Format

character-set-specification::=[MASTER.]character-set-name

character-set-name ::= {EBCDIK|UTF16}

(3) Rules

  1. The following table lists the character sets that are available in HiRDB.

    Table 1-9 Character sets available in HiRDB

    Character set name Usage format Character repertoire Default collation sequence
    EBCDIK EBCDIK code.
    Characters are represented by 8-bit (single-byte) character codes.
    All EBCDIK-encoded characters Code ordering based on bit combinations
    UTF16 Characters are represented in the character encoding format defined by JIS X 0221 (ISO/IEC 10646), in which each character is encoded as two or four bytes. Byte order is big-endian. All Unicode characters Code ordering based on bit combinations
  2. EBCDIK can only be specified as the character set if sjis is specified as the character code classification in the pdntenv command (pdsetup command in the UNIX edition).
  3. UTF16 can only be specified as the character set if utf-8 is specified as the character code classification in the pdntenv command (pdsetup command in the UNIX edition).
  4. A character set can be specified in any place where a character data type can be specified. A character set cannot be specified for the mixed character data type or national character data type.
  5. If no character set is specified, the character set is determined by the character code classification specified in the pdntenv command (pdsetup command in the UNIX edition). The character set that is assumed when no character set is specified is called the default character set. The following table lists the default character set that is assumed based on the character code classification specified in the pdntenv command (pdsetup command in the UNIX edition).

    Table 1-10 Default character sets for character codes specified in the pdntenv (pdsetup) command

    Character code specified in command Default character set
    sjis Shift JIS kanji code
    chinese EUC Chinese kanji code
    ujis EUC Japanese kanji code
    utf-8 Unicode (UTF-8)
    lang-c Single-byte character code
    chinese-gb18030 Chinese kanji code (GB18030)

(4) Notes

To use data encoded as UTF-16 in the ? parameter, specify the character set name in the character set descriptor area. Specifying UTF-16 data handling in the preprocessing options or embedded variable definitions allows data encoded as UTF-16 to also be used in embedded variables. In this case, the SQL preprocessor determines the character set name based on the specified preprocessing options and the embedded variable.

In addition to UTF16, either UTF-16LE or UTF-16BE can be specified as the character set name.

In the following descriptions, the UTF-16 character set name is assumed to include UTF-16LE and UTF-16BE.

For details about specifying the character set in preprocessing options, embedded variable definitions, or the character set descriptor area, see the HiRDB Version 9 UAP Development Guide.