1.1.5 SQL character set

The following table lists the characters that can be used in SQL statements.

Table 1-1 SQL character set

TypePermissible characters in SQL
Character string literalOne-byte character codes (not including X'00')
National character string literalAll two-byte code characters
Mixed character string literalOne-byte character codes (not including X'00') and all two-byte code characters
Other than above
  • Following one-byte code characters:
    Upper-case alphabetic characters (A to Z, $, @, #)
    Lower-case alphabetic characters (a to z)
    Numeric characters (0 to 9)
    Space
    Underscore character (_)
    Kana characters
  • All two-byte code characters
  • Following special characters (one-byte character codes):
    Comma (,)
    Period (.)
    Hyphen or minus sign (-)
    Plus sign (+)
    Asterisk (*)
    Single quotation mark (')
    Double quotation mark (")
    Left parenthesis (( )
    Right parenthesis ( ))
    Less than sign (<)
    Greater than sign (>)
    Equals sign (=)
    Circumflex (^)
    Exclamation mark (!)
    Forward slash (/)
Other than aboveQuestion mark (?)
Colon (:)
Semicolon (;)
Percent sign (%)
Vertical bar (|)
Left square bracket ([)
Right square bracket (])
TAB (X'09')
NL (X'0a')
CR (X'0d')

Characters that can be used in SQL vary depending on the character code type specified in the pdsetup command. For details about the pdsetup command, see the manual HiRDB Version 9 Command Reference.

SQL allows the use of one-byte and two-byte characters. These two types of characters require different character codes (two-byte characters are not available among the single-byte character codes). The following table indicates the relationships between characters and the character code types:

Specified character codeSingle-byte characterDouble-byte characterRemarks
Multiple -byte character codesjis#3
(Shift JIS kanji)
JISX0201JISX0208Double-byte characters include gaiji characters.
ujis#2
(EUC Japanese kanji)
JISX0201JISX0208Double-byte characters do not include gaiji characters.#1
chinese#6
(EUC Chinese kanji)
ISO-8859-1
(other than 80 to FF)
GB2312-80Double-byte characters do not include gaiji characters#1
utf-8#3, #4
(Unicode (UTF-8))
JISX0221JISX0221Double-byte characters include gaiji characters. For characters in the ASCII code range, these characters are treated the same as other characters, except that in some cases a single character is represented in six bytes.#5
MS-UnicodeMS-Unicode
chinese-gb18030#6
(Chinese kanji GB18030)
ISO-8859-1
(other than 80 to FF)
GB18030-2000Double-byte characters include gaiji characters. For characters in the ASCII code range, these characters are treated the same as other characters, except that in some cases a single character is represented in four bytes.
Single-byte character codelang-c#2, #6
(8-bit code)
Same as the specified code--These codes can be used in US ASCII and 8-bit codes.
Legend:
--: Not applicable
#1: Gaiji codes assigned to EUC Code Set 3 (character codes that are represented in three bytes as (8F)16 (xxxx)16) cannot be used.
#2: Cannot be used in the Windows edition.
#3: Passing and receiving Japanese data through a String class or a class inheriting that class between a Java UAP and HiRDB or between HiRDB and a Java routine is performed according to the rules regarding the mapping of Java character codes (mapping between a given character code and Unicode). In this case, some gaiji codes may fail to be converted correctly.
#4: HiRDB is governed by the UTF-8 encoding rules only; the mapping of codes and characters is transparent to HiRDB. Therefore, you can use characters that comply with the UTF-8 encoding rules. However, when performing character code conversion, you must pay attention to the relationship between the character set and the encoding rules. Therefore, to specify PDCLTCNVMODE in the client environment definition when the character codes of the HiRDB client are SJIS and the character codes of the HiRDB server are UTF-8, you must determine whether JISX0221 or MS-Unicode is being used. For details about PDCLTCNVMODE, see the HiRDB Version 9 UAP Development Guide.
#5: To use characters of four bytes or longer, you may need to specify the pd_substr_length operand of the system definition and PDSUBSTRLEN in the client environment definition. For details about the pd_substr_length operand, see the manual HiRDB Version 9 System Definition; for details about PDSUBSTRLEN, see the HiRDB Version 9 UAP Development Guide.
In ISO/IEC 10646, characters are allocated to bytes 1 through 4. Bytes 5 and 6 are reserved for future specifications, and no characters are allocated. Therefore, if you use bytes 5 or 6, there is no assurance that a conflict will not occur in the future.
#6: Cannot be used if XDS is used.