1.1.5 SQL character set

Table 1-1 shows the characters that are available in SQL.

Table 1-1 SQL character set

TypePermissible characters in SQL
Character string literalOne-byte character codes (not including X'00')
National character string literalAll two-byte code characters
Mixed character string literalOne-byte character codes (not including X'00') and all two-byte code characters
Other than above
  • Following one-byte code characters:
    Uppercase alphabetic characters (A-Z, $, @, #)
    Lowercase alphabetic characters (a-z)
    Numeric characters (0-9)
    Space
    Underline character (_)
    Kana characters
  • All two-byte code characters
  • Following special characters (one-byte character codes):
    Comma (,)
    Period (.)
    Hyphen or minus sign (-)
    Plus sign (+)
    Asterisk (*)
    Single quotation mark (')
    Quotation mark (")
    Left parenthesis (( )
    Right parenthesis ( ))
    Less than operator (<)
    Greater than operator (>)
    Equals sign (=)
    Circumflex (^)
    Exclamation mark (!)
    Slash (/)
Other than aboveQuestion mark (?)
Colon (:)
Semicolon (;)
Percent sign (%)
Vertical bar (|)
Left bracket ([)
Right bracket (])
TAB (X'09')
NL (X'0a')
CR (X'0d')

Characters that can be used in SQL vary depending on the character code type specified in the pdsetup command. For details about the pdsetup command, see the manual HiRDB Version 8 Command Reference.

SQL allows the use of one-byte and two-byte characters. These two types of characters require different character codes (two-byte characters are not available among the single-byte character codes). The following table shows the relationships between characters and the character code types:

Specified character codeSingle-byte characterDouble-byte characterRemarks
Multiple -byte character codesjis1, 4
(Shift JIS kanji)
JISX0201JISX0208Double-byte characters include gaiji characters.
ujis3
(EUC Japanese kanji)
JISX0201JISX0208Double-byte characters do not include gaiji characters.2
chinese
(EUC Chinese kanji)
ISO-8859-1
(Exclusive of 80-FF)
GB2312-80Double-byte characters do not include gaiji characters2
utf-84, 5
(Unicode (UTF-8))
JISX0221JISX0221Double-byte characters include gaiji characters. Within the ASCII code range, these characters are treated identically with other character codes, except that in some cases a single character may be represented in 6 bytes.6
MS-UnicodeMS-Unicode
Single-byte character codelang-c3
(8-bit code)
Same as the specified code[Figure]These codes can be used in US ASCII and 8-bit codes.
Legend:
[Figure]: Not applicable
1 Cannot be used in the Linux version.
2 Gaiji codes assigned to EUC Code Set 3 (character codes that are represented in 3 bytes as (8F)16 (xxxx)16) cannot be used.
3 Cannot be used in the Windows version.
4 Passing and receiving Japanese data through a String class or a class inheriting that class between a Java UAP and HiRDB or between HiRDB and a Java routine is performed according to the rules regarding the mapping of Java character codes (mapping between a given character code and Unicode). In this case, some gaiji codes may fail to be converted correctly.
5 HiRDB is governed by the UTF-8 encoding rules only; the mapping of codes and characters is transparent to HiRDB. Therefore, you can use characters that comply with the UTF-8 encoding rules. However, when performing character code conversion, you must pay attention to the relationship between the character set and the encoding rules. Therefore, to specify PDCLTCNVMODE in the client environment definition when the character codes of the HiRDB client are SJIS and the character codes of the HiRDB server are UTF-8, you must determine whether JISX0221 or MS-Unicode is being used. For details about PDCLTCNVMODE, see the HiRDB Version 8 UAP Development Guide.
6 To use characters of four bytes or longer, you may need to specify the pd_substr_length operand of the system definition and PDSUBSTRLEN in the client environment definition. For details about the pd_substr_length operand, see the manual HiRDB Version 8 System Definition; for details about PDSUBSTRLEN, see the HiRDB Version 8 UAP Development Guide.
In ISO/IEC 10646, characters are allocated to bytes 1 through 4. Bytes 5 and 6 are reserved for future specifications, and no characters are allocated. Therefore, if you use bytes 5 or 6, there is no assurance that a conflict will not occur in the future.

If the HiRDB External Data Access facility is used in an HiRDB using multi-byte character codes, you can use either the shift JIS kanji code or the EUC Japanese kanji code. Therefore, for the character code to be set in the pdsetup command, you need to specify either sjis or ujis.

In addition, for access to a foreign server using either the shift JIS kanji code or the EUC Japanese kanji code, you need to provide appropriate settings for the foreign server or the client for the foreign server. For details about settings for foreign servers and foreign server clients, see the appropriate DBMS manuals. The relationship between a foreign server and HiRDB character codes is shown as follows:

Type of foreign serverForeign server character codeHiRDB character code
Shift JIS kanjiEUC Japanese kanji
HiRDBShift JIS Japanese kanjiYN
EUC Japanese kanjiNY
XDM/RD E2EBCDIK or KEISY1N
ORACLEShift JIS kanjiYY1
EUC Japanese kanjiY1Y
DB22-byte EBCDICY1Y1, 2
1-byte, 2-byte mixed ASCIIY1Y1, 2
Legend:
Y: Can be connected.
N: Cannot be connected.
1 Character code conversion performed by a DBMS client library or gateway software for connection to a DBMS.
2 The GRAPHIC or VARGRAPHIC type cannot be used. Execution of an SQL statement containing a ? parameter corresponding to the GRAPHIC or VARGRAPHIC type can cause an error.