8.3.3 Format of database load utility input files

If you specify the -W option at the time of unloading, you can use the unload data file as the input data file to the database load utility.

Organization of this subsection
(1) DAT and extended DAT formats
(2) Binary format
(3) Fixed-size data format

(1) DAT and extended DAT formats

In the DAT and extended DAT formats, data is output as character string data. For details about the differences between the DAT and extended DAT formats, see (c) Differences between DAT and extended DAT formats. Except for the cited differences, the DAT and extended DAT formats are identical.

To output data, use the following rules:

  1. The system outputs one table row as one line in the file (a linefeed character is output at the end of each line).
  2. The system inserts a separator character between column data. For details about column data output formats (in DAT format) for various data types, see Section 5.5.1 DAT format.
    Character string, national character, mixed character string, and BINARY data are enclosed in double quotation marks (") in the output format.
    The output format of an abstract data type consists of as many data items as there are specified by the constructor parameter reverse creation function, each of which is delimited by a separator character. If the constructor parameter reverse creation function's return value is BLOB, the system outputs consecutive separator characters as the null value.
  3. The system outputs LOB column data as the null value, regardless of the specification of the -j option.
  4. If the constructor parameter reverse creation function's return value is BLOB, the system outputs the column data of abstract data type as the null value.
  5. An error results if the line length exceeds 512 MB after conversion. If you assign a streaming tape device to the unload data file and the line length exceeds 32 KB, an error results.
  6. If you specify the spacelvl=3 and sup options in the option statement, the system converts two consecutive single-byte spaces of NCHAR type and suppresses the output of a pair of single-byte spaces.
  7. If the data type of a column is character string, national character data, mixed character string, or binary data, and the column value contains a combination of the enclosing character (") + separator character or the separator character + enclosing character ("), the utility may treat the value as the end or start of column data during data loading. To avoid this, use the -W option to change the separator character during unloading or output the data in the binary format.
(a) Format for repetition columns

For repetition columns, the system outputs data from each element by separating it with separator characters. The row data format depends on what is specified in the array statement in the control information file. Figure 8-13 shows the row data output format when ff is specified in the array statement: Figure 8-14 shows the row data output format when vv is specified in the array statement.

Figure 8-13 Row data output format with ff specified in the array statement

[Figure]

Explanation
  1. Data is stored in all the elements.
  2. Elements 1-6 contain data, and element 5 contains a null value. Elements 7-10 contain either no data or a null value.
  3. Data is stored in all the elements. Elements 3, 8, and 9 contain a null value.
  4. Either the column value is null or all elements contain a null value.

    Figure 8-14 Row data output format with vv specified in the array statement

    [Figure]

Explanation
  1. Data is stored in all the elements. A value of 10 is output as the current number of elements.
  2. Elements 1-6 contain data, and element 5 contains a null value. A value of 6 is output as the current number of elements.
  3. Data is stored in all the elements. Elements 3 and 8-10 contain null values. A value of 10 is output as the current number of elements.
  4. Because a null value is stored in the column value, a value of 0 is output as the current number of elements.
  5. All elements contain a null value. A value of 10 is output as the current number of elements.
(b) Format used when the sup option is specified

If you specify the sup option during unload operation, the system compresses any trailing spaces in any column that is shorter than the column length specified in the table definitions and then outputs them to the unload data file. The following shows the format depending on the specification of the sup option:

Data type: CHAR or MCHAR
Example 1
  • Table definition:

CREATE TABLE T1(C1 INTEGER,C2 CHAR(10),...);

Input data: 1,A,A,...
sup option: Specified

1,"AA",...

Explanation: The system compresses the spaces (equal to eight characters) in a column that is shorter than the defined length.
sup option: Not specified

1,"AA........",...

Note: A period (.) indicates a single-byte space.
Explanation: The system outputs single-byte spaces (.) (equal to eight characters) to pad the column that is shorter than the defined length.
Example 2
  • Table definition:

CREATE TABLE T1(C1 INTEGER,C2 MCHAR(10),...1,);

Input data: 1,..........,... (a period (.) indicates a single-byte space)
sup option: Specified

1,".",...

Explanation: The system outputs one single-byte space.
sup option: Not specified

1,"..........",...

Explanation: The system outputs as many single-byte spaces as there are defined for the table.
Data type: NCHAR
Example 1
  • Table definition:

CREATE TABLE T1(C1 INTEGER,C2 NCHAR(10),...);

Input data: 1,A,A,...
sup option: Specified

1,"AA",...

Explanation: The system compresses the spaces (equal to eight characters) in a column that is shorter than the defined length.
sup option: Not specified

1,"AA[Figure] [Figure] [Figure] [Figure] [Figure] [Figure] [Figure] [Figure] [Figure]",...

Note: A triangle ([Figure]) indicates a double-byte space.
Explanation: The system outputs double-byte spaces ([Figure]) (equal to eight characters) to pad the column that is shorter than the defined length.
Example 2
  • Table definition:

CREATE TABLE T1(C1 INTEGER,C2 NCHAR(10),...1,);

Input data: 1, [Figure][Figure][Figure][Figure][Figure][Figure][Figure][Figure][Figure][Figure],... (a triangle ([Figure]) indicates a single-byte space)
sup option: Specified

1,"[Figure]",...

Explanation: The system outputs one double-byte space.
sup option: Not specified

1,"[Figure] [Figure] [Figure] [Figure] [Figure] [Figure] [Figure] [Figure] [Figure] [Figure]",...

Explanation: The system outputs as many double-byte spaces as there are defined for the table.
Data type: NCHAR (spacelvl=3 specified in the option statement)
Example 1
  • Table definition:

CREATE TABLE T1(C1 INTEGER,C2 NCHAR(10),...);

Input data: 1,A,A,...
sup option: Specified

1,"AA",...

Explanation: The system compresses the spaces (equal to eight characters) in a column that is shorter than the defined length.
sup option: Not specified

1,"AA................",...

Explanation: The system outputs single-byte spaces (equal to eight characters) to pad the column that is shorter than the defined length.
Example 2
  • Table definition:

CREATE TABLE T1(C1 INTEGER,C2 NCHAR(10),...1,);

Input data: 1,[Figure][Figure][Figure][Figure][Figure][Figure][Figure][Figure][Figure][Figure],... (a triangle ([Figure]) indicates a single-byte space)
sup option: Specified

1,"..",...

Note: A period (.) indicates a single-byte space.
Explanation: The system outputs two single-byte spaces.
sup option: Not specified

1,"....................",...

Note: A period (.) indicates a single-byte space.
Explanation: The system outputs as many single-byte spaces as there are defined for the table.
(c) Differences between DAT and extended DAT formats

The following table describes the differences between the DAT and extended DAT formats:

ItemDAT formatExtended DAT format
When the data contains the null character (\0) or linefeed character (\n)The row containing the corresponding column data value is not output.The row containing the corresponding column data value is output (\0 and \n are output as is).
When the enclosing character (") is used as part of the dataThe enclosing character used as part of the corresponding column data value is output as is.Two consecutive enclosing characters are output for an enclosing character that is used as part of the column data value.*
Example: "[Figure] ""
* Because multi-byte characters are also subject to conversion, you must pay special attention if the file contents are to be referenced, such as by an application program. Data loading can be performed because pdload converts "" to " whether or not the data contains multi-byte characters.

(2) Binary format

In the binary format, data is output in an internal format based on the database column definition. If this type of file is used as an input to the database load utility, the unloaded table and the data loading target table must have the same column definitions. For a non-FIX table, data cannot be loaded into a FIX table, even if the column definitions are the same.

To output data, use the following rules:

  1. For FIX tables, the system outputs data continuously without inserting separator characters between the column data items. For non-FIX tables, the system outputs not only the column data but also the data length and data position offset. Figure 8-15 shows the row data output format for non-FIX tables. For a description of column data output formats (in binary format) for various data types, see Section 5.5.2 Binary format.
  2. For a table with LOB columns defined, if the -j option is specified, the system outputs the LOB data value; otherwise, the system outputs the null value.
  3. In the case of data output to a fixed-length block tape such as CMT, the system attaches a 512-byte header to the data.

    Figure 8-15 Row data output format for non-fix tables

    [Figure]

Explanation
The system outputs the row length, column data offset, and row data in this order.
L
Row length (4-byte binary number)
Column data storage position offset
Offset from the beginning of the line to the column data header (4-byte binary number). The system outputs as many column data offsets as there are columns in the table. If a column's data is the null value, the system stores a value of 0 and does not output column data.
Column data
Column value.
If the column data requires a word boundary to be referenced by a UAP, first copy it to a boundary-adjusted area, then reference the column data.
For details about the output format for columns whose column data type or abstract data type attribute is BINARY, see 5.5.2 Binary format.
The following shows the order following the LOB column structure base table of the BINARY data, BINARY attribute data of abstract data type, LOB attribute data of abstract data type, and LOB column data:
[Figure]
Repetition columns take the following format:
[Figure]
Column data storage position offset (abstract data type with BLOB attribute)
The system outputs as many data offsets as there are specified by the constructor parameter reverse creation function in the specified order. Therefore, the number of columns in an unloaded table may not match the number of offsets in the unloaded data. The following shows an example:
[Figure]
LOB attribute data
The system outputs the data returned from the constructor parameter reverse creation function in the same format as for a column of defined type. The order of output is the same as the order of offsets, which is the order specified by the constructor parameter reverse creation function.
In the data indicated by row length L, the system outputs two-byte control information for each column of abstract data type. There is no need to consider this control information to edit the data.
If the constructor parameter reverse creation function returns a BLOB type, the system outputs the data in the same manner as with a defined BLOB type in the following order:
  1. The system outputs the column data of defined types and the data of abstract data type that is output by the constructor parameter reverse creation function (for a BLOB type, only the data length) as the LOB column structure base table data with a length of L.
  2. Immediately after the LOB column structure base table data, the system outputs the BLOB data that is output by the constructor parameter reverse creation function. The order of output is the same as the order defined for the column of abstract data type. If multiple constructor parameter reverse creation functions have been specified for this abstract data type, the system outputs the data in the order the functions are specified.
  3. The system outputs all other BLOB data that is not an abstract data type. The order of output is the same as the order in which the BLOB column is defined.
The following shows an example:
[Figure]

(3) Fixed-size data format

In the fixed-size data format, data is output as fixed-length character string data.

(a) Output format
Legend:
[Figure]: Not applicable
* If you have specified the enclose operand in the fixtext_option statement, add 2 bytes to the output length.
(b) Rules
  1. A maximum of 512 megabytes (536,870,912 bytes) can be output per row (including linefeed characters).
  2. If the table contains a column of the abstract data type, you must specify the unld_func statement.