Nonstop Database, HiRDB Version 9 Command Reference

[Contents][Index][Back][Next]

2.89 pdparaload (Perform parallel loading)

Organization of this section
(1) Function
(2) Executor
(3) Format
(4) Options
(5) Rules
(6) Notes

(1) Function

The pdparaload command performs data loading (pdload command) from a single input data file concurrently on multiple RDAREAs that constitute a row-partitioned table (parallel loading facility).

(2) Executor

User with the same execution privilege as for the pdload command

(3) Format

 
  pdparaload [pdload-command-options] [-I execution-interval] [authorization-identifier.]table-identifier
 
              pdparaload-control-statements-file-name
 

(4) Options

(a) pdload-command-options

Specifies the pdload command's options. The pdparaload command uses the specified options when it executes the pdload command.

Some pdload command options cannot be specified in the pdparaload command. The table below shows whether each pdload command option can be specified in the pdparaload command.

If an unsupported option is specified, the pdparaload command terminates with an error during data loading.

Table 2-23 pdload command options that can and cannot be specified

No. Option Specification in pdparaload Description
1 -d Y Executes the pdload command in the creation mode.
2 -a|-b|-U Y
(-w all cannot be specified)
Specifies the format of the input data file.
Note that -w all (data loading using a table transfer unload file that involves table and index definitions) cannot be specified.
3 -i Y Specifies the index creation method.
4 -l Y Specifies the update log acquisition mode for the database during execution of the pdload command.
5 -k Y Specifies the data input method for storing LOB data in a LOB column, if a LOB parameter is used as an argument of the constructor function that generates the values to be stored in an abstract data type column.
With the pdparaload command, we recommend that you use the -k f option to create files.
If this option is specified, one file is created for each item of LOB data. Therefore, duplicate LOB data will not be read during parallel processing, thereby reducing the number of I/O operations. However, when this option is specified, a LOB middle file is created during data loading for each RDAREA. The created LOB middle file is retained even after the pdparaload command terminates, so the user must delete the LOB middle file after termination of the pdparaload command.
6 -c Y Specifies the name of the column structure information file.
7 -v Y Specifies the name of the null value/function information file.
8 -n Y Specifies the number of buffer sectors when a local buffer is used.
9 -u N The authorization identifier of the user who executes the pdload command cannot be specified.
10 -x Y Specifies that checking for whether the input data is in ascending or descending order of the cluster key values is not to be performed.
11 -f N EasyMT cannot be specified for the input data file or LOB input file.
12 -s Y Specifies that the separator character between data items is to be changed for an input data file in DAT format.
13 -e Y Specifies that processing is to be cancelled if an error is detected in the input data.
14 -r Y Specifies that data input is to begin at a specified line, not at the beginning of the input data file.
15 -z Y Specifies that variable-length character string data, variable-length national character string data, and variable-length mixed character string data with a length of 0 is to be stored.
16 -y Y Specifies that data is to be stored in unused area in used pages during data loading if all unused pages become completely full.
17 -o N The index information file specified in the index statement cannot be deleted automatically after batch index creation has terminated normally.
18 -m Y Specifies the interval for display of the message indicating the progress of the current process.
19 -X Y Specifies the response monitoring time for dictionary manipulation performed by commands in order to detect failures.
20 -q Y Specifies the generation number of the RDAREAs subject to data loading when the inner replica facility is used.
21 -K Y Specifies the format of input data values for XML-type parameters.
22 -G Y Specifies the type of XML data specified as the input data file (XML document or ESIS-B format).
23 -F Y Specifies that an input data value of the FLOAT or SMALLFLT type is to be corrected if it is outside the value range permitted by the OS.
24 -E N Data loading to a table that is expanded to the memory database cannot be performed forcibly.

Legend:
Y: Can be specified in the pdparaload command
N: Cannot be specified in the pdparaload command
(b) -I execution-interval ~<unsigned integer>((10~600000))<<1000>> (milliseconds)

Specifies the execution interval between data loading sessions because the pdparaload command performs data loading separately for each RDAREA.

When you specify an execution interval, only one pdload command can be executing at the same time, which avoids concentration of accesses to the data dictionary table that could result in an execution wait status.

Guideline for the value to be specified

Normally, the default value is used.

If pdload's preprocessing time exceeds the specified value, add 1,000 each time you re-execute the pdparaload command.

(c) [authorization-identifier.]table-identifier ~<identifier>((1-30 bytes))

Specifies the table identifier of the table on which data loading is to be performed by using the pdparaload command. The specification rules are the same as for the pdload command. For details about [authorization-identifier.]table-identifier in the pdload command, see 5.4.2(25) [authorization-identifier]table-identifier.

(d) pdparaload-control-statements-file-name ~<path name>((1-1023 bytes))

Specifies the path of the pdparaload control statements file. This file contains the pdload command control statements that are to be executed by the pdparaload command. For details about the pdload command control statements, see 5.4 Command format. Note that some pdload command control statements cannot be specified in the pdparaload control statements file. The table below shows whether each pdload command control statement can be specified in the pdparaload control statements file.

If an unsupported control statement is specified, a control statement error occurs during data loading because the pdparaload command does not check the control statements.

Table 2-24 Whether pdload command control statements can be specified

No. Control statement Option Specification in pdparaload Description
1 mtguide -- N A tape device cannot be used.
2 emtdef -- N
3 source RDAREA-area N The user cannot specify RDAREA-area because it is specified by the pdparaload command.
4 server-name|host-name For a HiRDB parallel server configuration: M
For a HiRDB single server configuration: O
This option must be specified for a HiRDB parallel server configuration. If you perform data loading without specifying this option, files on the servers that contain each RDAREA will be processed as input data files.
5 input-data-file-name M Specifies the absolute path name of the input data file that contains the input data.
6 (uoc) O Specifies that a UOC is to be used to input/output the input data file.
7 error O Specifies the absolute path name of the file to which error information is to be output.
You must pay attention to the length of the specified path name because the pdparaload command adds an RDAREA name and "" to this file name during data loading. For details, see 4 in Notes.
8 errdata O Specifies that the erroneous rows of data are to be output if the input data results in an error.
You must pay attention to this file name because the pdparaload command adds an RDAREA name and "" to this file name during data loading. For details, see 4 in Notes.
9 errwork O Specifies the buffer size (in kilobytes) for creating an error data file when the errdata option is specified.
10 maxreclen O Specifies the input data length when the input data file is in DAT format, extended DAT format, binary format, or pdrorg-output binary format.
11 EasyMT-information N EasyMT cannot be specified because it is not supported.
12 validate-sign N
13 index -- N Information about an index information file cannot be specified.
14 idxwork server-name O Specifies the name of the server at which the index information file is to be created.
15 directory-name Y Specifies the absolute path name of the directory in which the index information file is to be created.
16 sort server-name O Specifies the name of the server in which sort work files are to be created.
17 directory-name Y Specifies the absolute path name of the directory under which sort work files are to be created.
18 lobdata LOB-input-file-name N# Specifies the LOB information when loading data to a table containing LOB columns or entering LOB data as an input parameter for a constructor function.
If a BLOB column or a column of abstract data type with the BLOB parameter is defined for the target table and f or v is specified in the -k option, the lobdata statement must be specified. If the lobdata statement is omitted, data loading is performed only for the base table; BLOB data will not be loaded.
19 LOB-input-file-directory-name O#
20 EasyMT-information N#
21 lobcolumn -- N LOB input files by the column cannot be used.
22 lobmid -- N LOB middle files cannot be specified.
LOB middle files are created in the work directory under the following name:
/work-directory/LOBMID-xxxxxxxxx
xxxxxxxxx is a name containing each process's execution time and process ID. Such a file is created for each RDAREA.
23 srcuoc -- O Specifies UOC information in order to use a UOC to edit data and then store the data in a database.
24 array -- O Specifies the handling of the array data format and null values specified in the input data file for a table containing repetition columns.
25 extdat -- O Uses extended functions with input data files in DAT format.
26 src_work -- N Divided-input data files cannot be created.
27 constraint -- O Specifies settings for check pending status.
28 option spacelvl O Specifies whether space conversion is to be executed on the input data.
29 tblfree O Changes the percentage of free space specified with CREATE TABLE (value of PCTFREE) for storing data during data loading.
30 idxfree O Changes the percentage of unused area specified with CREATE INDEX (value of PCTFREE) when creating indexes in the batch index creation mode.
31 job N Data loading with the synchronization point specification is not supported.
32 cutdtmsg O Specifies whether a warning message is to be output to the error information file if data truncation occurs during data loading on an input data file in DAT format.
33 nowait N NOWAIT search is not supported.
34 bloblimit O Specifies the size of the area for retaining data when executing data conversion using a pdrorg-output binary-format input data file before loading data.
35 exectime O Specifies an interval in minutes for monitoring the pdload execution time.
36 null_string O Specifies whether the default value set in the DEFAULT clause or the null value is to be stored when input data is the null value ("*" or omitted) during data loading on a table with the DEFAULT clause specified.
37 dataerr O Specifies that data storage processing is to be ignored (rollback) when an input data error (logical error) is detected.
If a non-partitioning key index is used and an option other than -i s (index update mode) is specified, a control statement error occurs during each data loading session because data loading is performed for each RDAREA.
38 lengover O Specifies that an input data error is to be detected when the input data in a DAT-format (including extended DAT format) input data file that is to be stored in a CHAR, VARCHAR, NCHAR, NVARCHAR, MCHAR, or MVARCHAR data type column is longer than the defined column length.
39 divermsg N The user cannot specify divermsg because it is specified by the pdparaload command.
40 srcendian O Specifies that pdrorg-output binary-format files are to be used to transfer data between platforms that use different endians.
41 allspace O Specifies that the character data to be stored in a numeric data type column (INTEGER, SMALLINT, DECIMAL, FLOAT, or SMALLFLT) is to be converted to 0 and then stored when input data files in the fixed-size data format are used for data loading.
42 whitespace O Specifies how to handle spaces contained in XML documents when data loading is performed from XML documents to XML-type columns.
43 seq_range O Specifies how to acquire the sequence number when the automatic numbering facility is used for data loading.
44 file_buff_size O Specifies the input buffer memory size when data is loaded from the input data file to an input buffer.
45 charset O Specifies the endian for input data files when data is loaded from input data files for UTF-16 to a table defined in a UTF-8 environment.

Legend:
M: Specification is mandatory.
Y: Must be specified if the control statement is specified.
O: Specification is optional.
N: Cannot be specified.
--: Not applicable

#
If a BLOB column or a column of abstract data type with the BLOB parameter is defined for the target table and f or v is specified in the -k option, the lobdata statement must be specified. The table below provides the details:
No. -k option BLOB column Column of abstract data type with BLOB-type input parameter
1 d N N
2 f Y Y
3 v Y N
Legend:
Y: The lobdata statement is required.
N: The lobdata statement is not required.

(5) Rules

  1. You can execute the pdparaload command only while HiRDB is active.
  2. The pdparaload command can perform data loading only on row-partitioned tables.
    The pdparaload command does not support data loading on a table for which flexible hash partitioning is defined (including flexible hash partitioning defined for matrix partitioning). If a table for which flexible hash partitioning is defined is specified, the pdparaload command terminates with an error.
  3. The RDAREAs constituting the row-partitioned table must be in a status in which the database load utility (pdload) can be executed. For details about RDAREA status, see C.2 Availability of utility or UAP execution depending on RDAREA status.
  4. In order to start the pdparaload command, you need as many locked resources as the total number of locked resources required for data loading for all the RDAREAs. For details about the number of locked resources required for data loading for an RDAREA, see 5.4.2(4)(a) Notes.
  5. The pdparaload command cannot be executed more than once concurrently on the same table. You can execute the pdparaload command concurrently on different tables, but those tables must not share any RDAREAs. The pdparaload command locks each RDAREA internally in order to perform data loading for the RDAREA. If the pdparaload command is executed on more than one table stored in the same RDAREA, the command terminates with a lock error.
    For details about the lock mode for data loading for an RDAREA, see B.2 Lock mode for utilities.

(6) Notes

  1. The following are the pdparaload command's return codes:
    0: Normal termination
    4: Normal termination (database storage processing was skipped because a logical error occurred in some of the input data)
    8: Abnormal termination
  2. The pdparaload command generates a name for the pdload control statements file under the following naming convention:
    LOD_CTL_authorization-identifier_table-identifier_RDAREA-name
    If a file with the same name already exists, the command terminates with an error.
  3. If a file with the file name generated by the pdparaload command from the following control statements already exists, the command overwrites the existing file:
    • Error information file name specified in error in the source statement
    • Error data file name specified in errdata in the source statements
  4. The pdparaload command adds an RDAREA name and enclosing double quotation marks ("") to the files specified in the source statement. You must specify path names and file names so that their lengths satisfy the rules described in the following.
    • For the source statement
      The lengths of path names and file names specified in the error and errdata operands of the source statement must be no greater than the values obtained from the formulas shown below:
      In the error operand:
      [Figure] Length of a path name (bytes)
      Maximum length of a path name for the OS - (length of RDAREA name + 1)
      [Figure] Length of a file name (bytes)
      Maximum length of a file name for the OS - (length of RDAREA name + 1)
      In the errdata operand:
      [Figure] Length of a path name (bytes)
      (Maximum length of a path name for the OS - 8) - (length of RDAREA name + 1)
      [Figure] Length of a file name (bytes)
      (Maximum length of a file name for the OS - 8) - (length of RDAREA name + 1)
      Note also that the length of a row of source statement must not exceed 1,023 bytes. If such a row exceeds 1,023 bytes, the command terminates with an error.
      The following are rules for the RDAREA name and "" that are added by the source statement.

      [Figure]