5.4.2 Options

Organization of this subsection
(1) -d
(2) {-a|-b}
(3) -i index-creation-method
(4) -l log-acquisition-method
(5) -W
(6) -k LOB-creation-type
(7) -c column-structure-information-filename
(8) -v null-value/function-information-filename
(9) -n [batch-output-local-buffer-sectors-count],[div],[local-buffer-sectors-count-for-random-access]
(10) -u authorization-identifier
(11) -x
(12) -f input-data-file-type-or-LOB-input-file-type
(13) -s separator-character
(14) -e
(15) -r input-begin-line
(16) -z
(17) -y
(18) -o
(19) -m progress-message-output-interval
(20) -X response-monitoring-time-for-server-to-server-communication
(21) -q generation-number
(22) [authorization-identifier.] table-identifier
(23) control-information-filename

(1) -d

[Figure]<<addition mode>>

Specifies that the database load utility is to be executed in the creation mode. If you omit this option, the system assumes the addition mode.

In the creation mode, all existing data is deleted from the table that is to be loaded, and then the input data is stored in the table.

In the addition mode, the input data is added to the existing data in the table.

Rules
  1. In the addition mode, the input data is stored in the pages following the last data in the table. If you use the addition mode to load data on a table with a cluster key defined, the additional data is not stored in the order of cluster key values.
  2. In the addition mode, data is always stored in unused pages. If the system cannot allocate unused pages, an error results. Normally, the system will not use available space in used pages that contain data. However, if you specify the -y option, the system will use used pages that already contain data.
  3. If you execute data loading in units of RDAREAs in the creation mode, the existing table data is deleted only in the specified RDAREA.
  4. During data deletion, the system does not allow a NOWAIT search, even if nowait=yes is specified in the option statement.
  5. Once data loading with the synchronization point specification terminates abnormally, the creation mode is ignored during the re-execution, if specified.
  6. If you execute data loading on a LOB column that is stored in an RDAREA in frozen update status or on a table that contains an abstract data type column with the LOB attribute, specifying the creation mode results in an error because data cannot be deleted. Even when you execute data loading only on a LOB column structure base table, specifying the creation mode results in an error.
  7. For a falsification prevented table, data loading in the creation mode is not permitted.

(2) {-a|-b}

[Figure]<<DAT format>>

Specifies the format of the input data file.

Omitted: DAT format

-a: Fixed-size data format

-b: Binary format

For details about the input data files, see section 5.5 Input data file.

(3) -i index-creation-method

[Figure]<<c>>

Specifies the index creation method. There are four ways to create indexes:

c
This indicates the batch index creation mode. If you specify this mode, the system creates indexes in batch mode after creating the table.
When row data is being stored, the system outputs index creation information to an index information file without actually creating an index. The system creates indexes after storing the row data.
Criterion
When a large amount of data is loaded, this method can create indexes at high speed.
Notes
  1. For data loading in the addition mode, the system outputs all index information, including both existing data and added data, to the index information file; therefore, all index entries are re-created.
    For additional data loading involving a plug-in index, the system outputs only the index information on the additional data. Therefore, the plug-in index is created only for the index entries that have been added.
  2. If you execute data loading with the synchronization point specification in the batch index creation mode, the processing performance decreases for the following reasons:
    [Figure]For a B-Tree index, the system stores all data and then searches the data pages again to create an index information file.
    [Figure]For a plug-in index, the system starts the index creation process each time the transaction is settled, thereby alternately executing data loading and index creation.
n
This indicates the index information output mode. If you specify this mode, the system outputs only the index information to an index information file.
When row data is being stored, the system outputs index creation information to an index information file without actually creating an index.
Criterion
You can use the index information file obtained to achieve high-speed index creation by executing multiple batch index creation processes concurrently with the database reorganization utility. This method is especially effective for a table that is partitioned and stored in the multiple servers that constitute a HiRDB/Parallel Server.
Notes
  1. If a table is partitioned into multiple RDAREAs in a server and a part of the partitioned table is created by data loading in units of RDAREAs, the system assumes the index information output mode for the non-partitioning key index that is not row-partitioned in the server.
  2. You cannot specify the index information output mode for a table for which a plug-in index is defined.
  3. When you specify the index information output mode, you need to use the database reorganization utility to create the index in batch mode after completing the data loading.
s
This indicates the index update mode. If you specify this mode, the system updates indexes each time a row of data is stored.
Criterion
Use the index update mode when loading a small amount of data to a table that already contains a large amount of data, or when a unique key index or primary key index has been defined and the key values of the data to be stored may be duplicated.
x
This indicates the index information output suppression mode. If you specify this mode, the system does not update indexes nor output index information to an index information file.
Criterion
Use this mode when executing data loading in multiple segments. For example, you should use the index information output suppression mode if data loading needs to be executed n times because there are n volumes of MT.
Notes
  1. If you are executing data loading using the -ix option, specify -ice (batch index creation mode) or -in (index information output mode) at the final execution. If you specify -in, you need to execute batch index creation (-k imp) later using the database reorganization utility.
    If you have loaded all data using the -ix option, you need to re-create indexes (-k irk) later using the database reorganization utility.
  2. You cannot specify the index information output suppression mode for a table for which a plug-in index is defined.
  3. If you execute data loading using the -ix option, the system does not update indexes, leaving indexes in uncreated status.
(a) Notes about the creation of an index
  1. If you execute pdload in a mode other than the index update mode (-i s), the system creates an index in batch mode after storing all row data in a table. If a duplicate key error or an out-of-limit duplicate key error occurs during the index creation process, a mismatch may result between the table and the index.
    To avoid this problem, create a backup copy in either of the following cases, irrespective of which log-acquisition mode is used:
    • Tables with a unique key index or primary key index defined
    • Tables with an index defined on a repetition column
  2. Figure 5-9 shows the index creation method for a row-partitioned index and a non-partitioned index. If a table is partitioned and stored in multiple RDAREAs in a server, there are a row-partitioned index that is stored in as many index storage RDAREAs as there are table storage RDAREAs and a non-partitioned index that is stored in a single index storage RDAREA.
    The row-partitioned index is partitioned and stored in the same manner as with the table that is partitioned in the server (RDAREAs INDEX1_1 and INDEX1_2 in the figure).
    The non-partitioned index is stored in a single RDAREA in the server, regardless of the number of table storage RDAREAs (RDAREAs INDEX1_3, INDEX2_1, and INDEX2_2 in the figure).

    Figure 5-9 Index creation method for partitioning key index and non-partitioning key index

    [Figure]

    Explanation
    After completion of data loading, the system creates indexes according to the specified -i option. If data loading is executed on TABLE1_1 in units of RDAREAs, row-partitioned index INDEX1_1 is created (there is no effect on INDEX1_2). To create non-partitioned index INDEX2_1, however, information about both TABLE1_1 and TABLE1_2 is required. Therefore, in this case, the non-partitioned index is not created (only the index information file for TABLE1_1 is created). To create the non-partitioned index, you need to execute data loading on TABLE1_2 also to create the index information file for TABLE1_2, then execute prong's batch index creation process using the index information files for TABLE1_1 and TABLE1_2.
  3. After table data has been stored, the corresponding indexes are unfinished and unavailable until the batch index creation process is completed.
  4. If you specify -i c or -i n, but there is no data in the input data file, the system outputs index information only for the existing data to the index information file. You can use this file as an input to pdrorg to re-create the index. In this case, be sure to execute pdload in the addition mode (-d option omitted). If you execute it in the creation mode (-d option specified), pdload deletes the table data at first, thereby losing the index information.
  5. If you execute data loading using the -i n option and then use ALTER TABLE to add RDAREAs before executing prong's batch index creation process (-k imp), the index information file output by the -i n option becomes unusable. If you are adding RDAREAs with ALTER TABLE, be sure to complete prong's batch index creation process beforehand. If you have already added RDAREAs using ALTER TABLE, execute prong's index re-creation process (-k ixrc).
  6. When you specify -i c or -i n, the utility creates as many index information files as the number of indexes [Figure]RDAREAs storing the table. Because these files are opened at the same time during data loading, they may exceed the maximum number of files permitted per process. If this maximum number of files is exceeded, increase the value of the pd_max_open_fds operand in the system definition. However, if the maximum value of the pd_max_open_fds operand is exceeded, either check and, if necessary, revise the number of table partitions per server and the number of defined indexes or specify -i s.
(b) Notes about the creation of a plug-in index
  1. If a plug-in index is defined, you cannot specify the index information output mode nor the index information output suppression mode. In these cases, specify either the batch index creation mode or the index update mode.
  2. For a B-tree structure index, execution of data loading in the batch index creation mode creates all indexes. For a plug-in index, an index is created only on the additional data. Therefore, if you execute data loading involving 0 data records, a B-tree structure index is re-created, but not a plug-in index. However, for a plug-in index, loading 0 data records after re-initializing the index storage RDAREA results in the re-creation of an index. Use this option to repair an index in the event of abnormal termination during an index creation process, or if the index has been damaged due to a disk error.
(c) Notes about specifying -i c or -i n

When -i c or -i n is specified and the index or idxwork statement is omitted, the utility outputs the index information file to the /tmp directory using the following naming conventions:

/tmp/INDEX-index-name-index-storage-RDAREA-name-unique-character-string

If pdload terminates abnormally, this file is not deleted. If you re-execute pdload as is, a new file with a different name is created; this may result in a space shortage in the /tmp directory. If the idxwork statement is specified, the utility also outputs the index information file to the directory specified in the idxwork statement using the same naming conventions. In this case also, if pdload terminates abnormally, the file is not deleted. You should use the rm OS command to delete unneeded index information files.

(4) -l log-acquisition-method

[Figure]<<p>>

Specifies the method for acquiring the database updating log when pdload is executed.

If pdload terminates abnormally during execution, the database is not restored to its status before pdload was executed even when the update log has been acquired. For details about what to do in the event of abnormal termination of pdload, see 5.12 Database status in the event of an error and recovery methods.

a
This indicates the log acquisition mode. This method collects database updating log information required for rollback and rollforward.
Criteria
This mode is suitable for loading a small amount of data.
When you execute data loading in the log acquisition mode, there is no need to make backups before and after execution of pdload, but performance is lower than in the other modes.
p
This indicates the pre-update log acquisition mode. This method collects database updating log information required for rollback, but it does not collect database updating log information required for rollforward.
Criteria
This mode is suitable for loading a large amount of data.
When you execute data loading in the pre-update log acquisition mode, the execution time is shorter than in the log acquisition mode. However, to handle possible media errors, you should back up the RDAREAs that store tables and indexes after executing pdload.
In the event of an error during execution of pdload, pdload restores the database to the synchronization point immediately before the occurrence of the error. This protects the RDAREAs from being placed in no-log shutdown status; however, the target table is not restored to its status before execution of pdload.
Notes
If data loading is executed on a table for which an abstract data type provided by a plug-in is defined, whether or not the p option (pre-update log acquisition mode) takes effect on the log output by the plug-in depends on the installed plug-in. If the plug-in does not support the p option, the a option (log acquisition mode) is assumed.
n
This indicates the no-log mode. The system does not update the database updating mode information.
Criteria
If only the table subject to data loading and its indexes are defined in the RDAREA, this mode is suitable for loading a large amount of data.
Data loading in the no-log mode is faster than in the other modes. However, to restore the database from its backup in the event of an error during utility execution or from its backup copy and log information in the event of an error on a medium, you must back up the RDAREAs storing the table and indexes both before and after executing the database load utility.
In the event of an error, you can restore the database only up to the point at which the backup was made.
Notes
  1. If an error occurs while executing pdload in the no-log mode, you must either use a backup copy to restore the RDAREAs that were shut down due to the error or re-initialize them.
  2. You cannot use the no-log mode for data loading with the synchronization point specification.
(a) Notes
  1. If you specify n and the database can be restored from the previously acquired backup copy and log information or from the input data (if the table subject to data loading contains no data and only this table is to be stored in applicable RDAREAs), there is no need to make a backup copy prior to execution of pdload.
  2. For details about how to operate when p or n is specified (database updating log information is not collected), see the HiRDB Version 8 System Operation Guide.
  3. A transaction log is always collected by a transaction (T) created by pdload, regardless of the -l option's specification. The system creates the following amount of transaction log information per server; therefore, the formula for determining the amount of log information during execution of pdload is as shown below:

    Amount of log information = (1328 + 176 [Figure] 3) [Figure] T + A (bytes)

    T = (x [Figure] 2) + (y [Figure] 2) + (z [Figure] 2)

    x: Number of tables*
    y: Number of RDAREAs storing LOB columns (LOB attribute)
    z: Number of indexes [Figure] number of RDAREAs storing indexes (not required if -i s is specified)
    * The value is the number of synchronization points in the case of data loading with synchronization point specification; otherwise, the value is 1.
    A: Amount of system log information that is output according to the database manipulation (amount of database updating log information). This value depends on the value of the -l option. For details about determining the amount of system log information, see the HiRDB Version 8 Installation and Design Guide.
    If p or n is specified, the system collects ENQ log information in lock mode. The following is the amount of ENQ log information collected per server:

    ENQ log information = (p + q + r) [Figure] T

    p: Number of RDAREAs storing the table
    q: Number of RDAREAs storing LOB columns (LOB attribute)
    r: Number of RDAREAs storing indexes
    Therefore, the amount of system log information the system outputs is the system log file record length [Figure] ENQ log information.
  4. If Real Time SAN Replication based on the log-only synchronous method is used, and pdload with -l p or -l n specified was executed at the transaction execution site, you must execute the preparations for log application.

(5) -W

Specifies that the file output in binary format by the database reorganization utility with the -W option specified is to be used as the input data file.

Data loading fails if the table definitions, such as the column data types, are different between the unloaded table and the table subject to data loading. However, if you specify a column structure information file, you can load data to a non-FIX table with the column definition sequence and number of columns changed. When you use a UAP or UOC to change the contents of the input data file, you must edit the data in the format in which there are no spaces between the row length, column data offset, and row data.

If you specify this option, make sure that the -b option is also specified.

(6) -k LOB-creation-type

[Figure]<<f or d>>

Specifies the data input method for storing LOB data in a LOB column, if a LOB parameter is used as an argument of the constructor function that generates the values to be stored in an abstract data type column. When this option is omitted, f is assumed, except that d is assumed when the -W option is specified.

f
Specify this option to prepare a file for each unit of LOB data. These files are called LOB input files.
c
Specify this option to prepare one file for all LOB data contained in one LOB column. This file is called a LOB column input file.
A column-unit LOB column input file is created when data is migrated from another database using a program.
You cannot specify this option for LOB data that is specified as an input parameter of the function that creates data for an abstract data type column. Therefore, if a table contains an abstract data type column that uses LOB as an input parameter, you cannot use a LOB column input file even for another LOB column that is not the abstract data type.
v
For the data to be stored in a BLOB column, provide a LOB input file for each set of LOB data to be stored in the column. Specify the data (LOB parameter) that serves as an input parameter for the constructor function that generates the values for the abstract data type columns if it is to be stored in an input file.
This specification takes effect only when the input data file is a binary-format file.
d
Specify this option to specify LOB data and LOB parameter data for abstract data type columns directly in an input data file.
This specification takes effect only when the input data file is a binary-format file.
When you specify -k d, you cannot execute data loading using UOC.
(a) Criteria

Determine the -k option according to whether or not there are LOB columns and LOB parameters for abstract data type columns and the format of input data file as follows:

LOB columnFormat of input data fileAbstract data type column
NoYes
Without LOB parameterWith LOB parameter
NoDAT format[Figure][Figure]-k f
Binary format-k v or -k d
Fixed-size data format-k f
pdrorg-generated binary format[Figure][Figure]-k d
YesDAT format-k f or -k c-k f or -k c-k f
Binary format-k f, -k c, or -k d-k f, -k c, or -k d-k v or -k d
Fixed-size data format-k f or-k c-k f or -k c-k f
pdrorg-generated binary format-k d-k d-k d

[Figure]: There is no need to specify the -k option.

(b) Rules
  1. If you are creating a LOB column structure base table separately from LOB columns, specify the same options when creating them.
  2. The contents of an input data file and LOB input files depends on the -k option specification as shown as follows:
    -k optionInput data fileLOB input file
    Data stored in LOB columnAbstract data type LOB parameterData stored in LOB columnAbstract data type LOB parameter
    fSpecify the name of the LOB input fileSpecify the name of the LOB input fileProvide as many files as there are data items to be storedProvide as many files as there are data items to be stored
    cSpecify dummy data[Figure]Provide as many files as there are columns[Figure]
    vSpecify the name of the LOB input fileSpecify the contents of LOB parameterProvide as many files as there are data items to be storedNot required
    dSpecify the contents of LOB dataSpecify the contents of LOB parameterNot requiredNot required
    [Figure]: Not applicable
  3. For data loading with the synchronization point specification, specify d in the -k option.Neither f, nor c, nor v can be specified.

(7) -c column-structure-information-filename

Specifies the name of a column structure information file.

For details about the column structure information file, see section 5.7 Column structure information file. Specify a column structure information file in the following cases:

(a) Rules
  1. If an input data file is in fixed-size data format, be sure to specify a column structure information file. Specification of a column structure information file is optional for an input data file in DAT or pdrorg-generated binary format. For an input data file in the pdrorg-generated binary format, you can specify a column structure information file to load data to a non-FIX table. An attempt to load data to a FIX table using a column structure information file results in an error. To modify the column structure of a FIX table using a column structure information file, treat the corresponding file as the input data file in the fixed-size data format.
  2. If an input data file is in the DAT format and satisfies all the following conditions, there is no need to specify a column structure information file:
    • Columns of input data are in the same order as the order in which the table columns are defined.
    • Input data has the same number of columns as the columns in the table.
    • The constructor function that generates the data to be stored in the abstract data type has the same name as the column data type, and only one constructor function is used.
    • The number of input data elements to be stored in a repetition column is the same as the maximum number of elements specified for the applicable columns during table definition.
  3. Provide the column structure information file in the host where the database load utility is executed (the pdload command is entered).
  4. A column structure information file and a null value/function information file are mutually exclusive.

(8) -v null-value/function-information-filename

Specifies the name of a null value/function information file.

You can use a null value/function information file to convert input data to a value that represents a null value or to specify information about a constructor function that generates values for the abstract data type columns.

For details about the null value/function information file, see section 5.8 Null value/function information file.

(a) Rules
  1. You can specify a null value/function information file if the input data file is in the binary format, not in the DAT or fixed-size data format.
  2. A null value/function information file and a column structure information file are mutually exclusive.

(9) -n [batch-output-local-buffer-sectors-count],[div],[local-buffer-sectors-count-for-random-access]

Specifies that a local buffer is to be used for loading data into the table. Specifying this option reduces the number of input/output operations because the system uses the local buffer for batch output to access the database.

When this option is omitted, the system uses the global buffer to output one page at a time.

batch-output-local-buffer-sectors-count[Figure]<unsigned integer> ((2-4096))
Specifies the number of local buffer sectors for batch output. The batch output local buffer is used for data pages.
For the number of batch output local buffers, we recommend a value in the range 16-32. A guidelines is 64 kilobytes/page length.
div
Specify div when all the conditions listed below are applicable. If div is not specified when these conditions are all applicable, the number of input/output operations may increase, thereby affecting performance adversely.
  • Data is to be loaded by table into a row partitioned table.
  • Data is to be loaded into a row partitioned table that uses a hash function (HASH0-HASH6), or the key values of the input data that is stored in a table partitioned by key ranges are distributed randomly.
  • There are multiple RDAREAs at one server in which the table is stored
When div is specified, the required memory size increases because the system allocates as many buffer sectors as there are table partitions in the server.
local-buffer-sectors-count-for-random-access[Figure]<unsigned integer> ((4-125000))
Specifies the number of local buffer sectors for random access. The random access local buffer is used for index pages.

It is recommended that you change the combination of the number of batch input/output local sectors and the number of random access local buffer sectors according to the table definition. Table 5-1 shows the recommended -n option specification.

Table 5-1 Recommended -n option specification (pdload)

Table typeColumn definitionTable partitioning in server
YesNo
FIX table[Figure]-n x, div-n x
Non-FIX tableVariable-length character string whose column length exceeds 256 bytes or a BINARY column is defined-n, ,y
Abstract data type column is defined-n x, div, y-n x, ,y
Repetition column is defined
Other-n x, div-n x
Legend:
x: Number of batch output local buffer sectors
y: Number of random access local buffer sectors
[Figure]: Not applicable
(a) Buffer used by pdload
  1. When the -n option is omitted, the system uses the global buffer. In this case, the transaction performance of a UAP that uses the global buffer drops because a large amount of global buffer space is used during data loading. Specifying the -n option eliminates such buffer contention. Figure 5-10 show the relationship between pdload and buffer.

    Figure 5-10 Relationship between pdload and buffer

    [Figure]

    Explanation:
    If only the global buffer is used (-n option is omitted), buffer contention occurs between pdload and the UAP.
    When both local and global buffers are used (-n option is specified), buffer contention does not occur between pdload and the UAP. However, for data loading to LOB columns, the system uses the global buffer even when the -n option is specified.
  2. If only the number of batch output local buffer sectors is specified in the -n option, the system uses a single batch output local buffer per RDAREA. If there are multiple RDAREAs, buffer contention occurs because only one batch output local buffer is used. If buffer contention occurs, the number of input/output operations increases, thereby affecting performance adversely. In such a case, specify div. When div is specified, buffer contention will not occur because the system allocates as many batch output local buffers as there are RDAREAs (one batch output local buffer per RDAREA).
  3. Even when the batch output local buffer is specified, the system may use the global buffer depending on the conditions. Table 5-2 describes the relationships between the conditions and the buffer that is used.

    Table 5-2 Relationship between conditions and the buffer that is used (pdload)

    ConditionSpecification of random access local buffer
    Not specifiedSpecified
    Global bufferBatch output local bufferGlobal bufferBatch output local bufferRandom access local buffer
    Data pageRDAREA storing LOB column structure base tableFIX tableNYNYN
    Non-FIX tableWhen variable-length character strings or BINARY columns whose column length exceeds 256 bytes are definedYNYNN
    When abstract data type columns or repetition columns are defined and a row of data cannot fit in one pageYNYNN
    OtherNYNYN
    RDAREA for storing LOB columnsYNYNN
    RDAREA for storing LOB attributesYNYNN
    Index pageWhen key values are searched from the table (-i c or n is specified)NYNYN
    When indexes are created concurrently (-i s is specified)YNNNY
    When batch index creation is executed (-i c is specified)YNNNY
    Directory pageYNYNN
Legend:
Y: Used
N: Not used
(b) Rules
  1. If the global buffer is not sufficient for a rebalancing table, performance may drop considerably. Therefore, if there is no column whose definition length exceeds 256 bytes, we recommend that you specify the number of batch output pages.
    If the local buffer specification is not effective for a reason such as the existence of a column whose definition length is 256 bytes or greater, allocate at least the following number of global buffer sectors:
    Number of buffer sectors required per RDAREA
    = 1024 [Figure] (number of table storage RDAREAs) [Figure] 2+ 3
  2. For a rebalancing table with FIX hash partitioning, the system allocates a buffer equivalent to the specified number of pages for each hash group, thereby consuming more memory than for a table that is not partitioned or partitioned by other conditions.

(10) -u authorization-identifier

Specifies the authorization identifier of the user executing the database load utility.

For the default value, see (b) Default value.

When this option is specified, a message requesting password entry is displayed. If no password is required, enter null in response to the message. The utility checks the authorization identifier entered and password to verify that the user is authorized to establish connection with the HiRDB system and to access the table.

(a) Criterion

Specify this option to use an authorization identifier that is not defined in the PDUSER environment variable.

(b) Default value

When this option is omitted, the system assumes the authorization identifier and password as follows:

  1. The system assumes the value of the PDUSER environment variable during the execution of database load utility. Be sure to specify PDUSER if you are executing the utility in the background with & attached by the shell, or in a remote shell environment in which a password cannot be entered. Following are examples of the PDUSER environment variable:
    Examples for C shell:
    Specifying a password:
    setenv PDUSER '"authorization-identifier"/"password"'
    Not specifying a password:
    setenv PDUSER '"authorization-identifier"'
  2. If the PDUSER environment variable is not set, the system assumes the login window's user name. Enter the password when a message is displayed requesting password entry. If no password is required, enter null in response to the message.
(c) Rules
  1. Do not specify this option if you are executing the utility in the background with & attached by the shell, or in a remote shell environment in which a password cannot be entered.
  2. If you enclose an authorization identifier in double quotation marks ("), the system treats it as being case sensitive; otherwise, the system treats it as in all uppercase letters. If you use the Bourne shell (sh), C shell (csh), or Korn shell (ksh), you need to enclose the authorization identifier in single quotation marks (').

(11) -x

When a cluster key has been defined for the table, this option specifies that data loading is to be performed in the order of the input data without checking to see whether the input data is in ascending or descending order of the cluster key values.

When this option is omitted, the utility performs cluster key checking and treats any out-of-order input data as an error.

(a) Criterion

If you know that the input data is sorted by the cluster key, specify this option to reduce the overhead of cluster key checking.

(b) Note
  1. For a table for which a UNIQUE cluster key has been defined, checking of duplicate keys is not available; therefore, an attempt to store data with a duplicate key will result in an error (rollback).
  2. This option enables input data that is not sorted by the cluster key to be stored in a table for which a cluster key has been defined; however, it does not provide the clustering effect.

(12) -f input-data-file-type-or-LOB-input-file-type

Specifies easymt if the input data files or the LOB input files are EasyMT.

(a) Rules
  1. If you are creating a LOB column structure base table and LOB columns at the same time, you need MTguide for mounting operations to use EasyMT.
  2. If the input data files are EasyMT and the LOB input files are regular files (or vice versa), you need to perform data loading separately for the LOB column structure base table and the LOB columns.
  3. When data is loaded to a table containing abstract data type columns, if LOB data is present as a parameter for the function that generates the data to be stored in an abstract data type column, the LOB input file cannot be assigned to EasyMT. In this case, you cannot specify this option.

(13) -s separator-character

[Figure]<character string>

For an input data file in DAT format, this option specifies the separator character to be used as the delimiter between data items. When this option is omitted, the comma (,) is assumed.

In the binary or fixed-size data format, this option is ignored, if specified.

(a) Criterion

Specify this option to use a character other than the comma (,) as the separator between data items in the input data.

(b) Rules
  1. You can specify the tab symbol and the pipe symbol (|) as separator characters if they do not occur in the input data. However, so they can be recognized by the shell that processes command line input, these special characters should be enclosed in double quotation marks (").
  2. None of the following characters can be specified as the separator character:
    • Uppercase letters (A-Z) and lowercase letters (a-z)
    • Numeric characters (0-9)
    • The following characters that are used as reserved characters by the utility:
      Asterisk (*), double quotation mark ("), underline (_)
  3. The following characters are not suitable for use as separator characters because they can occur in input data codes. Separator characters are single-byte codes; therefore, you cannot specify a double-byte code as a separator character.
    • Signs for numeric data input (+ or -)
    • Characters that can occur as Japanese input character codes:
      |, \, [, ], (, ), {, },zueng045.tif
    • The hyphen (-) for date data input
    • The colon (:) for time data input
    • The period (.) for time and date interval data input

(14) -e

Specifies that processing is to be cancelled if an error is detected in the input data.

To cancel processing and ignore the storage processing that had been performed up to that point (to roll back), specify the dataerr operand in the option statement.

If you omit both the -e option and the dataerr operand in the option statement, the utility continues processing and stores only normal data in the database even if it detects invalid input data.

When the -e option is specified, the system outputs the cause of error to the error information file only for the first data item resulting in an error. When the -e option is omitted, the system outputs the cause of error for all data resulting in an error.

(a) Criteria

You can cancel processing when the first error is detected in the input data.

Suppose that you are loading data to a repetition column using an input data file in DAT format. Although the input data is created in VV format, it is treated as FF format if the elmtype operand is omitted from the array statement, in which case all input data results in an error and the system creates an unneeded error information file as well as an error data file. In this case, if you specify the -e option, the system cancels the processing when it detects the first error, thereby avoiding the creation of unneeded files.

(b) Rules
  1. If the following conditions are satisfied, the utility cancels processing and rolls back regardless of the specification of this option:
    ConditionReason
    In a binary-format input data file, the length field of variable-length data is a negative value.In a binary-format with no delimitation between rows, row data must be identified because data is divided by the column length. Because data cannot be divided by a negative value, the utility cannot continue processing.
    The dataerr operand is specified in the option statement.When the dataerr operand is specified, the utility rolls back processing. Therefore, continuing with the processing serves no purposes.
  2. The name of an error information file is specified in the control information file. If there is no such specification, the database load utility creates the file in the /tmp directory at the host that contains the input data file. In this case, the database load utility assigns a file name and displays it in a message.
  3. If the server where an input file was prepared using a HiRDB/Parallel Server (the server specified in the source statement) is different from the server containing the table subject to data loading, or an input file is prepared on a utility special unit on a HiRDB/Single Server, the system cannot cancel the processing when it detects any of the following errors:
    • Duplicate key error on a unique index key or a primary key index
    • Invalid value for an abstract data type column
    If you specify the -e option to execute data loading in units of tables into a table that is divided into multiple servers on a HiRDB/Parallel Server, the system cannot guarantee the extent to which input row data has been stored in a table in the event of an error. Therefore, do not specify the -e option if all of the following conditions are met:
    • HiRDB/Parallel Server
    • Data loading in units of tables into a table divided into multiple servers
    • Tables with a unique key index or primary key index defined, or tables containing columns of abstract data type
  4. If the dataerr operand is specified in the option statement and error data is detected, the utility cancels processing even when the -e option is omitted.

(15) -r input-begin-line

[Figure]<unsigned integer> ((2-4294967295)) <<1>>

Specifies that data input is to begin at the specified line, not the beginning of the input data file.

(a) Criteria

If the data loading process specifying the -e option is cancelled, specify this option to restart the data loading.

(b) Notes
  1. When a UOC is used for data loading (to read the input data file), the -r option if specified, is ignored.
  2. If you re-execute data loading with a synchronization point specified using the -r option after the initial data loading rolled back due to abnormal termination, the system determines the input begin line as follows:
    StatusInput begin line during re-execution
    -r input-begin-line
    > Line saved during the data loading with a synchronization point specified
    -r input-begin-line
    -r input-begin-line
    = Line saved during the data loading with a synchronization point specified
    Line saved during the data loading with a synchronization point specified*
    -r input-begin-line
    < Line saved during the data loading with a synchronization point specified
    Line saved during the data loading with a synchronization point specified*
    * The KFPL00810-I message is displayed, indicating the re-execution of pdload.

(16) -z

Specifies that variable-length character string data, variable-length national character string data, and variable-length mixed character string data with a length of 0 is to be stored.

(a) Criteria

Specify this option to avoid storing the null value or single-byte space (one space character) for variable-length character string data, variable-length national character string data, or variable-length mixed character string data.

(b) Notes
  1. For data in the DAT format, the system handles data with a length of zero as follows:
    ..., " ", ... (Treated as data with a length of 0)
    ...,, ... (Treated as the null value)
  2. For data in the binary format, the system sets 0 as the actual data length and treats data without the actual data section as having a length of 0.

(17) -y

Specifies that data is to be stored in unused area in used pages during data loading if all unused pages become completely full. When this option is specified, the system displays the KFPH26010-I message before storing data in the unused area.

(a) Criteria

Normally, when the unused pages are used up during data loading, an error results and the processing is rolled back. To avoid this, specify this option so that data loading can be completed by storing data in unused area.

(b) Rules
  1. You cannot specify p (pre-update log acquisition mode) with the -l option.
  2. Once an unused page has been completely used, the specified percentage of unused space per page is not applicable to the data storage.
  3. The -n option, if specified, has no effect after an unused page has been used up.

(18) -o

Specifies that the index information file specified in the index statement is to be deleted automatically after batch index creation terminates normally. Note that the directory specified in the idxwork statement and the index information file in the /tmp directory that was created automatically by pdload are deleted after batch index creation terminates normally, regardless of the specification of this option.

(a) Criterion

An index information file created for plug-in indexes tends to be large and requires a large amount of disk space, if kept on the disk. To make sure that this index information file is deleted, you can specify the -o option to automatically delete index information files after the batch index creation process.

(b) Rules

For data loading in units of RDAREAs, it is necessary to execute pdrorg (-k ixmk) using the index information file output after data loading for the following indexes. Therefore, the -o option, if specified, is ignored in this case.

(19) -m progress-message-output-interval

[Figure]<unsigned integer> ((1-1000)) <<10>>

Specifies, in units of 10,000 lines, an interval at which a message is displayed indicating the progress of the current process.

(a) Criterion

If the default value, which is 10,000 lines, is too many or too few, specify a desired value.

(b) Rule

During batch index creation, the system ignores this option and displays messages only when the index creation begins and ends.

(20) -X response-monitoring-time-for-server-to-server-communication

[Figure]<unsigned integer> ((1-65535)) <<300>>

If an error (such as a communication error) occurs at the server where the command was executed, the command may stop responding and the application may stop. To help you detect errors, pdload enables you to monitor the response time for communication during dictionary manipulation performed by the command.

The -X option specifies the response monitoring time for dictionary manipulation (in seconds). If the execution time during dictionary manipulation exceeds the value set in the -X option, pdload assumes a dictionary access error and cancels processing with return code 8.

Criteria
  • If you want to detect an error in less time than 300 seconds in the event of a no-response from the server due to a communication error or unit down, specify a value that is smaller than 300 in the -X option.
  • If the system switchover facility is used, the command may keep waiting for a response even though system switchover has been completed. In such a case, you can terminate the command immediately by reducing the monitoring time.
  • The specified monitoring time may result in a timeout if a response from the dictionary is delayed and the utility's preprocessing is not completed within 300 seconds (which is the default value for the -X option). This can happen when many applications and utilities are executing concurrently. In such an environment, specify a value greater than 300 in the -X option.

(21) -q generation-number

[Figure]<unsigned integer> ((0-10))

Specifies the generation number of the RDAREA subject to data loading when the inner replica facility is used.

Specify the generation number as follows:

0: Original RDAREA is to be subject to data loading.
1 to 10: Replica RDAREA generation that is to be subject to data loading.
(a) Criteria

Specify this option if you are using the inner replica facility and loading data to an RDAREA other than the current RDAREA.

(b) Rules
  1. When the inner replica facility is not used, this option cannot be specified.
  2. When this option is omitted, the current RDAREA becomes subject to data loading.
  3. If a replica RDAREA is subject to data loading, specify the name of the original RDAREA in the source statement and the generation number subject to data loading in the -q option.
  4. pdload checks the RDAREA subject to data loading for its generation. It performs the following checking; if the utility detects an error, it displays a message and terminates with return code 8.
    Checking as to whether the generation numbers match between row-partitioned tables:
    When data loading is executed on a row-partitioned table, the utility checks the generation of the target RDAREA.
    When the -q option is specified, the utility checks each storage target for the RDAREA of the specified generation. When the -q option is omitted, the utility checks each current RDAREA at the storage target to determine whether their generation numbers match.
    Checking as to whether the generation numbers of the RDAREAs for storing the table and index match:
    The utility checks the generation numbers of the RDAREAs for storing the table and index.
    When the -q option is specified, the utility checks the RDAREAs subject to data loading for the specified generation. When the -q option is omitted, the utility checks all current RDAREAs subject to data loading to determine whether their generations are the same.
    Checking as to whether the replica status of the target RDAREAs is the same:
    When the -q option is specified, the utility checks the target RDAREAs to determine whether the current RDAREAs are intermixed with non-current RDAREAs.
  5. The following table describes whether or not each file can be used depending on the use of the inner replica facility:
    File output conditionWhether or not the file can be used
    When inner replica facility is not usedWhen inner replica facility is used
    Original RDAREAReplica RDAREA
    Same generation*Different generation
    Index information fileWhen inner replica facility is not usedYYNN
    When inner replica facility is usedOriginal RDAREAYYNN
    Replica RDAREANNYN
    LOB middle fileWhen inner replica facility is not usedYYNN
    When inner replica facility is usedOriginal RDAREAYYNN
    Replica RDAREANNYN

Y: File can be used.

N: File cannot be used.

* The generation information during file output is the same as the generation information specified in the -q option. When the -q option is omitted, the generation information during file output is the same as the generation information for the current RDAREA.

(22) [authorization-identifier.] table-identifier

Specifies the table identifier of the table subject to data loading. When the authorization identifier is omitted, the system assumes the user name used to establish connection with HiRDB.

(a) Rule

If you enclose an authorization identifier in double quotation marks ("), the system treats it as case sensitive; otherwise, the system treats it as in all uppercase letters. If you use the Bourne shell (sh), C shell (csh), or Korn shell (ksh), you need to enclose the authorization identifier in single quotation marks (').

(23) control-information-filename

Specifies the name of the control information file that contains the control statements of the database load utility.

You can specify any of the control statements listed below in the control information file. For details about the control statements, see sections 5.4.3 through 5.4.17.

(a) Number of specifiable control statements

The following table shows the maximum number of control statements permitted in a control information file:

Control statementMaximum number of control statements
mtguide statement1
emtdef statement1
source statement1
index statementNumber of indexes [Figure] number of storage RDAREAs
idxwork statementNumber of index storage servers
sort statementNumber of index storage servers
lobdata statement1
lobcolumn statementNumber of LOB columns
lobmid statement1
srcuoc statement1
array statement1
extdat statement1
src_work statement1
constraint statement1
option statement1
report statement1
(b) Relationship between control statements and options

The following shows the relationship between control statements and options.

(c) Files and directories specified in the control statements

The following rules apply to specifying files and directories in the control statements:

  1. Access privileges must be granted to the HiRDB administrator. If some control statements or operands are omitted, the utility may assume that directories or files are to be created in the /tmp or /usr/tmp directory; therefore, you must also grant access privileges to the /tmp or /usr/tmp directory.