5.4.2 Options
(1) -d
~<<addition mode>>
Specifies that the database load utility is to be executed in the creation mode. If you omit this option, the system assumes the addition mode.
In the creation mode, all existing data is deleted from the table that is to be loaded, and then the input data is stored in the table.
In the addition mode, the input data is added to the existing data in the table.
- Rules
- In the addition mode, the input data is stored in the pages following the last data in the table. If you use the addition mode to load data on a table with a cluster key defined, the additional data is not stored in the order of cluster key values.
- In the addition mode, data is always stored in unused pages. If the system cannot allocate unused pages, an error results. Normally, the system will not use available space in used pages that contain data. However, if you specify the -y option, the system will use used pages that already contain data.
- If you execute data loading in units of RDAREAs in the creation mode, the existing table data is deleted only in the specified RDAREA.
- During data deletion, the system does not allow a NOWAIT search, even if nowait=yes is specified in the option statement.
- Once data loading with the synchronization point specification terminates abnormally, the creation mode is ignored during the re-execution, if specified.
- If you execute data loading on a LOB column that is stored in an RDAREA in frozen update status or on a table that contains an abstract data type column with the LOB attribute, specifying the creation mode results in an error because data cannot be deleted. Even when you execute data loading only on a LOB column structure base table, specifying the creation mode results in an error.
- For a falsification prevented table, data loading in the creation mode is not permitted.
(2) {-a|-b [-W [-w {all|data} [,rdaomit][,csetnck]]] | -U}
~<<DAT format>>
Specifies the format of the input data file. For details about input data files, see 5.5 Input data file.
- Omitted: DAT format
- -a: Fixed-size data format
- -b: Binary format
- If you perform data loading on an audit trail table, you must specify the -b option because audit trail tables are always in binary format.
- If you also specify the -W option, the utility treats the file as a pdload unload file.
- -W: pdload unload file
- When you specify this option, you must also specify the -b option. Data loading fails if the table definitions, such as the column data types, are different between the unloaded table and the table subject to data loading. However, if you specify a column structure information file, you can load data into a non-FIX table with the column definition sequence and number of columns changed. When you use a UAP or UOC to change the contents of the input data file, you must edit the data in a format in which there are no spaces between the row length, column data offset, and row data.
- You must specify this option when you perform data loading on an audit trail table.
- -w: Table transfer unload file
- When you specify this option, you must also specify the -b -W option.
- {all|data}
- all: Stores the table, index definition, and table data.
- data: Stores only the table data.
- rdaomit
- Specify this option if the storage RDAREA names differ between the transfer source and the transfer target. This option is applicable only to a table that satisfies the following conditions:
- Table to be imported contains no BLOB columns.
- Table is not partitioned.
- csetnck
- Specifies that checking is not to be performed for whether the character encoding used in the table transfer unload file matches the character encoding used in HiRDB.
- -U: pdrorg unload file
- When you specify this option to perform data loading, the column definitions must match between the unloaded table and the table subject to data loading. When you unload a table containing a column for which SUPPRESS is specified, an unload data file created with the -S option specified (to compress spaces) or an unload data file by schema cannot be used. The only supported medium for unload data files is regular files (tape and HiRDB files are not supported). If unloaded data is output to multiple files, specify only the names of the unload data files to which actual data was output during execution of pdrorg in the source statement for pdload in the order specified in the unload statement. If the order of specification is incorrect, data might not be loaded correctly.
- Note that if this option is specified when the table subject to data loading satisfies any of the conditions listed below, a control statement error occurs:
- Table containing a column of an abstract data type
- Table containing a column of BINARY type
- Table containing a repetition column
- FIX table
- When you perform data loading on these tables, you must use pdload unload files.
(3) -i index-creation-method
~<<c>>
Specifies the index creation method. There are four ways to create indexes:
- c
- This indicates the batch index creation mode. If you specify this mode, the system creates indexes in batch mode after creating the table.
- When row data is being stored, the system outputs index creation information to an index information file without actually creating an index. The system creates indexes after storing the row data.
- Criterion
- When a large amount of data is loaded, this method can create indexes at high speed.
- Notes
- For data loading in the addition mode, the system outputs all index information, including both existing data and added data, to the index information file; therefore, all index entries are re-created.
For additional data loading involving a plug-in index, the system outputs only the index information on the additional data. Therefore, the plug-in index is created only for the index entries that have been added.
- If you execute data loading with the synchronization point specification in the batch index creation mode, the processing performance decreases for the following reasons:
For a B-tree index, the system stores all data and then searches the data pages again to create an index information file.
For a plug-in index, the system starts the index creation process each time the transaction is settled, thereby alternately executing data loading and index creation.
- n
- This indicates the index information output mode. If you specify this mode, the system outputs only the index information to an index information file.
- When row data is being stored, the system outputs index creation information to an index information file without actually creating an index.
- Criterion
- You can use the index information file obtained to achieve high-speed index creation by executing multiple batch index creation processes concurrently with the database reorganization utility. This method is especially effective for a table that is partitioned and stored in the multiple servers that constitute a HiRDB parallel server configuration.
- Notes
- If a table is partitioned into multiple RDAREAs in a server and a part of the partitioned table is created by data loading in units of RDAREAs, the system assumes the index information output mode for the non-partitioning key index that is not row-partitioned in the server.
- You cannot specify the index information output mode for a table for which a plug-in index is defined.
- When you specify the index information output mode, you need to use the database reorganization utility to create the index in batch mode after completing the data loading.
- s
- This indicates the index update mode. If you specify this mode, the system updates indexes each time a row of data is stored.
- Criterion
- Use the index update mode when loading a small amount of data to a table that already contains a large amount of data, or when a unique key index or primary key index has been defined and the key values of the data to be stored may be duplicated.
- x
- This indicates the index information output suppression mode. If you specify this mode, the system does not update indexes nor output index information to an index information file.
- Criterion
- Use this mode when executing data loading in multiple segments. For example, you should use the index information output suppression mode if data loading needs to be executed n times because there are n volumes of MT.
- Notes
- If you are executing data loading using the -ix option, specify -ice (batch index creation mode) or -in (index information output mode) at the final execution. If you specify -in, you need to execute batch index creation (-k imp) later using the database reorganization utility.
If you have loaded all data using the -ix option, you need to re-create indexes (-k irk) later using the database reorganization utility.
- You cannot specify the index information output suppression mode for a table for which a plug-in index is defined.
- If you execute data loading using the -ix option, the system does not update indexes, leaving indexes in uncreated status.
(a) Notes about the creation of an index
- If you execute pdload in a mode other than the index update mode (-i s), the system creates an index in batch mode after storing all row data in a table. If a duplicate key error or an out-of-limit duplicate key error occurs during the index creation process, a mismatch may result between the table and the index.
To avoid this problem, create a backup copy in either of the following cases, irrespective of which log-acquisition mode is used:
- Tables with a unique key index or primary key index defined
- Tables with an index defined on a repetition column
- The figure below shows the index creation method for a row-partitioned index and a non-partitioned index.
If a table is partitioned and stored in multiple RDAREAs in a server, there are a row-partitioned index that is stored in as many index storage RDAREAs as there are table storage RDAREAs and a non-partitioned index that is stored in a single index storage RDAREA.
The row-partitioned index is partitioned and stored in the same manner as with the table that is partitioned in the server (RDAREAs INDEX1_1 and INDEX1_2 in the figure).
The non-partitioned index is stored in a single RDAREA in the server, regardless of the number of table storage RDAREAs (RDAREAs INDEX1_3, INDEX2_1, and INDEX2_2 in the figure).
Figure 5-11 Index creation method for partitioning key index and non-partitioning key index
![[Figure]](figure/zu050030.gif)
- Explanation
- After completion of data loading, the system creates indexes according to the specified -i option. If data loading is executed on TABLE1_1 in units of RDAREAs, row-partitioned index INDEX1_1 is created (there is no effect on INDEX1_2). To create non-partitioned index INDEX2_1, however, information about both TABLE1_1 and TABLE1_2 is required. Therefore, in this case, the non-partitioned index is not created (only the index information file for TABLE1_1 is created). To create the non-partitioned index, you need to execute data loading on TABLE1_2 also to create the index information file for TABLE1_2, then execute prong's batch index creation process using the index information files for TABLE1_1 and TABLE1_2.
- After table data has been stored, the corresponding indexes are unfinished and unavailable until the batch index creation process is completed.
- If you specify -i c or -i n, but there is no data in the input data file, the system outputs index information only for the existing data to the index information file. You can use this file as an input to pdrorg to re-create the index. In this case, be sure to execute pdload in the addition mode (-d option omitted). If you execute it in the creation mode (-d option specified), pdload deletes the table data at first, thereby losing the index information.
- If you execute data loading using the -i n option and then use ALTER TABLE to add RDAREAs before executing prong's batch index creation process (-k imp), the index information file output by the -i n option becomes unusable. If you are adding RDAREAs with ALTER TABLE, be sure to complete prong's batch index creation process beforehand. If you have already added RDAREAs using ALTER TABLE, execute prong's index re-creation process (-k ixrc).
- When you specify -i c or -i n, the utility creates as many index information files as the number of indexes
RDAREAs storing the table. Because these files are opened at the same time during data loading, they may exceed the maximum number of files permitted per process. If the maximum number is exceeded, performance might suffer. For this reason, we recommend that you perform data loading for each RDAREA.
The maximum number of file descriptors that can be used per process is the physical limit or 7,872, whichever is smaller.
- If you have specified the index information output mode (-i n), make sure that you do not rename the index (with ALTER INDEX) before batch index creation (-k ixmk) is completed. If the index is renamed, batch index creation will stop. If you have renamed an index by mistake before batch index creation was completed, use ALTER INDEX to restore the original index name, perform batch index creation, and then rename the index.
- If you have specified the index information output mode (-i n), make sure that you do not rename an RDAREA containing the index before batch index creation (-k ixmk) is completed. If an RDAREA containing the index is renamed, batch index creation will stop. If you have renamed an RDAREA by mistake before batch index creation was completed, use pdmod to restore the original RDAREA name, perform batch index creation, and then rename the RDAREA.
- If indexes for a table consisting of many partitions are created with the index batch output mode (-i c) or the index information output mode (-i n), perform data loading for each RDAREA.
(b) Notes about the creation of a plug-in index
- If a plug-in index is defined, you cannot specify the index information output mode nor the index information output suppression mode. In these cases, specify either the batch index creation mode or the index update mode.
- For a B-tree structure index, execution of data loading in the batch index creation mode creates all indexes. For a plug-in index, an index is created only on the additional data. Therefore, if you execute data loading involving 0 data records, a B-tree structure index is re-created, but not a plug-in index. However, for a plug-in index, loading 0 data records after re-initializing the index storage RDAREA results in the re-creation of an index. Use this option to repair an index in the event of abnormal termination during an index creation process, or if the index has been damaged due to a disk error.
(c) Notes about specifying -i c or -i n
When -i c or -i n is specified and the index or idxwork statement is omitted, the utility outputs the index information file to the /tmp directory using the following naming conventions:
/tmp/INDEX-index-name-index-storage-RDAREA-name-unique-character-string
If pdload terminates abnormally, this file is not deleted. If you re-execute pdload as is, a new file with a different name is created; this may result in a space shortage in the /tmp directory. If the idxwork statement is specified, the utility also outputs the index information file to the directory specified in the idxwork statement using the same naming conventions. In this case also, if pdload terminates abnormally, the file is not deleted. You should use the rm OS command to delete unneeded index information files.
(4) -l log-acquisition-method
~<<p>>
Specifies the method for acquiring the database updating log when pdload is executed.
If pdload terminates abnormally during execution, the database is not restored to its status before pdload was executed even when the update log has been acquired. For details about what to do in the event of abnormal termination of pdload, see 5.13 Database status in the event of an error and recovery methods.
- a
- This indicates the log acquisition mode. This method collects database updating log information required for rollback and rollforward.
- Criteria
- This mode is suitable for loading a small amount of data.
- When you execute data loading in the log acquisition mode, there is no need to make backups before and after execution of pdload, but performance is lower than in the other modes.
- p
- This indicates the pre-update log acquisition mode. This method collects database updating log information required for rollback, but it does not collect database updating log information required for rollforward.
- Criteria
- This mode is suitable for loading a large amount of data.
- When you execute data loading in the pre-update log acquisition mode, the execution time is shorter than in the log acquisition mode. However, to handle possible media errors, you should back up the RDAREAs that store tables and indexes after executing pdload.
- In the event of an error during execution of pdload, pdload restores the database to the synchronization point immediately before the occurrence of the error. This protects the RDAREAs from being placed in no-log shutdown status; however, the target table is not restored to its status before execution of pdload.
- Notes
- If data loading is executed on a table for which an abstract data type provided by a plug-in is defined, whether or not the p option (pre-update log acquisition mode) takes effect on the log output by the plug-in depends on the installed plug-in. If the plug-in does not support the p option, the a option (log acquisition mode) is assumed.
- n
- This indicates the no-log mode. The system does not update the database updating mode information.
- Criteria
- If only the table subject to data loading and its indexes are defined in the RDAREA, this mode is suitable for loading a large amount of data.
- Data loading in the no-log mode is faster than in the other modes. However, to restore the database from its backup in the event of an error during utility execution or from its backup copy and log information in the event of an error on a medium, you must back up the RDAREAs storing the table and indexes both before and after executing the database load utility.
- In the event of an error, you can restore the database only up to the point at which the backup was made.
- Notes
- If an error occurs while executing pdload in the no-log mode, you must either use a backup copy to restore the RDAREAs that were shut down due to the error or re-initialize them.
(a) Notes
- If you specify n and the database can be restored from the previously acquired backup copy and log information or from the input data (if the table subject to data loading contains no data and only this table is to be stored in applicable RDAREAs), there is no need to make a backup copy prior to execution of pdload.
- For details about how to operate when p or n is specified (database updating log information is not collected), see the HiRDB Version 9 System Operation Guide.
- A transaction log is always collected by a transaction (T) created by pdload, regardless of the -l option's specification. The system creates the following amount of transaction log information per server; therefore, the formula for determining the amount of log information during execution of pdload is as shown below:
- Amount of log information = (1328 + 176
3)
T + A (bytes)
- T = (x
2) + (y
2) + (z
2)
x: Number of tables#
y: Number of RDAREAs storing LOB columns (LOB attribute)
z: Number of indexes
number of RDAREAs storing indexes (not required if -i s is specified)
- #: The value is the number of synchronization points in the case of data loading with synchronization point specification; otherwise, the value is 1.
A: Amount of system log information that is output according to the database manipulation (amount of database updating log information). This value depends on the value of the -l option. For details about determining the amount of system log information, see the HiRDB Version 9 Installation and Design Guide.
If p or n is specified, the system collects ENQ log information in lock mode. The following is the amount of ENQ log information collected per server:
- ENQ log information = (p + q + r)
T
p: Number of RDAREAs storing the table
q: Number of RDAREAs storing LOB columns (LOB attribute)
r: Number of RDAREAs storing indexes
Therefore, the amount of system log information the system outputs is the system log file record length
ENQ log information.
- If Real Time SAN Replication based on the log-only synchronous method is used, and pdload with -l p or -l n specified was executed at the transaction execution site, you must execute the preparations for log application.
(5) -k LOB-creation-type
~<<f or d>>
Specifies the data input method for storing LOB data in a LOB column, if a LOB parameter is used as an argument of the constructor function that generates the values to be stored in an abstract data type column. When this option is omitted, f is assumed, except that d is assumed if the -W option is specified and the job operand is omitted from the option control statement.
- f
- Specify this option to prepare a file for each unit of LOB data. These files are called LOB input files.
- c
- Specify this option to prepare one file for all LOB data contained in one LOB column. This file is called a LOB column input file.
- A column-unit LOB column input file is created when data is migrated from another database using a program.
- You cannot specify this option for LOB data that is specified as an input parameter of the function that creates data for an abstract data type column. Therefore, if a table contains an abstract data type column that uses LOB as an input parameter, you cannot use a LOB column input file even for another LOB column that is not the abstract data type.
- v
- For the data to be stored in a BLOB column, provide a LOB input file for each set of LOB data to be stored in the column. Specify the data (LOB parameter) that serves as an input parameter for the constructor function that generates the values for the abstract data type columns if it is to be stored in an input file.
- This specification takes effect only when the input data file is a binary-format file.
- d
- Specify this option to specify LOB data and LOB parameter data for abstract data type columns directly in an input data file.
- This specification takes effect only when the input data file is a binary-format file.
- When you specify -k d, you cannot execute data loading using UOC.
(a) Criteria
Determine the -k option according to whether or not there are LOB columns and LOB parameters for abstract data type columns and the format of input data file as follows:
LOB column | Format of input data file | Abstract data type column |
---|
No | Yes |
---|
Without LOB parameter | With LOB parameter |
---|
No | DAT format | -- | -- | -k f |
Binary format | -k v or -k d |
Fixed-size data format | -k f |
pdrorg-output binary format | -- | -- | -k d |
Yes | DAT format | -k f or -k c | -k f or -k c | -k f |
Binary format | -k f, -k c, or -k d | -k f, -k c, or -k d | -k v or -k d |
Fixed-size data format | -k f or-k c | -k f or -k c | -k f |
pdrorg-output binary format | -k d | -k d | -k d |
--: There is no need to specify the -k option.
(b) Rules
- If you are creating a LOB column structure base table separately from LOB columns, specify the same options when creating them.
- The contents of an input data file and LOB input files depends on the -k option specification as shown as follows:
-k option | Input data file | LOB input file |
---|
Data stored in LOB column | Abstract data type LOB parameter | Data stored in LOB column | Abstract data type LOB parameter |
---|
f | Specify the name of the LOB input file | Specify the name of the LOB input file | Provide as many files as there are data items to be stored | Provide as many files as there are data items to be stored |
c | Specify dummy data | -- | Provide as many files as there are columns | -- |
v | Specify the name of the LOB input file | Specify the contents of LOB parameter | Provide as many files as there are data items to be stored | Not required |
d | Specify the contents of LOB data | Specify the contents of LOB parameter | Not required | Not required |
--: Not applicable
- For data loading with the synchronization point specification, specify d in the -k option. You cannot specify f, c, or v.
(6) -c column-structure-information-filename
Specifies the name of a column structure information file.
For details about the column structure information file, see section 5.7 Column structure information file. Specify a column structure information file in the following cases:
- The order of columns in the input data file does not match the order of columns in the table.
- The number of columns in the input data file does not match the number of columns in the table.
- Data in the input data file is to be converted to the data type of the table columns before being stored.
- Any column data in the input data file that matches the null comparison value is to be stored as the null value.
- A constructor function is used to generate the values for the abstract data type columns.
- The number of repetition column elements in the input data file is less than the maximum number of elements specified during table definition.
(a) Rules
- If an input data file is in fixed-size data format, be sure to specify a column structure information file. Specification of a column structure information file is optional for an input data file in DAT or pdrorg-output binary format. In the case of an input data file in pdrorg-output binary format, you can specify a column structure information file to load data into a non-FIX table. An attempt to load data to a FIX table using a column structure information file results in an error. To modify the column structure of a FIX table using a column structure information file, treat the corresponding file as the input data file in the fixed-size data format.
- If an input data file is in the DAT format and satisfies all the following conditions, there is no need to specify a column structure information file:
- Columns of input data are in the same order as the order in which the table columns are defined.
- Input data has the same number of columns as the columns in the table.
- The constructor function that generates the data to be stored in the abstract data type has the same name as the column data type, and only one constructor function is used.
- The number of input data elements to be stored in a repetition column is the same as the maximum number of elements specified for the applicable columns during table definition.
- Provide the column structure information file in the host where the database load utility is executed (the pdload command is entered).
- A column structure information file and a null value/function information file are mutually exclusive.
(7) -v null-value/function-information-filename
Specifies the name of a null value/function information file.
You can use a null value/function information file to convert input data to a value that represents a null value or to specify information about a constructor function that generates values for the abstract data type columns.
For details about the null value/function information file, see section 5.8 Null value/function information file.
(a) Rules
- You can specify a null value/function information file if the input data file is in the binary format, not in the DAT or fixed-size data format.
- A null value/function information file and a column structure information file are mutually exclusive.
(8) -n {batch-output-local-buffer-sectors-count[,div] | batch-output-local-buffer-sectors-count,[div],local-buffer-sectors-count-for-random-access | ,,local-buffer-sectors-count-for-random-access}
Specifies the number of local buffer sectors to be used when data is imported into a table during data loading. If this option is specified but the number of batch output local buffer sectors or the number of local buffer sectors for random access is not specified, the corresponding local buffer is not used. If this option is omitted, the local buffer is not used. In such a case, the system uses the global buffer to output one page at a time.
- batch-output-local-buffer-sectors-count ~<unsigned integer> ((2-4096))
- Specifies the number of local buffer sectors for batch output. The batch output local buffer is used for data pages.
- For the number of batch output local buffers, we recommend that you specify a value in the range from 16 to 32. Determine the maximum value to specify by using the formula 184 kilobytes
page length.
- div
- Specify div when all the conditions listed below are applicable. If div is not specified when these conditions are all applicable, the number of input/output operations may increase, thereby affecting performance adversely.
- Data is to be loaded by table into a row partitioned table.
- Data is to be loaded into a row partitioned table that uses a hash function (HASH0-HASH6, HASHZ), or the key values of the input data that is stored in a table partitioned by key ranges are distributed randomly.
- There are multiple RDAREAs at one server in which the table is stored
- When div is specified, the required memory size increases because the system allocates as many buffer sectors as there are table partitions in the server.
- local-buffer-sectors-count-for-random-access ~<unsigned integer> ((4-125000))
- Specifies the number of local buffer sectors for random access. The random access local buffer is used for index pages. Specify this operand when you create indexes in the index update mode. In the index update mode, index pages with a B-tree structure are referenced each time a row is stored. If you use the random access local buffer, you can reduce the effects on online processing of global buffer contention because these index pages are placed in the corresponding buffer.
The table below shows the recommended -n option specification.
Table 5-5 Recommended -n option specification (pdload)
Table type | Column definition | Table partitioning in server |
---|
Yes | No |
---|
FIX table | -- | -n x, div | -n x |
Non-FIX table | Variable-length character string whose column length exceeds 256 bytes or a BINARY column is defined | -n is omitted. |
Other | -n x, div | -n x |
- Legend:
- x: Number of batch output local buffer sectors
- --: Not applicable
(a) Buffer used by pdload
- When the -n option is omitted, the system uses the global buffer. In this case, the transaction performance of a UAP that uses the global buffer drops because a large amount of global buffer space is used during data loading. Specifying the -n option eliminates such buffer contention. The figure below shows the relationship between pdload and the buffer.
Figure 5-12 Relationship between pdload and buffer
![[Figure]](figure/zu050820.gif)
- Explanation
- If only the global buffer is used (-n option is omitted), buffer contention occurs between pdload and the UAP.
- When both local and global buffers are used (-n option is specified), buffer contention does not occur between pdload and the UAP. However, for data loading to LOB columns, the system uses the global buffer even when the -n option is specified.
- If only the number of batch output local buffer sectors is specified in the -n option, the system uses a single batch output local buffer per RDAREA. If there are multiple RDAREAs, buffer contention occurs because only one batch output local buffer is used. If buffer contention occurs, the number of input/output operations increases, thereby affecting performance adversely. In such a case, specify div. When div is specified, buffer contention will not occur because the system allocates as many batch output local buffers as there are RDAREAs (one batch output local buffer per RDAREA).
- Even when the batch output local buffer is specified, the system may use the global buffer depending on the conditions. The table below describes the relationships between the conditions and the buffer that is used.
Table 5-6 Relationship between conditions and the buffer that is used (pdload)
Condition | Specification of random access local buffer |
---|
Not specified | Specified |
---|
Global buffer | Batch output local buffer | Global buffer | Batch output local buffer | Random access local buffer |
---|
Data page | User RDAREA | FIX table | Rows can no longer be stored in specified segments for a table with clustering specified | Y | N | Y | N | N |
Other | N | Y | N | Y | N |
Non-FIX table | When variable-length character strings or BINARY columns whose column length exceeds 256 bytes are defined | Y | N | Y | N | N |
When abstract data type columns or repetition columns are defined and a row of data cannot fit in one page | Y | N | Y | N | N |
Rows can no longer be stored in specified segments for a table with clustering specified | Y | N | Y | N | N |
Other | N | Y | N | Y | N |
User LOB RDAREAs | RDAREA for storing LOB columns | Y | N | Y | N | N |
RDAREA for storing LOB attributes | Y | N | Y | N | N |
Index page | When key values are searched from the table (-i c or n is specified) | N | Y | N | Y | N |
When indexes are created concurrently (-i s is specified) | Y | N | N | N | Y |
When batch index creation is executed (-i c is specified) | Y | N | N | N | Y |
Directory page | Y | N | Y | N | N |
- Legend:
- Y: Used
- N: Not used
(b) Rules
- If the global buffer is not sufficient for a rebalancing table, performance may drop considerably. Therefore, if there is no column whose definition length exceeds 256 bytes, we recommend that you specify the number of batch output pages.
If the local buffer specification is not effective for a reason such as the existence of a column whose definition length is 256 bytes or greater, allocate at least the following number of global buffer sectors:
Number of buffer sectors required per RDAREA
= 1,024
(number of table storage RDAREAs)
2+ 3
- For a rebalancing table with FIX hash partitioning, the system allocates a buffer equivalent to the specified number of pages for each hash group, thereby consuming more memory than for a table that is not partitioned or partitioned by other conditions.
(9) -u authorization-identifier
Specifies the authorization identifier of the user executing the database load utility.
For the default value, see (b) Default value.
When this option is specified, a message requesting password entry is displayed. If no password is required, enter null in response to the message. The utility checks the authorization identifier entered and password to verify that the user is authorized to establish connection with the HiRDB system and to access the table.
(a) Criterion
Specify this option to use an authorization identifier that is not defined in the PDUSER environment variable.
(b) Default value
When this option is omitted, the system assumes the authorization identifier and password as follows:
- The system assumes the value of the PDUSER environment variable during the execution of database load utility. Be sure to specify PDUSER if you are executing the utility in the background with & attached by the shell, or in a remote shell environment in which a password cannot be entered. Following are examples of the PDUSER environment variable:
- Examples for C shell:
- Specifying a password:
- setenv PDUSER '"authorization-identifier"/"password"'
- Not specifying a password:
- setenv PDUSER '"authorization-identifier"'
- If the PDUSER environment variable is not set, the system assumes the login window's user name. Enter the password when a message is displayed requesting password entry. If no password is required, enter null in response to the message.
(c) Rules
- Do not specify this option if you are executing the utility in the background with & attached by the shell, or in a remote shell environment in which a password cannot be entered.
- If you enclose an authorization identifier in double quotation marks ("), the system treats it as being case sensitive; otherwise, the system treats it as in all uppercase letters. If you use the Bourne shell (sh), C shell (csh), or Korn shell (ksh), you need to enclose the authorization identifier in single quotation marks (').
(10) -x
When a cluster key has been defined for the table, this option specifies that data loading is to be performed in the order of the input data without checking to see whether the input data is in ascending or descending order of the cluster key values.
When this option is omitted, the utility performs cluster key checking and treats any out-of-order input data as an error.
(a) Criterion
If you know that the input data is sorted by the cluster key, specify this option to reduce the overhead of cluster key checking.
(b) Note
- For a table for which a UNIQUE cluster key has been defined, checking of duplicate keys is not available; therefore, an attempt to store data with a duplicate key will result in an error (rollback).
- This option enables input data that is not sorted by the cluster key to be stored in a table for which a cluster key has been defined; however, it does not provide the clustering effect.
(11) -f input-data-file-type-or-LOB-input-file-type
Specifies easymt if the input data files or the LOB input files are EasyMT.
(a) Rules
- If you are creating a LOB column structure base table and LOB columns at the same time, you need MTguide for mounting operations to use EasyMT.
- If the input data files are EasyMT and the LOB input files are regular files (or vice versa), you need to perform data loading separately for the LOB column structure base table and the LOB columns.
- When data is loaded to a table containing abstract data type columns, if LOB data is present as a parameter for the function that generates the data to be stored in an abstract data type column, the LOB input file cannot be assigned to EasyMT. In this case, you cannot specify this option.
(12) -s separator-character
~<character string>
For an input data file in DAT format, this option specifies the separator character to be used as the delimiter between data items. When this option is omitted, the comma (,) is assumed.
In the binary or fixed-size data format, this option is ignored, if specified.
(a) Criterion
Specify this option to use a character other than the comma (,) as the separator between data items in the input data.
(b) Rules
- You can specify the tab symbol and the pipe symbol (|) as separator characters if they do not occur in the input data. However, so they can be recognized by the shell that processes command line input, these special characters should be enclosed in double quotation marks (").
- None of the following characters can be specified as the separator character:
- Uppercase letters (A-Z) and lowercase letters (a-z)
- Numeric characters (0-9)
- The following characters that are used as reserved characters by the utility:
Asterisk (*), double quotation mark ("), underline (_)
- The following characters are not suitable for use as separator characters because they can occur in input data characters. Separator characters are single-byte characters; therefore, you cannot specify a double-byte character as a separator character.
- Signs for numeric data input (+ or -)
- Characters that can occur as Japanese input characters:
|, \, [, ], (, ), {, },![[Figure]](figure/zueng045.gif)
- The hyphen (-) for date data input
- The colon (:) for time data input
- The period (.) for time and date interval data input
(13) -e
Specifies that processing is to be cancelled if an error is detected in the input data.
To cancel processing and ignore the storage processing that had been performed up to that point (to roll back), specify the dataerr operand in the option statement.
If you omit both the -e option and the dataerr operand in the option statement, the utility continues processing and stores only normal data in the database even if it detects invalid input data.
When the -e option is specified, the system outputs the cause of error to the error information file only for the first data item resulting in an error. When the -e option is omitted, the system outputs the cause of error for all data resulting in an error.
(a) Criteria
You can cancel processing when the first error is detected in the input data.
Suppose that you are loading data to a repetition column using an input data file in DAT format. Although the input data is created in VV format, it is treated as FF format if the elmtype operand is omitted from the array statement, in which case all input data results in an error and the system creates an unneeded error information file as well as an error data file. In this case, if you specify the -e option, the system cancels the processing when it detects the first error, thereby avoiding the creation of unneeded files.
(b) Rules
- If the following conditions are satisfied, the utility cancels processing and rolls back regardless of the specification of this option:
Condition | Reason |
---|
In a binary-format input data file, the length field of variable-length data is a negative value. | In a binary-format with no delimitation between rows, row data must be identified because data is divided by the column length. Because data cannot be divided by a negative value, the utility cannot continue processing. |
The dataerr operand is specified in the option statement. | When the dataerr operand is specified, the utility rolls back processing. Therefore, continuing with the processing serves no purposes. |
- The name of an error information file is specified in the control information file. If there is no such specification, the database load utility creates the file in the directory at the host that contains the input data file. In this case, the database load utility assigns a file name and displays it in a message. For details about the error information file name, see 5.4.4(2)(e) error=error-information-file-name.
- If the server where an input file was prepared using a HiRDB parallel server configuration (the server specified in the source statement) is different from the server containing the table subject to data loading, or an input file is prepared on a utility special unit in a HiRDB single server configuration, the system cannot cancel the processing when it detects any of the following errors:
- Duplicate key error on a unique index key or a primary key index
- Invalid value for an abstract data type column
If you specify the -e option to execute data loading in units of tables into a table that is divided into multiple servers in a HiRDB parallel server configuration, the system cannot guarantee the extent to which input row data has been stored in a table in the event of an error. Therefore, do not specify the -e option if all of the following conditions are met:
- HiRDB parallel server configuration
- Data loading in units of tables into a table divided into multiple servers
- Tables with a unique key index or primary key index defined, or tables containing columns of abstract data type
- If the dataerr operand is specified in the option statement and error data is detected, the utility cancels processing even when the -e option is omitted.
(14) -r input-begin-line
~<unsigned integer> ((2-4294967295)) <<1>>
Specifies that data input is to begin at the specified line, not the beginning of the input data file.
(a) Criteria
If the data loading process specifying the -e option is cancelled, specify this option to restart the data loading.
(b) Notes
- When a UOC is used for data loading (to read the input data file), the -r option if specified, is ignored.
- If you re-execute data loading with a synchronization point specified using the -r option after the initial data loading rolled back due to abnormal termination, the system determines the input begin line as follows:
Status | Input begin line during re-execution |
---|
-r input-begin-line > Line saved during the data loading with a synchronization point specified | -r input-begin-line |
-r input-begin-line = Line saved during the data loading with a synchronization point specified | Line saved during the data loading with a synchronization point specified# |
-r input-begin-line < Line saved during the data loading with a synchronization point specified | Line saved during the data loading with a synchronization point specified# |
- #: The KFPL00810-I message is displayed, indicating the re-execution of pdload.
(15) -z
Specifies that variable-length character string data, variable-length national character string data, and variable-length mixed character string data with a length of 0 is to be stored.
(a) Criteria
Specify this option to avoid storing the null value or single-byte space (one space character) for variable-length character string data, variable-length national character string data, or variable-length mixed character string data.
(b) Notes
- For data in the DAT format, the system handles data with a length of zero as follows:
..., " ", ... (Treated as data with a length of 0)
...,, ... (Treated as the null value)
- For data in the binary format, the system sets 0 as the actual data length and treats data without the actual data section as having a length of 0.
(16) -y
Specifies that data is to be stored in unused area in used pages during data loading if all unused pages become completely full. When this option is specified, the system displays the KFPH26010-I message before storing data in the unused area.
(a) Criteria
Normally, when the unused pages are used up during data loading, an error results and the processing is rolled back. To avoid this, specify this option so that data loading can be completed by storing data in unused area.
(b) Rules
- You cannot specify p (pre-update log acquisition mode) with the -l option.
- Once an unused page has been completely used, the specified percentage of unused space per page is not applicable to the data storage.
- The -n option, if specified, has no effect after an unused page has been used up.
(17) -o
Specifies that the index information file specified in the index statement is to be deleted automatically after batch index creation terminates normally. Note that the directory specified in the idxwork statement and the index information file in the /tmp directory that was created automatically by pdload are deleted after batch index creation terminates normally, regardless of the specification of this option.
(a) Criterion
An index information file created for plug-in indexes tends to be large and requires a large amount of disk space, if kept on the disk. To make sure that this index information file is deleted, you can specify the -o option to automatically delete index information files after the batch index creation process.
(b) Rules
For data loading in units of RDAREAs, it is necessary to execute pdrorg (-k ixmk) using the index information file output after data loading for the following indexes. Therefore, the -o option, if specified, is ignored in this case.
- There are multiple table storage RDAREAs in a single server, but there is only one index storage RDAREA for them.
(18) [-m[ progress-message-output-interval][,information-message-output-suppression-level]]
- progress-message-output-interval ~<unsigned integer> ((1-1000)) <<10>>
- Specifies, in units of 10,000 lines, the interval at which the message that indicates the progress of the current process is to be displayed. If you specify lvl2 in the -m option, this message will also be output to the work file for troubleshooting information.
- Criterion
- You need specify a value only if the default value, which is 100,000 lines, is too many lines or too few lines.
- Notes
- During batch index creation, the system ignores this option and displays messages only when index creation begins and ends.
- information-message-output-suppression-level ~<<lvl0>>
- Specifies a value that controls suppression of output of information messages to the standard output at the terminal that executed the utility, to the syslogfile, and to the message log file. Even when output is suppressed to all three (the standard output, the syslogfile, and the message log file), troubleshooting information is output to a work file. The table below shows the options that can be specified and the message output destinations.
Table 5-7 Option values and message output destinations
Option value | Description | Output destination |
---|
Standard output | syslogfile and message log file | Work file |
---|
lvl0 | Output is not suppressed | Y | Y | N |
lvl1 | Output to the standard output is suppressed | N | Y | N |
lvl2 | Output to the standard output, syslogfile, and message log file is suppressed | N | N | Y |
- Legend:
- Y: Messages are output
- N: Messages are not output
- Because messages are always output to a work file for troubleshooting purposes (to check utility operation) we recommend under normal circumstances that you specify lvl1, unless there is some special need to check messages.
- Work file
- When you specify lvl2 in the -m option, a work file is created for each server each time the utility executes. Therefore, we recommend that you specify a directory in the pd_tmp_directory operand so that the work files will be deleted periodically by HiRDB with the pdcspool command. The table below shows the work file output destinations.
Table 5-8 Work file output destinations
pd_tmp_directory operand in the system definition | TMPDIR environment variable | Work file output destination |
---|
Specified | -- | Directory specified in the pd_tmp_directory operand |
Omitted | Specified | Directory specified in the TMPDIR environment variable |
Omitted | /tmp directory |
- Legend:
- --: Not applicable
- If output of information messages to the work file fails due to a file open error or an I/O error, an error message is displayed, but processing continues.
- HiRDB creates the work files automatically. The file name is always in the format PDLOAD-aa-bb:
- PDLOAD: Fixed prefix indicating that the message was output by pdload.
- aa: Process's server name if the inner replica facility is not used. If the inner replica facility is used, this character string consists of the process's server name, GN, and the generation number.
- bb: File creation time and process ID.
(19) -X response-monitoring-time-for-server-to-server-communication
~<unsigned integer> ((1-65535)) <<300>>
If an error (such as a communication error) occurs at the server where the command was executed, the command may stop responding and the application may stop. To help you detect errors, pdload enables you to monitor the response time for communication during dictionary manipulation performed by the command.
The -X option specifies the response monitoring time for dictionary manipulation (in seconds). If the execution time during dictionary manipulation exceeds the value set in the -X option, pdload assumes a dictionary access error and cancels processing with return code 8.
- Criteria
- If you want to detect an error in less time than 300 seconds in the event of a no-response from the server due to a communication error or unit down, specify a value that is smaller than 300 in the -X option.
- If the system switchover facility is used, the command may keep waiting for a response even though system switchover has been completed. In such a case, you can terminate the command immediately by reducing the monitoring time.
- The specified monitoring time may result in a timeout if a response from the dictionary is delayed and the utility's preprocessing is not completed within 300 seconds (which is the default value for the -X option). This can happen when many applications and utilities are executing concurrently. In such an environment, specify a value greater than 300 in the -X option.
(20) -q generation-number
~<unsigned integer> ((0-10))
Specifies the generation number of the RDAREA subject to data loading when the inner replica facility is used.
Specify the generation number as follows:
- 0: Original RDAREA is to be subject to data loading.
- 1 to 10: Replica RDAREA generation that is to be subject to data loading.
(a) Criteria
Specify this option if you are using the inner replica facility and loading data to an RDAREA other than the current RDAREA.
(b) Rules
- When this option is omitted, the current RDAREA becomes subject to data loading.
- If a replica RDAREA is subject to data loading in units of RDAREAs, specify the name of the original RDAREA in the source statement and specify the generation number subject to data loading in the -q option.
- pdload checks the RDAREA subject to data loading for its generation. It performs the following checking; if the utility detects an error, it displays a message and terminates with return code 8.
- Checking as to whether the generation numbers match between row-partitioned tables:
- When data loading is executed on a row-partitioned table, the utility checks the generation of the target RDAREA.
- When the -q option is specified, the utility checks each storage target for the RDAREA of the specified generation. When the -q option is omitted, the utility checks each current RDAREA at the storage target to determine whether their generation numbers match.
- Checking as to whether the generation numbers of the RDAREAs for storing the table and index match:
- The utility checks the generation numbers of the RDAREAs for storing the table and index.
- When the -q option is specified, the utility checks the RDAREAs subject to data loading for the specified generation. When the -q option is omitted, the utility checks all current RDAREAs subject to data loading to determine whether their generations are the same.
- Checking as to whether the replica status of the target RDAREAs is the same:
- When the -q option is specified, the utility checks the target RDAREAs to determine whether the current RDAREAs are intermixed with non-current RDAREAs.
(21) -K{ i | f }
Specifies the format for the input data values for XML-type parameters for all XML-type columns defined in the table. If no XML-type column is defined for the table, specifying this option results in an error.
- i:
- Specifies that the input data values for XML-type parameters are specified directly in the input data file.
- f:
- Specifies that a file is provided for each input data value for an XML-type parameter and specifies the corresponding file path name in the input data file.
(a) Format of input data value
The table below describes the format for the input data values that you can select when you perform data loading on a table in which an XML-type column is defined. You use the -K option to specify the format for the input data values.
Table 5-9 Format of input data values when data loading is performed on a table for which an XML-type column is defined
Input data format | Description | Remarks |
---|
Embedded format (-K i specified) | Embeds XML documents in the input data file in the same input file format as for the other column data. | Data must be specified in BINARY format in the input data file. |
File path specification format (-K f specified) | Provides individual files containing the XML data and specifies the corresponding file path name in the input data file. | The file path name must be specified in BINARY or CHAR type in the input data file. |
(b) Permitted input data file formats
The table below shows the input data file formats that you can specify when you load XML documents and data in the ESIS-B format.
Table 5-10 Permitted input data file formats
Input data file format | Format of input data value (-K option value) and input data type (-G option value) |
---|
Embedded format (-K i) | File path specification format (-K f) |
---|
ESIS-B format (-G esisb) | XML document (-G xml) | ESIS-B format (-G esisb) | XML document (-G xml) |
---|
DAT format | N | N | Y | Y |
Extended DAT format | N | N | Y | Y |
Fixed-size data format (CHAR type specified) | N | N | Y | Y |
Fixed-size data format (BINARY type specified) | Y | Y | Y | Y |
Binary format | Y | Y | N | N |
pdrorg-output binary format | Y | N | N | N |
- Legend:
- Y: Can be specified in pdload
- N: Cannot be specified in pdload
(22) -G{ xml | esisb }
Specifies the data type (XML document or ESIS-B format) of XML data that is specified for the input data file for all XML-type columns defined in the table. If no XML-type column is defined in the table, specifying this option results in an error.
- xml:
- Specifies that the data specified as the XML-type parameters in the input data file constitutes an XML document.
- esisb:
- Specifies that the data specified as the XML-type parameters in the input data file is in ESIS-B format.
(a) Input data type
You can select the data listed below as the XML-type input data. You use the -G option to specify the input data type.
- XML document
XML text data created by the user
- ESIS-B-format data
Data in the format used to analyze XML documents by using the XML conversion command (phdxmlcnv) or XML conversion library.
(b) Specifying a column structure information file and a null value/function information file used for loading XML-type data
If you use a column structure information file or a null value/function information file to load data to a table for which an XML-type column is defined, you must specify a constructor function shown below.
The table below shows the constructor functions to be specified when you use a column structure information file.
Input data file format | How to specify column structure information file for each input data type |
---|
ESIS-B format (-G esisb) | XML document (-G xml) |
---|
DAT format | column-name, func=(XML,param=binary) [,filedir=path-name] | column-name, func=(XMLPARSE,param=integer, param=binary,param=integer) [,filedir=path-name] |
Extended DAT format |
Fixed-size data format (CHAR type specified) | func=(XML,param=binary, type=char()) [,filedir=path-name] | column-name, func=(XMLPARSE,param=integer, type=integer,param=binary,type=char() [,null=(start-location,[{c|x}]'comparison-value')] [,filldata=filler-data] ,param=integer,type=integer) [,filedir=path-name] |
Fixed-size data format (BINARY type specified) | func=(XML,param=binary ,type=binary()) [,filedir=path-name] | column-name, func=(XMLPARSE,param=integer, type=integer,param=binary,type=binary() [,null=(start-location,[{c|x}]'comparison-value')] ,param=integer,type=integer) [,filedir=path-name] |
Binary format | -- | -- |
pdrorg-output binary format | column-name, func=(XML,param=binary) | -- |
- Legend:
- --: Cannot be specified
- Note:
- If the input data type is an XML document, specify input data only in the second argument of the XMLPARSE function. For details about how to specify the data, see 5.5.1 DAT format or 5.5.3 Fixed-size data format. pdload sets data values in the first and third arguments.
The table below shows the constructor function to be specified when you use a null value/function information file.
Input data file format | How to specify column structure information file for each input data type |
---|
ESIS-B format (-G esisb) | XML document (-G xml) |
---|
DAT format | -- | -- |
Extended DAT format |
Fixed-size data format (CHAR type specified) |
Fixed-size data format (BINARY type specified) |
Binary format | func=(XML,param=binary [,null=(start-location,[{c|x}]'comparison-value')]) | func=(XMLPARSE,param=integer ,param=binary [,null=(start-location,[{c|x}]'comparison-value')] ,param=integer) |
pdrorg-output binary format | -- | -- |
- Legend:
- --: Cannot be specified
- Note:
- If the input data type is an XML document, specify input data only in the second argument of the XMLPARSE function. For details about how to specify the data, see 5.5.2 Binary format. pdload sets data values in the first and third arguments.
(23) -F
Specifies that when an input data value of the FLOAT or SMALLFLT type is less than the minimum value or greater than the maximum value supported by the OS, the input data value is to be corrected so that it is within the OS-supported value range. The figure below shows the range of input data value correction when the -F option is specified.
Figure 5-13 Range of input data value correction when the -F option is specified
![[Figure]](figure/zu050890.gif)
(a) Rules
- If an input data value is less than the OS's minimum value and the -F option has been omitted, the input data value is used as is. If an input data value is greater than the OS's maximum value when the -F option has been omitted, an error results.
- The -F option is applicable to the following data types:
- DAT format
- Extended DAT format
- Fixed-size data format
- pdload unload file in DAT format
- pdload unload file in extended DAT format
- pdload unload file in fixed-size data format
(24) -E
Specifies that data loading is to be performed forcibly on a table to be expanded to the memory database.
Normally, the database load utility cannot be used to perform data loading on a table to be expanded to the memory database. Specifying this option enables you to forcibly perform data loading on a table to be expanded to the memory database.
(25) [authorization-identifier] table-identifier
Specifies the table identifier of the table subject to data loading. When the authorization identifier is omitted, the system assumes the user name used to establish connection with HiRDB.
(a) Rule
If you enclose an authorization identifier in double quotation marks ("), the system treats it as case sensitive; otherwise, the system treats it as in all uppercase letters. If you use the Bourne shell (sh), C shell (csh), or Korn shell (ksh), you need to enclose the authorization identifier in single quotation marks (').
(26) control-information-filename
Specifies the name of the control information file that contains the control statements of the database load utility.
You can specify any of the control statements listed below in the control information file. For details about the control statements, see sections 5.4.3 through 5.4.16. Note that comments cannot be specified in control information files.
- mtguide statement (specification of MT information)
- emtdef statement (specification of MT information)
- source statement (specification of input data file information)
- index statement (specification of index information file information)
- idxwork statement (specification of index information file directory information)
- sort statement (specification of sort work directory information)
- lobdata statement (specification of LOB input file information)
- lobcolumn statement (specification of LOB column input file information)
- lobmid statement (specification of LOB middle file information)
- srcuoc statement (specification of UOC storage library information)
- array statement (specification of array data format for a table containing repetition columns)
- extdat statement (specification of information about the extended DAT format)
- src_work statement (specification of the output destination of divided-input data files)
- constraint statement (specification of check pending status)
- option statement (specification of data processing information)
(a) Number of specifiable control statements
The following table shows the maximum number of control statements permitted in a control information file:
Control statement | Maximum number of control statements |
---|
mtguide statement | 1 |
emtdef statement | 1 |
source statement | 1 |
index statement | Number of indexes number of storage RDAREAs |
idxwork statement | Number of index storage servers |
sort statement | Number of index storage servers |
lobdata statement | 1 |
lobcolumn statement | Number of LOB columns |
lobmid statement | 1 |
srcuoc statement | 1 |
array statement | 1 |
extdat statement | 1 |
src_work statement | 1 |
constraint statement | 1 |
option statement | 1 |
(b) Relationship between control statements and options
The following shows the relationship between control statements and options.
- Creating an index
Specify the following control statements in the -i option
Control statement | Specification of -i option |
---|
c | n | s | x |
---|
source statement | R | R | R | R |
index statement | O | O | -- | -- |
idxwork statement | O | O | -- | -- |
sort statement | O | -- | -- | -- |
R: Required
O: Optional
--: Not required
- Entering LOB data and parameters
When you enter LOB data and parameters, the control statements to be specified depend on whether or not the LOB column structure base table and LOB columns are created at the same time.
Control statement | Unit of data loading |
---|
Loading data to LOB column structure base table and LOB columns | Loading data to LOB column structure base table | Loading data to LOB columns |
---|
-k f | -k c | -k v | -k d | -k f | -k c | -k f | -k c |
---|
source statement | R | R | R | R | R | R | -- | -- |
lobdata statement | R | R | R | -- | -- | -- | R | O |
lobcolumn statement | -- | R | -- | -- | -- | -- | -- | R |
lobmid statement | R# | R | R# | -- | R# | R | R# | R |
R: Required
O: Optional
--: Not required
#: For a LOB parameter to a constructor function that generates data for abstract data type columns, this statement, if specified, is ignored.
- Using EasyMT
Specify the following control statements in the -f option:
Control statement | Coding | Specification of -f option |
---|
easymt | None |
---|
mtguide statement | EasyMT information | O | -- |
emtdef statement | EasyMT information | O | -- |
source statement | Input data file information | R | R |
R: Required
O: Optional
--: Not required
(c) Files and directories specified in the control statements
The following rules apply to specifying files and directories in the control statements:
- Access privileges must be granted to the HiRDB administrator.