Importing data

There are two ways to store data in a HiRDB table:

By table (xtrep command with the -j option omitted)
By RDAREA (xtrep command with the -j option specified)

(a) Storing data by table

This method uses the HiRDB table as the unit of storage. HiRDB Dataextractor assumes this method when the -j option is omitted from the xtrep command. If the target HiRDB table is row-partitioned, you should store data by RDAREA instead of by table in order to reduce the time required to perform data storage because HiRDB Dataextractor can store data in multiple RDAREAs at the same time.

(b) Storing data by RDAREA

This method uses the RDAREA as the unit of storage into a HiRDB table. HiRDB Dataextractor assumes this method when the -j option is specified in the xtrep command. If the target HiRDB table is row-partitioned, you can save processing time by using this method because you can start an instance of HiRDB Dataextractor for each RDAREA in order to execute multiple storage processes in parallel.

HiRDB Dataextractor supports data storage by RDAREA for the following type of row-partitioning:

Key range partitioning

Figure 3-10 provides an overview of storage by RDAREAs, and Table 3-12 shows the handling of data storage by RDAREA.

Figure 3-10 Overview of data storage by RDAREA

[Figure]

Table 3-12 Handling of data storage by RDAREA

Row partitioning method	HiRDB's pdload processing	Handling during extraction	Handling during storage
Key range partitioning	Checks that the input data is within the specified RDAREA storage range before actually storing the data. If the data extends outside the storage range and an error data file is specified (-q option in xtrep command), outputs the corresponding row data to the error data file.	Specifies extraction conditions and extracts the data that satisfies the storage conditions for the specified RDAREA.	In the case of a row-partitioned table in the server, you must use HiRDB's database reorganization utility to create a non-partitioning key index after data storage; see Note below. You can use the error data file as an input file to pdload. If data that extends outside the storage range is output to the error data file, check the data and use pdload to store it in the table, if necessary.

Note

Creating a non-partitioning key index for a row-partitioned table in the server
In the case of a row-partitioned table in the server, HiRDB Dataextractor assumes the index information output mode for a non-partitioning key index even if you specify c in the -i option of the xtrep command to store data by RDAREA. In such a case, you must use HiRDB's database reorganization utility (pdrorg) to create the non-partitioning index in the batch mode (by specifying ikmk in the-k option of the pdrorg command). For details about the database reorganization utility, see the HiRDB Command Reference manual. In the case of a row-partitioned table between servers, an index is created according to the -i option specification in the xtrep command.

(2) Specifying the data import method

HiRDB Dataextractor uses HiRDB's database load utility (pdload) to import data. You can specify the following import methods using the options of the xtrep command:

Extracted data storage method (-d option)
You can specify a method for storing extracted data into a table. Two methods are available:
- Deleting all existing data from the table and then storing the extracted data into the table
- Adding the extracting data to the table, retaining the existing data
Index creation method (-i option)
You can specify an index creation method. Four methods are available:
- Batch index creation mode
  This method creates indexes in batch mode after creating the table.
- Index information output mode
  This method outputs only the index information to an index information file.
- Index update mode
  This method updates indexes each time a row of data is stored.
- Index information output suppression mode
  This method does not update an index.
Parameter specification for pdload command (-l option)
You can specify desired parameters of the pdload command. For details about the use of this function, see 4.2.3 Additional data extraction and import functions or 5.1.3 Additional data extraction and import functions.
Log acquisition method (-l option)
You can select a log acquisition method. By selecting an appropriate log to be acquired according to the operation, you can reduce the log output processing time.
Number of batch output pages (-n option)
You can specify the number of pages to be output to a table in batch mode. By specifying this information, you can improve the importing efficiency HiRDB Dataextractor outputs the specified number of pages in batch mode.
Zero-length data storage (-z option)
Specify this option to store characters with a length of zero. If you specify this option, HiRDB Dataextractor lets you store characters with a length of zero.

(3) Converting the data types

(4) Converting character codes

(5) Creating an output file

The xtrep command lets you output extracted data to a file before starting up pdload. This file is called an output file. The output file is in the binary format. For details about the binary format, see Table 4-16.

To output data to an output file, specify the -o or -O option according to the output file processing method to be used after data is stored in a table. There are two ways of processing the output file:

Saving the output file after storing data to a table (-o option)
To save the output file after completion of import processing, specify the -o option. The created output file will be retained after import processing. You can use this output file as a backup. To re-create a HiRDB table on the basis of this output file in the event of an error, use HiRDB's database load utility.
Deleting the output file after storing data to a table (-O option)
If you want to delete the output file after completion of import processing, specify the -O option.

If extracted columns include a BLOB column, HiRDB Dataextractor creates LOB input files as well as the output file. A LOB input file is created for each LOB data item. You can use the -b option to specify a directory for storing the LOB input files. The -o or -O option determines the LOB input file processing method after import processing. To store BLOB-column data in the same output file as for non-BLOB data without creating a LOB input file, specify the XTLOBKIND environment variable.

If an output file or LOB input file to be created already exists, the -y option determines whether or not to overwrite the existing file. When the -y option is specified, HiRDB Dataextractor deletes the existing file and then outputs data to a new file. When the -y option is omitted, HiRDB Dataextractor outputs a message and terminates the processing.

The output file and LOB input files are created on the host at the following HiRDB server on the target system:

HiRDB/Single Server
Server at the single server
HiRDB/Parallel Server
- -f option specified
  Front-end server (FES) or back-end server (BES) specified with the -f option
- -f option omitted
  In the case of data storage in units of tables, the first FES server specified in the pdstart command in the HiRDB system common definitions (pdsys)
  In the case of data storage in units of RDAREAs, the back-end server (BES) that contains the RDAREA subject to data storage

(6) Notes on data import

If data is imported into a HiRDB table under the following conditions and an error occurs during import processing, the original status is restored:
- No file is created (-o or -O option omitted).
- Extracted data is added to the table, retaining the existing data.
To add data, execute data import as follows:
1. Specify the option in such a manner that files are created to import data to the table.
2. If you do not want to create a file, make a backup copy beforehand to protect against possible errors.
If pdload returns an error during data import, files that were created during pdload execution may remain. If the files are unnecessary, delete them. For details about the files that are created, see the manual HiRDB Command Reference.
To start HiRDB's database load utility (pdload) with HiRDB Dataextractor, specify pdload's -x option.
If this option is not needed, use the loader parameter specification function. For details about using this function, see (2) Loader parameter specification function in 4.2.3 Additional data extraction and import functions or (2) Loader parameter specification function in 5.1.3 Additional data extraction and import functions.
If a cluster key has been defined for the target table and data is to be imported in the order of the cluster key values, extract the data with the ORDER BY clause of the SELECT statement specified in the column name specification file for the column that corresponds to the source table's cluster key. If you use the code conversion function, the data may not be sorted in ascending (or descending) order because code conversion occurs after data extraction. In this case, use the database reorganization utility to reorganize the data after importing it.

3.3 Importing data