2.16 Retrieving data from CSV files
HADB enables you to use the SELECT statement to retrieve data in CSV files (CSV-format files). For example, if your last year's sales management data is stored in a CSV file, you can retrieve the sales management data without having to import it to a table.
To do this, execute the SELECT statement with the path of the target CSV file specified. If you specify a search condition in the SELECT statement, you can retrieve the data that satisfies the condition in the same manner as when you retrieve data that has been imported to a table. The following figure provides an overview of data retrieval from CSV files.
- #
-
Data can be retrieved from CSV files that are not compressed or that are compressed in GZIP format.
- Explanation
-
-
Executes the SELECT statement with the CSV file path specified.
Specify the CSV file path in a function provided by HADB (ADB_CSVREAD function for retrieving data from CSV files).
-
The HADB server accesses the CSV file whose path is specified in the function.
-
The HADB server retrieves data from the CSV file and stores it temporarily as a table. This set of data stored temporarily in tabular format is called a table-function derived table.
-
The HADB server retrieves the table-function derived table according to the search condition specified in the SELECT statement and returns the retrieval results to the application program.
-
- ■ Example of SELECT statement with the ADB_CSVREAD function specified
-
This example retrieves data in the following CSV files that have been compressed in GZIP format:
-
/dir/file1.csv.gz
-
/dir/file2.csv.gz
-
/dir/file3.csv.gz
SELECT * FROM TABLE(ADB_CSVREAD( MULTISET['/dir/file1.csv.gz','/dir/file2.csv.gz','/dir/file3.csv.gz'], 'COMPRESSION_FORMAT=GZIP; FIELD_NUM=1,2; ENCLOSING_CHAR="; DELIMITER_CHAR=,;')) AS "T1" ("C1" INTEGER, "C2" VARCHAR(32))
- Explanation
-
The underlined parts indicate the information specified in the ADB_CSVREAD function.
The information specified in the ADB_CSVREAD function includes the absolute path names of the CSV files and the data to be retrieved from the CSV files.
For details about the ADB_CSVREAD function, see ADB_CSVREAD function under System-defined functions in Constituent Elements in the manual HADB SQL Reference.
-
- Important
-
-
The CSV files must meet any of the following conditions:
∙ CSV files that are not compressed
∙ CSV files that are compressed in GZIP format by using the OS command gzip
∙ CSV files that are compressed in GZIP format and exported by using the adbexport command
-
The data in such a CSV file cannot be updated.
-
Retrieval of data from a CSV file requires more time than retrieval of data that has been imported to a table because the CSV file must be analyzed.
-