Hitachi

Hitachi Advanced Database SQL Reference


7.16.1 Specification format and rules for multiset value expressions

A multiset value expression is used to collect multiple element values into a single data set. A multiset value expression can be specified in the following locations:

Organization of this subsection

(1) Specification format

multiset-value-expression ::= {multiset-value-constructor-by-enumeration | multiset-value-constructor-by-query}
 
  multiset-value-constructor-by-enumeration ::= MULTISET[multiset-element[,multiset-element]...]
  multiset-value-constructor-by-query ::= MULTISET table-subquery

[Figure]

(2) Explanation of specification format

Important

To specify a multiset value expression in the ADB_AUDITREAD function, specify multiset-value-constructor-by-enumeration. You cannot specify multiset-value-constructor-by-query.

To specify a multiset value expression in the ADB_CSVREAD function, note the following:

  • To specify individual CSV file names in the ADB_CSVREAD function, specify multiset-value-constructor-by-enumeration.

  • To use a table subquery to determine the CSV file names to be specified in the ADB_CSVREAD function, specify multiset-value-constructor-by-query.

multiset-value-constructor-by-enumeration
MULTISET[multiset-element[,multiset-element]...]:
Important

[Figure]

■ To specify multiset-value-constructor-by-enumeration in the ADB_AUDITREAD function

For multiset-element, specify in the character string literal format the path names of the audit trail files to be specified in the ADB_AUDITREAD function. For details about character string literals, see 6.3 Literals.

The following is an example:

MULTISET['/audit/adbaud-201707*.aud','/audit/adbaud-201708*.aud']

The following rules apply:

  • A maximum of 1,000 multiset elements (path names of audit trail files) can be specified.

■ To specify multiset-value-constructor-by-enumeration in the ADB_CSVREAD function

multiset-element specifies the path name of a CSV file to be specified in the ADB_CSVREAD function in the form of a character string literal. For details about character string literals, see 6.3 Literals.

The following is an example:

MULTISET['/dir/file1.csv.gz','/dir/file2.csv.gz','/dir/file3.csv.gz']

The example above specifies three CSV files.

The following rules apply:

  • No more than 1,000 multiset elements (CSV file path names) can be specified.

multiset-value-constructor-by-query
MULTISET table-subquery:

Specifies the path names of the CSV files to be specified in the ADB_CSVREAD function in the form of a table subquery. For details about table subqueries, see 7.3 Subqueries.

The following is an example:

MULTISET (SELECT "FILE_NAME" FROM "FILELIST"
              WHERE "FILE_DATE" BETWEEN '2012/01/01' AND '2012/12/31')

The above example specifies CSV file names (FILE_NAME) for which the FILE_DATE column in the file management table (FILELIST) is between 2012/01/01 and 2012/12/31.

The following rules apply:

  • The result of the table subquery must be one column.

  • The table subquery cannot contain an external reference column.

    Example that generates an error:

    The underlined portion indicates the external reference column specification.

    SELECT * FROM "T0"
      WHERE EXISTS (SELECT * FROM "T1",
                      TABLE(ADB_CSVREAD(MULTISET (SELECT "T2"."C1"
                                                    FROM "T2"
                                                      WHERE "T2"."C2" = "T0"."C2"),
                                       'COMPRESSION_FORMAT=GZIP;'))
                      AS "TF1" ("TFC1" INTEGER,"TFC2" VARCHAR(32)))

(3) Examples

Example 1: To specify the path names of audit trail files in the ADB_AUDITREAD function

Output a list of HADB users who accessed the HADB server in the term from April 1, 2017 to April 30, 2017. Assume that the audit trails that were output in the term from April 1, 2017 to April 30, 2017 are stored in the /audit directory.

SELECT DISTINCT "USER_NAME"
    FROM TABLE(ADB_AUDITREAD(MULTISET['/audit/*.aud'])) "DT"
       WHERE "EXEC_TIME" BETWEEN TIMESTAMP'2017/04/01 00:00:00.000000'
                             AND TIMESTAMP'2017/04/30 23:59:59.999999'

In the preceding example, the underlined portion indicates a multiset value expression (multiset-value-constructor-by-enumeration).

Example 2: To specify the path names of CSV files in the ADB_CSVREAD function

Extract the following data from the GZIP-compressed CSV files /dir/file1.csv.gz, /dir/file2.csv.gz, and /dir/file3.csv.gz:

  • Customer ID (USERID)

  • Customer name (NAME)

  • Age (AGE)

SELECT "USERID","NAME","AGE"
    FROM TABLE(ADB_CSVREAD(MULTISET ['/dir/file1.csv.gz','/dir/file2.csv.gz','/dir/file3.csv.gz'],
                          'COMPRESSION_FORMAT=GZIP;'))
         AS "USERSLIST" ("USERID" CHAR(10),"NAME" VARCHAR(100),"AGE" INTEGER)

In the preceding example, the underlined portion indicates a multiset value expression (multiset-value-constructor-by-enumeration).

Example 3: To use a table subquery to specify the path names of CSV files in the ADB_CSVREAD function

Extract customer information data that was registered in 2010. When the data is extracted, the following conditions hold:

  • The customer information data is stored in CSV-format files.

  • The CSV files are compressed in GZIP format.

  • The CSV files are managed in the CSV file management table (FILELIST)

  • The absolute path name (FILE_NAME) of each CSV file and the date each file was registered (FILE_DATE) are stored in the CSV file management table.

SELECT "USERID","NAME","AGE"
    FROM TABLE(ADB_CSVREAD(MULTISET (SELECT "FILE_NAME" FROM "FILELIST"
                                       WHERE "FILE_DATE" BETWEEN '2010/01/01'
                                                             AND '2010/12/31'),
                          'COMPRESSION_FORMAT=GZIP;'))
         AS "USERSLIST" ("USERID" CHAR(10),"NAME" VARCHAR(100),"AGE" INTEGER)

In the preceding example, the underlined portion indicates a multiset value expression (multiset-value-constructor-by-query).