The following figure shows the flow and details of data processing during file input by the input adaptor.
Figure 10-2 Flow and details of data processing during file input
For details about the data formats handled by an input adaptor, see (1) Data formats handled by an input adaptor. For details about the processing, see the subsections beginning with (2) File input by a file input connector.
The data formats handled by an input adaptor are the input record and the common record. These data formats are discussed below.
Figure 10-3 Structure of a common record
You use the file input connector definition in the adaptor configuration definition file to define information about file input.
A file input connector reads data in rows from one or more files stored in the input file storage directory and converts the data into input records. One row of data read by the file input connector becomes one input record.
An input adaptor can read multiple input records at a time. The number of input records read by the file input connector is equal to the number of records that can be processed at one time for the format conversion, mapping, and tuple transmission operations.
This subsection discusses the types and structures of input files that are read by a file input connector.
File structure | Description | Prerequisite |
---|---|---|
Wraparound | Data is written to the files sequentially in the order the files were defined; there is a fixed number of input files. When all files are filled, data is written to the first file again. | The order of the files to which data is to be written is predetermined, and data is always written to the files in that order. To write data to a file that has become full, the file is first cleared of its data and then new data is written to it. |
Non-wraparound | Data is written to the files in the order the files were defined; there is no fixed number of input files. | The order of the files to which data is to be written is predetermined, and data is always written to the files in that order. |
Processing mode | Details of read processing | |
---|---|---|
Reading method | Read processing completion condition | |
Batch processing mode | When the input adaptor starts, the input file storage directory is checked for any input files. If there are any input files, they are read. | Read processing is completed when either of the following conditions is satisfied:
|
Real-time processing mode | The input file storage directory is monitored periodically at a specified interval and input files are read whenever they are detected. The input file storage directory monitoring interval and a monitoring count are specified in the input tag in the file input connector definition. When the specified monitoring count is exceeded, a warning message is issued and processing resumes. | Read processing is completed when either of the following conditions is satisfied:
|
You define information about format conversion in the format conversion definition in the adaptor configuration definition file.
Format conversion involves segmenting an input record into fields and then converting the fields into a common record.
The figure below shows an example of format conversion from input record to common record. In this example, the input record consists of three fields delimited by the space, the format of field 1 is converted to the TIME type, and the format of fields 2 and 3 is converted to the character string type.
Figure 10-4 Example of format conversion from input record to common record
The table below shows the structure of the common record in the above example and the tags that are specified in the format conversion definition.
Record structure | Tag |
---|---|
Record name: R1 Record structure: ($_time) ![]() ![]() | record tag (record definition) |
Field name: time, Type: TIME | field tag (field definition) |
Field name: method, Type: STRING | |
Field name: url, Type: STRING |
For details about the data types that can be converted and the settings for the structure of common records, see 9.11.1 Format conversion definition.
Format conversion enables you to define multiple record structures. When multiple record structures are defined, the input adaptor selects automatically the corresponding record structure and performs format conversion.
When multiple record structures are defined, the input adaptor uses the following methodology to select the appropriate record structure:
You specify information about mapping in the mapping definition in the adaptor configuration definition file.
The two types of mapping are mapping between record and stream and mapping between records. The table below provides an overview of these types of mapping.
Table 10-2 Overview of mapping
No. | Type of mapping | Description |
---|---|---|
1 | Mapping between record and stream | A common record output by the callback before mapping (mapping source) is associated with a common record based on the input stream format (mapping target). Mapping between record and stream is always performed before tuple transmission. |
2 | Mapping between records | A common record output by the callback before mapping (mapping source) is edited and converted to a target common record. If necessary, mapping between records is performed after format conversion, but before mapping between record and stream. You can use this type of mapping to change field names in the source common record or to delete fields that are not needed for the next callback processing. You can also use built-in functions# to obtain character strings and time values from source common records and apply them to target common records. You can specify multiple definitions for mapping between records. |
The figure below shows an example of mapping between record and stream by an input adaptor. This example maps the fields time and url, which are required for input stream s1, to the schema of the input stream and then converts them to a common record (mapping target).
Figure 10-5 Example of mapping between record and stream by an input adaptor
You define information about tuple transmission in the input stream definition in the adaptor configuration definition file.
The common records are converted to tuples based on the mapping results, and the tuples are then sent to the input stream according to the input stream definition.
The figure below shows an example of a tuple transmission from the input adaptor to the input stream. This example sends tuples to input stream s1.
Figure 10-6 Example of tuple transmission from the input adaptor