9.11.1 Format conversion definition

You specify a format conversion definition (FormatDefinition tag) as a child element of a CB definition for editing (DataEditCBDefinition tag) described in 9.9.3 CB definition for editing.

For details about format conversion processing, see 10.2.2(3) Format conversion or 10.6.2(4) Format conversion.

Organization of this subsection
(1) Format
(2) Details of definition
(3) Example
(4) Specifying the start year and month for the TIMESTAMP data type
(5) Format of record structure
(6) Values of the type attribute and the corresponding Java and CQL data types
(7) Rules for representing data types

(1) Format

<FormatDefinition ioType="{INPUT|OUTPUT}">
 <common>
   <unmatchedFormat>{IGNORE|WARNING|ERROR}
   </unmatchedFormat>
   <format timestampformat="format-number" year="start-year" month="start-month"/>
 </common>
 <records>
   <record name="record-name" exp="record-structure"
     timestampformat="format-number" year="start-year" month="start-month">
     <field name="field-name"
       type="{INT|SHORT|BYTE|LONG|BIG_DECIMAL|FLOAT|DOUBLE|STRING|DATE|TIME|TIMESTAMP}"
       pattern="pattern"/>
   </record>
 </records>
</FormatDefinition>

(2) Details of definition

FormatDefinition tag (all definition information)
Defines all format conversion definition information. You specify this definition only once.
ioType="{INPUT|OUTPUT}"
Specifies the type of standard adaptors being specified in this definition. This attribute cannot be omitted.
The permitted values are as follows:
  • INPUT
    Specifies that format conversion is being defined for input adaptors.
  • OUTPUT
    Specifies that format conversion is being defined for output adaptors.
common tag (common definition)
Defines common format conversion definition information. You specify this definition only once.
unmatchedFormat tag (definition of records that do not match the conversion pattern)
Specifies how to handle a record that does not match the format conversion pattern.
You specify this definition only once. This definition is optional. If you omit this definition, ERROR is assumed.
The permitted values are as follows:
  • IGNORE
    Ignores the detected record and resumes standard adaptor processing.
  • WARNING
    Outputs a warning message and resumes standard adaptor processing.
  • ERROR
    Outputs an error message and terminates the standard adaptor.
format tag (format definition)
Defines the character string representation for the TIMESTAMP data type. You can specify this definition only once. If you do not use a character string representation for the TIMESTAMP data type, you can omit this definition.
timestampformat="format-number"
Specifies the format number of the TIMESTAMP data type, as an integer from 1 to 4. If this attribute is omitted, 1 is assumed.
Note that if you specify the timestampformat attribute in both the format tag and the records tag, the specification in the records tag takes effect.
The table below lists and describes the format numbers that can be specified.
Format numberFormat of character string representationFormat
1 (default value)year-month-day hour:minute:second.millisecond (1 to 9 digits)yyyy-MM-dd HH:mm:ss.fffffffff
2#1month-name#2 day hour:minute:secondMMM dd HH:mm:ss
3year/month/day hour:minute:second.millisecond (3 digits)yyyy/MM/dd HH:mm:ss.SSS
4day/month-name#2/year:hour:minute:seconddd/MMM/yyyy:HH:mm:ss
#1
Format 2 does not contain the year. If you specify format 2 for input adaptors, you should use the year and month attributes to specify the start year and month. If you specify format 2 for output adaptors, neither the year nor the month attribute can be specified.
If neither the year nor the month attribute is specified, the system time (year and month) at the time the standard adaptor is started becomes the start year and month for the TIMESTAMP type.
#2
This is the 3-letter month name in English (JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC).

year="start-year"
If you specified 2 in the timestampformat attribute, use this attribute to specify the start year for the TIMESTAMP data type. Express the year as an integer from 1970 to 2261. This attribute cannot be specified for output adaptors.
For details about specification of the start year and month for the TIMESTAMP data type, see (4) Specifying the start year and month for the TIMESTAMP data type.
month="start-month"
If you specified 2 in the timestampformat attribute, use this attribute to specify the start month for the TIMESTAMP data type. Express the month as an integer from 1 to 12. This attribute cannot be specified for output adaptors.
For details about specification of the start year and month for the TIMESTAMP data type, see (4) Specifying the start year and month for the TIMESTAMP data type.
records tag (record group definition)
Defines record group information. You specify this definition only once. This definition is mandatory.
record tag (record definition)
Defines information about a record. You can specify a maximum of 10 record definitions. This definition is mandatory.
name="record-name"
Specifies a name for identifying this record information, as 1 to 100 single-byte alphanumeric characters and the underscore (_). This name must begin with a single-byte alphabetic character.
This record name must be unique within the records tag. This attribute cannot be omitted.
exp="record-structure"
Specifies the fields that constitute the record, as 1 to 1,000,000 characters.
You can specify in the record structure a delimiter that is specific to this record. You can also specify a different delimiter for each field. This attribute cannot be omitted.
For the format of the record structure, see (5) Format of record structure.
timestampformat="format-number"
Specifies the format number of the TIMESTAMP data type, as an integer from 1 to 4. If this attribute is omitted, the timestampformat attribute value in the format tag is assumed.
Note that if you specify the timestampformat attribute in both the format tag and the records tag, the specification in the records tag takes effect.
For the values that can be specified as the format numbers and their meanings, see the description of the timestampformat attribute in the format tag.
year="start-year"
If you specified 2 in the timestampformat attribute, use this attribute to specify the start year for the TIMESTAMP data type. Express the year as an integer from 1970 to 2261. This attribute cannot be specified for output adaptors.
For details about specification of the start year and month for the TIMESTAMP data type, see (4) Specifying the start year and month for the TIMESTAMP data type.
month="start-month"
If you specified 2 in the timestampformat attribute, use this attribute to specify the start month for the TIMESTAMP data type. Express the month as an integer from 1 to 12. This attribute cannot be specified for output adaptors.
For details about specification of the start year and month for the TIMESTAMP data type, see (4) Specifying the start year and month for the TIMESTAMP data type.
field tag (field definition)
Defines field information. You can specify a maximum of 3,000 field definitions. This definition is mandatory.
name="field-name"
Specifies a name for identifying a field and its information, as 1 to 100 single-byte alphanumeric characters and the underscore (_). This name must begin with a single-byte alphabetic character.
This field name must be unique within the records tag. This attribute cannot be omitted.
type="{INT|SHORT|BYTE|LONG|BIG_DECIMAL|FLOAT|DOUBLE|STRING|DATE|TIME|TIMESTAMP}"
Specifies the data type of the field, which must correspond to a Java data type. This attribute cannot be omitted.
For the values that can be specified in this attribute and the corresponding Java and CQL data types, see (6) Values of the type attribute and the corresponding Java and CQL data types.
pattern="pattern"
If you specified STRING in the type attribute, specify a pattern for the field value using a regular expression. Because the java.util.regex.Pattern class is used to analyze the regular expression, you must specify a regular expression that is within the range supported by the java.util.regex.Pattern class. The dollar sign ($) cannot be used.
If you specified a value other than STRING in the type attribute or are specifying this definition for output adaptors, this attribute cannot be specified; if it is specified in such a case, an error results.
You can specify a constant for the pattern. When a constant is specified for the pattern, that constant will be treated as the field's value.
For the rules for representing the data types that are applied when this attribute is omitted, see (7) Rules for representing data types.

(3) Example

<?xml version="1.0" encoding="UTF-8"?>
<root:AdaptorCompositionDefinition
xmlns:form="http://www.hitachi.co.jp/soft/xml/sdp/adaptor/definition/callback/FormatDefinition">
<!-- Omitted -->

<!-- CB definition for editing -->
<cb:DataEditCBDefinition class="jp.co.Hitachi.soft.sdp.adaptor.callback.dataedit.formattranslate.InputFormatTranslatorCBImpl" name="editor1">
 <!-- Format conversion definition -->
 <form:FormatDefinition ioType="INPUT">
   <form:common/>
     <!-- Record group definition -->
     <form:records>
       <form:record name="R1" exp="($_F1),($_F2),($_F3),($_F4),($_F5)">
       <!-- Field definition -->
         <form:field name="F1" type="INT"/>
         <form:field name="F2" type="STRING" pattern="[^,]*"/>
         <form:field name="F3" type="STRING" pattern="[A-Z]+.[A-Z]+"/>
         <form:field name="F4" type="INT"/>
         <form:field name="F5" type="INT"/>
       </form:record>
     </form:records>
 </form:FormatDefinition>
</cb:DataEditCBDefinition>

(4) Specifying the start year and month for the TIMESTAMP data type

If you specify format number 2 in the timestampformat attribute in the format tag for input adaptors and you specify both year and month attributes in the format tag, the start year and month for a field of the TIMESTAMP data type are determined as follows:

For the second and subsequent input records, the start year and month are determined by comparing the month value in a field of the input record with the reference year-month, as described below.

The figure below shows an example where the reference year-month is set to year Y and month 4. If the month value in a field of the input record is in the range from 3 to 12, the year value for the field becomes Y. If the month value is 1 or 2, the year value for the field becomes Y + 1.

Figure 9-3 Example when the reference year-month is set to year Y and month 4

[Figure]

The figure below shows an example where the reference year-month is set to year Y and month 1. If the month value in a field of the input record is 12, the year value for the field becomes Y - 1. If the month value is in the range from 1 to 11, the year value for the field becomes Y.

Figure 9-4 Example when the reference year-month is set to year Y and month 1

[Figure]

(5) Format of record structure

This subsection discusses the format of the record structure that is specified in the exp attribute in the record tag.

Format of record structure:

{[delimiter]($field-name)[delimiter]
|[delimiter]($field-name)[delimiter]...}

The following explains each value.
delimiter
Specifies the delimiter used to separate the fields in this record. When the value of the ioType attribute in the FormatDefinition tag is INPUT, you can use a regular expression to specify a character string. To use a character that has a special meaning in regular expressions ((, ), [, ], ., *, ?, +, ^, or $), you must use the backslash (\) as an escape character.
When the value of the ioType attribute in the FormatDefinition tag is OUTPUT, a regular expression cannot be used, so there is no need for an escape character to handle characters that have a special meaning in regular expressions.
field-name
Specifies the name of an individual field in this record.
Rules for specifying the record structure:
  • There is no need to specify the double-quotation mark (") for the delimiter.
  • The specification of each field name must begin with ($_ and end with ).
  • You must specify the names of all fields that constitute this record.
  • Specify in the field tag the names of the fields that constitute this record in the same order, from left to right, that the fields are to be arranged in the record.
  • The same field name cannot be specified more than once. The name of a field that is not a part of this record cannot be specified.
  • For special characters, their replacement characters must be specified. For details about the replacement characters, see the table of special characters (symbols) and replacement characters in 9.2 Notes about creating adaptor definition files.
Examples of record structure specifications:
Example 1:
  • There are five fields, F1 through F5.
  • The fields are delimited by the comma (,).
This record's structure is specified as follows:

"($_F1),($_F2),($_F3),($_F4),($_F5)"

Example 2:
  • There are three fields, F1 through F3.
  • The record begins with <.
  • The delimiter between fields F1 and F2 is > MSG.
  • The delimiter between fields F2 and F3 is a single-byte space.
This record's structure is specified as follows:

"&lt;($_F1)&gt; MSG($_F2) ($_F3)"

In this example, < and > are replaced with &lt; and &gt;, respectively, because < and > are special characters.

(6) Values of the type attribute and the corresponding Java and CQL data types

The table below shows the correspondences between the values of the type attribute in the field tag and the Java and CQL data types.

For the rules for representing the data types, see the rules for data types for the pattern attribute in the field tag. For details about the CQL data types, see the manual uCosminexus Stream Data Platform - Application Framework Application Development Guide.

Table 9-12 Values of the type attribute and the corresponding Java and CQL data types

No.Data type
(value of type attribute)
ClassificationData typeJava data typeCQL data type
1INTNumeric data4-byte integerPrimitive type intINT[EGER]
2SHORT2-byte integerPrimitive type shortSMALLINT
3BYTE1-byte integerPrimitive type byteTINYINT
4LONG8-byte integerPrimitive type longBIGINT
5BIG_DECIMALFixed-point numberjava.math.BigDecimal classDEC[IMAL]#1
NUMERIC#1
6DECIMAL(m)#2
NUMERIC(m)#2
7FLOAT4-byte real numberPrimitive type floatREAL
8DOUBLE8-byte real numberPrimitive type doubleFLOAT
DOUBLE
9STRINGCharacter dataCharacter stringjava.lang.String classCHAR[ACTER]#3
10CHAR[ACTER](n)#4
VARCHAR(p)#5
11DATEDate/time dataDate (year-month-day)java.sql.Date classDATE
12TIMETime (hour-minute-second)java.sql.Time classTIME
13TIMESTAMPDate and time (year- month- day + hour-minute-second + millisecond)java.sql.Timestamp classTIMESTAMP#6
14TIMESTAMP[(q)]#7
#1
The number of digits is assumed to be 15. If the number of digits exceeds 15, an error occurs when tuples are sent.
#2
m is a positive integer in the range 1 [Figure] m[Figure] 38. If the number of digits exceeds m, an error occurs when tuples are sent.
#3
The number of characters is assumed to be 1. If the number of characters exceeds 1, an error occurs when tuples are sent.
#4
n is a positive integer in the range 1 [Figure] n[Figure] 255.If the number of characters exceeds n, an error occurs when tuples are sent.
#5
p is a positive integer in the range 1 [Figure] p[Figure] 32767.If the number of characters exceeds p, an error occurs when tuples are sent.
#6
year- month- day + hour-minute-second + millisecond (3 digits) is assumed. If a precision higher than milliseconds is specified, a tuple send error occurs.
#7
q is an integer in the range 0 [Figure] p[Figure] 9 and indicates the fraction part of a second. If a precision higher than q is specified, a tuple send error occurs.

(7) Rules for representing data types

The table below describes the rules for representing data types when the pattern attribute is omitted from the field tag.

Table 9-13 Rules for representing data types when the pattern attribute is omitted from the field tag

No.Data type
(value of type attribute)
Pattern for data type
(regular expression)
DescriptionWhether or not changeable#
1INT"[-]{0,1}[0-9]+"In this character string pattern, one minus sign (-) or no minus signs appears at the beginning, and a number in the range from 0 to 9 appears once or more.N
2SHORTN
3BYTEN
4LONGN
5BIG_DECIMAL"[-]{0,1}[0-9]+\.[0-9]+"In this character string pattern:
  • One minus sign (-) or no minus signs appears at the beginning.
  • A number in the range from 0 to 9 appears once or more.
  • A number in the range from 0 to 9 appears once or more immediately after the period (.).
N
6FLOAT"[-]{0,1}[0-9]+\.[0-9]+"In this character string pattern:
  • One minus sign (-) or no minus signs appears at the beginning.
  • A number in the range from 0 to 9 appears once or more.
  • A number in the range from 0 to 9 appears once or more immediately after the period (.).
N
7DOUBLEN
8STRING"[^, ;]*"In this character string pattern, a character other than the space, comma (,), or semicolon (;) appears repeatedly.
If the data type is STRING, you can change (specify) the pattern.
Example of changing (specifying) the pattern:
This example uses only the period (.) as a delimiter:
"[^ .]+"
Y
9DATE"[0-9]{1,4}-[0-9]{1,2}-[0-9]{1,2}"This is a character string pattern in the format yyyy-mm-dd. The permitted characters are shown below; only the number of digits is checked.
  • yyyy
    Numeric characters from 0 to 9999
  • mm
    Numeric characters from 0 to 99
  • dd
    Numeric characters from 0 to 99
If the values of mm and dd are outside of the applicable range for MM and DD, respectively, they are converted to the regular year-month-day according to the java.sql.Date specifications.
Examples:
2001-13-12[Figure] 2002-01-12
0000-01-12[Figure] 0001-01-12
N
10TIME"[0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2}"This is a character string pattern in the format hh:mm:ss. The permitted characters are shown below; only the number of digits is checked.
  • hh
    Numeric characters from 0 to 99
  • mm
    Numeric characters from 0 to 99
  • ss
    Numeric characters from 0 to 99
If the values of hh, mm, and ss are outside of the applicable range for HH, MM, and SS, respectively, they are converted to the regular time according to the java.sql.Time specifications,
Example:
16:22:66[Figure] 16:23:06
N
11TIMESTAMPThe pattern depends on the value of the timestampformat attribute in the records tag, as described below.
When timestampformat attribute value is 1:
"[0-9]{1,4}-[0-9]{1,2}-[0-9]{1,2} [0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2}
\.[0-9]{1,9}"
When timestampformat attribute value is 2:
"[A-Za-z]{3}[ ]+[0-9]{1,2}[ ]+[0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2}"
When timestampformat attribute value is 3:
"[0-9]{1,4}/[0-9]{1,2}/[0-9]{1,2}[ ]+[0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2}\.[0-9]{3}"
When timestampformat attribute value is 4:
"[0-9]{1,2}/[A-Za-z]{3}/[0-9]{1,4}:[0-9]{1,2}:[0-9]{1,2}:[0-9]{1,2}"
The pattern depends on the value of the timestampformat attribute in the records tag, as described below.
When the value of the timestampformat attribute is 1:
This is a character string pattern in the format yyyy-mm-dd hh:mm:ss.fffffffff. The permitted characters are shown below; only the number of digits is checked.
  • yyyy
    Numeric characters from 0 to 9999
  • mm
    Numeric characters from 0 to 99
  • dd
    Numeric characters from 0 to 99
  • hh
    Numeric characters from 0 to 99
  • mm
    Numeric characters from 0 to 99
  • ss
    Numeric characters from 0 to 99
  • fffffffff
    Numeric characters from 0 to 999999999​
When the value of the timestampformat attribute is 2:
This is a character string pattern in the format MMM dd HH:mm:ss. The permitted characters are shown below; only the number of digits is checked.
  • MMM
    Alphabetic characters A to Z and a to z (3-letter name of the month in English)
  • dd
    Numeric characters from 0 to 99
  • HH
    Numeric characters from 0 to 99
  • mm
    Numeric characters from 0 to 99
  • ss
    Numeric characters from 0 to 99
When the value of the timestampformat attribute is 3:
This is a character string pattern in the format yyyy/MM/dd HH:mm:ss.SSS. The permitted characters are shown below; only the number of digits is checked.
  • yyyy
    Numeric characters from 0 to 9999
  • MM
    Numeric characters from 0 to 99
  • dd
    Numeric characters from 0 to 99
  • HH
    Numeric characters from 0 to 99
  • mm
    Numeric characters from 0 to 99
  • ss
    Numeric characters from 0 to 99
  • SSS
    Numeric characters from 000 to 999
When the value of the timestampformat attribute is 4:
This is a character string pattern in the format dd/MMM/yyyy:HH:mm:ss. The permitted characters are shown below; only the number of digits is checked.
  • dd
    Numeric characters from 0 to 99
  • MMM
    Alphabetic characters A to Z and a to z (3-letter name of the month in English)
  • yyyy
    Numeric characters from 0 to 9999
  • HH
    Numeric characters from 0 to 99
  • mm
    Numeric characters from 0 to 99
  • ss
    Numeric character from 0 to 99
If the values of mm, dd, hh, mm, and ss are outside of the applicable range for MM (month), DD, HH, MM (minute), and SS, respectively, they are converted to the regular year-month-day-time according to the java.sql.Timestamp specifications.
Example:
2001-13-10 16:22:66.101[Figure] 2002-01-10 16:23:06.101
N
Legend:
Y: Changeable
N: Not changeable
#
Indicates whether or not the character string pattern of each data type can be changed by specifying the pattern attribute.