9.10.2 HTTP packet input connector definition
(1) Format
<HttpPacketInputConnectorDefinition>
<input buffersize="I/O-buffer-size"
assemblingtime="packet-segment-assembly-time">
<packetdata
globalheader="global-header-area-size"
packetheader="packet-header-area-size"
packetoffset="offset-for-packet-data-size-area"
packetlength="packet-data-size-area-size"
timeoffset="offset-for-timestamp-area"/>
<command path="command-path-name"
parameter="command-parameter"/>
</input>
<output unit="maximum-output-unit">
<record name="record-name" type="{REQUEST|RESPONSE}">
<field name="field-name"/>
</record>
</output>
</HttpPacketInputConnectorDefinition> |
(2) Details of definition
- HttpPacketInputConnectorDefinition tag (all definition information)
- Defines all HTTP packet input connector definition information. You specify this definition only once.
- input tag (input definition)
- Defines the HTTP packet input or output information. You specify this definition only once.
- buffersize="I/O-buffer-size"
- Specifies as an integer from 1 to 12288 the maximum number of HTTP packets that can be stored in the input buffer or the maximum number of common records that can be stored in the output buffer. If this attribute is omitted, 4096 is assumed.
- assemblingtime="packet-segment-assembly-time"
- Specifies as an integer from 1 to 5000 the packet segment assembly time (in milliseconds). If this attribute is omitted, 2000 is assumed.
- Packet segments are packet data that has been segmented in TCP hierarchy according to the maximum segment length (MSS: maximum segment size). If the data collected by the HTTP packet input connector is packet segments, packets are assembled on the basis of the TCP protocol header information. The packet segment assembly time means the time required to link the received packet segments to obtain a complete packet since a new packet segment arrived. Each time packet segments are assembled into a packet, the counter is reset. Note that the packet data segmented in the IP hierarchy is not assembled.
- If packet segments cannot be assembled within the amount of time specified here, those packet segments undergoing assembly processing are discarded. The most recent time information in the packet segment used for assembly is used as the timestamp.
- packetdata tag (HTTP packet definition)
- Defines the format of the HTTP packets to be acquired. You specify this definition only once.
- globalheader="global-header-area-size"
- Specifies the size (in bytes) of the global header area, as an integer from 1 to 128. If this attribute is omitted, 24 is assumed.
- packetheader="packet-header-area-size"
- Specifies the size (in bytes) of the packet header area, as an integer from 9 to 128. If this attribute is omitted, 16 is assumed.
- packetoffset="offset-for-packet-data-size-area"
- Specifies the offset (in bytes) from the beginning of the packet header to the packet data size area, as an integer from 0 to 127. If this attribute is omitted, 8 is assumed.
- packetlength="packet-data-size-area-size"
- Specifies the size (in bytes) of the packet data size area, as an integer from 1 to 4. If this attribute is omitted, 4 is assumed.
- timeoffset="offset-for-timestamp-area"
- Specifies the offset (in bytes) from the beginning of the packet header to the timestamp area, as an integer from 0 to 120. If this attribute is omitted, 0 is assumed.
- The timestamp area, which is located in the packet header area, stores the timestamp obtained when the packet analyzer captured the HTTP packet. The following figure shows the format of the timestamp area.
Figure 9-2 Format of timestamp area
![[Figure]](figure/zs090200.gif)
- command tag (command definition)
- Defines the start command for the packet analyzer that is to be used. You specify this definition only once.
- Stream Data Platform - AF supports WinDump as the packet analyzer. For details about the information to be defined here when WinDump is used, see (4) Information to be specified in the command tag when WinDump is used.
- path="command-path-name"
- Specifies the absolute path of the packet analyzer start command, as 1 to 100 single-byte characters (ASCII codes 32 to 126).
- parameter="command-parameter"
- Specifies a parameter to be passed to the packet analyzer start command, as 1 to 100 single-byte characters (ASCII codes 32 to 126).
- output tag (output definition)
- Defines the output information for common records when the HTTP packet input connector converts HTTP packets to common records. You specify this definition only once.
- unit="maximum-output-unit"
- Specifies the maximum number of output units (in records) for common records, as an integer from 1 to 1000. If this attribute is omitted, 100 is assumed.
- record tag (record definition)
- Defines record information for common records. You may specify 1 or 2 record definitions.
- name="record-name"
- Specifies a name for a common record, as 1 to 100 single-byte alphanumeric characters and the underscore (_). This record name must begin with a single-byte alphabetic character. This attribute cannot be omitted. This record name must be unique within the record tag.
- type="{REQUEST|RESPONSE}"
- Specifies the type of the common record. This attribute cannot be omitted.
- The permitted values are as follows:
- "REQUEST"
Indicates a request record. This is a type of common record used to store data generated during request transmission from client to host in HTTP protocol communication.
- "RESPONSE"
Indicates a response record. This is a type of common record used to store data generated during response transmission from host to client in HTTP protocol communication.
- field tag (field definition)
- Defines information about the fields that constitute the common record. You may specify 1 to 16 field definitions.
- name="field-name"
- Specifies a field name.
- Specify for a field name the identifier of data that is to be extracted from HTTP packets. The data corresponding to this identifier is stored as the field's value.
- Each field name must be unique within the record definition.
- The identifiers that can be specified as field names depend on the type of common record specified in the type attribute in the record tag. For details about the identifiers that can be specified as field names, see (5) Identifiers that can be specified as field names.
- Extracted data is converted to the Java data type and then stored as field values. For details about the Java data types for the data that is stored as field values, see (6) Java data types that are stored as field values.
(3) Example
<?xml version="1.0" encoding="UTF-8"?>
<root:AdaptorCompositionDefinition
xmlns:hpicon="http://www.hitachi.co.jp/soft/xml/sdp/adaptor/definition/callback/HttpPacketInputConnectorDefinition">
<!-- Omitted -->
<!-- CB definition for input -->
<cb:InputCBDefinition
class="jp.co.Hitachi.soft.sdp.adaptor.callback.io.packetinput.HttpPacketInputCBImpl" name="inputer2">
<!-- HTTP packet input connector definition -->
<hpicon:HttpPacketInputConnectorDefinition>
<!-- Input definition -->
<hpicon:input buffersize="096" assemblingtime="2000">
<!-- HTTP packet definition -->
<hpicon:packetdata globalheader="24" packetheader="16" packetoffset="8"
packetlength="4" timeoffset="0"/>
<!-- Command definition -->
<hpicon:command path="C:\Program Files\WinDump\WinDump.exe"
parameter=" -i 1 -s 2048 -w - -n "tcp port 80 and host 133.145.224.19""/>
</hpicon:input>
<!-- Output definition -->
<hpicon:output unit="100">
<!-- Record definition -->
<hpicon:record name="RECORD1" type="REQUEST">
<!-- Field definition -->
<hpicon:field name="SEND_IP"/>
<hpicon:field name="RECEIVE_IP"/>
<hpicon:field name="SEND_PORT"/>
<hpicon:field name="RECEIVE_PORT"/>
<hpicon:field name="MESSAGE_TYPE"/>
<hpicon:field name="TARGET_URI"/>
</hpicon:record>
</hpicon:output>
</hpicon:HttpPacketInputConnectorDefinition>
</cb:InputCBDefinition> |
(4) Information to be specified in the command tag when WinDump is used
(5) Identifiers that can be specified as field names
The table below lists and describes the identifiers that can be specified as field names in the name attribute in the field tag.
Table 9-9 Identifiers that can be specified as field name
No. | Identifier | Data | Description | Protocol | Whether or not specifiable according to record type |
---|
Request | Response |
---|
1 | TIME | Time#1 | Time packet data arrived | -- | Y | Y |
2 | PACKET_LENGTH | Packet size#2 | Size of packet data (bytes) | -- | Y | Y |
3 | SEND_MAC | Sending MAC address | MAC address at the packet sending end | Ethernet | Y | Y |
4 | SEND_IP | Sending IP address | IP address at the packet sending end | IP | Y | Y |
5 | RECEIVE_IP | Receiving IP address | IP address at the packet receiving end | IP | Y | Y |
6 | SEND_PORT | Sending port number | Port number at the packet sending end | TCP | Y | Y |
7 | RECEIVE_PORT | Receiving port number | Port number at the packet receiving end | TCP | Y | Y |
8 | MESSAGE_TYPE | Message type | Request or Response | HTTP | Y | Y |
9 | METHOD_NAME | Method information | Method information, such as GET and POST | HTTP | Y | N |
10 | TARGET_URI | URI information#3 | Target URI information | HTTP | Y | N |
11 | REFERER | Referer#3 | Link source URI information | HTTP | Y | N |
12 | COOKIE | Cookie#3 #4 | Cookie information#5 | HTTP | Y | Y |
13 | STATUS_CODE | Status code | Result of request processing | HTTP | N | Y |
14 | CONNECTION | Connection | Connection persistency information | HTTP | Y | Y |
15 | CONTENT_LENGTH | Content-Length | Contents size (bytes) | HTTP | Y | Y |
16 | CONTENT_TYPE | Content-Type | Contents type | HTTP | Y | Y |
17 | MESSAGE_BODY | Message body#3 | Real data | HTTP | Y#6 | N |
- Legend:
- Y: Identifier can be specified
- N: identifier cannot be specified
- --: Not applicable
- #1
- The time is obtained from the timestamp in the packet header.
- #2
- The packet size is the sum of the sizes of the HTTP message start line, header size, and value referenced by the HTTP header "Content-Length". If there is no "Content-Length", the value referenced by "Content-Length" is 0.
- #3
- A character string in percent-encoding is decoded in UTF-8.
- #4
- The method for acquiring cookie data is not the same for request records and response records.
- For a request record, the data referenced by the HTTP header "Cookie" is acquired.
- For a response record, the data referenced by the HTTP header "Set-Cookie" is acquired. The data referenced by the HTTP headers "Cookie2" and "Set-Cookie2" is not acquired.
- #5
- The cookie information might contain multiple cookies. If you want to treat each cookie as a field, specify regexsubstring in the function attribute in the map tag (mapping definition) and acquire a character string using a regular expression.
- #6
- You can obtain the message body only when the method information is POST, the media type in Content-Type is text (regardless of the value of subtype), and Content-Length exists.
(6) Java data types that are stored as field values
Each protocol data item extracted according to the identifier specified in the name attribute in the field tag is converted to a Java data type, as shown in the table below. During conversion to the Java data type, if the value of the protocol data is greater than the permitted maximum value, only up to the maximum value is stored as the field value. If the data corresponding to the specified identifier is not found in the protocol data, the null character is stored as the field value for the String type and -1 is stored for the Integer type.
Table 9-10 Java data types that are stored as field values
No. | Data | Protocol | Java data type | Value range |
---|
1 | Time# | -- | Timestamp | 1970/01/01 00:00:00.000000 to 2261/12/31 23:59:59.999999 |
2 | Packet size | -- | Integer | 0 to 2147483647 |
3 | Sending MAC address | Ethernet | String | 17 characters (00:00:00:00:00:00 to FF:FF:FF:FF:FF:FF) |
4 | Sending port number | TCP | Integer | 0 to 65535 |
5 | Receiving port number | TCP | Integer |
6 | Sending IP address | IP | String | - In IPv4:
7 to 15 characters (0.0.0.0 to 255.255.255.255)
- In IPv6:
40 characters (0000:0000:0000:0000: 0000:0000:0000:0000 to FFFF:FFFF:FFFF:FFFF: FFFF:FFFF:FFFF:FFFF)
|
7 | Receiving IP address | IP | String |
8 | Data type | HTTP | String | 7 or 8 characters (Request or Response) |
9 | Method information | HTTP | String | 1 to 127 characters (such as GET or CONNECT) |
10 | URI information | HTTP | String | 1 to 255 characters |
11 | Referer | HTTP | String |
12 | Cookie | HTTP | String | 1 to 4,096 characters |
13 | Status code | HTTP | String | 3 characters (such as 200, 404) |
14 | Connection | HTTP | String | 1 to 127 characters (close or Keep-Alive) |
15 | Content-Length | HTTP | Integer | 0 to 2147483647 |
16 | Content-Type | HTTP | String | 3 to 255 characters |
17 | Message body | HTTP | String | 0 to 2,048 characters |
- Legend:
- --: Not applicable
- #
- When you specify time data in the field, you must specify at least 6 digits including the significant digits for the CQL data type (TIMESTAMP) in the query definition file.