9.10.2 HTTP packet input connector definition

You specify an HTTP packet input connector definition (HttpPacketInputConnectorDefinition tag) as a child element of a CB definition for input (InputCBDefinition tag).

For details about HTTP packet input processing, see 10.3 HTTP packet input.

Organization of this subsection
(1) Format
(2) Details of definition
(3) Example
(4) Information to be specified in the command tag when WinDump is used
(5) Identifiers that can be specified as field names
(6) Java data types that are stored as field values

(1) Format

<HttpPacketInputConnectorDefinition>
 <input buffersize="I/O-buffer-size"
  assemblingtime="packet-segment-assembly-time">
   <packetdata
    globalheader="global-header-area-size"
    packetheader="packet-header-area-size"
    packetoffset="offset-for-packet-data-size-area"
    packetlength="packet-data-size-area-size"
    timeoffset="offset-for-timestamp-area"/>
   <command path="command-path-name"
    parameter="command-parameter"/>
 </input>
 <output unit="maximum-output-unit">
   <record name="record-name" type="{REQUEST|RESPONSE}">
     <field name="field-name"/>
   </record>
 </output>
</HttpPacketInputConnectorDefinition>

(2) Details of definition

HttpPacketInputConnectorDefinition tag (all definition information)
Defines all HTTP packet input connector definition information. You specify this definition only once.
input tag (input definition)
Defines the HTTP packet input or output information. You specify this definition only once.
buffersize="I/O-buffer-size"
Specifies as an integer from 1 to 12288 the maximum number of HTTP packets that can be stored in the input buffer or the maximum number of common records that can be stored in the output buffer. If this attribute is omitted, 4096 is assumed.
assemblingtime="packet-segment-assembly-time"
Specifies as an integer from 1 to 5000 the packet segment assembly time (in milliseconds). If this attribute is omitted, 2000 is assumed.
Packet segments are packet data that has been segmented in TCP hierarchy according to the maximum segment length (MSS: maximum segment size). If the data collected by the HTTP packet input connector is packet segments, packets are assembled on the basis of the TCP protocol header information. The packet segment assembly time means the time required to link the received packet segments to obtain a complete packet since a new packet segment arrived. Each time packet segments are assembled into a packet, the counter is reset. Note that the packet data segmented in the IP hierarchy is not assembled.
If packet segments cannot be assembled within the amount of time specified here, those packet segments undergoing assembly processing are discarded. The most recent time information in the packet segment used for assembly is used as the timestamp.
packetdata tag (HTTP packet definition)
Defines the format of the HTTP packets to be acquired. You specify this definition only once.
globalheader="global-header-area-size"
Specifies the size (in bytes) of the global header area, as an integer from 1 to 128. If this attribute is omitted, 24 is assumed.
packetheader="packet-header-area-size"
Specifies the size (in bytes) of the packet header area, as an integer from 9 to 128. If this attribute is omitted, 16 is assumed.
packetoffset="offset-for-packet-data-size-area"
Specifies the offset (in bytes) from the beginning of the packet header to the packet data size area, as an integer from 0 to 127. If this attribute is omitted, 8 is assumed.
packetlength="packet-data-size-area-size"
Specifies the size (in bytes) of the packet data size area, as an integer from 1 to 4. If this attribute is omitted, 4 is assumed.
timeoffset="offset-for-timestamp-area"
Specifies the offset (in bytes) from the beginning of the packet header to the timestamp area, as an integer from 0 to 120. If this attribute is omitted, 0 is assumed.
The timestamp area, which is located in the packet header area, stores the timestamp obtained when the packet analyzer captured the HTTP packet. The following figure shows the format of the timestamp area.

Figure 9-2 Format of timestamp area

[Figure]
command tag (command definition)
Defines the start command for the packet analyzer that is to be used. You specify this definition only once.
Stream Data Platform - AF supports WinDump as the packet analyzer. For details about the information to be defined here when WinDump is used, see (4) Information to be specified in the command tag when WinDump is used.
path="command-path-name"
Specifies the absolute path of the packet analyzer start command, as 1 to 100 single-byte characters (ASCII codes 32 to 126).
parameter="command-parameter"
Specifies a parameter to be passed to the packet analyzer start command, as 1 to 100 single-byte characters (ASCII codes 32 to 126).
output tag (output definition)
Defines the output information for common records when the HTTP packet input connector converts HTTP packets to common records. You specify this definition only once.
unit="maximum-output-unit"
Specifies the maximum number of output units (in records) for common records, as an integer from 1 to 1000. If this attribute is omitted, 100 is assumed.
record tag (record definition)
Defines record information for common records. You may specify 1 or 2 record definitions.
name="record-name"
Specifies a name for a common record, as 1 to 100 single-byte alphanumeric characters and the underscore (_). This record name must begin with a single-byte alphabetic character. This attribute cannot be omitted. This record name must be unique within the record tag.
type="{REQUEST|RESPONSE}"
Specifies the type of the common record. This attribute cannot be omitted.
The permitted values are as follows:
  • "REQUEST"
    Indicates a request record. This is a type of common record used to store data generated during request transmission from client to host in HTTP protocol communication.
  • "RESPONSE"
    Indicates a response record. This is a type of common record used to store data generated during response transmission from host to client in HTTP protocol communication.
field tag (field definition)
Defines information about the fields that constitute the common record. You may specify 1 to 16 field definitions.
name="field-name"
Specifies a field name.
Specify for a field name the identifier of data that is to be extracted from HTTP packets. The data corresponding to this identifier is stored as the field's value.
Each field name must be unique within the record definition.
The identifiers that can be specified as field names depend on the type of common record specified in the type attribute in the record tag. For details about the identifiers that can be specified as field names, see (5) Identifiers that can be specified as field names.
Extracted data is converted to the Java data type and then stored as field values. For details about the Java data types for the data that is stored as field values, see (6) Java data types that are stored as field values.

(3) Example

<?xml version="1.0" encoding="UTF-8"?>
<root:AdaptorCompositionDefinition
xmlns:hpicon="http://www.hitachi.co.jp/soft/xml/sdp/adaptor/definition/callback/HttpPacketInputConnectorDefinition">
<!-- Omitted -->

<!-- CB definition for input -->
<cb:InputCBDefinition
class="jp.co.Hitachi.soft.sdp.adaptor.callback.io.packetinput.HttpPacketInputCBImpl" name="inputer2">
 <!-- HTTP packet input connector definition -->
 <hpicon:HttpPacketInputConnectorDefinition>
   <!-- Input definition -->
   <hpicon:input buffersize="096" assemblingtime="2000">
     <!-- HTTP packet definition -->
     <hpicon:packetdata globalheader="24" packetheader="16" packetoffset="8"
      packetlength="4" timeoffset="0"/>
     <!-- Command definition -->
     <hpicon:command path="C:\Program Files\WinDump\WinDump.exe"
      parameter=" -i 1 -s 2048 -w - -n &quot;tcp port 80 and host 133.145.224.19&quot;"/>
   </hpicon:input>
   <!-- Output definition -->
   <hpicon:output unit="100">
     <!-- Record definition -->
     <hpicon:record name="RECORD1" type="REQUEST">
       <!-- Field definition -->
       <hpicon:field name="SEND_IP"/>
       <hpicon:field name="RECEIVE_IP"/>
       <hpicon:field name="SEND_PORT"/>
       <hpicon:field name="RECEIVE_PORT"/>
       <hpicon:field name="MESSAGE_TYPE"/>
       <hpicon:field name="TARGET_URI"/>
     </hpicon:record>
   </hpicon:output>
 </hpicon:HttpPacketInputConnectorDefinition>
</cb:InputCBDefinition>

(4) Information to be specified in the command tag when WinDump is used

This subsection discusses the information to be specified in the command tag when WinDump is used as the packet analyzer.

The example presented here uses WinDump version 3.9.5. For details about the WinDump start command, see the WinDump documentation.

The HTTP packet input connector supports the WinDump start command (WinDump.exe) specified in the following format.

WinDump.exe[Figure]-i[Figure]network-device-number[Figure]-s[Figure]internal-buffer-size[Figure]-w[Figure]-[Figure]-n[Figure]"tcp[Figure]port[Figure]port-number[Figure]and[Figure]host[Figure]IP-address"
Legend:
[Figure]: Single-byte space. It might be permissible to omit parameter delimiters (single-byte spaces) in the WinDump start command, but the single-byte spaces cannot be omitted from the parameter attribute in the command tag.

The following explains each value in the format.

WinDump.exe
This is the WinDump start command. Specify the absolute path of WinDump.exe in the path attribute in the command tag.
-i[Figure]network-device-number
This option specifies the number of the network device connected to the target computer that is to be analyzed. Specify this option and value in the parameter attribute in the command tag.
-s[Figure]internal-buffer-size
This option specifies the size (in bytes) of the internal buffer for storing the captured packet data. HTTP requires a larger packet size than TCP; normally, 2,048 bytes is sufficient. Specify this option and value in the parameter attribute in the command tag.
-w[Figure]-
This option specifies a file or the standard output as the output destination of the captured packets. If you use an HTTP packet input connector, specify -w[Figure]- because packets are output to the standard output. Specify this option and value in the parameter attribute in the command tag.
-n[Figure]"tcp[Figure]port[Figure]port-number[Figure]and[Figure]host[Figure]IP-address
This option specifies the capture filter in the filter format of the packet capture library (libpcap). If you use an HTTP packet input connector, specify the port number used with the HTTP protocol and the IP address of the computer to be analyzed. Specify this option and value in the parameter attribute in the command tag.
Note that a double quotation mark (") is treated as a special character, so code it as &quot;.

(5) Identifiers that can be specified as field names

The table below lists and describes the identifiers that can be specified as field names in the name attribute in the field tag.

Table 9-9 Identifiers that can be specified as field name

No.IdentifierDataDescriptionProtocolWhether or not specifiable according to record type
RequestResponse
1TIMETime#1Time packet data arrived--YY
2PACKET_LENGTHPacket size#2Size of packet data (bytes)--YY
3SEND_MACSending MAC addressMAC address at the packet sending endEthernetYY
4SEND_IPSending IP addressIP address at the packet sending endIPYY
5RECEIVE_IPReceiving IP addressIP address at the packet receiving endIPYY
6SEND_PORTSending port numberPort number at the packet sending endTCPYY
7RECEIVE_PORTReceiving port numberPort number at the packet receiving endTCPYY
8MESSAGE_TYPEMessage typeRequest or ResponseHTTPYY
9METHOD_NAMEMethod informationMethod information, such as GET and POSTHTTPYN
10TARGET_URIURI information#3Target URI informationHTTPYN
11REFERERReferer#3Link source URI informationHTTPYN
12COOKIECookie#3 #4Cookie information#5HTTPYY
13STATUS_CODEStatus codeResult of request processingHTTPNY
14CONNECTIONConnectionConnection persistency informationHTTPYY
15CONTENT_LENGTHContent-LengthContents size (bytes)HTTPYY
16CONTENT_TYPEContent-TypeContents typeHTTPYY
17MESSAGE_BODYMessage body#3Real dataHTTPY#6N
Legend:
Y: Identifier can be specified
N: identifier cannot be specified
--: Not applicable
#1
The time is obtained from the timestamp in the packet header.
#2
The packet size is the sum of the sizes of the HTTP message start line, header size, and value referenced by the HTTP header "Content-Length". If there is no "Content-Length", the value referenced by "Content-Length" is 0.
#3
A character string in percent-encoding is decoded in UTF-8.
#4
The method for acquiring cookie data is not the same for request records and response records.
For a request record, the data referenced by the HTTP header "Cookie" is acquired.
For a response record, the data referenced by the HTTP header "Set-Cookie" is acquired. The data referenced by the HTTP headers "Cookie2" and "Set-Cookie2" is not acquired.
#5
The cookie information might contain multiple cookies. If you want to treat each cookie as a field, specify regexsubstring in the function attribute in the map tag (mapping definition) and acquire a character string using a regular expression.
#6
You can obtain the message body only when the method information is POST, the media type in Content-Type is text (regardless of the value of subtype), and Content-Length exists.

(6) Java data types that are stored as field values

Each protocol data item extracted according to the identifier specified in the name attribute in the field tag is converted to a Java data type, as shown in the table below. During conversion to the Java data type, if the value of the protocol data is greater than the permitted maximum value, only up to the maximum value is stored as the field value. If the data corresponding to the specified identifier is not found in the protocol data, the null character is stored as the field value for the String type and -1 is stored for the Integer type.

Table 9-10 Java data types that are stored as field values

No.DataProtocolJava data typeValue range
1Time#--Timestamp1970/01/01 00:00:00.000000 to 2261/12/31 23:59:59.999999
2Packet size--Integer0 to 2147483647​
3Sending MAC addressEthernetString17 characters (00:00:00:00:00:00 to FF:FF:FF:FF:FF:FF)
4Sending port numberTCPInteger0 to 65535
5Receiving port numberTCPInteger
6Sending IP addressIPString
  • In IPv4:
    7 to 15 characters (0.0.0.0 to 255.255.255.255)
  • In IPv6:
    40 characters (0000:0000:0000:0000: 0000:0000:0000:0000 to FFFF:FFFF:FFFF:FFFF: FFFF:FFFF:FFFF:FFFF)
7Receiving IP addressIPString
8Data typeHTTPString7 or 8 characters (Request or Response)
9Method informationHTTPString1 to 127 characters (such as GET or CONNECT)
10URI informationHTTPString1 to 255 characters
11RefererHTTPString
12CookieHTTPString1 to 4,096 characters
13Status codeHTTPString3 characters (such as 200, 404)
14ConnectionHTTPString1 to 127 characters (close or Keep-Alive)
15Content-LengthHTTPInteger0 to 2147483647​
16Content-TypeHTTPString3 to 255 characters
17Message bodyHTTPString0 to 2,048 characters
Legend:
--: Not applicable
#
When you specify time data in the field, you must specify at least 6 digits including the significant digits for the CQL data type (TIMESTAMP) in the query definition file.