9.3.1 Setting the timers for monitoring communication
For communications between the EADS clients and EADS servers that use TCP, EADS detects communication errors by monitoring the following durations:
-
Duration from the start of connection to its completion, using a socket
-
Duration from the start of a data write operation to its completion
-
Duration from the start of a data read operation to its completion
Multiple data read operations might occur when a single message is received.
- Approach
-
EADS detects communication errors more quickly by shortening the monitoring interval, and it prevents timeouts from occurring frequently by increasing the monitoring interval.
- Organization of this subsection
(1) Specifying timeout values for communication between EADS client and EADS server
You can specify timeout values for communication between EADS client and EADS server as shown in the following figure.
The table below lists the parameters used for specifying communication timeout values. The numbers 1 to 6 in the table correspond to the numbers in Figure 9-1 Specifying timeout values for communication between EADS client and EADS server.
No. |
Timeout value to be specified |
Property file |
Parameter name |
---|---|---|---|
1 |
Connection to the EADS server |
Client Property file |
|
2 |
Data transmission to the EADS server |
||
3 |
Data reception from the EADS server |
||
4 |
Data reception from the EADS client |
Server Property file |
|
5 |
Data transmission to the EADS client |
||
6 |
Closing an idle permanent connection |
The following figure shows the locations at which timeout values can be specified for communication between EADS client and EADS server.
Details of specifying timeout values ((1) to (6) in the figure) are explained in the following subsections:
-
9.3.1(2) Approach to specifying communication timeout values
-
9.3.1(3) Specifying a timeout value for closing a permanent connection
In addition, tips (1 to 5) are explained in 9.3.1(4) Tips for considering the timeout values to be specified.
(2) Approach to specifying communication timeout values
The communication timeout values indicated by 1 to 5 in Figure 9-2 Locations at which timeout values can be specified for communication between EADS client and EADS server are used to detect the following events, not as markers for when the corresponding processing is to be completed:
-
Physical closure of a channel
-
Closure of a channel to the communication target for a reason such as a failure at the communication target
We do not recommend that you use the communication timeout function for closing the connection, even when no response is returned within the expected time.
The EADS server processing continues regardless of the status of the communication channel to the EADS client. Data is not rolled back when channel closure is detected when a response is sent to the EADS client. Therefore, if the communication timeout function is used to close communication, the EADS client cannot obtain the data operation results.
We recommend that you tune the communication timeout values so that failures at the communication target and channel closure will be detected correctly, rather than using them to detect delays in the processing and the network.
(3) Specifying a timeout value for closing a permanent connection
A timeout value for closing a permanent connection (6 in Figure 9-2 Locations at which timeout values can be specified for communication between EADS client and EADS server) is used to prevent the connection and thread from becoming unavailable until the TCP keep-alive idle time set in the OS is reached when the EADS server cannot detect a shutdown of the OS or host on which the application program (user program) is running.
We recommend that you consider the following two points and specify a sufficient value within a range that will not affect other functions and that will avoid unnecessary communication closure and errors.
-
Relationship with the number of connections on the EADS client
The EADS client maintains permanent connections even when there is no operation request from the application program to the EADS server for a specified period of time. Permanent connections are used for periodic checking of cluster information. The maximum number of permanent connections per EADS server is specified in the eads.client.connectionPool.poolsize parameter in the client properties. The permanent connections are used at the interval specified for each EADS server in the eads.client.clusterInfo.update.interval parameter in the client properties.
Only one thread is used to check the cluster information. If an attempt is made to check the cluster information at the address of a host whose OS is not running normally, the processing will be placed in wait status for a maximum of the duration specified in the eads.client.connection.send.timeout parameter in the client properties. During this time, communication with other EADS servers for checking the cluster information will be placed on hold.
In such a situation, to prevent permanent connections from being closed when there is no request from the EADS client, specify a sufficient value in the eads.server.connection.keepAlive.timeout parameter in the server properties that can satisfy the following condition:
eads.server.connection.keepAlive.timeout parameter value in the server properties
> (eads.client.connection.send.timeout parameter value in the client properties permissible number of EADS server failures
+ eads.client.clusterInfo.update.interval parameter value in the client properties)
eads.client.connectionPool.poolsize parameter value in the client properties
If the EADS server process is shut down but there is no problem on the OS, wait status does not last as long as the duration specified in the above eads.client.connection.send.timeout parameter. Therefore, specify the permissible number of EADS server failures taking into account permissible failures such as host failures, not process failures.
-
Relationship with the maximum number of connections on the EADS server
If the eads.server.maxConnections parameter is specified in the server properties, the number of connections from the EADS client is limited and any connection attempted beyond the specified value will result in an error.
If the minimum value is specified in this parameter and a network disconnection that cannot be detected from the EADS server's network occurs due to a host or OS error, the EADS server's connection might be maintained until the time specified in the eads.server.connection.keepAlive.timeout parameter in the server properties is reached. As a result, the EADS client might not be able to reestablish connection after error recovery processing. To resolve this problem, specify a parameter value that satisfies the following condition:
eads.server.connection.keepAlive.timeout parameter value in the server properties
< time required for the EADS client to reestablish connection after a failure
Note that the handling time depends greatly on the nature of the failure. If there is sufficient memory, we recommend that you add the number of connections that will be used for reconnection to the eads.server.maxConnections parameter value in advance and specify a sufficient value, such as 3600 seconds, in the eads.server.connection.keepAlive.timeout parameter.
(4) Tips for considering the timeout values to be specified
This subsection provides tips for considering the timeout values to be specified. These tips correspond to 1 to 5 in Figure 9-2.
- Tip 1 (1)
-
Normally, a permanent connection is used for communication from the EADS client. Therefore, connection processing occurs only when connection is established for the first time and when the number of threads that use client libraries increases. For the connection timeout value, the same value is used as for the transmission timeout value (eads.client.connection.send.timeout parameter value in the client properties).
- Tip 2 (2)
-
Data send processing is treated as being complete when data has been stored in the sender's send buffer. Therefore, data send processing succeeds regardless of the statuses of the receive process and channel as long as the data fits in the send buffer (if there is a problem in the channel, the next receive processing results in an error).
If an attempt is made to send data that is larger than the size of the send buffer, a send error might occur due to a problem (such as FullGC) at the receiving process. If you will be sending data that is larger than the send buffer size, take into account the time required for processing events such as the receiving process's FullGC.
- Tip 3 (3)
-
Send and receive processing takes place internally in multiple segments. The timeout values are applied to each segment. Therefore, the timeout values are not for guaranteeing the length of time before replies are returned.
- Tip 4 (4)
-
The EADS client performs the next read processing when its current send processing is completed. The EADS client does not consider whether the EADS server has received the data that the EADS client sent.
The EADS client's receive timeout includes the time for performing data operations and executing user functions on the EADS server. If there is a user function that requires a long time to process and the client properties are specified to accommodate such a user function's processing time, the value might not be suitable for normal communication processing. In such a case, we recommend that you use API functions with timeout settings instead of specifying the timeout values in the client properties.
- Tip 5 (5)
-
After sending a response to the EADS client, the EADS server waits for the next request. For this receive processing, the timeout value for permanent connections is used (eads.server.connection.keepAlive.timeout parameter value specified in the server properties), not the normal communication timeout value (eads.server.connection.timeout parameter value specified in the server properties). For details about the timeout value for permanent connections, see 9.3.1(3) Specifying a timeout value for closing a permanent connection.