Tuning TCP KeepAlive intervals on Windows Agents

ISSUE

Long periods of inactivity on either the DATA or CONTROL channel can result in intermediate routers and switches in the network infrastructure dropping the connection. The tell-tale sign this has happened are log entries on both ends indicating the "other" side closed the connection. With respect to the agent, Windows has ridiculously long Keep Alive timeout values (2 hrs) and most network infrastructures will terminate an idle connection long before this 2 hr period expires. The solution is to tune the the Windows TCP Keep Alives to use more realistic values for the Timeout and Interval by executing an IOCTL on the socket when it's opened. The defaults chosen are 10 minutes for the timeout and 10 seconds for the retry interval. Windows itself fixes the number of retries at 10. The timeout and interval values are user settable in the MASTER.INI file as well as the complete disabling of the keep alive mechanism (not recommended!).

Master.ini values are as follows:

[BProfessional]
...
   NetTCPKeepAlive = (default: True)
   NetTCPKeepAliveTimeout = ; Timeout Value in seconds (default: 600)
   NetTCPKeepAliveInterval = ; Keep Alive Retry Interval in seconds (default: 10)
 
WBPS_x.log error messages enhanced to be more informative.
 
Master.ini: No entries related to Keep Alives (default)
Log entry : Jul 18 10:29:29 : OpenSocket     : set socket options SIO_KEEPALIVE_VALS (Enabled, TO: 600 sec,TI: 10 sec)
 
Master.ini: NetTcpKeepAlive=False
Log entry : Jul 18 10:49:18 : OpenSocket     : set socket options SIO_KEEPALIVE_VALS (Disabled)
 
Master.ini: NetTcpKeepAliveTimeout=120
            NetTcpKeepAliveInterval=5
Log entry : Jul 18 10:56:05 : OpenSocket     : set socket options SIO_KEEPALIVE_VALS (Enabled, TO: 120 sec,TI: 5 sec)

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Contact us