SSH and keepalive

Jephe Wu -  http://linuxtechres.blogspot.com

Objective: make your SSH connection more stable. Do not disconnect due to inactivity
Environment: CentOS 5, Windows XP, putty 0.60, ssh client on CentOS 5

Concepts:
1.  Why SSH connection somehow discontinues during idle time
Router or firewall in between make the connection state invalid
According to http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/overview.html .This behavior is caused by the connection tracking procedures implemented in proxies and firewalls, which keep track of all connections that pass through them. Because of the physical limits of these machines, they can only keep a finite number of connections in their memory. The most common and logical policy is to keep newest connections and to discard old and inactive connections first.

Thus the trick is to send packets as infrequently as possible over idle connections.

2. TCP keepalive and application level keepalive
According to http://the.earth.li/~sgtatham/putty/0.58/htmldoc/Chapter4.html#config-keepalive .
TCP keepalives is similar to application-level keepalives, and the same caveats apply. The main differences are:

    * TCP keepalives are available on all connection types, including Raw and Rlogin.(in Putty)
    * The interval between TCP keepalives is usually much longer, typically two hours; this is set by the operating system, and cannot be configured within PuTTY.
    * If the operating system does not receive a response to a keepalive, it may send out more in quick succession and terminate the connection if no response is received.

TCP keepalives may be more useful for ensuring that half-open connections are terminated than for keeping a connection alive. Although it also can Prevent disconnection due to network inactivity

3. TCP keepalive
3.1 how it works
After authentication, ssh sends a 32 byte empty packet to the sshd every n seconds. sshd does not care about this, but the server's TCP stack must send back an ACK for that packet. If the client's TCP stack does not receive an ACK for this or a later packet, it will retransmit for some time and then signal a connection-timeout to ssh, causing ssh to exit.

3.2 configuration of tcp keepalive


/proc/sys/net/ipv4/tcp_keepalive_intvl
/proc/sys/net/ipv4/tcp_keepalive_probes
/proc/sys/net/ipv4/tcp_keepalive_time


or permanently set them in /etc/sysctl.conf as follows:
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9


note:
a. above settings are default ones, you can change it.
b. If the network hardware or software drops connections that have been idle for less than the two hour default, the Client session will fail. KEEPALIVE timeouts are configured at the OS level for all tcp connections that have KEEPALIVE function enabled in their application, and there's option to choose it. such as the one in Putty.


If the network hardware or software (including firewalls) have a idle limit of one hour, then the KEEPALIVE timeout must be less than one hour. To rectify this situation TCP/IP KEEPALIVE settings can be lowered to fit inside the firewall limits. The implementation of TCP KEEPALIVE may vary from vendor to vendor. The original definition is quite old and described in RFC 1122.

4. Application level keepalive



4.1 How to configure it for openssh client command 'ssh' to prevent disconnection
man ssh_config on Linux, you get

ServerAliveInterval:
Sets a timeout interval in seconds after which if no data has been received from the server, ssh will send a message through the encrypted channel to request a response from the server. The default is 0, indicating that these messages will not be sent to the server.

This option applies to protocol version 2 only.

ServerAliveCountMax:
Sets the number of server alive messages (see above) which may be sent without ssh receiving any messages back from the server. If this threshold is reached while server alive messages are being sent, ssh will disconnect from the server, terminating the session. It is important to note that the use of server alive messages is very different from TCPKeepAlive (below). The server alive messages are sent through the encrypted channel and therefore will not be spoofable. The TCP keepalive option enabled by TCPKeepAlive is spoofable. The server alive mechanism is valuable when the client or server depend on knowing when a connection has become inactive.

The default value is 3. If, for example, ServerAliveInterval (above) is set to 30, and ServerAliveCountMax is left at the default, if the server becomes unresponsive ssh will disconnect after approximately 90 seconds.

we can use the following command

% ssh -o TCPKeepAlive=no -o ServerAliveInterval=30
or

put above options in /etc/ssh/ssh_config
You can use like this:

    ServerAliveInterval 60
    ServerAliveCountMax 600 
(default is 3 according to man ssh_config)


   
or
put above options in $HOME/.ssh/config  (see man ssh_config)

Make sure you set like this: ServerAliveInterval*ServerAliveCountMax <= 0.8*N, N being the timeout. The default value of ServerAliveCountMax is 3 (man ssh_config) and therefore a a 3x30 = 90 seconds if you guessed a disconnect is about less then 1.5min).

If you set it too low, there will be unnecessary traffic between client and server to keep alive, so it decrease performance.



4.2 How to configure it to prevent ssh disconnection for Putty

Enable tcp keepalive and 'seconds between keepalive' are totally different things, one is for TCP level keepalive, another is application level implementation. 

option 1: use tcp keepalive.
'Connection' menu:
Disable Nagle's algorithm
Enable TCP keepalives

option 2: use application level keepalive
'connection' - seconds between keepalives (0 to turn off)

You might consider to eanble 'Connection' -> 'SSH' -> 'X11'
Enable X11 forwarding
Enable MIT-Magic-Cookie-1


 Save the session


The following is from Putty documentation http://the.earth.li/~sgtatham/putty/0.58/htmldoc/Chapter4.html#config-keepalive

If you find your sessions are closing unexpectedly (most often with ‘Connection reset by peer’) after they have been idle for a while, you might want to try using this option.

Some network routers and firewalls need to keep track of all connections through them. Usually, these firewalls will assume a connection is dead if no data is transferred in either direction after a certain time interval. This can cause PuTTY sessions to be unexpectedly closed by the firewall if no traffic is seen in the session for some time.

The keepalive option (‘Seconds between keepalives’) allows you to configure PuTTY to send data through the session at regular intervals, in a way that does not disrupt the actual terminal session. If you find your firewall is cutting idle connections off, you can try entering a non-zero value in this field. The value is measured in seconds; so, for example, if your firewall cuts connections off after ten minutes then you might want to enter 300 seconds (5 minutes) in the box.

Note that keepalives are not always helpful. They help if you have a firewall which drops your connection after an idle period; but if the network between you and the server suffers from breaks in connectivity then keepalives can actually make things worse. If a session is idle, and connectivity is temporarily lost between the endpoints, but the connectivity is restored before either side tries to send anything, then there will be no problem - neither endpoint will notice that anything was wrong. However, if one side does send something during the break, it will repeatedly try to re-send, and eventually give up and abandon the connection. Then when connectivity is restored, the other side will find that the first side doesn't believe there is an open connection any more. Keepalives can make this sort of problem worse, because they increase the probability that PuTTY will attempt to send data during a break in connectivity. Therefore, you might find they help connection loss, or you might find they make it worse, depending on what kind of network problems you have between you and the server.

Keepalives are only supported in Telnet and SSH; the Rlogin and Raw protocols offer no way of implementing them. (For an alternative, see section 4.13.3.)

Note that if you are using SSH-1 and the server has a bug that makes it unable to deal with SSH-1 ignore messages (see section 4.23.1), enabling keepalives will have no effect.