Making autossh reconnect after a dynamic address change

furface · 05-15-2015, 04:46 PM

Is there any way to make autossh reconnect after the client's network changes its public ip address? It seems that the only way to make the session reconnect is to:

1. Kill the process on the hosting server
2. Reboot the client machine that started autossh.

I need solid reliability that lasts multiple months on ssh tunnels that are monitoring remote devices. I'm worried that I'm going to have to go physically to various sites to reboot periodically to reestablish connections.

One solution I've thought of is to run my own cron process that checks the tunnels and then responds accordingly with some actions. Isn't this what autossh is supposed to do already?

Thanks.

nini09 · 05-18-2015, 02:43 PM

Do you start port monitoring? Autossh should restart it automatically when IP change.

furface · 05-19-2015, 10:54 AM

Thanks. I don't run any specific port monitoring software. Just autossh. I think I'm getting an idea of what's going on. Both Autossh and the ssh server track connections and kill processes based on configuration settings. For Auttossh I used recommended settings without thinking about them:

ServerAliveInterval 60
ServerAliveCountMax 3

I should have set ServerAliveCountMax to something much larger, like 100.

I think that sshd_config has these settings:

ClientAliveCountMax
ClientAliveInterval

In my case I believe that the server was not killing the process and removing the PID before my Autossh connection was disconnecting. The machine I'm using is out working somewhere, so I'll have to set up a test machine to see if this is correct.

Thanks.

furface · 05-20-2015, 04:39 PM

Actually I was wrong about ServerLiveCountMax. That's what tells ssh when to terminate. However, the problem is that it only checks if the server is alive, not if the connection is stale. If the local machine changes an ip address, then the connection is stale. I don't use port monitoring.

su -s /bin/sh autossh -c 'autossh -M 0 -q -f -N -o "ServerAliveInterval 60" -o "ServerAliveCountMax 3" -R 1000:192.168.3.2:80 xxx_myhost_xxxxxx.com'

I'm not sure how to fix this short of shutting down the ssh server for a few minutes to refresh all the connections. I think what I'm going to do is write a script or small app that monitors the connection and has the ability to reboot the machine remotely if either instructed to or else if certain catastrophic conditions exist.

Thanks.

furface · 05-20-2015, 08:16 PM

OK, I did some more testing, and I think the problem is that ssh clients can't reliably tell when the ssh connection is stale. ServerAliveCountMax only tests whether or not the server (not just sshd) is up. So if sshd goes down and the server stays up, it will not force ssh clients to quit, even though the connection may be unusable.

Basically there are 2 situations:

1. Connection gets interrupted for a long time so both ssh and sshd conclude that the connection has been lost: This is good, Autossh will reliably reconnect.

2. Connection gets interrupted for a period that is short relative the timeout periods of ssh and sshd: This is bad. It can leave the connection in an unusable state, and Autossh will not reconnect because ssh thinks the connection is still open.

The only way I've found to reliably reconnect while having access only to the server when the connection goes stale and the ssh client thinks it's still connected is to:

1. Kill the PID for the connection on the server.

2. Also on the server put in a firewall rule to completely block the remote tunnel machine from reaching the server. This could be more fine grained if one were to tell which port ssh clients use to establish that the server is still alive.

3. Wait until you are sure that ssh on the tunnel machine has timed out, and then remove the firewall rule.

nini09 · 05-21-2015, 02:44 PM

You should use port or connection monitoring, -M option, to detect stale connection. If connection is stale, that means that the connection can't forward traffic. The port or connection monitoring should detect this.
The -M 0 will turn the monitoring off.

furface · 05-22-2015, 12:24 PM

nini09, -M option makes use of an echo server, like the one hosted by inetd. It doesn't tell you anything about the state of ssh connections. It could be used to reset all of the tunnels hosted by a server at once. However, it's a bit of overkill. The docs suggest using ServerAliveInterval and ServerAliveCountMax instead.

http://manpages.ubuntu.com/manpages/...autossh.1.html

Quote:

Setting the monitor port to 0 turns the monitoring function off, and autossh will only restart ssh upon ssh’s exit. For example, if you are using a recent version of OpenSSH, you may wish to explore using the ServerAliveInterval and ServerAliveCountMax options to have the SSH client exit if it finds itself no longer connected to the server. In many ways this may be a better solution than the monitoring port.

Again, the problem is that ssh clients don't seem to be able to reliably detect stale connections. There may be a way to do it, but I don't know.

Thanks

IngoMeyer · 11-13-2023, 03:09 AM

This is a very old thread, but I had the same problem previously. What helped in my case:

Activate TCP keep alive in `/etc/ssh/sshd_config` with `TCPKeepAlive yes`.

Create a file `/etc/sysctl.d/01-tcp_keepalive.conf` and set

Code:

net.ipv4.tcp_keepalive_time=60
net.ipv4.tcp_keepalive_probes=3
net.ipv4.tcp_keepalive_intvl=10

Reboot the server.

Stale connections are now cleaned up by the OS on the TCP level. After 60 seconds of inactivity, null packets will be sent to test a TCP connection. If no reply is received after the third try within 30 seconds, the connection will be closed. So in total, broken connections will be cleaned up after 90 seconds.

But warning, this could affect other applications! TCP null packets are sometimes filtered by firewalls, so connections, that are actually alive, could also be closed if they are not used for 90 seconds.