Hi everyone,
I have a simple socket application running on a couple machines, pumping messages back and forth to each other. The problem I'm having is that the connection between these two machines is routed via one of two OTHER machines, and has to be tolerant of the route changing. I.E. the topology is:
Code:
Machine 1
| |
r1 r2
| |
Machine 2
By default, the connection between machines 1 and 2 is routed through r1, but if r1 fails, scripts on Machine 1 and Machine 2 change their routes to use r2. Right now, when this happens, my socket application becomes unable to send/receive from the other end until I shutdown and reconnect the socket. I'm currently using a simple heartbeat to detect when communications have been lost, and then have each application shutdown and attempt to reconnect. It seems to work reasonably well, but someone told me that I shouldn't even have to do this and that a socket should be tolerant of this situation on its own. I note that if I have an open SSH session between the two machines, it works fine after the route change. I'd always assumed that SSH itself used a similar application-level mechanism to remake the sockets after a problem.
So my question is: is there a socket option or the like I should be using to make everything easier on myself, negating the need for the heartbeat and disconnect/reconnect? I'm obviously pretty new at this. Thanks for the help, everyone.