Hi guys,
I'm having some trouble tracking down an *outbound* connection problem on my Linode / Debian Unstable setup. Hopefully someone can point me in the right direction. Here's the situation:
I have a Nodejs script running that polls a couple external machines/routers ever minute. It sets the idle socket timeouts long enough where it can reuse existing connections on subsequent polls. Sometimes the other end closes the sockets, but the script reconnects if it needs to. Everything works great for several days.
Sometimes a machine quits responding in a timely manner (> 5 seconds) and I have node abort the connection attempt for that poll cycle calling
http://nodejs.org/api/net.html#net_socket_destroy. This works fine for occasional bad connections, but in certain cases (perhaps multiple bad consecutive attempts) something in Debian's network stack actually screws over and starts refusing outbound connections, consistently resulting in ECONNREFUSED. I tried to figure out what could be going on but have been unsuccessful.
I tried telneting to the remote machines from the Linode SSH, Connection refused. Telnet to google:80 works okay. The firewall log on the target router shows the connection packet was received and accepted. I can telnet to the machine itself locally just fine from Windows. Linode/Debian still says conn refused. I looked at netstat -ton output and did not see an unusually large number of CLOSE_WAIT or other hanging sockets.
BUT...restarting the Linode fixes everything.
I tried restarting just the networking service but it kicks me off ssh, so i cannot tell if that fixes anything, but it doesnt let me ssh back in until i reboot the machine through the Linode manager.
if anyone has some ideas, please let me know.
thanks!