Scripting a network restart when net goes down.
Alright, so as a first-action solution I've come up with an idea.
Basically every now and then we get very quick short-lasting incoming ddos attacks and it seems the NIC gives up but the server is still online. So to counteract this I thought it would be good to have a script which checks if it can ping google.com or any other sustainable host.. this script can run via cron every 5minutes. When it can't contact google.com and has already tried 2 times, it restarts the network with 'service network restart' and then removes the log of its past two attempts. I'm just wondering, how would I allow the script to check if the ping is succeeding? I'm not great at bash scripting so I would appreciate any help. Thanks! |
How about
Code:
ping -i120 google.com |
Hmm,
Maybe this could help, dunnow... Quote:
needs root rights, though... Of course, the "bacon" of the thing is the line where the restart happens, depends on your distro, on mine (Arch Linux) this could be: Quote:
Thor |
I wrote a script to do exactly this on one of our embedded systems a while back, works great.
The basic gist is to ping a known host (I usually use the router's IP since you don't want to go around restarting the network if it's just an Internet outage) and check the exit code like Thor showed. If the ping succeeds you exit, if it fails you wait a minute and try again. If it succeeds you exit, if it fails you wait a minute and try again. If it fails the third time, you write the event to a log and issue a network restart command. |
Quote:
I'm definitely not any good in bash so I'm basing this from piecing bits together, and I don't think that method works all to well so I would like to learn from something whole if you would be kind enough. |
Keep in mind that this was developed for an embedded debian system with a lot of restrictions, which is why it's written the way it is (no cron, etc). It should run on a regular system as well, but in that case I would probably make a few changes.
It's invoked in rc.local using nohup so it's always running in the background. With cron this wouldn't be necessary. Code:
#!/bin/bash Code:
export ETHWD_SLEEP=60 |
Quote:
|
What kind of problems are you having? I just ran it without any issues on Redhat Enterprise 4 and Fedora 15.
|
sleep: missing operand
Try `sleep --help' for more information. ./test.sh: line 7: ethconfig: No such file or directory Usage: ping [-LRUbdfnqrvVaA] [-c count] [-i interval] [-w deadline] [-p pattern] [-s packetsize] [-t ttl] [-I interface or address] [-M mtu discovery hint] [-S sndbuf] [ -T timestamp option ] [ -Q tos ] [hop1 ...] destination I've tried changing ethconfig to ifconfig but commands are different. Ping, not sure about that. Sleep, the same, not sure. Seems your script would work but I just need to know how to convert several parts to centos. |
ethconfig is an ascii file with two lines as shown, it tells the script which IP to ping and how long to wait between repeat attempts. The script is failing on line 7 because you didn't create the ethconfig file. Since you didn't create the ethconfig file, that means the $ETH_GW and $ETHWD_SLEEP variables are empty, which is why the calls to ping and sleep are failing. Create the ethconfig file in your pwd (or create it elsewhere and hard-code the location in the script) and all of those problems will go away.
Alternatively, you can just swap the values in ethconfig into the script directly and remove the "source ethconfig" line. I didn't do that in my version because this code is always running, 24/7, so if I ever wanted to change the IP or the delay time, I would have to kill the script, change the values, and then re-start it. Separating those values into their own file allows me to change them without having to kill and restart the script (which is why ethconfig is being sourced every iteration of the loop). If you convert this script into a cron version (aka: not an infinite loop), then those problems become moot. |
Quote:
ping failed ping failed ping failed ping failed However it doesn't seem to be restarting the network, including it isn't logging that it is even attempting it, or echo'ing it. Any advice? Once again appreciate it. |
I suggest using monit for this sort of thing. That is what it's designed to do.
|
Quote:
|
Quote:
|
Don't do like I did!
Once I put a script on my client of my systems which rebooted the machine when the network was down. I can't remember why I had the machine reboot instead of simply restarting the network. Maybe it had something to do with module loading. I must have installed it back in 2005 or so. I renewed some system components but cloned the installation when replacing the hard disk.
Anyway, years went by and (as usual with Linux) I never had a problem, and (as usual with Linux) things were working so well I had totally forgotten about installing that script. I even forgot that I ever wrote it. What did this script do? Exactly 10 minutes after booting the client would ping our main server. That was the WinNT (at that time) PDC, and if that one was down, there would be something very wrong anyway, so that looked a reliable source. The PDC never went down and there were no problems at all. Until 2 years ago I replaced that WinNT box with an all new Linux server. With a different IP address of course because the two servers would need to run side by side for a while. The the WinNT server was taken off-line. Nothing happened, because my client only checked 10 minutes after booting. Not every 10 minutes. So weeks after taking the WinNT server off-line I rebooted the client. Which faithfully rebooted after 10 minutes. Not only could I remember ever installing the script, there was also no association with the server going off-line. There were weeks in between. So what does one do with a desktop computer which reboots after 10 minutes? Right, suspect the cooling, clean the fan, replace it, check the power supply, replace it, memory, hard disk.... Long story short, I can't remember anymore how I discovered the script again, but after replacing most of the hardware I finally did. IIRC I logged an entry in the /var/log/messages, but at first I didn't understand why a network failure would force a reboot. Anyway, don't do it my way. It was stupid. jlinkels |
All times are GMT -5. The time now is 04:19 PM. |