LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)
-   -   Script to check connection (and restart network if down) (https://www.linuxquestions.org/questions/linux-networking-3/script-to-check-connection-and-restart-network-if-down-262281/)

mac_phil 12-03-2004 05:42 PM

Script to check connection (and restart network if down)
 
I leave my desktop online 24/7, and often SSH into it when I'm away.

Every once in awhile the system loses its network connection. (This is with DHCP.)

If I'm home, I can just type 'service network restart'. If I'm not home, I'm screwed.

How would I write a simple script to do the following every 15 minutes:

1)Check if the network is up
1a) If up, exit.
2b) If down, issue command 'service network restart' and goto 1.


I understand the basic ideas, that 1 will probably involve pinging some host, but I don't know how to write such a script.

Thanks!

scowles 12-03-2004 07:17 PM

1) copy/paste the following lines into a new file named /usr/local/bin/check_network. Change the ROUTER_IP variable to the IP address that can verify that your network is up.
Code:

#!/bin/bash

ROUTER_IP=192.168.9.6
( ! ping -c1 $ROUTER_IP >/dev/null 2>&1 ) && service network restart >/dev/null 2>&1

2) Add the following line to /etc/crontab. Change the */2 to whatever increment you want to have this script run in minutes. It's currently set to run /usr/local/bin/check_netowork every two minutes.
Code:

*/2 * * * * root /usr/local/bin/check_network
3) Set the execute permissions on the script. As root, type:
chmod +x /usr/local/bin/check_network.

thats it!

mac_phil 12-05-2004 12:24 AM

Thanks! That is really helpful.

VibeOfOurTribe 12-09-2004 12:44 PM

Ok, I guess I don't really unerstand the 2>&1 part of the script, but your script does not appear to be correct. Here is an example (I removed the >/dev/null so I could see what is going on):

My network is alive and I type:
Code:

(ping -c1 google.com 2>&1) && echo "true"
Here is what that does:
"PING google.com (216.239.37.99) 56(84) bytes of data.

--- google.com ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms"

notice it doesn't echo "true", which if I'm understanging your script correctly, it should.

So if I do:
Code:

(! ping -c1 google.com 2>&1) && echo "true"
again, while my network is up it says.
"PING google.com (216.239.37.99) 56(84) bytes of data.

--- google.com ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

true"


According to this, your script
Code:

#!/bin/bash

ROUTER_IP=192.168.9.6
( ! ping -c1 $ROUTER_IP >/dev/null 2>&1 ) && service network restart >/dev/null 2>&1

the network will get restarted even if it's up. Removing the "!" seems to have the desired effect, but it just looks incorrect. What is going on here? Possibly our versions of the "ping" program have different implementations?

mac_phil 12-09-2004 01:08 PM

Yes, it seems there is something slightly wrong with the script.

My network connection usually dies about once every three weeks. I've been running this script nonstop for about five days, and it has restarted my network at least 15 times. So it is giving some false positives for the network being down.

VibeOfOurTribe 12-09-2004 01:13 PM

Ok here is a script that works beautifully for me:
Code:

#!/bin/bash
x=`ping -c1 google.com 2>&1 | grep unknown`
if [ ! "$x" = "" ]; then
        echo "It's down!! Attempting to restart."
        service network restart
fi

This script works because if the network is down, ping will return "ping: unknown host google.com". If it is up then the word "unknown" shouldn't lie anywhere in the output.

Hope that works for you too!
:)

VibeOfOurTribe 12-09-2004 01:21 PM

also, to be on the safe side in my crontab i put a second line which restarts the network every hour regardless. So my crontab for root looks like this:

Code:

*/2 * * * * /usr/local/bin/check_network
0 * * * *    service network restart > /dev/null


mac_phil 12-09-2004 01:21 PM

I'll give that one a try, thanks!

hande1 10-14-2009 08:48 AM

A similar script
 
Did you have any luck with this Phil?

I'm looking to do something similar but I'm not overly familiar with bash scripting so it's driving me mad.

In an ideal world my script would ping 3 different IPs in turn, and if packet loss on either link exceeded say, 10%, a set of actions would be taken to restart the link:

a) Restart networking
b) Restart networkmanager
c) Ask networkmanager to dial up my link again (using cnetworkmanager as a command line utility )

I have the latter commands, but am struggling to structure and build a script to perform the above logic.

Any gurus fancy offering their kind assistance?

abibibo 01-05-2010 09:10 PM

Hande: If it's not too late, here's something I just wrote for my own use. It should be simple to modify such that it tests packet loss for multiple IP's and acts based on the average. Run using the crontab examples posted previously.


Code:

#!/usr/bin/env bash
#/usr/local/bin/wap_check

# Script to monitor and restart wireless access point when needed

maxPloss=10 #Maximum percent packet loss before a restart

restart_networking() {
        # Add any commands need to get network back up and running
        /etc/init.d/networking restart

        #only needed if your running a wireless ap
        /etc/init.d/dhcp3-server restart
}

# First make sure we can resolve google, otherwise 'ping -w' would hang
if ! $(host -W5 www.google.com > /dev/null 2>&1); then
        #Make a note in syslog
        logger "wap_check: Network connection is down, restarting network ..."
        restart_networking
        exit
fi

# Initialize to a value that would force a restart
# (just in case ping gives an error and ploss doesn't get set)
ploss=101
# now ping google for 10 seconds and count packet loss
ploss=$(ping -q -w10 www.google.com | grep -o "[0-9]*%" | tr -d %) > /dev/null 2>&1

if [ "$ploss" -gt "$maxPloss" ]; then
        logger "Packet loss ($ploss%) exceeded $maxPloss, restarting network ..."
        restart_networking
fi


hande1 01-06-2010 05:19 AM

Thanks abibibo!

jnovos 04-07-2016 03:43 AM

The command ping not work into sh script because it needs to change the permission.

In ubuntu execute this command sudo chmod u+s `which ping`



thanks for script :)

pingu_penguin 04-11-2016 04:09 AM

Try dig , doesnt require root priviledges.

A script could go this way

#!/bin/bash

dig google.com
REPLY=`echo $0`
if [ $REPLY ! = 0 ]
then service network restart
else
sleep 60s


You can put this script in a infinite loop.
I believe this is a lot simpler. You can redirect output to /dev/null wherever needed and tune the sleep time too as required.

Also you can set the HOSTIP variable as needed with the following :
HOSTIP=`ifconfig eth0 | grep 'inet addr:' | cut -d: -f2 | awk '{ print $1}'`

to check if your ethernet card has been given a ip.

rob.rice 04-12-2016 12:00 PM

the simplest way to do what you want would be to install
wicd and set it to aoutmaticly reconnect BUT you need to go back to the out of the box net work settings for the network
interface you use to connect to the inter net
it will work for eather net connections
I use it for wifi it's so fast at reconneccting that videos
on youtube don't even buffer befor the connection is back up

sundialsvcs 04-12-2016 04:03 PM

There's a really good, generalized, solution to this sort of system-management task: Nagios.

There are both commercial and open-source versions of this battle-tested product. I can personally attest that it works beautifully, for this and many other things.

pshanks 06-21-2016 05:59 PM

An updated script for checking an unstable wifi connection
 
Nagios is a great solution... if you are managing a whole environment. My use case exactly matches the OP's, and I also was interested in a shell scripting exercise. So I wrote a short script to address exactly this problem; it is updated to include: logging to systemd via systemd-cat, checking connectivity using http HEAD request instead of ping or HTTP GET, and extensive use of nmcli as the scripting interface for Network Manager. Moreover, this script does not use the brute force method of restarting all network services, but rather operates only on the wifi radio and wlan interface.
Warning, this was written late in the day, I was tired, and the script is FAR from optimized or even logical. Feel free to improve on it.

Code:

#! /bin/bash
# This script can be run by any user with permissions to control the wifi
# interface - however, if it is run as a cron job it will require root
# permissions to re-enable a disabled WiFi radio.  It has been tested
# succesfully with the following use cases:
# WiFi disabled (nmcli radio wifi off)
# Network device disabled, but WiFi enabled (nmcli dev dis ifname)
# Ethernet connection down (nmcli con down conn_name)
#
# Besides testing with nmcli, connectivity is confirmed with a wget spider
# command to a well known highly available web server (e.g., google.com)
# We do not use icmp/ping to do connectivity testing; this may be
# prohibited in some environments and would always fail.
# It is not necessary to restart the network service; operations are
# limited to just the WiFi connection.
#
# This script logs to the systemd journal with systemd-cat, and you can
# see the entries with "journalctl -t script_name"
# You can see the latest entries first by using "-r", or you
# can have "follow" functionality with "-f"
# Finally, you can see all messages since last boot with "--boot"
# Example:  journalctl -rt wifi-check --boot
#
# Best to call this script as a cron job, with something like 20 minute
# granularity, depending on how unstable your wifi connection is.
# Example crontab entry:
# */20 * * * * /path/to/script
#
# Finally, this script will not fix network connectivity issues that
# originate outside of this machine. But you already knew this.


device="wlp3s0"                        # device name, e.g., wlp[bus #]s[slot #]
conn="My_SSID"                        # connection name often matches WiFi SSID
test_target="google.com"        # choose something with high availability
hostname=$(uname -n)
script=$(basename $0)
DO_RECOVERY=0

# try to get an http response from a well known HA server
# We use "--spider" to avoid actually bringing down content, because
# all we really need is a success response like HTTP 200
online_test() {
        wget -q --tries=10 --timeout=20 --spider http://${test_target}
}

online_test
if [ $? -eq 0 ]; then
        echo "$hostname wifi is connected" | systemd-cat -t $script -p info
else
        echo "$hostname may be offline: wget failed" | systemd-cat -t $script -p alert
        DO_RECOVERY=1
fi

# Network Manager device status codes
# GENERAL.STATE:10 (unmanaged) ==> we should never see this
# GENERAL.STATE:20 (unavailable) ==> wifi probably disabled
# GENERAL.STATE:30 (disconnected) ==> wifi enabled, but conn down
# GENERAL.STATE:100 (connected)
device_test() {
  nmcli_out=$(nmcli -t -f GENERAL.STATE device show $device)
  rgx="GENERAL.STATE\:([0-9]+)"
  [[ $nmcli_out =~ $rgx ]]
  return "${BASH_REMATCH[1]}"       
}

# network connection state; $con_state -eq 0 means not active
conn_test() {
        count=$(nmcli -t -f NAME conn show --active | grep wlp3s0 | wc -l)
        return $count
}

if [[ $DO_RECOVERY -eq 1 ]]; then

# Recovery steps
# 1 - Is WiFi enabled?  If not, then fix it.
# wifi radio state is either enabled or disabled

  wifi_status=$(nmcli r wifi)
  if [[ $wifi_status =~ "disabled" ]]; then
        echo "WiFi is disabled... enabling it now." | systemd-cat -t $script -p info

        nmcli r wifi on  # this one line doesn't work for non-root cron jobs
        sleep 15  # give everything a few seconds to re-connect.

        wifi_status=$(nmcli r wifi)
        if [[ $wifi_status =~ "enabled" ]]; then
                echo "WiFi is successfully enabled in Recovery Phase 1" | systemd-cat -t $script -p info
        else
                echo "ERROR: WiFi was NOT successfully enabled in Recovery Phase 1." | systemd-cat -t $script -p err
        fi

        device_test
        if [[ $? -eq 100 ]]; then
                echo "${device} is available and is connected" | systemd-cat -t $script -p info
        elif [[ $? -eq 30 ]]; then
                echo "${device} is available, but is not connected yet" | systemd-cat -t $script -p info
        elif [[ $? -eq 20 ]]; then
                echo "ERROR: ${device} is not available in Recovery Phase 1" | systemd-cat -t $script -p err
        fi

        conn_test
        if [[ $? > 0 ]]; then
                echo "Connection $conn is up in Recovery Phase 1" | systemd-cat -t $script -p info
        else
                echo "Connection $conn is down, attempting to bring it up" | systemd-cat -t $script -p info
                nmcli c up $conn
                conn_test
                if [ $? > 0 ]; then echo "Success!" | systemd-cat -t $script -p info
                fi
        fi

        online_test
        if [ $? -eq 0 ]; then
                echo "$hostname is back online. Recovery script is finished." | systemd-cat -t $script -p info
                DO_RECOVERY=0                # We're done now.
        else echo "$hostname is still offline in Recovery Phase 1" | systemd-cat -t $script -p info
        fi

  else echo "Phase 1 check: WiFi is enabled" | systemd-cat -t $script -p info

  fi # end wifi_status check

fi  # end DO_RECOVERY Phase 1

# echo "DO_RECOVERY flag is ${DO_RECOVERY} after Phase 1" | systemd-cat -t $script -p debug

# re-using the recovery flag is a bit stupid, but I'm tired.
if [[ $DO_RECOVERY -eq 1 ]]; then

#  echo "In Phase 2 now" | systemd-cat -t $script -p debug

  # 2 - WiFi is enabled, how about the network device?
  device_test
  dev_status=$?
  if [[ $dev_status > 20 ]]; then        # see network device status codes above
        nmcli conn up $conn
        if [[ $? -eq 0 ]]; then
          echo "Connection ${conn} is up in Phase 2" | systemd-cat -t $script -p debug
          online_test
          if [ $? -eq 0 ]; then
            echo "$hostname is back online" | systemd-cat -t $script -p info
          else echo "$hostname is still offline in Phase 2" | systemd-cat -t $script -p debug
          fi  # end online test
        else
          echo "ERROR: Connection ${conn} did not come up in Phase 2" | systemd-cat -t $script -p err
        fi # end conn test
  else
        echo "ERROR: Device status is ${dev_status} - this can't be brought up in Phase 2" | systemd-cat -t $script -p err
  fi
fi # end second round of DO_RECOVERY
exit



All times are GMT -5. The time now is 04:12 AM.