LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 03-29-2012, 01:46 AM   #1
hahacc
Member
 
Registered: Oct 2010
Posts: 93

Rep: Reputation: 1
Question centos 6.2 lost internet connections intermittently


Hi guys,
There's one host(centos 6.2) which lost it's networking connection intermittently, and thus the whole OS was left there without networking which was very bad. It's a host with httpd installed, so without networking connections, it's very bad.

The OS was not shutted down or rebooted after the loss of networking, but just stayed there. I checked error logs and cannot find anything that's related to this strange behavior. The OS has xinetd(rsync/nrpe), httpd, mysql, vsftpd installed and I've already gave it a yum update and now it's at 2.6.32-220.7.1.el6.x86_64, CentOS release 6.2 (Final)

Can anyone help on this?
 
Old 03-29-2012, 01:53 AM   #2
lithos
Senior Member
 
Registered: Jan 2010
Location: SI : 45.9531, 15.4894
Distribution: CentOS, OpenNA/Trustix, testing desktop openSuse 12.1 /Cinnamon/KDE4.8
Posts: 1,144

Rep: Reputation: 217Reputation: 217Reputation: 217
Hi

Is your server running any network daemon (service) with DHCP enabled maybe ?
Is it NIC that is defective maybe, can you try replace network card ?
What does your
Code:
# service network status
Configured devices:
lo eth0 eth1
Currently active devices:
lo eth0


and 

# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:30:4F:28:16:C2
          inet addr:192.168.0.7  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::230:4fff:fe28:16c2/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:520277291 errors:0 dropped:0 overruns:0 frame:0
          TX packets:320763080 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2683477502 (2.4 GiB)  TX bytes:3405751313 (3.1 GiB)
          Interrupt:209 Base address:0x2000

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:222507 errors:0 dropped:0 overruns:0 frame:0
          TX packets:222507 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:60339993 (57.5 MiB)  TX bytes:60339993 (57.5 MiB)


# cat /etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0
ONBOOT=yes
BOOTPROTO=static
BROADCAST=192.168.0.255
IPADDR=192.168.0.7
NETMASK=255.255.255.0
NETWORK=192.168.0.0
TYPE=Ethernet
show ?

Can you ping maybe any other computer/server in the same subnet network ?
or is maybe
Code:
ping www.google.com
giving any response ?

Last edited by lithos; 03-29-2012 at 01:55 AM.
 
Old 03-29-2012, 03:00 AM   #3
hahacc
Member
 
Registered: Oct 2010
Posts: 93

Original Poster
Rep: Reputation: 1
Thanks.
Here's the outputs:
Quote:
[root@jingan10 network-scripts]# service network status
Configured devices:
lo eth0 eth1
Currently active devices:
lo eth1

[root@jingan10 network-scripts]# ifconfig -a
eth0 Link encap:Ethernet HWaddr BC:AE:C5:3D:316
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Interrupt:16 Memory:fbde0000-fbe00000

eth1 Link encap:Ethernet HWaddr BC:AE:C5:3D:25:71
inet addr:116.255.130.60 Bcast:116.255.130.63 Mask:255.255.255.224
inet6 addr: fe80::beae:c5ff:fe3d:2571/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:96091 errors:0 dropped:0 overruns:0 frame:0
TX packets:105631 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:20427859 (19.4 MiB) TX bytes:67686188 (64.5 MiB)
Interrupt:17 Memory:fbce0000-fbd00000

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:59 errors:0 dropped:0 overruns:0 frame:0
TX packets:59 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:5700 (5.5 KiB) TX bytes:5700 (5.5 KiB)

[root@jingan10 network-scripts]# cat ifcfg-eth0
DEVICE="eth0"
BOOTPROTO="none"
HWADDR="BC:AE:C5:3D:316"
ONBOOT="no"
IPADDR=

[root@jingan10 network-scripts]# cat ifcfg-eth1
DEVICE="eth1"
BOOTPROTO="static"
HWADDR="BC:AE:C5:3D:25:71"
ONBOOT="yes"
IPADDR=116.255.130.60
NETMASK=255.255.255.224
GATEWAY=116.255.130.33
Actually, seems there's no service with dhcp enabled running, I've taken a snapshot of all processes in the attachment of this thread.

And I've written a script to cron job to check for networking every 15 minutes, if the host can not ping some ip addresses, then restart network. And then wait for some time, then if it still can not ping, reboot the host, here goes the script:
Quote:
#!/bin/bash
#*/15 * * * * /backup/sites/reboot_if_no_internet_access.sh
sleep 10
ip_addy=(
8.8.8.8
8.8.8.8
8.8.8.8
220.181.111.85
220.181.111.85
220.181.111.85
123.125.38.240
123.125.38.240
123.125.38.240
)
_max=7
_count=0
for ip in ${ip_addy[*]} ; do
/bin/ping -c1 -w3 $ip > /dev/null
if [ $? -ne 0 ] ; then
_count=$(( $_count + 1 ))
fi
done

if [ $_count -gt $_max ] ; then
/bin/echo -n "restart networking at: ">>/var/tmp/reboot.log
/bin/echo `date` >>/var/tmp/reboot.log
/etc/init.d/network restart
sleep 90
ip_addySecond=(
8.8.8.8
8.8.8.8
8.8.8.8
220.181.111.85
220.181.111.85
220.181.111.85
123.125.38.240
123.125.38.240
123.125.38.240
)
_maxSecond=7
_countSecond=0

for ipSecond in ${ip_addySecond[*]} ; do
/bin/ping -c1 -w3 $ipSecond > /dev/null
if [ $? -ne 0 ] ; then
_countSecond=$(( $_countSecond + 1 ))
fi
done
if [ $_countSecond -gt $_maxSecond ] ; then
/bin/echo -n "reboot server at: ">>/var/tmp/reboot.log
/bin/echo `date` >>/var/tmp/reboot.log
/sbin/reboot
fi
fi
From the scripts log file, I can see that before rebooting, networking was restarted, and from dmesg at that time, networking seems restarted well, but still ping failed later and thus host rebooted:

Quote:
Mar 29 15:15:38 jingan10 kernel: lo: Disabled Privacy Extensions
Mar 29 15:15:39 jingan10 kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready
Mar 29 15:15:40 jingan10 kernel: e1000e: eth1 NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
Mar 29 15:15:40 jingan10 kernel: e1000e 0000:02:00.0: eth1: 10/100 speed: disabling TSO
Mar 29 15:15:40 jingan10 kernel: e1000e: eth1 NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
Mar 29 15:15:40 jingan10 kernel: e1000e 0000:02:00.0: eth1: 10/100 speed: disabling TSO
Mar 29 15:15:40 jingan10 kernel: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready #seems networking restarted well
Mar 29 15:17:40 jingan10 init: tty (/dev/tty1) main process (1606) killed by TERM signal #but still host was rebooted
From
Quote:
Originally Posted by lithos View Post
Hi

Is your server running any network daemon (service) with DHCP enabled maybe ?
Is it NIC that is defective maybe, can you try replace network card ?
What does your
Code:
# service network status
Configured devices:
lo eth0 eth1
Currently active devices:
lo eth0


and 

# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:30:4F:28:16:C2
          inet addr:192.168.0.7  Bcast:192.168.0.255  Mask:255.255.255.0
          inet6 addr: fe80::230:4fff:fe28:16c2/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:520277291 errors:0 dropped:0 overruns:0 frame:0
          TX packets:320763080 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2683477502 (2.4 GiB)  TX bytes:3405751313 (3.1 GiB)
          Interrupt:209 Base address:0x2000

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:222507 errors:0 dropped:0 overruns:0 frame:0
          TX packets:222507 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:60339993 (57.5 MiB)  TX bytes:60339993 (57.5 MiB)


# cat /etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE=eth0
ONBOOT=yes
BOOTPROTO=static
BROADCAST=192.168.0.255
IPADDR=192.168.0.7
NETMASK=255.255.255.0
NETWORK=192.168.0.0
TYPE=Ethernet
show ?

Can you ping maybe any other computer/server in the same subnet network ?
or is maybe
Code:
ping www.google.com
giving any response ?
Attached Files
File Type: txt processes-snapshot.txt (16.9 KB, 22 views)

Last edited by hahacc; 03-29-2012 at 03:06 AM.
 
Old 03-29-2012, 07:00 AM   #4
lithos
Senior Member
 
Registered: Jan 2010
Location: SI : 45.9531, 15.4894
Distribution: CentOS, OpenNA/Trustix, testing desktop openSuse 12.1 /Cinnamon/KDE4.8
Posts: 1,144

Rep: Reputation: 217Reputation: 217Reputation: 217
Hi,

just as a precaution please mask your IP addresses in your ifcfg-ethX (for example IPADDR=1.2.3.4) as I don't see it's relevant.

It seems that you're running your connections through eth1 NIC
Code:
root@jingan10 network-scripts]# cat ifcfg-eth0
DEVICE="eth0"
BOOTPROTO="none"
HWADDR="BC:AE:C5:3D:316"
ONBOOT="no"
IPADDR=

[root@jingan10 network-scripts]# cat ifcfg-eth1
DEVICE="eth1"
BOOTPROTO="static"
HWADDR="BC:AE:C5:3D:25:71"
ONBOOT="yes"
IPADDR=1.2.1.2
NETMASK=255.255.255.224
GATEWAY=1.2.1.3
opposing to mostly eth0

But I think that this kind of setup needs to have some routing configured
which I unfortunately don't know of.



I wish Maybe some expert users here could help more on how to use eth1 for default Internet connection.


Regards
 
Old 03-30-2012, 07:00 AM   #5
hahacc
Member
 
Registered: Oct 2010
Posts: 93

Original Poster
Rep: Reputation: 1
please see note below

Last edited by hahacc; 04-13-2012 at 01:17 AM.
 
Old 04-04-2012, 10:58 PM   #6
hahacc
Member
 
Registered: Oct 2010
Posts: 93

Original Poster
Rep: Reputation: 1
Smile

Just for guys who may arrive here after searching:
1.there's kernel bug in intel 82574L e1000e driver on centos 6(MSI/MSI-X interrupts issue), we can resolve this by install kmod-e1000e package from ELrepo.org and later add pcie_aspm=off e1000e.IntMode=1,1 e1000e.InterruptThrottleRate=10000,10000 acpi=off to kernel parameters. You can read more info Intel e1000e driver bug on 82574L Ethernet controller causing network blipping.
2.For the high Tx traffic, this was caused by port 53 dns flooding attack. I've resolved this by writing some iptable rules. More info here: port 53 dns flooding attack

Last edited by hahacc; 04-13-2012 at 01:17 AM.
 
Old 10-19-2012, 03:46 PM   #7
Severian37
LQ Newbie
 
Registered: Oct 2012
Posts: 1

Rep: Reputation: Disabled
Thumbs up

Thanks for posting the info on the elrepo e1000e package and kernel parameters. This was a huge help.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
CentOS 5.3 with bonded Broadcom NIC - loses connectivity intermittently hackaroo Linux - Networking 5 07-11-2012 02:02 AM
Lost connections when using iproute2 landysaccount Linux - Newbie 2 02-20-2009 01:09 PM
lost (*some*) internet connections after yum update jake* Linux - General 3 12-18-2006 07:22 AM
SSH Refuses Connections Intermittently Noido Linux - Software 6 09-12-2006 12:59 PM
Internet connection/Cable Modem resets intermittently NLawrence Linux - Networking 1 04-23-2005 02:10 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 07:23 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration