LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Networking (https://www.linuxquestions.org/questions/linux-networking-3/)
-   -   Network hanging problem (https://www.linuxquestions.org/questions/linux-networking-3/network-hanging-problem-744466/)

davhak 08-02-2009 01:18 AM

Network hanging problem
 
Dear All,

I have Fedora Core 6 (64bit) installed on three identical dual quad-core server machines (Intel EM64T) connected with Gigabit NIC/Ethernet. All three machines use NFS with another server machine. The FC6 configuration as well as the hardware is the same on all machines. We use MPICH2 for parallel simulations and everything worked fine for about 2 years. Recently I noticed that for the two of the three machines after some time the network hangs. Interestingly the problem appeared about the same time on two of the machines. All machines are separated from the outer world by a firewall which have only ssh open. After hanging the ping/ssh command stops working. Logging into the system also happens after lengthy delay. Below is the output of ifconfig eth0 command on one of the machines after hanging.

eth0 Link encap:Ethernet HWaddr 00:15:17:0C:57:62
inet addr:192.168.13.2 Bcast:192.168.13.255 Mask:255.255.255.0
inet6 addr: fe80::215:17ff:fe0c:5762/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:17922 errors:6 dropped:0 overruns:0 frame:6
TX packets:10900 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1319091 (1.2 MiB) TX bytes:1154866 (1.1 MiB)
Base address:0x2020 Memory:b8820000-b8840000

I also attach the output of netstat -rn command.

Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
192.168.13.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0
0.0.0.0 192.168.13.254 0.0.0.0 UG 0 0 0 eth0

The logs don't show anything obviously wrong except the many ypbind[1989]: broadcast: RPC: Timed out errors in the /var/log/messages during hanging. When I do ifconfig eth0 down and then ifconfig eth0 up (or roughly network service restart) the network recovers, though after some time the hanging might occur again.

Could someone point about possible causes of this problem ?

Thanks very much for any help.

jahic.mersudin 08-03-2009 03:32 AM

do u need gigabit link maybe that causes problem and try checking cables, or simple check if led that display link stat on eth card, is on during down time.

or check setup /etc/network/interfaces


All times are GMT -5. The time now is 06:53 PM.