LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Networking
User Name
Password
Linux - Networking This forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.

Notices


Reply
  Search this Thread
Old 12-28-2010, 01:13 PM   #1
rentze
LQ Newbie
 
Registered: Dec 2010
Posts: 3

Rep: Reputation: 0
Huge packet loss on a gigabit link


Hi,

I have a following configuration: 4 PCs (say, A, B, C and D), running Ubuntu or Debian, interconnected using a gigabit switch, which is connected to the Internet. Two machines (say, A and B) also have a direct private connection between them (provided by another pair of NICs).

Now, when I test the connection performance with iperf, the results vary. The private connection between A and B performs well - about 930Mbps using iperf's UDP test. Between C and D it is about 800Mbps which I find tolerable. Packet loss when running these tests is negligible. However, when I run iperf between any of {A,B} and {C,D}, the performance significantly drops as there is a huge number of lost packets. For example, here is the result of testing between A and C:

[ 3] local xxx.xxx.xxx.xxx port 34702 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 834 MBytes 700 Mbits/sec
[ 3] Sent 594940 datagrams
[ 3] Server Report:
[ 3] 0.0-10.2 sec 179 MBytes 147 Mbits/sec 12.645 ms 467089/594938 (79%)
[ 3] 0.0-10.2 sec 1 datagrams received out-of-order

Why is there such a large number of packets which are generated, but lost somewhere?
A<->B private link works fine, so system level parameters on both A and B are correct. Furthermore, C<->D works ok, so I guess I shouldn't blame the switch.

Is there a per-NIC configuration that I should check or it smells like a hw problem? Problematic NICs on both A and B are of the same type - Allied Telesyn AT2916T.

Thanks... at least for reading this
 
Old 12-28-2010, 05:18 PM   #2
never say never
Member
 
Registered: Sep 2009
Location: Indiana, USA
Distribution: SLES, SLED, OpenSuse, CentOS, ubuntu 10.10, OpenBSD, FreeBSD
Posts: 195

Rep: Reputation: 37
First blush says configuration issue (most likely duplex). You don't say how your systems are configured Are they configured to auto negotiate or to a specific configuration? What is the wire distance?

Auto negotiation is great when it works and a nightmare when it doesn't.

Another possibility is buffer overflow (the NIC buffer fills before the system empties the buffer).

If you are still having trouble post back with configs, and make sure not to use the same xxx ip address.

Be more accurate with ips, something like xxx.xxx.xxx.aaa, xxx.xxx.xxx.bbb, xxx.xxx.yyy.ccc, xxx.xxx.yyy.ddd so we can tell each system apart.
 
Old 12-28-2010, 10:43 PM   #3
djtoltz
Member
 
Registered: Nov 2003
Location: Eastern North Carolina, USA
Distribution: Mandrake
Posts: 51

Rep: Reputation: 20
You can use ifconfig to see the configuration and errors on a single network interface.
 
Old 12-29-2010, 06:41 AM   #4
rentze
LQ Newbie
 
Registered: Dec 2010
Posts: 3

Original Poster
Rep: Reputation: 0
Thanks for your suggestions! Now I am a bit closer to the source of the trouble. Namely, I forgot to mention that A and B are not running an ordinary kernel: they are both Xen Dom0-s. When I reboot them with the same kernel, but without Xen hypervisor, the huge packet loss disappears. The performance is not great though, I am getting like 575 Mbps uplink and 690 Mbps downlink (everything is configured by auto-negotiation, I don't specify anything explicitly). Still, this bandwidth is perfectly fine for me, I just want to get rid of the packet loss problem.

Furthermore, I have discovered that the problem occurs only when A or B act as receivers. Here is the score (A=iperf server=receiver, C=iperf client=sender):

Quote:
[ 3] local xxx.xxx.xxx.aaa port 5001 connected with xxx.xxx.xxx.bbb port 38590
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 3] 0.0-10.2 sec 324 MBytes 266 Mbits/sec 12.213 ms 356145/587534 (61%)
[ 3] 0.0-10.2 sec 1 datagrams received out-of-order
ifconfig reports no problems, but here is the output of netstat -su:

Quote:
IcmpMsg:
InType3: 19
OutType3: 27
OutType8: 14
Udp:
1181505 packets received
108486 packets to unknown port received.
2348066 packet receive errors
1481319 packets sent
RcvbufErrors: 2348066
UdpLite:
IpExt:
InMcastPkts: 17
InBcastPkts: 1926
InOctets: 1154882666
OutOctets: -2076309331
InMcastOctets: 476
InBcastOctets: 386075
(this is after serveral tests. After each one, not surprisingly, RcvbufErrors increases by the same number as the number of lost packets reported by iperf).

Increasing UDP receiving buffer didn't help. Xen network bridge is turned off.

Any other suggestions? How to determine precisely where the packets get dropped? Judging from all this, it is Xen's fault, so I'll try to explore their mailing lists...
 
Old 12-29-2010, 07:22 AM   #5
Dani1973
Member
 
Registered: Dec 2010
Distribution: Debian testing
Posts: 148

Rep: Reputation: 16
Have you tried testing this from a virtual machine and not dom0?

Or higher the specs of your Dom0 (especially RAM) for testing purpose.
The result you receive come close to transfer rates of single hard drives (Dom0 using swap for the received or send data???).

Last edited by Dani1973; 12-29-2010 at 07:30 AM.
 
Old 12-29-2010, 09:38 AM   #6
okcomputer44
Member
 
Registered: Jun 2008
Location: /home/laz
Distribution: CentOS/Debian
Posts: 246

Rep: Reputation: 53
Quote:
Originally Posted by rentze View Post
Thanks for your suggestions! Now I am a bit closer to the source of the trouble. Namely, I forgot to mention that A and B are not running an ordinary kernel: they are both Xen Dom0-s. When I reboot them with the same kernel, but without Xen hypervisor, the huge packet loss disappears. The performance is not great though, I am getting like 575 Mbps uplink and 690 Mbps downlink (everything is configured by auto-negotiation, I don't specify anything explicitly). Still, this bandwidth is perfectly fine for me, I just want to get rid of the packet loss problem.

Furthermore, I have discovered that the problem occurs only when A or B act as receivers. Here is the score (A=iperf server=receiver, C=iperf client=sender):



ifconfig reports no problems, but here is the output of netstat -su:



(this is after serveral tests. After each one, not surprisingly, RcvbufErrors increases by the same number as the number of lost packets reported by iperf).

Increasing UDP receiving buffer didn't help. Xen network bridge is turned off.

Any other suggestions? How to determine precisely where the packets get dropped? Judging from all this, it is Xen's fault, so I'll try to explore their mailing lists...
Well I had nightmare with Xen too. The bonding does not work with Xen at all either. http://www.cyberciti.biz/tips/linux-...interface.html

It took me a while to figure out why on earth does not work the bonding.
When I rebooted my server I just realized it uses the Xen kernel instead of the regular one.
I changed it and after that everything worked well. Before that I lost every second ping packets!
So the Xen kernel could cause many network issue(es).
 
Old 12-29-2010, 12:30 PM   #7
rentze
LQ Newbie
 
Registered: Dec 2010
Posts: 3

Original Poster
Rep: Reputation: 0
Well, I have discovered the bottleneck. Actually, the problem is the CPU. Under a "normal" kernel, processing of iperf burns 100% cycles of one, and about 50% of the other core. In the configuration with Xen, originally I had only one core dedicated to Dom0... by far insufficient for this kind of processing! Even with both the cores activated, all the cycles get consumed (because networking requires more "thinking" under Xen) and the problem persists.

Now I guess that the solution is to buy a faster processor

Thanks again for all your suggestions!

Last edited by rentze; 12-29-2010 at 12:39 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
HUGE FC6 waning. DVD+RW data loss markelo Fedora 12 01-01-2007 05:57 AM
D-link DWL-G520 packet loss errors on Gentoo 2005.0 Mr.Ampersand() Linux - Wireless Networking 0 11-19-2005 09:23 AM
packet loss. bruse Linux - Networking 1 09-01-2005 08:24 AM
70% packet loss bingviini Linux - Networking 5 11-18-2004 11:35 AM
What can cause packet loss? Micah Linux - Hardware 8 03-15-2004 12:31 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Networking

All times are GMT -5. The time now is 11:37 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration