Linux - NetworkingThis forum is for any issue related to networks or networking.
Routing, network cards, OSI, etc. Anything is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have recently upgraded my home network to gigE, expecting to get transfer speeds around 50mB/s on nfs. Unfortunately first results fall very short, only around 17mB/s. I have done a bit more testing, and here is what I came up with :
- during nfs transfers, CPU usage on the server is around 60% wait and 40% system, so it seems the network is the problem. CPU usage on the client is around 5%. The server is an IBM dual-p3/500 with 512mB ram. The client is a core 2 duo (3gHz) with 4gB ram.
- disk access on the server can be done at around 40-60mB/s. I expect that the pci bus should handle 50mB/s disk read + 50mB/s gigE (the total is below the bus' 133mB/s limit)
So it seems transfers are faster from the client to the server than the other way, and that transmission without disk access is much faster on the server, but not on the client.
client:
Code:
ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: umbg
Wake-on: d
Link detected: yes
server:
Code:
ethtool eth0
Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Half 1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: g
Wake-on: d
Current message level: 0x00000037 (55)
Link detected: yes
Does anyone know what is causing the performance bottleneck or how to fix it?
I put gigabit in about a year ago and expected to get about 900 Mbps. In practice I can get about 300-350 Mbps sustained transfer rates. I did some tweaking but didn't really improve things much. I suspect the PCI bus is in fact a bottleneck - unless you only have one device on it and the data is going one way!
In the end because I was achieving 3 times the speed of 100BaseT and close to raw disk performance there wasn't much point in pursuing it further.
I put gigabit in about a year ago and expected to get about 900 Mbps. In practice I can get about 300-350 Mbps sustained transfer rates. I did some tweaking but didn't really improve things much. I suspect the PCI bus is in fact a bottleneck - unless you only have one device on it and the data is going one way!
In the end because I was achieving 3 times the speed of 100BaseT and close to raw disk performance there wasn't much point in pursuing it further.
I agree with you, I don't expect 900mb/s. I expect 35-40mB/s (around 300mb/s), and that's what I get from the client to the server. My problem is that the server can only send data at half that speed. So it seems the problem is not the bus, because that would also slow down receiving on the server.
What kind of tweaking have you done to try to make yours faster?
On one machine I got the same asymmetric performance issue as you and used ethereal (now wireshark) to see what was happening. There were lots of TCP errors. I reduced the final numbers for wmem and rmem and played around until I got stable performance. Rather than 4194304 I used 1048576. The sysctl variables can be set like:
Code:
sysctl -w net.ipv4.tcp_wmem="4096 16384 1048576"
Since then I've done fresh installs of openSuSE 10.3 on all the other machines without needing to upgrade the drivers or tweak the sysctl variables. On the machine that had the problems the variables are now:
Interestingly when I run it between my production server and the new server (running 10.3) which I'm going to swap to, I get 335 Mbits/sec one way and 496 Mbits/sec the other way. The new server is a dual core AMD Opteron with a few gig of RAM so I would expect asymmetric performance. If I run two clients against the new server they both get about 300Mbits/sec so it does look like CPU and bus performance affect things.
Thanks for the advice. I increased all the buffers to 4mB and the asymmetric performance issue seems gone. However, I get 500mb/s with iperf, which is good, but I only get 320mb/s with netcat and 136mb/s with nfs, which is exactly what I had before. (Yes, I remounted before testing.) I tried both nfs v3 and v4 over tcp. Nfs rsize and wsize are set to 64k.
Looks like the usual benchmarking vs real world issue. As I said before once I got it stable given that raw disk performance (again using hdparm not real world!) limits data transfer to about 50 MB/sec I didn't see the point in looking further. Maybe wireshark could tell you more about the packet sizes and whether any TCP errors were slowing things down.
FTP performance is around 22mB/s (vsftpd). NFS is at 19mB/s.
these are very similar. I suspect something other than network being your bottleneck - namely drives. What is your disk setup? have you tried a dd test as a *write*, i.e.
time dd if=/dev/zero of=some_file bs=10M count=100 <-- do this on the server, not over nfs.
what are your MTU settings on both client and server (as can be seen via ifconfig)?
what sort of switch are you using?
do you see a difference in speed when fetching a file from the nfs server as opposed to writing one?
The server has a 10G SCSI disk for the system, a 320G ide disk for /home and a 1T sata disk for file storage (mostly pictures and video).
The write performance (on the server) is 24mB/s for the ide disk, 53mB/s for sata and an abysmal 16mB/s for scsi (the disk is almost 10 years old). I have a spare ide controller somewhere, maybe that can improve ide performance.
NFS write is 12mB/s for the ide disk and 17 mB/s for the sata disk. Read is 18 and 21.
I tried with MTU of both 1500 and 9000 (jumbo frames) on both the client and the server (yes, the switches support it). The switches are unmanaged Netgear and D-link gigabit switches.
I just found something that may be interesting: if I get a file over nfs, remount the export (to flush the client cache, but keep the server's) and get it again, the speed is around 60mB/s. So it does seem to be something about the disks on the server.
I think that confirms it - if iperf is symmetrical and ~500 Mbps then the networking is going as fast as it can and probably limited by PCI bus. When I had asymmetric performance and some very low figures on one machine it was due to TCP errors and retries. Changing the net.ipv4 sysctl variables fixed that.
By the way when rerunning iperf yesterday and getting ~330 Mbps, on one machine I was only getting ~ 200Mbps. However, at the time there was a mysql process consuming about 60% cpu at the time so it also depends on CPU usage as well as PCI bus.
I also had a passing look at jumbo frames but when experimenting with the sysctl variables and also looked into the theoretical performance gains that could be made. However, they were marginal at best and decided that it was irrelevant because of the PCI bus and disk access limitations.
I think my newest server can manage ~ 600 Mbps but the clients can't keep up!
The new server has a SATA drive but can only get ~ 75 MB/sec which isn't much better than IDE so maybe I'll have a look at that myself.
Last edited by andrewdodsworth; 05-24-2008 at 05:35 AM.
Reason: Additional info
gives 57mB/s in iperf, 13mB/s on hda and 27mB/s on sda, for a total of 97mB/s on the bus. (Performance is the same when reading from a file rather than raw disk access.)
So the bottleneck is not on the bus. What I found interesting is that the disk performance goes down by half under heavy network usage. CPU during the combined iperf/dd runs is 70sys/0id/30wa. dd uses 45% cpu. So that's the bottleneck. Maybe I should first look at lowering cpu usage for disk access (is it possible? dma and acpi are already on.)
But during an nfs operation, the cpu is 25sys/10id/65wa and I only get 18mB/s. Why is wait so high if neither device is at full speed and the cpu is not maxed?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.