high IOWait on server when copying files from network
Hy there,
We've got a problem with our SMB fileserver here, every time we copy Data from the network to the server the IOWait time hits the 90% mark the loadavg rises to above 10 and the throughput drops to 3MB/s. Browsing the Filetree via SMB at the same time is virtual impossible (you have to wait 15 to 20sec. before the Windows Explorer shows you the directory content). Copying Files from the Server is not a problem and is working like charm. But first here's some information about the Hardware: The Server is a FSC RX300 S3:
The thing is, I can't find the bottleneck here which is causing these high IOWait times. But I was able to count out several possible reasons:
and here is the output of some progs i ran while copying data to the server: Code:
$ mpstat 1 So maybe somebody has got an idea where the problem might be. |
Quote:
While my experience is with Copper Intel Pro/1000 GT, See if the E1000 is generating excessive interrupts. Check out: /usr/src/linux/Documentation/networking/e1000.txt and http://support.intel.com/support/net...o100/21397.htm for details on the InterruptThrottleRate option to the e1000 module. you can watch your interrupts by running `vmstat 5` in another window while a download is in progress. Also, try making a RAM disk and download to it: Code:
mkdir /mnt/ram HTH |
Well first, thanks for the reply. :)
Second, how do I see if the two card's are connected to the same PCI-Bus? I tried "lspci -t". Code:
-[0000:00]-+-00.0 Does that indicate they are both on the same PCI-Bus? If so than maybe that's not the Problem because i also tried the onboard SCSI-HBA (PCIid: 02:08.0), which resulted in the same problem. I also switched from the Intel Pro 1000 to the onboard Broadcom 1GB copper card (PCIid: 04:00.0), still with the same symptoms. (All that is, if the above tree view really shows the different PCI-Buses) I tried an messed around a bit with the TCP Congestion control protocol an various other TCP-Stack "optimizations" like changing the receive buffer size, which seems to soften the problem a bit. that's what i changed: Code:
echo "reno" > /proc/sys/net/ipv4/tcp_congestion_control I will try and change to a 2.6.20 Kernel when we can afford for a little down time (hope that will be this week and maybe I messed something up in the kernel config). Oh and i remembered that we once hat VMWare running on that server maybe the vmnet and vmbridge modules are messing up with the internal network handling. Will get rid of them once i changed the kernel. I nearly forgot: The ramdisk test: I copied a 561MB file from the Network to the ramdisk, first got around 9MB/s which dropped to 4MB/s after about 300MB. The interrupts where around 4000 while copying. Moving that file to the FC-Disk took 5.590sec (!). Which seems to be quite fast. Again thanks for your suggestions |
All times are GMT -5. The time now is 02:43 AM. |