Well first, thanks for the reply.
Second, how do I see if the two card's are connected to the same PCI-Bus?
I tried "lspci -t".
Code:
-[0000:00]-+-00.0
+-02.0-[0000:01-03]--+-00.0-[0000:02]--+-08.0
| | \-08.1
| \-00.2-[0000:03]--
+-04.0-[0000:04]----00.0
+-05.0-[0000:05]----00.0
+-06.0-[0000:06-08]--+-00.0-[0000:07]----01.0
| \-00.2-[0000:08]----01.0
+-1d.0
+-1d.1
+-1d.2
+-1d.3
+-1d.7
+-1e.0-[0000:09]----05.0
+-1f.0
+-1f.1
\-1f.3
Where 08:01.0 is the FC-HBA and 07:01.0 is the Intel Pro 1000.
Does that indicate they are both on the same PCI-Bus?
If so than maybe that's not the Problem because i also tried the onboard SCSI-HBA (PCIid: 02:08.0), which resulted in the same problem.
I also switched from the Intel Pro 1000 to the onboard Broadcom 1GB copper card (PCIid: 04:00.0), still with the same symptoms.
(All that is, if the above tree view really shows the different PCI-Buses)
I tried an messed around a bit with the TCP Congestion control protocol an various other TCP-Stack "optimizations" like changing the receive buffer size, which seems to soften the problem a bit.
that's what i changed:
Code:
echo "reno" > /proc/sys/net/ipv4/tcp_congestion_control
echo 1 > /proc/sys/net/ipv4/tcp_no_metrics_save
echo 16777216 > /proc/sys/net/core/rmem_max
echo 16777216 > /proc/sys/net/core/wmem_max
echo "4096 87380 16777216" > /proc/sys/net/ipv4/tcp_rmem
echo "4096 87380 16777216" > /proc/sys/net/ipv4/tcp_wmem
I still have the problem with the copying "stalling" every now and then for about 20 to 30sec. but the server load avg. now keeps below 3.
I will try and change to a 2.6.20 Kernel when we can afford for a little down time (hope that will be this week and maybe I messed something up in the kernel config).
Oh and i remembered that we once hat VMWare running on that server maybe the vmnet and vmbridge modules are messing up with the internal network handling. Will get rid of them once i changed the kernel.
I nearly forgot: The ramdisk test:
I copied a 561MB file from the Network to the ramdisk, first got around 9MB/s which dropped to 4MB/s after about 300MB. The interrupts where around 4000 while copying.
Moving that file to the FC-Disk took 5.590sec (!). Which seems to be quite fast.
Again thanks for your suggestions