LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Horrible Ubuntu performance (https://www.linuxquestions.org/questions/linux-newbie-8/horrible-ubuntu-performance-658285/)

RadioActiveLamb 07-25-2008 04:02 PM

Horrible Ubuntu performance
 
Let me preface this question with a little background and system information:

I am a Windows veteran. I have lots of hardware and network experience and have been a systems manager for nearly 20 years. I am just-now dabbling in Linux. What's a better way that to drop it onto a production box and force myself to use it? :)

The box is an Intel server motherboard with four dual-core Xeon 2.5Ghz processors. It has 4GB RAM and RAID-5 storage with five physical drives and one logical volume. The raid controller is a 64-bit PCI Compaq "Smart Array" 431. This controller is a dinosaur, but its what we had lying-around. The firmware is current and it operated just fine under Windows 2003.

I've installed Ubuntu 8.04 with the Gnome desktop. I've also installed Samba so that I can file-share with our existing Windows 2003 servers.

We're noticing that performance on this machine is absolutely terrible. Especially when there is any disk activity. It seems that a single process runs at an acceptable level, but as soon as we try to do two operations at the same time, performance just simply goes away. An example would be copying a large file from a Windows box, and then from another Windows workstation, trying to access shared folders on the Ubuntu machine would cause Explorer to time-out. Once the file copy is done, the Ubuntu box is snappy again.

Another example is keeping it all on one box. Copying a 20GB file from one local folder to another under Gnome will cause excruciatingly slow start-ups for relatively small Gnome applications (terminal, etc.)

The question is... Where do I START to diagnose this? Please be specific when asking for more information. For example, if you want to see a config file or log, I need to know how to find that config file or log.

Thanks for any help toward speeding this server up.

arizonagroovejet 07-25-2008 04:15 PM

The only time I've seen performance issues like that was on a desktop machine and DMA was disabled for the harddisk. Enabling DMA sorted it out. Whether this would apply to your hardware I really don't know but it's worth a check. Try
Code:

$  sudo hdparm  /dev/whatever
where whatever is your harddisk device node and look for mention of DMA in the output. If you don't know what device node the harddisk has run

Code:

$ sudo fdisk -l | grep Disk
You may also want to check that all processors are recognised though if doubt that would cause the problems you're describing

Code:

$ grep processor /proc/cpuinfo  | wc -l
The result should be 8. (4 processors with 2 cores each)

RadioActiveLamb 07-25-2008 04:29 PM

Here's what I get:

Code:

$ sudo fdisk -l | grep Disk
Disk /dev/ida/c0d0: 293.6 GB, 293609594880 bytes
Disk identifier: 0xe8408740

and to the hdparm (DMA shouldn't apply to a SCSI HA):

Code:

$ sudo hdparm /dev/ida/c0d0
/dev/ida/c0d0:
 HDIO_GET_32BIT failed: Invalid argument
 HDIO_GET_UNMASKINTR failed: Invalid argument
 HDIO_GET_DMA failed: Invalid argument
 HDIO_GET_KEEPSETTINGS failed: Invalid argument
 readonly      =  0 (off)
 readahead    = 256 (on)
 geometry      = 35696/255/63, sectors = 573456240, start = 0

Processor power doesn't seem to be a problem. Watching the system log, the average cpu load rarely goes over 4%. It sees all the cores.

Code:

$ grep processor /proc/cpuinfo | wc -l
8

Thanks for the quick reply. Hopefully this is the first step to resolving this problem.

arizonagroovejet 07-25-2008 04:40 PM

Hmmm yeah forget DMA then, as you say doesn't apply to SCSI. The machine I had hideous disk access performance issues with had a PATA drive. I have no other suggestion. A quick Google suggests the raid controller ought to work OK with Linux. Novell list is as compatible with some of their Linux products. http://www.novell.com/partnerguide/product/200680.html

jkzfixme 07-25-2008 07:21 PM

dns issue ?
 
Try adding the machine name and IP address pairs for your clients and your server to the appropriate hosts file (/etc/hosts under linux, something like c:\windows\system32\etc\drivers\hosts under windows).


Regards
JKZfixme

jkzfixme 07-25-2008 07:29 PM

also might want to add this in your conf file

socket options = TCP_NODELAY SO_RCVBUF=16384 SO_SNDBUF=16384


Regards
JKZfixme

syg00 07-25-2008 07:55 PM

I run an old quad (P-III) Xeon with (Adaptec) SCSI raid5 and it runs fine. Has (in the past) had Ubuntu as a host for vmware, and trundled along quite nicely. Try this "dmesg | less" and see if anything looks amiss (less is a pager, "q" to quit; dmesg is {most of} your boot messages you don't normally see on Ubuntu).

Sounds like disk is the problem if you see it on local transfers as well - try this when things are slow and post the output
Code:

top -b -n 1 | awk '{if (NR <=7) print; else if ($8 == "D") {print; count++} } END {print "Total tasks in I/O sleep: "count}'

RadioActiveLamb 07-29-2008 12:11 PM

Here are the results. It appears that the CPUs aren't being taxed at all.

Code:

top -b -n 1 | awk '{if (NR <=7) print; else if ($8 == "D") {print; count++} } END {print "Total tasks in I/O sleep: "count}'
top - 10:09:25 up 4 days,  1:48,  2 users,  load average: 3.12, 5.07, 2.57
Tasks: 181 total,  1 running, 180 sleeping,  0 stopped,  0 zombie
Cpu(s):  0.7%us,  2.2%sy,  0.0%ni, 93.6%id,  3.3%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:  4025132k total,  3329704k used,  695428k free,    82916k buffers
Swap: 11655116k total,    39872k used, 11615244k free,  2779392k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
Total tasks in I/O sleep:


RadioActiveLamb 07-29-2008 01:06 PM

When I'm duplicating a large file on the RAID volume, I get a total of 5.5-6MB/sec. That's horrible and painful. If I try to duplicate a file at the same time, both copies reduce in speed to achieve a TOTAL of 5.5-6MB/sec. :scratch:

Raid volume to LAN share: http://files.lambville.com/pix/Screenshot.gif
Volume to volume duplicate: http://files.lambville.com/pix/Screenshot-1.gif

The RAID controller is capable of much faster speeds than that, and I'm certain that this 64-bit PCI Intel server mainboard is capable.

Any ideas what to try next?

syg00 07-29-2008 08:04 PM

That card has been supported in Linux for ages; BLK_CPQ_DA On my Ubuntu (desktop) it is selected as a module in the config, so if you have the hardware, it should load the kernel module automagically for you on boot. What does "lsmod | grep -i cpq" produce ???.

BTW, that listing above shows no evidence of a problem that we can chase - wasn't what I thought might be the issue.

RadioActiveLamb 01-02-2009 12:04 PM

The problem turned-out to be a faulty RAID controller. We swapped it out with a different model. The new RAID-10 is flying. Thanks for all your suggestions.

farslayer 01-02-2009 03:09 PM

THANK YOU for coming back and posting the resolution, even if it was just defective hardware.


All times are GMT -5. The time now is 03:06 AM.