LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 05-23-2004, 09:17 AM   #1
Boss Hoss
Member
 
Registered: Sep 2003
Distribution: SuSe
Posts: 62

Rep: Reputation: 15
Exclamation High load - but CPU 99% idle?


I don't understand why a top report shows such a high server load. I've even turned off httpd for 5 minutes since that seemed to be a large part of the loading I was seeing yesterday. I posted in another topic about hdb dma entries in my logs, but I was at least able to lower the load to 0.5 when I turned off httpd

10:13:15 up 17:20, 2 users, load average: 17.22, 17.96, 16.04
100 processes: 96 sleeping, 4 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 99.9%
Mem: 1000284k av, 916340k used, 83944k free, 0k shrd, 193196k buff
366316k active, 442416k inactive
Swap: 4048296k av, 19608k used, 4028688k free 452296k cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
1 root 16 0 428 428 368 S 0.0 0.0 0:04 0 init
2 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd
3 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
5 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush
4 root 16 0 0 0 0 SW 0.0 0.0 0:04 0 kswapd
6 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 kupdated
7 root 20 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd
11 root 15 0 0 0 0 SW 0.0 0.0 0:48 0 kjournald
73 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 khubd
402 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 usb-storage-0
403 root 18 0 0 0 0 SW 0.0 0.0 0:00 0 scsi_eh_0
3657 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kjournald
3689 root 15 0 0 0 0 DW 0.0 0.0 0:23 0 kjournald
4069 root 15 0 0 0 0 SW 0.0 0.0 0:03 0 kjournald
4091 root 15 0 0 0 0 SW 0.0 0.0 0:45 0 kjournald
5058 root 15 0 576 576 492 R 0.0 0.0 0:09 0 syslogd
5106 root 15 0 1228 1228 356 S 0.0 0.1 0:00 0 klogd
5148 root 17 0 652 652 468 S 0.0 0.0 0:00 0 smartd
5217 named 18 0 2640 2640 1968 S 0.0 0.2 0:01 0 named
5230 root 17 0 1648 1644 1492 S 0.0 0.1 0:00 0 sshd
5242 root 23 0 1460 1460 1268 S 0.0 0.1 0:00 0 sshd
5259 root 15 0 948 948 888 S 0.0 0.0 0:00 0 xinetd
5273 ntp 16 0 2740 2740 2216 S 0.0 0.2 0:05 0 ntpd
5283 clamav 15 0 964 960 820 S 0.0 0.0 0:00 0 freshclam
5421 root 15 0 2644 2644 1940 S 0.0 0.2 0:02 0 sendmail
5429 root 16 0 2416 2412 1760 S 0.0 0.2 0:00 0 sendmail
5438 smmsp 16 0 2284 2280 1752 S 0.0 0.2 0:00 0 sendmail
 
Old 05-23-2004, 01:17 PM   #2
rottie
Member
 
Registered: Oct 2003
Posts: 64

Rep: Reputation: 15
Maybe it's not the CPU but disk-access that's heavy loaded?
Is your system doing a lot of reading/writing to (log) files?

Do you experience the heavy load yourself ?
 
Old 05-23-2004, 02:27 PM   #3
Boss Hoss
Member
 
Registered: Sep 2003
Distribution: SuSe
Posts: 62

Original Poster
Rep: Reputation: 15
Yes I'm going to say the problem is with the disk access.

I have my server setup using Ensim Webppliance using a dual HD config. Everything to do with the virtual hosting is on /dev/hdb in the /home directory and everything else is on /dev/hda

I was getting I/O errors on hdb and bad sectors and we decided to change it out because I kept getting corrupt MySQL tables which is just bizarre. After replacing the drive we couldn't bring up the server..it would just hang in the boot process. We could put both drives into a similar 1U server and it would come up. So we replaced the mobo and then it would come up.

So the server has been running since Wed 7:45 pretty well. But on Saturday am I saw some dma time out errors on hdb. The server loading would just peg. I turned off httpd service and it would come down, but never did the ram useage. Since all virtual hosting is on /home I believe the server loading is due to the disk write/read speed on hdb.

Last night I also ran another script that does mass emailing (legite newsletter) and it just crawled. So I checked the disk read speed..

[root@startbox log]# hdparm -t /dev/hdb
/dev/hdb:
Timing buffered disk reads: 2 MB in 10.28 seconds = 199.22 kB/sec

and its doing this set as (which my colo support told me to run it as)...

/dev/hdb:
multcount = 16 (on)
IO_support = 1 (32-bit)
unmaskirq = 1 (on)
using_dma = 1 (on)
keepsettings = 0 (off)
readonly = 0 (off)
readahead = 8 (on)
geometry = 9964/255/63, sectors = 160086528, start = 0

but in my log files I was seeing this..

May 22 22:31:58 startbox kernel: hdb: DMA disabled
May 22 22:32:18 startbox kernel: hdb: dma_timer_expiry: dma status == 0x01
May 22 22:32:28 startbox kernel: hdb: error waiting for DMA
May 22 22:32:28 startbox kernel: hdb: dma timeout retry: status=0x50 { DriveReady SeekComplete }

When I tried to restart the server remotely I was seeing this in the log file repeatedly..

May 23 12:34:26 startbox kernel: end_request: I/O error, dev 03:42 (hdb), sector 0


Now I finally had to drive down to the NOC and restart the machine manually. Now after checking root file system integrity, the server came right up and here's where things are with all my sites on..

15:20:26 up 2:14, 1 user, load average: 1.04, 1.11, 1.02
125 processes: 119 sleeping, 6 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 50.8% 0.0% 6.9% 0.0% 0.0% 0.0% 42.1%
Mem: 1000284k av, 719524k used, 280760k free, 0k shrd, 83716k buff
244456k active, 419784k inactive
Swap: 4048296k av, 0k used, 4048296k free 413740k cached

and I have it set as "auto" in BIOS and didn't send any hdparm commands like colo told me to (and this is the same as hda)..

/dev/hdb:
multcount = 16 (on)
IO_support = 0 (default 16-bit)
unmaskirq = 0 (off)
using_dma = 1 (on)
keepsettings = 0 (off)
readonly = 0 (off)
readahead = 8 (on)
geometry = 9964/255/63, sectors = 160086528, start = 0

weblogs are written every 10 minutes.
 
Old 05-23-2004, 02:34 PM   #4
Boss Hoss
Member
 
Registered: Sep 2003
Distribution: SuSe
Posts: 62

Original Poster
Rep: Reputation: 15
Now I have seen one dma entry in the log since bringing the server back up with the setting I just posted. I saw an entry like this on Saturday am before the problems started.

May 23 13:06:49 startbox kernel: ide0: BM-DMA at 0xff00-0xff07, BIOS settings: hdaMA, hdbMA
May 23 13:06:49 startbox kernel: hdb: Maxtor 6Y080L0, ATA DISK drive
May 23 13:06:49 startbox kernel: hdb: attached ide-disk driver.
May 23 13:06:49 startbox kernel: hdb: host protected area => 1
May 23 13:06:49 startbox kernel: hdb: 160086528 sectors (81964 MB) w/2048KiB Cache, CHS=9964/255/63, UDMA(133)
May 23 13:06:49 startbox kernel: hdb: hdb1 hdb2
May 23 13:44:55 startbox kernel: hdb: dma_timer_expiry: dma status == 0x61

So now I'm wondering about the SWAP useage. Today isn't my peak load by any means. Could I be running into troubles when the RAM switches over to SWAP? I have a swap partition on hda & hdb.

Last edited by Boss Hoss; 05-23-2004 at 03:01 PM.
 
Old 05-23-2004, 04:53 PM   #5
Boss Hoss
Member
 
Registered: Sep 2003
Distribution: SuSe
Posts: 62

Original Poster
Rep: Reputation: 15
swap is on just not in use so far since rebooting

[root@startbox /]# swapon -s
Filename Type Size Used Priority
/dev/hdb1 partition 2024148 0 -1
/dev/hda5 partition 2024148 0 -2
 
Old 05-23-2004, 08:31 PM   #6
Boss Hoss
Member
 
Registered: Sep 2003
Distribution: SuSe
Posts: 62

Original Poster
Rep: Reputation: 15
Ok, I see top finally has me using some SWAP..but I see my disk read speed drops way off!! Knowing it was doing 2-8MB/sec earlier today before swap use..this seems like a big drop in performance?

Do I need more RAM? The new mobo has a 3rd slot for RAM where the old mobo had only 2. My server loading is still ok, but this slow drive speed is going to kill me especially with the MySQL reads/writes.

Maybe I should switch the priority on swap to hda5 and give this drive a break?

[root@startbox html]# hdparm -t /dev/hdb

/dev/hdb:
Timing buffered disk reads: 4 MB in 4.16 seconds = 984.62 kB/sec


21:22:57 up 8:17, 2 users, load average: 1.01, 3.76, 3.75
116 processes: 115 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 46.2% 0.0% 6.1% 0.0% 0.0% 0.0% 47.6%
Mem: 1000284k av, 928072k used, 72212k free, 0k shrd, 126708k buff
399796k active, 467036k inactive
Swap: 4048296k av, 104k used, 4048192k free 521648k cached
 
Old 05-24-2004, 04:39 AM   #7
Electro
LQ Guru
 
Registered: Jan 2002
Posts: 6,042

Rep: Reputation: Disabled
I have a Maxtor drive and sometimes if I leave my computer on for over 16 hours. The drive starts acting up. The drive sometimes complains that it does not see /home, so I have to re-mount /home. My /home partition does not have any problems. I do not have any DMA problems with it. After I enabled no write error through hdparm, I think it fixed the problem. You can try it but do it with caution.

I read that Maxtor drives do not like to be on the same channel as other drives. Try to put the Maxtor drive on its own channel. By placing the Maxtor on another channel. Swap can work in parallel, so it does not have to fight with other hard drives on the same channel.

For databases, use a SCSI controller with expandable cache. A lot of cache works well with databases. Hard drives with built-in 8 megabytes of cache are a waste when used on the controller with its own cache.

If you want reliability use either SATA or SCSI hard drives.

You probably want to stop using ntpd. It can bogged down the server. Use a seperate system for ntpd. Also usb storage devices can bogged down the system because they depend on about 70% to 80% of the CPU. Use usb storage another system and then copy the files that you want backup or restore the files.

It could be something is conflicting with other pieces of hardware in your system. Try taking out one component at a time. You may want to run memtest86.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Why is my load average so high when comp. is idle? BrianK Linux - General 1 11-18-2005 12:25 AM
dm-crypt and high cpu load(pdflush) tazdevil77 Slackware 2 12-25-2004 02:03 PM
High idle cpu load in 2.6.4? geekzen Linux - General 4 04-10-2004 11:54 AM
Why am I getting ?high? CPU load? pnh73 Linux - General 15 10-21-2003 10:36 AM
Help! very high CPU load by squid cash_05 Linux - Networking 0 10-08-2003 07:05 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 07:02 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration