LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 06-07-2006, 10:31 PM   #1
desibeli
LQ Newbie
 
Registered: Aug 2005
Posts: 14

Rep: Reputation: 0
RAID5 and RAID1 causing high system load on Suse 10.1 with no activity


I recently installed Suse 10.1 and when I create a RAID5 array my system load goes up by 2. If I create 2 RAID5 arrays, I'm at a load average of 4.

With RAID1 my load goes to 10! for 1 array with no activity??

The arrays are empty and there is no activity on the machine at all. The HD LED is constantly on when I create a RAID5 or RAID1 array. It stays on for more than 24 hours with high load.

My computer has a Highpoint 372 on board ATA RAID controller, but I get the message from Suse that this is not supported in kernel 2.6 and onwards -- could there be a conflict?

I'm using reiserfs on the arrays. The RAID5 arrays consists of 1 partition from 4 different disks on 2 different IDE channels. The RAID1 can be pretty much any combination of any 2 partitions on any controller - still high load.

Anybody has had similar experiences or any hints to what could be wrong? Any hints to how I can figure out what is causing the high system load?
 
Old 06-08-2006, 12:35 AM   #2
WhatsHisName
Senior Member
 
Registered: Oct 2003
Location: /earth/usa/nj (UTC-5)
Distribution: RHL9;F1-10; CentOS4-5; DebianSarge-Squeeze
Posts: 1,151

Rep: Reputation: 46
Are you seeing the high loads after you have allowed the raids to rebuild/recover following their creation or do the high loads just reflect the rebuild/recover process?
 
Old 06-08-2006, 10:22 AM   #3
desibeli
LQ Newbie
 
Registered: Aug 2005
Posts: 14

Original Poster
Rep: Reputation: 0
I see the high load by just creating the arrays of empty disks. The high load persists for more than 24 hours on empty disks with no other activity on the computer. I'm not trying to recover from a lost disk, I'm installing Linux on an empty computer.

Any hints to how I can see what is causing the load/disk activity?
 
Old 06-08-2006, 10:39 AM   #4
farslayer
Guru
 
Registered: Oct 2005
Location: Willoughby, Ohio
Distribution: linuxdebian
Posts: 7,231
Blog Entries: 5

Rep: Reputation: 189Reputation: 189
cat /proc/mdstat
Check the status of your array, if it's in the process of being built, or rebuilding and how far along it is..
 
Old 06-08-2006, 10:47 AM   #5
WhatsHisName
Senior Member
 
Registered: Oct 2003
Location: /earth/usa/nj (UTC-5)
Distribution: RHL9;F1-10; CentOS4-5; DebianSarge-Squeeze
Posts: 1,151

Rep: Reputation: 46
When you create a brand new software raid1 using mdadm, the system immediately begins a rebuilding of the mirror drive from scratch.

When you create a brand new software raid5 using mdadm, the system immediately begins to create parity using the last drive from scratch.

Depending on the size of the array and your system’s computing power, the initialization process can go on from seconds to minutes to hours.

If you reboot before the rebuilding process is complete, then the initialization process will start over from the beginning. Fortunately, most true hardware raid cards are set up to restart initialization where they left off after a reboot, but the software raids aren’t so lucky.

As farslayer pointed out, the easiest way to follow the rebuilding process is:
Code:
# cat /proc/mdstat
 
Old 06-09-2006, 11:46 AM   #6
desibeli
LQ Newbie
 
Registered: Aug 2005
Posts: 14

Original Poster
Rep: Reputation: 0
Thanks for the answers. It seems to be the slow building of the arrays that confuses me. My array is 4 x 190GB and projected build time was 3300 minutes, which after many hours of running seems to pretty accurate. The machine is an Athlon XP2200 with 1GB RAM.

# cat mdstat
md0 : active raid5 hdl1[4] hdk1[2] hdj1[1] hdi1[0]
585110784 blocks level 5, 128k chunk, algorithm 2 [4/3] [UUU_]
[======>..............] recovery = 31.6% (61759884/195036928) finish=1913.0min speed=1160K/sec

I'm still puzzled to why it would take this computer 3300 minutes to calculate the parity of an empty disk, what data is there to run parity on? couldn't the mdadm just assume the entire disk is 0's?? Or is there still something wrong in my HW/setup?
 
Old 06-09-2006, 12:25 PM   #7
WhatsHisName
Senior Member
 
Registered: Oct 2003
Location: /earth/usa/nj (UTC-5)
Distribution: RHL9;F1-10; CentOS4-5; DebianSarge-Squeeze
Posts: 1,151

Rep: Reputation: 46
There’s something wrong with your disk setup. The rebuild rate is way too slow.

Here’s an example from an old P3 system that is essentially used as network attached storage:
Code:
# cat /proc/mdstat

md4 : active raid5 hdk4[3] hdi4[2] hdg4[1] hde4[0]
      37760448 blocks level 5, 4k chunk, algorithm 2 [4/4] [UUUU]

# mdadm /dev/md4 -f /dev/hdk4 -r /dev/hdk4 -a /dev/hdk4

mdadm: set /dev/hdk4 faulty in /dev/md4
mdadm: hot removed /dev/hdk4
mdadm: hot added /dev/hdk4

# cat /proc/mdstat

md4 : active raid5 hdk4[4] hdi4[2] hdg4[1] hde4[0]
      37760448 blocks level 5, 4k chunk, algorithm 2 [4/3] [UUU_]
      [=>...................]  recovery =  5.5% (703104/12586816) finish=9.4min speed=20958K/sec

# cat /proc/mdstat

md4 : active raid5 hdk4[4] hdi4[2] hdg4[1] hde4[0]
      37760448 blocks level 5, 4k chunk, algorithm 2 [4/3] [UUU_]
      [===============>.....]  recovery = 75.2% (9470828/12586816) finish=2.3min speed=21619K/sec

# cat /proc/mdstat

md4 : active raid5 hdk4[3] hdi4[2] hdg4[1] hde4[0]
      37760448 blocks level 5, 4k chunk, algorithm 2 [4/4] [UUUU]
Test System: 800MHz P3 (1 cpu) with 4x320GB Western Digital drives (WD3200JB) running from two Promise Ultra100 TX2 IDE controller cards on a PCI/33 bus. The recovery example above contains a 36GB volume group and is on the bottom 5% of the drives, so it doesn’t get any slower than this.

Last edited by WhatsHisName; 06-09-2006 at 12:29 PM.
 
Old 06-09-2006, 12:41 PM   #8
desibeli
LQ Newbie
 
Registered: Aug 2005
Posts: 14

Original Poster
Rep: Reputation: 0
Any hints to how I can find out what is wrong in the system? The machine ran very well as an WinXP last week, only addition is two 30Gb drives.

Disks I have:
primary on-board IDE: 60 GB + 160GB disks (ATA100)
2ndary on-board IDE: CDrom, DVD-RW
Highpoint 372 ATA-RAID, ch1: 200Gb, 300Gb
Highpoint 327 ATA-RAID, ch2: 200Gb, 300Gb
Maxtor PCI IDE card, ch1: 30Gb, 30Gb
Naxtor PCI IDE card, ch2: emtpy
Maxtor PCI IDE card2, ch1: emtpy
Maxtor PCI IDE card2, ch2: emtpy

I've tried arrays on both the Highpoint controller and the Maxtor PCI controllers, both are slow.

I do have one funny problem. None of my disks are labled /dev/hda, my first disk is /dev/hde and its on the Maxtor PCI controller, my "/dev/hda", i.e. on-board ctrl1-ch1-master is called /dev/hdm -- is this a hint?

Could it be some IRQ/address conflict?

Any other advice than start pulling out HW piece by piece?
 
Old 06-09-2006, 06:47 PM   #9
WhatsHisName
Senior Member
 
Registered: Oct 2003
Location: /earth/usa/nj (UTC-5)
Distribution: RHL9;F1-10; CentOS4-5; DebianSarge-Squeeze
Posts: 1,151

Rep: Reputation: 46
Quote:
I do have one funny problem. None of my disks are labeled /dev/hda, my first disk is /dev/hde and its on the Maxtor PCI controller, my "/dev/hda", i.e. on-board ctrl1-ch1-master is called /dev/hdm -- is this a hint?
IDE and SCSI assignments start with the motherboard controllers and then move to cards on the PCI/PCI-X/PCIe buses. For example, if you have the typical 2 IDE channels on the motherboard, they will be hd[a-d]. The first 2-channel card on the PCI bus will be assigned hd[e-h], the second card assigned hd[j-l] and so on.


The speed problem may involve the driver for the Highpoint card, but I donít have a good recommendation there. Also, mixing two types of IDE controller cards might be an issue as related to the motherboard BIOS, especially if the system is a Dell.

As a test, you could try removing the Highpoint card and distributing the drives across the Maxtor cards, which I assume are identical to the Promise Ultra133 TX2 cards.

The other thing that strikes me is the use of two drives per IDE channel (master/slave). Notice in my example above that only the master positions are used on each IDE card channel (hd[egik]). With some master/slave drive combinations, you can run into conflicts that slow things down. That isnít an issue when you only use 1 drive per IDE channel.

The other potential problem is PCI bus congestion, especially if you have a PCI/33 bus.
 
Old 06-09-2006, 11:59 PM   #10
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,964
Blog Entries: 11

Rep: Reputation: 865Reputation: 865Reputation: 865Reputation: 865Reputation: 865Reputation: 865Reputation: 865
Another thing ALWAYS to be checked for slow IDE devices is
the DMA settings.

What does hdparm say for the individual drives?


Cheers,
Tink
 
Old 06-10-2006, 12:57 AM   #11
WhatsHisName
Senior Member
 
Registered: Oct 2003
Location: /earth/usa/nj (UTC-5)
Distribution: RHL9;F1-10; CentOS4-5; DebianSarge-Squeeze
Posts: 1,151

Rep: Reputation: 46
Good point.

You see numerous posts about drives being set to the wrong udma level, but I have never personally run into a drive that was incorrectly set.

desibeli:
Code:
# hdparm -I /dev/hde

(lots of info deleted)

Capabilities:
        LBA, IORDY(can be disabled)
        bytes avail on r/w long: 74     Queue depth: 1
        Standby timer values: spec'd by Standard, with device specific minimum
        R/W multiple sector transfer: Max = 16  Current = 16
        Recommended acoustic management value: 128, current value: 254
        DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 *udma5

(lot more stuff deleted)
 
Old 06-10-2006, 01:50 PM   #12
desibeli
LQ Newbie
 
Registered: Aug 2005
Posts: 14

Original Poster
Rep: Reputation: 0
DMA settings was "udma5" and "udma6", drive settings looked correct to me.

I pulled all the HW from the computer so I only had 1 disk and 1 cdrom. No other controllers. First disk was called /dev/hde -- not /dev/hda on Suse 10.1.

I decided to try Debian instead of Suse, same problem. Debian got the disks confused too, /dev/hda was not the motherboard Via KT400 chipset controller, but on the motherboards Highpoint372 controller. Also Debian hangs on ide-detect on the Highpoint 372 controller module.

Gentoo boots and works, but I had to select numerous options to get basic stuff enabled, e.g. MySql was not selected by default and it seems like you have to recompile everything during Gentoo install???

Finally I went back to Fedora. FC5 boots, finds the disks in the correct order. Builds the arrays with correct speed. I still use reiserfs, same HW, in same locations. RAID1 builds with 20-30MB/sec in parallel with a RAID5 building around 20MB/sec.

Something is wrong in Suse 10.1 and Debian reguarding the Highpoint372 controller. BTW FC5 calls the disks on the Promise 133TX2 controller /dev/mapper/<something> and associates these with a hpt37x module?? maybe this is where Suse and Debian gets confused, if it uses hpt37x module for both the Promise TX2 and for the on-board HPT372 controllers?

Anyway, thanks for the help, sorry I didn't have the stamina to find the problem in Suse and Debian.

Last edited by desibeli; 06-11-2006 at 12:15 PM.
 
  


Reply

Tags
load, raid5, system


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Disk performance causing high Load Avg? craigeb78 Linux - Hardware 6 03-09-2006 04:47 PM
Kded causing HD activity... why? ..and how do I stop it? antis Linux - General 2 11-20-2005 07:31 AM
need to create high system load bigtl Linux - General 2 09-29-2005 07:20 AM
HDD activity causes high load squisher Linux - Hardware 1 07-14-2005 08:46 AM
High system load while reading from HDD on nforce2 qQsh Linux - Hardware 5 01-31-2005 04:37 AM


All times are GMT -5. The time now is 03:21 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration