LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
LinkBack Search this Thread
Old 05-12-2011, 03:00 PM   #1
The Belgain
LQ Newbie
 
Registered: Dec 2004
Posts: 17

Rep: Reputation: 1
Very slow software RAID5/LVM array - which drive is dying?


Hi,

I have an Ubuntu 10.04 system which is used as a file-server, primarily for storing video. The setup combines two RAID5 arrays joined in an LVM (details below).

It's served me very well in the past, with decent enough performance for my use (around 150-200MB/s sequential reads for example). All I really ask of it is to be able to stream HD video, which shouldn't be too onerous.

As I say, the setup used to work absolutely fine but it now grinds to a halt (i.e. <1MB/s reads, 100ms+ seek times, ...) sometimes - even with no load on the system. I strongly suspect one of the hard drives is on its way out, but can't tell which one. I've looked at the drives in system monitor and they all look healthy - SMART reports them all as having either no bad sectors or 1/2 bad sectors. I've run transfer tests on each individual drive and they're perfectly fast. The problem is that the issue is intermittent - when I run a test over a particular drive it's fine more often than not.

Any suggestions for how to pin down this problem?

Physical drives (all SATA):
-- 2x 320GB drives (partitions: 320GB)
-- 3x 750GB drives (partitions: 320GB, 430GB)
-- 1x 1.5TB drive (patitions: 320GB, 430GB)

RAID5 arrays:
-- 1x 6 drive RAID5 array, comprising the 320GB partitions.
-- 1x 4 drive RAID5 array, comprising the 430GB partitions.

LVM:
-- One VG comprising the two RAID5 arrays.

(It's a slightly odd setup, the aim is to be able to grow the array by adding larger drives in future.)
 
Old 05-12-2011, 04:24 PM   #2
jefro
Guru
 
Registered: Mar 2008
Posts: 10,246

Rep: Reputation: 1255Reputation: 1255Reputation: 1255Reputation: 1255Reputation: 1255Reputation: 1255Reputation: 1255Reputation: 1255Reputation: 1255
Swap out with new drives and rebuild it then see if it goes away may be the way.

I'd look at all smart data but it may end up being controller or cables or other issues.
 
Old 05-13-2011, 01:39 AM   #3
The Belgain
LQ Newbie
 
Registered: Dec 2004
Posts: 17

Original Poster
Rep: Reputation: 1
Thanks for the reply. I'm hoping there's a better way though; swapping out a drive and rebuilding the array will take a very long time (especially with the array running slow). Rebuilds take several hours to complete and I'd need to do that 6 times.

I might try moving the drives between controllers, and swapping cables out thought - that would be quicker...
 
Old 05-13-2011, 01:44 AM   #4
H_TeXMeX_H
Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266
Run a smart long test on each, and then post the attributes.
 
Old 05-14-2011, 06:26 PM   #5
The Belgain
LQ Newbie
 
Registered: Dec 2004
Posts: 17

Original Poster
Rep: Reputation: 1
I've run a long smart test on each of the 7 drives (that's the 6 drives in the array, plus the OS boot drive) - run as "sudo smartctl --test=long <device>". All tests have passed - I've attached the full SMART data for all of them below - that's the output of "sudo smartctl --al <device>".

Any ideas? I'm wondering whether something else might be causing the array to be slowing down, but can't think what that might be. In terms of the drives, what's a sensible upper limit for the temperature they should run at? Some of them are just over 50 celsius, I don't know whether that's reasonable.
Attached Files
File Type: log smart_data.log (30.8 KB, 6 views)
 
Old 05-15-2011, 03:28 AM   #6
H_TeXMeX_H
Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266
Here are interesting bits:

Code:
Model Family:     Seagate Barracuda 7200.10 family
#1
190 Airflow_Temperature_Cel 0x0022   048   032   045    Old_age   Always   In_the_past 52 (255 255 59 28)
#2
190 Airflow_Temperature_Cel 0x0022   047   031   045    Old_age   Always   In_the_past 53 (255 255 61 28)
These two did overheat. However, all the other attributes are normal, and all the smart long tests passed.

This means that no drives are failing, but I would add some more fans.

Did you update recently or change anything on this system ? Maybe it was a bad update, or something changed to cause this ...
Maybe check the logs for anything suspicious, /var/log/ messages syslog.
Also check the cables.
 
1 members found this post helpful.
Old 05-15-2011, 11:14 AM   #7
The Belgain
LQ Newbie
 
Registered: Dec 2004
Posts: 17

Original Poster
Rep: Reputation: 1
Thanks for the help - I'll look at adding more fans to the case.

I haven't changed anything recently (other than installing standard security updates for Ubuntu). It's possible that the case has gathered some dust over time - I'll clean the filters for the case fans and see if I can blow some dust out of the case. What's a safe temperature for drives to operate at?

As an when I get new drives I'll try to get some 5400rpm ones rather than 7200rpm as they should run a little cooler.
 
Old 05-15-2011, 12:05 PM   #8
H_TeXMeX_H
Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266
I have the same drive and it runs at:

Code:
Model Family:     Seagate Barracuda 7200.10 family
...
190 Airflow_Temperature_Cel 0x0022   063   059   045    Old_age   Always       -       37 (Min/Max 27/38)
So that's 37C, yours are running at 52 and 53C ... quite a bit more. I suspect poor airflow. Certainly if there is dust, clean it out, maybe add more fans if necessary.

EDIT:
According to the manual:
http://www.seagate.com/docs/pdf/data...da_7200_10.pdf
The max operating temp is 60C, and yours have gone over in the past.

Last edited by H_TeXMeX_H; 05-15-2011 at 12:11 PM.
 
1 members found this post helpful.
Old 05-15-2011, 02:51 PM   #9
The Belgain
LQ Newbie
 
Registered: Dec 2004
Posts: 17

Original Poster
Rep: Reputation: 1
Having given it a good clean and rerouted some cables for better airflow, the drive temperature is now a little lower (46C for those two drives). However that doesn't seem to have helped - I'm still seeing poor performance with occasional long waits for access.

One thing which has now occured to me though is file fragmentation. Many of the files here have been downloaded by Bittorrent, and when upgrading from Ubuntu 8.04 to 10.04 recently, I changed Bittorrent client from Vuze to Transmission (the default Ubuntu client). One change is that Vuze allocates the entire file on disk prior to starting the download, whereas Transmission allocates it incrementally while downloading.

I suspect that it's resulting in some very fragmented files, which is making access very slow. Picking a recently-downloaded 50MB file at random, filefrag reports it has 1399 extents which seems very poor to me (the array isn't very full: 600GB free out of 2.9TB). Picking an old 350MB file shows a more reasonable 39 extents.

A quick Google shows this bug report/discussion relating to precisely this issue: https://trac.transmissionbt.com/ticket/849. I'll set the option in Transmission to preallocate files, and see if that helps. I suspect the RAID/LVM setup I have exacerbates the problem (having two bits of the same drive in the same logical volume). This is an EXT3 filesystem - I wonder whether EXT4 would have helped at all...

Anyway, thanks for the help - hopefully I have what I need now.
 
1 members found this post helpful.
Old 05-16-2011, 04:17 AM   #10
H_TeXMeX_H
Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266Reputation: 1266
I didn't know about filefrag, and was looking for such a program. I probably didn't find it because it can only be run as root.

Certainly 1399 extents in very fragmented. Try copying the file and using that instead (you can use cp or dd to copy it).
 
  


Reply

Tags
hard drives, lvm, raid, smart, temperature


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Drive intermittently dropping from RAID5 array icrf Linux - Server 2 09-06-2009 09:43 AM
extremly slow random read on new raid5 array Tomasu Linux - Server 1 05-30-2009 02:47 PM
can /boot be on a software RAID5 array? garydale Linux - Software 12 06-14-2008 08:54 AM
Want to move LVM installation to RAID5 array - how do I get it to boot? nethbar Linux - Software 0 05-29-2007 10:49 AM
software raid5 + LVM and other questions slackman Linux - Server 0 05-08-2007 02:54 AM


All times are GMT -5. The time now is 04:56 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration