Very slow software RAID5/LVM array - which drive is dying?
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
Very slow software RAID5/LVM array - which drive is dying?
I have an Ubuntu 10.04 system which is used as a file-server, primarily for storing video. The setup combines two RAID5 arrays joined in an LVM (details below).
It's served me very well in the past, with decent enough performance for my use (around 150-200MB/s sequential reads for example). All I really ask of it is to be able to stream HD video, which shouldn't be too onerous.
As I say, the setup used to work absolutely fine but it now grinds to a halt (i.e. <1MB/s reads, 100ms+ seek times, ...) sometimes - even with no load on the system. I strongly suspect one of the hard drives is on its way out, but can't tell which one. I've looked at the drives in system monitor and they all look healthy - SMART reports them all as having either no bad sectors or 1/2 bad sectors. I've run transfer tests on each individual drive and they're perfectly fast. The problem is that the issue is intermittent - when I run a test over a particular drive it's fine more often than not.
Thanks for the reply. I'm hoping there's a better way though; swapping out a drive and rebuilding the array will take a very long time (especially with the array running slow). Rebuilds take several hours to complete and I'd need to do that 6 times.
I might try moving the drives between controllers, and swapping cables out thought - that would be quicker...
I've run a long smart test on each of the 7 drives (that's the 6 drives in the array, plus the OS boot drive) - run as "sudo smartctl --test=long <device>". All tests have passed - I've attached the full SMART data for all of them below - that's the output of "sudo smartctl --al <device>".
Any ideas? I'm wondering whether something else might be causing the array to be slowing down, but can't think what that might be. In terms of the drives, what's a sensible upper limit for the temperature they should run at? Some of them are just over 50 celsius, I don't know whether that's reasonable.
These two did overheat. However, all the other attributes are normal, and all the smart long tests passed.
This means that no drives are failing, but I would add some more fans.
Did you update recently or change anything on this system ? Maybe it was a bad update, or something changed to cause this ...
Maybe check the logs for anything suspicious, /var/log/ messages syslog.
Also check the cables.
Thanks for the help - I'll look at adding more fans to the case.
I haven't changed anything recently (other than installing standard security updates for Ubuntu). It's possible that the case has gathered some dust over time - I'll clean the filters for the case fans and see if I can blow some dust out of the case. What's a safe temperature for drives to operate at?
As an when I get new drives I'll try to get some 5400rpm ones rather than 7200rpm as they should run a little cooler.
Having given it a good clean and rerouted some cables for better airflow, the drive temperature is now a little lower (46C for those two drives). However that doesn't seem to have helped - I'm still seeing poor performance with occasional long waits for access.
One thing which has now occured to me though is file fragmentation. Many of the files here have been downloaded by Bittorrent, and when upgrading from Ubuntu 8.04 to 10.04 recently, I changed Bittorrent client from Vuze to Transmission (the default Ubuntu client). One change is that Vuze allocates the entire file on disk prior to starting the download, whereas Transmission allocates it incrementally while downloading.
I suspect that it's resulting in some very fragmented files, which is making access very slow. Picking a recently-downloaded 50MB file at random, filefrag reports it has 1399 extents which seems very poor to me (the array isn't very full: 600GB free out of 2.9TB). Picking an old 350MB file shows a more reasonable 39 extents.
A quick Google shows this bug report/discussion relating to precisely this issue: https://trac.transmissionbt.com/ticket/849. I'll set the option in Transmission to preallocate files, and see if that helps. I suspect the RAID/LVM setup I have exacerbates the problem (having two bits of the same drive in the same logical volume). This is an EXT3 filesystem - I wonder whether EXT4 would have helped at all...
Anyway, thanks for the help - hopefully I have what I need now.