LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > SUSE / openSUSE
User Name
Password
SUSE / openSUSE This Forum is for the discussion of Suse Linux.

Notices


Reply
  Search this Thread
Old 11-07-2016, 01:01 PM   #1
jim_cliff11
LQ Newbie
 
Registered: Jan 2004
Posts: 21

Rep: Reputation: 0
Raid Issues


Hi All,

The OpenSuse server we are using is giving me a few issues regarding the RAID configuration.

See attached picture for information.

The distro boots up and runs fine but the orange lights on two of the drives are flashing.

Is there a way within OpenSuse to run diagnostics and fix any faults within the drives?

lsblk gives me the following:

Code:
cciss!c0d0   disk  1.8T LOGICAL VOLUME
cciss!c0d0p1 part    2G
cciss!c0d0p2 part   40G
cciss!c0d0p3 part  1.8T
cciss!c0d1   disk  1.4T LOGICAL VOLUME
cciss!c0d1p1 part  1.4T
I have a raid 5 configured into two logical drives.

Thanks,
Jim
Attached Thumbnails
Click image for larger version

Name:	IMG_5507.JPG
Views:	27
Size:	124.5 KB
ID:	23467  

Last edited by jim_cliff11; 11-07-2016 at 01:03 PM.
 
Old 11-07-2016, 01:04 PM   #2
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694
"orange lights on two of the drives are flashing"

In the Dell world, amber flashing lights means that the drive has physically failed. There is no amount of commands that can ever fix it. You should pull them and replace them before you lose another drive and lose your array and your DATA. Get a backup.

http://www.dell.com/support/article/us/en/04/SLN292269
 
Old 11-07-2016, 03:55 PM   #3
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,125

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
I would have thought the messages were self explanatory.
Change the battery.
 
Old 11-07-2016, 04:01 PM   #4
smallpond
Senior Member
 
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: Fedora
Posts: 4,140

Rep: Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263Reputation: 1263
For hardware RAID you need to follow the RAID controller procedures for finding out what's wrong and replacing or reinitializing the drives. Anything you try to do in software directly to the drives would be likely to conflict with the RAID controller.
 
Old 11-07-2016, 09:53 PM   #5
jefro
Moderator
 
Registered: Mar 2008
Posts: 21,978

Rep: Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624
https://www.smartmontools.org/wiki/S...ID-Controllers tells what might work on smart tools.

The battery ought to be replaced before you play with the controller too much. I mean after you make a full backup.

Some raid bios's do have ways to test drives. Boot to the raid bios with some key combo at boot after normal bios and before OS boots. Ctrl-a or some key combo.

If you have a test stand you can test drives one at a time or just read smart numbers. The old scsi drives could be low level formated. We used to do that every year and kept them working for decades.
 
Old 11-09-2016, 12:55 PM   #6
jim_cliff11
LQ Newbie
 
Registered: Jan 2004
Posts: 21

Original Poster
Rep: Reputation: 0
Thanks for the feedback.

I've ordered a replacement battery to sort that issue out.

As for the 'imminent failure of the hard drives' warning, does this basically mean the HD is about to go FUBAR? I am running RAID 5 with 3 physical disks but in honesty I don't know how to go about resolving this. The drives I am running are ATA GB0750C8047 units at 750GB a piece. Few questions below:

1. In order to stabilise my system do I need to replace these with identical drives? eg. Make, model, size etc?
2. Do I need to power off the server, or can I simply pull each drive at a time, replace HD and push back in? Then I'm guessing the RAID will do its thing: restore data on the new drive. Once this is done, then follow the same procedure with the second drive? Or is this completely wrong? Is there anything I need to do or initialise to begin the restoration procedure?

Sorry for my lack of knowledge.
Any help greatly appreciated.

Jim
 
Old 11-09-2016, 12:58 PM   #7
szboardstretcher
Senior Member
 
Registered: Aug 2006
Location: Detroit, MI
Distribution: GNU/Linux systemd
Posts: 4,278

Rep: Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694Reputation: 1694
Quote:
For hardware RAID you need to follow the RAID controller procedures for finding out what's wrong and replacing or reinitializing the drives.
You'll have to look up your raid setup and look through the manual for precise answers for your hardware.

But:
  • "Generally" the blinking amber light means that the drive will fail soon. It needs replaced.
  • "Generally" You do not have to replace them with the same drive, just same size or bigger
  • "Generally" with hardware raid, you do not have to shutdown the system. You can replace a drive, wait for the rebuild to happen.
 
Old 11-09-2016, 11:30 PM   #8
ember1205
Member
 
Registered: Oct 2014
Posts: 176

Rep: Reputation: 16
S.M.A.R.T. (Self-Monitoring Analysis and Reporting Technology) is a set of tools designed to inter-operate with like-built hard drives to constantly monitor a variety of different aspects of the drive's performance. Various areas are monitored and certain programmed thresholds are used as the "standard of measure" for the different areas. When those thresholds are crossed, the software is able to alert to an underperforming drive that may be on the edge of catastrophic failure. Generally, when you get "imminent failure" messages, you should be doing everything in your power to ensure you have a solid backup and then swapping the drives out for good ones and getting the array rebuilt.

As has already been said multiple times, the warning messages should have been fairly self-explanatory.
 
Old 11-10-2016, 03:50 AM   #9
jim_cliff11
LQ Newbie
 
Registered: Jan 2004
Posts: 21

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by ember1205 View Post
As has already been said multiple times, the warning messages should have been fairly self-explanatory.
Yes, and I acknowledge this. I'm now trying to gather as much information as I can in order to replace the drives.

Can anyone elaborate on whether the P400 controller needs to have identical replacement drives? I need to be sure before I purchase a new drives that their going to do the job. I'm only able to find direct replacement drives in the USA, struggling in the UK. So would a different manufacturer, model and higher size volume suffice?

Thanks,
Jim
 
Old 11-10-2016, 07:50 AM   #10
jim_cliff11
LQ Newbie
 
Registered: Jan 2004
Posts: 21

Original Poster
Rep: Reputation: 0
Just spoke a local IT technician who told me the GB0750C8047 Seagate Barracuda drive will be loaded with a firmware specific to HP. So I cant just throw any old 750GB Barracuda drive in.

On the actual drive itself, it does day firmware: HPG1.

Does anyone else have any experience with this?
 
Old 11-10-2016, 08:10 AM   #11
ember1205
Member
 
Registered: Oct 2014
Posts: 176

Rep: Reputation: 16
Have you considered contacting HP or an authorized HP shop for guidance? "Any" drive will work. What you're concerned with is the SMART communications between the drives and the controller and the drive being "as capable as possible" of inter-operating with the controller. Find out if you can upgrade the firmware on the drive yourself and whether you can get the actual firmware from HP's web site (you'd be surprised at the stuff you can download from them).
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Raid 5 issues countiec Linux - Server 2 04-30-2015 04:29 AM
RAID Issues unrestricted Linux - Newbie 1 03-20-2011 11:20 AM
RAID issues wesley.bruwer Linux - Newbie 7 06-26-2009 10:07 AM
RAID Issues gsoft Ubuntu 0 09-11-2007 03:11 AM
Raid issues Sigh Linux - Newbie 1 03-20-2005 09:32 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > SUSE / openSUSE

All times are GMT -5. The time now is 09:11 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration