LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 03-17-2018, 12:55 PM   #1
upnort
Senior Member
 
Registered: Oct 2014
Distribution: Slackware
Posts: 1,893

Rep: Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162
Self training with a HW RAID controller


I am looking for suggestions to provide myself some bootstrap training.

I soon will have access to a Dell R710 with an H700 RAID controller with five 1 TB disks. I want to learn more about monitoring and responding to various RAID events. Basically I want to write my own lab training lesson plan.

I have only worked with RAID controllers already in production and never have faced failures or degradation events. Although by no means a guru, I am familiar with the megacli command and have written some simple megacli script wrappers. There are oodles of megacli articles online. I am not interested in the megacli command as much as I am interested in simulating common types of degradation and fixing the problem.

Just looking for a list of real-world things to learn.

Thanks!
 
Old 03-18-2018, 03:38 AM   #2
jlinkels
LQ Guru
 
Registered: Oct 2003
Location: Bonaire, Leeuwarden
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,195

Rep: Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043Reputation: 1043
AFAIK the Dell RAID shows up as one single drive in Linux. I am not sure if there are applications to access the RAID controller from within Linux.

If you install VMWare on the Dell the RAID is seen as a single drive and I am sure there is no way to access the controller from VMWare or from a guest. Not even when using the Dell version of ESXi.

You communicate through iDRAC with the RAID controller for configuration and monitoring.

It seems to be Dell's intention to "set and forget" the RAID. If you have a defective disk, you pull it out and insert a new one. It will rebuild.

I must advice against RAID5. Rebuilding very large arrays (a few TB) might take longer than acceptable for operating a degraded array.

jlinkels
 
Old 03-18-2018, 12:26 PM   #3
upnort
Senior Member
 
Registered: Oct 2014
Distribution: Slackware
Posts: 1,893

Original Poster
Rep: Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162
Quote:
I am not sure if there are applications to access the RAID controller from within Linux.
megacli
 
Old 03-18-2018, 01:41 PM   #4
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,666

Rep: Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970
Quote:
Originally Posted by jlinkels View Post
AFAIK the Dell RAID shows up as one single drive in Linux. I am not sure if there are applications to access the RAID controller from within Linux.

If you install VMWare on the Dell the RAID is seen as a single drive and I am sure there is no way to access the controller from VMWare or from a guest. Not even when using the Dell version of ESXi. You communicate through iDRAC with the RAID controller for configuration and monitoring. It seems to be Dell's intention to "set and forget" the RAID. If you have a defective disk, you pull it out and insert a new one. It will rebuild.
Agree with this. While there are utilities that you CAN use to monkey around with a RAID array, it's really best (in terms of a hardware RAID solution), to let the controller do it. You can get....interesting...results otherwise. You should still be able to see the individual drives with smartctl and other utilities, though, as far as I remember. But when I build a HW RAID, I'll usually monitor it with SNMP, which will return all the goodies.
Quote:
I must advice against RAID5. Rebuilding very large arrays (a few TB) might take longer than acceptable for operating a degraded array.
I've had mixed results. If you have a controller with a decent amount of RAM, and the disk isn't getting hammered, you can get a fairly quick rebuild, but still it's going to be a while. But at least your system isn't down during that time, and even if it takes a week or so to get rebuilt, your system is still up. And, the chances of a second drive failing in that window is pretty small as well.
 
Old 03-19-2018, 02:32 PM   #5
upnort
Senior Member
 
Registered: Oct 2014
Distribution: Slackware
Posts: 1,893

Original Poster
Rep: Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162Reputation: 1162
Well, um, thanks, I guess. I was not asking for opinions about troubleshooting or theory. I was asking for ideas how to conduct some self-training with a RAID controller. For example, pull a drive to degrade the array, replace the drive, and monitor the array rebuilding. I am looking for various ways to degrade an array and learn how to respond. My goal is I have basic experience monitoring arrays but no experience handling actual failures.

Quote:
You should still be able to see the individual drives with smartctl and other utilities
With many controllers smartctl will pierce the veil to see individual drives, but fails to provide information about the array. The megacli command works great to query a supported RAID controller and fill that void.

Quote:
I must advice against RAID5. Rebuilding very large arrays (a few TB) might take longer than acceptable for operating a degraded array.
Well, with large disks rebuilding any RAID takes a long time. I have access to one system with 1 TB drives using RAID 1 with one hot spare. I have seen that system take most of the day to rebuild the array. My cynical opinion why RAID 5 tends to be pushed is to sell more hard drives. To be fair, a 3-disk RAID 5 provides an additional disk of redundancy compared to RAID 1, but can only suffer one disk failure just like RAID 1. Although striping improves overall throughput I am not fond of the whole striping thing. Just unnecessary complexity for most users. Yes, backups are required but to me, RAID 1 is simpler to maintain and recover. That all said, I have an opportunity with this Dell R710 to learn more about RAID and that was my hope with this thread.
 
Old 03-19-2018, 03:42 PM   #6
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 26,666

Rep: Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970Reputation: 7970
Quote:
Originally Posted by upnort View Post
Well, um, thanks, I guess. I was not asking for opinions about troubleshooting or theory. I was asking for ideas how to conduct some self-training with a RAID controller. For example, pull a drive to degrade the array, replace the drive, and monitor the array rebuilding. I am looking for various ways to degrade an array and learn how to respond. My goal is I have basic experience monitoring arrays but no experience handling actual failures.
I would strongly suggest you don't just 'pull a drive' to degrade the array, unless you have hot-swap drives. However, for testing, powering the system off and pulling a drive will get the array degraded, and let you look.

There *MAY* be tools that can let you see what you can with an mdadm command (for software RAID), but that depends on your controller. With something like RAID5 or 6, the system won't even notice, and will continue as normal. Unless you poke through the logs or have SNMP set up, you won't notice. Putting the drive in is similarly invisible...the controller is doing the grunt work there.
Quote:
With many controllers smartctl will pierce the veil to see individual drives, but fails to provide information about the array. The megacli command works great to query a supported RAID controller and fill that void.
It can, but it may not (as you say) depending on controller. LSI and Adaptec have utilities for their hardware RAID controllers that will get you further than the standard Linux utilities.
Quote:
Well, with large disks rebuilding any RAID takes a long time. I have access to one system with 1 TB drives using RAID 1 with one hot spare. I have seen that system take most of the day to rebuild the array. My cynical opinion why RAID 5 tends to be pushed is to sell more hard drives. To be fair, a 3-disk RAID 5 provides an additional disk of redundancy compared to RAID 1, but can only suffer one disk failure just like RAID 1. Although striping improves overall throughput I am not fond of the whole striping thing. Just unnecessary complexity for most users. Yes, backups are required but to me, RAID 1 is simpler to maintain and recover. That all said, I have an opportunity with this Dell R710 to learn more about RAID and that was my hope with this thread.
Yes, it does take a long time...but honestly, who cares? Because:
  • The system is up
  • No data is lost
  • This happens in the background
For me, even if it takes several days, it's a non-issue. Because as you say, you have backups...and you're playing the odds too. Because what are the chances that two drives in one system/array are going to fail within a week of each other? RAID 5 and 6 are my go-tos, followed by RAID 50/60. I despise RAID1 because I've been bitten by the corrupted mirror numerous times. Yes, the drives are mirrored...but if one gets corrupted, it'll do nothing but corrupt the data on the second drive, leaving you dead in the water. Not to mention many times where you'd have to power the system down, put the mirror into primary slot, get a new drive, power up, and then wait xxx time for the re-mirror.

Have only had one instance of RAID5 going squirrely, and that was because of a flaky HP RAID controller (HP...they put the "J" in quality...)
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] How to check a hardware level raid with PERC H710 raid controller? abelosorio Debian 1 01-13-2015 09:57 AM
RAID controller with RAID 1+0 for workstations/desktops dlugasx Linux - Hardware 2 03-17-2012 11:04 PM
ICP raid controller, no automatic rebuild of raid 5 after replacing bad disk auclark@wsu.edu Linux - Newbie 3 12-14-2009 10:54 AM
RAID controller card override onboard RAID controller? Dr. Psy Linux - Enterprise 1 05-30-2005 10:35 AM
ABIT KR7A133-RAID MB - HPT372 RAID Controller jeb Linux - Hardware 3 05-07-2002 01:51 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 05:17 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration