LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 05-01-2011, 01:15 PM   #1
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,045

Rep: Reputation: Disabled
Automating raid failure detection on Slack 13.1


Hey fellow Slackers! I just setup sendmail on my server to send emails and it works, now I would like to be able to get an email from mdadm if sometjhing was going wrong. I imagine most raid users have this feature setup.

Right now, I have 7 raid arrays and mdadm starts at boot time. Until now, I used Mr. Goblin's script (http://connie.slackware.com/~mrgoblin/files/rc.mdadm) (thanks Mr Goblin!) to monitor my arrays.

The script is started at boot time from rc.local. I created a small script in /usr/bin that send the following command to rc.mdadm giving me the status of the arrays:

Code:
/etc/rc.d/rc.mdadm status
and it works fine, but this requires me probing the arrays manually by calling the script from the command line. I would like to automate probing every 10 minutes or whatever and if a fault has been detected, I get an email.

Right now, with the command:

Code:
mdadm --monitor --scan --test --oneshot
I get 7 emails saying:

Code:
This is an automatically generated mail message from mdadm
running on local-server

A TestMessage event had been detected on md device /dev/md2.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] 
md6 : active raid1 sde1[0] sdf1[1]
      1465135936 blocks [2/2] [UU]
      
md5 : active raid1 sda9[0] sdb9[1]
      253834880 blocks [2/2] [UU]
      
md4 : active raid1 sda8[0] sdb8[1]
      15366016 blocks [2/2] [UU]
      
md3 : active raid1 sda7[0] sdb7[1]
      10249344 blocks [2/2] [UU]
      
md2 : active raid1 sda6[0] sdb6[1]
      10249344 blocks [2/2] [UU]
      
md1 : active raid1 sda5[0] sdb5[1]
      20490752 blocks [2/2] [UU]
      
md0 : active raid1 sda1[0] sdb1[1]
      272960 blocks [2/2] [UU]
      
unused devices: <none>
In my mdadm.conf I have

Code:
MAILADDR email@gmail.com
Thats all I have for now. Will this allow monitoring of my raid arrays the way it is setup now? Or do I need to modify the setup? Right now I am not even sure how mdadm is started at first, and if it is really monitoring my arrays. I could ultimately test the setup by unplugging a drive, but I really (really!) dont want to do that...

Thanks!

Last edited by lpallard; 05-01-2011 at 01:17 PM.
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 05-01-2011, 03:48 PM   #2
vulcan59
Member
 
Registered: Sep 2007
Location: UK
Distribution: Slackware 14.2 & Current
Posts: 96

Rep: Reputation: 30
This is what I run at startup.

Code:
/sbin/mdadm -F --scan -m dave@pc1 -f -d 600
 
Old 05-01-2011, 04:13 PM   #3
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,045

Original Poster
Rep: Reputation: Disabled
Make sense, but my problem is to find out the way slackware launch mdadm at boot time. There is nothing in rc.local so I guess its called from somewhere else. I also found that mdadm is running with the switches --monitor --daemonise /dev/md[0-9] but thats all.

SO your command basically monitor (-F), scans the arrays (--scan), mails to dave@pc1 (-m), is daemonised (-f) and finally polls the arrays everyt 600 secs (-d).

Again, I think it make sense. Mine does not poll the arrays and will not send emails. Why does it need --scan at all? Also, since you called -m from the command directly, do you still need your email in mdadm.conf?

I think I would be pleased with this command :

Code:
/sbin/mdadm --monitor --scan --mail email@gmail.com --daemonise --delay 600 /dev/md[0-9]
Would it work?

EDIT: I think mdadm is initialized from initrd ... If so, how do I change the parameters? Which file to modify? Last thing I want to do is to fry my setup because of a stupid error...

Last edited by lpallard; 05-01-2011 at 04:31 PM.
 
Old 05-01-2011, 04:50 PM   #4
vulcan59
Member
 
Registered: Sep 2007
Location: UK
Distribution: Slackware 14.2 & Current
Posts: 96

Rep: Reputation: 30
First thing I should say is that my only machine using software raid is Slackware 12.2 so something may have changed with 13.1 which I don't know about.

My monitor command is run from rc.local. All my raid arrays are made up of partitions of type Linux raid autodetect and are automatically detected at system boot without anything special in the initrd. My initrd only contains modules for ext3.

--scan is there because I don't specify any device names. My mdadm.conf is empty, no email address or devices in it.

Have you actually tried simulating a failure like this to see if you get an email
Code:
mdadm /dev/md0 --fail /dev/sda1
mdadm /dev/md0 --remove /dev/sda1
mdadm /dev/md0 --add /dev/sda1
 
Old 05-01-2011, 08:40 PM   #5
mRgOBLIN
Slackware Contributor
 
Registered: Jun 2002
Location: New Zealand
Distribution: Slackware
Posts: 999

Rep: Reputation: 231Reputation: 231Reputation: 231
Slackware by default does not start mdadm at boot time.
The rc.mdadm script does that for you (assuming you have your email address in /etc/mdadm.conf)

The mdadm you see running has been started by my script and is in fact monitoring the listed devices.
The only time you'll get an email is if some RAID event happens.

If started from rc.local you should see one of the very last lines after booting is mdadm telling you it is monitoring your arrays along with the email address that errors are sent to.
 
Old 05-01-2011, 09:48 PM   #6
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,045

Original Poster
Rep: Reputation: Disabled
Yes I have my email in /etc/mfadm.conf as

Code:
MAILADDR email@gmail.com
It is in fact started at boot time and called from rc.local

So no need for other parameters? ps -A indicate that mdadm has been launched with only monitor and daemonise but nothing else... Will it probe the arrays to confirm all is fine?
 
Old 05-02-2011, 01:59 AM   #7
mRgOBLIN
Slackware Contributor
 
Registered: Jun 2002
Location: New Zealand
Distribution: Slackware
Posts: 999

Rep: Reputation: 231Reputation: 231Reputation: 231
ps should show something like this.

Code:
ps aux |grep mdadm
root      3777  0.0  0.0    952   172 ?        Ss   Apr09   0:02 /sbin/mdadm --monitor --daemonise /dev/md[0-9]
As I mentioned earlier, when you first start the machine you should see mdadm starting and showing the email address that will be used. It won't be part of the command-line you see from ps output though.
 
Old 05-02-2011, 04:36 AM   #8
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,045

Original Poster
Rep: Reputation: Disabled
Quote:
As I mentioned earlier, when you first start the machine you should see mdadm starting and showing the email address that will be used. It won't be part of the command-line you see from ps output though.
Yes ps shows exactly what you posted on your last post.

This server is headless. No monitor attached to it. Instead of looking at the boot sequence, Is there a way to get the output of the boot sequence to see if it uses my email?
 
Old 05-02-2011, 05:36 AM   #9
vulcan59
Member
 
Registered: Sep 2007
Location: UK
Distribution: Slackware 14.2 & Current
Posts: 96

Rep: Reputation: 30
Quote:
Originally Posted by lpallard View Post
This server is headless. No monitor attached to it. Instead of looking at the boot sequence, Is there a way to get the output of the boot sequence to see if it uses my email?
If you really want to check that all is working,why not do the test I described above. No hardware unplugging involved, just the mdadm commands.
 
Old 05-02-2011, 06:27 AM   #10
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,045

Original Poster
Rep: Reputation: Disabled
Good idea, I'll lost back with the results!
 
Old 05-02-2011, 04:29 PM   #11
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,045

Original Poster
Rep: Reputation: Disabled
Nope! I did your test, the array degraded then successfully reconstructed, but I never got an email from mdadm and sendmail works perfectly since I get the test emails with the command:

Code:
mdadm --monitor --scan --test --oneshot
Except "on-demand" querying, I am 99% sure the rc.mdadm script does not provide notification functionality nor it provides scanning or real time monitoring per-se. I have my email in /etc/mdadm.conf. Except normal mdadm messages such as reconstruction of the array, there is nothing in /var/log/messages that shows mdadm sending an email.

So what next?
 
Old 05-02-2011, 05:09 PM   #12
mRgOBLIN
Slackware Contributor
 
Registered: Jun 2002
Location: New Zealand
Distribution: Slackware
Posts: 999

Rep: Reputation: 231Reputation: 231Reputation: 231
Did you add your arrays to /etc/mdadm.conf?
 
Old 05-02-2011, 05:35 PM   #13
granth
Member
 
Registered: Jul 2004
Location: USA
Distribution: Slackware64
Posts: 212

Rep: Reputation: 55
rc.mdadm could be the problem. It doesn't use the mdadm.conf file (scan), and doesn't specify an admin email address with the -m or --mail argument.

Code:
${PROG} --monitor --daemonise "${DEVICES}" > ${PIDFILE}

Try populating your mdadm.conf, launch mdadm manually, and test again.

This is what I use, with a populated mdadm.conf:

Code:
/sbin/mdadm --monitor --scan -f -d 120

This is interesting, from the mdadm man page:

-f, --daemonise
Tell mdadm to run as a background daemon if it decides to moni-
tor anything. This causes it to fork and run in the child, and
to disconnect form the terminal. The process id of the child is
written to stdout. This is useful with --scan which will only
continue monitoring if a mail address or alert program is found
in the config file.
 
Old 05-02-2011, 05:49 PM   #14
lpallard
Senior Member
 
Registered: Nov 2008
Posts: 1,045

Original Poster
Rep: Reputation: Disabled
Yes, my arrays are in mdadm.conf. I believe if they werent there, the arrays would auto assemble at every boot!?

/etc/mdadm.conf
Code:
DEVICE /dev/sd[abcd]1 /dev/sd[ab]5 /dev/sd[ab]6 /dev/sd[ab]7 /dev/sd[ab]8 /dev/sd[ab]9

ARRAY /dev/md0 level=raid1 num-devices=2 UUID=d5e485e2:c46c2a00:9129b1fb:ded846c9 auto=yes
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=650583a2:993c6f50:e2a67cd3:3ffc9b6b auto=yes
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=3c31073c:297fbd76:fcdeb3aa:d13ad906 auto=yes
ARRAY /dev/md3 level=raid1 num-devices=2 UUID=2f1dcfa6:cd6f6c13:35c67e6f:32735530 auto=yes
ARRAY /dev/md4 level=raid1 num-devices=2 UUID=ad057bc4:37f886c8:6a45d43a:5ece5a50 auto=yes
ARRAY /dev/md5 level=raid1 num-devices=2 UUID=4fdeb908:d96635f4:f4c68e26:2ecec5a5 auto=yes
ARRAY /dev/md6 level=raid1 num-devices=2 UUID=03a9b2d5:f6e2c821:85427e01:1986d539 auto=yes
MAILADDR email@gmail.com
Also, I currently dont have to launch mdadm manually, every boot it is started automatically either via rc.mdadm or something else (that I couldnt find).

I dont believe mdadm.conf to be the problem
 
Old 05-02-2011, 07:28 PM   #15
mRgOBLIN
Slackware Contributor
 
Registered: Jun 2002
Location: New Zealand
Distribution: Slackware
Posts: 999

Rep: Reputation: 231Reputation: 231Reputation: 231
Quote:
Originally Posted by lpallard View Post
Yes ps shows exactly what you posted on your last post.

This server is headless. No monitor attached to it. Instead of looking at the boot sequence, Is there a way to get the output of the boot sequence to see if it uses my email?
Code:
/etc/rc.d/rc.mdadm restart
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Automating detection of Internet USB dongle via udev Sylvester Incognito 5 07-03-2009 10:02 PM
Dual drive failure in RAID 5 (also, RAID 1, and LVM) ABL Linux - Server 6 05-27-2009 08:01 PM
ethernet card detection failure chytons Linux - Networking 1 04-07-2006 03:11 PM
Automating NDISWRAPPER on Slack 10 c0dy Linux - Wireless Networking 1 12-21-2004 12:39 PM
any hardware failure detection software? sanglih Linux - Software 6 07-12-2002 07:02 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 07:04 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration