LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 06-28-2011, 01:12 AM   #1
xj25vm
Member
 
Registered: Jun 2008
Posts: 393

Rep: Reputation: 68
Help with smartd configuration options


Hi all,

I'm having some trouble understanding the smartd configuration options which go in /etc/smartd.conf. I've had smartd running for close to a year now, and read the manual page for smartd and smartd.conf several times, and I still can't get my head around it properly. This is what I use in smartd.conf:

/dev/sda -H -f -S on -o on -n standby,q \
-s (O/../.././(00|06|12|18)|S/../.././11|L/../../(3|6)/21) \
-m <nomailer> -M exec /usr/sbin/smartd_mailer

The above works fine (I use a custom script because I use exim). However, I had two failing hard-disks on two different machines (sector pending attribute) - but I would only receive warning emails if I restarted the machines - not when the periodic tests were being performed. I have a number of questions, if any of you would like to share some light on this:

1. It is not clear from the manual page how the monitoring and alerting mode works. Do I get an initial email when I start smartd if something is wrong, and then no other warning email after each scan, even if there is a fault, unless the fault goes worse?
2. Can I force it to send me warning emails after each test (offline, online, short, long etc.) if the fault condition is still there, even if it hasn't gotten worse?
3. There seem to be a variety of parameters monitored. I was thinking that if I monitor bad sectors, and the hdd temperature - it should be enough to warn me when the hard-disk is failing? Would the -a switch in smartd.conf cover these?
4. I don't understand the difference between monitoring and logging in relation to alerts? Do I get alerts only if I log things?
5. Do I understand correctly that the -p and -u switch would warn me of *any* changes in SMART attributes, even if they don't represent risk of failure?
6. Is there a way of finding out if the scheduled tests (offline, long, short etc.) have been performed?

Many thanks for any replies. It might be just me being thick, or the SMART specifications being a bit complex, or the manual page not being as clear as it could be - or maybe a combination of the above :-)

Last edited by xj25vm; 06-28-2011 at 01:15 AM.
 
Old 07-07-2011, 03:59 PM   #2
Andy Alt
Member
 
Registered: Jun 2004
Location: Minnesota, USA
Distribution: Slackware64-stable, Manjaro, Debian64 stable
Posts: 528

Rep: Reputation: 167Reputation: 167
http://serverfault.com/questions/320...ecovered-value

check out the smartctl command.
http://smartmontools.sourceforge.net...martctl.8.html

As for the output messages from smartd, I'll sometimes just use tail -f /var/log/syslog | grep smartd

Though on some systems the messages are in /var/log/messages
 
Old 10-20-2011, 05:29 AM   #3
xj25vm
Member
 
Registered: Jun 2008
Posts: 393

Original Poster
Rep: Reputation: 68
Thanks for the answer - but this is not exactly what I am asking. You are suggesting how to find the current smart status (using smartctl and /var/log/syslog).

What I am saying is that smartd is running in the background all the time, monitoring and running its periodic tests. However, it fails to spot the problem, unless I restart the machine. Shouldn't I get an error/warning message when the automatic smart test occurs, several times a week? Why is it only emailing me when the machine starts that something is wrong? If I don't restart the server, it will sit there running happily for months and never notifies me that something is wrong.

What is the point of having smartd running in the background and performing automatic tests, and being configured to email away warnings - if I have to manually login remotely and run smartctl and check syslog?
 
Old 10-20-2011, 07:05 AM   #4
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
Quote:
Originally Posted by xj25vm View Post
Do I get an initial email when I start smartd if something is wrong, and then no other warning email after each scan, even if there is a fault, unless the fault goes worse?
AFAIK the default is "-M once". If you want an alert on smartd startup you add "-M test".


Quote:
Originally Posted by xj25vm View Post
Can I force it to send me warning emails after each test (offline, online, short, long etc.) if the fault condition is still there, even if it hasn't gotten worse?
You mean 'M daily'?


Quote:
Originally Posted by xj25vm View Post
There seem to be a variety of parameters monitored. I was thinking that if I monitor bad sectors, and the hdd temperature - it should be enough to warn me when the hard-disk is failing? Would the -a switch in smartd.conf cover these?
Apparently the man page says it covers what you wrote about. Plus it's a smartd default.


Quote:
Originally Posted by xj25vm View Post
I don't understand the difference between monitoring and logging in relation to alerts? Do I get alerts only if I log things?
I don't know if I can explain this in a simple way but monitoring is what smartd does. Reason-for-being kind of thing. Logging means telling syslog about changes like starting smartd, starting a self-test or telling it some value has changed. Alerting means smartd emailing a problem description. While it would be odd not to log changes you can do so (use say '-l local6' and don't reference it in /etc/syslog.conf) and only keep the alerting.


Quote:
Originally Posted by xj25vm View Post
Do I understand correctly that the -p and -u switch would warn me of *any* changes in SMART attributes, even if they don't represent risk of failure?
"-t", which starnds for "combine -p with -t" reports all changes. With "-I" you can ignore specific values.


Quote:
Originally Posted by xj25vm View Post
Is there a way of finding out if the scheduled tests (offline, long, short etc.) have been performed?
As in 'smartctl -l selftest /dev/devicename'?
 
Old 10-20-2011, 10:17 AM   #5
xj25vm
Member
 
Registered: Jun 2008
Posts: 393

Original Poster
Rep: Reputation: 68
Thank you for your reply, unSpawn.

Quote:
AFAIK the default is "-M once". If you want an alert on smartd startup you add "-M test".
I think what I am after is -M daily. I hope this will email me daily *only* if there keeps on being a problem with the hard-disk. Although I can't tell from the man page if that is the case - or it will email me daily even if there is nothing to report. It is a bit confusing that, although you can schedule regular tests - this doesn't seem to influence how frequently you receive email alerts. I just assumed that, if a test finds a problem, it will just email the alert immediately. It seemed like the reasonable thing to expect. You enable emailing - and you receive alerts all the time while there keeps on being a problem there - every time you run a test and it keeps on finding the problem?

Thanks for the other pointers as well. I think I'm getting there :-)
 
Old 10-20-2011, 11:07 AM   #6
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
Quote:
Originally Posted by xj25vm View Post
I hope this will email me daily *only* if there keeps on being a problem with the hard-disk. Although I can't tell from the man page if that is the case - or it will email me daily even if there is nothing to report.
If there's nothing to report smartd won't send email.


Quote:
Originally Posted by xj25vm View Post
It is a bit confusing that, although you can schedule regular tests - this doesn't seem to influence how frequently you receive email alerts. I just assumed that, if a test finds a problem, it will just email the alert immediately. It seemed like the reasonable thing to expect.
AFAIK self-tests are independent backgrounded processes that take quite a while to complete and that's different from counters the disk maintains and which smartd can access and report about instantly.


Quote:
Originally Posted by xj25vm View Post
You enable emailing - and you receive alerts all the time while there keeps on being a problem there - every time you run a test and it keeps on finding the problem?
No, I chose the default. In contrast to people who think getting a gazillion emails is a Good Thing I hold the opinion that if one can't or won't respond to a single message then sending duplicates won't change / teach anything priority / efficiency-wise...
 
Old 10-20-2011, 11:24 AM   #7
xj25vm
Member
 
Registered: Jun 2008
Posts: 393

Original Poster
Rep: Reputation: 68
Quote:
No, I chose the default. In contrast to people who think getting a gazillion emails is a Good Thing I hold the opinion that if one can't or won't respond to a single message then sending duplicates won't change / teach anything priority / efficiency-wise...
I can't necessarily argue with your point there - however, I prefer to be pestered again and again until I get around to solving the problem - otherwise it gets lost in the noise of daily running around and trying to fix things. I guess that means I'm not as organised as I should be

Thanks again for your helpful replies. I just couldn't get my head around the idea of warning emails being treated as completely separate things from the scheduled scans.

Last edited by xj25vm; 10-20-2011 at 11:26 AM.
 
Old 10-20-2011, 11:45 AM   #8
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,415
Blog Entries: 55

Rep: Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600Reputation: 3600
You're welcome.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
What do the configuration options in initramfs.conf mean? Kesem Linux - Newbie 4 07-28-2009 01:52 AM
Smartd Error Message generated by Smartd Daemon Proces rexjenny Red Hat 1 11-29-2006 07:12 PM
No networking options in kernel configuration ab_s0248 Linux - Distributions 1 04-28-2005 03:18 PM
No networking options in kernel configuration ab_s0248 Linux - Networking 1 04-28-2005 01:05 PM
binutils configuration options Barbarian Linux - Software 1 07-06-2002 05:57 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 10:25 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration