LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   A lightweight SMART exception reporting solution? (https://www.linuxquestions.org/questions/linux-software-2/a-lightweight-smart-exception-reporting-solution-761425/)

catkin 10-12-2009 01:07 PM

A lightweight SMART exception reporting solution?
 
Hello :)

Is there a lightweight SMART exception reporting solution? Ideally something that will periodically gather SMART statistics and, when there is an exception indicating a problem worthy of investigation, send an email?

smartmontools provides the base, the ability to query drives and report statistics. It would, of course, be possible to run a script periodically that runs smartmontools commands, analyses the output and, on detecting an exception, sends an email. A sophisticated implementation would send emails reporting the same problem with decreasing frequency and would infrequently send a "green light" email to assure the recipient that is was still working.

This seems such an "obviously" useful function that it may have been developed already but so far I have only found graphical utilities such as munin, GSmartControl and maybe CrystalDiskInfo which would be fine tools to run constantly in an operations centre but are overkill for a single workstation.

Best

Charles

doc.nice 10-15-2009 02:14 PM

infact, smartmontools does support this, see man smartd...

catkin 10-15-2009 02:29 PM

Quote:

Originally Posted by doc.nice (Post 3720669)
infact, smartmontools does support this, see man smartd...

Thanks doc.nice :) Sweet as! I had been looking at smartctl but smartd is the bees knees. Some great information written by smartmontools creator, Bruce Allen, in Linux Journal. Reading that in conjunction with the smartmond man page and following its cross-references to the smartctl page, I came up with the following line in /etc/smartd.conf
Code:

/dev/sda -a -I 194 -M diminishing m root -o on -s  (S/../.././17|L/../../6/15) -W 3,41,46
Is testing OK so far (was holding off posting until tested a little more) but I can't think of a way to test it properly without an (ideally simulated) failing HDD.

doc.nice 10-15-2009 02:51 PM

Hi,

I use /dev/hda -a -I 194 -n standby -o on -S on -s S/../.././02 -m (myadminmailaddress) -M diminishing

but I must have missed the -W switch, will add it immediately :)

About the usefullness:
I already had a dying disc and must say, SMART has done its job. It told me about 3 mins before the disc finally gave up that something is bad... :(
I have another disc (used as swap in a private pc) that is SMART defect for more than 3 years now...
So I wouldn't give that much, but the thermal check ist quite a good indicator for real trouble in your case, I think...


All times are GMT -5. The time now is 05:25 AM.