LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   how to check for hard disk failure? (https://www.linuxquestions.org/questions/linux-newbie-8/how-to-check-for-hard-disk-failure-4175477818/)

hahnhahnhahn 09-20-2013 12:47 AM

how to check for hard disk failure?
 
Hi to all

I'm a newbie here. I need to check for hard disk failure in my linux. Suspect "intermittent" hard disk failure is causing some of my tables in oracle database to have missing records. May I know which log and how can I check for hard disk failure? Below are my details:

Linux:
Red Hat Enterprise Linux ES release 4 (Nahant Update 8)
Kernel \r on an \m
Linux version 2.6.9-89.ELsmp (mockbuild@hs20-bc1-2.build.redhat.com) (gcc version 3.4.6 20060404 (Red Hat 3.4.6-11)) #1 SMP Mon Apr 20 10:34:33 EDT 2009

Oracle Database
Oracle 9i SE(9.2.0.7) Red Hat Linux 4.8(32 bit)

spazticclown 09-20-2013 12:59 AM

First off back up any important data.

Smartctl can be used to check the SMART status of the drive:
Code:

# smartctl -a /dev/sda
Will print off the smart status for sda

badblocks can scan for read errors on the drives
Code:

# badblocks -s /dev/sda
-s Provides a nice output and percent complete.

If you have a hardware RAID controller then SMART data should be available in the RAID management utility.

Hope that helps you out.

hahnhahnhahn 09-20-2013 01:11 AM

Quote:

Originally Posted by spazticclown (Post 5031144)
First off back up any important data.

Smartctl can be used to check the SMART status of the drive:
Code:

# smartctl -a /dev/sda
Will print off the smart status for sda

badblocks can scan for read errors on the drives
Code:

# badblocks -s /dev/sda
-s Provides a nice output and percent complete.

If you have a hardware RAID controller then SMART data should be available in the RAID management utility.

Hope that helps you out.

Hi, how about logs? Which log will show me hard disk failure details?

astrogeek 09-20-2013 01:19 AM

You might see some related errors in /var/log/{syslog, messages}, but there is no log for hard disk failures - until it fails 100% and fails to mount.

spazticclown 09-20-2013 01:19 AM

Good question, dmesg (/var/log/messages) may show you some info regarding the drive (sda, sdb etc), mdraid (md0, md1 etc).
Code:

# cat /var/log/messages | grep -i "sda"
Is a good starting point.

Good luck.

hahnhahnhahn 09-20-2013 02:00 AM

Alright. thanks everyone for helping.

John VV 09-20-2013 02:28 AM

You might want to consider installing a supported OS
RHEL 4.8 is unsupported
you could upgrade to RHEL 4.9
it is now on EXTRA extended life support ( and you have to buy the extra support )
but that support will be ENDING in mid 2014 ( that is for 4.9 , 4.8 is ALREADY NOT supported )

hahnhahnhahn 09-20-2013 02:36 AM

Quote:

Originally Posted by John VV (Post 5031201)
You might want to consider installing a supported OS
RHEL 4.8 is unsupported
you could upgrade to RHEL 4.9
it is now on EXTRA extended life support ( and you have to buy the extra support )
but that support will be ENDING in mid 2014 ( that is for 4.9 , 4.8 is ALREADY NOT supported )

Hi John VV,

Greatly appreciate your sharing. However on another note, my company is reducing their budget. As such, we cant have upgrade anytime soon. Likewise for Oracle 9i which is already unsupported by Oracle.

gdejonge 09-21-2013 01:38 AM

Quote:

Originally Posted by hahnhahnhahn (Post 5031205)
Hi John VV,

Greatly appreciate your sharing. However on another note, my company is reducing their budget. As such, we cant have upgrade anytime soon. Likewise for Oracle 9i which is already unsupported by Oracle.

You really should asked your boss what would happen to the department/company when this database system will fail.
I've worked for a company where every minute of down-time would costs them thousands of dollars of lost revenue.

This is why companies that really depend on their IT infrastructure have a DR plan. (DR=disaster recovery). And that every sysadmin worth his salt at least has thought about it.

Cheers

zeebra 09-21-2013 07:51 AM

Quote:

Originally Posted by gdejonge (Post 5031861)
You really should asked your boss what would happen to the department/company when this database system will fail.
I've worked for a company where every minute of down-time would costs them thousands of dollars of lost revenue.

This is why companies that really depend on their IT infrastructure have a DR plan. (DR=disaster recovery). And that every sysadmin worth his salt at least has thought about it.

Cheers

There is no reason for old systems to fail. You never need to always have the newest version to be able to use a system successfully. Many people and companies run very old systems and they work perfectly well.

Depending on the company size and needs I would recommend moving over to a free and unsupported system that they can manage themselves instead, if their resources cannot support moving to the newest and latests and having all the best support packages.


All times are GMT -5. The time now is 10:48 AM.