Linux - EnterpriseThis forum is for all items relating to using Linux in the Enterprise.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Well i've been using smartctl to check a few of my drives but I don't understand much of the output or how serious what I deem to be a problem is. Basically I have four drives, sdc, sdd, sde and sdf.
Could someone tell me if drives are faulty or not and if in fact faulty is it safe to still use them or should I replace asap?
Thanks,
icebrian
Code:
#################################################################################
#################################################################################
# smartctl -d ata -a /dev/sdc
#################################################################################
#################################################################################
smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: WDC WD10EACS-00D6B0
Serial Number: WD-WCAU40236786
Firmware Version: 01.01A01
User Capacity: 1,000,204,886,016 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Wed Nov 4 16:10:24 2009 WET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 249) Self-test routine in progress...
90% of test remaining.
Total time to complete Offline
data collection: (22200) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x303f) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 162 148 021 Pre-fail Always - 6858
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 115
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 051 Old_age Always - 0
9 Power_On_Hours 0x0032 086 086 000 Old_age Always - 10574
10 Spin_Retry_Count 0x0032 100 100 051 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 051 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 115
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 31
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 115
194 Temperature_Celsius 0x0022 131 115 000 Old_age Always - 19
196 Reallocated_Event_Count 0x0032 199 199 000 Old_age Always - 1
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 051 Old_age Offline - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 10221 -
# 2 Extended offline Completed without error 00% 10174 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
#################################################################################
#################################################################################
# smartctl -d ata -a /dev/sdd
#################################################################################
#################################################################################
smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: WDC WD10EACS-00D6B0
Serial Number: WD-WCAU40213629
Firmware Version: 01.01A01
User Capacity: 1,000,204,886,016 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Wed Nov 4 16:10:58 2009 WET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (24000) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x303f) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 164 149 021 Pre-fail Always - 6783
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 114
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 051 Old_age Always - 0
9 Power_On_Hours 0x0032 086 086 000 Old_age Always - 10575
10 Spin_Retry_Count 0x0032 100 100 051 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 051 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 114
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 29
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 114
194 Temperature_Celsius 0x0022 130 116 000 Old_age Always - 20
196 Reallocated_Event_Count 0x0032 196 196 000 Old_age Always - 4
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 051 Old_age Offline - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 10223 -
# 2 Extended offline Completed without error 00% 10176 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
#################################################################################
#################################################################################
# smartctl -d ata -a /dev/sde
#################################################################################
#################################################################################
smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: SAMSUNG HD103UJ
Serial Number: S13PJ1DQ504866
Firmware Version: 1AA01112
User Capacity: 1,000,204,886,016 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 3b
Local Time is: Wed Nov 4 16:11:26 2009 WET
==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details.
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (11962) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 200) minutes.
Conveyance self-test routine
recommended polling time: ( 21) minutes.
SCT capabilities: (0x003f) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 100 100 051 Pre-fail Always - 1
3 Spin_Up_Time 0x0007 078 078 011 Pre-fail Always - 7530
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 97
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 100 100 051 Pre-fail Always - 0
8 Seek_Time_Performance 0x0025 100 100 015 Pre-fail Offline - 10847
9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 9093
10 Spin_Retry_Count 0x0033 100 100 051 Pre-fail Always - 0
11 Calibration_Retry_Count 0x0012 100 100 000 Old_age Always - 22
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 97
13 Read_Soft_Error_Rate 0x000e 100 100 000 Old_age Always - 1
183 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
184 Unknown_Attribute 0x0033 100 100 099 Pre-fail Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 13
188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 084 072 000 Old_age Always - 16 (Lifetime Min/Max 15/17)
194 Temperature_Celsius 0x0022 084 071 000 Old_age Always - 16 (Lifetime Min/Max 15/19)
195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always - 16086
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 100 099 000 Old_age Always - 1
200 Multi_Zone_Error_Rate 0x000a 100 100 000 Old_age Always - 0
201 Soft_Read_Error_Rate 0x000a 253 253 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 0
Warning: ATA Specification requires self-test log structure revision number = 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 8600 -
SMART Selective Self-Test Log Data Structure Revision Number (0) should be 1
SMART Selective self-test log data structure revision number 0
Warning: ATA Specification requires selective self-test log data structure revision number = 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
#################################################################################
#################################################################################
# smartctl -d ata -a /dev/sdf
#################################################################################
#################################################################################
smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: SAMSUNG HD103UJ
Serial Number: S13PJ1DQ504867
Firmware Version: 1AA01112
User Capacity: 1,000,204,886,016 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 3b
Local Time is: Wed Nov 4 16:11:59 2009 WET
==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details.
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 121) The previous self-test completed having
the read element of the test failed.
Total time to complete Offline
data collection: (11858) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 198) minutes.
Conveyance self-test routine
recommended polling time: ( 21) minutes.
SCT capabilities: (0x003f) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 099 099 051 Pre-fail Always - 15
3 Spin_Up_Time 0x0007 078 078 011 Pre-fail Always - 7310
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 98
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 100 100 051 Pre-fail Always - 0
8 Seek_Time_Performance 0x0025 100 100 015 Pre-fail Offline - 10603
9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 9166
10 Spin_Retry_Count 0x0033 100 100 051 Pre-fail Always - 0
11 Calibration_Retry_Count 0x0012 100 100 000 Old_age Always - 80
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 98
13 Read_Soft_Error_Rate 0x000e 099 099 000 Old_age Always - 15
183 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
184 Unknown_Attribute 0x0033 100 100 099 Pre-fail Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 59
188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 083 072 000 Old_age Always - 17 (Lifetime Min/Max 17/18)
194 Temperature_Celsius 0x0022 082 071 000 Old_age Always - 18 (Lifetime Min/Max 17/21)
195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always - 1908
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 2
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 1
199 UDMA_CRC_Error_Count 0x003e 100 100 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x000a 100 100 000 Old_age Always - 0
201 Soft_Read_Error_Rate 0x000a 253 253 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 0
Warning: ATA Specification requires self-test log structure revision number = 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 8825 1953518996
# 2 Extended offline Aborted by host 90% 8808 -
# 3 Extended offline Aborted by host 90% 8668 -
SMART Selective Self-Test Log Data Structure Revision Number (0) should be 1
SMART Selective self-test log data structure revision number 0
Warning: ATA Specification requires selective self-test log data structure revision number = 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
I am not expert on this, but I don't see that there is any problem with those. Here is a rather extended quote from the smartctlman page explaining the interpretation of VALUE, WORST, THRESH, and RAW. As you can see, it is given when explaining the -A option (text in bold was highlighted in the man page; my emphasis is shown in red):
Code:
-A, --attributes
Prints only the vendor specific SMART Attributes. The
Attributes are numbered from 1 to 253 and have specific names
and ID numbers. For example Attribute 12 is "power cycle count":
how many times has the disk been powered up.
Each Attribute has a "Raw" value, printed under the heading
"RAW_VALUE", and a "Normalized" value printed under the heading
"VALUE". [Note: smartctl prints these values in base-10.] In
the example just given, the "Raw Value" for Attribute 12 would
be the actual number of times that the disk has been
power-cycled, for example 365 if the disk has been turned on
once per day for exactly one year. Each vendor uses their own
algorithm to convert this "Raw" value to a "Normalized" value in
the range from 1 to 254. Please keep in mind that smartctl only
reports the different Attribute types, values, and thresholds as
read from the device. It does not carry out the conversion
between "Raw" and "Normalized" values: this is done by the
disk's firmware.
The conversion from Raw value to a quantity with physical units
is not specified by the SMART standard. In most cases, the val-
ues printed by smartctl are sensible. For example the tempera-
ture Attribute generally has its raw value equal to the tempera-
ture in Celsius. However in some cases vendors use unusual con-
ventions. For example the Hitachi disk on my laptop reports its
power-on hours in minutes, not hours. Some IBM disks track three
temperatures rather than one, in their raw values. And so on.
Each Attribute also has a Threshold value (whose range is 0 to
255) which is printed under the heading "THRESH". If the Nor-
malized value is less than or equal to the Threshold value, then
the Attribute is said to have failed. If the Attribute is a
pre-failure Attribute, then disk failure is imminent.
Each Attribute also has a "Worst" value shown under the heading
"WORST". This is the smallest (closest to failure) value that
the disk has recorded at any time during its lifetime when SMART
was enabled. [Note however that some vendors firmware may actu-
ally increase the "Worst" value for some "rate-type"
Attributes.]
The Attribute table printed out by smartctl also shows the
"TYPE" of the Attribute. Attributes are one of two possible
types: Pre-failure or Old age. Pre-failure Attributes are ones
which, if less than or equal to their threshold values, indicate
pending disk failure. Old age, or usage Attributes, are ones
which indicate end-of-product life from old-age or normal aging
and wearout, if the Attribute value is less than or equal to the
threshold. Please note: the fact that an Attribute is of type
'Pre-fail' does not mean that your disk is about to fail! It
only has this meaning if the Attribute's current Normalized
value is less than or equal to the threshold value.
If the Attribute's current Normalized value is less than or
equal to the threshold value, then the "WHEN_FAILED" column will
display "FAILING_NOW". If not, but the worst recorded value is
less than or equal to the threshold value, then this column will
display "In_the_past". If the "WHEN_FAILED" column has no entry
(indicated by a dash: '-') then this Attribute is OK now (not
failing) and has also never failed in the past.
The table column labeled "UPDATED" shows if the SMART Attribute
values are updated during both normal operation and off-line
testing, or only during offline testing. The former are labeled
"Always" and the latter are labeled "Offline".
So to summarize: the Raw Attribute values are the ones that
might have a real physical interpretation, such as "Temperature
Celsius", "Hours", or "Start-Stop Cycles". Each manufacturer
converts these, using their detailed knowledge of the disk's
operations and failure modes, to Normalized Attribute values in
the range 1-254. The current and worst (lowest measured) of
these Normalized Attribute values are stored on the disk, along
with a Threshold value that the manufacturer has determined will
indicate that the disk is going to fail, or that it has exceeded
its design age or aging limit. smartctl does not calculate any
of the Attribute values, thresholds, or types, it merely reports
them from the SMART data on the device.
So I understand the non-zero RAW values to indicate an event happened. But since neither the VALUE nor WORST numbers are anywhere close to THRESH, I don't think you have any problem. I welcome the interpretation of anybody else who has more knowledge of this than I do.
Last edited by blackhole54; 11-10-2009 at 03:37 AM..
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.