LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   Input/Output error hard drive failure? (https://www.linuxquestions.org/questions/slackware-14/input-output-error-hard-drive-failure-4175470136/)

flyinggeorge 07-18-2013 07:05 PM

Input/Output error hard drive failure?
 
This isn't really a Slackware issue, or at least I don't think it is. I was using my laptop the other day and some programs just stopped working. I attempted to open a terminal and use dmesg to figure out what was going on and all I got was "Input output error" regardless of the command. I even tried halt and reboot with the same result. I assume this is related to inability to read the hard drive.

I was forced to manually reboot and attempted to use the Slackware DVD to perform fsck /dev/sda fsck returned "bad superblock." I've only had to deal with the bad superblock error once before when my file system got corrupt (due to a power outage) but I should note that everything has been running fine since. Is it possible that the hard drive just needed to shut down temporarily and get restarted? This laptop is almost brand new. I bought it about two months ago. A hard drive issue would be unfortunate.

Like I said, everything is working fine now, but why would fsck return bad superblock?

guanx 07-18-2013 08:39 PM

First check if it's a hardware or software problem with "smartctl -a /dev/sdX" where sdX is your hard disk device (probably sdX = sda).

flyinggeorge 07-18-2013 08:59 PM

Smartctl did return some errors, but I don't know how to read it.

Code:

bash-4.2# smartctl -a /dev/sda
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.2.29] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:    Hitachi HTS727575A9E364
Serial Number:    J3740084HT22AE
LU WWN Device Id: 5 000cca 68cd90833
Firmware Version: JF4OA200
User Capacity:    750,156,374,016 bytes [750 GB]
Sector Sizes:    512 bytes logical, 4096 bytes physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:  8
ATA Standard is:  ATA-8-ACS revision 6
Local Time is:    Thu Jul 18 21:48:35 2013 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (  0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  45) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (  2) minutes.
Extended self-test routine
recommended polling time:        ( 132) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate    0x000b  100  100  062    Pre-fail  Always      -      0
  2 Throughput_Performance  0x0005  100  100  040    Pre-fail  Offline      -      0
  3 Spin_Up_Time            0x0007  216  216  033    Pre-fail  Always      -      2
  4 Start_Stop_Count        0x0012  100  100  000    Old_age  Always      -      104
  5 Reallocated_Sector_Ct  0x0033  100  100  005    Pre-fail  Always      -      0
  7 Seek_Error_Rate        0x000b  100  100  067    Pre-fail  Always      -      0
  8 Seek_Time_Performance  0x0005  100  100  040    Pre-fail  Offline      -      0
  9 Power_On_Hours          0x0012  097  097  000    Old_age  Always      -      1637
 10 Spin_Retry_Count        0x0013  100  100  060    Pre-fail  Always      -      0
 12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      104
191 G-Sense_Error_Rate      0x000a  100  100  000    Old_age  Always      -      1
192 Power-Off_Retract_Count 0x0032  100  100  000    Old_age  Always      -      27
193 Load_Cycle_Count        0x0012  085  085  000    Old_age  Always      -      152926
194 Temperature_Celsius    0x0002  157  157  000    Old_age  Always      -      38 (Min/Max 20/49)
196 Reallocated_Event_Count 0x0032  100  100  000    Old_age  Always      -      47
197 Current_Pending_Sector  0x0022  100  100  000    Old_age  Always      -      0
198 Offline_Uncorrectable  0x0008  100  100  000    Old_age  Offline      -      0
199 UDMA_CRC_Error_Count    0x000a  200  200  000    Old_age  Always      -      0
223 Load_Retry_Count        0x000a  100  100  000    Old_age  Always      -      0

SMART Error Log Version: 1
ATA Error Count: 23 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 23 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:47.877  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:47.877  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:47.877  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:47.876  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:47.876  SET FEATURES [Set transfer mode]

Error 22 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:45.285  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:45.285  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:45.285  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:45.284  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:45.284  SET FEATURES [Set transfer mode]

Error 21 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:42.694  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:42.693  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:42.693  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:42.692  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:42.692  SET FEATURES [Set transfer mode]

Error 20 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:40.102  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:40.101  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:40.101  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:40.100  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:40.100  SET FEATURES [Set transfer mode]

Error 19 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:37.501  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:37.501  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:37.500  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:37.500  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:37.499  SET FEATURES [Set transfer mode]

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

They all have something to do with set transfer mode it seems. Can anyone shed some light?

TobiSGD 07-18-2013 09:09 PM

This disk has already had 23 errors after that short time. I would recommend to backup your data and return it to the vendor for a hardware check.

flyinggeorge 07-18-2013 09:28 PM

Is it possible that all of these errors are related to the same issue? For instance maybe each command ended up bringing an error and 23 commands were run? Also what would the vendor do for a hardware check that I can't? I'll give it back to the vendor if required, but I would like to figure this out on my own if possible. Mostly because this is my only computer and I would prefer not to go without.

TobiSGD 07-18-2013 09:36 PM

This rather looks like a hardware error, it might be the controller, but more likely the disk. You can download the disk manufacturer's diagnosis software and test the disk. Before doing anything else I would recommend to backup your data, hardware tests usually stress the hardware, so that an possible error might get worse.

flyinggeorge 07-18-2013 09:54 PM

Well I'm going to try and find out if Asus offers a warranty. Like I said it's only two months old, but IDK if they offer a limited warranty or not. Looking into it...

Edit: Talked to an Asus rep and they have opened up an RMA case for me. By the way it sounds I will have to re-install Slackware though. They seem dead set on putting Windows 8 on here. Hopefully the fact that I removed the recovery partition will stop that, but I doubt it.

H_TeXMeX_H 07-19-2013 01:52 AM

There are some errors, but nothing conclusive. Try running a SMART long test:
Code:

smartctl -t long /dev/sda
Wait for it to finish and post 'smartctl -a /dev/sda' again.

flyinggeorge 07-19-2013 02:22 PM

I ran the long test and smartctl -a again. The results look the same to me.

Code:

bash-4.2# smartctl -t long /dev/sda
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.2.29] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 132 minutes for test to complete.
Test will complete after Fri Jul 19 15:18:20 2013

Use smartctl -X to abort test.
bash-4.2# smartctl -a /dev/sda
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.2.29] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:    Hitachi HTS727575A9E364
Serial Number:    J3740084HT22AE
LU WWN Device Id: 5 000cca 68cd90833
Firmware Version: JF4OA200
User Capacity:    750,156,374,016 bytes [750 GB]
Sector Sizes:    512 bytes logical, 4096 bytes physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:  8
ATA Standard is:  ATA-8-ACS revision 6
Local Time is:    Fri Jul 19 15:19:26 2013 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 241) Self-test routine in progress...
                                        10% of test remaining.
Total time to complete Offline
data collection:                (  45) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (  2) minutes.
Extended self-test routine
recommended polling time:        ( 132) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate    0x000b  100  100  062    Pre-fail  Always      -      0
  2 Throughput_Performance  0x0005  100  100  040    Pre-fail  Offline      -      0
  3 Spin_Up_Time            0x0007  216  216  033    Pre-fail  Always      -      2
  4 Start_Stop_Count        0x0012  100  100  000    Old_age  Always      -      104
  5 Reallocated_Sector_Ct  0x0033  100  100  005    Pre-fail  Always      -      0
  7 Seek_Error_Rate        0x000b  100  100  067    Pre-fail  Always      -      0
  8 Seek_Time_Performance  0x0005  100  100  040    Pre-fail  Offline      -      0
  9 Power_On_Hours          0x0012  097  097  000    Old_age  Always      -      1652
 10 Spin_Retry_Count        0x0013  100  100  060    Pre-fail  Always      -      0
 12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      104
191 G-Sense_Error_Rate      0x000a  100  100  000    Old_age  Always      -      0
192 Power-Off_Retract_Count 0x0032  100  100  000    Old_age  Always      -      27
193 Load_Cycle_Count        0x0012  085  085  000    Old_age  Always      -      158134
194 Temperature_Celsius    0x0002  125  125  000    Old_age  Always      -      48 (Min/Max 20/49)
196 Reallocated_Event_Count 0x0032  100  100  000    Old_age  Always      -      47
197 Current_Pending_Sector  0x0022  100  100  000    Old_age  Always      -      0
198 Offline_Uncorrectable  0x0008  100  100  000    Old_age  Offline      -      0
199 UDMA_CRC_Error_Count    0x000a  200  200  000    Old_age  Always      -      0
223 Load_Retry_Count        0x000a  100  100  000    Old_age  Always      -      0

SMART Error Log Version: 1
ATA Error Count: 23 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 23 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:47.877  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:47.877  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:47.877  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:47.876  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:47.876  SET FEATURES [Set transfer mode]

Error 22 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:45.285  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:45.285  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:45.285  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:45.284  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:45.284  SET FEATURES [Set transfer mode]

Error 21 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:42.694  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:42.693  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:42.693  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:42.692  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:42.692  SET FEATURES [Set transfer mode]

Error 20 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:40.102  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:40.101  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:40.101  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:40.100  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:40.100  SET FEATURES [Set transfer mode]

Error 19 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:37.501  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:37.501  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:37.500  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:37.500  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:37.499  SET FEATURES [Set transfer mode]

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


guanx 07-19-2013 04:30 PM

Quote:

Originally Posted by flyinggeorge (Post 4993407)
I ran the long test and smartctl -a again. The results look the same to me.
...

Please be patient ...
Code:

Self-test execution status:      ( 241) Self-test routine in progress...
                                        10% of test remaining.


flyinggeorge 07-19-2013 05:03 PM

Whoops. Let me try that again. :redface:

Code:

bash-4.2# smartctl -a /dev/sda
smartctl 5.43 2012-06-30 r3573 [x86_64-linux-3.2.29] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:    Hitachi HTS727575A9E364
Serial Number:    J3740084HT22AE
LU WWN Device Id: 5 000cca 68cd90833
Firmware Version: JF4OA200
User Capacity:    750,156,374,016 bytes [750 GB]
Sector Sizes:    512 bytes logical, 4096 bytes physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:  8
ATA Standard is:  ATA-8-ACS revision 6
Local Time is:    Fri Jul 19 18:01:10 2013 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (  0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  45) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (  2) minutes.
Extended self-test routine
recommended polling time:        ( 132) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate    0x000b  100  100  062    Pre-fail  Always      -      0
  2 Throughput_Performance  0x0005  100  100  040    Pre-fail  Offline      -      0
  3 Spin_Up_Time            0x0007  216  216  033    Pre-fail  Always      -      2
  4 Start_Stop_Count        0x0012  100  100  000    Old_age  Always      -      104
  5 Reallocated_Sector_Ct  0x0033  100  100  005    Pre-fail  Always      -      0
  7 Seek_Error_Rate        0x000b  100  100  067    Pre-fail  Always      -      0
  8 Seek_Time_Performance  0x0005  100  100  040    Pre-fail  Offline      -      0
  9 Power_On_Hours          0x0012  097  097  000    Old_age  Always      -      1657
 10 Spin_Retry_Count        0x0013  100  100  060    Pre-fail  Always      -      0
 12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      104
191 G-Sense_Error_Rate      0x000a  100  100  000    Old_age  Always      -      0
192 Power-Off_Retract_Count 0x0032  100  100  000    Old_age  Always      -      27
193 Load_Cycle_Count        0x0012  085  085  000    Old_age  Always      -      158139
194 Temperature_Celsius    0x0002  162  162  000    Old_age  Always      -      37 (Min/Max 20/49)
196 Reallocated_Event_Count 0x0032  100  100  000    Old_age  Always      -      47
197 Current_Pending_Sector  0x0022  100  100  000    Old_age  Always      -      0
198 Offline_Uncorrectable  0x0008  100  100  000    Old_age  Offline      -      0
199 UDMA_CRC_Error_Count    0x000a  200  200  000    Old_age  Always      -      0
223 Load_Retry_Count        0x000a  100  100  000    Old_age  Always      -      0

SMART Error Log Version: 1
ATA Error Count: 23 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 23 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:47.877  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:47.877  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:47.877  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:47.876  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:47.876  SET FEATURES [Set transfer mode]

Error 22 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:45.285  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:45.285  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:45.285  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:45.284  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:45.284  SET FEATURES [Set transfer mode]

Error 21 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:42.694  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:42.693  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:42.693  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:42.692  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:42.692  SET FEATURES [Set transfer mode]

Error 20 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:40.102  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:40.101  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:40.101  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:40.100  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:40.100  SET FEATURES [Set transfer mode]

Error 19 occurred at disk power-on lifetime: 1615 hours (67 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 07 38 c5 ae 03  Error: UNC at LBA = 0x03aec538 = 61785400

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 00 37 c5 ae 40 00      00:10:37.501  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:10:37.501  SET FEATURES [Reserved for Serial ATA]
  27 00 00 00 00 00 e0 00      00:10:37.500  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:10:37.500  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:10:37.499  SET FEATURES [Set transfer mode]

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error      00%      1655        -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


H_TeXMeX_H 07-20-2013 02:46 AM

Looks like it completed without error, and the attributes are still good. I don't think it is failing.

There are those errors, but one of my drives also has some of those and continues to work fine. They tend to appear after sudden loss of power.

TobiSGD 07-20-2013 06:04 AM

I wouldn't consider a disk that after working only 1600 hours has already 23 errors in the log and already 47 reallocated sectors to be in a good state. All my disks have exactly zero errors and zero reallocated sectors after running 20000+ hours and if they would have I would backup the data on them and look for a replacement. This is especially true as long as I have guarantee on the disk, a 2 months old disk shouldn't show any of those signs.

guanx 07-20-2013 07:17 AM

Quote:

Originally Posted by TobiSGD (Post 4993696)
I wouldn't consider a disk that after working only 1600 hours has already 23 errors in the log and already 47 reallocated sectors to be in a good state. ...

No, that is really not good.

Just out of curiosity -- why is Reallocated_Event_Count 47 while both Current_Pending_Sector and Reallocated_Sector_Ct are zero?

H_TeXMeX_H 07-20-2013 07:55 AM

If the drive is under warranty and you believe it to be faulty, send it in for replacement. If not, I don't think it is failing.

onebuck 07-20-2013 09:34 AM

Member Response
 
Hi,

First, I would make the backup as suggested by other members. Then get the manufactures diagnostic set if you wish to test further. If the drive is in question and you did state that a RMA was given then ship it back to the manufacture. I agree that you should not be getting errors on a new drive with low hours.

You should use the diagnostic set to test then make your decision(s) based on the test results.

flyinggeorge 07-20-2013 09:59 AM

Asus said they do not offer a diagnostic tool to end users. I don't know of any others, except maybe fsck? But I am likely going to send it in for the RMA. There was a sudden loss of power issue about a week or two before this happened.

rknichols 07-20-2013 04:42 PM

I notice that the 5 errors listed all occurred at about the same time (1615 hours) and refer to the same LBA. I also see that the G-Sense_Error_Rate (parameter 191) had a raw value of 1 in the first report (at 1637 hours) but returned to 0 in the second report (at 1657 hours). That in combination with the oddity of a Reallocated_Event_Count of 47 while both Current_Pending_Sector and Reallocated_Sector_Ct are zero suggests that some external event happened at around 1615 hours to cause the errors. Since the long test currently passes without error, I believe that the drive is now fine.

You might have difficulty getting a warranty replacement for a drive that now passes all tests and has zero reallocated or pending sectors.

mRgOBLIN 07-20-2013 08:45 PM

It's a Hitachi drive.

http://www.hgst.com/hdd/support/down...2_v416_b00.iso

http://www.hgst.com/support/troubles...eshooting-tips
A windows version here if that's an option.
http://www.hgst.com/support/downloads/#WINDFT


All times are GMT -5. The time now is 10:49 PM.