LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   Disk utility quits - Is my hard drive failing? (https://www.linuxquestions.org/questions/linux-hardware-18/disk-utility-quits-is-my-hard-drive-failing-4175494372/)

CamTheSaxMan 02-09-2014 04:39 PM

Disk utility quits - Is my hard drive failing?
 
I'm running a dual-boot system with Windows 7 and Linux Mint 13. After a few unsuccessful attempts to install Mint 16, I reinstalled Mint 13. However, when I start Mint and switch to a text console, I get tons of error messages that keep filling the screen. I did a Disk Utility test and it stopped about 2 seconds into the scan and it displayed Self-tests: FAILED (Read). I'd post a screenshot if it was possible to upload images, but I did run GSmartControl in Windows which also stopped after 2 seconds and produced a log output:
Code:

smartctl 6.2 2013-07-26 r3841 [x86_64-w64-mingw32-win7-sp1] (sf-6.2-1)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:    Seagate Momentus 5400.6
Device Model:    ST9500325AS
Serial Number:    S2WG54NE
LU WWN Device Id: 5 000c50 04b1c9558
Firmware Version: 0005HPM1
User Capacity:    500,107,862,016 bytes [500 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:  ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Sun Feb 09 15:47:08 2014 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)        Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 121)        The previous self-test completed having
                                        the read element of the test failed.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                          (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003)        Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01)        Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:          (  2) minutes.
Extended self-test routine
recommended polling time:          ( 137) minutes.
SCT capabilities:                (0x103f)        SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate    0x000f  090  083  006    Pre-fail  Always      -      167247778
  3 Spin_Up_Time            0x0002  098  098  000    Old_age  Always      -      0
  4 Start_Stop_Count        0x0033  098  098  000    Pre-fail  Always      -      2332
  5 Reallocated_Sector_Ct  0x0033  096  096  036    Pre-fail  Always      -      83
  7 Seek_Error_Rate        0x000f  077  060  030    Pre-fail  Always      -      58685100
  9 Power_On_Hours          0x0032  097  097  000    Old_age  Always      -      3199
 10 Spin_Retry_Count        0x0013  100  100  097    Pre-fail  Always      -      0
 12 Power_Cycle_Count      0x0033  099  099  020    Pre-fail  Always      -      1854
183 Runtime_Bad_Block      0x0032  100  253  000    Old_age  Always      -      0
184 End-to-End_Error        0x0033  100  100  097    Pre-fail  Always      -      0
187 Reported_Uncorrect      0x0032  001  001  000    Old_age  Always      -      2089
188 Command_Timeout        0x0032  100  099  000    Old_age  Always      -      4
189 High_Fly_Writes        0x003a  100  100  000    Old_age  Always      -      0
190 Airflow_Temperature_Cel 0x0022  056  052  045    Old_age  Always      -      44 (Min/Max 42/44)
191 G-Sense_Error_Rate      0x0032  100  100  000    Old_age  Always      -      22
192 Power-Off_Retract_Count 0x0032  100  100  000    Old_age  Always      -      51
193 Load_Cycle_Count        0x0032  070  070  000    Old_age  Always      -      61021
194 Temperature_Celsius    0x0022  044  048  000    Old_age  Always      -      44 (0 17 0 0 0)
195 Hardware_ECC_Recovered  0x001a  051  046  000    Old_age  Always      -      167247778
196 Reallocated_Event_Count 0x0033  096  096  036    Pre-fail  Always      -      83
197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      78
198 Offline_Uncorrectable  0x0010  100  100  000    Old_age  Offline      -      0
199 UDMA_CRC_Error_Count    0x003e  200  200  000    Old_age  Always      -      0

SMART Error Log Version: 1
ATA Error Count: 2533 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2533 occurred at disk power-on lifetime: 3198 hours (133 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 71 62 ab 0a  Error: UNC at LBA = 0x0aab6271 = 179004017

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 70 62 ab 4a 00      00:24:04.782  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:24:04.740  SET FEATURES [Enable SATA feature]
  ec 00 00 00 00 00 a0 00      00:24:04.641  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      00:24:04.616  SET FEATURES [Set transfer mode]
  ef 10 02 00 00 00 a0 00      00:24:04.606  SET FEATURES [Enable SATA feature]

Error 2532 occurred at disk power-on lifetime: 3198 hours (133 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 71 62 ab 0a  Error: UNC at LBA = 0x0aab6271 = 179004017

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 70 62 ab 4a 00      00:24:02.264  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:24:02.254  SET FEATURES [Enable SATA feature]
  ec 00 00 00 00 00 a0 00      00:24:02.253  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      00:24:02.240  SET FEATURES [Set transfer mode]
  ef 10 02 00 00 00 a0 00      00:24:02.198  SET FEATURES [Enable SATA feature]

Error 2531 occurred at disk power-on lifetime: 3198 hours (133 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 71 62 ab 0a  Error: UNC at LBA = 0x0aab6271 = 179004017

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 70 62 ab 4a 00      00:23:59.777  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:23:59.767  SET FEATURES [Enable SATA feature]
  ec 00 00 00 00 00 a0 00      00:23:59.765  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      00:23:59.752  SET FEATURES [Set transfer mode]
  ef 10 02 00 00 00 a0 00      00:23:59.711  SET FEATURES [Enable SATA feature]

Error 2530 occurred at disk power-on lifetime: 3198 hours (133 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 71 62 ab 0a  Error: UNC at LBA = 0x0aab6271 = 179004017

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 70 62 ab 4a 00      00:23:57.302  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:23:57.292  SET FEATURES [Enable SATA feature]
  ec 00 00 00 00 00 a0 00      00:23:57.291  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      00:23:57.277  SET FEATURES [Set transfer mode]
  ef 10 02 00 00 00 a0 00      00:23:57.268  SET FEATURES [Enable SATA feature]

Error 2529 occurred at disk power-on lifetime: 3198 hours (133 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 71 62 ab 0a  Error: WP at LBA = 0x0aab6271 = 179004017

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 00 98 e8 8a 5c 49 00      00:23:54.749  WRITE FPDMA QUEUED
  61 00 08 18 48 c9 47 00      00:23:54.749  WRITE FPDMA QUEUED
  61 00 08 28 4c bd 47 00      00:23:54.748  WRITE FPDMA QUEUED
  60 00 08 70 62 ab 4a 00      00:23:54.747  READ FPDMA QUEUED
  ef 10 02 00 00 00 a0 00      00:23:54.737  SET FEATURES [Enable SATA feature]

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure      90%      3199        179004017
# 2  Short offline      Completed: read failure      90%      3199        179004017
# 3  Extended offline    Completed: read failure      90%      3198        118885828
# 4  Extended offline    Completed: read failure      90%      3198        118885828
# 5  Extended offline    Completed: read failure      90%      3198        118885828
# 6  Extended offline    Completed: read failure      90%      3198        118885828
# 7  Short offline      Completed: read failure      90%      3198        118885828
# 8  Short offline      Completed: read failure      90%      3198        118885828
# 9  Short offline      Completed: read failure      90%      3198        118885828
#10  Short offline      Completed: read failure      90%      3198        118885828
#11  Extended offline    Completed: read failure      90%      3198        118885828

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Can someone make sense out of this? Is my drive bad?

Ser Olmy 02-09-2014 04:47 PM

Yes, your drive is definitely failing.

The Reallocated_Sector_Ct parameter tells you 83 sectors have been confirmed as bad and reallocated to the spare area of the disk. The Current_Pending_Sector parameter shows that a further 78 sectors are bad and will be reallocated the next time a write operation is performed.

You should back up all your data immediately and replace the drive as soon as possible.

metaschima 02-09-2014 05:11 PM

It is likely failing due to the failed SMART long test. Backup your data and consider getting a new drive.

Some suggest that wiping the drive with zeros may allow you to continue to use it despite its continuous degradation. If you choose to do this, do NOT keep any important data on the disk as it could fail at any time.

jefro 02-10-2014 02:31 PM

See if you can get the OEM's diag for a second test. I assume you have some issue with the drive but double check any other item like psu or cables or even ram or controller.

CamTheSaxMan 02-10-2014 04:26 PM

Where do I find that OEM stuff? The drive works fine other than the errors Linux keeps throwing. I'm moving the Linux partition to a different area of the disk and I'm going to see if I can get it to work properly or reinstall if the data is too corrupt. GParted keeps giving me read errors, so I know there are some bad sectors in that area.

metaschima 02-10-2014 05:08 PM

You don't really need the OEM tests, because it's really the same test and just as accurate. If you need them, they are probably on:
http://www.ultimatebootcd.com/

Again, the drive is failing, so backup now.

jefro 02-11-2014 03:20 PM

You go the the company's web site that made the hard drive. It is the best way to test a drive. OEM diags have the very best inside information about that drive. The test suite they provide tends to have more exacting tests. The only better test suite would be factory tests. You are unlikely to get those.

You may have to create a bartspe disk to test the windows app or load on windows for less than 30 days.

OEM diags do more than simple smart tests.

If you want to double check another way then put the drive into a completely different system and re-test it there with all different associated hardware.


All times are GMT -5. The time now is 03:48 AM.