Hi All,
My HDD failed to boot on day last week. It either happened out of the blue or it could have happened after the computer was frozen and the power button was held down to turn off the power (can't recall for sure).
When I ran gparted, it flagged my linux swap partition and my home drive (I'm not familiar with the full meaning of the orangish "flag").
This is the ending portion of the error I found when I tried to boot the partition.
Code:
[ 18.818---] ata3.00: statua: { DRDY ERR }
[ 18.818---] ata3.00: error { UNC }
[ 18.818---] ata3.00: configured for UDMA/133
[ 18.818---] sd 2:0:0:0: [sda] unhandled sense code
[ 18.818---] sd 2:0:0:0: [sda]
[ 18.818---] Result" hhostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 18.818---] sd 2:0:0:0: [sda]
[ 18.818---] Add. Sense: Unrecovered rread error - auto reallocate failed
[ 18.818---] sd 2:0:0:0: [sda] CDB:
[ 18.818---] Read(10): 28 00 04 99 39 e8 00 00 08 00
[ 18.818---] end_request: I/O error, dev sda, sector 77150696
[ 18.818---] Buffer I/O error on device sda7, logical block 61
[ 18.819---] ata3: EH complete
BusyBopx v1.20.2 (Ubuntu 1:1.20.0-8.1ubuntu1) built-in shell (asdh)
Enter 'help' for a list of built-in commands
(initramfs)
This is the output of lsblk -f -o +size. sda6 is the linux swap and sda7 is the home partition.
Code:
root@sysresccd /root % lsblk -f -o +size
NAME FSTYPE LABEL UUID MOUNTPOINT SIZE
sda 931.5G
├─sda1 ext4 a1e05889-c8a7-4373-b54a-0a0dfd8a71e2 476M
├─sda2 1K
├─sda5 ext4 100f8716-048c-460e-abc9-913fa2d09c6c 14G
├─sda6 22.4G
└─sda7 838.2G
This is the output of dmesg |tail
Code:
mint@mint ~ $ sudo dmesg |tail
[ 2808.481008] nouveau 0000:01:00.0: gr: 00409610: f7f70000
[ 2808.481013] nouveau 0000:01:00.0: gr: TRAP_TEXTURE - TP2: 00000003 [ FAULT]
[ 2808.481017] nouveau 0000:01:00.0: gr: magic set 3:
[ 2808.481021] nouveau 0000:01:00.0: gr: 00409e04: dc0a6201
[ 2808.481025] nouveau 0000:01:00.0: gr: 00409e08: f700f7f7
[ 2808.481029] nouveau 0000:01:00.0: gr: 00409e0c: 40000430
[ 2808.481033] nouveau 0000:01:00.0: gr: 00409e10: f7f70000
[ 2808.481038] nouveau 0000:01:00.0: gr: TRAP_TEXTURE - TP3: 00000003 [ FAULT]
[ 2808.481044] nouveau 0000:01:00.0: gr: 00200000 [] ch 6 [001f8f9000 cinnamon[2913]] subc 3 class 8597 mthd 1b0c data 1000f010
[ 2808.481060] nouveau 0000:01:00.0: fb: trapped read at f700f7f700 on channel 6 [1f8f9000 cinnamon[2913]] engine 00 [PGRAPH] client 0a [TEXTURE] subclient 00 [] reason 00000000 [PT_NOT_PRESENT]
mint@mint ~ $
This is the output of smartctl -x /dev/sda.
[code]
root@sysresccd /root % smartctl -x /dev/sda
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.1.33-std483-amd64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Blue
Device Model: WDC WD10EZEX-00RKKA0
Serial Number: WD-WCC1S4067136
LU WWN Device Id: 5 0014ee 2088d5aeb
Firmware Version: 80.00A80
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Oct 10 15:38:16 2016 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Unavailable
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, frozen [SEC2]
Wt Cache Reorder: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (11040) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 127) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x30b5) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-K 200 200 051 - 2010
3 Spin_Up_Time POS--K 173 172 021 - 2316
4 Start_Stop_Count -O--CK 100 100 000 - 841
5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0
7 Seek_Error_Rate -OSR-K 200 200 000 - 0
9 Power_On_Hours -O--CK 086 086 000 - 10279
10 Spin_Retry_Count -O--CK 100 100 000 - 0
11 Calibration_Retry_Count -O--CK 100 100 000 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 840
192 Power-Off_Retract_Count -O--CK 200 200 000 - 127
193 Load_Cycle_Count -O--CK 200 200 000 - 713
194 Temperature_Celsius -O---K 115 100 000 - 28
196 Reallocated_Event_Count -O--CK 200 200 000 - 0
197 Current_Pending_Sector -O--CK 192 192 000 - 1373
198 Offline_Uncorrectable ----CK 192 192 000 - 1370
199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0
200 Multi_Zone_Error_Rate ---R-- 196 196 000 - 1684
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x02 SL R/O 5 Comprehensive SMART error log
0x03 GPL R/O 6 Ext. Comprehensive SMART error log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x09 SL R/W 1 Selective self-test log
0x10 GPL R/O 1 SATA NCQ Queued Error log
0x11 GPL R/O 1 SATA Phy Event Counters log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log
0xa0-0xa7 GPL,SL VS 16 Device vendor specific log
0xa8-0xb5 GPL,SL VS 1 Device vendor specific log
0xb6 GPL VS 1 Device vendor specific log
0xb7 GPL,SL VS 1 Device vendor specific log
0xbd GPL,SL VS 1 Device vendor specific log
0xc0 GPL,SL VS 1 Device vendor specific log
0xc1 GPL VS 93 Device vendor specific log
0xe0 GPL,SL R/W 1 SCT Command/Status
0xe1 GPL,SL R/W 1 SCT Data Transfer
SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
Device Error Count: 1899 (device log contains only the most recent 24 errors)
CR = Command Register
FEATR = Features Register
COUNT = Count (was: Sector Count) Register
LBA_48 = Upper bytes of LBA High/Mid/Low Registers ] ATA-8
LH = LBA High (was: Cylinder High) Register ] LBA
LM = LBA Mid (was: Cylinder Low) Register ] Register
LL = LBA Low (was: Sector Number) Register ]
DV = Device (was: Device/Head) Register
DC = Device Control Register
ER = Error register
ST = Status register
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 1899 [2] occurred at disk power-on lifetime: 10279 hours (428 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 04 99 38 78 40 00 Error: UNC at LBA = 0x04993878 = 77150328
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 b8 00 00 04 99 38 78 40 08 00:07:01.754 READ FPDMA QUEUED
ef 00 10 00 02 00 00 00 00 00 00 a0 08 00:07:01.753 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 00 00 00 00 00 e0 08 00:07:01.753 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 00 00 00 00 00 a0 08 00:07:01.753 IDENTIFY DEVICE
ef 00 03 00 46 00 00 00 00 00 00 a0 08 00:07:01.752 SET FEATURES [Set transfer mode]
Error 1898 [1] occurred at disk power-on lifetime: 10279 hours (428 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 04 99 38 78 40 00 Error: UNC at LBA = 0x04993878 = 77150328
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 98 00 00 04 99 38 78 40 08 00:06:59.727 READ FPDMA QUEUED
60 00 08 00 90 00 00 04 99 38 38 40 08 00:06:59.727 READ FPDMA QUEUED
60 00 08 00 88 00 00 04 99 38 18 40 08 00:06:59.727 READ FPDMA QUEUED
60 00 08 00 80 00 00 04 99 40 00 40 08 00:06:59.701 READ FPDMA QUEUED
60 00 08 00 78 00 00 6d 5f 46 f8 40 08 00:06:59.674 READ FPDMA QUEUED
Error 1897 [0] occurred at disk power-on lifetime: 10279 hours (428 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 04 99 38 78 40 00 Error: UNC at LBA = 0x04993878 = 77150328
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 b0 00 00 00 0e f0 38 40 08 00:01:50.657 READ FPDMA QUEUED
60 00 08 00 98 00 00 00 0e f0 18 40 08 00:01:50.636 READ FPDMA QUEUED
60 00 08 00 90 00 00 00 00 18 00 40 08 00:01:50.625 READ FPDMA QUEUED
60 00 08 00 88 00 00 00 0e f8 00 40 08 00:01:50.622 READ FPDMA QUEUED
60 00 08 00 40 00 00 04 99 38 78 40 08 00:01:50.614 READ FPDMA QUEUED
Error 1896 [23] occurred at disk power-on lifetime: 10279 hours (428 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 04 99 38 78 40 00 Error: UNC at LBA = 0x04993878 = 77150328
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 00 00 00 04 99 38 78 40 08 00:01:48.588 READ FPDMA QUEUED
60 00 08 00 f0 00 00 04 99 38 38 40 08 00:01:48.587 READ FPDMA QUEUED
60 00 08 00 e8 00 00 04 99 38 18 40 08 00:01:48.587 READ FPDMA QUEUED
60 00 08 00 e0 00 00 04 99 40 00 40 08 00:01:48.562 READ FPDMA QUEUED
60 00 08 00 d8 00 00 6d 5f 46 f8 40 08 00:01:48.526 READ FPDMA QUEUED
Error 1895 [22] occurred at disk power-on lifetime: 10279 hours (428 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 04 99 38 78 40 00 Error: UNC at LBA = 0x04993878 = 77150328
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 10 00 00 04 99 38 78 40 08 00:01:42.483 READ FPDMA QUEUED
60 00 08 00 f0 00 00 00 0f 00 00 40 08 00:01:42.464 READ FPDMA QUEUED
60 00 68 00 e8 00 00 01 cd ef 88 40 08 00:01:42.462 READ FPDMA QUEUED
60 00 80 00 e0 00 00 01 cd ef 00 40 08 00:01:42.462 READ FPDMA QUEUED
60 00 f8 00 d8 00 00 01 cd ee 00 40 08 00:01:42.462 READ FPDMA QUEUED
Error 1894 [21] occurred at disk power-on lifetime: 10279 hours (428 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 04 99 38 78 40 00 Error: UNC at LBA = 0x04993878 = 77150328
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 38 00 00 04 99 38 78 40 08 00:01:40.404 READ FPDMA QUEUED
60 00 08 00 30 00 00 04 99 38 38 40 08 00:01:40.404 READ FPDMA QUEUED
60 00 08 00 28 00 00 04 99 38 18 40 08 00:01:40.404 READ FPDMA QUEUED
60 00 08 00 20 00 00 04 99 40 00 40 08 00:01:40.379 READ FPDMA QUEUED
60 00 08 00 18 00 00 6d 5f 46 f8 40 08 00:01:40.356 READ FPDMA QUEUED
Error 1893 [20] occurred at disk power-on lifetime: 10279 hours (428 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 04 99 3b 08 40 00 Error: UNC at LBA = 0x04993b08 = 77150984
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 48 00 00 04 99 3b 08 40 08 00:01:08.857 READ FPDMA QUEUED
ef 00 10 00 02 00 00 00 00 00 00 a0 08 00:01:08.857 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 00 00 00 00 00 e0 08 00:01:08.856 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 00 00 00 00 00 a0 08 00:01:08.856 IDENTIFY DEVICE
ef 00 03 00 46 00 00 00 00 00 00 a0 08 00:01:08.856 SET FEATURES [Set transfer mode]
Error 1892 [19] occurred at disk power-on lifetime: 10279 hours (428 days + 7 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 04 99 3b 08 40 00 Error: UNC at LBA = 0x04993b08 = 77150984
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 20 00 00 04 99 3b 08 40 08 00:01:06.827 READ FPDMA QUEUED
60 00 08 00 18 00 00 04 99 3b 00 40 08 00:01:06.826 READ FPDMA QUEUED
60 00 08 00 10 00 00 04 99 39 10 40 08 00:01:06.826 READ FPDMA QUEUED
60 00 08 00 08 00 00 04 99 39 08 40 08 00:01:06.826 READ FPDMA QUEUED
60 00 08 00 00 00 00 04 99 39 00 40 08 00:01:06.808 READ FPDMA QUEUED
SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
SCT Status Version: 3
SCT Version (vendor specific): 258 (0x0102)
SCT Support Level: 1
Device State: Active (0)
Current Temperature: 27 Celsius
Power Cycle Min/Max Temperature: 23/27 Celsius
Lifetime Min/Max Temperature: 23/43 Celsius
Under/Over Temperature Limit Count: 0/0
Vendor specific:
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
SCT Temperature History Version: 2
Temperature Sampling Period: 1 minute
Temperature Logging Interval: 1 minute
Min/Max recommended Temperature: 0/60 Celsius
Min/Max Temperature Limit: -41/85 Celsius
Temperature History Size (Index): 478 (303)
Index Estimated Time Temperature Celsius
304 2016-10-10 07:41 35 ****************
... ..( 4 skipped). .. ****************
309 2016-10-10 07:46 35 ****************
310 2016-10-10 07:47 36 *****************
... ..(172 skipped). .. *****************
5 2016-10-10 10:40 36 *****************
6 2016-10-10 10:41 37 ******************
... ..( 7 skipped). .. ******************
14 2016-10-10 10:49 37 ******************
15 2016-10-10 10:50 36 *****************
... ..(125 skipped). .. *****************
141 2016-10-10 12:56 36 *****************
142 2016-10-10 12:57 37 ******************
... ..( 60 skipped). .. ******************
203 2016-10-10 13:58 37 ******************
204 2016-10-10 13:59 36 *****************
... ..( 12 skipped). .. *****************
217 2016-10-10 14:12 36 *****************
218 2016-10-10 14:13 37 ******************
... ..( 46 skipped). .. ******************
265 2016-10-10 15:00 37 ******************
266 2016-10-10 15:01 ? -
267 2016-10-10 15:02 23 ****
268 2016-10-10 15:03 24 *****
269 2016-10-10 15:04 25 ******
270 2016-10-10 15:05 26 *******
271 2016-10-10 15:06 26 *******
272 2016-10-10 15:07 27 ********
... ..( 2 skipped). .. ********
275 2016-10-10 15:10 27 ********
276 2016-10-10 15:11 33 **************
... ..( 2 skipped). .. **************
279 2016-10-10 15:14 33 **************
280 2016-10-10 15:15 34 ***************
... ..( 5 skipped). .. ***************
286 2016-10-10 15:21 34 ***************
287 2016-10-10 15:22 35 ****************
... ..( 15 skipped). .. ****************
303 2016-10-10 15:38 35 ****************
SCT Error Recovery Control command not supported
Device Statistics (GP/SMART Log 0x04) not supported
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0002 2 0 R_ERR response for data FIS
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0005 2 0 R_ERR response for non-data FIS
0x0006 2 0 R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0008 2 0 Device-to-host non-data FIS retries
0x0009 2 3 Transition from drive PhyRdy to drive PhyNRdy
0x000a 2 2 Device-to-host register FISes sent due to a COMRESET
0x000b 2 0 CRC errors within host-to-device FIS
0x000f 2 0 R_ERR response for host-to-device data FIS, CRC
0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC
0x8000 4 602 Vendor specific
root@sysresccd /root %
[\code]
I watched a video about Ubuntu and using "disk utility" to show a dashboard that gave information about smart drives, displayed their current condition (working or not) and had a button that claimed to try and repair the drive. I didn't find anything like this on mint. I installed something called gnome-disk-utility, but the interface was very different and didn't have the smart drive repair feature that interested me.
I know testdrive exists, but I'm not confident how to apply to the case where I'm trying to make the unbootable partition bootable. I could probably figure out how to try and take individual files off the partition copies, but that would not be necessary if I can get the original drive to boot.
Does anyone know of a "cookbook" style guide that I could use to try and restore the unbootable drive?
If not, can anyone provide some guidance?
Also, once I ddrescued image and log files to a secondary drive, I just copied an pasted those backups to a tertiary drive. Does copy and paste work fine in this case?
TIA...