Hard drive failing

PsychoHermit · 12-31-2021, 09:17 PM

Hi Folks,

it looks like my hard drive is on it's last legs. I didn't want to spend the money but it looks like I get to upgrade to an SSD.

Code:

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   099   062    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0025   100   100   040    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0023   168   100   033    Pre-fail  Always       -       1
  4 Start_Stop_Count        0x0032   098   098   000    Old_age   Always       -       3201
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002f   100   100   067    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0025   100   100   040    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   080   080   000    Old_age   Always       -       8950
 10 Spin_Retry_Count        0x0033   100   100   060    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always       -       3200
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0033   100   100   097    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       592709156864
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       81606148102
190 Airflow_Temperature_Cel 0x0022   070   049   045    Old_age   Always       -       30 (Min/Max 20/31)
191 G-Sense_Error_Rate      0x0032   082   082   000    Old_age   Always       -       4689
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       851981
193 Load_Cycle_Count        0x0032   059   059   000    Old_age   Always       -       419500
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0036   100   100   000    Old_age   Always       -       0
223 Load_Retry_Count        0x002a   100   100   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

--glenn

rknichols · 12-31-2021, 10:22 PM

I don't see anything wrong in those attributes. The Load_Cycle_Count and Power-Off_Retract_Count are awfully high, but so far they don't seem to have hurt anything. Having the drive power cycling every 90 seconds or so (851981 times in just 8950 hours) seems like a configuration problem.

computersavvy · 12-31-2021, 10:39 PM

Smartctl is usually able to report a lot more if you were to use

Code:

sudo smartctl -a /dev/sdX

Unless you are certain it is failing (i.e. getting a lot of corruption that requires using fsck or similar) I would dig deeper before you just automatically replace it.

With that said, the output of the smartctl command above is a better judge of the status than the little bit you posted here.

Also remember that a backup is always recommended just in case of catastrophic failure.

syg00 · 12-31-2021, 11:01 PM

Well, I certainly wouldn't be happy with those numbers for 187, 188.

mrmazda · 12-31-2021, 11:14 PM

Quote:

Originally Posted by PsychoHermit

Hi Folks,

it looks like my hard drive is on it's last legs. I didn't want to spend the money but it looks like I get to upgrade to an SSD.

Code:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0

From these I see no reason to expect soon failure, especially with such low power on hours. Don't be afraid to update to SSD anyway. Unless your PC is running the disk on an old SATA-1 controller, upgrading to SSD should be worth it for the speed increase. This HDD looks like a great candidate for backing up your SSD.

lvm_ · 01-01-2022, 03:49 AM

Quote:

Originally Posted by syg00

Well, I certainly wouldn't be happy with those numbers for 187, 188.

Nothing wrong with those - you should check normalised, not raw values. It is probably a seagate drive - they always have insane raw values. And if it is, it should be replaced even if it looks perfectly healthy as this one does.

rknichols · 01-01-2022, 08:40 AM

Quote:

Originally Posted by syg00

Well, I certainly wouldn't be happy with those numbers for 187, 188.

Convert those values to hex and look at the low-order bits. Seagate drives (I'm guessing this is a Seagate hybrid drive) typically have some raw values for which the low-order bits are the actual exception count and the higher-order bits are the number of operations.

smallpond · 01-01-2022, 10:59 AM

Unless you're really into parsing low level details, you can just ask smartctl for its evaluation with -H

Code:

sudo smartctl  -H /dev/nvme0n1
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.4-101.fc34.x86_64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

rclark · 01-01-2022, 11:26 AM

Looks ok. But hey ... any excuse to spend $100 on an SSD sounds good to me

. My spinning rust storage devices are now only for backups. All our laptops, desktops, and a data server our running on SSDs.

syg00 · 01-01-2022, 04:55 PM

Never too late to learn - thanks for the education.

PsychoHermit · 01-01-2022, 08:23 PM

I guess I will hold off replacing the drive and see what develops. It may keep working for quite some time.

Thanks,
--glenn

sundialsvcs · 01-03-2022, 09:33 AM

If you think that a drive might be headed for failure, get rid of the damned thing.

SSD hard drives, both internal and external, are insanely-big and no longer expensive. I have several external drives attached to all of my computers, for continuous backups and other purposes.

If you are using LVM = Logical Volume Management, as you should be, you can actually migrate all of the data off the failing drive and onto the new one automagically ... and without downtime.