LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware
User Name
Password
Slackware This Forum is for the discussion of Slackware Linux.

Notices


Reply
  Search this Thread
Old 09-14-2016, 01:02 PM   #1
1337_powerslacker
Member
 
Registered: Nov 2009
Location: Kansas, USA
Distribution: Slackware64-15.0
Posts: 862
Blog Entries: 9

Rep: Reputation: 592Reputation: 592Reputation: 592Reputation: 592Reputation: 592Reputation: 592
Mysterious errors with SSD


The past 2 weeks or so, I have been seeing this error early in the boot process:

Quote:
[ 8.507719] blk_update_request: I/O error, dev sda, sector 29390672
[ 8.509482] blk_update_request: I/O error, dev sda, sector 29390696
[ 8.511227] blk_update_request: I/O error, dev sda, sector 29390704
[ 8.512913] blk_update_request: I/O error, dev sda, sector 29390712
[ 8.514563] blk_update_request: I/O error, dev sda, sector 29390720
[ 8.516222] blk_update_request: I/O error, dev sda, sector 29390728
[ 8.517800] blk_update_request: I/O error, dev sda, sector 29390776
[ 8.519364] blk_update_request: I/O error, dev sda, sector 29390784
[ 8.520917] blk_update_request: I/O error, dev sda, sector 29390792
[ 8.522415] blk_update_request: I/O error, dev sda, sector 29390800
I Googled the error, and one of the sites I clicked on said that the drive might be failing, and suggested to run the smartctl command. The specific command and results are shown below:

Code:
sudo smartctl -a -d ata /dev/sda
smartctl 6.5 2016-05-07 r4318 [x86_64-linux-4.7.3-ck3] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     SandForce Driven SSDs                                                                                
Device Model:     MKNSSDEC240GB                                                                                        
Serial Number:    ME151116100077F27                                                                                    
LU WWN Device Id: 5 888914 100077f27                                                                                   
Firmware Version: 604ABBF0                                                                                             
User Capacity:    240,057,409,536 bytes [240 GB]                                                                       
Sector Size:      512 bytes logical/physical                                                                           
Rotation Rate:    Solid State Device                                                                                   
Device is:        In smartctl database [for details use: -P show]                                                      
ATA Version is:   ATA8-ACS, ACS-2 T13/2015-D revision 3                                                                
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)                                                               
Local Time is:    Wed Sep 14 12:46:41 2016 CDT                                                                         
SMART support is: Available - device has SMART capability.                                                             
SMART support is: Enabled                                                                                              

=== START OF READ SMART DATA SECTION ===                                                                               
SMART overall-health self-assessment test result: PASSED                                                               

General SMART Values:                                                                                                  
Offline data collection status:  (0x00) Offline data collection activity                                               


Self-test execution status:      (   0) The previous self-test routine completed                                       


Total time to complete Offline                                                                                         
data collection:                (    0) seconds.                                                                       
Offline data collection                                                                                                
capabilities:                    (0x7d) SMART execute Offline immediate.                                               







SMART capabilities:            (0x0003) Saves SMART data before entering                                               


Error logging capability:        (0x01) Error logging supported.                                                       

Short self-test routine                                                                                                
recommended polling time:        (   1) minutes.                                                                       
Extended self-test routine                                                                                             
recommended polling time:        (  48) minutes.                                                                       
Conveyance self-test routine                                                                                           
recommended polling time:        (   2) minutes.                                                                       
SCT capabilities:              (0x0025) SCT Status supported.                                                          


SMART Attributes Data Structure revision number: 10                                                                    
Vendor Specific SMART Attributes with Thresholds:                                                                      
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE                       
  1 Raw_Read_Error_Rate     0x0032   120   120   050    Old_age   Always       -       0/0                             
  5 Retired_Block_Count     0x0033   100   100   003    Pre-fail  Always       -       0                               
  9 Power_On_Hours_and_Msec 0x0032   097   097   000    Old_age   Always       -       2864h+42m+18.480s               
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       535                             
171 Program_Fail_Count      0x000a   100   100   000    Old_age   Always       -       0                               
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0                               
174 Unexpect_Power_Loss_Ct  0x0030   000   000   000    Old_age   Offline      -       76                              
177 Wear_Range_Delta        0x0000   000   000   000    Old_age   Offline      -       0                               
181 Program_Fail_Count      0x000a   100   100   000    Old_age   Always       -       0                               
182 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0                               
187 Reported_Uncorrect      0x0012   100   100   000    Old_age   Always       -       0                               
190 Airflow_Temperature_Cel 0x0000   028   050   000    Old_age   Offline      -       28 (Min/Max 17/50)              
194 Temperature_Celsius     0x0022   028   050   000    Old_age   Always       -       28 (Min/Max 17/50)              
195 ECC_Uncorr_Error_Count  0x001c   120   120   000    Old_age   Offline      -       0/0
196 Reallocated_Event_Count 0x0033   100   100   003    Pre-fail  Always       -       0
201 Unc_Soft_Read_Err_Rate  0x001c   120   120   000    Old_age   Offline      -       0/0
204 Soft_ECC_Correct_Rate   0x001c   120   120   000    Old_age   Offline      -       0/0
230 Life_Curve_Status       0x0013   100   100   000    Pre-fail  Always       -       100
231 SSD_Life_Left           0x0013   100   100   010    Pre-fail  Always       -       8589934592
233 SandForce_Internal      0x0032   000   000   000    Old_age   Always       -       1572
234 SandForce_Internal      0x0032   000   000   000    Old_age   Always       -       492
241 Lifetime_Writes_GiB     0x0032   000   000   000    Old_age   Always       -       492
242 Lifetime_Reads_GiB      0x0032   000   000   000    Old_age   Always       -       857

SMART Error Log not supported

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
The site also said that the (other user that showed this data) drive was faulty and needed to be replaced. This is a Mushkin drive that is only a few months old. Do I really need to replace it so soon?

Thanks for any input!

Regards,

Matt
 
Old 09-14-2016, 01:35 PM   #2
phenixia2003
Senior Member
 
Registered: May 2006
Location: France
Distribution: Slackware
Posts: 1,052

Rep: Reputation: 1008Reputation: 1008Reputation: 1008Reputation: 1008Reputation: 1008Reputation: 1008Reputation: 1008Reputation: 1008
Hello,

I'm not an expert, but I see nothing wrong in your smartcl report. The drive seems to be in good health.

You can try to run a short and/or a long selftest :

Code:
$ smartctl --test=short /dev/sda

$ smartctl --test=long /dev/sda
A short test should take 2 minutes and 48 minutes for a long test. When the test is terminated, run smartctl as below to get test results :
Code:
$ smartctl -l selftest /dev/sda
You can also check your sata cable, and even test with another if possible.

--
SeB
 
1 members found this post helpful.
Old 09-14-2016, 02:27 PM   #3
solarfields
Senior Member
 
Registered: Feb 2006
Location: slackalaxy.com
Distribution: Slackware, CRUX
Posts: 1,449

Rep: Reputation: 997Reputation: 997Reputation: 997Reputation: 997Reputation: 997Reputation: 997Reputation: 997Reputation: 997
i had a disk that died, showing such error. I remember the I/O error thing. Backup if you can.
 
1 members found this post helpful.
Old 09-14-2016, 02:50 PM   #4
bassmadrigal
LQ Guru
 
Registered: Nov 2003
Location: West Jordan, UT, USA
Distribution: Slackware
Posts: 8,792

Rep: Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656Reputation: 6656
As phenixia2003 said, your drive doesn't seem to be showing any smart errors. The main one to look out for specifically with SSDs is Reallocated_Event_Count. This "error" will start cropping up when you start going beyond the usable write amounts for a cell and it has to start moving data off worn out cells. Your value is still at 0, so nothing has worn out according to smart data.

I would also do as phenixia2003 recommended and check your cable that it's fully seated or replace it with another cable.

But then, it is always possible that the error showing up in your dmesg is something that SMART doesn't log, so your drive could be a dud. If you have another computer, it might be worth checking in there to see if you get the same warning (which would remove the motherboard being the problem from the equation).

As always, it wouldn't hurt to back up your important stuff, just in case things go sideways.

Last edited by bassmadrigal; 09-14-2016 at 02:52 PM.
 
2 members found this post helpful.
Old 09-14-2016, 02:57 PM   #5
Ilgar
Senior Member
 
Registered: Jan 2005
Location: Istanbul, Turkey
Distribution: Slackware64 15.0, Slackwarearm 14.2
Posts: 1,157

Rep: Reputation: 237Reputation: 237Reputation: 237
Like phenixia2003, I also think that the smartctl output looks OK, except maybe
Code:
174 Unexpect_Power_Loss_Ct  0x0030   000   000   000    Old_age   Offline      -       76
Normally failed writes/reads should should also show in the relevant fields of smartcl. Perhaps it's not the drive itself but the wiring?
 
1 members found this post helpful.
Old 09-14-2016, 03:11 PM   #6
kjhambrick
Senior Member
 
Registered: Jul 2005
Location: Round Rock, TX
Distribution: Slackware64 15.0 + Multilib
Posts: 2,159

Rep: Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512
1337_powerslacker --

Like Ilgar, it seems to me that the Unexpect_Power_Loss_Ct is suspicious.

If the cable or the interface on the MoBo is bad, all the drive would know is 'power loss'

OTOH, the OS might see some sort of request error.

Check the Cable ( and Card, if one exists ) ?

-- kjh( the only other oddity I see is that I've never had a drive in the smartctl database ... mine are all reported as not in the DataBase )
 
Old 09-17-2016, 03:13 AM   #7
BratPit
Member
 
Registered: Jan 2011
Posts: 250

Rep: Reputation: 100Reputation: 100
http://lkcl.net/reports/ssd_analysis.html
https://news.ycombinator.com/item?id=10552218
https://forums.anandtech.com/threads...ction.2452606/

1. Check filesystem

OR

2. Do a so called "secure erase procedure"

https://www.thomas-krenn.com/en/wiki/SSD_Secure_Erase
https://www.unixmen.com/secure-erase-your-ssd/

Last edited by BratPit; 09-17-2016 at 03:44 AM.
 
Old 09-17-2016, 06:06 AM   #8
kjhambrick
Senior Member
 
Registered: Jul 2005
Location: Round Rock, TX
Distribution: Slackware64 15.0 + Multilib
Posts: 2,159

Rep: Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512Reputation: 1512
Quote:
Originally Posted by BratPit View Post
Eeek !!!

I've not read the content of the 'Secure Erase' links, but unless they've discovered such-a-thing as a 'non-destructive Secure Erase' then #2 above sounds like a last resort ?

Or am I missing something ?

If I understand Secure Erase and the drive has actually gone bad then a Sledge Hammer is MUCH quicker than Secure Erase and it take MUCH LESS effort to erase an SSD with a Sledge Hammer than it does with HDDs

Looking at 1337_powerslacker's `smartctl` Report, the drive itself looks OK.

IMO, check the Interface Components ( Cable and SATA Connector and optionally any SATA Card ) before doing anything else.

-- kjh
 
Old 09-17-2016, 07:29 AM   #9
BratPit
Member
 
Registered: Jan 2011
Posts: 250

Rep: Reputation: 100Reputation: 100
SE is there to bring back disk to life if possible, not mobo,sata interface, filesystem etc.
That must be checked first.
It costs lost data.
If that fail after that your Sledge Hammer is very reasonable option but not first.

"Report, the drive itself looks OK"

Ya "itself" but telling nothing about possible power failure controller:-)

Sometimes SMART means not so smart.

Last edited by BratPit; 09-17-2016 at 07:38 AM.
 
Old 09-17-2016, 08:46 AM   #10
1337_powerslacker
Member
 
Registered: Nov 2009
Location: Kansas, USA
Distribution: Slackware64-15.0
Posts: 862

Original Poster
Blog Entries: 9

Rep: Reputation: 592Reputation: 592Reputation: 592Reputation: 592Reputation: 592Reputation: 592
Well, as has been suggested several times, I opened my case and re-seated the power and SATA cables, so that there's no question of incomplete electrical contact on either. I booted up, and no errors occurred. So I think this was a one-off, where there may have been some inadvertent jostling of cables when I was fiddling in my computer's internals.

Thanks for the suggestions, everyone! It was much appreciated!

Happy Slacking!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
ZFS/Btrfs mysterious checksum errors gkovacs Linux - Hardware 5 06-02-2016 04:33 PM
Fastest SATA SSD or pair of RAID 0 SSD's under aprox $200 Ulysses_ Linux - Hardware 1 03-26-2016 09:45 AM
Access To Encrypted SSD Partition With Native Password in SSD>SATA Enclosure skidvicious Linux - Hardware 5 12-03-2015 04:40 PM
Samsung SSD errors on working Linux load dabigboy Linux - Hardware 7 05-15-2015 04:01 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Distributions > Slackware

All times are GMT -5. The time now is 11:30 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration