LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 12-10-2016, 03:32 PM   #1
qombi
Member
 
Registered: Jan 2016
Posts: 34

Rep: Reputation: Disabled
S.M.A.R.T Error


I assume I need to replace a hard drive. What I am concerned with is data corruption on the file server though. I reviewed recent backups and looks like no new files created or files deleted other than the ones expected. I assume all data is intact.

I received multiple messages in syslogs starting 12/8/16 example shown below:

Dec 10 15:44:07 username smartd[17919]: Device: /dev/sdb [SAT], 1 Currently unreadable (pending) sectors

Dec 9 05:45:57 username smartd[18336]: Device: /dev/sdb [SAT], 1 Offline uncorrectable sectors

I decided to execute an extended smartctl test on the drive, here are the results. Everything passed. I do see 1 pending sector listed here as well.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 253 168 021 Pre-fail Always - 1100
4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1902
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 053 053 000 Old_age Always - 34609
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 255
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 100
193 Load_Cycle_Count 0x0032 017 017 000 Old_age Always - 550601
194 Temperature_Celsius 0x0022 122 104 000 Old_age Always - 28
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 34597 -

Is there a way to mark that sector as bad to not write to it? I assume the drive is dying even though passed the extended. Any suggestions, comments? Any further testing or steps I should take? Thanks!
 
Old 12-10-2016, 04:55 PM   #2
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,777

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
I don't know of anyone who would replace a drive because of a single bad sector. It's really strange that the extended test did not stop when it tried to read that sector. Maybe it's just marginal, and was read successfully by the test.

The way you fix a bad sector is by writing to it, which will cause it to be reallocated if it's actually bad, or else removed from the "pending" list if it can be written successfully. The Bad Block HOWTO has instructions for doing that, but that's all predicated on identifying the bad sector. Unless you can get the test to fail or see an I/O error from a "read" operation, there's no way to know where it is.

I suppose you could run "dd if=/dev/sd{X} of=/dev/sd{X} bs=64k" to copy the whose disk back to itself. I really don't like that idea because of the chance of a memory problem or other glitch sliently corrupting the data, but I can't think of any other way to rewrite a bad sector that resists being identified.

Last edited by rknichols; 12-10-2016 at 10:43 PM. Reason: spelling
 
Old 12-10-2016, 04:56 PM   #3
jefro
Moderator
 
Registered: Mar 2008
Posts: 21,978

Rep: Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624Reputation: 3624
If you believe these guys then they feel the drive is subject to going out if numbers don't improve. https://kb.acronis.com/content/9133

I think most file systems have a way to mark problems. At one time Norton had a way to trick drives in a number of ways to extend their life. We used to do low level formats on SCSI drives and keep them going for a few decades.

Many of the OEM drive makers offer a bootable image that one can do factory tests and get better information. I'd try that or at least get ultimate boot cd to run generic tests.
 
Old 12-10-2016, 05:52 PM   #4
qombi
Member
 
Registered: Jan 2016
Posts: 34

Original Poster
Rep: Reputation: Disabled
Thanks guys, I will try a few of these steps and see what I get
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Fatal error in MPI_Init: Other MPI error, error stack:gethostbyname failed(errno 1) turbo67 Linux - Networking 2 10-17-2018 04:06 AM
Fatal error in MPI_Init: Other MPI error, error stack:gethostbyname failed(errno 1) turbo67 Red Hat 1 06-15-2014 05:53 AM
Fatal error in MPI_Init: Other MPI error, error stack:gethostbyname failed(errno 1) shilpiiitr Linux - Software 0 01-13-2014 02:44 PM
Error 502 : Display Fatal Error Message, Error pushing image, dbpaCT failed! HaloCheng Linux - Newbie 1 09-12-2012 12:02 PM
Sendmail: eocket wedge , 504 error , dsn error, mail relay connection error djcs Debian 0 03-03-2009 12:41 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 04:56 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration