LinuxQuestions.org
Register a domain and help support LQ
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices



Reply
 
Search this Thread
Old 08-02-2013, 04:04 PM   #1
alaios
Senior Member
 
Registered: Jan 2003
Location: Aachen
Distribution: Opensuse 11.2 (nice and steady)
Posts: 2,133

Rep: Reputation: 45
Reuse a faulty disk


Hi all,
I have a hard disk that had in the past many hardware problems.
I attach you the smart output
Code:
smartctl -a /dev/sda
smartctl 6.0 2012-10-10 r3643 [x86_64-linux-3.1.10-1.16-default] (SUSE RPM)
Copyright (C) 2002-12, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Scorpio Blue Serial ATA
Device Model:     WDC WD3200BEVT-22ZCT0
Serial Number:    WD-WXF0A99P7681
LU WWN Device Id: 5 0014ee 20375eb73
Firmware Version: 11.01A11
User Capacity:    320,072,933,376 bytes [320 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.5, 3.0 Gb/s                                                                                                                                                                                            
Local Time is:    Tue Mar 19 08:17:11 2013 EET                                                                                                                                                                                  
SMART support is: Available - device has SMART capability.                                                                                                                                                                      
SMART support is: Enabled                                                                                                                                                                                                        
                                                                                                                                                                                                                                
=== START OF READ SMART DATA SECTION ===                                                                                                                                                                                        
SMART overall-health self-assessment test result: PASSED                                                                                                                                                                        
                                                                                                                                                                                                                                
General SMART Values:                                                                                                                                                                                                            
Offline data collection status:  (0x00) Offline data collection activity                                                                                                                                                        
                                        was never started.                                                                                                                                                                      
                                        Auto Offline Data Collection: Disabled.                                                                                                                                                  
Self-test execution status:      (   0) The previous self-test routine completed                                                                                                                                                
                                        without error or no self-test has ever                                                                                                                                                  
                                        been run.                                                                                                                                                                                
Total time to complete Offline                                                                                                                                                                                                  
data collection:                ( 9960) seconds.                                                                                                                                                                                
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 118) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x303f) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   198   198   051    Pre-fail  Always       -       23359
  3 Spin_Up_Time            0x0027   184   183   021    Pre-fail  Always       -       1758
  4 Start_Stop_Count        0x0032   096   096   000    Old_age   Always       -       4244
  5 Reallocated_Sector_Ct   0x0033   186   186   140    Pre-fail  Always       -       111
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   090   090   000    Old_age   Always       -       7379
 10 Spin_Retry_Count        0x0033   100   100   051    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   097   097   000    Old_age   Always       -       3402
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       75
193 Load_Cycle_Count        0x0032   124   124   000    Old_age   Always       -       228553
194 Temperature_Celsius     0x0022   114   074   000    Old_age   Always       -       33
196 Reallocated_Event_Count 0x0032   184   184   000    Old_age   Always       -       16
197 Current_Pending_Sector  0x0032   198   198   000    Old_age   Always       -       112
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0009   100   253   051    Pre-fail  Offline      -       0

SMART Error Log Version: 1
ATA Error Count: 22858 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 22858 occurred at disk power-on lifetime: 7374 hours (307 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 2c 7f e4 40  Error: WP at LBA = 0x00e47f2c = 14974764

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 08 10 e8 29 7c 01 00      03:46:29.525  WRITE FPDMA QUEUED
  61 08 08 98 28 7a 01 00      03:46:29.524  WRITE FPDMA QUEUED
  61 08 f0 68 3f 53 06 00      03:46:29.524  WRITE FPDMA QUEUED
  61 08 c8 a8 b4 81 01 00      03:46:29.524  WRITE FPDMA QUEUED
  61 08 b0 b8 29 7c 01 00      03:46:29.524  WRITE FPDMA QUEUED

Error 22857 occurred at disk power-on lifetime: 7374 hours (307 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 2c 7f e4 40  Error: UNC at LBA = 0x00e47f2c = 14974764

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 28 80 08 7f e4 24 00      03:46:25.969  READ FPDMA QUEUED
  60 40 78 ba 80 91 12 00      03:46:25.969  READ FPDMA QUEUED
  60 80 70 60 68 25 0e 00      03:46:25.969  READ FPDMA QUEUED
  61 03 68 60 36 a3 09 00      03:46:25.969  WRITE FPDMA QUEUED
  60 40 60 10 cb e8 01 00      03:46:25.969  READ FPDMA QUEUED

Error 22856 occurred at disk power-on lifetime: 7374 hours (307 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 2c 7f e4 40  Error: WP at LBA = 0x00e47f2c = 14974764

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 08 18 a0 28 7a 01 00      03:46:22.588  WRITE FPDMA QUEUED
  61 08 10 b0 32 7b 01 00      03:46:22.588  WRITE FPDMA QUEUED
  61 08 08 90 28 7a 01 00      03:46:22.587  WRITE FPDMA QUEUED
  61 08 00 98 28 7a 01 00      03:46:22.587  WRITE FPDMA QUEUED
  61 01 f8 c8 d8 ff 02 00      03:46:22.586  WRITE FPDMA QUEUED

Error 22855 occurred at disk power-on lifetime: 7374 hours (307 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 2c 7f e4 40  Error: UNC at LBA = 0x00e47f2c = 14974764

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 28 80 08 7f e4 24 00      03:46:19.106  READ FPDMA QUEUED
  60 38 78 3a 80 91 12 00      03:46:19.105  READ FPDMA QUEUED
  60 80 70 10 67 25 0e 00      03:46:19.105  READ FPDMA QUEUED
  60 20 68 40 36 dd 01 00      03:46:19.105  READ FPDMA QUEUED
  60 38 60 62 e0 86 01 00      03:46:19.105  READ FPDMA QUEUED

Error 22854 occurred at disk power-on lifetime: 7374 hours (307 days + 6 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 2c 7f e4 40  Error: UNC at LBA = 0x00e47f2c = 14974764

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 08 40 0e 07 a7 01 00      03:46:15.868  READ FPDMA QUEUED
  61 08 38 a0 28 7a 01 00      03:46:15.868  WRITE FPDMA QUEUED
  60 1b 30 16 07 a7 01 00      03:46:15.868  READ FPDMA QUEUED
  61 08 28 98 32 7b 01 00      03:46:15.867  WRITE FPDMA QUEUED
  60 28 20 08 7f e4 24 00      03:46:15.867  READ FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       60%      7378         31108065

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

I have recovered the disk to a new one with dd_rescue so the old disk is left unused. I wanted to know if I could do some low-level format and get rid of the bad areas so I could keep him as sort of disk moving files between systems.

I would like to thank you in advance for your help

Regards
Alex
 
Old 08-02-2013, 04:43 PM   #2
Doc CPU
Senior Member
 
Registered: Jun 2011
Location: Stuttgart, Germany
Distribution: Mint, Debian, Gentoo, Win 2k/XP
Posts: 1,094

Rep: Reputation: 341Reputation: 341Reputation: 341Reputation: 341
Hi there,

Quote:
Originally Posted by alaios View Post
I have a hard disk that had in the past many hardware problems.
so you have a good reason not to trust that unit any more.

Quote:
Originally Posted by alaios View Post
I have recovered the disk to a new one with dd_rescue so the old disk is left unused. I wanted to know if I could do some low-level format and get rid of the bad areas so I could keep him as sort of disk moving files between systems.
I haven't even bothered to look closely at the SMART data. If you say you had some trouble with the drive, I wouldn't use it for precious data any longer if I were you. Maybe for temporary data, or maybe as a transport media. But confidence in the drive's reliability would be completely lost for me.

Besides, it's a WD drive. Western Digital has a long tradition about non-trustworthyness, though I have to admit their reliability has improved during the last five years.

[X] Doc CPU
 
Old 08-02-2013, 07:13 PM   #3
PTrenholme
Senior Member
 
Registered: Dec 2004
Location: Olympia, WA, USA
Distribution: Fedora, (K)Ubuntu
Posts: 4,157

Rep: Reputation: 333Reputation: 333Reputation: 333Reputation: 333
A quick "Google" of FPDMA found several comments that FPDMA errors are usually cable or power supply problems, not physical disk problems. So, monitor your NEW drive for for this type of error. If you changed the drive cable with the drive (usually a good idea), and the errors don't reappear, try mounting the old drive with a new cable and running it a while. Perhaps you now have two good drives. (Maybe you could use them as a mirror set.:)

If you changed the cable and the problem happens with the new drive, check your power supply.

By the way, the FPDMA is the direct memory access stack, set by the driver, that tells the drive where, in the computer's memory, it should read or write when triggered by the driver. If the stack memory on the drive was faulty, I would expect that almost any disk usage would fail, not just a few isolated failures.
 
Old 08-02-2013, 09:10 PM   #4
jefro
Guru
 
Registered: Mar 2008
Posts: 12,329

Rep: Reputation: 1561Reputation: 1561Reputation: 1561Reputation: 1561Reputation: 1561Reputation: 1561Reputation: 1561Reputation: 1561Reputation: 1561Reputation: 1561Reputation: 1561
We have performed low level formats on scsi disks for decades. It is usually a good first step. At one time you used to be able to get factory low level tools. Not so common now. I agree that any number of issues but all the old age deals prove it is old too. Norton used to make a tool that moved sectors a bit.
 
Old 08-02-2013, 09:30 PM   #5
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,225

Rep: Reputation: Disabled
The S.M.A.R.T. data says this disk has 112 bad, un-reallocated sectors (the Current_Pending_Sector value) in addition to 111 reallocated sectors (the Reallocated_Sector_Count value). That should be enough to make anyone feel uncomfortable about using this disk.

It is possible (but not at all likely) that the number of bad sectors will remain constant rather than growing. You can test this hypothesis by writing to the sectors currently marked as bad (use dd, badblocks or hdparm), as this will force a reallocation. Even if the drive should appear defect-free afterwards, you would need to keep a very close eye on the S.M.A.R.T. parameters for any indication of more defects.

Given the price of hard drives these days, it makes little sense to keep using a flaky drive unless you're doing it for the educational value.
 
1 members found this post helpful.
Old 02-04-2014, 08:13 AM   #6
alaios
Senior Member
 
Registered: Jan 2003
Location: Aachen
Distribution: Opensuse 11.2 (nice and steady)
Posts: 2,133

Original Poster
Rep: Reputation: 45
Hi there,
sorry for not writing after so long but my daughter was born in the mean time

I am thinking of resuing this hard disk as a big flash disk, for copying, non-important files (i.e movies to the tv, audio files to the audio player e.t.c)

What would be the procedure I have to follow to protect me from the allocated/unallocated bad sectors?

I would like to thank you in advance for your help

Regards
Alex
 
Old 02-04-2014, 08:40 AM   #7
TobiSGD
Moderator
 
Registered: Dec 2009
Location: Hanover, Germany
Distribution: Main: Gentoo Others: What fits the task
Posts: 15,799
Blog Entries: 2

Rep: Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201
Quote:
Originally Posted by alaios View Post
What would be the procedure I have to follow to protect me from the allocated/unallocated bad sectors?
There is none. You may be able to avoid the already existing bad sectors, but from my experience if you have a few you will get more over time, in areas that were good before.
Don't waste your time on that, you can't fix hardware with software.
 
Old 02-04-2014, 12:23 PM   #8
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 1,777

Rep: Reputation: 741Reputation: 741Reputation: 741Reputation: 741Reputation: 741Reputation: 741Reputation: 741
If you just overwrite the whole drive with zeros, the drive will reallocate the current pending sectors to spares. But, as TobiSGD and Ser Olmy have said, it is likely that more bad sectors will develop in the future.

As for the comment about "all the old age deals," that column is just telling you what it would mean if there were something other than "-" in the WHEN_FAILED column. At the moment, none of the SMART parameters have gone past what the manufacturer considers to be their failure thresholds, though for bad sector counts most people would consider the drive to be in dire need of replacement long before it uses up most of its spare sectors and the SMART attribute declares failure.
 
Old 02-04-2014, 12:41 PM   #9
metaschima
Senior Member
 
Registered: Dec 2013
Distribution: Slackware
Posts: 1,687

Rep: Reputation: 416Reputation: 416Reputation: 416Reputation: 416Reputation: 416
Wiping the disk with zeros will fix soft errors, but not hard errors. You can try it and keep using the disk, but don't put any important information on the disk that you haven't backed up.
 
Old 02-06-2014, 05:25 AM   #10
alaios
Senior Member
 
Registered: Jan 2003
Location: Aachen
Distribution: Opensuse 11.2 (nice and steady)
Posts: 2,133

Original Poster
Rep: Reputation: 45
Thanks for the answers. Can you give me the tools I need to perform those operations. As I said I need your help to keep using this drive just as a large usb disk inside the house, for copying non-important information. Even in that case I have a reason to keep the disk

Alex
 
Old 02-06-2014, 05:44 AM   #11
TobiSGD
Moderator
 
Registered: Dec 2009
Location: Hanover, Germany
Distribution: Main: Gentoo Others: What fits the task
Posts: 15,799
Blog Entries: 2

Rep: Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201Reputation: 4201
You have the tools already, just use dd to overwrite the disk with zeroes. Or use the badblocks utility in its destructive mode (-w option).
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Faulty card or faulty config? svar Linux - Networking 4 09-02-2009 10:39 AM
How to reuse FD_CLR value laginagesh Linux - Newbie 1 07-28-2009 06:10 AM
removed faulty win disk. grub issue svennand Linux - Newbie 2 12-24-2007 08:34 PM
Might my disk be faulty? a_l_a_n Linux - Hardware 2 03-26-2007 06:13 PM
Socket reuse ed.poore Programming 5 02-14-2007 05:42 PM


All times are GMT -5. The time now is 12:59 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration