LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 12-23-2011, 03:04 PM   #1
Kallaste
Member
 
Registered: Nov 2011
Distribution: Slackware
Posts: 363

Rep: Reputation: 85
Smartctl read failure; Is my HD failing?


My Xubuntu installation has failed over and over from the first time I installed it a month ago, so I am checking my hard disk from a Parted Magic live CD with smartctl (via the gui, GSmartControl) prior to reinstalling yet again. I am getting a "completed with read failure" message for every test I run. However, I have never been through this before and I do not know if this definitely means the disk is going bad, especially since it also says there are no errors logged and the overall health self-assessment test was passed. If someone out there could help, I would be grateful.

Here is the log from the short test:

Code:
smartctl 5.41 2011-06-09 r3365 [i686-linux-3.0.4-pmagic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     SAMSUNG SpinPoint F3
Device Model:     SAMSUNG HD103SJ
Serial Number:    S246J9AZ201015
LU WWN Device Id: 5 0024e9 201e7b7d5
Firmware Version: 1AJ10001
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 6
Local Time is:    Fri Dec 23 20:24:19 2011 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 121)	The previous self-test completed having
					the read element of the test failed.
Total time to complete Offline 
data collection: 		( 9540) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 159) minutes.
SCT capabilities: 	       (0x003f)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   051    Pre-fail  Always       -       10528
  2 Throughput_Performance  0x0026   252   252   000    Old_age   Always       -       0
  3 Spin_Up_Time            0x0023   071   050   025    Pre-fail  Always       -       8977
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       469
  5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   252   252   051    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0024   252   252   015    Old_age   Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       4860
 10 Spin_Retry_Count        0x0032   252   252   051    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   252   252   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       483
191 G-Sense_Error_Rate      0x0022   100   100   000    Old_age   Always       -       9
192 Power-Off_Retract_Count 0x0022   252   252   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0002   053   048   000    Old_age   Always       -       47 (Min/Max 18/52)
195 Hardware_ECC_Recovered  0x003a   100   100   000    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   252   252   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       3
198 Offline_Uncorrectable   0x0030   252   252   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0036   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       10
223 Load_Retry_Count        0x0032   252   252   000    Old_age   Always       -       0
225 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       483

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%      4860         280578
# 2  Extended offline    Completed: read failure       90%      4860         280578

Note: selective self-test log revision number (0) not 1 implies that no selective self-test has ever been run
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Completed_read_failure [90% left] (0-65535)
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
And here is the ouput from the extended test (which also only took a few seconds). I think it is the same as the other one, but maybe I am missing something.

Code:
smartctl 5.41 2011-06-09 r3365 [i686-linux-3.0.4-pmagic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     SAMSUNG SpinPoint F3
Device Model:     SAMSUNG HD103SJ
Serial Number:    S246J9AZ201015
LU WWN Device Id: 5 0024e9 201e7b7d5
Firmware Version: 1AJ10001
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 6
Local Time is:    Fri Dec 23 20:57:10 2011 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 121)	The previous self-test completed having
					the read element of the test failed.
Total time to complete Offline 
data collection: 		( 9540) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 159) minutes.
SCT capabilities: 	       (0x003f)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   051    Pre-fail  Always       -       10528
  2 Throughput_Performance  0x0026   252   252   000    Old_age   Always       -       0
  3 Spin_Up_Time            0x0023   071   050   025    Pre-fail  Always       -       8977
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       469
  5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   252   252   051    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0024   252   252   015    Old_age   Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       4861
 10 Spin_Retry_Count        0x0032   252   252   051    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   252   252   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       483
191 G-Sense_Error_Rate      0x0022   100   100   000    Old_age   Always       -       9
192 Power-Off_Retract_Count 0x0022   252   252   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0002   053   048   000    Old_age   Always       -       47 (Min/Max 18/52)
195 Hardware_ECC_Recovered  0x003a   100   100   000    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   252   252   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       3
198 Offline_Uncorrectable   0x0030   252   252   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0036   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       10
223 Load_Retry_Count        0x0032   252   252   000    Old_age   Always       -       0
225 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       483

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%      4861         280578
# 2  Short offline       Completed: read failure       90%      4860         280578
# 3  Extended offline    Completed: read failure       90%      4860         280578

Note: selective self-test log revision number (0) not 1 implies that no selective self-test has ever been run
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Completed_read_failure [90% left] (0-65535)
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Thanks.
 
Old 12-23-2011, 05:46 PM   #2
norobro
Member
 
Registered: Feb 2006
Distribution: Debian Sid
Posts: 792

Rep: Reputation: 331Reputation: 331Reputation: 331Reputation: 331
Perhaps these links will help:

Bad block HOWTO for smartmontools - By Bruce Allen, the author of smartmontools

Smartmontools and fixing Unreadable Disk Sectors
 
Old 12-23-2011, 08:02 PM   #3
Kallaste
Member
 
Registered: Nov 2011
Distribution: Slackware
Posts: 363

Original Poster
Rep: Reputation: 85
This is good information, but as I said, this is the first time I have dealt with this issue and I do not want to make a decision about my hard disk on the basis of my inexpert interpretation of some guides. Isn't there anyone willing to offer me a well-founded opinion of whether my disk is going bad?
 
Old 12-23-2011, 09:11 PM   #4
norobro
Member
 
Registered: Feb 2006
Distribution: Debian Sid
Posts: 792

Rep: Reputation: 331Reputation: 331Reputation: 331Reputation: 331
Your software tool gave you a "well-founded" opinion of the condition of your hard drive.

Maybe this will convince you: link

My opinion, and you know the old saying about opinions, is that your drive is not failing.
 
Old 12-23-2011, 09:39 PM   #5
Kallaste
Member
 
Registered: Nov 2011
Distribution: Slackware
Posts: 363

Original Poster
Rep: Reputation: 85
Quote:
Originally Posted by norobro View Post
Your software tool gave you a "well-founded" opinion of the condition of your hard drive.

Maybe this will convince you: link

My opinion, and you know the old saying about opinions, is that your drive is not failing.
Actually, I'm pretty sure software does not issue opinions.
 
Old 12-24-2011, 10:19 AM   #6
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
Quote:
Originally Posted by BloomingNutria View Post
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 3
...
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 90% 4860 280578
# 2 Extended offline Completed: read failure 90% 4860 280578
The drive currently has 3 unreadable sectors that it will reallocate to spare sectors the next time they are written. The Bad Block HOWTO at http://smartmontools.sourceforge.net/badblockhowto.html shows the steps you need to take to recover with minimal data loss.

Is your drive failing? That depends. There are various things (mechanical shock, power supply glitches, ...) that can cause a sector or two to go bad without suggesting that catastrophic failure is imminent. After you get the drive cleaned up, you should keep a close watch on its Reallocated_Sector_Ct and Current_Pending_Sector attributes. If you continue to get new bad sectors, the drive should be replaced immediately.
 
1 members found this post helpful.
Old 12-24-2011, 11:41 AM   #7
Kallaste
Member
 
Registered: Nov 2011
Distribution: Slackware
Posts: 363

Original Poster
Rep: Reputation: 85
Thank you, you have helped me learn to interpret the log the right way. I will follow your suggestions, do some more reading, and hopefully get this under control with minimal loss.

I really appreciate it!
 
Old 05-11-2012, 05:28 AM   #8
nwsmith
LQ Newbie
 
Registered: Mar 2007
Posts: 1

Rep: Reputation: 1
I just noticed that an earlier post on this thread, references one of my old blog posts, so I thought I would add my feedback.

In the example above, counter '197 Current_Pending_Sector' is showing a count of 3. This is bad. I've seen this a few times now on various hard drives, and I suspect its due to the mains power failing while the drive is in the middle of a write.

The first thing to do is to run an long selftest with:
Code:
# smartctl -t long /dev/sda
The command will respond with an estimate of how long it thinks the test will take to complete.
(But this assumes that no errors will be found!)

To check progress use:
Code:
# smartctl -A /dev/sda | grep remaining
...but don't check too often, as with some drives, it can abort the test.

After the test completes examine the reported status with:
Code:
# smartctl -l selftest /dev/hda
You will see something like this:
Code:
  === START OF READ SMART DATA SECTION ===
  SMART Self-test log structure revision number 1
  Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
  # 1  Extended offline    Completed: read failure       20%      1596         44724966
So take that 'LBA' of 44724966 and multiply by (512/4096) which is the equivalent of 'divide by 8'
Code:
  44724966 / 8 = 5590620.75
Round this down to 5590620 and use as follows to 'zero-out' the sector:
Code:
  # dd if=/dev/zero of=/dev/sda conv=sync bs=4096 count=1 seek=5590620 
  1+0 records in
  1+0 records out
  # sync
Now retry the selftest & see if the errors are now all cleared. If not, then repeat the above.

Of course, if that sector was in the middle of a file, this will corrupted the file. See my blog post "Smartmontools and fixing Unreadable Disk Sectors" for how to try to check for this.

Once the 'Current_Pending_Sector' are "fixed" the drive will probably be fine, in my experience. As long as 'Reallocated_Sector_Ct' is zero, you should be fine. Even a few reallocated sectors seems ok, but if that count starts to increment frequently, then that is a danger sign! But of course you need to regularly keep a close eye on the counters. Use 'smartd' to schedule daily tests.
 
1 members found this post helpful.
Old 10-12-2012, 01:33 AM   #9
benmctee
LQ Newbie
 
Registered: Oct 2012
Location: Hawaii
Distribution: Ubuntu
Posts: 6

Rep: Reputation: Disabled
Tried advice, still having issue

I apologize for re-opening this somewhat old thread, but I'm having a similar issue and not getting the results you described. I performed the SMART short test, and received the following:

# 1 Short offline Completed: read failure 90% 10769 980534200
# 2 Short offline Completed: read failure 90% 10769 980534202

I did:
980534200 / 8 = 980534200
sudo dd if=/dev/zero of=/dev/sda conv=sync bs=4096 count=1 seek=980534200
sudo sync

That caused me to get another bad sector upon running the self test:
# 1 Short offline Completed: read failure 90% 10769 991816368
# 2 Short offline Completed: read failure 90% 10769 980534200
# 3 Short offline Completed: read failure 90% 10769 980534202

I then repeated, as you instructed, and now have the following:
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 90% 10769 3885852504
# 2 Short offline Completed: read failure 90% 10769 991816368
# 3 Short offline Completed: read failure 90% 10769 980534200
# 4 Short offline Completed: read failure 90% 10769 980534202

My full report is:

smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-31-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Caviar Green (Adv. Format)
Device Model: WDC WD20EARS-00MVWB0
Serial Number: WD-WMAZA0747797
LU WWN Device Id: 5 0014ee 655beb6e9
Firmware Version: 51.0AB51
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Thu Oct 11 20:17:24 2012 HST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x85) Offline data collection activity
was aborted by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 121) The previous self-test completed having
the read element of the test failed.
Total time to complete Offline
data collection: (37680) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x3035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 171 168 021 Pre-fail Always - 6408
4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 1321
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 086 086 000 Old_age Always - 10769
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 236
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 193
193 Load_Cycle_Count 0x0032 106 106 000 Old_age Always - 282149
194 Temperature_Celsius 0x0022 113 106 000 Old_age Always - 37
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 1
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 22

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 90% 10769 3885852504
# 2 Short offline Completed: read failure 90% 10769 991816368
# 3 Short offline Completed: read failure 90% 10769 980534200
# 4 Short offline Completed: read failure 90% 10769 980534202

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


===================

Any suggestions?
 
Old 10-12-2012, 10:30 AM   #10
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
You attached your problem to the end of an old and SOLVED thread, which means that few people are going to read it. You should have started a new thread, which would then get greater exposure by being on the "zero reply" list.

Also, always wrap pasted, formatted text in [code] ... [/code] tags to preserve formatting.

That said, on to the problem ...
Quote:
Originally Posted by benmctee View Post
I apologize for re-opening this somewhat old thread, but I'm having a similar issue and not getting the results you described. I performed the SMART short test, and received the following:

# 1 Short offline Completed: read failure 90% 10769 980534200
# 2 Short offline Completed: read failure 90% 10769 980534202

I did:
980534200 / 8 = 980534200
sudo dd if=/dev/zero of=/dev/sda conv=sync bs=4096 count=1 seek=980534200
sudo sync
I don't know what "980534200 / 8 = 980534200" is supposed to mean, but your seek is beyond the end of the disk. In dd, specifying "bs=4096" sets both ibs and obs to that value. Multiplying your seek value by the 4096 byte block size yields a roughly 4TB offset into your 2TB disk.
Quote:
Originally Posted by benmctee View Post
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Caviar Green (Adv. Format)
Device Model: WDC WD20EARS-00MVWB0
Serial Number: WD-WMAZA0747797
LU WWN Device Id: 5 0014ee 655beb6e9
Firmware Version: 51.0AB51
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Thu Oct 11 20:17:24 2012 HST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
The drive lies to the OS and claims to have 512-byte sectors. Try your dd command again, but this time use a seek of 122566775 (980534200/8).
 
Old 10-13-2012, 01:31 AM   #11
benmctee
LQ Newbie
 
Registered: Oct 2012
Location: Hawaii
Distribution: Ubuntu
Posts: 6

Rep: Reputation: Disabled
Still the same result

I was almost going to create a new thread, but I've seen other forums where they say "this has been been solved many times before... try using the search function next time", or something along those lines. So, I apologize for posting here, and appreciate your assistance nonetheless.

Quote:
I don't know what "980534200 / 8 = 980534200" is supposed to mean
I made a mistake in creating the post. I was copy/pasting from the calculator, and didn't proof-read well enough. During my actual command though, I used 122566775.

This is my log before the dd command:
Code:
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%     10793         3885852504
# 2  Short offline       Completed: read failure       90%     10769         3885852504
# 3  Short offline       Completed: read failure       90%     10769         991816368
# 4  Short offline       Completed: read failure       90%     10769         980534200
# 5  Short offline       Completed: read failure       90%     10769         980534202
This is copied directly from my terminal:
Code:
sudo dd if=/dev/zero of=/dev/sda conv=sync bs=4096 count=1 seek=122566775
[sudo] password for benmctee: 
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 3.8588e-05 s, 106 MB/s
benmctee@linux-server:~$ sudo sync
And this is my result after trying the dd command that time:
Code:
# 1  Short offline       Completed: read failure       90%     10793         3885852504
# 2  Short offline       Completed: read failure       90%     10793         3885852504
# 3  Short offline       Completed: read failure       90%     10769         3885852504
# 4  Short offline       Completed: read failure       90%     10769         991816368
# 5  Short offline       Completed: read failure       90%     10769         980534200
# 6  Short offline       Completed: read failure       90%     10769         980534202
Any ideas on my failure or why it keeps just adding additional read failures? And should I create a new post for this still?

Thanks for helping a forum newbie!

Last edited by benmctee; 10-13-2012 at 01:36 AM. Reason: did ls /dev and saw /dev/zero, no need to mkdir /dev/zero
 
Old 10-13-2012, 09:03 AM   #12
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
Quote:
Originally Posted by benmctee View Post
During my actual command though, I used 122566775.

This is my log before the dd command:
Code:
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%     10793         3885852504
# 2  Short offline       Completed: read failure       90%     10769         3885852504
# 3  Short offline       Completed: read failure       90%     10769         991816368
# 4  Short offline       Completed: read failure       90%     10769         980534200
# 5  Short offline       Completed: read failure       90%     10769         980534202
This is copied directly from my terminal:
Code:
sudo dd if=/dev/zero of=/dev/sda conv=sync bs=4096 count=1 seek=122566775
[sudo] password for benmctee: 
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 3.8588e-05 s, 106 MB/s
benmctee@linux-server:~$ sudo sync
And this is my result after trying the dd command that time:
Code:
# 1  Short offline       Completed: read failure       90%     10793         3885852504
# 2  Short offline       Completed: read failure       90%     10793         3885852504
# 3  Short offline       Completed: read failure       90%     10769         3885852504
# 4  Short offline       Completed: read failure       90%     10769         991816368
# 5  Short offline       Completed: read failure       90%     10769         980534200
# 6  Short offline       Completed: read failure       90%     10769         980534202
There is no point in starting a new thread now.

The original bad sector has been remapped, and you are now finding a second error at block 3885852504. The only thing puzzling there is the failure at block 991816368, which seems to have gone away without you doing anything to fix it. Try the dd command again, this time with a seek of 485731563 (3885852504 / 8). After each try it's useful to look at the SMART attributes data for "Reallocated_Sector_Ct" and "Current_Pending_Sector" attributes.

Once you've got the short test to complete without error, I'd recommend running the long test again. That should take about 159 minutes to run unless, of course, it finds an error earlier. In an previous post you mentioned running the extended (presumably "-t long") self test, but I don't see any evidence of that in the selftest log.
 
Old 12-10-2013, 02:03 PM   #13
izakharyaschev
LQ Newbie
 
Registered: Apr 2011
Location: Moscow
Distribution: ALT Sisyphus
Posts: 7

Rep: Reputation: 0
the command to check whether a selftest is running

Quote:
Originally Posted by nwsmith View Post
To check progress use:
Code:
# smartctl -A /dev/sda | grep remaining
...but don't check too often, as with some drives, it can abort the test.
I've just discovered that it's a wrong option in the command.

The correct one (for me) would be:

Code:
# smartctl -c /dev/sda | fgrep remain
					90% of test remaining.
#
 
  


Reply

Tags
badblocks, disk, hard drives



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Urgent: Failing NTFS HD - Will ddrescue Read & Write To Windows NTFS Drives? zipgunner Linux - Newbie 15 03-08-2011 06:29 AM
sslReadServer: FD 72: read failure: (104) Connection reset by peer Niceman2005 Linux - Software 2 02-03-2011 08:37 AM
Read Only after power failure pablo1999 Linux - Server 6 09-20-2009 11:55 PM
Ubuntu and Plextor-760A failing to read most media Xzyx987X Linux - Hardware 2 01-08-2007 06:07 AM
File read failure during Install JKrzyHss Fedora - Installation 4 09-09-2004 01:15 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 02:41 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration