LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 03-20-2008, 10:26 AM   #1
exceed1
Member
 
Registered: Mar 2008
Location: oslo
Distribution: debian,redhat
Posts: 199

Rep: Reputation: 31
fsck is not removing bad blocks, why not?


Hi

I have a dedicated server im running (a private one, not running anything big). I took a look at the log files the other day and noticed that the harddrive had some bad blocks. I then got the system to runlevel 1 to issue an fsck.ext3. I added the "-c" option and the "-y" option to add the blocks to a bad blocks list and get the problems fixed. After this i ran fsck.ext3 again with the "-y" and "-f" option since fsck.ext3 gave information about the filesystem being clean, so i had to force the check, the fsck tool still said that the harddrive had 55 bad blocks.

Hmm, i thought, that was weird. As it says in the manual page for "fsck", the "-p" option should automatically fix any errors and the "-c" option should add the bad blocks to a bad blocks list. My question is, when i now run fsck again, it still says that there are 55 bad blocks... why isnt fsck fixing the errors ? (the filesystem is unmounted).

Any help is appericiated

Last edited by exceed1; 03-20-2008 at 10:31 AM.
 
Old 03-20-2008, 11:51 AM   #2
MS3FGX
Guru
 
Registered: Jan 2004
Location: NJ, USA
Distribution: Slackware, Debian
Posts: 5,852

Rep: Reputation: 351Reputation: 351Reputation: 351Reputation: 351
You can't fix a bad block, it is permanently destroyed on the physical disk. 55 blocks is quite a bit, the drive is definitely no longer safe to use.

The purpose of listing them is so that the system will be aware of which ones are dead so that the filesystem can still be read from to get all of your data off of it. Getting bad blocks on a drive is a sign of (generally) imminent failure, you need to backup everything as soon as possible (limit use of the drive until you have recovered everything) and replace it.

You may be able to recover some of the data that was on those blocks, but unless it was very important your best course is to just get what you can still easily copy off before the drive stops working completely.

Last edited by MS3FGX; 03-20-2008 at 11:53 AM.
 
Old 03-20-2008, 12:33 PM   #3
marozsas
Senior Member
 
Registered: Dec 2005
Location: Campinas/SP - Brazil
Distribution: SuSE, RHEL, Fedora, Ubuntu
Posts: 1,393
Blog Entries: 1

Rep: Reputation: 63
I don't know.

But you can try do run fsck using the non-destructive write test instead the simple read test. Just use double c in fsck "fsck -cc ...".

Before you do that, get the current list of badblocks and compare it with the same list when the fsck is done in the second time.

Code:
dumpe2fs -b /dev/your-block-device > /somewhere/in/other/filesystem.before-fsck-cc
fsck -cc ...
dumpe2fs -b /dev/your-block-device > /somewhere/in/other/filesystem.after-fsck-cc
In this way you can check if fsck had marked some blocks as bad or not.

Last edited by marozsas; 03-20-2008 at 12:35 PM.
 
Old 03-20-2008, 12:34 PM   #4
exceed1
Member
 
Registered: Mar 2008
Location: oslo
Distribution: debian,redhat
Posts: 199

Original Poster
Rep: Reputation: 31
Thanks for your reply, it was very interesting.

My question is now, when the blocks have been marked as bad by the badblocks program (since fdisk wouldnt mark them as bad blocks i had to use the badblocks program), why should the disk fail completely, why cant it just continue to work like normal when it now knows that it shouldnt read or write anything to these blocks on the disk?
 
Old 03-20-2008, 01:32 PM   #5
jailbait
Guru
 
Registered: Feb 2003
Location: Blue Ridge Mountain
Distribution: Debian Wheezy, Debian Jessie
Posts: 7,503

Rep: Reputation: 174Reputation: 174
Quote:
Originally Posted by exceed1 View Post

Thanks for your reply, it was very interesting.

My question is now, when the blocks have been marked as bad by the badblocks program (since fdisk wouldnt mark them as bad blocks i had to use the badblocks program), why should the disk fail completely, why cant it just continue to work like normal when it now knows that it shouldnt read or write anything to these blocks on the disk?
How bad blocks are handles on your hard drive can vary depending on how old the drive is. The current method is this:

There are spare blocks at the end of the hard drive. When a block becomes defective the hard drive's firmware assigns one of the spare blocks to replace the bad block. When your cpu accesses a bad block the firmware automatically converts the access to the spare block. This scheme works until all of the spare blocks are in use. Once you have more bad blocks than spares you get into the situation where bad blocks cannot be fixed.

--------------------
Steve Stites
 
Old 03-20-2008, 02:09 PM   #6
exceed1
Member
 
Registered: Mar 2008
Location: oslo
Distribution: debian,redhat
Posts: 199

Original Poster
Rep: Reputation: 31
thanks for the reply, it was very informative

when i now ran the badblocks program it says that it cant find any bad blocks, but when i run fsck it says that there are 55 bad blocks. also, when i checked the logs for some time ago (/var/log/messages and /var/log/syslog) it said that there were bad blocks. which tool is correct here? i have also been told that the badblocks program is better to find bad blocks on the HDD than fsck, is that correct?

output from the tools:
badblocks:
"Pass completed, 0 bad blocks found."

fsck:
"....other info.."
"55 bad blocks"
"...more info.."

Last edited by exceed1; 03-20-2008 at 02:12 PM.
 
Old 03-20-2008, 04:28 PM   #7
jailbait
Guru
 
Registered: Feb 2003
Location: Blue Ridge Mountain
Distribution: Debian Wheezy, Debian Jessie
Posts: 7,503

Rep: Reputation: 174Reputation: 174
Quote:
Originally Posted by exceed1 View Post

when i now ran the badblocks program it says that it cant find any bad blocks, but when i run fsck it says that there are 55 bad blocks. also, when i checked the logs for some time ago (/var/log/messages and /var/log/syslog) it said that there were bad blocks. which tool is correct here? i have also been told that the badblocks program is better to find bad blocks on the HDD than fsck, is that correct?

output from the tools:
badblocks:
"Pass completed, 0 bad blocks found."

fsck:
"....other info.."
"55 bad blocks"
"...more info.."
I don't know which program does the best job of diagnosing bad blocks. The hard drive manufacturers have bootable diagnostic diskettes available for download which will do destructive testing on your hard drive and assign spares to bad blocks. The last time I had this problem about 6 years ago I used one of their diagnostic diskettes to straighten out the problem. I also remember a time when I used one of the diagnostic diskettes and found out I was out of spare blocks and the drive was kaput.

If all of the bad blocks are clustered near each other you can also cure the problem by partitioning the hard drive so that all of the bad blocks are in free space not allocated to any partition.

------------------------
Steve Stites
 
Old 03-20-2008, 05:10 PM   #8
exceed1
Member
 
Registered: Mar 2008
Location: oslo
Distribution: debian,redhat
Posts: 199

Original Poster
Rep: Reputation: 31
Ok. Ill check it out.

I tried to check the health of the disk with smartctl and the output i got was this:
"SMART overall-healt assessment test result: PASSED"

Is this information i can trust or does smartctl check other parts of the disk and not everything?
 
Old 03-20-2008, 08:11 PM   #9
MS3FGX
Guru
 
Registered: Jan 2004
Location: NJ, USA
Distribution: Slackware, Debian
Posts: 5,852

Rep: Reputation: 351Reputation: 351Reputation: 351Reputation: 351
SMART testing is little more than an educated guess. I have had completely dead drives pass SMART tests in the past, and perfectly functional ones fail.
 
Old 03-21-2008, 08:07 AM   #10
exceed1
Member
 
Registered: Mar 2008
Location: oslo
Distribution: debian,redhat
Posts: 199

Original Poster
Rep: Reputation: 31
Hmm, ok, so the smart tool isnt something you should use or at least be aware that the status from the test can be wrong.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
bad blocks bong.mau Linux - General 1 11-16-2005 02:17 PM
BAd Blocks / Formatting linmix Linux - Software 0 11-14-2004 02:43 PM
bad blocks/disk pb nadine.mauch Linux - General 0 10-07-2004 02:49 AM
How to do a bad blocks check yelo Linux - General 2 04-02-2004 10:14 PM
fsck many bad blocks mjolnir *BSD 5 01-13-2004 06:35 AM


All times are GMT -5. The time now is 11:16 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration