LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 07-11-2016, 01:24 AM   #1
DaneM
Member
 
Registered: Oct 2003
Location: Chico, CA, USA
Distribution: Linux Mint
Posts: 881

Rep: Reputation: 130Reputation: 130
Question Why wouldn't I want to fix an error with fsck?


Is there any reason why someone wouldn't want to allow fsck to fix errors automatically? The documentation implies that this could cause problems, in some cases; but as a person who knows Linux well, but filesystems poorly, I can't think of how one would know the difference between a "good" error to fix and a "bad" error to fix.

Is there any reason not to use the -a or -p options?
 
Old 07-11-2016, 01:53 AM   #2
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 21,850

Rep: Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309Reputation: 7309
sometimes a corrupted filesystem should not be altered in any way to be able to save/backup all the available data.
 
2 members found this post helpful.
Old 07-11-2016, 02:49 AM   #3
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,128

Rep: Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121Reputation: 4121
Quote:
Originally Posted by DaneM View Post
Is there any reason why someone wouldn't want to allow fsck to fix errors automatically? The documentation implies that this could cause problems, in some cases;
Citation ?. Running with "-y" is just asking for trouble IMHO.

It is not safe to run on mounted filesystems. Other than that, if "-p" fails, take careful note of why.
My rule is if I have a serious failure (say a power failure and then errors that need me to reply to), I reformat and restore. No question, I just do it - fsck is designed to correct a filesystem; any files that become victims are not (well) documented. How can you trust that data ?.
 
2 members found this post helpful.
Old 07-11-2016, 07:11 AM   #4
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
There are times when there are two alternatives...

One problem that can occur is that the same blocks get allocated to two different files. One solution is to delete one file and mark all the mixed blocks to the other...

But that could leave you with a bad file, while the other file was valid... You can't know which file is the actual bad one without manually examining both files.

Another time is if a block is marked free, but is in use by a file... The usual solution is to mark the blocks as used - but the file could still be otherwise corrupted. The only way to know is to examine the file itself.

In both cases, if the files are easily recovered from backup just deleting the files with fsck is fine. But if it happens to a directory, deleting the file will/can cause a LOT of other failures - newly orphaned or lost files mostly.

"fsck -y" itself is rather safe. fsck will ONLY take decisions that have only one option. If there are two or more, then the fsck run gets aborted with no changes (and requires a manual choice made).

MOST errors on a filesystem can be fixed with an "fsck -y", which is why an "fsck -y" used to be the standard operation on a root filesystem. No administrator actions were required.

Last edited by jpollard; 07-11-2016 at 07:15 AM.
 
1 members found this post helpful.
Old 07-11-2016, 07:55 AM   #5
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,659
Blog Entries: 4

Rep: Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941Reputation: 3941
It might well be said that an argument against "fix errors automatically," at least as the first thing that is to be done, is that you are thereby handing-over your disks's future to software, instead of using human judgment.

Obviously, "file-system corruption is something that should never happen," and it usually never does. Therefore, when it does start showing-up, there necessarily is some underlying cause, which fsck and its brethren would not know about. If the hardware is going-south, their attempts to "fix" the problem might well run afoul of the same problem.

I would therefore recommend that you first take the drive completely off-line, then thoroughly examine it for errors. Judge for yourself what each error implies, as you're driving to the hardware store to buy a new drive to replace the old one.

Another important thing to look at (which fsck also "knows not of") is the SMART on-board diagnostics that nearly all disk drives today can provide. The on-board hardware of a disk drive is anything but "passive." It very-actively monitors itself for errors, can "spare out" defective disk-sectors without informing the host, and in-general does a lot of things "behind the scenes" that have the cumulative effect of making the device look a lot more reliable than it physically is ... until, it can't do that anymore. That's often when a drive, in the real world, starts to "mysteriously" fail.
 
1 members found this post helpful.
Old 07-11-2016, 08:29 AM   #6
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
In my experience, once the drive has failed SMART (hard, not soft), it doesn't work at all. Soft failures are covered. Once the amount of soft failures exceed a given amount it is in the "pre-fail" mode and you better get ready to replace the drive.

Last edited by jpollard; 07-11-2016 at 08:30 AM.
 
1 members found this post helpful.
Old 07-11-2016, 10:12 AM   #7
DaneM
Member
 
Registered: Oct 2003
Location: Chico, CA, USA
Distribution: Linux Mint
Posts: 881

Original Poster
Rep: Reputation: 130Reputation: 130
I understand the way in which a failing drive would produce problems that fsck can't fix, and, of course, the need for backups. I guess what I'm not clear on is how a person might know what files need to be looked at/verified, when the fsck output looks something like this:

Code:
 49182(f): expecting 77542800 got phys 77543296 (blkcnt 1836366)
 49182(f): expecting 77543312 got phys 77543808 (blkcnt 1836382)
 49182(f): expecting 77543824 got phys 77543904 (blkcnt 1836398)
 49182(f): expecting 77543920 got phys 77544160 (blkcnt 1836414)
 49182(f): expecting 77544176 got phys 77544416 (blkcnt 1836430)
 49182(f): expecting 77544432 got phys 77578715 (blkcnt 1836446)
 49182(f): expecting 77579979 got phys 77580524 (blkcnt 1837709)
 49182(f): expecting 77582958 got phys 77584109 (blkcnt 1840140)
(A.K.A. "What the heck does this mean?")

Last edited by DaneM; 07-11-2016 at 10:16 AM.
 
Old 07-11-2016, 10:46 AM   #8
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: Rocky Linux
Posts: 4,779

Rep: Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212Reputation: 2212
Quote:
Originally Posted by DaneM View Post
Code:
 49182(f): expecting 77542800 got phys 77543296 (blkcnt 1836366)
 49182(f): expecting 77543312 got phys 77543808 (blkcnt 1836382)
 49182(f): expecting 77543824 got phys 77543904 (blkcnt 1836398)
...
(A.K.A. "What the heck does this mean?")
I have no clue what that means, but the decision to run "fsck -y" right away depends on your goals. If you just want to get the machine back up and running as quickly as possible and are prepared to accept possible data loss, then fine, run "fsck -y". But, if you want to maximize the chances of recovering data, then you really should save an image of the damaged filesystem first. The actions taken by fsck to patch up the filesystem can make forensic data recovery a lot more difficult.
 
2 members found this post helpful.
Old 07-11-2016, 01:29 PM   #9
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
It is reporting fragmented files (inode 49182) for a regular file (f). The others are just reports on the fragments identified.

This should not count as an error - but is part of the fragmentation check.

reference: http://git.whamcloud.com/?p=tools/e2...8970da644e0953

This might be counted as part of filesystem debugging information.

Last edited by jpollard; 07-11-2016 at 01:31 PM.
 
2 members found this post helpful.
Old 07-11-2016, 11:19 PM   #10
DaneM
Member
 
Registered: Oct 2003
Location: Chico, CA, USA
Distribution: Linux Mint
Posts: 881

Original Poster
Rep: Reputation: 130Reputation: 130
Excellent! Thanks for helping me solve that mystery. I forgot that the fragcheck option might produce extra messages...
 
Old 07-12-2016, 07:08 AM   #11
LukeRFI
Member
 
Registered: Jun 2016
Location: Canada
Distribution: Various versions of Fedora
Posts: 30

Rep: Reputation: Disabled
From my perspective, I don't recommend any file system repair on a drive unless you are 100% sure that everything is backed up. In many cases, file system issues are just that, issues with the file system. However, it is quite frequent that the root cause for file system issues are tied to physical issues with the hard drive. When you run an fsck (or chkdsk for our windows friends) and encounter issues because of bad sector reads, it is not uncommon for the file system repair to drop the damaged chain, rather than actually fix it (because it can't). This results in irreversible and unnecessary damage to the file system.
 
Old 07-12-2016, 09:16 AM   #12
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Quote:
Originally Posted by LukeRFI View Post
From my perspective, I don't recommend any file system repair on a drive unless you are 100% sure that everything is backed up. In many cases, file system issues are just that, issues with the file system. However, it is quite frequent that the root cause for file system issues are tied to physical issues with the hard drive. When you run an fsck (or chkdsk for our windows friends) and encounter issues because of bad sector reads, it is not uncommon for the file system repair to drop the damaged chain, rather than actually fix it (because it can't). This results in irreversible and unnecessary damage to the file system.
Not true in most non-Microsoft filesystems. They are designed to be repaired, and they can be repaired.

If the drive is working - it can be worked on. Even bad sector reads (if the drive is STILL working), though causing problems, can recover all but those sectors. Yes, some files may become unreadable, or partially unreadable. But that doesn't prevent the rest from being recoverable.

Without errors being fixed, the file system may not be mounted, even in a read-only mode.

So unless the backup was done immediately before the failure, backups will NOT recover all your data.

Last edited by jpollard; 07-12-2016 at 09:17 AM.
 
Old 07-12-2016, 10:06 AM   #13
LukeRFI
Member
 
Registered: Jun 2016
Location: Canada
Distribution: Various versions of Fedora
Posts: 30

Rep: Reputation: Disabled
Quote:
Originally Posted by jpollard View Post
Not true in most non-Microsoft filesystems. They are designed to be repaired, and they can be repaired.
This is absolutely not true. As a data recovery professional for almost a couple decades, I get to see the damage caused almost daily.
Quote:
If the drive is working - it can be worked on. Even bad sector reads (if the drive is STILL working), though causing problems, can recover all but those sectors. Yes, some files may become unreadable, or partially unreadable. But that doesn't prevent the rest from being recoverable.
How it is worked on has a huge impact on its overall recoverability. Again, I see the damage caused by making the assumption that a drive is not in as bad shape as it is.
 
1 members found this post helpful.
Old 07-12-2016, 10:27 AM   #14
JaredDM
LQ Newbie
 
Registered: Jul 2016
Location: Providence, RI
Distribution: Mostly Windows to be honest.
Posts: 9

Rep: Reputation: Disabled
I have to agree with LukeRFI here. Any and all filesystem repair utilities, with the exception of perhaps DiskWarrior in Mac, will favor repairing corruption in the file tables over saving the data. Running any options to "scan for and repair" bad sectors is just a recipe for disaster if the data isn't backed up. It very often kills the read/write heads making what would have been an easy recovery into an expensive nightmare.

Perhaps as a constructive first step before doing that, try using ddrescue to image as much of the sectors as you can onto another good disk. Then, you can run fsck against the clone and see what happens.

Last edited by JaredDM; 07-12-2016 at 10:32 AM.
 
1 members found this post helpful.
Old 07-12-2016, 10:30 AM   #15
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
I've been working with damaged drives since about 1980. All the way from a total head crash - and still able to get ONE good copy of the data (the disk was essentially scraped bare after that).

Formerly, it was possible to allocate your own list of replacement blocks or extend the list of bad blocks to not use (it required a low level format - the manufacturers list had a software extension that the system could use for the same purpose)... And disks would last many years even in the presence of read errors. They still can last years (my oldest disk right now is 40G disk about 20 years old, still working just fine, though not as active as it used to be)

As long as the disk heads have not been damaged, it is VERY reliable to recover data by just running fsck.

Yes, if the heads are actually damaged, you can do additional damage that can't be recovered. But that doesn't happen that often. Disks submerged in water? Not functional. Disks with blown formatters? Not functional. Disks that have been dropped? Likely not functional... but it depends on the disk and how it was dropped.

File systems after a system crash? No problem for any UNIX/Linux system as long as it isn't a Microsoft filesystem (and even then, MOST of the time they can still be recovered, just not as reliably as native filesystems).

Last edited by jpollard; 07-12-2016 at 10:32 AM.
 
1 members found this post helpful.
  


Reply

Tags
errors, filesystems, fsck



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to fix an accidentally-fsck'ed RAID5 member kevinfishburne Linux - Server 1 01-12-2016 11:08 AM
[SOLVED] Boot fails at fsck but I get no error when I run fsck myself. new install on ssd. rlx Slackware 1 12-02-2012 02:29 PM
Will fsck fix this problem and will it break anything? DJOtaku Linux - Hardware 11 12-22-2011 05:54 PM
How to fix fsck? anon004 Linux - Hardware 4 01-17-2009 11:34 AM
fsck - doesn't fix problem? Ishkabibble Linux - Hardware 2 04-07-2008 09:33 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 04:04 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration