[SOLVED] Why wouldn't I want to fix an error with fsck?
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Is there any reason why someone wouldn't want to allow fsck to fix errors automatically? The documentation implies that this could cause problems, in some cases; but as a person who knows Linux well, but filesystems poorly, I can't think of how one would know the difference between a "good" error to fix and a "bad" error to fix.
Is there any reason not to use the -a or -p options?
Is there any reason why someone wouldn't want to allow fsck to fix errors automatically? The documentation implies that this could cause problems, in some cases;
Citation ?. Running with "-y" is just asking for trouble IMHO.
It is not safe to run on mounted filesystems. Other than that, if "-p" fails, take careful note of why.
My rule is if I have a serious failure (say a power failure and then errors that need me to reply to), I reformat and restore. No question, I just do it - fsck is designed to correct a filesystem; any files that become victims are not (well) documented. How can you trust that data ?.
There are times when there are two alternatives...
One problem that can occur is that the same blocks get allocated to two different files. One solution is to delete one file and mark all the mixed blocks to the other...
But that could leave you with a bad file, while the other file was valid... You can't know which file is the actual bad one without manually examining both files.
Another time is if a block is marked free, but is in use by a file... The usual solution is to mark the blocks as used - but the file could still be otherwise corrupted. The only way to know is to examine the file itself.
In both cases, if the files are easily recovered from backup just deleting the files with fsck is fine. But if it happens to a directory, deleting the file will/can cause a LOT of other failures - newly orphaned or lost files mostly.
"fsck -y" itself is rather safe. fsck will ONLY take decisions that have only one option. If there are two or more, then the fsck run gets aborted with no changes (and requires a manual choice made).
MOST errors on a filesystem can be fixed with an "fsck -y", which is why an "fsck -y" used to be the standard operation on a root filesystem. No administrator actions were required.
It might well be said that an argument against "fix errors automatically," at least as the first thing that is to be done, is that you are thereby handing-over your disks's future to software, instead of using human judgment.
Obviously, "file-system corruption is something that should never happen," and it usually never does. Therefore, when it does start showing-up, there necessarily is some underlying cause, which fsck and its brethren would not know about. If the hardware is going-south, their attempts to "fix" the problem might well run afoul of the same problem.
I would therefore recommend that you first take the drive completely off-line, then thoroughly examine it for errors. Judge for yourself what each error implies, as you're driving to the hardware store to buy a new drive to replace the old one.
Another important thing to look at (which fsck also "knows not of") is the SMART on-board diagnostics that nearly all disk drives today can provide. The on-board hardware of a disk drive is anything but "passive." It very-actively monitors itself for errors, can "spare out" defective disk-sectors without informing the host, and in-general does a lot of things "behind the scenes" that have the cumulative effect of making the device look a lot more reliable than it physically is ... until, it can't do that anymore. That's often when a drive, in the real world, starts to "mysteriously" fail.
In my experience, once the drive has failed SMART (hard, not soft), it doesn't work at all. Soft failures are covered. Once the amount of soft failures exceed a given amount it is in the "pre-fail" mode and you better get ready to replace the drive.
I understand the way in which a failing drive would produce problems that fsck can't fix, and, of course, the need for backups. I guess what I'm not clear on is how a person might know what files need to be looked at/verified, when the fsck output looks something like this:
I have no clue what that means, but the decision to run "fsck -y" right away depends on your goals. If you just want to get the machine back up and running as quickly as possible and are prepared to accept possible data loss, then fine, run "fsck -y". But, if you want to maximize the chances of recovering data, then you really should save an image of the damaged filesystem first. The actions taken by fsck to patch up the filesystem can make forensic data recovery a lot more difficult.
From my perspective, I don't recommend any file system repair on a drive unless you are 100% sure that everything is backed up. In many cases, file system issues are just that, issues with the file system. However, it is quite frequent that the root cause for file system issues are tied to physical issues with the hard drive. When you run an fsck (or chkdsk for our windows friends) and encounter issues because of bad sector reads, it is not uncommon for the file system repair to drop the damaged chain, rather than actually fix it (because it can't). This results in irreversible and unnecessary damage to the file system.
From my perspective, I don't recommend any file system repair on a drive unless you are 100% sure that everything is backed up. In many cases, file system issues are just that, issues with the file system. However, it is quite frequent that the root cause for file system issues are tied to physical issues with the hard drive. When you run an fsck (or chkdsk for our windows friends) and encounter issues because of bad sector reads, it is not uncommon for the file system repair to drop the damaged chain, rather than actually fix it (because it can't). This results in irreversible and unnecessary damage to the file system.
Not true in most non-Microsoft filesystems. They are designed to be repaired, and they can be repaired.
If the drive is working - it can be worked on. Even bad sector reads (if the drive is STILL working), though causing problems, can recover all but those sectors. Yes, some files may become unreadable, or partially unreadable. But that doesn't prevent the rest from being recoverable.
Without errors being fixed, the file system may not be mounted, even in a read-only mode.
So unless the backup was done immediately before the failure, backups will NOT recover all your data.
Not true in most non-Microsoft filesystems. They are designed to be repaired, and they can be repaired.
This is absolutely not true. As a data recovery professional for almost a couple decades, I get to see the damage caused almost daily.
Quote:
If the drive is working - it can be worked on. Even bad sector reads (if the drive is STILL working), though causing problems, can recover all but those sectors. Yes, some files may become unreadable, or partially unreadable. But that doesn't prevent the rest from being recoverable.
How it is worked on has a huge impact on its overall recoverability. Again, I see the damage caused by making the assumption that a drive is not in as bad shape as it is.
I have to agree with LukeRFI here. Any and all filesystem repair utilities, with the exception of perhaps DiskWarrior in Mac, will favor repairing corruption in the file tables over saving the data. Running any options to "scan for and repair" bad sectors is just a recipe for disaster if the data isn't backed up. It very often kills the read/write heads making what would have been an easy recovery into an expensive nightmare.
Perhaps as a constructive first step before doing that, try using ddrescue to image as much of the sectors as you can onto another good disk. Then, you can run fsck against the clone and see what happens.
I've been working with damaged drives since about 1980. All the way from a total head crash - and still able to get ONE good copy of the data (the disk was essentially scraped bare after that).
Formerly, it was possible to allocate your own list of replacement blocks or extend the list of bad blocks to not use (it required a low level format - the manufacturers list had a software extension that the system could use for the same purpose)... And disks would last many years even in the presence of read errors. They still can last years (my oldest disk right now is 40G disk about 20 years old, still working just fine, though not as active as it used to be)
As long as the disk heads have not been damaged, it is VERY reliable to recover data by just running fsck.
Yes, if the heads are actually damaged, you can do additional damage that can't be recovered. But that doesn't happen that often. Disks submerged in water? Not functional. Disks with blown formatters? Not functional. Disks that have been dropped? Likely not functional... but it depends on the disk and how it was dropped.
File systems after a system crash? No problem for any UNIX/Linux system as long as it isn't a Microsoft filesystem (and even then, MOST of the time they can still be recovered, just not as reliably as native filesystems).
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.