Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Twice in the last 10 days, a journal failure has occurred on the partition that is my /home.
Is this a hard drive starting to go bad?
Or are there other things that could cause this?
The system re-mounts the partition that it has occurred on as read-only.
I have recovered by going to init 1 and unmounting the partition and running fsck. Will things have been lost?
I recall reading about an application that will check a hard drive and mark bad tracks and re-assigm them elsewhere - does such a program exist?
The system will force a filesystem check if the volume was not unmounted properly. Are you sure you are correctly shutting down every time and not just powering off while the system is up?
If nothing you are doing is causing the checks, it is possible there is some corruption or hardware failure.
You can check a drive for bad blocks with the aptly named "badblocks" command. As you have already found, "fsck" is used to check for and fix filesystem level errors.
It is important to understand though that you can't fix a bad block on the drive; the best you can do is tell the filesystem to not use it. Once you start losing blocks on your drive, you need to be focused on backing up your data and putting a new drive in the system. The problem will only get worse with time, until eventually you might not be able to use the drive at all.
Usually the symptoms you describe herald a hardware problem. Could be bad cable, or bad controller. Could also be a bad drive. I would assign a controller fault as being about equal probability with a drive fault, and a cable fault as being lower probability. However, since cables are cheap, I would start by changing the cable.
Badblocks is indeed a way to test the drive. You should start doing the nondestructive test routinely looking for changes in the drive. It is possible that you had a head crash due, for instance, to the cat bumping the box as it ran past fleeing the 2 year old that was pulling its tail. In this event, you have a bad place on the drive that more than likely won't propagate. Thus, if you map out the bad place, you'll be OK.
OTOH, if the bad blocks list is growing, then as you have already been told, this is your cue to buy a new drive rather soon.
If you can get your hands on a copy, or if you are willing to spend the money, get spinrite. This product is brilliant and will definitively tell you the condition of your drive, and very often will fix it.
Thanks jiml8;
The drive is an ide drive; "bad controller" -- is their a controller on the hard drive? ow are you referring to the ide port on the mother board. I have two sata drives and the ide drive and frequently after powering off, when I power back on, none of the hard drives are found; until I re-seat the serial ata cables.
Spinrit's been around for ages; never thought of using it for a hard drive -- i have used it in my windows dark ages to test a 100mb zip drive.
Thanks for pointing me to "badblocks"
Thank you both for answering my query, all replys are appreciated.
I have recovered by going to init 1 and unmounting the partition and running fsck. Will things have been lost?
Every partition has a lost+found directory. fsck places any orphan files, orphan directories, or file fragments in lost+found and uses the inode number for a file name. So look in the directory called /home/lost+found for any lost files.
If your ATA drive is using a forty wire ATA cable, you may be getting cross talk. If the drive is ancient, you may not need a 80 wire cable, but it has to be very very old.
Just another avenue to walk in your troubleshooting.
The part about having to re-seat the SATA cables is odd. Normally, creeping is associated with too much heat in the box, possibly an expansion/contraction issue also. But I'm thinking it's probably a controller issue, the main chip set on the mother board.
SpinRite is the only software I'm aware of that can reallocate data in bad sectors/blocks to spare sectors using a tenacious method of reading them, then isolate the bad sectors so they can't be used again, thus saving your data and drive.
Thanks jiml8;
The drive is an ide drive; "bad controller" -- is their a controller on the hard drive? ow are you referring to the ide port on the mother board. I have two sata drives and the ide drive and frequently after powering off, when I power back on, none of the hard drives are found; until I re-seat the serial ata cables.
Spinrit's been around for ages; never thought of using it for a hard drive -- i have used it in my windows dark ages to test a 100mb zip drive.
Thanks for pointing me to "badblocks"
Thank you both for answering my query, all replys are appreciated.
Every partition has a lost+found directory. fsck places any orphan files, orphan directories, or file fragments in lost+found and uses the inode number for a file name. So look in the directory called /home/lost+found for any lost files.
Thanks for the info Steve; I think it's time to replace the drive; there are over 1800 files containing 77mb of data in that lost+found folder.
Quote:
Originally Posted by Junior Hacker
The part about having to re-seat the SATA cables is odd. Normally, creeping is associated with too much heat in the box, possibly an expansion/contraction issue also. But I'm thinking it's probably a controller issue, the main chip set on the mother board.
Information appreciated. I do not feel real confident trouble shooting this stuff and it is always nice to get confirmation for some of my thoughts --- specifically heat and mainboard problems
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.